Large language models (LLMs) sit at the heart of the current wave of generative AI. This introductory module lays the groundwork for working with them lucidly: what an LLM is, how it is trained, and — above all — the limits you don’t always see when you simply fire prompts at one.
We start from the foundations: an LLM is a statistical model trained to predict the next token in a text sequence. Behind that modest definition lie billions of parameters, petabytes of training data, and the so-called “Transformer” architecture that has reshaped natural language processing since 2017. Understanding this token-by-token prediction mechanism radically changes the way you interpret a model’s outputs, and especially the famous “hallucinations”.
The module then covers the three training phases found in modern LLMs (pre-training, supervised fine-tuning, RLHF alignment), the notions of context and context window, and a first look at the differences between open and closed models. By the end, you will be able to read a model card, compare two LLMs on objective criteria, and identify the use cases where they excel — and the ones where they remain uncertain.