Introduction to Large Language Models (LLMs)

Understand what an LLM is, how it is trained, and what it can — or cannot — do.

By Héric Libong
Introduction to Large Language Models (LLMs)

Large language models (LLMs) sit at the heart of the current wave of generative AI. This introductory module lays the groundwork for working with them lucidly: what an LLM is, how it is trained, and — above all — the limits you don’t always see when you simply fire prompts at one.

We start from the foundations: an LLM is a statistical model trained to predict the next token in a text sequence. Behind that modest definition lie billions of parameters, petabytes of training data, and the so-called “Transformer” architecture that has reshaped natural language processing since 2017. Understanding this token-by-token prediction mechanism radically changes the way you interpret a model’s outputs, and especially the famous “hallucinations”.

The module then covers the three training phases found in modern LLMs (pre-training, supervised fine-tuning, RLHF alignment), the notions of context and context window, and a first look at the differences between open and closed models. By the end, you will be able to read a model card, compare two LLMs on objective criteria, and identify the use cases where they excel — and the ones where they remain uncertain.

Tags: LLM AI Beginner NLP