With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them.
I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact with today.
This two-part deep dive is something I’ve been meaning to do for a while and was also inspired by Andrej Karpathy’s widely popular 3.5-hour YouTube video, which has racked up 800,000+ views in just 10 days. Andrej is a founding member of OpenAI, his insights are gold— you get the idea.
If you have the time, his video is definitely worth watching. But let’s be real — 3.5 hours is a long watch. So, for all the busy folks who don’t want to miss out, I’ve distilled the key concepts from the first 1.5 hours into this 10-minute read, adding my own breakdowns to help you build a solid intuition.
What you’ll get
Part 1 (this article): Covers the fundamentals of LLMs, including pre-training to post-training, neural networks, Hallucinations, and inference.
Part 2: Reinforcement learning with human/AI feedback, investigating o1 models, DeepSeek R1, AlphaGo
Let’s go! I’ll start with looking at how LLMs are being built.
At a high level, there are 2 key phases: pre-training and post-training.
1. Pre-training
Before an LLM can generate text, it must first learn how language works. This happens through pre-training, a highly computationally intensive task.
Step 1: Data collection and preprocessing
The first step in training an LLM is gathering as much high-quality text as possible. The goal is to create a massive and diverse dataset containing a wide range of human knowledge.
One source is Common Crawl, which is a free, open repository of web crawl data containing 250 billion web pages over 18 years. However, raw web data is noisy — containing spam, duplicates and low quality content — so preprocessing is essential.If you’re interested in preprocessed datasets, FineWeb offers a curated version of Common Crawl, and is made available on Hugging Face.

Once cleaned, the text corpus is ready for tokenization.
Step 2: Tokenization
Before a neural network can process text, it must be converted into numerical form. This is done through tokenization, where words, subwords, or characters are mapped to unique numerical tokens.
Think of tokens as the building blocks — the fundamental building blocks of all language models. In GPT4, there are 100,277 possible tokens.A popular tokenizer, Tiktokenizer, allows you to experiment with tokenization and see how text is broken down into tokens. Try entering a sentence, and you’ll see each word or subword assigned a series of numerical IDs.

Step 3: Neural network training
Once the text is tokenized, the neural network learns to predict the next token based on its context. As shown above, the model takes an input sequence of tokens (e.g., “we are cook ing”) and processes it through a giant mathematical expression — which represents the model’s architecture — to predict the next token.
A neural network consists of 2 key parts:
- Parameters (weights) — the learned numerical values from training.
- Architecture (mathematical expression) — the structure defining how the input tokens are processed to produce outputs.

Initially, the model’s predictions are random, but as training progresses, it learns to assign probabilities to possible next tokens.
When the correct token (e.g. “food”) is identified, the model adjusts its billions of parameters (weights) through backpropagation — an optimization process that reinforces correct predictions by increasing their probabilities while reducing the likelihood of incorrect ones.
This process is repeated billions of times across massive datasets.
Base model — the output of pre-training
At this stage, the base model has learned:
- How words, phrases and sentences relate to each other
- Statistical patterns in your training data
However, base models are not yet optimised for real-world tasks. You can think of them as an advanced autocomplete system — they predict the next token based on probability, but with limited instruction-following ability.
A base model can sometimes recite training data verbatim and can be used for certain applications through in-context learning, where you guide its responses by providing examples in your prompt. However, to make the model truly useful and reliable, it requires further training.
2. Post training — Making the model useful
Base models are raw and unrefined. To make them helpful, reliable, and safe, they go through post-training, where they are fine-tuned on smaller, specialised datasets.
Because the model is a neural network, it cannot be explicitly programmed like traditional software. Instead, we “program” it implicitly by training it on structured labeled datasets that represent examples of desired interactions.
How post training works
Specialised datasets are created, consisting of structured examples on how the model should respond in different situations.
Some types of post training include:
- Instruction/conversation fine tuning
Goal: To teach the model to follow instructions, be task oriented, engage in multi-turn conversations, follow safety guidelines and refuse malicious requests, etc.
Eg: InstructGPT (2022): OpenAI hired some 40 contractors to create these labelled datasets. These human annotators wrote prompts and provided ideal responses based on safety guidelines. Today, many datasets are generated automatically, with humans reviewing and editing them for quality. - Domain specific fine tuning
Goal: Adapt the model for specialised fields like medicine, law and programming.
Post training also introduces special tokens — symbols that were not used during pre-training — to help the model understand the structure of interactions. These tokens signal where a user’s input starts and ends and where the AI’s response begins, ensuring that the model correctly distinguishes between prompts and replies.
Now, we’ll move on to some other key concepts.
Inference — how the model generates new text
Inference can be performed at any stage, even midway through pre-training, to evaluate how well the model has learned.
When given an input sequence of tokens, the model assigns probabilities to all possible next tokens based on patterns it has learned during training.
Instead of always choosing the most likely token, it samples from this probability distribution — similar to flipping a biased coin, where higher-probability tokens are more likely to be selected.
This process repeats iteratively, with each newly generated token becoming part of the input for the next prediction.
Token selection is stochastic and the same input can produce different outputs. Over time, the model generates text that wasn’t explicitly in its training data but follows the same statistical patterns.
Hallucinations — when LLMs generate false info
Why do hallucinations occur?
Hallucinations happen because LLMs do not “know” facts — they simply predict the most statistically likely sequence of words based on their training data.
Early models struggled significantly with hallucinations.
For instance, in the example below, if the training data contains many “Who is…” questions with definitive answers, the model learns that such queries should always have confident responses, even when it lacks the necessary knowledge.
When asked about an unknown person, the model does not default to “I don’t know” because this pattern was not reinforced during training. Instead, it generates its best guess, often leading to fabricated information.

How do you reduce hallucinations?
Method 1: Saying “I don’t know”
Improving factual accuracy requires explicitly training the model to recognise what it does not know — a task that is more complex than it seems.
This is done via self interrogation, a process that helps define the model’s knowledge boundaries.
Self interrogation can be automated using another AI model, which generates questions to probe knowledge gaps. If it produces a false answer, new training examples are added, where the correct response is: “I’m not sure. Could you provide more context?”
If a model has seen a question many times in training, it will assign a high probability to the correct answer.
If the model has not encountered the question before, it distributes probability more evenly across multiple possible tokens, making the output more randomised. No single token stands out as the most likely choice.
Fine tuning explicitly trains the model to handle low-confidence outputs with predefined responses.
For example, when I asked ChatGPT-4o, “Who is asdja rkjgklfj?”, it correctly responded: “I’m not sure who that is. Could you provide more context?”
Method 2: Doing a web search
A more advanced method is to extend the model’s knowledge beyond its training data by giving it access to external search tools.
At a high level, when a model detects uncertainty, it can trigger a web search. The search results are then inserted into a model’s context window — essentially allowing this new data to be part of it’s working memory. The model references this new information while generating a response.
Vague recollections vs working memory
Generally speaking, LLMs have two types of knowledge access.
- Vague recollections — the knowledge stored in the model’s parameters from pre-training. This is based on patterns it learned from vast amounts of internet data but is not precise nor searchable.
- Working memory — the information that is available in the model’s context window, which is directly accessible during inference. Any text provided in the prompt acts as a short term memory, allowing the model to recall details while generating responses.
Adding relevant facts within the context window significantly improves response quality.
Knowledge of self
When asked questions like “Who are you?” or “What built you?”, an LLM will generate a statistical best guess based on its training data, unless explicitly programmed to respond accurately.
LLMs do not have true self-awareness, their responses depend on patterns seen during training.
One way to provide the model with a consistent identity is by using a system prompt, which sets predefined instructions about how it should describe itself, its capabilities, and its limitations.
To end off
That’s a wrap for Part 1! I hope this has helped you build intuition on how LLMs work. In Part 2, we’ll dive deeper into reinforcement learning and some of the latest models.
Got questions or ideas for what I should cover next? Drop them in the comments — I’d love to hear your thoughts. See you in Part 2! 🙂