From Raw Text to Helpful Assistant
Learn how models have evolved from raw text completion machines to multifunctional, helpful assistants.
In our last lesson, we witnessed the colossal process of pretraining. The result is a powerful base model that has compressed a vast portion of human knowledge into its weights by learning to predict the next token.
But this genius is not a good assistant. It’s a pattern-completion engine, not an instruction-following conversationalist. It has also learned all the biases and toxicity from its training data, making it unsafe. It’s a powerful engine without a steering wheel, brakes, or any sense of the rules of the road. How do we take this raw power and “align” it with human intent and values? This final, crucial process is called alignment.
Stage 1: Instruction fine-tuning (SFT)
The first and most fundamental problem with our base model is that it doesn’t know the “format” of a good answer. If you ask it, “Explain the concept of black holes,” it might just continue your sentence with, “…is a fascinating topic in modern astrophysics,” because that’s a statistically common pattern. We need to teach it the conversational pattern of “User asks a question -> Assistant provides a helpful, complete answer.”
This is the goal of supervised fine-tuning (SFT), also known as instruction tuning. The key ingredient for SFT is a new, much smaller, but extremely high-quality dataset. This dataset is painstakingly created, often by human labelers, and consists of thousands of example conversations in the desired format, like (instruction, response) pairs.
We then take our pretrained base model and continue training it using the exact same four-step training loop we learned about. The only difference is that we are now using this small, curated dataset instead of the massive web corpus.
SFT is like sending our library genius to a “Consulting 101” workshop. We’re not teaching them new facts about the world; we’re teaching them the social rules of the job. By showing them hundreds of examples of good client interactions, they quickly learn the ...