Foundation Models vs. Task-Specific Models
Explore the distinctions between foundation models and task-specific models. Learn how foundation models serve as broad, adaptable bases for tasks via prompting, fine-tuning, or retrieval augmentation. Discover when specialized models outperform foundation models, considering latency, explainability, data availability, and cost. This lesson helps you evaluate and select the right model type for your AI applications based on practical production constraints.
The previous lesson established how LLMs are trained through pre-training, supervised fine-tuning, and RLHF, and it drew a clear line between LLMs and traditional machine learning. That distinction raises a practical question: if a team needs to build a customer-support chatbot, a legal document summarizer, and a code reviewer, should they train three separate models or start from a single, shared base? The answer depends on understanding a category of model that has reshaped the entire AI landscape over the past few years.
What are foundation models
A
The term “foundation” was coined by Stanford’s Institute for Human-Centered AI (HAI) to emphasize exactly this point. Think of it like a building’s foundation: the concrete slab does not determine whether the structure above becomes a hospital, an office, or a school, but every one of those buildings depends on it.
Note: A foundation model is defined by its generality. It is not optimized for any single task during pre-training, which is precisely what makes it adaptable to many tasks afterward.
This generality has a direct business implication. A company that needs a customer-support chatbot, a legal document summarizer, and a code reviewer can start from the same foundation model rather than training three separate systems. Each downstream application adapts the shared model through different techniques, amortizing the enormous upfront cost of pre-training across multiple use cases.
Amazon SageMaker JumpStart illustrates this industry shift. It provides access to pre-trained foundation model endpoints that practitioners can deploy and adapt ...