Language Models, Chat Models, and Embedding Models

Explore the three main language model types in LangChain: completion-style, chat, and embedding models. Understand their input-output structures, use cases, and how LangChain provides a unified interface to switch providers seamlessly, enabling effective AI solutions across various tasks.

We'll cover the following...

Completion-style language models
Chat models and message structures
- Message roles and their purpose
- Instantiation and provider swapping
Embedding models for vector operations
- How embeddings enable retrieval
  - The LangChain embedding interface
Choosing the right model type
Conclusion

Every LLM provider exposes a slightly different API. OpenAI expects one format, Anthropic expects another, and a model hosted on AWS SageMaker expects yet another. If your application code is tightly coupled to any single provider’s API shape, switching providers means rewriting significant portions of your codebase. LangChain solves this fragmentation by introducing a unified abstraction layer that groups all language model interactions into three distinct types: completion-style LLMs, chat models, and embedding models. Each type handles a different shape of input and output, but all three share a common interface pattern that keeps your application code provider-agnostic.

Consider a real-world scenario. A customer-support platform uses a chat model to handle live dialogue with users, an embedding model to search a knowledge base for relevant help articles, and a completion-style LLM to summarize ticket histories for internal reports. Three different model types, three different purposes, but LangChain lets you instantiate and call all of them through a consistent API. By the end of this lesson, you will be able to instantiate each type, understand their input and output contracts, and choose the right one for any given task.

Completion-style language models

Completion-style LLMs accept a single string prompt and return a string completion. They correspond to older provider APIs, such as OpenAI’s legacy completions endpoint, where you send in raw text and the model continues it. Think of this like autocomplete on a much larger scale: you provide the beginning of a thought, and the model finishes it.

In LangChain, you work with these models by importing a provider-specific class, instantiating it with parameters like model name, temperature, and max_tokens, and then calling .invoke() with a plain string. The return value is also a plain string.

While chat models have largely superseded completion-style LLMs for most modern tasks, completion models remain relevant in specific scenarios. Simple text generation, batch summarization, and integration with legacy ...

1.LLM Application Architectures

2.Challenges and Risks

3.Transformers and Attention

4.Vector Databases

5.Prompt Engineering

Cloud Lab

6.Fine-Tuning

Cloud Lab

7.Model Context with LangChain

8.Agentic Workflows

Cloud Lab

9.Retrieval Augmented Generation (RAG)

Cloud Lab

Cloud Lab

10.LLM Evaluation

Cloud Lab

Language Models, Chat Models, and Embedding Models

Completion-style language models