Chunking Strategies and Embedding Model Selection

Understand and compare four chunking strategies—fixed-size, recursive, semantic, and parent-child—and learn how their trade-offs impact retrieval quality. Discover how chunk size affects recall and precision, and evaluate embedding models based on relevant metrics. This lesson equips you to select effective chunking and embedding model combinations tailored to your specific corpus and retrieval goals.

We'll cover the following...

Four chunking strategies compared
How chunk size affects recall and precision
- The precision-recall trade-off
- Key evaluation metrics
Embedding model selection
- Comparing popular models
- Evaluation criteria and practical rules
Choosing the right combination
Conclusion

In the previous lesson, we identified retrieval misses as a primary failure mode of the naive RAG pipeline. Many of those misses trace back to a single root cause: the way documents are split into chunks before embedding. Every vector in your store is only as semantically meaningful as the chunk it represents. If a chunk slices a sentence in half or buries a critical definition inside irrelevant surrounding text, no amount of downstream optimization can recover that lost signal. The chunking strategy you choose sets the ceiling on retrieval performance.

This lesson examines four chunking strategies (fixed-size, recursive, semantic, and parent-child) and pairs that discussion with embedding model selection. To ground the concepts, consider an enterprise knowledge base containing legal clauses, technical specifications, and FAQ answers. These documents vary wildly in length, density, and structure. A one-size-fits-all chunking approach will inevitably mangle some of them. By the end of this lesson, you will be able to choose a chunking strategy and embedding model pairing that maximizes recall and precision for a given corpus.

Four chunking strategies compared

Each chunking strategy makes a different bet about how to segment text, and each comes with a distinct trade-off between implementation simplicity and retrieval quality. The following breakdown covers the four approaches you will encounter most often in production RAG systems.

Fixed-size chunking: This method splits text by a token or character count with an optional overlap window. It is the simplest to implement, requiring no understanding of document structure. However, it is structure-blind. It will split a sentence in half or separate a term from its definition without hesitation. The overlap window mitigates some boundary issues, but it cannot prevent semantic fragmentation.
Recursive chunking: This approach uses a hierarchy of separators (double newline, single newline, sentence boundary, word boundary) and recursively splits only when a chunk exceeds the size limit. Because it tries the largest structural separator first, it preserves paragraph and sentence boundaries far better than fixed-size splitting. Implementation complexity is moderate, and most frameworks like LangChain provide it out of the box.
Semantic chunking: Instead of relying on character patterns, semantic chunking computes embedding similarityA numerical measure, typically cosine similarity, of how close two text segments are in meaning when represented as vectors. between consecutive sentences and splits where that similarity drops ...

1.LLM Application Architectures

2.Challenges and Risks

3.Transformers and Attention

4.Vector Databases

5.Prompt Engineering

Cloud Lab

6.Fine-Tuning

Cloud Lab

7.Model Context with LangChain

8.Agentic Workflows

Cloud Lab

9.Retrieval Augmented Generation (RAG)

Cloud Lab

Cloud Lab

10.LLM Evaluation

Cloud Lab

Chunking Strategies and Embedding Model Selection

Four chunking strategies compared