What if your LLM could search like Google?
Most large language models (LLMs) struggle to answer questions about recent events or obscure facts. Traditional fixes like retrieval-augmented generation (RAG) rely on external search APIs, which rack up costs and introduce new points of failure. Enter ZeroSearch, Alibaba’s internal search simulation framework that eliminates API dependencies while boosting answer accuracy.
ZeroSearch trains LLMs to simulate a search engine internally. By generating both relevant and noisy documents during training, the model learns to retrieve and synthesize grounded answers using only its pretraining memory.
In this post, we’ll explore:
Why traditional RAG and RL-based search incur heavy costs
How ZeroSearch works under the hood
What kind of performance gains it offers
Limitations and future directions for internal tool simulation
Let's get started.