Evaluation of a Distributed Search's Design

Analyze how a distributed search architecture meets critical non-functional requirements, such as availability and scalability. Learn how partitioning, replication across availability zones, and isolating indexing processes ensure high performance and cost reduction in System Design.

We'll cover the following...

Requirements compliance
Conclusion

Requirements compliance

Let’s evaluate how the distributed search design meets non-functional requirements.

Availability

We use distributed storage to persist:

Documents crawled by the indexer.
The indexing nodes generate inverted indexes.

Distributed storage replicates data across regions, which supports cross-region deployment and disaster recovery. Clusters of indexing and search nodes are deployed across multiple Availability Zones (AZs). If an Availability Zone fails, traffic is redirected to a healthy cluster. Within each cluster, redundant nodes provide failover capability.

Indexing occurs offline, outside the user’s critical path. Because search queries do not require synchronous access to the absolute latest data, we avoid replication delays. This decoupling ensures high availability for search operations.

Note: Once the latest data replicates to all indexing groups and search nodes, download it, and queries reflect the updates.