This course immerses the learner into the design of systems and makes them ready for any new design problems.

sys-design.tar.gz

basic

This course provides an alternate perspective to system design where we learn about building blocs first and then use them to build bigger, more capable systems.

Copy-System Design-The Missing Pieces

## Availability

We utilized distributed storage to store:
* Documents crawled by the indexer
* Inverted indexes generated by the indexing nodes

Data is replicated across multiple regions in distributed storage, making cross-region deployment for indexing and search easier. The group of indexing and search nodes merely needs to be replicated in different availability zones. Therefore, we deploy the cluster of indexing and search nodes in different availability zones so that if a failure occurs in one place, we can process the requests from another cluster. Multiple groups of indexing and search nodes help to achieve high indexing and search availability. Moreover, in each cluster, if a node dies, another can take its place.

The indexing is done offline (not on the user's critical path). We don't need to replicate the indexing operations synchronously as it is not necessary to respond to the user search queries with the latest data that is just added to the index. So, we don't have to wait for the replication of the new index to respond to the search queries. This makes the search available to the users. 
> Once we have replicated the latest data at all the places and the search nodes have downloaded it, then the search queries are performed on the latest data.

# Availability

We utilized distributed storage to store:
* Documents crawled by the indexer
* Inverted indexes generated by the indexing nodes

Data is replicated across multiple regions in distributed storage, making cross-region deployment for indexing and search easier. The group of indexing and search nodes merely needs to be replicated in different availability zones. Therefore, we deploy the cluster of indexing and search nodes in different availability zones so that if a failure occurs in one place, we can process the requests from another cluster. Multiple groups of indexing and search nodes help to achieve high indexing and search availability. Moreover, in each cluster, if a node dies, another can take its place.

The indexing is done offline (not on the user's critical path). We don't need to replicate the indexing operations synchronously as it is not necessary to respond to the user search queries with the latest data that is just added to the index. So, we don't have to wait for the replication of the new index to respond to the search queries. This makes the search available to the users. 
> Once we have replicated the latest data at all the places and the search nodes have downloaded it, then the search queries are performed on the latest data.

Analyze how our design meets the requirements.

Evaluation of Distributed Search Design

Availability