Workflow of Delete and Snapshot Operations in GFS

Let’s learn how GFS carries out delete and snapshot operations.

We'll cover the following

In the previous lessons, we’ve discussed the GFS design approach and the workflow of the create, read, and write operations. This lesson will cover the workflow of two remaining operations—delete file, and snapshot file/directory.

Delete file

The files on GFS are huge, implying that a file will have several chunks spread across the multiple chunkservers. Moreover, each chunk is replicated on multiple chunkservers for availability. Deleting so many chunks from multiple chunkservers while holding the client's delete request will add substantial latency to the client. If any of the replicas are temporarily down, we have to wait for them to recover to delete the chunk. This will produce an unnecessary wait on the client side. So, the file system implements a service called garbage collection. This service deletes the chunks but responds to the client immediately after marking the deleted file in the namespace tree maintained at the manager node. This is shown in the following illustration.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.