Problem: Find Duplicate File in System
Explore how to use hash tables to find groups of duplicate files in a file system by parsing directory paths and file contents. Learn to efficiently aggregate file paths by content to detect duplicates, understand the implementation steps, and analyze the solution's time and space complexity.
We'll cover the following...
Statement
You are given a list paths, where each element is a string representing directory information. Each string contains a directory path followed by one or more files along with their contents in the following format:
"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"
This indicates that there are n files (f1.txt, f2.txt, …, fn.txt) with contents (f1_content, f2_content, …, fn_content) respectively, all located in the directory "root/d1/d2/.../dm". Here, n m m
Your task is to identify all groups of duplicate files in the file system. A group of duplicate files consists of at least
"directory_path/file_name.txt"
Return a list of all such groups. The answer may be returned in any order.
Note: You may assume that no files or directories share the same name within the same directory, and that each given directory info string represents a unique directory.
Constraints:
...