You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
|
|
There are two main components of a filer: directories and files.
My previous approach was to use some sequance number to generate directoryId. However, this is not scalable. The id generation itself is a bottleneck. It needs careful locking and deduplication checking to get a directoryId.
In a second design, each directory is deterministically mapped to UUID version 3, which uses MD5 to map a tuple of <uuid, name> to a version 3 UUID. However, this UUID3 approach is logically the same as storing the full path.
Storing the full path is the simplest design.
separator is a special byte, 0x00.
When writing a file: <file parent full path, separator, file name> => fildId, file properties For folders: The filer breaks the directory path into folders. for each folder: if it is not in cache: check whether the folder is created in the KVS, if not: set <folder parent full path, separator, folder name> => directory properties if no permission for the folder: break
The filer caches the most recently used folder permissions with a TTL. So any folder permission change needs to wait TTL interval to take effect.
When listing the directory: prefix scan of using (the folder full path + separator) as the prefix
The downside: 1. Rename a folder will need to recursively process all sub folders and files. 2. Move a folder will need to recursively process all sub folders and files. So these operations are not allowed if the folder is not empty.
Allowing: 1. Rename a file 2. Move a file to a different folder 3. Delete an empty folder
|