mirror of https://github.com/trapexit/mergerfs.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
47 lines
2.7 KiB
47 lines
2.7 KiB
# inodecalc
|
|
|
|
Inodes (st_ino) are unique identifiers within a filesystem. Each
|
|
mounted filesystem has device ID (st_dev) as well and together they
|
|
can uniquely identify a file on the whole of the system. Entries on
|
|
the same device with the same inode are in fact references to the same
|
|
underlying file. It is a many to one relationship between names and an
|
|
inode. Directories, however, do not have multiple links on most
|
|
systems due to the complexity they add.
|
|
|
|
FUSE allows the server (mergerfs) to set inode values but not device
|
|
IDs. Creating an inode value is somewhat complex in mergerfs' case as
|
|
files aren't really in its control. If a policy changes what directory
|
|
or file is to be selected or something changes out of band it becomes
|
|
unclear what value should be used. Most software does not to care what
|
|
the values are but those that do often break if a value changes
|
|
unexpectedly. The tool find will abort a directory walk if it sees a
|
|
directory inode change. NFS can return stale handle errors if the
|
|
inode changes out of band. File dedup tools will usually leverage
|
|
device ids and inodes as a shortcut in searching for duplicate files
|
|
and would resort to full file comparisons should it find different
|
|
inode values.
|
|
|
|
mergerfs offers multiple ways to calculate the inode in hopes of
|
|
covering different usecases.
|
|
|
|
* `passthrough`: Passes through the underlying inode value. Mostly
|
|
intended for testing as using this does not address any of the
|
|
problems mentioned above and could confuse file deduplication
|
|
software as inodes from different filesystems can be the same.
|
|
* `path-hash`: Hashes the relative path of the entry in question. The
|
|
underlying file's values are completely ignored. This means the
|
|
inode value will always be the same for that file path. This is
|
|
useful when using NFS and you make changes out of band such as copy
|
|
data between branches. This also means that entries that do point to
|
|
the same file will not be recognizable via inodes. That does not
|
|
mean hard links don't work. They will.
|
|
* `path-hash32`: 32bit version of path-hash.
|
|
* `devino-hash`: Hashes the device id and inode of the underlying
|
|
entry. This won't prevent issues with NFS should the policy pick a
|
|
different file or files move out of band but will present the same
|
|
inode for underlying files that do too.
|
|
* `devino-hash32`: 32bit version of devino-hash.
|
|
|
|
hybrid-hash: Performs path-hash on directories and devino-hash on other file types. Since directories can't have hard links the static value won't make a difference and the files will get values useful for finding duplicates. Probably the best to use if not using NFS. As such it is the default.
|
|
|
|
hybrid-hash32: 32bit version of hybrid-hash.
|