From 5152c63480748719058500ae458ec40f27d6bee6 Mon Sep 17 00:00:00 2001 From: Antonio SJ Musumeci Date: Thu, 16 Mar 2023 23:46:33 -0400 Subject: [PATCH] Misc README updates --- README.md | 273 +++++++++++++++++++++++++------------------------ man/mergerfs.1 | 263 ++++++++++++++++++++++++----------------------- 2 files changed, 275 insertions(+), 261 deletions(-) diff --git a/README.md b/README.md index 71b0a0e2..0f70d410 100644 --- a/README.md +++ b/README.md @@ -65,9 +65,10 @@ A + B = C mergerfs does **not** support the copy-on-write (CoW) or whiteout behaviors found in **aufs** and **overlayfs**. You can **not** mount a read-only filesystem and write to it. However, mergerfs will ignore -read-only drives when creating new files so you can mix read-write and -read-only drives. It also does **not** split data across drives. It is -not RAID0 / striping. It is simply a union of other filesystems. +read-only filesystems when creating new files so you can mix +read-write and read-only filesystems. It also does **not** split data +across filesystems. It is not RAID0 / striping. It is simply a union of +other filesystems. # TERMINOLOGY @@ -178,7 +179,7 @@ These options are the same regardless of whether you use them with the policy of `create` (read below). Enabling this will cause rename and link to always use the non-path preserving behavior. This means files, when renamed or linked, will stay on the same - drive. (default: false) + filesystem. (default: false) * **security_capability=BOOL**: If false return ENOATTR when xattr security.capability is queried. (default: true) * **xattr=passthrough|noattr|nosys**: Runtime control of @@ -191,7 +192,7 @@ These options are the same regardless of whether you use them with the copy-on-write function similar to cow-shell. (default: false) * **statfs=base|full**: Controls how statfs works. 'base' means it will always use all branches in statfs calculations. 'full' is in - effect path preserving and only includes drives where the path + effect path preserving and only includes branches where the path exists. (default: base) * **statfs_ignore=none|ro|nc**: 'ro' will cause statfs calculations to ignore available space for branches mounted or tagged as 'read-only' @@ -324,9 +325,9 @@ you're using. Not all features are available in older releases. Use The 'branches' argument is a colon (':') delimited list of paths to be pooled together. It does not matter if the paths are on the same or -different drives nor does it matter the filesystem (within +different filesystems nor does it matter the filesystem type (within reason). Used and available space will not be duplicated for paths on -the same device and any features which aren't supported by the +the same filesystem and any features which aren't supported by the underlying filesystem (such as file attributes or extended attributes) will return the appropriate errors. @@ -334,7 +335,7 @@ Branches currently have two options which can be set. A type which impacts whether or not the branch is included in a policy calculation and a individual minfreespace value. The values are set by prepending an `=` at the end of a branch designation and using commas as -delimiters. Example: /mnt/drive=RW,1234 +delimiters. Example: `/mnt/drive=RW,1234` #### branch mode @@ -590,10 +591,10 @@ something to keep in mind. **WARNING:** Some backup solutions, such as CrashPlan, do not backup the target of a symlink. If using this feature it will be necessary to -point any backup software to the original drives or configure the -software to follow symlinks if such an option is -available. Alternatively create two mounts. One for backup and one for -general consumption. +point any backup software to the original filesystems or configure the +software to follow symlinks if such an option is available. +Alternatively create two mounts. One for backup and one for general +consumption. ### nullrw @@ -750,11 +751,11 @@ All policies which start with `ep` (**epff**, **eplfs**, **eplus**, **epmfs**, **eprand**) are `path preserving`. `ep` stands for `existing path`. -A path preserving policy will only consider drives where the relative +A path preserving policy will only consider branches where the relative path being accessed already exists. When using non-path preserving policies paths will be cloned to target -drives as necessary. +branches as necessary. With the `msp` or `most shared path` policies they are defined as `path preserving` for the purpose of controlling `link` and `rename`'s @@ -775,15 +776,15 @@ but it makes things a bit more uniform. | all | Search: For **mkdir**, **mknod**, and **symlink** it will apply to all branches. **create** works like **ff**. | | epall (existing path, all) | For **mkdir**, **mknod**, and **symlink** it will apply to all found. **create** works like **epff** (but more expensive because it doesn't stop after finding a valid branch). | | epff (existing path, first found) | Given the order of the branches, as defined at mount time or configured at runtime, act on the first one found where the relative path exists. | -| eplfs (existing path, least free space) | Of all the branches on which the relative path exists choose the drive with the least free space. | -| eplus (existing path, least used space) | Of all the branches on which the relative path exists choose the drive with the least used space. | -| epmfs (existing path, most free space) | Of all the branches on which the relative path exists choose the drive with the most free space. | +| eplfs (existing path, least free space) | Of all the branches on which the relative path exists choose the branch with the least free space. | +| eplus (existing path, least used space) | Of all the branches on which the relative path exists choose the branch with the least used space. | +| epmfs (existing path, most free space) | Of all the branches on which the relative path exists choose the branch with the most free space. | | eppfrd (existing path, percentage free random distribution) | Like **pfrd** but limited to existing paths. | | eprand (existing path, random) | Calls **epall** and then randomizes. Returns 1. | -| ff (first found) | Given the order of the drives, as defined at mount time or configured at runtime, act on the first one found. | -| lfs (least free space) | Pick the drive with the least available free space. | -| lus (least used space) | Pick the drive with the least used space. | -| mfs (most free space) | Pick the drive with the most available free space. | +| ff (first found) | Given the order of the branches, as defined at mount time or configured at runtime, act on the first one found. | +| lfs (least free space) | Pick the branch with the least available free space. | +| lus (least used space) | Pick the branch with the least used space. | +| mfs (most free space) | Pick the branch with the most available free space. | | msplfs (most shared path, least free space) | Like **eplfs** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. | | msplus (most shared path, least used space) | Like **eplus** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. | | mspmfs (most shared path, most free space) | Like **epmfs** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. | @@ -832,7 +833,7 @@ filesystem. `rename` only works within a single filesystem or device. If a rename can't be done atomically due to the source and destination paths existing on different mount points it will return **-1** with **errno = EXDEV** (cross device / improper link). So if a -`rename`'s source and target are on different drives within the pool +`rename`'s source and target are on different filesystems within the pool it creates an issue. Originally mergerfs would return EXDEV whenever a rename was requested @@ -850,25 +851,25 @@ work while still obeying mergerfs' policies. Below is the basic logic. * Using the **rename** policy get the list of files to rename * For each file attempt rename: * If failure with ENOENT (no such file or directory) run **create** policy - * If create policy returns the same drive as currently evaluating then clone the path + * If create policy returns the same branch as currently evaluating then clone the path * Re-attempt rename * If **any** of the renames succeed the higher level rename is considered a success * If **no** renames succeed the first error encountered will be returned * On success: - * Remove the target from all drives with no source file - * Remove the source from all drives which failed to rename + * Remove the target from all branches with no source file + * Remove the source from all branches which failed to rename * If using a **create** policy which does **not** try to preserve directory paths * Using the **rename** policy get the list of files to rename * Using the **getattr** policy get the target path * For each file attempt rename: - * If the source drive != target drive: - * Clone target path from target drive to source drive + * If the source branch != target branch: + * Clone target path from target branch to source branch * Rename * If **any** of the renames succeed the higher level rename is considered a success * If **no** renames succeed the first error encountered will be returned * On success: - * Remove the target from all drives with no source file - * Remove the source from all drives which failed to rename + * Remove the target from all branches with no source file + * Remove the source from all branches which failed to rename The the removals are subject to normal entitlement checks. @@ -894,11 +895,11 @@ the source of the metadata you see in an **ls**. #### statfs / statvfs #### [statvfs](http://linux.die.net/man/2/statvfs) normalizes the source -drives based on the fragment size and sums the number of adjusted +filesystems based on the fragment size and sums the number of adjusted blocks and inodes. This means you will see the combined space of all sources. Total, used, and free. The sources however are dedupped based -on the drive so multiple sources on the same drive will not result in -double counting its space. Filesystems mounted further down the tree +on the filesystem so multiple sources on the same drive will not result in +double counting its space. Other filesystems mounted further down the tree of the branch will not be included when checking the mount's stats. The options `statfs` and `statfs_ignore` can be used to modify @@ -1211,8 +1212,8 @@ following: * mergerfs.fsck: Provides permissions and ownership auditing and the ability to fix them * mergerfs.dedup: Will help identify and optionally remove duplicate files * mergerfs.dup: Ensure there are at least N copies of a file across the pool - * mergerfs.balance: Rebalance files across drives by moving them from the most filled to the least filled - * mergerfs.consolidate: move files within a single mergerfs directory to the drive with most free space + * mergerfs.balance: Rebalance files across filesystems by moving them from the most filled to the least filled + * mergerfs.consolidate: move files within a single mergerfs directory to the filesystem with most free space * https://github.com/trapexit/scorch * scorch: A tool to help discover silent corruption of files and keep track of files * https://github.com/trapexit/bbf @@ -1324,37 +1325,18 @@ of sizes below the FUSE message size (128K on older kernels, 1M on newer). -#### policy caching - -Policies are run every time a function (with a policy as mentioned -above) is called. These policies can be expensive depending on -mergerfs' setup and client usage patterns. Generally we wouldn't want -to cache policy results because it may result in stale responses if -the underlying drives are used directly. - -The `open` policy cache will cache the result of an `open` policy for -a particular input for `cache.open` seconds or until the file is -unlinked. Each file close (release) will randomly chose to clean up -the cache of expired entries. - -This cache is really only useful in cases where you have a large -number of branches and `open` is called on the same files repeatedly -(like **Transmission** which opens and closes a file on every -read/write presumably to keep file handle usage low). - - #### statfs caching Of the syscalls used by mergerfs in policies the `statfs` / `statvfs` call is perhaps the most expensive. It's used to find out the -available space of a drive and whether it is mounted +available space of a filesystem and whether it is mounted read-only. Depending on the setup and usage pattern these queries can be relatively costly. When `cache.statfs` is enabled all calls to `statfs` by a policy will be cached for the number of seconds its set to. Example: If the create policy is `mfs` and the timeout is 60 then for -that 60 seconds the same drive will be returned as the target for +that 60 seconds the same filesystem will be returned as the target for creates because the available space won't be updated for that time. @@ -1392,42 +1374,42 @@ for instance. MergerFS does not natively support any sort of tiered caching. Most users have no use for such a feature and its inclusion would complicate the code. However, there are a few situations where a cache -drive could help with a typical mergerfs setup. +filesystem could help with a typical mergerfs setup. -1. Fast network, slow drives, many readers: You've a 10+Gbps network - with many readers and your regular drives can't keep up. -2. Fast network, slow drives, small'ish bursty writes: You have a +1. Fast network, slow filesystems, many readers: You've a 10+Gbps network + with many readers and your regular filesystems can't keep up. +2. Fast network, slow filesystems, small'ish bursty writes: You have a 10+Gbps network and wish to transfer amounts of data less than your - cache drive but wish to do so quickly. + cache filesystem but wish to do so quickly. With #1 it's arguable if you should be using mergerfs at all. RAID would probably be the better solution. If you're going to use mergerfs there are other tactics that may help: spreading the data across -drives (see the mergerfs.dup tool) and setting `func.open=rand`, using -`symlinkify`, or using dm-cache or a similar technology to add tiered -cache to the underlying device. +filesystems (see the mergerfs.dup tool) and setting `func.open=rand`, +using `symlinkify`, or using dm-cache or a similar technology to add +tiered cache to the underlying device. With #2 one could use dm-cache as well but there is another solution which requires only mergerfs and a cronjob. -1. Create 2 mergerfs pools. One which includes just the slow drives - and one which has both the fast drives (SSD,NVME,etc.) and slow - drives. -2. The 'cache' pool should have the cache drives listed first. +1. Create 2 mergerfs pools. One which includes just the slow devices + and one which has both the fast devices (SSD,NVME,etc.) and slow + devices. +2. The 'cache' pool should have the cache filesystems listed first. 3. The best `create` policies to use for the 'cache' pool would probably be `ff`, `epff`, `lfs`, or `eplfs`. The latter two under - the assumption that the cache drive(s) are far smaller than the - backing drives. If using path preserving policies remember that + the assumption that the cache filesystem(s) are far smaller than the + backing filesystems. If using path preserving policies remember that you'll need to manually create the core directories of those paths you wish to be cached. Be sure the permissions are in sync. Use - `mergerfs.fsck` to check / correct them. You could also tag the - slow drives as `=NC` though that'd mean if the cache drives fill - you'd get "out of space" errors. + `mergerfs.fsck` to check / correct them. You could also set the + slow filesystems mode to `NC` though that'd mean if the cache + filesystems fill you'd get "out of space" errors. 4. Enable `moveonenospc` and set `minfreespace` appropriately. To make sure there is enough room on the "slow" pool you might want to set `minfreespace` to at least as large as the size of the largest - cache drive if not larger. This way in the worst case the whole of - the cache drive(s) can be moved to the other drives. + cache filesystem if not larger. This way in the worst case the + whole of the cache filesystem(s) can be moved to the other drives. 5. Set your programs to use the cache pool. 6. Save one of the below scripts or create you're own. 7. Use `cron` (as root) to schedule the command at whatever frequency @@ -1442,15 +1424,15 @@ rather than days. May want to use the `fadvise` / `--drop-cache` version of rsync or run rsync with the tool "nocache". *NOTE:* The arguments to these scripts include the cache -**drive**. Not the pool with the cache drive. You could have data loss -if the source is the cache pool. +**filesystem** itself. Not the pool with the cache filesystem. You +could have data loss if the source is the cache pool. ``` #!/bin/bash if [ $# != 3 ]; then - echo "usage: $0 " + echo "usage: $0 " exit 1 fi @@ -1469,15 +1451,15 @@ Move the oldest file from the cache to the backing pool. Continue till below percentage threshold. *NOTE:* The arguments to these scripts include the cache -**drive**. Not the pool with the cache drive. You could have data loss -if the source is the cache pool. +**filesystem** itself. Not the pool with the cache filesystem. You +could have data loss if the source is the cache pool. ``` #!/bin/bash if [ $# != 3 ]; then - echo "usage: $0 " + echo "usage: $0 " exit 1 fi @@ -1506,7 +1488,7 @@ FUSE filesystem working from userspace there is an increase in overhead relative to kernel based solutions. That said the performance can match the theoretical max but it depends greatly on the system's configuration. Especially when adding network filesystems into the mix -there are many variables which can impact performance. Drive speeds +there are many variables which can impact performance. Device speeds and latency, network speeds and latency, general concurrency, read/write sizes, etc. Unfortunately, given the number of variables it has been difficult to find a single set of settings which provide @@ -1528,7 +1510,7 @@ understand what behaviors it may impact * disable `async_read` * test theoretical performance using `nullrw` or mounting a ram disk * use `symlinkify` if your data is largely static and read-only -* use tiered cache drives +* use tiered cache devices * use LVM and LVM cache to place a SSD in front of your HDDs * increase readahead: `readahead=1024` @@ -1567,9 +1549,9 @@ the order listed (but not combined). 2. Mount mergerfs over `tmpfs`. `tmpfs` is a RAM disk. Extremely high speed and very low latency. This is a more realistic best case scenario. Example: `mount -t tmpfs -o size=2G tmpfs /tmp/tmpfs` -3. Mount mergerfs over a local drive. NVMe, SSD, HDD, etc. If you have - more than one I'd suggest testing each of them as drives and/or - controllers (their drivers) could impact performance. +3. Mount mergerfs over a local device. NVMe, SSD, HDD, etc. If you + have more than one I'd suggest testing each of them as drives + and/or controllers (their drivers) could impact performance. 4. Finally, if you intend to use mergerfs with a network filesystem, either as the source of data or to combine with another through mergerfs, test each of those alone as above. @@ -1579,7 +1561,7 @@ further testing with different options to see if they impact performance. For reads and writes the most relevant would be: `cache.files`, `async_read`. Less likely but relevant when using NFS or with certain filesystems would be `security_capability`, `xattr`, -and `posix_acl`. If you find a specific system, drive, filesystem, +and `posix_acl`. If you find a specific system, device, filesystem, controller, etc. that performs poorly contact trapexit so he may investigate further. @@ -1632,7 +1614,7 @@ echo 3 | sudo tee /proc/sys/vm/drop_caches * If you don't see some directories and files you expect, policies seem to skip branches, you get strange permission errors, etc. be sure the underlying filesystems' permissions are all the same. Use - `mergerfs.fsck` to audit the drive for out of sync permissions. + `mergerfs.fsck` to audit the filesystem for out of sync permissions. * If you still have permission issues be sure you are using POSIX ACL compliant filesystems. mergerfs doesn't generally make exceptions for FAT, NTFS, or other non-POSIX filesystem. @@ -1684,7 +1666,7 @@ outdated. The reason this is the default is because any other policy would be more expensive and for many applications it is unnecessary. To always return the directory with the most recent mtime or a faked value based -on all found would require a scan of all drives. +on all found would require a scan of all filesystems. If you always want the directory information from the one with the most recent mtime then use the `newest` policy for `getattr`. @@ -1709,9 +1691,9 @@ then removing the source. Since the source **is** the target in this case, depending on the unlink policy, it will remove the just copied file and other files across the branches. -If you want to move files to one drive just copy them there and use -mergerfs.dedup to clean up the old paths or manually remove them from -the branches directly. +If you want to move files to one filesystem just copy them there and +use mergerfs.dedup to clean up the old paths or manually remove them +from the branches directly. #### cached memory appears greater than it should be @@ -1772,15 +1754,14 @@ Please read the section above regarding [rename & link](#rename--link). The problem is that many applications do not properly handle `EXDEV` errors which `rename` and `link` may return even though they are -perfectly valid situations which do not indicate actual drive or OS -errors. The error will only be returned by mergerfs if using a path -preserving policy as described in the policy section above. If you do -not care about path preservation simply change the mergerfs policy to -the non-path preserving version. For example: `-o category.create=mfs` - -Ideally the offending software would be fixed and it is recommended -that if you run into this problem you contact the software's author -and request proper handling of `EXDEV` errors. +perfectly valid situations which do not indicate actual device, +filesystem, or OS errors. The error will only be returned by mergerfs +if using a path preserving policy as described in the policy section +above. If you do not care about path preservation simply change the +mergerfs policy to the non-path preserving version. For example: `-o +category.create=mfs` Ideally the offending software would be fixed and +it is recommended that if you run into this problem you contact the +software's author and request proper handling of `EXDEV` errors. #### my 32bit software has problems @@ -1887,9 +1868,10 @@ Users have reported running mergerfs on everything from a Raspberry Pi to dual socket Xeon systems with >20 cores. I'm aware of at least a few companies which use mergerfs in production. [Open Media Vault](https://www.openmediavault.org) includes mergerfs as its sole -solution for pooling drives. The author of mergerfs had it running for -over 300 days managing 16+ drives with reasonably heavy 24/7 read and -write usage. Stopping only after the machine's power supply died. +solution for pooling filesystems. The author of mergerfs had it +running for over 300 days managing 16+ devices with reasonably heavy +24/7 read and write usage. Stopping only after the machine's power +supply died. Most serious issues (crashes or data corruption) have been due to [kernel @@ -1897,14 +1879,14 @@ bugs](https://github.com/trapexit/mergerfs/wiki/Kernel-Issues-&-Bugs). All of which are fixed in stable releases. -#### Can mergerfs be used with drives which already have data / are in use? +#### Can mergerfs be used with filesystems which already have data / are in use? Yes. MergerFS is a proxy and does **NOT** interfere with the normal -form or function of the drives / mounts / paths it manages. +form or function of the filesystems / mounts / paths it manages. MergerFS is **not** a traditional filesystem. MergerFS is **not** RAID. It does **not** manipulate the data that passes through it. It -does **not** shard data across drives. It merely shards some +does **not** shard data across filesystems. It merely shards some **behavior** and aggregates others. @@ -1920,8 +1902,8 @@ best off using `mfs` for `category.create`. It will spread files out across your branches based on available space. Use `mspmfs` if you want to try to colocate the data a bit more. You may want to use `lus` if you prefer a slightly different distribution of data if you have a -mix of smaller and larger drives. Generally though `mfs`, `lus`, or -even `rand` are good for the general use case. If you are starting +mix of smaller and larger filesystems. Generally though `mfs`, `lus`, +or even `rand` are good for the general use case. If you are starting with an imbalanced pool you can use the tool **mergerfs.balance** to redistribute files across the pool. @@ -1929,8 +1911,8 @@ If you really wish to try to colocate files based on directory you can set `func.create` to `epmfs` or similar and `func.mkdir` to `rand` or `eprand` depending on if you just want to colocate generally or on specific branches. Either way the *need* to colocate is rare. For -instance: if you wish to remove the drive regularly and want the data -to predictably be on that drive or if you don't use backup at all and +instance: if you wish to remove the device regularly and want the data +to predictably be on that device or if you don't use backup at all and don't wish to replace that data piecemeal. In which case using path preservation can help but will require some manual attention. Colocating after the fact can be accomplished using the @@ -1965,29 +1947,29 @@ That said, for the average person, the following should be fine: `cache.files=off,dropcacheonclose=true,category.create=mfs` -#### Why are all my files ending up on 1 drive?! +#### Why are all my files ending up on 1 filesystem?! -Did you start with empty drives? Did you explicitly configure a +Did you start with empty filesystems? Did you explicitly configure a `category.create` policy? Are you using an `existing path` / `path preserving` policy? The default create policy is `epmfs`. That is a path preserving algorithm. With such a policy for `mkdir` and `create` with a set of -empty drives it will select only 1 drive when the first directory is -created. Anything, files or directories, created in that first -directory will be placed on the same branch because it is preserving -paths. +empty filesystems it will select only 1 filesystem when the first +directory is created. Anything, files or directories, created in that +first directory will be placed on the same branch because it is +preserving paths. This catches a lot of new users off guard but changing the default would break the setup for many existing users. If you do not care about path preservation and wish your files to be spread across all -your drives change to `mfs` or similar policy as described above. If -you do want path preservation you'll need to perform the manual act of -creating paths on the drives you want the data to land on before -transferring your data. Setting `func.mkdir=epall` can simplify -managing path preservation for `create`. Or use `func.mkdir=rand` if -you're interested in just grouping together directory content by -drive. +your filesystems change to `mfs` or similar policy as described +above. If you do want path preservation you'll need to perform the +manual act of creating paths on the filesystems you want the data to +land on before transferring your data. Setting `func.mkdir=epall` can +simplify managing path preservation for `create`. Or use +`func.mkdir=rand` if you're interested in just grouping together +directory content by filesystem. #### Do hardlinks work? @@ -2058,8 +2040,8 @@ such, mergerfs always changes its credentials to that of the caller. This means that if the user does not have access to a file or directory than neither will mergerfs. However, because mergerfs is creating a union of paths it may be able to read some files and -directories on one drive but not another resulting in an incomplete -set. +directories on one filesystem but not another resulting in an +incomplete set. Whenever you run into a split permission issue (seeing some but not all files) try using @@ -2153,9 +2135,10 @@ overlayfs have. #### Why use mergerfs over unionfs? UnionFS is more like aufs than mergerfs in that it offers overlay / -CoW features. If you're just looking to create a union of drives and -want flexibility in file/directory placement then mergerfs offers that -whereas unionfs is more for overlaying RW filesystems over RO ones. +CoW features. If you're just looking to create a union of filesystems +and want flexibility in file/directory placement then mergerfs offers +that whereas unionfs is more for overlaying RW filesystems over RO +ones. #### Why use mergerfs over overlayfs? @@ -2179,8 +2162,8 @@ without the single point of failure. #### Why use mergerfs over ZFS? MergerFS is not intended to be a replacement for ZFS. MergerFS is -intended to provide flexible pooling of arbitrary drives (local or -remote), of arbitrary sizes, and arbitrary filesystems. For `write +intended to provide flexible pooling of arbitrary filesystems (local +or remote), of arbitrary sizes, and arbitrary filesystems. For `write once, read many` usecases such as bulk media storage. Where data integrity and backup is managed in other ways. In that situation ZFS can introduce a number of costs and limitations as described @@ -2200,6 +2183,29 @@ There are a number of UnRAID users who use mergerfs as well though I'm not entirely familiar with the use case. +#### Why use mergerfs over StableBit's DrivePool? + +DrivePool works only on Windows so not as common an alternative as +other Linux solutions. If you want to use Windows then DrivePool is a +good option. Functionally the two projects work a bit +differently. DrivePool always writes to the filesystem with the most +free space and later rebalances. mergerfs does not offer rebalance but +chooses a branch at file/directory create time. DrivePool's +rebalancing can be done differently in any directory and has file +pattern matching to further customize the behavior. mergerfs, not +having rebalancing does not have these features, but similar features +are planned for mergerfs v3. DrivePool has builtin file duplication +which mergerfs does not natively support (but can be done via an +external script.) + +There are a lot of misc differences between the two projects but most +features in DrivePool can be replicated with external tools in +combination with mergerfs. + +Additionally DrivePool is a closed source commercial product vs +mergerfs a ISC licensed OSS project. + + #### What should mergerfs NOT be used for? * databases: Even if the database stored data in separate files @@ -2214,7 +2220,7 @@ not entirely familiar with the use case. availability you should stick with RAID. -#### Can drives be written to directly? Outside of mergerfs while pooled? +#### Can filesystems be written to directly? Outside of mergerfs while pooled? Yes, however it's not recommended to use the same file from within the pool and from without at the same time (particularly @@ -2244,7 +2250,7 @@ was asked of it: filtering possible branches due to those settings. Only one error can be returned and if one of the reasons for filtering a branch was **minfreespace** then it will be returned as such. **moveonenospc** is only relevant to writing a file which is too -large for the drive its currently on. +large for the filesystem it's currently on. It is also possible that the filesystem selected has run out of inodes. Use `df -i` to list the total and available inodes per @@ -2336,7 +2342,8 @@ away by using realtime signals to inform all threads to change credentials. Taking after **Samba**, mergerfs uses **syscall(SYS_setreuid,...)** to set the callers credentials for that thread only. Jumping back to **root** as necessary should escalated -privileges be needed (for instance: to clone paths between drives). +privileges be needed (for instance: to clone paths between +filesystems). For non-Linux systems mergerfs uses a read-write lock and changes credentials only when necessary. If multiple threads are to be user X diff --git a/man/mergerfs.1 b/man/mergerfs.1 index a6586227..c5d3aa5f 100644 --- a/man/mergerfs.1 +++ b/man/mergerfs.1 @@ -77,9 +77,9 @@ A + B = C mergerfs does \f[B]not\f[R] support the copy-on-write (CoW) or whiteout behaviors found in \f[B]aufs\f[R] and \f[B]overlayfs\f[R]. You can \f[B]not\f[R] mount a read-only filesystem and write to it. -However, mergerfs will ignore read-only drives when creating new files -so you can mix read-write and read-only drives. -It also does \f[B]not\f[R] split data across drives. +However, mergerfs will ignore read-only filesystems when creating new +files so you can mix read-write and read-only filesystems. +It also does \f[B]not\f[R] split data across filesystems. It is not RAID0 / striping. It is simply a union of other filesystems. .SH TERMINOLOGY @@ -210,7 +210,8 @@ Typically rename and link act differently depending on the policy of \f[C]create\f[R] (read below). Enabling this will cause rename and link to always use the non-path preserving behavior. -This means files, when renamed or linked, will stay on the same drive. +This means files, when renamed or linked, will stay on the same +filesystem. (default: false) .IP \[bu] 2 \f[B]security_capability=BOOL\f[R]: If false return ENOATTR when xattr @@ -233,7 +234,7 @@ to cow-shell. .IP \[bu] 2 \f[B]statfs=base|full\f[R]: Controls how statfs works. `base' means it will always use all branches in statfs calculations. -`full' is in effect path preserving and only includes drives where the +`full' is in effect path preserving and only includes branches where the path exists. (default: base) .IP \[bu] 2 @@ -442,10 +443,10 @@ POLICY = mergerfs function policy .PP The `branches' argument is a colon (`:') delimited list of paths to be pooled together. -It does not matter if the paths are on the same or different drives nor -does it matter the filesystem (within reason). +It does not matter if the paths are on the same or different filesystems +nor does it matter the filesystem type (within reason). Used and available space will not be duplicated for paths on the same -device and any features which aren\[cq]t supported by the underlying +filesystem and any features which aren\[cq]t supported by the underlying filesystem (such as file attributes or extended attributes) will return the appropriate errors. .PP @@ -454,7 +455,7 @@ A type which impacts whether or not the branch is included in a policy calculation and a individual minfreespace value. The values are set by prepending an \f[C]=\f[R] at the end of a branch designation and using commas as delimiters. -Example: /mnt/drive=RW,1234 +Example: \f[C]/mnt/drive=RW,1234\f[R] .SS branch mode .IP \[bu] 2 RW: (read/write) - Default behavior. @@ -748,8 +749,8 @@ This is unlikely to occur in practice but is something to keep in mind. \f[B]WARNING:\f[R] Some backup solutions, such as CrashPlan, do not backup the target of a symlink. If using this feature it will be necessary to point any backup software -to the original drives or configure the software to follow symlinks if -such an option is available. +to the original filesystems or configure the software to follow symlinks +if such an option is available. Alternatively create two mounts. One for backup and one for general consumption. .SS nullrw @@ -939,11 +940,11 @@ All policies which start with \f[C]ep\f[R] (\f[B]epff\f[R], \f[C]path preserving\f[R]. \f[C]ep\f[R] stands for \f[C]existing path\f[R]. .PP -A path preserving policy will only consider drives where the relative +A path preserving policy will only consider branches where the relative path being accessed already exists. .PP When using non-path preserving policies paths will be cloned to target -drives as necessary. +branches as necessary. .PP With the \f[C]msp\f[R] or \f[C]most shared path\f[R] policies they are defined as \f[C]path preserving\f[R] for the purpose of controlling @@ -990,19 +991,19 @@ T} T{ eplfs (existing path, least free space) T}@T{ -Of all the branches on which the relative path exists choose the drive +Of all the branches on which the relative path exists choose the branch with the least free space. T} T{ eplus (existing path, least used space) T}@T{ -Of all the branches on which the relative path exists choose the drive +Of all the branches on which the relative path exists choose the branch with the least used space. T} T{ epmfs (existing path, most free space) T}@T{ -Of all the branches on which the relative path exists choose the drive +Of all the branches on which the relative path exists choose the branch with the most free space. T} T{ @@ -1019,23 +1020,23 @@ T} T{ ff (first found) T}@T{ -Given the order of the drives, as defined at mount time or configured at -runtime, act on the first one found. +Given the order of the branches, as defined at mount time or configured +at runtime, act on the first one found. T} T{ lfs (least free space) T}@T{ -Pick the drive with the least available free space. +Pick the branch with the least available free space. T} T{ lus (least used space) T}@T{ -Pick the drive with the least used space. +Pick the branch with the least used space. T} T{ mfs (most free space) T}@T{ -Pick the drive with the most available free space. +Pick the branch with the most available free space. T} T{ msplfs (most shared path, least free space) @@ -1141,8 +1142,8 @@ If a rename can\[cq]t be done atomically due to the source and destination paths existing on different mount points it will return \f[B]-1\f[R] with \f[B]errno = EXDEV\f[R] (cross device / improper link). -So if a \f[C]rename\f[R]\[cq]s source and target are on different drives -within the pool it creates an issue. +So if a \f[C]rename\f[R]\[cq]s source and target are on different +filesystems within the pool it creates an issue. .PP Originally mergerfs would return EXDEV whenever a rename was requested which was cross directory in any way. @@ -1169,7 +1170,7 @@ For each file attempt rename: If failure with ENOENT (no such file or directory) run \f[B]create\f[R] policy .IP \[bu] 2 -If create policy returns the same drive as currently evaluating then +If create policy returns the same branch as currently evaluating then clone the path .IP \[bu] 2 Re-attempt rename @@ -1184,9 +1185,9 @@ returned On success: .RS 2 .IP \[bu] 2 -Remove the target from all drives with no source file +Remove the target from all branches with no source file .IP \[bu] 2 -Remove the source from all drives which failed to rename +Remove the source from all branches which failed to rename .RE .RE .IP \[bu] 2 @@ -1201,10 +1202,10 @@ Using the \f[B]getattr\f[R] policy get the target path For each file attempt rename: .RS 2 .IP \[bu] 2 -If the source drive != target drive: +If the source branch != target branch: .RS 2 .IP \[bu] 2 -Clone target path from target drive to source drive +Clone target path from target branch to source branch .RE .IP \[bu] 2 Rename @@ -1219,9 +1220,9 @@ returned On success: .RS 2 .IP \[bu] 2 -Remove the target from all drives with no source file +Remove the target from all branches with no source file .IP \[bu] 2 -Remove the source from all drives which failed to rename +Remove the source from all branches which failed to rename .RE .RE .PP @@ -1247,14 +1248,14 @@ file/directory which is the source of the metadata you see in an .SS statfs / statvfs .PP statvfs (http://linux.die.net/man/2/statvfs) normalizes the source -drives based on the fragment size and sums the number of adjusted blocks -and inodes. +filesystems based on the fragment size and sums the number of adjusted +blocks and inodes. This means you will see the combined space of all sources. Total, used, and free. -The sources however are dedupped based on the drive so multiple sources -on the same drive will not result in double counting its space. -Filesystems mounted further down the tree of the branch will not be -included when checking the mount\[cq]s stats. +The sources however are dedupped based on the filesystem so multiple +sources on the same drive will not result in double counting its space. +Other filesystems mounted further down the tree of the branch will not +be included when checking the mount\[cq]s stats. .PP The options \f[C]statfs\f[R] and \f[C]statfs_ignore\f[R] can be used to modify \f[C]statfs\f[R] behavior. @@ -1611,11 +1612,11 @@ mergerfs.dedup: Will help identify and optionally remove duplicate files mergerfs.dup: Ensure there are at least N copies of a file across the pool .IP \[bu] 2 -mergerfs.balance: Rebalance files across drives by moving them from the -most filled to the least filled +mergerfs.balance: Rebalance files across filesystems by moving them from +the most filled to the least filled .IP \[bu] 2 mergerfs.consolidate: move files within a single mergerfs directory to -the drive with most free space +the filesystem with most free space .RE .IP \[bu] 2 https://github.com/trapexit/scorch @@ -1746,40 +1747,21 @@ Note that if an application is properly sizing writes then writeback caching will have little or no effect. It will only help with writes of sizes below the FUSE message size (128K on older kernels, 1M on newer). -.SS policy caching -.PP -Policies are run every time a function (with a policy as mentioned -above) is called. -These policies can be expensive depending on mergerfs\[cq] setup and -client usage patterns. -Generally we wouldn\[cq]t want to cache policy results because it may -result in stale responses if the underlying drives are used directly. -.PP -The \f[C]open\f[R] policy cache will cache the result of an -\f[C]open\f[R] policy for a particular input for \f[C]cache.open\f[R] -seconds or until the file is unlinked. -Each file close (release) will randomly chose to clean up the cache of -expired entries. -.PP -This cache is really only useful in cases where you have a large number -of branches and \f[C]open\f[R] is called on the same files repeatedly -(like \f[B]Transmission\f[R] which opens and closes a file on every -read/write presumably to keep file handle usage low). .SS statfs caching .PP Of the syscalls used by mergerfs in policies the \f[C]statfs\f[R] / \f[C]statvfs\f[R] call is perhaps the most expensive. -It\[cq]s used to find out the available space of a drive and whether it -is mounted read-only. +It\[cq]s used to find out the available space of a filesystem and +whether it is mounted read-only. Depending on the setup and usage pattern these queries can be relatively costly. When \f[C]cache.statfs\f[R] is enabled all calls to \f[C]statfs\f[R] by a policy will be cached for the number of seconds its set to. .PP Example: If the create policy is \f[C]mfs\f[R] and the timeout is 60 -then for that 60 seconds the same drive will be returned as the target -for creates because the available space won\[cq]t be updated for that -time. +then for that 60 seconds the same filesystem will be returned as the +target for creates because the available space won\[cq]t be updated for +that time. .SS symlink caching .PP As of version 4.20 Linux supports symlink caching. @@ -1815,54 +1797,55 @@ NVMe, SSD, Optane in front of traditional HDDs for instance. MergerFS does not natively support any sort of tiered caching. Most users have no use for such a feature and its inclusion would complicate the code. -However, there are a few situations where a cache drive could help with -a typical mergerfs setup. +However, there are a few situations where a cache filesystem could help +with a typical mergerfs setup. .IP "1." 3 -Fast network, slow drives, many readers: You\[cq]ve a 10+Gbps network -with many readers and your regular drives can\[cq]t keep up. +Fast network, slow filesystems, many readers: You\[cq]ve a 10+Gbps +network with many readers and your regular filesystems can\[cq]t keep +up. .IP "2." 3 -Fast network, slow drives, small\[cq]ish bursty writes: You have a +Fast network, slow filesystems, small\[cq]ish bursty writes: You have a 10+Gbps network and wish to transfer amounts of data less than your -cache drive but wish to do so quickly. +cache filesystem but wish to do so quickly. .PP With #1 it\[cq]s arguable if you should be using mergerfs at all. RAID would probably be the better solution. If you\[cq]re going to use mergerfs there are other tactics that may -help: spreading the data across drives (see the mergerfs.dup tool) and -setting \f[C]func.open=rand\f[R], using \f[C]symlinkify\f[R], or using -dm-cache or a similar technology to add tiered cache to the underlying -device. +help: spreading the data across filesystems (see the mergerfs.dup tool) +and setting \f[C]func.open=rand\f[R], using \f[C]symlinkify\f[R], or +using dm-cache or a similar technology to add tiered cache to the +underlying device. .PP With #2 one could use dm-cache as well but there is another solution which requires only mergerfs and a cronjob. .IP "1." 3 Create 2 mergerfs pools. -One which includes just the slow drives and one which has both the fast -drives (SSD,NVME,etc.) and slow drives. +One which includes just the slow devices and one which has both the fast +devices (SSD,NVME,etc.) and slow devices. .IP "2." 3 -The `cache' pool should have the cache drives listed first. +The `cache' pool should have the cache filesystems listed first. .IP "3." 3 The best \f[C]create\f[R] policies to use for the `cache' pool would probably be \f[C]ff\f[R], \f[C]epff\f[R], \f[C]lfs\f[R], or \f[C]eplfs\f[R]. -The latter two under the assumption that the cache drive(s) are far -smaller than the backing drives. +The latter two under the assumption that the cache filesystem(s) are far +smaller than the backing filesystems. If using path preserving policies remember that you\[cq]ll need to manually create the core directories of those paths you wish to be cached. Be sure the permissions are in sync. Use \f[C]mergerfs.fsck\f[R] to check / correct them. -You could also tag the slow drives as \f[C]=NC\f[R] though that\[cq]d -mean if the cache drives fill you\[cq]d get \[lq]out of space\[rq] -errors. +You could also set the slow filesystems mode to \f[C]NC\f[R] though +that\[cq]d mean if the cache filesystems fill you\[cq]d get \[lq]out of +space\[rq] errors. .IP "4." 3 Enable \f[C]moveonenospc\f[R] and set \f[C]minfreespace\f[R] appropriately. To make sure there is enough room on the \[lq]slow\[rq] pool you might want to set \f[C]minfreespace\f[R] to at least as large as the size of -the largest cache drive if not larger. -This way in the worst case the whole of the cache drive(s) can be moved -to the other drives. +the largest cache filesystem if not larger. +This way in the worst case the whole of the cache filesystem(s) can be +moved to the other drives. .IP "5." 3 Set your programs to use the cache pool. .IP "6." 3 @@ -1880,8 +1863,8 @@ May want to use the \f[C]fadvise\f[R] / \f[C]--drop-cache\f[R] version of rsync or run rsync with the tool \[lq]nocache\[rq]. .PP \f[I]NOTE:\f[R] The arguments to these scripts include the cache -\f[B]drive\f[R]. -Not the pool with the cache drive. +\f[B]filesystem\f[R] itself. +Not the pool with the cache filesystem. You could have data loss if the source is the cache pool. .IP .nf @@ -1889,7 +1872,7 @@ You could have data loss if the source is the cache pool. #!/bin/bash if [ $# != 3 ]; then - echo \[dq]usage: $0 \[dq] + echo \[dq]usage: $0 \[dq] exit 1 fi @@ -1907,8 +1890,8 @@ Move the oldest file from the cache to the backing pool. Continue till below percentage threshold. .PP \f[I]NOTE:\f[R] The arguments to these scripts include the cache -\f[B]drive\f[R]. -Not the pool with the cache drive. +\f[B]filesystem\f[R] itself. +Not the pool with the cache filesystem. You could have data loss if the source is the cache pool. .IP .nf @@ -1916,7 +1899,7 @@ You could have data loss if the source is the cache pool. #!/bin/bash if [ $# != 3 ]; then - echo \[dq]usage: $0 \[dq] + echo \[dq]usage: $0 \[dq] exit 1 fi @@ -1946,7 +1929,7 @@ That said the performance can match the theoretical max but it depends greatly on the system\[cq]s configuration. Especially when adding network filesystems into the mix there are many variables which can impact performance. -Drive speeds and latency, network speeds and latency, general +Device speeds and latency, network speeds and latency, general concurrency, read/write sizes, etc. Unfortunately, given the number of variables it has been difficult to find a single set of settings which provide optimal performance. @@ -1982,7 +1965,7 @@ disk .IP \[bu] 2 use \f[C]symlinkify\f[R] if your data is largely static and read-only .IP \[bu] 2 -use tiered cache drives +use tiered cache devices .IP \[bu] 2 use LVM and LVM cache to place a SSD in front of your HDDs .IP \[bu] 2 @@ -2029,7 +2012,7 @@ Extremely high speed and very low latency. This is a more realistic best case scenario. Example: \f[C]mount -t tmpfs -o size=2G tmpfs /tmp/tmpfs\f[R] .IP "3." 3 -Mount mergerfs over a local drive. +Mount mergerfs over a local device. NVMe, SSD, HDD, etc. If you have more than one I\[cq]d suggest testing each of them as drives and/or controllers (their drivers) could impact performance. @@ -2046,7 +2029,7 @@ For reads and writes the most relevant would be: \f[C]cache.files\f[R], Less likely but relevant when using NFS or with certain filesystems would be \f[C]security_capability\f[R], \f[C]xattr\f[R], and \f[C]posix_acl\f[R]. -If you find a specific system, drive, filesystem, controller, etc. +If you find a specific system, device, filesystem, controller, etc. that performs poorly contact trapexit so he may investigate further. .PP Sometimes the problem is really the application accessing or writing @@ -2109,7 +2092,7 @@ exibit incorrect behavior if run otherwise.. If you don\[cq]t see some directories and files you expect, policies seem to skip branches, you get strange permission errors, etc. be sure the underlying filesystems\[cq] permissions are all the same. -Use \f[C]mergerfs.fsck\f[R] to audit the drive for out of sync +Use \f[C]mergerfs.fsck\f[R] to audit the filesystem for out of sync permissions. .IP \[bu] 2 If you still have permission issues be sure you are using POSIX ACL @@ -2165,7 +2148,7 @@ appear outdated. The reason this is the default is because any other policy would be more expensive and for many applications it is unnecessary. To always return the directory with the most recent mtime or a faked -value based on all found would require a scan of all drives. +value based on all found would require a scan of all filesystems. .PP If you always want the directory information from the one with the most recent mtime then use the \f[C]newest\f[R] policy for \f[C]getattr\f[R]. @@ -2191,7 +2174,7 @@ Since the source \f[B]is\f[R] the target in this case, depending on the unlink policy, it will remove the just copied file and other files across the branches. .PP -If you want to move files to one drive just copy them there and use +If you want to move files to one filesystem just copy them there and use mergerfs.dedup to clean up the old paths or manually remove them from the branches directly. .SS cached memory appears greater than it should be @@ -2253,16 +2236,15 @@ Please read the section above regarding rename & link. The problem is that many applications do not properly handle \f[C]EXDEV\f[R] errors which \f[C]rename\f[R] and \f[C]link\f[R] may return even though they are perfectly valid situations which do not -indicate actual drive or OS errors. +indicate actual device, filesystem, or OS errors. The error will only be returned by mergerfs if using a path preserving policy as described in the policy section above. If you do not care about path preservation simply change the mergerfs policy to the non-path preserving version. -For example: \f[C]-o category.create=mfs\f[R] -.PP -Ideally the offending software would be fixed and it is recommended that -if you run into this problem you contact the software\[cq]s author and -request proper handling of \f[C]EXDEV\f[R] errors. +For example: \f[C]-o category.create=mfs\f[R] Ideally the offending +software would be fixed and it is recommended that if you run into this +problem you contact the software\[cq]s author and request proper +handling of \f[C]EXDEV\f[R] errors. .SS my 32bit software has problems .PP Some software have problems with 64bit inode values. @@ -2373,24 +2355,24 @@ to dual socket Xeon systems with >20 cores. I\[cq]m aware of at least a few companies which use mergerfs in production. Open Media Vault (https://www.openmediavault.org) includes mergerfs as -its sole solution for pooling drives. +its sole solution for pooling filesystems. The author of mergerfs had it running for over 300 days managing 16+ -drives with reasonably heavy 24/7 read and write usage. +devices with reasonably heavy 24/7 read and write usage. Stopping only after the machine\[cq]s power supply died. .PP Most serious issues (crashes or data corruption) have been due to kernel bugs (https://github.com/trapexit/mergerfs/wiki/Kernel-Issues-&-Bugs). All of which are fixed in stable releases. -.SS Can mergerfs be used with drives which already have data / are in use? +.SS Can mergerfs be used with filesystems which already have data / are in use? .PP Yes. MergerFS is a proxy and does \f[B]NOT\f[R] interfere with the normal -form or function of the drives / mounts / paths it manages. +form or function of the filesystems / mounts / paths it manages. .PP MergerFS is \f[B]not\f[R] a traditional filesystem. MergerFS is \f[B]not\f[R] RAID. It does \f[B]not\f[R] manipulate the data that passes through it. -It does \f[B]not\f[R] shard data across drives. +It does \f[B]not\f[R] shard data across filesystems. It merely shards some \f[B]behavior\f[R] and aggregates others. .SS Can mergerfs be removed without affecting the data? .PP @@ -2402,7 +2384,8 @@ probably best off using \f[C]mfs\f[R] for \f[C]category.create\f[R]. It will spread files out across your branches based on available space. Use \f[C]mspmfs\f[R] if you want to try to colocate the data a bit more. You may want to use \f[C]lus\f[R] if you prefer a slightly different -distribution of data if you have a mix of smaller and larger drives. +distribution of data if you have a mix of smaller and larger +filesystems. Generally though \f[C]mfs\f[R], \f[C]lus\f[R], or even \f[C]rand\f[R] are good for the general use case. If you are starting with an imbalanced pool you can use the tool @@ -2413,8 +2396,8 @@ set \f[C]func.create\f[R] to \f[C]epmfs\f[R] or similar and \f[C]func.mkdir\f[R] to \f[C]rand\f[R] or \f[C]eprand\f[R] depending on if you just want to colocate generally or on specific branches. Either way the \f[I]need\f[R] to colocate is rare. -For instance: if you wish to remove the drive regularly and want the -data to predictably be on that drive or if you don\[cq]t use backup at +For instance: if you wish to remove the device regularly and want the +data to predictably be on that device or if you don\[cq]t use backup at all and don\[cq]t wish to replace that data piecemeal. In which case using path preservation can help but will require some manual attention. @@ -2451,9 +2434,9 @@ the documentation will be improved. That said, for the average person, the following should be fine: .PP \f[C]cache.files=off,dropcacheonclose=true,category.create=mfs\f[R] -.SS Why are all my files ending up on 1 drive?! +.SS Why are all my files ending up on 1 filesystem?! .PP -Did you start with empty drives? +Did you start with empty filesystems? Did you explicitly configure a \f[C]category.create\f[R] policy? Are you using an \f[C]existing path\f[R] / \f[C]path preserving\f[R] policy? @@ -2461,23 +2444,23 @@ policy? The default create policy is \f[C]epmfs\f[R]. That is a path preserving algorithm. With such a policy for \f[C]mkdir\f[R] and \f[C]create\f[R] with a set -of empty drives it will select only 1 drive when the first directory is -created. +of empty filesystems it will select only 1 filesystem when the first +directory is created. Anything, files or directories, created in that first directory will be placed on the same branch because it is preserving paths. .PP This catches a lot of new users off guard but changing the default would break the setup for many existing users. If you do not care about path preservation and wish your files to be -spread across all your drives change to \f[C]mfs\f[R] or similar policy -as described above. +spread across all your filesystems change to \f[C]mfs\f[R] or similar +policy as described above. If you do want path preservation you\[cq]ll need to perform the manual -act of creating paths on the drives you want the data to land on before -transferring your data. +act of creating paths on the filesystems you want the data to land on +before transferring your data. Setting \f[C]func.mkdir=epall\f[R] can simplify managing path preservation for \f[C]create\f[R]. Or use \f[C]func.mkdir=rand\f[R] if you\[cq]re interested in just -grouping together directory content by drive. +grouping together directory content by filesystem. .SS Do hardlinks work? .PP Yes. @@ -2546,8 +2529,8 @@ of the caller. This means that if the user does not have access to a file or directory than neither will mergerfs. However, because mergerfs is creating a union of paths it may be able to -read some files and directories on one drive but not another resulting -in an incomplete set. +read some files and directories on one filesystem but not another +resulting in an incomplete set. .PP Whenever you run into a split permission issue (seeing some but not all files) try using @@ -2644,7 +2627,7 @@ features which aufs and overlayfs have. .PP UnionFS is more like aufs than mergerfs in that it offers overlay / CoW features. -If you\[cq]re just looking to create a union of drives and want +If you\[cq]re just looking to create a union of filesystems and want flexibility in file/directory placement then mergerfs offers that whereas unionfs is more for overlaying RW filesystems over RO ones. .SS Why use mergerfs over overlayfs? @@ -2664,8 +2647,9 @@ without the single point of failure. .SS Why use mergerfs over ZFS? .PP MergerFS is not intended to be a replacement for ZFS. -MergerFS is intended to provide flexible pooling of arbitrary drives -(local or remote), of arbitrary sizes, and arbitrary filesystems. +MergerFS is intended to provide flexible pooling of arbitrary +filesystems (local or remote), of arbitrary sizes, and arbitrary +filesystems. For \f[C]write once, read many\f[R] usecases such as bulk media storage. Where data integrity and backup is managed in other ways. In that situation ZFS can introduce a number of costs and limitations as @@ -2683,6 +2667,29 @@ open source is important. .PP There are a number of UnRAID users who use mergerfs as well though I\[cq]m not entirely familiar with the use case. +.SS Why use mergerfs over StableBit\[cq]s DrivePool? +.PP +DrivePool works only on Windows so not as common an alternative as other +Linux solutions. +If you want to use Windows then DrivePool is a good option. +Functionally the two projects work a bit differently. +DrivePool always writes to the filesystem with the most free space and +later rebalances. +mergerfs does not offer rebalance but chooses a branch at file/directory +create time. +DrivePool\[cq]s rebalancing can be done differently in any directory and +has file pattern matching to further customize the behavior. +mergerfs, not having rebalancing does not have these features, but +similar features are planned for mergerfs v3. +DrivePool has builtin file duplication which mergerfs does not natively +support (but can be done via an external script.) +.PP +There are a lot of misc differences between the two projects but most +features in DrivePool can be replicated with external tools in +combination with mergerfs. +.PP +Additionally DrivePool is a closed source commercial product vs mergerfs +a ISC licensed OSS project. .SS What should mergerfs NOT be used for? .IP \[bu] 2 databases: Even if the database stored data in separate files (mergerfs @@ -2698,7 +2705,7 @@ much latency (if it works at all). As replacement for RAID: mergerfs is just for pooling branches. If you need that kind of device performance aggregation or high availability you should stick with RAID. -.SS Can drives be written to directly? Outside of mergerfs while pooled? +.SS Can filesystems be written to directly? Outside of mergerfs while pooled? .PP Yes, however it\[cq]s not recommended to use the same file from within the pool and from without at the same time (particularly writing). @@ -2729,7 +2736,7 @@ those settings. Only one error can be returned and if one of the reasons for filtering a branch was \f[B]minfreespace\f[R] then it will be returned as such. \f[B]moveonenospc\f[R] is only relevant to writing a file which is too -large for the drive its currently on. +large for the filesystem it\[cq]s currently on. .PP It is also possible that the filesystem selected has run out of inodes. Use \f[C]df -i\f[R] to list the total and available inodes per @@ -2824,7 +2831,7 @@ Taking after \f[B]Samba\f[R], mergerfs uses \f[B]syscall(SYS_setreuid,\&...)\f[R] to set the callers credentials for that thread only. Jumping back to \f[B]root\f[R] as necessary should escalated privileges -be needed (for instance: to clone paths between drives). +be needed (for instance: to clone paths between filesystems). .PP For non-Linux systems mergerfs uses a read-write lock and changes credentials only when necessary.