Browse Source

Merge pull request #689 from trapexit/readme

fix typos and update FAQ regarding policy preference
pull/694/head
trapexit 5 years ago
committed by GitHub
parent
commit
8cdb7174c4
No known key found for this signature in database GPG Key ID: 4AEE18F83AFDEB23
  1. 47
      README.md
  2. 79
      man/mergerfs.1

47
README.md

@ -130,7 +130,7 @@ Each branch can have a suffix of `=RW` (read / write), `=RO` (read-only), or `=N
The above line will use all mount points in /mnt prefixed with **disk** and the **cdrom**. The above line will use all mount points in /mnt prefixed with **disk** and the **cdrom**.
To have the pool mounted at boot or otherwise accessable from related tools use **/etc/fstab**.
To have the pool mounted at boot or otherwise accessible from related tools use **/etc/fstab**.
``` ```
# <file system> <mount point> <type> <options> <dump> <pass> # <file system> <mount point> <type> <options> <dump> <pass>
@ -144,7 +144,7 @@ To have the pool mounted at boot or otherwise accessable from related tools use
### fuse_msg_size ### fuse_msg_size
FUSE applications communicate with the kernel over a special character device: `/dev/fuse`. A large portion of the overhead associated with FUSE is the cost of going back and forth from user space and kernel space over that device. Generally speaking the fewer trips needed the better the performance will be. Reducing the number of trips can be done a number of ways. Kernel level caching and increasing message sizes being two significant ones. When it comes to reads and writes if the message size is doubled the number of trips are appoximately halved.
FUSE applications communicate with the kernel over a special character device: `/dev/fuse`. A large portion of the overhead associated with FUSE is the cost of going back and forth from user space and kernel space over that device. Generally speaking the fewer trips needed the better the performance will be. Reducing the number of trips can be done a number of ways. Kernel level caching and increasing message sizes being two significant ones. When it comes to reads and writes if the message size is doubled the number of trips are approximately halved.
In Linux 4.20 a new feature was added allowing the negotiation of the max message size. Since the size is in multiples of [pages](https://en.wikipedia.org/wiki/Page_(computer_memory)) the feature is called `max_pages`. There is a maximum `max_pages` value of 256 (1MiB) and minimum of 1 (4KiB). The default used by Linux >=4.20, and hardcoded value used before 4.20, is 32 (128KiB). In mergerfs its referred to as `fuse_msg_size` to make it clear what it impacts and provide some abstraction. In Linux 4.20 a new feature was added allowing the negotiation of the max message size. Since the size is in multiples of [pages](https://en.wikipedia.org/wiki/Page_(computer_memory)) the feature is called `max_pages`. There is a maximum `max_pages` value of 256 (1MiB) and minimum of 1 (4KiB). The default used by Linux >=4.20, and hardcoded value used before 4.20, is 32 (128KiB). In mergerfs its referred to as `fuse_msg_size` to make it clear what it impacts and provide some abstraction.
@ -153,7 +153,7 @@ Since there should be no downsides to increasing `fuse_msg_size` / `max_pages`,
### symlinkify ### symlinkify
Due to the levels of indirection introduced by mergerfs and the underlying technology FUSE there can be varying levels of performance degredation. This feature will turn non-directories which are not writable into symlinks to the original file found by the `readlink` policy after the mtime and ctime are older than the timeout.
Due to the levels of indirection introduced by mergerfs and the underlying technology FUSE there can be varying levels of performance degradation. This feature will turn non-directories which are not writable into symlinks to the original file found by the `readlink` policy after the mtime and ctime are older than the timeout.
**WARNING:** The current implementation has a known issue in which if the file is open and being used when the file is converted to a symlink then the application which has that file open will receive an error when using it. This is unlikely to occur in practice but is something to keep in mind. **WARNING:** The current implementation has a known issue in which if the file is open and being used when the file is converted to a symlink then the application which has that file open will receive an error when using it. This is unlikely to occur in practice but is something to keep in mind.
@ -162,7 +162,7 @@ Due to the levels of indirection introduced by mergerfs and the underlying techn
### nullrw ### nullrw
Due to how FUSE works there is an overhead to all requests made to a FUSE filesystem. Meaning that even a simple passthrough will have some slowdown. However, generally the overhead is minimal in comparison to the cost of the underlying I/O. By disabling the underlying I/O we can test the theoretical performance boundries.
Due to how FUSE works there is an overhead to all requests made to a FUSE filesystem. Meaning that even a simple passthrough will have some slowdown. However, generally the overhead is minimal in comparison to the cost of the underlying I/O. By disabling the underlying I/O we can test the theoretical performance boundaries.
By enabling `nullrw` mergerfs will work as it always does **except** that all reads and writes will be no-ops. A write will succeed (the size of the write will be returned as if it were successful) but mergerfs does nothing with the data it was given. Similarly a read will return the size requested but won't touch the buffer. By enabling `nullrw` mergerfs will work as it always does **except** that all reads and writes will be no-ops. A write will succeed (the size of the write will be returned as if it were successful) but mergerfs does nothing with the data it was given. Similarly a read will return the size requested but won't touch the buffer.
@ -296,7 +296,7 @@ The plan is to rewrite mergerfs to use the low level API so these invasive libfu
`rename` and `link` are tricky functions in a union filesystem. `rename` only works within a single filesystem or device. If a rename can't be done atomically due to the source and destination paths existing on different mount points it will return **-1** with **errno = EXDEV** (cross device). So if a `rename`'s source and target are on different drives within the pool it creates an issue. `rename` and `link` are tricky functions in a union filesystem. `rename` only works within a single filesystem or device. If a rename can't be done atomically due to the source and destination paths existing on different mount points it will return **-1** with **errno = EXDEV** (cross device). So if a `rename`'s source and target are on different drives within the pool it creates an issue.
Originally mergerfs would return EXDEV whenever a rename was requested which was cross directory in any way. This made the code simple and was technically complient with POSIX requirements. However, many applications fail to handle EXDEV at all and treat it as a normal error or otherwise handle it poorly. Such apps include: gvfsd-fuse v1.20.3 and prior, Finder / CIFS/SMB client in Apple OSX 10.9+, NZBGet, Samba's recycling bin feature.
Originally mergerfs would return EXDEV whenever a rename was requested which was cross directory in any way. This made the code simple and was technically compliant with POSIX requirements. However, many applications fail to handle EXDEV at all and treat it as a normal error or otherwise handle it poorly. Such apps include: gvfsd-fuse v1.20.3 and prior, Finder / CIFS/SMB client in Apple OSX 10.9+, NZBGet, Samba's recycling bin feature.
As a result a compromise was made in order to get most software to work while still obeying mergerfs' policies. Below is the basic logic. As a result a compromise was made in order to get most software to work while still obeying mergerfs' policies. Below is the basic logic.
@ -545,12 +545,12 @@ If most files are read once through and closed (like media) it is best to enable
It is difficult to balance memory usage, cache bloat & duplication, and performance. Ideally mergerfs would be able to disable caching for the files it reads/writes but allow page caching for itself. That would limit the FUSE overhead. However, there isn't a good way to achieve this. It would need to open all files with O_DIRECT which places limitations on the what underlying filesystems would be supported and complicates the code. It is difficult to balance memory usage, cache bloat & duplication, and performance. Ideally mergerfs would be able to disable caching for the files it reads/writes but allow page caching for itself. That would limit the FUSE overhead. However, there isn't a good way to achieve this. It would need to open all files with O_DIRECT which places limitations on the what underlying filesystems would be supported and complicates the code.
kernel documenation: https://www.kernel.org/doc/Documentation/filesystems/fuse-io.txt
kernel documentation: https://www.kernel.org/doc/Documentation/filesystems/fuse-io.txt
#### entry & attribute caching #### entry & attribute caching
Given the relatively high cost of FUSE due to the kernel <-> userspace round trips there are kernel side caches for file entries and attributes. The entry cache limits the `lookup` calls to mergerfs which ask if a file exists. The attribute cache limits the need to make `getattr` calls to mergerfs which provide file attributes (mode, size, type, etc.). As with the page cache these should not be used if the underlying filesystems are being manipulated at the same time as it could lead to odd behavior or data corruption. The options for setting these are `cache.entry` and `cache.negative_entry` for the entry cache and `cache.attr` for the attributes cache. `cache.negative_entry` refers to the timeout for negative responses to lookups (non-existant files).
Given the relatively high cost of FUSE due to the kernel <-> userspace round trips there are kernel side caches for file entries and attributes. The entry cache limits the `lookup` calls to mergerfs which ask if a file exists. The attribute cache limits the need to make `getattr` calls to mergerfs which provide file attributes (mode, size, type, etc.). As with the page cache these should not be used if the underlying filesystems are being manipulated at the same time as it could lead to odd behavior or data corruption. The options for setting these are `cache.entry` and `cache.negative_entry` for the entry cache and `cache.attr` for the attributes cache. `cache.negative_entry` refers to the timeout for negative responses to lookups (non-existent files).
#### policy caching #### policy caching
@ -576,7 +576,7 @@ As of version 4.20 Linux supports symlink caching. Significant performance incre
#### readdir caching #### readdir caching
As of version 4.20 Linux supports readdir caching. This can have a significant impact on directory traversal. Especially when combined with entry (`cache.entry`) and attribute (`cache.attr`) caching. Setting `cache.readdir=true` will result in requesting readdir caching from the kernel on each `opendir`. If the kernel doesn't support readdir caching setting the option to `true` has no effect. This option is configuarable at runtime via xattr `user.mergerfs.cache.readdir`.
As of version 4.20 Linux supports readdir caching. This can have a significant impact on directory traversal. Especially when combined with entry (`cache.entry`) and attribute (`cache.attr`) caching. Setting `cache.readdir=true` will result in requesting readdir caching from the kernel on each `opendir`. If the kernel doesn't support readdir caching setting the option to `true` has no effect. This option is configurable at runtime via xattr `user.mergerfs.cache.readdir`.
#### writeback caching #### writeback caching
@ -684,7 +684,7 @@ If you always want the directory information from the one with the most recent m
This is not a bug. This is not a bug.
Run in verbose mode to better undertand what's happening:
Run in verbose mode to better understand what's happening:
``` ```
$ mv -v /mnt/pool/foo /mnt/disk1/foo $ mv -v /mnt/pool/foo /mnt/disk1/foo
@ -718,7 +718,7 @@ Try enabling the `use_ino` option. Some have reported that it fixes the issue.
#### rtorrent fails with ENODEV (No such device) #### rtorrent fails with ENODEV (No such device)
Be sure to set `cache.files=partial|full|auto-full` or turn off `direct_io`. rtorrent and some other applications use [mmap](http://linux.die.net/man/2/mmap) to read and write to files and offer no failback to traditional methods. FUSE does not currently support mmap while using `direct_io`. There may be a performance penalty on writes with `direct_io` off as well as the problem of double caching but it's the only way to get such applications to work. If the performance loss is too high for other apps you can mount mergerfs twice. Once with `direct_io` enabled and one without it. Be sure to set `dropcacheonclose=true` if not using `direct_io`.
Be sure to set `cache.files=partial|full|auto-full` or turn off `direct_io`. rtorrent and some other applications use [mmap](http://linux.die.net/man/2/mmap) to read and write to files and offer no fallback to traditional methods. FUSE does not currently support mmap while using `direct_io`. There may be a performance penalty on writes with `direct_io` off as well as the problem of double caching but it's the only way to get such applications to work. If the performance loss is too high for other apps you can mount mergerfs twice. Once with `direct_io` enabled and one without it. Be sure to set `dropcacheonclose=true` if not using `direct_io`.
#### rtorrent fails with files >= 4GiB #### rtorrent fails with files >= 4GiB
@ -767,7 +767,7 @@ In Apple's MacOSX 10.9 they replaced Samba (client and server) with their own pr
#### Trashing files occasionally fails #### Trashing files occasionally fails
This is the same issue as with Samba. `rename` returns `EXDEV` (in our case that will really only happen with path preserving policies like `epmfs`) and the software doesn't handle the situtation well. This is unfortunately a common failure of software which moves files around. The standard indicates that an implementation `MAY` choose to support non-user home directory trashing of files (which is a `MUST`). The implementation `MAY` also support "top directory trashes" which many probably do.
This is the same issue as with Samba. `rename` returns `EXDEV` (in our case that will really only happen with path preserving policies like `epmfs`) and the software doesn't handle the situation well. This is unfortunately a common failure of software which moves files around. The standard indicates that an implementation `MAY` choose to support non-user home directory trashing of files (which is a `MUST`). The implementation `MAY` also support "top directory trashes" which many probably do.
To create a `$topdir/.Trash` directory as defined in the standard use the [mergerfs-tools](https://github.com/trapexit/mergerfs-tools) tool `mergerfs.mktrash`. To create a `$topdir/.Trash` directory as defined in the standard use the [mergerfs-tools](https://github.com/trapexit/mergerfs-tools) tool `mergerfs.mktrash`.
@ -796,7 +796,7 @@ In order to fix this please install newer versions of libfuse. If using a Debian
There seems to be an issue with Linux version `4.9.0` and above in which an invalid message appears to be transmitted to libfuse (used by mergerfs) causing it to exit. No messages will be printed in any logs as its not a proper crash. Debugging of the issue is still ongoing and can be followed via the [fuse-devel thread](https://sourceforge.net/p/fuse/mailman/message/35662577). There seems to be an issue with Linux version `4.9.0` and above in which an invalid message appears to be transmitted to libfuse (used by mergerfs) causing it to exit. No messages will be printed in any logs as its not a proper crash. Debugging of the issue is still ongoing and can be followed via the [fuse-devel thread](https://sourceforge.net/p/fuse/mailman/message/35662577).
#### mergerfs under heavy load and memory preasure leads to kernel panic
#### mergerfs under heavy load and memory pressure leads to kernel panic
https://lkml.org/lkml/2016/9/14/527 https://lkml.org/lkml/2016/9/14/527
@ -845,7 +845,7 @@ NOTE: This is only relevant to mergerfs versions at or below v2.25.x and should
Not *really* a bug. The FUSE library will move files when asked to delete them as a way to deal with certain edge cases and then later delete that file when its clear the file is no longer needed. This however can lead to two issues. One is that these hidden files are noticed by `rm -rf` or `find` when scanning directories and they may try to remove them and they might have disappeared already. There is nothing *wrong* about this happening but it can be annoying. The second issue is that a directory might not be able to removed on account of the hidden file being still there. Not *really* a bug. The FUSE library will move files when asked to delete them as a way to deal with certain edge cases and then later delete that file when its clear the file is no longer needed. This however can lead to two issues. One is that these hidden files are noticed by `rm -rf` or `find` when scanning directories and they may try to remove them and they might have disappeared already. There is nothing *wrong* about this happening but it can be annoying. The second issue is that a directory might not be able to removed on account of the hidden file being still there.
Using the **hard_remove** option will make it so these temporary files are not used and files are deleted immedately. That has a side effect however. Files which are unlinked and then they are still used (in certain forms) will result in an error (ENOENT).
Using the **hard_remove** option will make it so these temporary files are not used and files are deleted immediately. That has a side effect however. Files which are unlinked and then they are still used (in certain forms) will result in an error (ENOENT).
# FAQ # FAQ
@ -867,6 +867,17 @@ MergerFS is **not** a traditional filesystem. MergerFS is **not** RAID. It does
See the previous question's answer. See the previous question's answer.
#### What policies should I use?
Unless you're doing something more niche the average user is probably best off using `mfs` for `category.create`. It will spread files out across your branches based on available space. You may want to use `lus` if you prefer a slightly different distribution of data if you have a mix of smaller and larger drives. Generally though `mfs`, `lus`, or even `rand` are good for the general use case. If you are starting with an imbalanced pool you can use the tool **mergerfs.balance** to redistribute files across the pool.
If you really wish to try to colocate files based on directory you can set `func.create` to `epmfs` or similar and `func.mkdir` to `rand` or `eprand` depending on if you just want to colocate generally or on specific branches. Either way the *need* to colocate is rare. For instance: if you wish to remove the drive regularly and want the data to predictably be on that drive or if you don't use backup at all and don't wish to replace that data piecemeal. In which case using path preservation can help but will require some manual attention. Colocating after the fact can be accomplished using the **mergerfs.consolidate** tool.
Ultimately there is no correct answer. It is a preference or based on some particular need. mergerfs is very easy to test and experiment with. I suggest creating a test setup and experimenting to get a sense of what you want.
The reason `mfs` is not the default `category.create` policy is historical. When/if a 3.X gets released it will be changed to minimize confusion people often have with path preserving policies.
#### Do hard links work? #### Do hard links work?
Yes. You need to use `use_ino` to support proper reporting of inodes. Yes. You need to use `use_ino` to support proper reporting of inodes.
@ -892,12 +903,12 @@ If using a network filesystem such as NFS, SMB, CIFS (Samba) be sure to pay clos
Are you using a path preserving policy? The default policy for file creation is `epmfs`. That means only the drives with the path preexisting will be considered when creating a file. If you don't care about where files and directories are created you likely shouldn't be using a path preserving policy and instead something like `mfs`. Are you using a path preserving policy? The default policy for file creation is `epmfs`. That means only the drives with the path preexisting will be considered when creating a file. If you don't care about where files and directories are created you likely shouldn't be using a path preserving policy and instead something like `mfs`.
This can be especially apparent when filling an empty pool from an external source. If you do want path preservation you'll need to perform the manual act of creating paths on the drives you want the data to land on before transfering your data. Setting `func.mkdir=epall` can simplify managing path perservation for `create`.
This can be especially apparent when filling an empty pool from an external source. If you do want path preservation you'll need to perform the manual act of creating paths on the drives you want the data to land on before transferring your data. Setting `func.mkdir=epall` can simplify managing path preservation for `create`.
#### Why was libfuse embedded into mergerfs? #### Why was libfuse embedded into mergerfs?
1. A significant number of users use mergerfs on distros with old versions of libfuse which have serious bugs. Requiring updated versions of libfuse on those distros isn't pratical (no package offered, user inexperience, etc.). The only practical way to provide a stable runtime on those systems was to "vendor" / embed the library into the project.
1. A significant number of users use mergerfs on distros with old versions of libfuse which have serious bugs. Requiring updated versions of libfuse on those distros isn't practical (no package offered, user inexperience, etc.). The only practical way to provide a stable runtime on those systems was to "vendor" / embed the library into the project.
2. mergerfs was written to use the high level API. There are a number of limitations in the HLAPI that make certain features difficult or impossible to implement. While some of these features could be patched into newer versions of libfuse without breaking the public API some of them would require hacky code to provide backwards compatibility. While it may still be worth working with upstream to address these issues in future versions, since the library needs to be vendored for stability and compatibility reasons it is preferable / easier to modify the API. Longer term the plan is to rewrite mergerfs to use the low level API. 2. mergerfs was written to use the high level API. There are a number of limitations in the HLAPI that make certain features difficult or impossible to implement. While some of these features could be patched into newer versions of libfuse without breaking the public API some of them would require hacky code to provide backwards compatibility. While it may still be worth working with upstream to address these issues in future versions, since the library needs to be vendored for stability and compatibility reasons it is preferable / easier to modify the API. Longer term the plan is to rewrite mergerfs to use the low level API.
@ -933,14 +944,14 @@ UnionFS is more like aufs than mergerfs in that it offers overlay / CoW features
#### Why use mergerfs over LVM/ZFS/BTRFS/RAID0 drive concatenation / striping? #### Why use mergerfs over LVM/ZFS/BTRFS/RAID0 drive concatenation / striping?
With simple JBOD / drive concatenation / stripping / RAID0 a single drive failure will result in full pool failure. mergerfs performs a similar behavior without the possibility of catastrophic failure and the difficulties in recovery. Drives may fail however all other data will continue to be accessable.
With simple JBOD / drive concatenation / stripping / RAID0 a single drive failure will result in full pool failure. mergerfs performs a similar behavior without the possibility of catastrophic failure and the difficulties in recovery. Drives may fail however all other data will continue to be accessible.
When combined with something like [SnapRaid](http://www.snapraid.it) and/or an offsite backup solution you can have the flexibilty of JBOD without the single point of failure.
When combined with something like [SnapRaid](http://www.snapraid.it) and/or an offsite backup solution you can have the flexibility of JBOD without the single point of failure.
#### Why use mergerfs over ZFS? #### Why use mergerfs over ZFS?
MergerFS is not intended to be a replacement for ZFS. MergerFS is intended to provide flexible pooling of arbitrary drives (local or remote), of arbitrary sizes, and arbitrary filesystems. For `write once, read many` usecases such as bulk media storage. Where data integrity and backup is managed in other ways. In that situation ZFS can introduce major maintance and cost burdens as described [here](http://louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html).
MergerFS is not intended to be a replacement for ZFS. MergerFS is intended to provide flexible pooling of arbitrary drives (local or remote), of arbitrary sizes, and arbitrary filesystems. For `write once, read many` usecases such as bulk media storage. Where data integrity and backup is managed in other ways. In that situation ZFS can introduce major maintenance and cost burdens as described [here](http://louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html).
#### Can drives be written to directly? Outside of mergerfs while pooled? #### Can drives be written to directly? Outside of mergerfs while pooled?

79
man/mergerfs.1

@ -315,7 +315,7 @@ can\[aq]t create but you can change / delete).
The above line will use all mount points in /mnt prefixed with The above line will use all mount points in /mnt prefixed with
\f[B]disk\f[] and the \f[B]cdrom\f[]. \f[B]disk\f[] and the \f[B]cdrom\f[].
.PP .PP
To have the pool mounted at boot or otherwise accessable from related
To have the pool mounted at boot or otherwise accessible from related
tools use \f[B]/etc/fstab\f[]. tools use \f[B]/etc/fstab\f[].
.IP .IP
.nf .nf
@ -345,7 +345,7 @@ Reducing the number of trips can be done a number of ways.
Kernel level caching and increasing message sizes being two significant Kernel level caching and increasing message sizes being two significant
ones. ones.
When it comes to reads and writes if the message size is doubled the When it comes to reads and writes if the message size is doubled the
number of trips are appoximately halved.
number of trips are approximately halved.
.PP .PP
In Linux 4.20 a new feature was added allowing the negotiation of the In Linux 4.20 a new feature was added allowing the negotiation of the
max message size. max message size.
@ -370,7 +370,7 @@ See the \f[C]nullrw\f[] section for benchmarking examples.
.PP .PP
Due to the levels of indirection introduced by mergerfs and the Due to the levels of indirection introduced by mergerfs and the
underlying technology FUSE there can be varying levels of performance underlying technology FUSE there can be varying levels of performance
degredation.
degradation.
This feature will turn non\-directories which are not writable into This feature will turn non\-directories which are not writable into
symlinks to the original file found by the \f[C]readlink\f[] policy symlinks to the original file found by the \f[C]readlink\f[] policy
after the mtime and ctime are older than the timeout. after the mtime and ctime are older than the timeout.
@ -396,7 +396,7 @@ Meaning that even a simple passthrough will have some slowdown.
However, generally the overhead is minimal in comparison to the cost of However, generally the overhead is minimal in comparison to the cost of
the underlying I/O. the underlying I/O.
By disabling the underlying I/O we can test the theoretical performance By disabling the underlying I/O we can test the theoretical performance
boundries.
boundaries.
.PP .PP
By enabling \f[C]nullrw\f[] mergerfs will work as it always does By enabling \f[C]nullrw\f[] mergerfs will work as it always does
\f[B]except\f[] that all reads and writes will be no\-ops. \f[B]except\f[] that all reads and writes will be no\-ops.
@ -755,7 +755,7 @@ within the pool it creates an issue.
.PP .PP
Originally mergerfs would return EXDEV whenever a rename was requested Originally mergerfs would return EXDEV whenever a rename was requested
which was cross directory in any way. which was cross directory in any way.
This made the code simple and was technically complient with POSIX
This made the code simple and was technically compliant with POSIX
requirements. requirements.
However, many applications fail to handle EXDEV at all and treat it as a However, many applications fail to handle EXDEV at all and treat it as a
normal error or otherwise handle it poorly. normal error or otherwise handle it poorly.
@ -1180,7 +1180,7 @@ It would need to open all files with O_DIRECT which places limitations
on the what underlying filesystems would be supported and complicates on the what underlying filesystems would be supported and complicates
the code. the code.
.PP .PP
kernel documenation:
kernel documentation:
https://www.kernel.org/doc/Documentation/filesystems/fuse\-io.txt https://www.kernel.org/doc/Documentation/filesystems/fuse\-io.txt
.SS entry & attribute caching .SS entry & attribute caching
.PP .PP
@ -1198,7 +1198,7 @@ The options for setting these are \f[C]cache.entry\f[] and
\f[C]cache.negative_entry\f[] for the entry cache and \f[C]cache.negative_entry\f[] for the entry cache and
\f[C]cache.attr\f[] for the attributes cache. \f[C]cache.attr\f[] for the attributes cache.
\f[C]cache.negative_entry\f[] refers to the timeout for negative \f[C]cache.negative_entry\f[] refers to the timeout for negative
responses to lookups (non\-existant files).
responses to lookups (non\-existent files).
.SS policy caching .SS policy caching
.PP .PP
Policies are run every time a function (with a policy as mentioned Policies are run every time a function (with a policy as mentioned
@ -1254,7 +1254,7 @@ Setting \f[C]cache.readdir=true\f[] will result in requesting readdir
caching from the kernel on each \f[C]opendir\f[]. caching from the kernel on each \f[C]opendir\f[].
If the kernel doesn\[aq]t support readdir caching setting the option to If the kernel doesn\[aq]t support readdir caching setting the option to
\f[C]true\f[] has no effect. \f[C]true\f[] has no effect.
This option is configuarable at runtime via xattr
This option is configurable at runtime via xattr
\f[C]user.mergerfs.cache.readdir\f[]. \f[C]user.mergerfs.cache.readdir\f[].
.SS writeback caching .SS writeback caching
.PP .PP
@ -1262,8 +1262,8 @@ writeback caching is a technique for improving write speeds by batching
writes at a faster device and then bulk writing to the slower device. writes at a faster device and then bulk writing to the slower device.
With FUSE the kernel will wait for a number of writes to be made and With FUSE the kernel will wait for a number of writes to be made and
then send it to the filesystem as one request. then send it to the filesystem as one request.
mergerfs currently uses a modified and vendored libfuse 2.9.7 which does
not support writeback caching.
mergerfs currently uses a modified and vendor ed libfuse 2.9.7 which
does not support writeback caching.
Adding said feature should not be difficult but benchmarking needs to be Adding said feature should not be difficult but benchmarking needs to be
done to see if what effect it will have. done to see if what effect it will have.
.SS tiered caching .SS tiered caching
@ -1464,7 +1464,7 @@ recent mtime then use the \f[C]newest\f[] policy for \f[C]getattr\f[].
.PP .PP
This is not a bug. This is not a bug.
.PP .PP
Run in verbose mode to better undertand what\[aq]s happening:
Run in verbose mode to better understand what\[aq]s happening:
.IP .IP
.nf .nf
\f[C] \f[C]
@ -1505,7 +1505,7 @@ Be sure to set \f[C]cache.files=partial|full|auto\-full\f[] or turn off
\f[C]direct_io\f[]. \f[C]direct_io\f[].
rtorrent and some other applications use rtorrent and some other applications use
mmap (http://linux.die.net/man/2/mmap) to read and write to files and mmap (http://linux.die.net/man/2/mmap) to read and write to files and
offer no failback to traditional methods.
offer no fallback to traditional methods.
FUSE does not currently support mmap while using \f[C]direct_io\f[]. FUSE does not currently support mmap while using \f[C]direct_io\f[].
There may be a performance penalty on writes with \f[C]direct_io\f[] off There may be a performance penalty on writes with \f[C]direct_io\f[] off
as well as the problem of double caching but it\[aq]s the only way to as well as the problem of double caching but it\[aq]s the only way to
@ -1601,7 +1601,7 @@ responds similar to older release of gvfs on Linux.
This is the same issue as with Samba. This is the same issue as with Samba.
\f[C]rename\f[] returns \f[C]EXDEV\f[] (in our case that will really \f[C]rename\f[] returns \f[C]EXDEV\f[] (in our case that will really
only happen with path preserving policies like \f[C]epmfs\f[]) and the only happen with path preserving policies like \f[C]epmfs\f[]) and the
software doesn\[aq]t handle the situtation well.
software doesn\[aq]t handle the situation well.
This is unfortunately a common failure of software which moves files This is unfortunately a common failure of software which moves files
around. around.
The standard indicates that an implementation \f[C]MAY\f[] choose to The standard indicates that an implementation \f[C]MAY\f[] choose to
@ -1678,7 +1678,7 @@ No messages will be printed in any logs as its not a proper crash.
Debugging of the issue is still ongoing and can be followed via the Debugging of the issue is still ongoing and can be followed via the
fuse\-devel fuse\-devel
thread (https://sourceforge.net/p/fuse/mailman/message/35662577). thread (https://sourceforge.net/p/fuse/mailman/message/35662577).
.SS mergerfs under heavy load and memory preasure leads to kernel panic
.SS mergerfs under heavy load and memory pressure leads to kernel panic
.PP .PP
https://lkml.org/lkml/2016/9/14/527 https://lkml.org/lkml/2016/9/14/527
.IP .IP
@ -1745,7 +1745,7 @@ The second issue is that a directory might not be able to removed on
account of the hidden file being still there. account of the hidden file being still there.
.PP .PP
Using the \f[B]hard_remove\f[] option will make it so these temporary Using the \f[B]hard_remove\f[] option will make it so these temporary
files are not used and files are deleted immedately.
files are not used and files are deleted immediately.
That has a side effect however. That has a side effect however.
Files which are unlinked and then they are still used (in certain forms) Files which are unlinked and then they are still used (in certain forms)
will result in an error (ENOENT). will result in an error (ENOENT).
@ -1773,6 +1773,41 @@ It merely shards some \f[B]behavior\f[] and aggregates others.
.SS Can mergerfs be removed without affecting the data? .SS Can mergerfs be removed without affecting the data?
.PP .PP
See the previous question\[aq]s answer. See the previous question\[aq]s answer.
.SS What policies should I use?
.PP
Unless you\[aq]re doing something more niche the average user is
probably best off using \f[C]mfs\f[] for \f[C]category.create\f[].
It will spread files out across your branches based on available space.
You may want to use \f[C]lus\f[] if you prefer a slightly different
distribution of data if you have a mix of smaller and larger drives.
Generally though \f[C]mfs\f[], \f[C]lus\f[], or even \f[C]rand\f[] are
good for the general use case.
If you are starting with an imbalanced pool you can use the tool
\f[B]mergerfs.balance\f[] to redistribute files across the pool.
.PP
If you really wish to try to colocate files based on directory you can
set \f[C]func.create\f[] to \f[C]epmfs\f[] or similar and
\f[C]func.mkdir\f[] to \f[C]rand\f[] or \f[C]eprand\f[] depending on if
you just want to colocate generally or on specific branches.
Either way the \f[I]need\f[] to colocate is rare.
For instance: if you wish to remove the drive regularly and want the
data to predictably be on that drive or if you don\[aq]t use backup at
all and don\[aq]t wish to replace that data piecemeal.
In which case using path preservation can help but will require some
manual attention.
Colocating after the fact can be accomplished using the
\f[B]mergerfs.consolidate\f[] tool.
.PP
Ultimately there is no correct answer.
It is a preference or based on some particular need.
mergerfs is very easy to test and experiment with.
I suggest creating a test setup and experimenting to get a sense of what
you want.
.PP
The reason \f[C]mfs\f[] is not the default \f[C]category.create\f[]
policy is historical.
When/if a 3.X gets released it will be changed to minimize confusion
people often have with path preserving policies.
.SS Do hard links work? .SS Do hard links work?
.PP .PP
Yes. Yes.
@ -1832,15 +1867,15 @@ This can be especially apparent when filling an empty pool from an
external source. external source.
If you do want path preservation you\[aq]ll need to perform the manual If you do want path preservation you\[aq]ll need to perform the manual
act of creating paths on the drives you want the data to land on before act of creating paths on the drives you want the data to land on before
transfering your data.
transferring your data.
Setting \f[C]func.mkdir=epall\f[] can simplify managing path Setting \f[C]func.mkdir=epall\f[] can simplify managing path
perservation for \f[C]create\f[].
preservation for \f[C]create\f[].
.SS Why was libfuse embedded into mergerfs? .SS Why was libfuse embedded into mergerfs?
.IP "1." 3 .IP "1." 3
A significant number of users use mergerfs on distros with old versions A significant number of users use mergerfs on distros with old versions
of libfuse which have serious bugs. of libfuse which have serious bugs.
Requiring updated versions of libfuse on those distros isn\[aq]t Requiring updated versions of libfuse on those distros isn\[aq]t
pratical (no package offered, user inexperience, etc.).
practical (no package offered, user inexperience, etc.).
The only practical way to provide a stable runtime on those systems was The only practical way to provide a stable runtime on those systems was
to "vendor" / embed the library into the project. to "vendor" / embed the library into the project.
.IP "2." 3 .IP "2." 3
@ -1896,10 +1931,10 @@ With simple JBOD / drive concatenation / stripping / RAID0 a single
drive failure will result in full pool failure. drive failure will result in full pool failure.
mergerfs performs a similar behavior without the possibility of mergerfs performs a similar behavior without the possibility of
catastrophic failure and the difficulties in recovery. catastrophic failure and the difficulties in recovery.
Drives may fail however all other data will continue to be accessable.
Drives may fail however all other data will continue to be accessible.
.PP .PP
When combined with something like SnapRaid (http://www.snapraid.it) When combined with something like SnapRaid (http://www.snapraid.it)
and/or an offsite backup solution you can have the flexibilty of JBOD
and/or an offsite backup solution you can have the flexibility of JBOD
without the single point of failure. without the single point of failure.
.SS Why use mergerfs over ZFS? .SS Why use mergerfs over ZFS?
.PP .PP
@ -1909,8 +1944,8 @@ MergerFS is intended to provide flexible pooling of arbitrary drives
For \f[C]write\ once,\ read\ many\f[] usecases such as bulk media For \f[C]write\ once,\ read\ many\f[] usecases such as bulk media
storage. storage.
Where data integrity and backup is managed in other ways. Where data integrity and backup is managed in other ways.
In that situation ZFS can introduce major maintance and cost burdens as
described
In that situation ZFS can introduce major maintenance and cost burdens
as described
here (http://louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html). here (http://louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html).
.SS Can drives be written to directly? Outside of mergerfs while pooled? .SS Can drives be written to directly? Outside of mergerfs while pooled?
.PP .PP

Loading…
Cancel
Save