Browse Source

Merge pull request #1342 from techie2000/patch-2

Update README.md
pull/1346/head
trapexit 5 months ago
committed by GitHub
parent
commit
40359df453
No known key found for this signature in database GPG Key ID: B5690EEEBB952194
  1. 113
      README.md

113
README.md

@ -464,7 +464,7 @@ covering different usecases.
32bit versions are provided as there is some software which does not
handle 64bit inodes well.
While there is a risk of hash collision in tests of a couple million
While there is a risk of hash collision in tests of a couple of million
entries there were zero collisions. Unlike a typical filesystem FUSE
filesystems can reuse inodes and not refer to the same entry. The
internal identifier used to reference a file in FUSE is different from
@ -493,7 +493,7 @@ should always be managing inode values.
### pin-threads
Simple strategies for pinning read and/or process threads. If process
threads are not enabled than the strategy simply works on the read
threads are not enabled then the strategy simply works on the read
threads. Invalid values are ignored.
* R1L: All read threads pinned to a single logical CPU.
@ -507,15 +507,15 @@ threads. Invalid values are ignored.
* RPSL: All read and process threads are spread across all logical CPUs.
* RPSP: All read and process threads are spread across all physical CPUs.
* R1PPSP: All read threads are pinned to a single physical CPU while
process threads are spread across all other phsycial CPUs.
process threads are spread across all other physical CPUs.
### fuse_msg_size
FUSE applications communicate with the kernel over a special character
device: `/dev/fuse`. A large portion of the overhead associated with
FUSE is the cost of going back and forth from user space and kernel
space over that device. Generally speaking the fewer trips needed the
FUSE is the cost of going back and forth between user space and kernel
space over that device. Generally speaking, the fewer trips needed the
better the performance will be. Reducing the number of trips can be
done a number of ways. Kernel level caching and increasing message
sizes being two significant ones. When it comes to reads and writes if
@ -527,7 +527,7 @@ max message size. Since the size is in multiples of
[pages](https://en.wikipedia.org/wiki/Page_(computer_memory)) the
feature is called `max_pages`. There is a maximum `max_pages` value of
256 (1MiB) and minimum of 1 (4KiB). The default used by Linux >=4.20,
and hardcoded value used before 4.20, is 32 (128KiB). In mergerfs its
and hardcoded value used before 4.20, is 32 (128KiB). In mergerfs it's
referred to as `fuse_msg_size` to make it clear what it impacts and
provide some abstraction.
@ -573,11 +573,11 @@ the `newpath`. The `target` value is determined by the value of
* passthrough: Return EXDEV as normal.
* rel-symlink: A relative path from the `newpath`.
* abs-base-symlink: A absolute value using the underlying branch.
* abs-pool-symlink: A absolute value using the mergerfs mount point.
* abs-base-symlink: An absolute value using the underlying branch.
* abs-pool-symlink: An absolute value using the mergerfs mount point.
NOTE: It is possible that some applications check the file they
link. In those cases it is possible it will error or complain.
link. In those cases, it is possible it will error or complain.
### rename-exdev
@ -591,7 +591,7 @@ The `target` value is determined by the value of `rename-exdev`.
* passthrough: Return EXDEV as normal.
* rel-symlink: A relative path from the `newpath`.
* abs-symlink: A absolute value using the mergerfs mount point.
* abs-symlink: An absolute value using the mergerfs mount point.
NOTE: It is possible that some applications check the file they
rename. In those cases it is possible it will error or complain.
@ -619,7 +619,7 @@ something to keep in mind.
the target of a symlink. If using this feature it will be necessary to
point any backup software to the original filesystems or configure the
software to follow symlinks if such an option is available.
Alternatively create two mounts. One for backup and one for general
Alternatively, create two mounts. One for backup and one for general
consumption.
@ -683,16 +683,16 @@ writable.
Even though it's a more niche situation this hack breaks normal
security and behavior and as such is `off` by default. If set to `git`
it will only perform the hack when the path in question includes
`/.git/`. `all` will result it applying anytime a read-only file which
`/.git/`. `all` will result in it applying anytime a read-only file which
is empty is opened for writing.
### export-support
In theory this flag should not be exposed to the end user. It is a
In theory, this flag should not be exposed to the end user. It is a
low-level FUSE flag which indicates whether or not the kernel can send
certain kinds of messages to it for the purposes of using with
NFS. mergerfs does support these messages but due bugs and quirks
certain kinds of messages to it for the purposes of using it with
NFS. mergerfs does support these messages but due to bugs and quirks
found in the kernel and mergerfs this option is provided just in case
it is needed for debugging.
@ -704,7 +704,7 @@ initiated it is not possible to change during run time.
The POSIX filesystem API is made up of a number of
functions. **creat**, **stat**, **chown**, etc. For ease of
configuration in mergerfs most of the core functions are grouped into
configuration in mergerfs, most of the core functions are grouped into
3 categories: **action**, **create**, and **search**. These functions
and categories can be assigned a policy which dictates which branch is
chosen when performing that function.
@ -778,7 +778,7 @@ device) depending on the most recent reason for filtering a
branch. **ENOENT** will be returned if no eligible branch is found.
If **create**, **mkdir**, **mknod**, or **symlink** fail with `EROFS`
or other fundimental errors then mergerfs will mark any branch found
or other fundamental errors then mergerfs will mark any branch found
to be read-only as such (IE will set the mode `RO`) and will rerun the
policy and try again. This is mostly for `ext4` filesystems that can
suddenly become read-only when it encounters an error.
@ -933,7 +933,7 @@ work while still obeying mergerfs' policies. Below is the basic logic.
* Remove the target from all branches with no source file
* Remove the source from all branches which failed to rename
The the removals are subject to normal entitlement checks.
The removals are subject to normal entitlement checks.
The above behavior will help minimize the likelihood of EXDEV being
returned but it will still be possible.
@ -959,7 +959,7 @@ The options `statfs` and `statfs_ignore` can be used to modify
https://lkml.kernel.org/linux-fsdevel/20211024132607.1636952-1-amir73il@gmail.com/T/
By default FUSE would issue a flush before the release of a file
By default, FUSE would issue a flush before the release of a file
descriptor. This was considered a bit aggressive and a feature added
to give the FUSE server the ability to choose when that happens.
@ -1000,14 +1000,14 @@ value.
1) if no errors: return 0 (success)
2) return first error
Older version of mergerfs would return success if any success occurred
Older versions of mergerfs would return success if any success occurred
but for unlink and rmdir there are downstream assumptions that, while
not impossible to occur, can confuse some software.
### others
For search functions there is always a single thing acted on and as
For search functions, there is always a single thing acted on and as
such whatever return value that comes from the single function call is
returned.
@ -1031,7 +1031,7 @@ distros are below.
Most Debian installs are of a stable branch and therefore do not have
the most up to date software. While mergerfs is available via `apt` it
is suggested that uses install the most recent version available from
is suggested that users install the most recent version available from
the [releases page](https://github.com/trapexit/mergerfs/releases).
#### prebuilt deb
@ -1052,7 +1052,7 @@ sudo apt install -y mergerfs
Most Ubuntu installs are of a stable branch and therefore do not have
the most up to date software. While mergerfs is available via `apt` it
is suggested that uses install the most recent version available from
is suggested that users install the most recent version available from
the [releases page](https://github.com/trapexit/mergerfs/releases).
#### prebuilt deb
@ -1116,7 +1116,7 @@ users: https://github.com/trapexit/mergerfs/releases
branches should be considered works in progress.
First get the code from [github](https://github.com/trapexit/mergerfs).
First, get the code from [github](https://github.com/trapexit/mergerfs).
```
$ git clone https://github.com/trapexit/mergerfs.git
@ -1343,7 +1343,7 @@ because mergerfs is no longer part of the workflow. Keep in mind that
this also means certain mergerfs features that work by interrupting
the read/write workflow, such as `moveonenospc`, will no longer work.
Also understand that this will only work on dynamically linked
Also, understand that this will only work on dynamically linked
software. Anything statically compiled will not work. Many GoLang and
Rust apps are statically compiled.
@ -1475,11 +1475,11 @@ to enable `dropcacheonclose` regardless of caching mode in order to
minimize buffer bloat.
It is difficult to balance memory usage, cache bloat & duplication,
and performance. Ideally mergerfs would be able to disable caching for
and performance. Ideally, mergerfs would be able to disable caching for
the files it reads/writes but allow page caching for itself. That
would limit the FUSE overhead. However, there isn't a good way to
achieve this. It would need to open all files with O_DIRECT which
places limitations on the what underlying filesystems would be
places limitations on what the underlying filesystems would be
supported and complicates the code.
kernel documentation: https://www.kernel.org/doc/Documentation/filesystems/fuse-io.txt
@ -1549,7 +1549,7 @@ creates because the available space won't be updated for that time.
As of version 4.20 Linux supports symlink caching. Significant
performance increases can be had in workloads which use a lot of
symlinks. Setting `cache.symlinks=true` will result in requesting
symlink caching from the kernel only if supported. As a result its
symlink caching from the kernel only if supported. As a result it's
safe to enable it on systems prior to 4.20. That said it is disabled
by default for now. You can see if caching is enabled by querying the
xattr `user.mergerfs.cache.symlinks` but given it must be requested at
@ -1690,9 +1690,9 @@ both against your normal setup, a singular branch, and with
# BENCHMARKING
Filesystems are complicated. They do many things and many of those are
interconnected. Additionally, the OS, drivers, hardware, etc. all can
interconnected. Additionally, the OS, drivers, hardware, etc. can all
impact performance. Therefore, when benchmarking, it is **necessary**
that the test focus as narrowly as possible.
that the test focuses as narrowly as possible.
For most throughput is the key benchmark. To test throughput `dd` is
useful but **must** be used with the correct settings in order to
@ -1704,7 +1704,7 @@ representative of the device's true performance.
When benchmarking through mergerfs ensure you only use 1 branch to
remove any possibility of the policies complicating the
situation. Benchmark the underlying filesystem first and then mount
mergerfs over it and test again. If you're experience speeds below
mergerfs over it and test again. If you're experiencing speeds below
your expectation you will need to narrow down precisely which
component is leading to the slowdown. Preferably test the following in
the order listed (but not combined).
@ -1735,7 +1735,7 @@ investigate further.
Sometimes the problem is really the application accessing or writing
data through mergerfs. Some software use small buffer sizes which can
lead to more requests and therefore greater overhead. You can test
this out yourself by replace `bs=1M` in the examples below with `ibs`
this out yourself by replacing `bs=1M` in the examples below with `ibs`
or `obs` and using a size of `512` instead of `1M`. In one example
test using `nullrw` the write speed dropped from 4.9GB/s to 69.7MB/s
when moving from `1M` to `512`. Similar results were had when testing
@ -1806,10 +1806,9 @@ echo 3 | sudo tee /proc/sys/vm/drop_caches
* Some policies mixed with some functions may result in strange
behaviors. Not that some of these behaviors and race conditions
couldn't happen outside **mergerfs** but that they are far more
likely to occur on account of the attempt to merge together multiple
sources of data which could be out of sync due to the different
policies.
* For consistency its generally best to set **category** wide policies
likely to occur on account of the attempt to merge multiple sources
of data which could be out of sync due to the different policies.
* For consistency it's generally best to set **category** wide policies
rather than individual **func**'s. This will help limit the
confusion of tools such as
[rsync](http://linux.die.net/man/1/rsync). However, the flexibility
@ -1908,7 +1907,7 @@ you're having troubles with and asking them to add a fallback to
regular file IO when mmap is unavailable.
If the issue is that scanning doesn't seem to pick up media then be
sure to set `func.getattr=newest` though generally a full scan will
sure to set `func.getattr=newest`, though generally, a full scan will
pick up all media anyway.
@ -1956,7 +1955,7 @@ move files around on that SMB share to fail with a IO error.
[GVFS-fuse v1.22.0](https://bugzilla.gnome.org/show_bug.cgi?id=734568)
and above fixed this issue but a large number of systems use the older
release. On Ubuntu the version can be checked by issuing `apt-cache
release. On Ubuntu, the version can be checked by issuing `apt-cache
showpkg gvfs-fuse`. Most distros released in 2015 seem to have the
updated release and will work fine but older systems may
not. Upgrading gvfs-fuse or the distro in general will address the
@ -1964,7 +1963,7 @@ problem.
In Apple's MacOSX 10.9 they replaced Samba (client and server) with
their own product. It appears their new client does not handle
**EXDEV** either and responds similar to older release of gvfs on
**EXDEV** either and responds similarly to older releases of gvfs on
Linux.
@ -2003,7 +2002,7 @@ compatible with older systems which may not have C++11
compilers. There is enough storage for 256 users' supplemental
groups. Each user is allowed up to 32 supplemental groups. Linux >=
2.6.3 allows up to 65535 groups per user but most other *nixs allow
far less. NFS allowing only 16. The system does handle overflow
far less. NFS allows only 16. The system does handle overflow
gracefully. If the user has more than 32 supplemental groups only the
first 32 will be used. If more than 256 users are using the system
when an uncached user is found it will evict an existing user's cache
@ -2052,7 +2051,7 @@ with it.
It should be noted that NFS and FUSE (the technology mergerfs uses) do
not work perfectly with one another due to certain design choices in
FUSE (and mergerfs.) Due to these issues it is generally recommended
FUSE (and mergerfs.) Due to these issues, it is generally recommended
to use SMB when possible till situations change. That said mergerfs
should generally work as an export of NFS and issues discovered should
still be reported.
@ -2072,7 +2071,7 @@ NFS export settings:
`open_by_handle_at` functions which allow a program to keep a
reference to a file without technically having it open in the typical
sense. The problem is that FUSE has no way to know that NFS has a
handle that it will later use to open the file again. As a result it
handle that it will later use to open the file again. As a result, it
is possible for the kernel to tell mergerfs to forget about the node
and should NFS ever ask for that node's details in the future it would
have nothing to respond with. Keeping nodes around forever is not
@ -2102,7 +2101,7 @@ permission and ownership issues if root squashing is enabled.
[SMB](https://en.wikipedia.org/wiki/Server_Message_Block) is a
protocol most used by Microsoft Windows systems to share file shares,
printers, etc. However, due to the popularity for Windows, it is also
printers, etc. However, due to the popularity of Windows, it is also
supported on many other platforms including Linux. The most popular
way of supporting SMB on Linux is via the software Samba.
@ -2110,7 +2109,7 @@ way of supporting SMB on Linux is via the software Samba.
ways of serving Linux filesystems, via SMB should work fine with
mergerfs. The services do not tend to use the same technologies which
NFS uses and therefore don't have the same issues. There should not be
an special settings required to use mergerfs with Samba. However,
special settings required to use mergerfs with Samba. However,
[CIFSD](https://en.wikipedia.org/wiki/CIFSD) and other programs have
not been extensively tested. If you use mergerfs with CIFSD or other
SMB servers please submit your experiences so these docs can be
@ -2251,7 +2250,7 @@ is to consolidate them back.
#### What settings should I use?
Depends on what features you want. Generally speaking there are no
Depends on what features you want. Generally speaking, there are no
"wrong" settings. All settings are performance or feature related. The
best bet is to read over the available options and choose what fits
your situation. If something isn't clear from the documentation please
@ -2285,7 +2284,7 @@ you'll need to perform the manual act of creating paths on the
filesystems you want the data to land on before transferring your
data. Setting `func.mkdir=epall` can simplify managing path
preservation for `create`. Or use `func.mkdir=rand` if you're
interested in just grouping together directory content by filesystem.
interested in just grouping directory content by filesystem.
#### Do hardlinks work?
@ -2334,7 +2333,7 @@ Keep in mind that you **MUST** consider identity when using
containers. For example: supplemental groups will be picked up from
the container unless you properly manage users and groups by sharing
relevant /etc files or by using some other means to share identity
across containers. Similarly if you use "rootless" containers and user
across containers. Similarly, if you use "rootless" containers and user
namespaces to do uid/gid translations you **MUST** consider that while
managing shared files.
@ -2387,7 +2386,7 @@ container platforms such as Docker.
#### Why use FUSE? Why not a kernel based solution?
As with any solutions to a problem there are advantages and
As with any solution to a problem, there are advantages and
disadvantages to each one.
A FUSE based solution has all the downsides of FUSE:
@ -2428,8 +2427,8 @@ permissions.
#### Why was splice support removed?
After a lot of testing over the years splicing always appeared to be
at best provide equivalent performance and in cases worse
After a lot of testing over the years, splicing always appeared to
at best, provide equivalent performance, and in some cases, worse
performance. Splice is not supported on other platforms forcing a
traditional read/write fallback to be provided. The splice code was
removed to simplify the codebase.
@ -2451,7 +2450,7 @@ removed to simplify the codebase.
#### Can filesystems be written to directly? Outside of mergerfs while pooled?
Yes, however it's not recommended to use the same file from within the
Yes, however, it's not recommended to use the same file from within the
pool and from without at the same time (particularly
writing). Especially if using caching of any kind (cache.files,
cache.entry, cache.attr, cache.negative_entry, cache.symlinks,
@ -2505,9 +2504,9 @@ You can remove the reserve by running: `tune2fs -m 0 <device>`
When file caching is enabled in any form (`cache.files!=off`) it will
issue `getxattr` requests for `security.capability` prior to *every
single write*. This will usually result in a performance degradation,
single write*. This will usually result in performance degradation,
especially when using a network filesystem (such as NFS or SMB.)
Unfortunately at this moment the kernel is not caching the response.
Unfortunately at this moment, the kernel is not caching the response.
To work around this situation mergerfs offers a few solutions.
@ -2524,7 +2523,7 @@ To work around this situation mergerfs offers a few solutions.
forwarded to mergerfs. The downside is that also means the xattr
based config and query functionality won't work either.
4. Disable file caching. If you aren't using applications which use
`mmap` it's probably simpler to just disable it all together. The
`mmap` it's probably simpler to just disable it altogether. The
kernel won't send the requests when caching is disabled.
@ -2538,7 +2537,7 @@ and if it returns **0** then it will
race condition but it doesn't handle other situations. Rather than
attempting to simulate POSIX ACL behavior the proper way to manage
this is to use [seteuid](http://linux.die.net/man/2/seteuid) and
[setegid](http://linux.die.net/man/2/setegid), in effect becoming the
[setegid](http://linux.die.net/man/2/setegid), in effect, becoming the
user making the original call, and perform the action as them. This is
what mergerfs does and why mergerfs should always run as root.
@ -2550,7 +2549,7 @@ thread only. Jumping back to **root** as necessary should escalated
privileges be needed (for instance: to clone paths between
filesystems).
For non-Linux systems mergerfs uses a read-write lock and changes
For non-Linux systems, mergerfs uses a read-write lock and changes
credentials only when necessary. If multiple threads are to be user X
then only the first one will need to change the processes
credentials. So long as the other threads need to be user X they will
@ -2667,7 +2666,7 @@ There are a lot of misc differences between the two projects but most
features in DrivePool can be replicated with external tools in
combination with mergerfs.
Additionally DrivePool is a closed source commercial product vs
Additionally, DrivePool is a closed source commercial product vs
mergerfs a ISC licensed OSS project.

Loading…
Cancel
Save