diff --git a/mkdocs/docs/config/cache.md b/mkdocs/docs/config/cache.md index 59a84a2a..cbc32e42 100644 --- a/mkdocs/docs/config/cache.md +++ b/mkdocs/docs/config/cache.md @@ -11,7 +11,8 @@ works for mergerfs itself. Not the underlying filesystems. * `cache.files=full`: Enables page caching. Files are cached across opens. * `cache.files=auto-full`: Enables page caching. Files are cached - across opens if mtime and size are unchanged since previous open. + across opens if mtime and size are unchanged since previous + open. Cache is dropped if mtime or size change on open. * `cache.files=per-process`: Enable page caching (equivalent to `cache.files=partial`) only for processes whose 'comm' name matches one of the values defined in cache.files.process-names. If the name @@ -40,6 +41,11 @@ transparently enable page caching when mmap is requested. This means it should be safe to set `cache.files=off`. However, on Linux v6.5 and below you will need to configure `cache.files` as you need. +If [passthrough](passthrough.md) is enabled so must be page +caching. mergerfs will set `cache.files=auto-full` if `passthrough` is +enabled. And when using `passthrough` the there is no double page +caching since it is in fact passing through the IO. + [^1]: This is not unique to mergerfs and affects all FUSE filesystems. It is something that the FUSE community hopes to diff --git a/mkdocs/docs/config/passthrough.md b/mkdocs/docs/config/passthrough.md index 4d2af3a3..4e027bc3 100644 --- a/mkdocs/docs/config/passthrough.md +++ b/mkdocs/docs/config/passthrough.md @@ -3,10 +3,10 @@ * default: `off` * arguments: * `off`: Passthrough is never enabled. - * `ro`: Only enable passthrough when file opened for reading only. - * `wo`: Only enable passthrough when file opened for writing only. - * `rw`: Enable passthrough when file opened for reading, writing, - or both. + * `ro`: Only enable IO passthrough when file opened for reading only. + * `wo`: Only enable IO passthrough when file opened for writing only. + * `rw`: Enable IO passthrough when file opened for reading, writing, + or both. In [Linux 6.9](https://kernelnewbies.org/Linux_6.9#Faster_FUSE_I.2FO) a IO passthrough feature was added to FUSE. Typically `mergerfs` has @@ -57,7 +57,7 @@ file opened. However, at the moment there is no use case for picking and choosing which to enable outside `cache.files=per-process` (which is largely unnecessary on Linux v6.6 and above. See [direct-io-allow-mmap](options.md)) If such a use case arises please -reach out to the author to discuss. +[reach out to the author](../support.md) to discuss. Unlike [preload.so](../tooling.md#preloadso), `passthrough` will work for any software interacting with `mergerfs`. However, `passthrough` @@ -67,10 +67,18 @@ requires Linux v6.9 or above to work. `root` as currently only `root` is allowed to leverage the kernel feature. -**NOTE:** If a file has been opened and passthrough enabled, while that -file is open, if another open request is made `mergerfs` must also -enable `passthrough` for the second open request. This is a limitation -of how the passthrough feature works. +**NOTE:** If a file has been opened and `passthrough` enabled, while +that file is open, if another open request is made `mergerfs` must +also enable `passthrough` for the second open request. This is a +limitation of how the passthrough feature works. Though there is no +known usecase where this is useful. + +**NOTE:** In order to add `passthrough` feature to `mergerfs` it was +necessary to remove the "feature" where mergerfs could open the same +file on different branches. Such as using `func.open=rand` and having +multiple files at the same relative path across different +branches. This "feature" was very very rarely used and it was +impossible to support `passthrough` without changing the behavior. ## Alternatives diff --git a/mkdocs/docs/extended_usage_patterns.md b/mkdocs/docs/extended_usage_patterns.md index 63399de6..353e5c1a 100644 --- a/mkdocs/docs/extended_usage_patterns.md +++ b/mkdocs/docs/extended_usage_patterns.md @@ -14,11 +14,11 @@ bottlenecked by their network, internet connection, or limited size of the cache. However, there are a few situations where a tiered cache setup could help. -1. Fast network, slow filesystems, many readers: You've a 10+Gbps +1. Fast network, slow filesystems, many readers: You've a 10Gbps+ network with many readers and your regular filesystems can't keep up. 2. Fast network, slow filesystems, small'ish bursty writes: You have - a 10+Gbps network and wish to transfer amounts of data less than + a 10Gbps+ network and wish to transfer amounts of data less than your cache filesystem but wish to do so quickly and the time between bursts is long enough to migrate data. @@ -27,8 +27,8 @@ level that can aggregate performance or using higher performance storage would probably be the better solution. If you're going to use mergerfs there are other tactics that may help: spreading the data across filesystems (see the mergerfs.dup tool) and setting -`func.open=rand`, using `symlinkify`, or using dm-cache or a similar -technology to add tiered cache to the underlying device itself. +`func.open=rand` or using dm-cache or a similar technology to add +tiered cache to the underlying device itself. With #2 one could use a block cache solution as available via LVM and dm-cache but there is another solution which requires only mergerfs, a @@ -52,12 +52,19 @@ script to move files around, and a cron job to run said script. * Set your programs to use the **cache** pool. * Configure the **base** pool with the `create` policy you would like to lay out files as you like. -* Save one of the below scripts or create your own. The script's +* Use monstermuffin's + [mergerfs-cache-mover](https://github.com/monstermuffin/mergerfs-cache-mover), + one of the scripts below, or create your own. The script's responsibility is to move files from the **cache** branches (not pool) to the **base** pool. * Use `cron` (as root) to schedule the command at whatever frequency is appropriate for your workflow. +**NOTE:** Due to the additional overhead it is not recommended to nest +or otherwise create hierarchies of mergerfs pools. It will work but +the latency increases will further harm performance. Even when using +passthrough IO or other features. + ### time based expiring diff --git a/mkdocs/docs/intro_to_filesystems.md b/mkdocs/docs/intro_to_filesystems.md index 3db5a5f3..873dc10f 100644 --- a/mkdocs/docs/intro_to_filesystems.md +++ b/mkdocs/docs/intro_to_filesystems.md @@ -51,6 +51,8 @@ those needing that knowledge. * [file descriptor](https://en.wikipedia.org/wiki/File_descriptor): A handle used by software, provided by the operating system, to reference open files. +* [mmap](https://en.wikipedia.org/wiki/Mmap): A way to abstract access + to a file by making it appear as a region of memory. ## Files diff --git a/mkdocs/docs/quickstart.md b/mkdocs/docs/quickstart.md index fa0fc162..1254c352 100644 --- a/mkdocs/docs/quickstart.md +++ b/mkdocs/docs/quickstart.md @@ -30,10 +30,11 @@ caching](config/cache.md) was disabled (ie: `cache.files=off`). However, it now will enable page caching if needed for a particular file if `mmap` is requested. -`mmap` is needed by certain software to read and write to a -file. However, many software could work without it and fail to have -proper error handling. Many programs that use sqlite3 will require -`mmap` despite [sqlite3 working perfectly +[mmap](https://en.wikipedia.org/wiki/Mmap) is needed by certain +software to read and write to a file. However, many software could +work without it and fail to have proper error handling for when it is +unavailable. Many programs that use **sqlite3** will require `mmap` +despite [sqlite3 working perfectly fine](known_issues_bugs.md#sqlite3-plex-jellyfin-do-not-work-with-mergerfs) without it (and in some cases can be more performant with regular file IO.) diff --git a/tools/mergerfs.percent-full-mover b/tools/mergerfs.percent-full-mover index 27b8dd11..c8688ee6 100755 --- a/tools/mergerfs.percent-full-mover +++ b/tools/mergerfs.percent-full-mover @@ -7,13 +7,13 @@ fi CACHEFS="${1}" BASEPOOL="${2}" -PERCENTAGE=${3} +PERCENTAGE="${3}" set -o errexit -while [ $(df "${CACHE}" | tail -n1 | awk '{print $5}' | cut -d'%' -f1) -gt ${PERCENTAGE} ] +while [ $(df "${CACHEFS}" | tail -n1 | awk '{print $5}' | cut -d'%' -f1) -gt ${PERCENTAGE} ] do # Find the file with the oldest access time - FILE=$(find "${CACHE}" -type f -printf '%A@ %P\n' | \ + FILE=$(find "${CACHEFS}" -type f -printf '%A@ %P\n' | \ sort | \ head -n 1 | \ cut -d' ' -f2-) @@ -32,6 +32,6 @@ do --remove-source-files \ --relative \ --log-file=/tmp/mergerfs-cache-rsync.log \ - "${CACHE}/./${FILE}" \ - "${BACKING}/" + "${CACHEFS}/./${FILE}" \ + "${BASEPOOL}/" done