Browse Source

Merge pull request #611 from trapexit/man

tweak docs
pull/615/head 2.26.0
trapexit 6 years ago
committed by GitHub
parent
commit
5e16851c79
No known key found for this signature in database GPG Key ID: 4AEE18F83AFDEB23
  1. 8
      README.md
  2. 26
      man/mergerfs.1
  3. 7
      src/fs_statvfs_cache.cpp
  4. 7
      src/policy_cache.cpp

8
README.md

@ -507,11 +507,11 @@ Given the relatively high cost of FUSE due to the kernel <-> userspace round tri
#### policy caching #### policy caching
Policies are run every time a function is called. These policies can be expensive depending on the setup and usage patterns. Generally we wouldn't want to cache policy results because it may result in stale responses if the underlying drives are used directly.
Policies are run every time a function (with a policy as mentioned above) is called. These policies can be expensive depending on mergerfs' setup and client usage patterns. Generally we wouldn't want to cache policy results because it may result in stale responses if the underlying drives are used directly.
The `open` policy cache will cache the result of an `open` policy for a particular input for `cache.open` seconds or until the file is unlinked. Each file close (release) will randomly chose to clean up the cache of expired entries. The `open` policy cache will cache the result of an `open` policy for a particular input for `cache.open` seconds or until the file is unlinked. Each file close (release) will randomly chose to clean up the cache of expired entries.
This cache is useful in cases like that of **Transmission** which has a "open, read/write, close" pattern (which is much more costly due to the FUSE overhead than normal.)
This cache is really only useful in cases where you have a large number of branches and `open` is called on the same files repeatedly (like **Transmission** which opens and closes a file on every read/write presumably to keep file handle usage low).
#### statfs caching #### statfs caching
@ -605,7 +605,7 @@ done
* https://github.com/trapexit/backup-and-recovery-howtos : A set of guides / howtos on creating a data storage system, backing it up, maintaining it, and recovering from failure. * https://github.com/trapexit/backup-and-recovery-howtos : A set of guides / howtos on creating a data storage system, backing it up, maintaining it, and recovering from failure.
* If you don't see some directories and files you expect in a merged point or policies seem to skip drives be sure the user has permission to all the underlying directories. Use `mergerfs.fsck` to audit the drive for out of sync permissions. * If you don't see some directories and files you expect in a merged point or policies seem to skip drives be sure the user has permission to all the underlying directories. Use `mergerfs.fsck` to audit the drive for out of sync permissions.
* Do **not** use `direct_io` if you expect applications (such as rtorrent) to [mmap](http://linux.die.net/man/2/mmap) files. It is not currently supported in FUSE w/ `direct_io` enabled. Enabling `dropcacheonclose` is recommended when `direct_io` is disabled. * Do **not** use `direct_io` if you expect applications (such as rtorrent) to [mmap](http://linux.die.net/man/2/mmap) files. It is not currently supported in FUSE w/ `direct_io` enabled. Enabling `dropcacheonclose` is recommended when `direct_io` is disabled.
* Since POSIX gives you only error or success on calls its difficult to determine the proper behavior when applying the behavior to multiple targets. **mergerfs** will return an error only if all attempts of an action fail. Any success will lead to a success returned. This means however that some odd situations may arise.
* Since POSIX functions give only a singular error or success its difficult to determine the proper behavior when applying the function to multiple targets. **mergerfs** will return an error only if all attempts of an action fail. Any success will lead to a success returned. This means however that some odd situations may arise.
* [Kodi](http://kodi.tv), [Plex](http://plex.tv), [Subsonic](http://subsonic.org), etc. can use directory [mtime](http://linux.die.net/man/2/stat) to more efficiently determine whether to scan for new content rather than simply performing a full scan. If using the default **getattr** policy of **ff** its possible those programs will miss an update on account of it returning the first directory found's **stat** info and its a later directory on another mount which had the **mtime** recently updated. To fix this you will want to set **func.getattr=newest**. Remember though that this is just **stat**. If the file is later **open**'ed or **unlink**'ed and the policy is different for those then a completely different file or directory could be acted on. * [Kodi](http://kodi.tv), [Plex](http://plex.tv), [Subsonic](http://subsonic.org), etc. can use directory [mtime](http://linux.die.net/man/2/stat) to more efficiently determine whether to scan for new content rather than simply performing a full scan. If using the default **getattr** policy of **ff** its possible those programs will miss an update on account of it returning the first directory found's **stat** info and its a later directory on another mount which had the **mtime** recently updated. To fix this you will want to set **func.getattr=newest**. Remember though that this is just **stat**. If the file is later **open**'ed or **unlink**'ed and the policy is different for those then a completely different file or directory could be acted on.
* Some policies mixed with some functions may result in strange behaviors. Not that some of these behaviors and race conditions couldn't happen outside **mergerfs** but that they are far more likely to occur on account of the attempt to merge together multiple sources of data which could be out of sync due to the different policies. * Some policies mixed with some functions may result in strange behaviors. Not that some of these behaviors and race conditions couldn't happen outside **mergerfs** but that they are far more likely to occur on account of the attempt to merge together multiple sources of data which could be out of sync due to the different policies.
* For consistency its generally best to set **category** wide policies rather than individual **func**'s. This will help limit the confusion of tools such as [rsync](http://linux.die.net/man/1/rsync). However, the flexibility is there if needed. * For consistency its generally best to set **category** wide policies rather than individual **func**'s. This will help limit the confusion of tools such as [rsync](http://linux.die.net/man/1/rsync). However, the flexibility is there if needed.
@ -617,7 +617,7 @@ done
Remember that the default policy for `getattr` is `ff`. The information for the first directory found will be returned. If it wasn't the directory which had been updated then it will appear outdated. Remember that the default policy for `getattr` is `ff`. The information for the first directory found will be returned. If it wasn't the directory which had been updated then it will appear outdated.
The reason this is the default is because any other policy would be far more expensive and for many applications it is unnecessary. To always return the directory with the most recent mtime or a faked value based on all found would require a scan of all drives. That alone is far more expensive than `ff` but would also possibly spin up sleeping drives.
The reason this is the default is because any other policy would be more expensive and for many applications it is unnecessary. To always return the directory with the most recent mtime or a faked value based on all found would require a scan of all drives.
If you always want the directory information from the one with the most recent mtime then use the `newest` policy for `getattr`. If you always want the directory information from the one with the most recent mtime then use the `newest` policy for `getattr`.

26
man/mergerfs.1

@ -1076,9 +1076,10 @@ The options for setting these are \f[C]cache.entry\f[] and
responses to lookups (non\-existant files). responses to lookups (non\-existant files).
.SS policy caching .SS policy caching
.PP .PP
Policies are run every time a function is called.
These policies can be expensive depending on the setup and usage
patterns.
Policies are run every time a function (with a policy as mentioned
above) is called.
These policies can be expensive depending on mergerfs\[aq] setup and
client usage patterns.
Generally we wouldn\[aq]t want to cache policy results because it may Generally we wouldn\[aq]t want to cache policy results because it may
result in stale responses if the underlying drives are used directly. result in stale responses if the underlying drives are used directly.
.PP .PP
@ -1088,9 +1089,10 @@ the file is unlinked.
Each file close (release) will randomly chose to clean up the cache of Each file close (release) will randomly chose to clean up the cache of
expired entries. expired entries.
.PP .PP
This cache is useful in cases like that of \f[B]Transmission\f[] which
has a "open, read/write, close" pattern (which is much more costly due
to the FUSE overhead than normal.)
This cache is really only useful in cases where you have a large number
of branches and \f[C]open\f[] is called on the same files repeatedly
(like \f[B]Transmission\f[] which opens and closes a file on every
read/write presumably to keep file handle usage low).
.SS statfs caching .SS statfs caching
.PP .PP
Of the syscalls used by mergerfs in policies the \f[C]statfs\f[] / Of the syscalls used by mergerfs in policies the \f[C]statfs\f[] /
@ -1260,9 +1262,9 @@ It is not currently supported in FUSE w/ \f[C]direct_io\f[] enabled.
Enabling \f[C]dropcacheonclose\f[] is recommended when Enabling \f[C]dropcacheonclose\f[] is recommended when
\f[C]direct_io\f[] is disabled. \f[C]direct_io\f[] is disabled.
.IP \[bu] 2 .IP \[bu] 2
Since POSIX gives you only error or success on calls its difficult to
determine the proper behavior when applying the behavior to multiple
targets.
Since POSIX functions give only a singular error or success its
difficult to determine the proper behavior when applying the function to
multiple targets.
\f[B]mergerfs\f[] will return an error only if all attempts of an action \f[B]mergerfs\f[] will return an error only if all attempts of an action
fail. fail.
Any success will lead to a success returned. Any success will lead to a success returned.
@ -1302,12 +1304,10 @@ The information for the first directory found will be returned.
If it wasn\[aq]t the directory which had been updated then it will If it wasn\[aq]t the directory which had been updated then it will
appear outdated. appear outdated.
.PP .PP
The reason this is the default is because any other policy would be far
more expensive and for many applications it is unnecessary.
The reason this is the default is because any other policy would be more
expensive and for many applications it is unnecessary.
To always return the directory with the most recent mtime or a faked To always return the directory with the most recent mtime or a faked
value based on all found would require a scan of all drives. value based on all found would require a scan of all drives.
That alone is far more expensive than \f[C]ff\f[] but would also
possibly spin up sleeping drives.
.PP .PP
If you always want the directory information from the one with the most If you always want the directory information from the one with the most
recent mtime then use the \f[C]newest\f[] policy for \f[C]getattr\f[]. recent mtime then use the \f[C]newest\f[] policy for \f[C]getattr\f[].

7
src/fs_statvfs_cache.cpp

@ -25,7 +25,7 @@
#include <pthread.h> #include <pthread.h>
#include <stdint.h> #include <stdint.h>
#include <sys/statvfs.h> #include <sys/statvfs.h>
#include <sys/time.h>
#include <time.h>
struct Element struct Element
{ {
@ -46,11 +46,8 @@ namespace l
get_time(void) get_time(void)
{ {
uint64_t rv; uint64_t rv;
struct timeval now;
::gettimeofday(&now,NULL);
rv = now.tv_sec;
rv = ::time(NULL);
return rv; return rv;
} }

7
src/policy_cache.cpp

@ -5,7 +5,7 @@
#include <string> #include <string>
#include <vector> #include <vector>
#include <sys/time.h>
#include <time.h>
using std::map; using std::map;
using std::string; using std::string;
@ -20,11 +20,8 @@ namespace l
get_time(void) get_time(void)
{ {
uint64_t rv; uint64_t rv;
struct timeval now;
::gettimeofday(&now,NULL);
rv = now.tv_sec;
rv = ::time(NULL);
return rv; return rv;
} }

Loading…
Cancel
Save