Browse Source

Merge pull request #633 from trapexit/cache.files

add file caching across opens and runtime control
pull/635/head
trapexit 6 years ago
committed by GitHub
parent
commit
c8e178bf71
No known key found for this signature in database GPG Key ID: 4AEE18F83AFDEB23
  1. 93
      README.md
  2. 2
      libfuse/Makefile
  3. 4
      libfuse/include/fuse_common.h
  4. 180
      libfuse/lib/fuse.c
  5. 240
      man/mergerfs.1
  6. 72
      src/config.cpp
  7. 31
      src/config.hpp
  8. 29
      src/fuse_create.cpp
  9. 10
      src/fuse_getxattr.cpp
  10. 1
      src/fuse_listxattr.cpp
  11. 29
      src/fuse_open.cpp
  12. 22
      src/fuse_setxattr.cpp
  13. 78
      src/option_parser.cpp

93
README.md

@ -66,34 +66,40 @@ mergerfs does **not** support the copy-on-write (CoW) behavior found in **aufs**
### mount options
* **allow_other**: a libfuse option which allows users besides the one which ran mergerfs to see the filesystem. This is required for most use-cases.
* **direct_io**: causes FUSE to bypass caching which can increase write speeds at the detriment of reads. Note that not enabling `direct_io` will cause double caching of files and therefore less memory for caching generally (enable **dropcacheonclose** to help with this problem). However, `mmap` does not work when `direct_io` is enabled.
* **minfreespace=value**: the minimum space value used for creation policies. Understands 'K', 'M', and 'G' to represent kilobyte, megabyte, and gigabyte respectively. (default: 4G)
* **moveonenospc=true|false**: when enabled (set to **true**) if a **write** fails with **ENOSPC** or **EDQUOT** a scan of all drives will be done looking for the drive with the most free space which is at least the size of the file plus the amount which failed to write. An attempt to move the file to that drive will occur (keeping all metadata possible) and if successful the original is unlinked and the write retried. (default: false)
* **use_ino**: causes mergerfs to supply file/directory inodes rather than libfuse. While not a default it is recommended it be enabled so that linked files share the same inode value.
* **dropcacheonclose=true|false**: when a file is requested to be closed call `posix_fadvise` on it first to instruct the kernel that we no longer need the data and it can drop its cache. Recommended when **direct_io** is not enabled to limit double caching. (default: false)
* **symlinkify=true|false**: when enabled (set to **true**) and a file is not writable and its mtime or ctime is older than **symlinkify_timeout** files will be reported as symlinks to the original files. Please read more below before using. (default: false)
* **symlinkify_timeout=value**: time to wait, in seconds, to activate the **symlinkify** behavior. (default: 3600)
* **nullrw=true|false**: turns reads and writes into no-ops. The request will succeed but do nothing. Useful for benchmarking mergerfs. (default: false)
* **ignorepponrename=true|false**: ignore path preserving on rename. Typically rename and link act differently depending on the policy of `create` (read below). Enabling this will cause rename and link to always use the non-path preserving behavior. This means files, when renamed or linked, will stay on the same drive. (default: false)
* **allow_other**: A libfuse option which allows users besides the one which ran mergerfs to see the filesystem. This is required for most use-cases.
* **minfreespace=value**: The minimum space value used for creation policies. Understands 'K', 'M', and 'G' to represent kilobyte, megabyte, and gigabyte respectively. (default: 4G)
* **moveonenospc=true|false**: When enabled if a **write** fails with **ENOSPC** or **EDQUOT** a scan of all drives will be done looking for the drive with the most free space which is at least the size of the file plus the amount which failed to write. An attempt to move the file to that drive will occur (keeping all metadata possible) and if successful the original is unlinked and the write retried. (default: false)
* **use_ino**: Causes mergerfs to supply file/directory inodes rather than libfuse. While not a default it is recommended it be enabled so that linked files share the same inode value.
* **dropcacheonclose=true|false**: When a file is requested to be closed call `posix_fadvise` on it first to instruct the kernel that we no longer need the data and it can drop its cache. Recommended when **cache.files=partial|full|auto-full** to limit double caching. (default: false)
* **symlinkify=true|false**: When enabled and a file is not writable and its mtime or ctime is older than **symlinkify_timeout** files will be reported as symlinks to the original files. Please read more below before using. (default: false)
* **symlinkify_timeout=value**: Time to wait, in seconds, to activate the **symlinkify** behavior. (default: 3600)
* **nullrw=true|false**: Turns reads and writes into no-ops. The request will succeed but do nothing. Useful for benchmarking mergerfs. (default: false)
* **ignorepponrename=true|false**: Ignore path preserving on rename. Typically rename and link act differently depending on the policy of `create` (read below). Enabling this will cause rename and link to always use the non-path preserving behavior. This means files, when renamed or linked, will stay on the same drive. (default: false)
* **security_capability=true|false**: If false return ENOATTR when xattr security.capability is queried. (default: true)
* **xattr=passthrough|noattr|nosys**: Runtime control of xattrs. Default is to passthrough xattr requests. 'noattr' will short circuit as if nothing exists. 'nosys' will respond with ENOSYS as if xattrs are not supported or disabled. (default: passthrough)
* **link_cow=true|false**: When enabled if a regular file is opened which has a link count > 1 it will copy the file to a temporary file and rename over the original. Breaking the link and providing a basic copy-on-write function similar to cow-shell. (default: false)
* **statfs=base|full**: Controls how statfs works. 'base' means it will always use all branches in statfs calculations. 'full' is in effect path preserving and only includes drives where the path exists. (default: base)
* **statfs_ignore=none|ro|nc**: 'ro' will cause statfs calculations to ignore available space for branches mounted or tagged as 'read-only' or 'no create'. 'nc' will ignore available space for branches tagged as 'no create'. (default: none)
* **posix_acl=true|false:** enable POSIX ACL support (if supported by kernel and underlying filesystem). (default: false)
* **posix_acl=true|false:** Enable POSIX ACL support (if supported by kernel and underlying filesystem). (default: false)
* **async_read=true|false:** Perform reads asynchronously. If disabled or unavailable the kernel will ensure there is at most one pending read request per file handle and will attempt to order requests by offset. (default: true)
* **threads=num**: number of threads to use in multithreaded mode. When set to zero (the default) it will attempt to discover and use the number of logical cores. If the lookup fails it will fall back to using 4. If the thread count is set negative it will look up the number of cores then divide by the absolute value. ie. threads=-2 on an 8 core machine will result in 8 / 2 = 4 threads. There will always be at least 1 thread. NOTE: higher number of threads increases parallelism but usually decreases throughput. (default: number of cores) *NOTE2:* the option is unavailable when built with system libfuse.
* **fsname=name**: sets the name of the filesystem as seen in **mount**, **df**, etc. Defaults to a list of the source paths concatenated together with the longest common prefix removed.
* **func.<func>=<policy>**: sets the specific FUSE function's policy. See below for the list of value types. Example: **func.getattr=newest**
* **threads=num**: Number of threads to use in multithreaded mode. When set to zero it will attempt to discover and use the number of logical cores. If the lookup fails it will fall back to using 4. If the thread count is set negative it will look up the number of cores then divide by the absolute value. ie. threads=-2 on an 8 core machine will result in 8 / 2 = 4 threads. There will always be at least 1 thread. NOTE: higher number of threads increases parallelism but usually decreases throughput. (default: 0)
* **fsname=name**: Sets the name of the filesystem as seen in **mount**, **df**, etc. Defaults to a list of the source paths concatenated together with the longest common prefix removed.
* **func.<func>=<policy>**: Sets the specific FUSE function's policy. See below for the list of value types. Example: **func.getattr=newest**
* **category.<category>=<policy>**: Sets policy of all FUSE functions in the provided category. Example: **category.create=mfs**
* **cache.open=<int>**: 'open' policy cache timeout in seconds. (default: 0)
* **cache.statfs=<int>**: 'statfs' cache timeout in seconds. (default: 0)
* **cache.attr=<int>**: file attribute cache timeout in seconds. (default: 1)
* **cache.entry=<int>**: file name lookup cache timeout in seconds. (default: 1)
* **cache.negative_entry=<int>**: negative file name lookup cache timeout in seconds. (default: 0)
* **cache.symlinks=<bool>**: cache symlinks (if supported by kernel) (default: false)
* **cache.readdir=<bool>**: cache readdir (if supported by kernel) (default: false)
* **cache.attr=<int>**: File attribute cache timeout in seconds. (default: 1)
* **cache.entry=<int>**: File name lookup cache timeout in seconds. (default: 1)
* **cache.negative_entry=<int>**: Negative file name lookup cache timeout in seconds. (default: 0)
* **cache.files=libfuse|off|partial|full|auto-full**: File page caching mode (default: libfuse)
* **cache.symlinks=<bool>**: Cache symlinks (if supported by kernel) (default: false)
* **cache.readdir=<bool>**: Cache readdir (if supported by kernel) (default: false)
* **direct_io**: deprecated - Bypass page cache. Use `cache.files=off` instead. (default: false)
* **kernel_cache**: deprecated - Do not invalidate data cache on file open. Use `cache.files=full` instead. (default: false)
* **auto_cache**: deprecated - Invalidate data cache if file mtime or size change. Use `cache.files=auto-full` instead. (default: false)
* **async_read**: deprecated - Perform reads asynchronously. Use `async_read=true` instead.
* **sync_read**: deprecated - Perform reads synchronously. Use `async_read=false` instead.
**NOTE:** Options are evaluated in the order listed so if the options are **func.rmdir=rand,category.action=ff** the **action** category setting will override the **rmdir** setting.
@ -347,7 +353,7 @@ $ su -
#### Generically
Have git, g++, make, python, automake, libtool installed.
Have git, g++, make, python installed.
```
$ cd mergerfs
@ -499,11 +505,27 @@ A B C
#### page caching
The kernel performs caching of data pages on all files not opened with `O_DIRECT`. Due to mergerfs using FUSE and therefore being a userland process the kernel can double cache the content being read through mergerfs. Once from the underlying filesystem and once for mergerfs. Using `direct_io` and/or `dropcacheonclose` help minimize the double caching. `direct_io` will instruct the kernel to bypass the page cache for files opened through mergerfs. `dropcacheonclose` will cause mergerfs to instruct the kernel to flush a file's page cache for which it had opened when closed. If most data is read once its probably best to enable both (read above for details and limitations).
https://en.wikipedia.org/wiki/Page_cache
tl;dr:
* cache.files=off: Disables page caching. Underlying files cached, mergerfs files are not.
* cache.files=partial: Enables page caching. Underlying files cached, mergerfs files cached while open.
* cache.files=full: Enables page caching. Underlying files cached, mergerfs files cached across opens.
* cache.files=auto-full: Enables page caching. Underlying files cached, mergerfs files cached across opens if mtime and size are unchanged since previous open.
* cache.files=libfuse: follow traditional libfuse `direct_io`, 'kernel_cache`, and `auto_cache` arguments.
FUSE, which mergerfs uses, offers a number of page caching modes. mergerfs tries to simplify their use via the `cache.files` option. It can and should replace usage of `direct_io`, `kernel_cache`, and `auto_cache`.
Due to mergerfs using FUSE and therefore being a userland process proxying existing filesystems the kernel will double cache the content being read and written through mergerfs. Once from the underlying filesystem and once from mergerfs (it sees them as two separate entities). Using `cache.files=off` will keep the double caching from happening by disabling caching of mergerfs but this has the side effect that *all* read and write calls will be passed to mergerfs which may be slower than enabling caching, you lose shared `mmap` support which can affect apps such as rtorrent, and no read-ahead will take place. The kernel will still cache the underlying filesystem data but that only helps so much given mergerfs will still process all requests.
If a cache is desired for mergerfs do not enable `direct_io` and instead possibly use `auto_cache` or `kernel_cache`. By default FUSE will invalidate cached pages when a file is opened. By using `auto_cache` it will instead use `getattr` to check if a file has changed when the file is opened and if so will flush the cache. `ac_attr_timeout` is the timeout for keeping said cache. Alternatively `kernel_cache` will keep the cache across opens unless invalidated through other means. You should only uses these if you do not plan to write/modify the same files through mergerfs and the underlying filesystem at the same time. It could lead to corruption. Then again doing so without caching can also cause issues.
If you do enable file page caching, `cache.files=partial|full|auto-full`, you should also enable `dropcacheonclose` which will cause mergerfs to instruct the kernel to flush the underlying file's page cache when the file is closed. This behavior is the same as the rsync fadvise / drop cache patch and Feh's nocache project.
It's a difficult balance between memory usage, cache bloat & duplication, and performance. Ideally mergerfs would be able to disable caching for the files it reads/writes but allow page caching for itself. That would limit the FUSE overhead. However, there isn't good way to achieve this.
If most files are read once through and closed (like media) it is best to enable `dropcacheonclose` regardless of caching mode in order to minimize buffer bloat.
It is difficult to balance memory usage, cache bloat & duplication, and performance. Ideally mergerfs would be able to disable caching for the files it reads/writes but allow page caching for itself. That would limit the FUSE overhead. However, there isn't a good way to achieve this. It would need to open all files with O_DIRECT which places limitations on the what underlying filesystems would be supported and complicates the code.
kernel documenation: https://www.kernel.org/doc/Documentation/filesystems/fuse-io.txt
#### entry & attribute caching
@ -534,7 +556,7 @@ As of version 4.20 Linux supports symlink caching. Significant performance incre
#### readdir caching
As of version 4.20 Linux supports readdir caching. This can have a significant impact on directory traversal. Especially when combined with entry (`cache.entry`) and attribute ('cache.attr') caching. Setting `cache.readdir=true` will result in requesting readdir caching from the kernel on each `opendir`. If the kernel doesn't support readdir caching setting the option to `true` has no effect. This option is configuarable at runtime via xattr `user.mergerfs.cache.readdir`.
As of version 4.20 Linux supports readdir caching. This can have a significant impact on directory traversal. Especially when combined with entry (`cache.entry`) and attribute (`cache.attr`) caching. Setting `cache.readdir=true` will result in requesting readdir caching from the kernel on each `opendir`. If the kernel doesn't support readdir caching setting the option to `true` has no effect. This option is configuarable at runtime via xattr `user.mergerfs.cache.readdir`.
#### writeback caching
@ -620,7 +642,7 @@ done
* Run mergerfs as `root` (with **allow_other**) unless you're merging paths which are owned by the same user otherwise strange permission issues may arise.
* https://github.com/trapexit/backup-and-recovery-howtos : A set of guides / howtos on creating a data storage system, backing it up, maintaining it, and recovering from failure.
* If you don't see some directories and files you expect in a merged point or policies seem to skip drives be sure the user has permission to all the underlying directories. Use `mergerfs.fsck` to audit the drive for out of sync permissions.
* Do **not** use `direct_io` if you expect applications (such as rtorrent) to [mmap](http://linux.die.net/man/2/mmap) files. It is not currently supported in FUSE w/ `direct_io` enabled. Enabling `dropcacheonclose` is recommended when `direct_io` is disabled.
* Do **not** use `cache.files=off` or `direct_io` if you expect applications (such as rtorrent) to [mmap](http://linux.die.net/man/2/mmap) files. Shared mmap is not currently supported in FUSE w/ `direct_io` enabled. Enabling `dropcacheonclose` is recommended when `cache.files=partial|full|auto-full` or `direct_io=false`.
* Since POSIX functions give only a singular error or success its difficult to determine the proper behavior when applying the function to multiple targets. **mergerfs** will return an error only if all attempts of an action fail. Any success will lead to a success returned. This means however that some odd situations may arise.
* [Kodi](http://kodi.tv), [Plex](http://plex.tv), [Subsonic](http://subsonic.org), etc. can use directory [mtime](http://linux.die.net/man/2/stat) to more efficiently determine whether to scan for new content rather than simply performing a full scan. If using the default **getattr** policy of **ff** its possible those programs will miss an update on account of it returning the first directory found's **stat** info and its a later directory on another mount which had the **mtime** recently updated. To fix this you will want to set **func.getattr=newest**. Remember though that this is just **stat**. If the file is later **open**'ed or **unlink**'ed and the policy is different for those then a completely different file or directory could be acted on.
* Some policies mixed with some functions may result in strange behaviors. Not that some of these behaviors and race conditions couldn't happen outside **mergerfs** but that they are far more likely to occur on account of the attempt to merge together multiple sources of data which could be out of sync due to the different policies.
@ -659,11 +681,7 @@ If you want to move files to one drive just copy them there and use mergerfs.ded
#### cached memory appears greater than it should be
Use the `direct_io` option as described above. Due to what mergerfs is doing there ends up being two caches of a file under normal usage. One from the underlying filesystem and one from mergerfs. Enabling `direct_io` removes the mergerfs cache. This saves on memory but means the kernel needs to communicate with mergerfs more often and can therefore result in slower speeds.
Since enabling `direct_io` disables `mmap` this is not an ideal situation however write speeds should be increased.
If `direct_io` is disabled it is probably a good idea to enable `dropcacheonclose` to minimize double caching.
Use `cache.files=off` or `direct_io=true`. See the section on page caching.
#### NFS clients returning ESTALE / Stale file handle
@ -680,7 +698,7 @@ Try enabling the `use_ino` option. Some have reported that it fixes the issue.
#### rtorrent fails with ENODEV (No such device)
Be sure to turn off `direct_io`. rtorrent and some other applications use [mmap](http://linux.die.net/man/2/mmap) to read and write to files and offer no failback to traditional methods. FUSE does not currently support mmap while using `direct_io`. There may be a performance penalty on writes with `direct_io` off as well as the problem of double caching but it's the only way to get such applications to work. If the performance loss is too high for other apps you can mount mergerfs twice. Once with `direct_io` enabled and one without it. Be sure to set `dropcacheonclose=true` if not using `direct_io`.
Be sure to set `cache.files=partial|full|auto-full` or turn off `direct_io`. rtorrent and some other applications use [mmap](http://linux.die.net/man/2/mmap) to read and write to files and offer no failback to traditional methods. FUSE does not currently support mmap while using `direct_io`. There may be a performance penalty on writes with `direct_io` off as well as the problem of double caching but it's the only way to get such applications to work. If the performance loss is too high for other apps you can mount mergerfs twice. Once with `direct_io` enabled and one without it. Be sure to set `dropcacheonclose=true` if not using `direct_io`.
#### rtorrent fails with files >= 4GiB
@ -828,7 +846,7 @@ See the previous question's answer.
Yes. You need to use `use_ino` to support proper reporting of inodes.
What mergerfs does not do is fake hard links across branches. Read the section "rename & link" for how it.
What mergerfs does not do is fake hard links across branches. Read the section "rename & link" for how it works.
#### Does mergerfs support CoW / copy-on-write?
@ -902,7 +920,7 @@ MergerFS is not intended to be a replacement for ZFS. MergerFS is intended to pr
#### Can drives be written to directly? Outside of mergerfs while pooled?
Yes, however its not recommended to use the same file from within the pool and from without at the same time. Especially if using caching of any kind (cache.entry, cache.attr, ac_attr_timeout, cache.negative_entry, cache.symlinks, auto_cache, kernel_cache).
Yes, however its not recommended to use the same file from within the pool and from without at the same time. Especially if using caching of any kind (cache.files, cache.entry, cache.attr, cache.negative_entry, cache.symlinks, cache.readdir, etc.).
#### Why do I get an "out of space" / "no space left on device" / ENOSPC error even though there appears to be lots of space available?
@ -954,7 +972,7 @@ and the kernel use internally (also called the "nodeid").
#### I notice massive slowdowns of writes over NFS
Due to how NFS works and interacts with FUSE when not using `direct_io` its possible that a getxattr for `security.capability` will be issued prior to any write. This will usually result in a massive slowdown for writes. Using `direct_io` will keep this from happening (and generally good to enable unless you need the features it disables) but the `security_capability` option can also help by short circuiting the call and returning `ENOATTR`.
Due to how NFS works and interacts with FUSE when not using `cache.files=off` or `direct_io` its possible that a getxattr for `security.capability` will be issued prior to any write. This will usually result in a massive slowdown for writes. Using `cache.files=off` or `direct_io` will keep this from happening (and generally good to enable unless you need the features it disables) but the `security_capability` option can also help by short circuiting the call and returning `ENOATTR`.
You could also set `xattr` to `noattr` or `nosys` to short circuit or stop all xattr requests.
@ -981,15 +999,14 @@ For non-Linux systems mergerfs uses a read-write lock and changes credentials on
NOTE: be sure to read about these features before changing them
* enable (or disable) `direct_io`
* enable (or disable) `auto_cache`
* enable (or disable) `kernel_cache`
* enable (or disable) `splice_move`, `splice_read`, and `splice_write`
* increase cache timeouts `cache.attr`, `cache.entry`, `cache.negative_entry`
* enable (or disable) page caching (`cache.files`)
* enable `cache.open`
* enable `cache.statfs`
* enable `cache.symlinks`
* change the number opf worker threads
* enable `cache.readdir`
* change the number of worker threads
* disable `security_capability` and/or `xattr`
* disable `posix_acl`
* disable `async_read`

2
libfuse/Makefile

@ -1,4 +1,4 @@
VERSION = "2.9.7-mergerfs_2.27.0"
VERSION = "2.9.7-mergerfs_2.28.0"
OPT = -O2
ifeq ($(DEBUG),1)

4
libfuse/include/fuse_common.h

@ -78,8 +78,10 @@ fuse_file_info
/* Requests the kernel to cache entries returned by readdir */
uint32_t cache_readdir : 1;
uint32_t auto_cache : 1;
/** Padding. Do not use*/
uint32_t padding : 25;
uint32_t padding : 24;
/** File handle. May be filled in by filesystem in open().
Available in all other file operations */

180
libfuse/lib/fuse.c

@ -59,19 +59,14 @@ struct fuse_config {
double entry_timeout;
double negative_timeout;
double attr_timeout;
double ac_attr_timeout;
int ac_attr_timeout_set;
int remember;
int nopath;
int debug;
int hard_remove; /* not used */
int use_ino;
int readdir_ino;
int set_mode;
int set_uid;
int set_gid;
int kernel_cache;
int auto_cache;
int intr;
int intr_signal;
int help;
@ -85,11 +80,6 @@ struct fuse_fs {
int debug;
};
struct fusemod_so {
void *handle;
int ctr;
};
struct lock_queue_element {
struct lock_queue_element *next;
pthread_cond_t cond;
@ -162,7 +152,8 @@ struct lock {
struct lock *next;
};
struct node {
struct node
{
struct node *name_next;
struct node *id_next;
fuse_ino_t nodeid;
@ -172,14 +163,12 @@ struct node {
char *name;
uint64_t nlookup;
int open_count;
struct timespec stat_updated;
struct timespec mtime;
off_t size;
struct lock *locks;
uint64_t hidden_fh;
char is_hidden;
char cache_valid;
int treelock;
struct stat stat_cache;
char stat_cache_valid;
char inline_name[32];
};
@ -2315,12 +2304,6 @@ node_open(const struct node *node_)
(node_->open_count > 0));
}
static int mtime_eq(const struct stat *stbuf, const struct timespec *ts)
{
return stbuf->st_mtime == ts->tv_sec &&
ST_MTIM_NSEC(stbuf) == ts->tv_nsec;
}
#ifndef CLOCK_MONOTONIC
#define CLOCK_MONOTONIC CLOCK_REALTIME
#endif
@ -2339,15 +2322,21 @@ static void curr_time(struct timespec *now)
}
}
static void update_stat(struct node *node, const struct stat *stbuf)
static
void
update_stat(struct node *node_,
const struct stat *stnew_)
{
if (node->cache_valid && (!mtime_eq(stbuf, &node->mtime) ||
stbuf->st_size != node->size))
node->cache_valid = 0;
node->mtime.tv_sec = stbuf->st_mtime;
node->mtime.tv_nsec = ST_MTIM_NSEC(stbuf);
node->size = stbuf->st_size;
curr_time(&node->stat_updated);
struct stat *stold;
stold = &node_->stat_cache;
if((node_->stat_cache_valid) &&
((stold->st_mtim.tv_sec != stnew_->st_mtim.tv_sec) ||
(stold->st_mtim.tv_nsec != stnew_->st_mtim.tv_nsec) ||
(stold->st_size != stnew_->st_size)))
node_->stat_cache_valid = 0;
*stold = *stnew_;
}
static int lookup_path(struct fuse *f, fuse_ino_t nodeid,
@ -2372,11 +2361,9 @@ static int lookup_path(struct fuse *f, fuse_ino_t nodeid,
e->generation = node->generation;
e->entry_timeout = f->conf.entry_timeout;
e->attr_timeout = f->conf.attr_timeout;
if (f->conf.auto_cache) {
pthread_mutex_lock(&f->lock);
update_stat(node, &e->attr);
pthread_mutex_unlock(&f->lock);
}
set_stat(f, e->ino, &e->attr);
if (f->conf.debug)
fprintf(stderr, " NODEID: %lu\n",
@ -2645,7 +2632,6 @@ static void fuse_lib_getattr(fuse_req_t req, fuse_ino_t ino,
if (!err) {
pthread_mutex_lock(&f->lock);
node = get_node(f, ino);
if (f->conf.auto_cache)
update_stat(node, &buf);
pthread_mutex_unlock(&f->lock);
set_stat(f, ino, &buf);
@ -2781,11 +2767,9 @@ static void fuse_lib_setattr(fuse_req_t req, fuse_ino_t ino, struct stat *attr,
}
if (!err) {
if (f->conf.auto_cache) {
pthread_mutex_lock(&f->lock);
update_stat(get_node(f, ino), &buf);
pthread_mutex_unlock(&f->lock);
}
set_stat(f, ino, &buf);
fuse_reply_attr(req, &buf, f->conf.attr_timeout);
} else {
@ -3076,40 +3060,47 @@ static void fuse_do_release(struct fuse *f, fuse_ino_t ino, const char *path,
fuse_fs_free_hide(f->fs,fh);
}
static void fuse_lib_create(fuse_req_t req, fuse_ino_t parent,
const char *name, mode_t mode,
static
void
fuse_lib_create(fuse_req_t req,
fuse_ino_t parent,
const char *name,
mode_t mode,
struct fuse_file_info *fi)
{
struct fuse *f = req_fuse_prepare(req);
int err;
char *path;
struct fuse *f;
struct fuse_intr_data d;
struct fuse_entry_param e;
char *path;
int err;
f = req_fuse_prepare(req);
err = get_path_name(f, parent, name, &path);
if (!err) {
if(!err)
{
fuse_prepare_interrupt(f, req, &d);
err = fuse_fs_create(f->fs, path, mode, fi);
if (!err) {
if(!err)
{
err = lookup_path(f, parent, name, path, &e, fi);
if (err)
if(err)
{
fuse_fs_release(f->fs, path, fi);
else if (!S_ISREG(e.attr.st_mode)) {
}
else if(!S_ISREG(e.attr.st_mode))
{
err = -EIO;
fuse_fs_release(f->fs, path, fi);
forget_node(f, e.ino, 1);
} else {
if (f->conf.kernel_cache)
fi->keep_cache = 1;
}
}
fuse_finish_interrupt(f, req, &d);
}
if (!err) {
if(!err)
{
pthread_mutex_lock(&f->lock);
struct node *n = get_node(f,e.ino);
n->open_count++;
get_node(f,e.ino)->open_count++;
pthread_mutex_unlock(&f->lock);
if (fuse_reply_create(req, &e, fi) == -ENOENT) {
@ -3118,7 +3109,9 @@ static void fuse_lib_create(fuse_req_t req, fuse_ino_t parent,
fuse_do_release(f, e.ino, path, fi);
forget_node(f, e.ino, 1);
}
} else {
}
else
{
reply_err(req, err);
}
@ -3132,70 +3125,79 @@ static double diff_timespec(const struct timespec *t1,
((double) t1->tv_nsec - (double) t2->tv_nsec) / 1000000000.0;
}
static void open_auto_cache(struct fuse *f, fuse_ino_t ino, const char *path,
static
void
open_auto_cache(struct fuse *f,
fuse_ino_t ino,
const char *path,
struct fuse_file_info *fi)
{
struct node *node;
pthread_mutex_lock(&f->lock);
node = get_node(f, ino);
if (node->cache_valid) {
struct timespec now;
curr_time(&now);
if (diff_timespec(&now, &node->stat_updated) >
f->conf.ac_attr_timeout) {
struct stat stbuf;
node = get_node(f,ino);
if(node->stat_cache_valid)
{
int err;
struct stat stbuf;
pthread_mutex_unlock(&f->lock);
err = fuse_fs_fgetattr(f->fs, path, &stbuf, fi);
err = fuse_fs_fgetattr(f->fs,path,&stbuf,fi);
pthread_mutex_lock(&f->lock);
if (!err)
update_stat(node, &stbuf);
if(!err)
update_stat(node,&stbuf);
else
node->cache_valid = 0;
}
node->stat_cache_valid = 0;
}
if (node->cache_valid)
if(node->stat_cache_valid)
fi->keep_cache = 1;
node->cache_valid = 1;
node->stat_cache_valid = 1;
pthread_mutex_unlock(&f->lock);
}
static void fuse_lib_open(fuse_req_t req, fuse_ino_t ino,
static
void
fuse_lib_open(fuse_req_t req,
fuse_ino_t ino,
struct fuse_file_info *fi)
{
struct fuse *f = req_fuse_prepare(req);
struct fuse_intr_data d;
char *path;
int err;
char *path;
struct fuse *f;
struct fuse_intr_data d;
f = req_fuse_prepare(req);
err = get_path(f, ino, &path);
if (!err) {
if(!err)
{
fuse_prepare_interrupt(f, req, &d);
err = fuse_fs_open(f->fs, path, fi);
if (!err) {
if (f->conf.kernel_cache)
fi->keep_cache = 1;
if (f->conf.auto_cache)
if(!err)
{
if (fi && fi->auto_cache)
open_auto_cache(f, ino, path, fi);
}
fuse_finish_interrupt(f, req, &d);
}
if (!err) {
if(!err)
{
pthread_mutex_lock(&f->lock);
struct node *n = get_node(f,ino);
n->open_count++;
get_node(f,ino)->open_count++;
pthread_mutex_unlock(&f->lock);
if (fuse_reply_open(req, fi) == -ENOENT) {
/* The open syscall was interrupted, so it
must be cancelled */
/* The open syscall was interrupted, so it must be cancelled */
if(fuse_reply_open(req, fi) == -ENOENT)
fuse_do_release(f, ino, path, fi);
}
} else
else
{
reply_err(req, err);
}
free_path(f, ino, path);
}
@ -4346,12 +4348,8 @@ static const struct fuse_opt fuse_lib_opts[] = {
FUSE_OPT_KEY("-d", FUSE_OPT_KEY_KEEP),
FUSE_LIB_OPT("debug", debug, 1),
FUSE_LIB_OPT("-d", debug, 1),
FUSE_LIB_OPT("hard_remove", hard_remove, 1),
FUSE_LIB_OPT("use_ino", use_ino, 1),
FUSE_LIB_OPT("readdir_ino", readdir_ino, 1),
FUSE_LIB_OPT("kernel_cache", kernel_cache, 1),
FUSE_LIB_OPT("auto_cache", auto_cache, 1),
FUSE_LIB_OPT("noauto_cache", auto_cache, 0),
FUSE_LIB_OPT("umask=", set_mode, 1),
FUSE_LIB_OPT("umask=%o", umask, 0),
FUSE_LIB_OPT("uid=", set_uid, 1),
@ -4360,8 +4358,6 @@ static const struct fuse_opt fuse_lib_opts[] = {
FUSE_LIB_OPT("gid=%d", gid, 0),
FUSE_LIB_OPT("entry_timeout=%lf", entry_timeout, 0),
FUSE_LIB_OPT("attr_timeout=%lf", attr_timeout, 0),
FUSE_LIB_OPT("ac_attr_timeout=%lf", ac_attr_timeout, 0),
FUSE_LIB_OPT("ac_attr_timeout=", ac_attr_timeout_set, 1),
FUSE_LIB_OPT("negative_timeout=%lf", negative_timeout, 0),
FUSE_LIB_OPT("noforget", remember, -1),
FUSE_LIB_OPT("remember=%u", remember, 0),
@ -4377,15 +4373,12 @@ static void fuse_lib_help(void)
fprintf(stderr,
" -o use_ino let filesystem set inode numbers\n"
" -o readdir_ino try to fill in d_ino in readdir\n"
" -o kernel_cache cache files in kernel\n"
" -o [no]auto_cache enable caching based on modification times (off)\n"
" -o umask=M set file permissions (octal)\n"
" -o uid=N set file owner\n"
" -o gid=N set file group\n"
" -o entry_timeout=T cache timeout for names (1.0s)\n"
" -o negative_timeout=T cache timeout for deleted names (0.0s)\n"
" -o attr_timeout=T cache timeout for attributes (1.0s)\n"
" -o ac_attr_timeout=T auto cache timeout for attributes (attr_timeout)\n"
" -o noforget never forget cached inodes\n"
" -o remember=T remember cached inodes for T seconds (0s)\n"
" -o nopath don't supply path if not necessary\n"
@ -4566,9 +4559,6 @@ struct fuse *fuse_new_common(struct fuse_chan *ch, struct fuse_args *args,
fuse_lib_opt_proc) == -1)
goto out_free_fs;
if (!f->conf.ac_attr_timeout_set)
f->conf.ac_attr_timeout = f->conf.attr_timeout;
#if defined(__FreeBSD__) || defined(__NetBSD__)
/*
* In FreeBSD, we always use these settings as inode numbers

240
man/mergerfs.1

@ -82,62 +82,53 @@ so you can mix read\-write and read\-only drives.
.SH OPTIONS
.SS mount options
.IP \[bu] 2
\f[B]allow_other\f[]: a libfuse option which allows users besides the
\f[B]allow_other\f[]: A libfuse option which allows users besides the
one which ran mergerfs to see the filesystem.
This is required for most use\-cases.
.IP \[bu] 2
\f[B]direct_io\f[]: causes FUSE to bypass caching which can increase
write speeds at the detriment of reads.
Note that not enabling \f[C]direct_io\f[] will cause double caching of
files and therefore less memory for caching generally (enable
\f[B]dropcacheonclose\f[] to help with this problem).
However, \f[C]mmap\f[] does not work when \f[C]direct_io\f[] is enabled.
.IP \[bu] 2
\f[B]minfreespace=value\f[]: the minimum space value used for creation
\f[B]minfreespace=value\f[]: The minimum space value used for creation
policies.
Understands \[aq]K\[aq], \[aq]M\[aq], and \[aq]G\[aq] to represent
kilobyte, megabyte, and gigabyte respectively.
(default: 4G)
.IP \[bu] 2
\f[B]moveonenospc=true|false\f[]: when enabled (set to \f[B]true\f[]) if
a \f[B]write\f[] fails with \f[B]ENOSPC\f[] or \f[B]EDQUOT\f[] a scan of
all drives will be done looking for the drive with the most free space
which is at least the size of the file plus the amount which failed to
write.
\f[B]moveonenospc=true|false\f[]: When enabled if a \f[B]write\f[] fails
with \f[B]ENOSPC\f[] or \f[B]EDQUOT\f[] a scan of all drives will be
done looking for the drive with the most free space which is at least
the size of the file plus the amount which failed to write.
An attempt to move the file to that drive will occur (keeping all
metadata possible) and if successful the original is unlinked and the
write retried.
(default: false)
.IP \[bu] 2
\f[B]use_ino\f[]: causes mergerfs to supply file/directory inodes rather
\f[B]use_ino\f[]: Causes mergerfs to supply file/directory inodes rather
than libfuse.
While not a default it is recommended it be enabled so that linked files
share the same inode value.
.IP \[bu] 2
\f[B]dropcacheonclose=true|false\f[]: when a file is requested to be
\f[B]dropcacheonclose=true|false\f[]: When a file is requested to be
closed call \f[C]posix_fadvise\f[] on it first to instruct the kernel
that we no longer need the data and it can drop its cache.
Recommended when \f[B]direct_io\f[] is not enabled to limit double
caching.
Recommended when \f[B]cache.files=partial|full|auto\-full\f[] to limit
double caching.
(default: false)
.IP \[bu] 2
\f[B]symlinkify=true|false\f[]: when enabled (set to \f[B]true\f[]) and
a file is not writable and its mtime or ctime is older than
\f[B]symlinkify_timeout\f[] files will be reported as symlinks to the
original files.
\f[B]symlinkify=true|false\f[]: When enabled and a file is not writable
and its mtime or ctime is older than \f[B]symlinkify_timeout\f[] files
will be reported as symlinks to the original files.
Please read more below before using.
(default: false)
.IP \[bu] 2
\f[B]symlinkify_timeout=value\f[]: time to wait, in seconds, to activate
\f[B]symlinkify_timeout=value\f[]: Time to wait, in seconds, to activate
the \f[B]symlinkify\f[] behavior.
(default: 3600)
.IP \[bu] 2
\f[B]nullrw=true|false\f[]: turns reads and writes into no\-ops.
\f[B]nullrw=true|false\f[]: Turns reads and writes into no\-ops.
The request will succeed but do nothing.
Useful for benchmarking mergerfs.
(default: false)
.IP \[bu] 2
\f[B]ignorepponrename=true|false\f[]: ignore path preserving on rename.
\f[B]ignorepponrename=true|false\f[]: Ignore path preserving on rename.
Typically rename and link act differently depending on the policy of
\f[C]create\f[] (read below).
Enabling this will cause rename and link to always use the non\-path
@ -177,7 +168,7 @@ calculations to ignore available space for branches mounted or tagged as
create\[aq].
(default: none)
.IP \[bu] 2
\f[B]posix_acl=true|false:\f[] enable POSIX ACL support (if supported by
\f[B]posix_acl=true|false:\f[] Enable POSIX ACL support (if supported by
kernel and underlying filesystem).
(default: false)
.IP \[bu] 2
@ -187,9 +178,9 @@ pending read request per file handle and will attempt to order requests
by offset.
(default: true)
.IP \[bu] 2
\f[B]threads=num\f[]: number of threads to use in multithreaded mode.
When set to zero (the default) it will attempt to discover and use the
number of logical cores.
\f[B]threads=num\f[]: Number of threads to use in multithreaded mode.
When set to zero it will attempt to discover and use the number of
logical cores.
If the lookup fails it will fall back to using 4.
If the thread count is set negative it will look up the number of cores
then divide by the absolute value.
@ -198,15 +189,14 @@ threads=\-2 on an 8 core machine will result in 8 / 2 = 4 threads.
There will always be at least 1 thread.
NOTE: higher number of threads increases parallelism but usually
decreases throughput.
(default: number of cores) \f[I]NOTE2:\f[] the option is unavailable
when built with system libfuse.
(default: 0)
.IP \[bu] 2
\f[B]fsname=name\f[]: sets the name of the filesystem as seen in
\f[B]fsname=name\f[]: Sets the name of the filesystem as seen in
\f[B]mount\f[], \f[B]df\f[], etc.
Defaults to a list of the source paths concatenated together with the
longest common prefix removed.
.IP \[bu] 2
\f[B]func.<func>=<policy>\f[]: sets the specific FUSE function\[aq]s
\f[B]func.<func>=<policy>\f[]: Sets the specific FUSE function\[aq]s
policy.
See below for the list of value types.
Example: \f[B]func.getattr=newest\f[]
@ -222,21 +212,44 @@ seconds.
\f[B]cache.statfs=<int>\f[]: \[aq]statfs\[aq] cache timeout in seconds.
(default: 0)
.IP \[bu] 2
\f[B]cache.attr=<int>\f[]: file attribute cache timeout in seconds.
\f[B]cache.attr=<int>\f[]: File attribute cache timeout in seconds.
(default: 1)
.IP \[bu] 2
\f[B]cache.entry=<int>\f[]: file name lookup cache timeout in seconds.
\f[B]cache.entry=<int>\f[]: File name lookup cache timeout in seconds.
(default: 1)
.IP \[bu] 2
\f[B]cache.negative_entry=<int>\f[]: negative file name lookup cache
\f[B]cache.negative_entry=<int>\f[]: Negative file name lookup cache
timeout in seconds.
(default: 0)
.IP \[bu] 2
\f[B]cache.symlinks=<bool>\f[]: cache symlinks (if supported by kernel)
\f[B]cache.files=libfuse|off|partial|full|auto\-full\f[]: File page
caching mode (default: libfuse)
.IP \[bu] 2
\f[B]cache.symlinks=<bool>\f[]: Cache symlinks (if supported by kernel)
(default: false)
.IP \[bu] 2
\f[B]cache.readdir=<bool>\f[]: Cache readdir (if supported by kernel)
(default: false)
.IP \[bu] 2
\f[B]cache.readdir=<bool>\f[]: cache readdir (if supported by kernel)
\f[B]direct_io\f[]: deprecated \- Bypass page cache.
Use \f[C]cache.files=off\f[] instead.
(default: false)
.IP \[bu] 2
\f[B]kernel_cache\f[]: deprecated \- Do not invalidate data cache on
file open.
Use \f[C]cache.files=full\f[] instead.
(default: false)
.IP \[bu] 2
\f[B]auto_cache\f[]: deprecated \- Invalidate data cache if file mtime
or size change.
Use \f[C]cache.files=auto\-full\f[] instead.
(default: false)
.IP \[bu] 2
\f[B]async_read\f[]: deprecated \- Perform reads asynchronously.
Use \f[C]async_read=true\f[] instead.
.IP \[bu] 2
\f[B]sync_read\f[]: deprecated \- Perform reads synchronously.
Use \f[C]async_read=false\f[] instead.
.PP
\f[B]NOTE:\f[] Options are evaluated in the order listed so if the
options are \f[B]func.rmdir=rand,category.action=ff\f[] the
@ -826,7 +839,7 @@ $\ su\ \-
.fi
.SS Generically
.PP
Have git, g++, make, python, automake, libtool installed.
Have git, g++, make, python installed.
.IP
.nf
\f[C]
@ -1041,40 +1054,64 @@ bad blocks and find the files using those blocks
.SH CACHING
.SS page caching
.PP
The kernel performs caching of data pages on all files not opened with
\f[C]O_DIRECT\f[].
Due to mergerfs using FUSE and therefore being a userland process the
kernel can double cache the content being read through mergerfs.
Once from the underlying filesystem and once for mergerfs.
Using \f[C]direct_io\f[] and/or \f[C]dropcacheonclose\f[] help minimize
the double caching.
\f[C]direct_io\f[] will instruct the kernel to bypass the page cache for
files opened through mergerfs.
\f[C]dropcacheonclose\f[] will cause mergerfs to instruct the kernel to
flush a file\[aq]s page cache for which it had opened when closed.
If most data is read once its probably best to enable both (read above
for details and limitations).
.PP
If a cache is desired for mergerfs do not enable \f[C]direct_io\f[] and
instead possibly use \f[C]auto_cache\f[] or \f[C]kernel_cache\f[].
By default FUSE will invalidate cached pages when a file is opened.
By using \f[C]auto_cache\f[] it will instead use \f[C]getattr\f[] to
check if a file has changed when the file is opened and if so will flush
the cache.
\f[C]ac_attr_timeout\f[] is the timeout for keeping said cache.
Alternatively \f[C]kernel_cache\f[] will keep the cache across opens
unless invalidated through other means.
You should only uses these if you do not plan to write/modify the same
files through mergerfs and the underlying filesystem at the same time.
It could lead to corruption.
Then again doing so without caching can also cause issues.
.PP
It\[aq]s a difficult balance between memory usage, cache bloat &
duplication, and performance.
https://en.wikipedia.org/wiki/Page_cache
.PP
tl;dr: * cache.files=off: Disables page caching.
Underlying files cached, mergerfs files are not.
* cache.files=partial: Enables page caching.
Underlying files cached, mergerfs files cached while open.
* cache.files=full: Enables page caching.
Underlying files cached, mergerfs files cached across opens.
* cache.files=auto\-full: Enables page caching.
Underlying files cached, mergerfs files cached across opens if mtime and
size are unchanged since previous open.
* cache.files=libfuse: follow traditional libfuse \f[C]direct_io\f[],
\[aq]kernel_cache\f[C],\ and\f[]auto_cache` arguments.
.PP
FUSE, which mergerfs uses, offers a number of page caching modes.
mergerfs tries to simplify their use via the \f[C]cache.files\f[]
option.
It can and should replace usage of \f[C]direct_io\f[],
\f[C]kernel_cache\f[], and \f[C]auto_cache\f[].
.PP
Due to mergerfs using FUSE and therefore being a userland process
proxying existing filesystems the kernel will double cache the content
being read and written through mergerfs.
Once from the underlying filesystem and once from mergerfs (it sees them
as two separate entities).
Using \f[C]cache.files=off\f[] will keep the double caching from
happening by disabling caching of mergerfs but this has the side effect
that \f[I]all\f[] read and write calls will be passed to mergerfs which
may be slower than enabling caching, you lose shared \f[C]mmap\f[]
support which can affect apps such as rtorrent, and no read\-ahead will
take place.
The kernel will still cache the underlying filesystem data but that only
helps so much given mergerfs will still process all requests.
.PP
If you do enable file page caching,
\f[C]cache.files=partial|full|auto\-full\f[], you should also enable
\f[C]dropcacheonclose\f[] which will cause mergerfs to instruct the
kernel to flush the underlying file\[aq]s page cache when the file is
closed.
This behavior is the same as the rsync fadvise / drop cache patch and
Feh\[aq]s nocache project.
.PP
If most files are read once through and closed (like media) it is best
to enable \f[C]dropcacheonclose\f[] regardless of caching mode in order
to minimize buffer bloat.
.PP
It is difficult to balance memory usage, cache bloat & duplication, and
performance.
Ideally mergerfs would be able to disable caching for the files it
reads/writes but allow page caching for itself.
That would limit the FUSE overhead.
However, there isn\[aq]t good way to achieve this.
However, there isn\[aq]t a good way to achieve this.
It would need to open all files with O_DIRECT which places limitations
on the what underlying filesystems would be supported and complicates
the code.
.PP
kernel documenation:
https://www.kernel.org/doc/Documentation/filesystems/fuse\-io.txt
.SS entry & attribute caching
.PP
Given the relatively high cost of FUSE due to the kernel <\-> userspace
@ -1142,7 +1179,7 @@ startup you can not change it at runtime.
As of version 4.20 Linux supports readdir caching.
This can have a significant impact on directory traversal.
Especially when combined with entry (\f[C]cache.entry\f[]) and attribute
(\[aq]cache.attr\[aq]) caching.
(\f[C]cache.attr\f[]) caching.
Setting \f[C]cache.readdir=true\f[] will result in requesting readdir
caching from the kernel on each \f[C]opendir\f[].
If the kernel doesn\[aq]t support readdir caching setting the option to
@ -1298,11 +1335,14 @@ all the underlying directories.
Use \f[C]mergerfs.fsck\f[] to audit the drive for out of sync
permissions.
.IP \[bu] 2
Do \f[B]not\f[] use \f[C]direct_io\f[] if you expect applications (such
as rtorrent) to mmap (http://linux.die.net/man/2/mmap) files.
It is not currently supported in FUSE w/ \f[C]direct_io\f[] enabled.
Do \f[B]not\f[] use \f[C]cache.files=off\f[] or \f[C]direct_io\f[] if
you expect applications (such as rtorrent) to
mmap (http://linux.die.net/man/2/mmap) files.
Shared mmap is not currently supported in FUSE w/ \f[C]direct_io\f[]
enabled.
Enabling \f[C]dropcacheonclose\f[] is recommended when
\f[C]direct_io\f[] is disabled.
\f[C]cache.files=partial|full|auto\-full\f[] or
\f[C]direct_io=false\f[].
.IP \[bu] 2
Since POSIX functions give only a singular error or success its
difficult to determine the proper behavior when applying the function to
@ -1380,19 +1420,8 @@ mergerfs.dedup to clean up the old paths or manually remove them from
the branches directly.
.SS cached memory appears greater than it should be
.PP
Use the \f[C]direct_io\f[] option as described above.
Due to what mergerfs is doing there ends up being two caches of a file
under normal usage.
One from the underlying filesystem and one from mergerfs.
Enabling \f[C]direct_io\f[] removes the mergerfs cache.
This saves on memory but means the kernel needs to communicate with
mergerfs more often and can therefore result in slower speeds.
.PP
Since enabling \f[C]direct_io\f[] disables \f[C]mmap\f[] this is not an
ideal situation however write speeds should be increased.
.PP
If \f[C]direct_io\f[] is disabled it is probably a good idea to enable
\f[C]dropcacheonclose\f[] to minimize double caching.
Use \f[C]cache.files=off\f[] or \f[C]direct_io=true\f[].
See the section on page caching.
.SS NFS clients returning ESTALE / Stale file handle
.PP
Be sure to use \f[C]noforget\f[] and \f[C]use_ino\f[] arguments.
@ -1405,7 +1434,8 @@ Try enabling the \f[C]use_ino\f[] option.
Some have reported that it fixes the issue.
.SS rtorrent fails with ENODEV (No such device)
.PP
Be sure to turn off \f[C]direct_io\f[].
Be sure to set \f[C]cache.files=partial|full|auto\-full\f[] or turn off
\f[C]direct_io\f[].
rtorrent and some other applications use
mmap (http://linux.die.net/man/2/mmap) to read and write to files and
offer no failback to traditional methods.
@ -1675,7 +1705,7 @@ Yes.
You need to use \f[C]use_ino\f[] to support proper reporting of inodes.
.PP
What mergerfs does not do is fake hard links across branches.
Read the section "rename & link" for how it.
Read the section "rename & link" for how it works.
.SS Does mergerfs support CoW / copy\-on\-write?
.PP
Not in the sense of a filesystem like BTRFS or ZFS nor in the overlayfs
@ -1813,9 +1843,8 @@ here (http://louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html
.PP
Yes, however its not recommended to use the same file from within the
pool and from without at the same time.
Especially if using caching of any kind (cache.entry, cache.attr,
ac_attr_timeout, cache.negative_entry, cache.symlinks, auto_cache,
kernel_cache).
Especially if using caching of any kind (cache.files, cache.entry,
cache.attr, cache.negative_entry, cache.symlinks, cache.readdir, etc.).
.SS Why do I get an "out of space" / "no space left on device" / ENOSPC
error even though there appears to be lots of space available?
.PP
@ -1905,13 +1934,14 @@ and\ the\ kernel\ use\ internally\ (also\ called\ the\ "nodeid").
.SS I notice massive slowdowns of writes over NFS
.PP
Due to how NFS works and interacts with FUSE when not using
\f[C]direct_io\f[] its possible that a getxattr for
\f[C]security.capability\f[] will be issued prior to any write.
\f[C]cache.files=off\f[] or \f[C]direct_io\f[] its possible that a
getxattr for \f[C]security.capability\f[] will be issued prior to any
write.
This will usually result in a massive slowdown for writes.
Using \f[C]direct_io\f[] will keep this from happening (and generally
good to enable unless you need the features it disables) but the
\f[C]security_capability\f[] option can also help by short circuiting
the call and returning \f[C]ENOATTR\f[].
Using \f[C]cache.files=off\f[] or \f[C]direct_io\f[] will keep this from
happening (and generally good to enable unless you need the features it
disables) but the \f[C]security_capability\f[] option can also help by
short circuiting the call and returning \f[C]ENOATTR\f[].
.PP
You could also set \f[C]xattr\f[] to \f[C]noattr\f[] or \f[C]nosys\f[]
to short circuit or stop all xattr requests.
@ -1971,25 +2001,23 @@ assuming there are few users.
.PP
NOTE: be sure to read about these features before changing them
.IP \[bu] 2
enable (or disable) \f[C]direct_io\f[]
.IP \[bu] 2
enable (or disable) \f[C]auto_cache\f[]
.IP \[bu] 2
enable (or disable) \f[C]kernel_cache\f[]
.IP \[bu] 2
enable (or disable) \f[C]splice_move\f[], \f[C]splice_read\f[], and
\f[C]splice_write\f[]
.IP \[bu] 2
increase cache timeouts \f[C]cache.attr\f[], \f[C]cache.entry\f[],
\f[C]cache.negative_entry\f[]
.IP \[bu] 2
enable (or disable) page caching (\f[C]cache.files\f[])
.IP \[bu] 2
enable \f[C]cache.open\f[]
.IP \[bu] 2
enable \f[C]cache.statfs\f[]
.IP \[bu] 2
enable \f[C]cache.symlinks\f[]
.IP \[bu] 2
change the number opf worker threads
enable \f[C]cache.readdir\f[]
.IP \[bu] 2
change the number of worker threads
.IP \[bu] 2
disable \f[C]security_capability\f[] and/or \f[C]xattr\f[]
.IP \[bu] 2

72
src/config.cpp

@ -52,6 +52,7 @@ Config::Config()
cache_symlinks(false),
cache_readdir(false),
async_read(true),
cache_files(CacheFiles::LIBFUSE),
POLICYINIT(access),
POLICYINIT(chmod),
POLICYINIT(chown),
@ -124,3 +125,74 @@ Config::set_category_policy(const string &category_,
return 0;
}
Config::CacheFiles::operator int() const
{
return _data;
}
Config::CacheFiles::operator std::string() const
{
switch(_data)
{
case OFF:
return "off";
case PARTIAL:
return "partial";
case FULL:
return "full";
case AUTO_FULL:
return "auto-full";
case LIBFUSE:
return "libfuse";
case INVALID:
break;
}
return "";
}
Config::CacheFiles::CacheFiles()
: _data(INVALID)
{
}
Config::CacheFiles::CacheFiles(Config::CacheFiles::Enum data_)
: _data(data_)
{
}
bool
Config::CacheFiles::valid() const
{
return (_data != INVALID);
}
Config::CacheFiles&
Config::CacheFiles::operator=(const Config::CacheFiles::Enum data_)
{
_data = data_;
return *this;
}
Config::CacheFiles&
Config::CacheFiles::operator=(const std::string &data_)
{
if(data_ == "off")
_data = OFF;
else if(data_ == "partial")
_data = PARTIAL;
else if(data_ == "full")
_data = FULL;
else if(data_ == "auto-full")
_data = AUTO_FULL;
else if(data_ == "libfuse")
_data = LIBFUSE;
else
_data = INVALID;
return *this;
}

31
src/config.hpp

@ -51,6 +51,34 @@ public:
};
};
class CacheFiles
{
public:
enum Enum
{
INVALID = -1,
LIBFUSE,
OFF,
PARTIAL,
FULL,
AUTO_FULL
};
CacheFiles();
CacheFiles(Enum);
operator int() const;
operator std::string() const;
CacheFiles& operator=(const Enum);
CacheFiles& operator=(const std::string&);
bool valid() const;
private:
Enum _data;
};
public:
Config();
@ -68,6 +96,8 @@ public:
uint64_t minfreespace;
bool moveonenospc;
bool direct_io;
bool kernel_cache;
bool auto_cache;
bool dropcacheonclose;
bool symlinkify;
time_t symlinkify_timeout;
@ -82,6 +112,7 @@ public:
bool cache_symlinks;
bool cache_readdir;
bool async_read;
CacheFiles cache_files;
public:
const Policy *policies[FuseFunc::Enum::END];

29
src/fuse_create.cpp

@ -31,6 +31,7 @@
using std::string;
using std::vector;
typedef Config::CacheFiles CacheFiles;
namespace l
{
@ -123,7 +124,35 @@ namespace FUSE
const ugid::Set ugid(fc->uid,fc->gid);
const rwlock::ReadGuard readlock(&config.branches_lock);
switch(config.cache_files)
{
case CacheFiles::LIBFUSE:
ffi_->direct_io = config.direct_io;
ffi_->keep_cache = config.kernel_cache;
ffi_->auto_cache = config.auto_cache;
break;
case CacheFiles::OFF:
ffi_->direct_io = 1;
ffi_->keep_cache = 0;
ffi_->auto_cache = 0;
break;
case CacheFiles::PARTIAL:
ffi_->direct_io = 0;
ffi_->keep_cache = 0;
ffi_->auto_cache = 0;
break;
case CacheFiles::FULL:
ffi_->direct_io = 0;
ffi_->keep_cache = 1;
ffi_->auto_cache = 0;
break;
case CacheFiles::AUTO_FULL:
ffi_->direct_io = 0;
ffi_->keep_cache = 0;
ffi_->auto_cache = 1;
break;
}
return l::create(config.getattr,
config.create,
config.branches,

10
src/fuse_getxattr.cpp

@ -225,6 +225,14 @@ namespace l
}
}
static
void
getxattr_controlfile(const Config::CacheFiles &cache_files_,
string &attrvalue_)
{
attrvalue_ = (string)cache_files_;
}
static
void
getxattr_controlfile_policies(const Config &config,
@ -371,6 +379,8 @@ namespace l
l::getxattr_controlfile_bool(config.cache_symlinks,attrvalue);
else if((attr[2] == "cache") && (attr[3] == "readdir"))
l::getxattr_controlfile_bool(config.cache_readdir,attrvalue);
else if((attr[2] == "cache") && (attr[3] == "files"))
l::getxattr_controlfile(config.cache_files,attrvalue);
break;
}

1
src/fuse_listxattr.cpp

@ -48,6 +48,7 @@ namespace l
("user.mergerfs.branches")
("user.mergerfs.cache.attr")
("user.mergerfs.cache.entry")
("user.mergerfs.cache.files")
("user.mergerfs.cache.negative_entry")
("user.mergerfs.cache.open")
("user.mergerfs.cache.readdir")

29
src/fuse_open.cpp

@ -31,6 +31,7 @@
using std::string;
using std::vector;
typedef Config::CacheFiles CacheFiles;
namespace l
{
@ -92,7 +93,35 @@ namespace FUSE
const ugid::Set ugid(fc->uid,fc->gid);
const rwlock::ReadGuard readlock(&config.branches_lock);
switch(config.cache_files)
{
case CacheFiles::LIBFUSE:
ffi_->direct_io = config.direct_io;
ffi_->keep_cache = config.kernel_cache;
ffi_->auto_cache = config.auto_cache;
break;
case CacheFiles::OFF:
ffi_->direct_io = 1;
ffi_->keep_cache = 0;
ffi_->auto_cache = 0;
break;
case CacheFiles::PARTIAL:
ffi_->direct_io = 0;
ffi_->keep_cache = 0;
ffi_->auto_cache = 0;
break;
case CacheFiles::FULL:
ffi_->direct_io = 0;
ffi_->keep_cache = 1;
ffi_->auto_cache = 0;
break;
case CacheFiles::AUTO_FULL:
ffi_->direct_io = 0;
ffi_->keep_cache = 0;
ffi_->auto_cache = 1;
break;
}
return l::open(config.open,
config.open_cache,
config.branches,

22
src/fuse_setxattr.cpp

@ -235,6 +235,26 @@ namespace l
return 0;
}
static
int
setxattr(const string &attrval_,
const int flags_,
Config::CacheFiles &cache_files_)
{
Config::CacheFiles tmp;
if((flags_ & XATTR_CREATE) == XATTR_CREATE)
return -EEXIST;
tmp = attrval_;
if(!tmp.valid())
return -EINVAL;
cache_files_ = tmp;
return 0;
}
static
int
setxattr_controlfile_func_policy(Config &config,
@ -438,6 +458,8 @@ namespace l
return l::setxattr_controlfile_cache_negative_entry(attrval,flags);
else if((attr[2] == "cache") && (attr[3] == "readdir"))
return l::setxattr_bool(attrval,flags,config.cache_readdir);
else if((attr[2] == "cache") && (attr[3] == "files"))
return l::setxattr(attrval,flags,config.cache_files);
break;
default:

78
src/option_parser.cpp

@ -132,13 +132,13 @@ parse_and_process(const std::string &value,
static
int
parse_and_process(const std::string &value,
bool &boolean)
parse_and_process(const std::string &value_,
bool &boolean_)
{
if(value == "false")
boolean = false;
else if(value == "true")
boolean = true;
if((value_ == "false") || (value_ == "0") || (value_ == "off"))
boolean_ = false;
else if((value_ == "true") || (value_ == "1") || (value_ == "on"))
boolean_ = true;
else
return 1;
@ -155,6 +155,22 @@ parse_and_process(const std::string &value_,
return 0;
}
static
int
parse_and_process(const std::string &value_,
Config::CacheFiles &cache_files_)
{
Config::CacheFiles tmp;
tmp = value_;
if(!tmp.valid())
return 1;
cache_files_ = tmp;
return 0;
}
static
int
parse_and_process_errno(const std::string &value_,
@ -241,6 +257,8 @@ parse_and_process_cache(Config &config_,
return parse_and_process(value_,config_.cache_symlinks);
else if(func_ == "readdir")
return parse_and_process(value_,config_.cache_readdir);
else if(func_ == "files")
return parse_and_process(value_,config_.cache_files);
return 1;
}
@ -252,8 +270,14 @@ parse_and_process_arg(Config &config,
{
if(arg == "defaults")
return 0;
else if(arg == "hard_remove")
return 0;
else if(arg == "direct_io")
return (config.direct_io=true,0);
else if(arg == "kernel_cache")
return (config.kernel_cache=true,0);
else if(arg == "auto_cache")
return (config.auto_cache=true,0);
else if(arg == "async_read")
return (config.async_read=true,0);
else if(arg == "sync_read")
@ -315,6 +339,10 @@ parse_and_process_kv_arg(Config &config,
rv = parse_and_process(value,config.posix_acl);
else if(key == "direct_io")
rv = parse_and_process(value,config.direct_io);
else if(key == "kernel_cache")
rv = parse_and_process(value,config.kernel_cache);
else if(key == "auto_cache")
rv = parse_and_process(value,config.auto_cache);
else if(key == "async_read")
rv = parse_and_process(value,config.async_read);
}
@ -378,14 +406,14 @@ void
usage(void)
{
std::cout <<
"Usage: mergerfs [options] <srcpaths> <destpath>\n"
"Usage: mergerfs [options] <branches> <destpath>\n"
"\n"
" -o [opt,...] mount options\n"
" -h --help print help\n"
" -v --version print version\n"
"\n"
"mergerfs options:\n"
" <srcpaths> ':' delimited list of directories. Supports\n"
" <branches> ':' delimited list of directories. Supports\n"
" shell globbing (must be escaped in shell)\n"
" -o func.<f>=<p> Set function <f> to policy <p>\n"
" -o category.<c>=<p> Set functions in category <c> to <p>\n"
@ -393,43 +421,49 @@ usage(void)
" default = 0 (disabled)\n"
" -o cache.statfs=<int> 'statfs' cache timeout in seconds. Used by\n"
" policies. default = 0 (disabled)\n"
" -o cache.files=libfuse|off|partial|full|auto-full\n"
" * libfuse: Use direct_io, kernel_cache, auto_cache\n"
" values directly\n"
" * off: Disable page caching\n"
" * partial: Clear page cache on file open\n"
" * full: Keep cache on file open\n"
" * auto-full: Keep cache if mtime & size not changed\n"
" default = libfuse\n"
" -o cache.symlinks=<bool>\n"
" enable kernel caching of symlinks (if supported)\n"
" Enable kernel caching of symlinks (if supported)\n"
" default = false\n"
" -o cache.readdir=<bool>\n"
" Enable kernel caching readdir (if supported)\n"
" default = false\n"
" -o cache.attr=<int> file attribute cache timeout in seconds.\n"
" -o cache.attr=<int> File attribute cache timeout in seconds.\n"
" default = 1\n"
" -o cache.entry=<int> file name lookup cache timeout in seconds.\n"
" -o cache.entry=<int> File name lookup cache timeout in seconds.\n"
" default = 1\n"
" -o cache.negative_entry=<int>\n"
" negative file name lookup cache timeout in\n"
" Negative file name lookup cache timeout in\n"
" seconds. default = 0\n"
" -o cache.readdir=<bool>\n"
" enable kernel caching readdir (if supported)\n"
" -o direct_io Bypass page caching, may increase write\n"
" speeds at the cost of reads. Please read docs\n"
" for more details as there are tradeoffs.\n"
" -o use_ino Have mergerfs generate inode values rather than\n"
" autogenerated by libfuse. Suggested.\n"
" -o minfreespace=<int> minimum free space needed for certain policies.\n"
" -o minfreespace=<int> Minimum free space needed for certain policies.\n"
" default = 4G\n"
" -o moveonenospc=<bool> Try to move file to another drive when ENOSPC\n"
" on write. default = false\n"
" -o dropcacheonclose=<bool>\n"
" When a file is closed suggest to OS it drop\n"
" the file's cache. This is useful when direct_io\n"
" is disabled. default = false\n"
" the file's cache. This is useful when using\n"
" 'cache.files'. default = false\n"
" -o symlinkify=<bool> Read-only files, after a timeout, will be turned\n"
" into symlinks. Read docs for limitations and\n"
" possible issues. default = false\n"
" -o symlinkify_timeout=<int>\n"
" timeout in seconds before will turn to symlinks.\n"
" Timeout in seconds before files turn to symlinks.\n"
" default = 3600\n"
" -o nullrw=<bool> Disables reads and writes. For benchmarking.\n"
" default = false\n"
" -o ignorepponrename=<bool>\n"
" Ignore path preserving when performing renames\n"
" and links. default = false\n"
" -o link_cow=<bool> delink/clone file on open to simulate CoW.\n"
" -o link_cow=<bool> Delink/clone file on open to simulate CoW.\n"
" default = false\n"
" -o security_capability=<bool>\n"
" When disabled return ENOATTR when the xattr\n"

Loading…
Cancel
Save