mirror of https://github.com/trapexit/mergerfs.git
Antonio SJ Musumeci
3 weeks ago
2 changed files with 89 additions and 0 deletions
@ -0,0 +1,88 @@ |
|||
# Usage Patterns |
|||
|
|||
## tiered cache |
|||
|
|||
Some storage technologies support what is called "tiered" caching. The |
|||
placing of smaller, faster storage as a transparent cache to larger, |
|||
slower storage. NVMe, SSD, Optane in front of traditional HDDs for |
|||
instance. |
|||
|
|||
mergerfs does not natively support any sort of tiered caching. Most |
|||
users have no use for such a feature and its inclusion would |
|||
complicate the code as it exists today. However, there are a few |
|||
situations where a cache filesystem could help with a typical mergerfs |
|||
setup. |
|||
|
|||
1. Fast network, slow filesystems, many readers: You've a 10+Gbps |
|||
network with many readers and your regular filesystems can't keep |
|||
up. |
|||
2. Fast network, slow filesystems, small'ish bursty writes: You have |
|||
a 10+Gbps network and wish to transfer amounts of data less than |
|||
your cache filesystem but wish to do so quickly and the time |
|||
between bursts is long enough to migrate data. |
|||
|
|||
With #1 it's arguable if you should be using mergerfs at all. A RAID |
|||
level that can aggregate performance or using higher performance |
|||
storage would probably be the better solution. If you're going to use |
|||
mergerfs there are other tactics that may help: spreading the data |
|||
across filesystems (see the mergerfs.dup tool) and setting |
|||
`func.open=rand`, using `symlinkify`, or using dm-cache or a similar |
|||
technology to add tiered cache to the underlying device itself. |
|||
|
|||
With #2 one could use dm-cache as well but there is another solution |
|||
which requires only mergerfs and a cronjob. |
|||
|
|||
1. Create 2 mergerfs pools. One which includes just the slow branches |
|||
and one which has both the fast branches (SSD,NVME,etc.) and slow |
|||
branches. The 'base' pool and the 'cache' pool. |
|||
2. The 'cache' pool should have the cache branches listed first in |
|||
the branch list. |
|||
3. The best `create` policies to use for the 'cache' pool would |
|||
probably be `ff`, `epff`, `lfs`, `msplfs`, or `eplfs`. The latter |
|||
three under the assumption that the cache filesystem(s) are far |
|||
smaller than the backing filesystems. If using path preserving |
|||
policies remember that you'll need to manually create the core |
|||
directories of those paths you wish to be cached. Be sure the |
|||
permissions are in sync. Use `mergerfs.fsck` to check / correct |
|||
them. You could also set the slow filesystems mode to `NC` though |
|||
that'd mean if the cache filesystems fill you'd get "out of space" |
|||
errors. |
|||
4. Enable `moveonenospc` and set `minfreespace` appropriately. To |
|||
make sure there is enough room on the "slow" pool you might want |
|||
to set `minfreespace` to at least as large as the size of the |
|||
largest cache filesystem if not larger. This way in the worst case |
|||
the whole of the cache filesystem(s) can be moved to the other |
|||
drives. |
|||
5. Set your programs to use the 'cache' pool. |
|||
6. Save one of the below scripts or create you're own. The script's |
|||
responsibility is to move files from the cache filesystems (not |
|||
pool) to the 'base' pool. |
|||
7. Use `cron` (as root) to schedule the command at whatever frequency |
|||
is appropriate for your workflow. |
|||
|
|||
|
|||
### time based expiring |
|||
|
|||
Move files from cache to base pool based only on the last time the |
|||
file was accessed. Replace `-atime` with `-amin` if you want minutes |
|||
rather than days. May want to use the `fadvise` / `--drop-cache` |
|||
version of rsync or run rsync with the tool |
|||
[nocache](https://github.com/Feh/nocache). |
|||
|
|||
**NOTE:** The arguments to these scripts include the cache |
|||
**filesystem** itself. Not the pool with the cache filesystem. You |
|||
could have data loss if the source is the cache pool. |
|||
|
|||
[mergerfs.time-based-mover](https://github.com/trapexit/mergerfs/blob/latest-release/tools/mergerfs.time-based-mover?raw=1) |
|||
|
|||
|
|||
### percentage full expiring |
|||
|
|||
Move the oldest file from the cache to the backing pool. Continue till |
|||
below percentage threshold. |
|||
|
|||
**NOTE:** The arguments to these scripts include the cache |
|||
**filesystem** itself. Not the pool with the cache filesystem. You |
|||
could have data loss if the source is the cache pool. |
|||
|
|||
[mergerfs.percent-full-mover](https://github.com/trapexit/mergerfs/blob/latest-release/tools/mergerfs.percent-full-mover?raw=1) |
Write
Preview
Loading…
Cancel
Save
Reference in new issue