You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

88 lines
4.1 KiB

  1. # Usage Patterns
  2. ## tiered cache
  3. Some storage technologies support what is called "tiered" caching. The
  4. placing of smaller, faster storage as a transparent cache to larger,
  5. slower storage. NVMe, SSD, Optane in front of traditional HDDs for
  6. instance.
  7. mergerfs does not natively support any sort of tiered caching. Most
  8. users have no use for such a feature and its inclusion would
  9. complicate the code as it exists today. However, there are a few
  10. situations where a cache filesystem could help with a typical mergerfs
  11. setup.
  12. 1. Fast network, slow filesystems, many readers: You've a 10+Gbps
  13. network with many readers and your regular filesystems can't keep
  14. up.
  15. 2. Fast network, slow filesystems, small'ish bursty writes: You have
  16. a 10+Gbps network and wish to transfer amounts of data less than
  17. your cache filesystem but wish to do so quickly and the time
  18. between bursts is long enough to migrate data.
  19. With #1 it's arguable if you should be using mergerfs at all. A RAID
  20. level that can aggregate performance or using higher performance
  21. storage would probably be the better solution. If you're going to use
  22. mergerfs there are other tactics that may help: spreading the data
  23. across filesystems (see the mergerfs.dup tool) and setting
  24. `func.open=rand`, using `symlinkify`, or using dm-cache or a similar
  25. technology to add tiered cache to the underlying device itself.
  26. With #2 one could use dm-cache as well but there is another solution
  27. which requires only mergerfs and a cronjob.
  28. 1. Create 2 mergerfs pools. One which includes just the slow branches
  29. and one which has both the fast branches (SSD,NVME,etc.) and slow
  30. branches. The 'base' pool and the 'cache' pool.
  31. 2. The 'cache' pool should have the cache branches listed first in
  32. the branch list.
  33. 3. The best `create` policies to use for the 'cache' pool would
  34. probably be `ff`, `epff`, `lfs`, `msplfs`, or `eplfs`. The latter
  35. three under the assumption that the cache filesystem(s) are far
  36. smaller than the backing filesystems. If using path preserving
  37. policies remember that you'll need to manually create the core
  38. directories of those paths you wish to be cached. Be sure the
  39. permissions are in sync. Use `mergerfs.fsck` to check / correct
  40. them. You could also set the slow filesystems mode to `NC` though
  41. that'd mean if the cache filesystems fill you'd get "out of space"
  42. errors.
  43. 4. Enable `moveonenospc` and set `minfreespace` appropriately. To
  44. make sure there is enough room on the "slow" pool you might want
  45. to set `minfreespace` to at least as large as the size of the
  46. largest cache filesystem if not larger. This way in the worst case
  47. the whole of the cache filesystem(s) can be moved to the other
  48. drives.
  49. 5. Set your programs to use the 'cache' pool.
  50. 6. Save one of the below scripts or create you're own. The script's
  51. responsibility is to move files from the cache filesystems (not
  52. pool) to the 'base' pool.
  53. 7. Use `cron` (as root) to schedule the command at whatever frequency
  54. is appropriate for your workflow.
  55. ### time based expiring
  56. Move files from cache to base pool based only on the last time the
  57. file was accessed. Replace `-atime` with `-amin` if you want minutes
  58. rather than days. May want to use the `fadvise` / `--drop-cache`
  59. version of rsync or run rsync with the tool
  60. [nocache](https://github.com/Feh/nocache).
  61. **NOTE:** The arguments to these scripts include the cache
  62. **filesystem** itself. Not the pool with the cache filesystem. You
  63. could have data loss if the source is the cache pool.
  64. [mergerfs.time-based-mover](https://github.com/trapexit/mergerfs/blob/latest-release/tools/mergerfs.time-based-mover?raw=1)
  65. ### percentage full expiring
  66. Move the oldest file from the cache to the backing pool. Continue till
  67. below percentage threshold.
  68. **NOTE:** The arguments to these scripts include the cache
  69. **filesystem** itself. Not the pool with the cache filesystem. You
  70. could have data loss if the source is the cache pool.
  71. [mergerfs.percent-full-mover](https://github.com/trapexit/mergerfs/blob/latest-release/tools/mergerfs.percent-full-mover?raw=1)