You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

119 lines
4.9 KiB

  1. # Usage Patterns
  2. ## tiered cache
  3. Some storage technologies support what is called "tiered" caching. The
  4. placing of smaller, faster storage as a transparent cache to larger,
  5. slower storage. NVMe, SSD, Optane in front of traditional HDDs for
  6. instance.
  7. mergerfs does not natively support any sort of tiered caching. Most
  8. users have no use for such a feature and its inclusion would
  9. complicate the code as it exists today. However, there are a few
  10. situations where a cache filesystem could help with a typical mergerfs
  11. setup.
  12. 1. Fast network, slow filesystems, many readers: You've a 10+Gbps
  13. network with many readers and your regular filesystems can't keep
  14. up.
  15. 2. Fast network, slow filesystems, small'ish bursty writes: You have
  16. a 10+Gbps network and wish to transfer amounts of data less than
  17. your cache filesystem but wish to do so quickly and the time
  18. between bursts is long enough to migrate data.
  19. With #1 it's arguable if you should be using mergerfs at all. A RAID
  20. level that can aggregate performance or using higher performance
  21. storage would probably be the better solution. If you're going to use
  22. mergerfs there are other tactics that may help: spreading the data
  23. across filesystems (see the mergerfs.dup tool) and setting
  24. `func.open=rand`, using `symlinkify`, or using dm-cache or a similar
  25. technology to add tiered cache to the underlying device itself.
  26. With #2 one could use a block cache solution as available via LVM and
  27. dm-cache but there is another solution which requires only mergerfs, a
  28. script to move files around, and a cron job to run said script.
  29. * Create two mergerfs pools. One which includes just the **slow**
  30. branches and one which has both the **fast** branches
  31. (SSD,NVME,etc.) and **slow** branches. The **base** pool and the
  32. **cache** pool.
  33. * The **cache** pool should have the cache branches listed first in
  34. the branch list in order to to make it easier to prioritize them.
  35. * The best `create` policies to use for the **cache** pool would
  36. probably be `ff`, `lus`, or `lfs`. The latter two under the
  37. assumption that the cache filesystem(s) are far smaller than the
  38. backing filesystems.
  39. * You can also set the **slow** filesystems mode to `NC` which would
  40. give you the ability to use other `create` policies though that'd
  41. mean if the cache filesystems fill you'd get "out of space"
  42. errors. This however may be good as it would indicate the script
  43. moving files around is not configured properly.
  44. * Set your programs to use the **cache** pool.
  45. * Configure the **base** pool with the `create` policy you would like
  46. to lay out files as you like.
  47. * Save one of the below scripts or create your own. The script's
  48. responsibility is to move files from the **cache** branches (not
  49. pool) to the **base** pool.
  50. * Use `cron` (as root) to schedule the command at whatever frequency
  51. is appropriate for your workflow.
  52. ### time based expiring
  53. Move files from cache filesystem to base pool which have an access
  54. time older than the supplied number of days. Replace `-atime` with
  55. `-amin` in the script if you want minutes rather than days.
  56. **NOTE:** The arguments to these scripts include the cache
  57. **filesystem** itself. Not the pool with the cache filesystem. You
  58. could have data loss if the source is the cache pool.
  59. [mergerfs.time-based-mover](https://github.com/trapexit/mergerfs/blob/latest-release/tools/mergerfs.time-based-mover?raw=1)
  60. Download:
  61. ```
  62. curl -o /usr/local/bin/mergerfs.time-based-mover https://raw.githubusercontent.com/trapexit/mergerfs/refs/heads/latest-release/tools/mergerfs.time-based-mover
  63. ```
  64. crontab entry:
  65. ```
  66. # m h dom mon dow command
  67. 0 * * * * /usr/local/bin/mergerfs.time-based-mover /mnt/ssd/cache00 /mnt/base-pool 1
  68. ```
  69. If you have more than one cache filesystem then simply add a cron
  70. entry for each.
  71. If you want to only move files from a subdirectory then use the
  72. subdirectories. `/mnt/ssd/cache00/foo` and `/mnt/base-pool/foo`
  73. respectively.
  74. ### percentage full expiring
  75. While the cache filesystem's percentage full is above the provided
  76. value move the oldest file from the cache filesystem to the base pool.
  77. **NOTE:** The arguments to these scripts include the cache
  78. **filesystem** itself. Not the pool with the cache filesystem. You
  79. could have data loss if the source is the cache pool.
  80. [mergerfs.percent-full-mover](https://github.com/trapexit/mergerfs/blob/latest-release/tools/mergerfs.percent-full-mover?raw=1)
  81. Download:
  82. ```
  83. curl -o /usr/local/bin/mergerfs.percent-full-mover https://raw.githubusercontent.com/trapexit/mergerfs/refs/heads/latest-release/tools/mergerfs.percent-full-mover
  84. ```
  85. crontab entry:
  86. ```
  87. # m h dom mon dow command
  88. 0 * * * * /usr/local/bin/mergerfs.percent-full-mover /mnt/ssd/cache00 /mnt/base-pool 80
  89. ```
  90. If you have more than one cache filesystem then simply add a cron
  91. entry for each.
  92. If you want to only move files from a subdirectory then use the
  93. subdirectories. `/mnt/ssd/cache00/foo` and `/mnt/base-pool/foo`
  94. respectively.