seaweedfs

Commit Graph

Author	SHA1	Message	Date
Chris Lu	6442da6f17	mount: efficient file lookup in large directories, skipping directory caching (#7818 ) * mount: skip directory caching on file lookup and write When opening or creating a file in a directory that hasn't been cached yet, don't list the entire directory. Instead: - For reads: fetch only the single file's metadata directly from the filer - For writes: create on filer but skip local cache insertion This fixes a performance issue where opening a file in a directory with millions of files would hang because EnsureVisited() had to list all entries before the open could complete. The directory will still be cached when explicitly listed (ReadDir), but individual file operations now bypass the full directory caching. Key optimizations: - Extract shared lookupEntry() method to eliminate code duplication - Skip EnsureVisited on Lookup (file open) - Skip cache insertion on Mknod, Mkdir, Symlink, Link if dir not cached - Skip cache update on file sync/flush if dir not cached - If directory IS cached and entry not found, return ENOENT immediately Fixes #7145 * mount: add error handling for meta cache insert/update operations Handle errors from metaCache.InsertEntry and metaCache.UpdateEntry calls instead of silently ignoring them. This prevents silent cache inconsistencies and ensures errors are properly propagated. Files updated: - filehandle_read.go: handle InsertEntry error in downloadRemoteEntry - weedfs_file_sync.go: handle InsertEntry error in doFlush - weedfs_link.go: handle UpdateEntry and InsertEntry errors in Link - weedfs_symlink.go: handle InsertEntry error in Symlink * mount: use error wrapping (%w) for consistent error handling Use %w instead of %v in fmt.Errorf to preserve the original error, allowing it to be inspected up the call stack with errors.Is/As.	4 days ago
Chris Lu	0cd9f34177	mount: improve EnsureVisited performance with dedup, parallelism, and batching (#7697 ) * mount: add singleflight to deduplicate concurrent EnsureVisited calls When multiple goroutines access the same uncached directory simultaneously, they would all make redundant network requests to the filer. This change uses singleflight.Group to ensure only one goroutine fetches the directory entries while others wait for the result. This fixes a race condition where concurrent lookups or readdir operations on the same uncached directory would: 1. Make duplicate network requests to the filer 2. Insert duplicate entries into LevelDB cache 3. Waste CPU and network bandwidth * mount: fetch parent directories in parallel during EnsureVisited Previously, when accessing a deep path like /a/b/c/d, the parent directories were fetched serially from target to root. This change: 1. Collects all uncached directories from target to root first 2. Fetches them all in parallel using errgroup 3. Relies on singleflight (from previous commit) for deduplication This reduces latency when accessing deep uncached paths, especially in high-latency network environments where parallel requests can significantly improve performance. * mount: add batch inserts for LevelDB meta cache When populating the meta cache from filer, entries were inserted one-by-one into LevelDB. This change: 1. Adds BatchInsertEntries method to LevelDBStore that uses LevelDB's native batch write API 2. Updates MetaCache to keep a direct reference to the LevelDB store for batch operations 3. Modifies doEnsureVisited to collect entries and insert them in batches of 100 entries Batch writes are more efficient because: - Reduces number of individual write operations - Reduces disk syncs - Improves throughput for large directories * mount: fix potential nil dereference in MarkChildrenCached Add missing check for inode existence in inode2path map before accessing the InodeEntry. This prevents a potential nil pointer dereference if the inode exists in path2inode but not in inode2path (which could happen due to race conditions or bugs). This follows the same pattern used in IsChildrenCached which properly checks for existence before accessing the entry. * mount: fix batch flush when last entry is hidden The previous batch insert implementation relied on the isLast flag to flush remaining entries. However, if the last entry is a hidden system entry (like 'topics' or 'etc' in root), the callback returns early and the remaining entries in the batch are never flushed. Fix by: 1. Only flush when batch reaches threshold inside the callback 2. Flush any remaining entries after ReadDirAllEntries completes 3. Use error wrapping instead of logging+returning to avoid duplicate logs 4. Create new slice after flush to allow GC of flushed entries 5. Add documentation for batchInsertSize constant This ensures all entries are properly inserted regardless of whether the last entry is hidden, and prevents memory retention issues. * mount: add context support for cancellation in EnsureVisited Thread context.Context through the batch insert call chain to enable proper cancellation and timeout support: 1. Use errgroup.WithContext() so if one fetch fails, others are cancelled 2. Add context parameter to BatchInsertEntries for consistency with InsertEntry 3. Pass context to ReadDirAllEntries for cancellation during network calls 4. Check context cancellation before starting work in doEnsureVisited 5. Use %w for error wrapping to preserve error types for inspection This prevents unnecessary work when one directory fetch fails and makes the batch operations consistent with the existing context-aware APIs.	2 weeks ago
tam-i13	b669607fcd	Add error list each entry func (#7485 ) * added error return in type ListEachEntryFunc * return error if errClose * fix fmt.Errorf * fix return errClose * use %w fmt.Errorf * added entry in messege error * add callbackErr in ListDirectoryEntries * fix error * add log * clear err when the scanner stops on io.EOF, so returning err doesn’t surface EOF as a failure. * more info in error * add ctx to logs, error handling * fix return eachEntryFunc * fix * fix log * fix return * fix foundationdb test s * fix eachEntryFunc * fix return resEachEntryFuncErr * Update weed/filer/filer.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update weed/filer/elastic/v7/elastic_store.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update weed/filer/hbase/hbase_store.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update weed/filer/foundationdb/foundationdb_store.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update weed/filer/ydb/ydb_store.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * fix * add scanErr --------- Co-authored-by: Roman Tamarov <r.tamarov@kryptonite.ru> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: chrislu <chris.lu@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	4 weeks ago
jang1lee	4ad669b2aa	Fixes files with TTL can not be read in a mounted folder (#6646 )	9 months ago
jang1lee	f7f6e1158e	Fixes files with TTL are not listed in a mounted folder. (#6621 )	10 months ago
chrislu	9ca30e52d5	fuse mount handles ttl entries fix https://github.com/seaweedfs/seaweedfs/issues/5527	1 year ago
wusong	0cb9ddd8ec	Fix data loss: add lock for metacache (#4664 ) Co-authored-by: wang wusong <wangwusong@virtaitech.com>	2 years ago
wusong	de081c0d64	[mount] fix metacache update (#4161 )	3 years ago
wusong	549354e324	Fix hardlink counting (#4042 ) Signed-off-by: wusong <wangwusong@virtaitech.com> Signed-off-by: wusong <wangwusong@virtaitech.com> Co-authored-by: wusong <wangwusong@virtaitech.com>	3 years ago
chrislu	303bd067b5	Revert "rename: delete source entry metadata only, skipping hard links" This reverts commit `03466f955e`. fix https://github.com/seaweedfs/seaweedfs/issues/3386	3 years ago
chrislu	26dbc6c905	move to https://github.com/seaweedfs/seaweedfs	3 years ago
chrislu	fcf3714443	mount: add back support for filer.path	4 years ago
chrislu	03466f955e	rename: delete source entry metadata only, skipping hard links	4 years ago
chrislu	dbeeda8123	listen for metadata updates	4 years ago
chrislu	fec8428fd8	POSIX: different inode for same named different file types	4 years ago
chrislu	2dcb8cb93b	POSIX: ensure file and directory inodes are different this is just an in memory representation. POSIX wants different inode numbers for the same named file or directory.	4 years ago
chrislu	10ecf80ca1	add a debug capability to list all metadata keys	4 years ago
Chris Lu	cca62fdb30	mount: streaming renaming folders	4 years ago
Chris Lu	93bb7869b8	Revert "mount: fix renaming a deep directory with unvisited directories" This reverts commit `0ccdb937bb`.	4 years ago
Chris Lu	0ccdb937bb	mount: fix renaming a deep directory with unvisited directories	4 years ago
Chris Lu	215b169562	mount: recursively rename locally	5 years ago
Chris Lu	83c037e093	fix logs	5 years ago
Chris Lu	3a86d4dbfd	mount: fix directory invalidation fix https://github.com/chrislusf/seaweedfs/issues/2038	5 years ago
Chris Lu	b81956bcb5	mount: invalidate kernel cache when mounted to a filer path fix https://github.com/chrislusf/seaweedfs/issues/1752#issuecomment-768178422	5 years ago
Chris Lu	a4063a5437	add stream list directory entries	5 years ago
Chris Lu	f002e668de	change limit to int64 in case of overflow	5 years ago
Chris Lu	9a50dbcda0	chagned api	5 years ago
Chris Lu	e71463a9eb	mount: invalide file cache when metadata is changed	5 years ago
Chris Lu	10a4a628e9	refresh cached file entry from sync metadata updates	5 years ago
Chris Lu	58fa506491	minor	5 years ago
Chris Lu	9b4f7fed14	mount: report filer IO error related to https://github.com/chrislusf/seaweedfs/issues/1530	5 years ago
Chris Lu	b18f21cce1	mount: fix bound tree with filer.path fix https://github.com/chrislusf/seaweedfs/issues/1528	5 years ago
Chris Lu	5e239afdfc	hardlink works now	5 years ago
Chris Lu	0adbb56cc1	rename	5 years ago
Chris Lu	7e1aad0b54	mount: map uid/gid between local and filer	5 years ago
Chris Lu	eb7929a971	rename filer2 to filer	5 years ago
Chris Lu	aee27ccbe1	multiple fixes * adjust isOpen count * move ContinuousDirtyPages lock to filehandle * fix problem with MergeIntoVisibles, avoid reusing slices * let filer delete the garbage	5 years ago
Chris Lu	4ccfdaeb4d	prevent nil	5 years ago
Chris Lu	24c8e6bcb4	minor optimization	5 years ago
Chris Lu	60d14a9800	mount: fix difference with storage format in local cache	6 years ago
Chris Lu	ddec7b2bb9	go fmt	6 years ago
Chris Lu	f7a45d448f	FUSE mount: lazy loading meta cache	6 years ago
Chris Lu	e93588ec78	FUSE mount: atomic local cache updates	6 years ago
Chris Lu	ed3cf811f5	refactoring	6 years ago
Chris Lu	a207285af7	cache metadata on startup	6 years ago
Chris Lu	e24b25de78	async meta caching: can stream updates now	6 years ago
Chris Lu	4f02f7121d	read from meta cache meta cache is not initialized	6 years ago

14 Commits (77a56c28572a550b652f1d49ed1a3166a3bd9eb1)