seaweedfs

Commit Graph

Author	SHA1	Message	Date
Chris Lu	753e1db096	Prevent split-brain: Persistent ClusterID and Join Validation (#8022 ) * Prevent split-brain: Persistent ClusterID and Join Validation - Persist ClusterId in Raft store to survive restarts. - Validate ClusterId on Raft command application (piggybacked on MaxVolumeId). - Prevent masters with conflicting ClusterIds from joining/operating together. - Update Telemetry to report the persistent ClusterId. * Refine ClusterID validation based on feedback - Improved error message in cluster_commands.go. - Added ClusterId mismatch check in RaftServer.Recovery. * Handle Raft errors and support Hashicorp Raft for ClusterId - Check for errors when persisting ClusterId in legacy Raft. - Implement ClusterId generation and persistence for Hashicorp Raft leader changes. - Ensure consistent error logging. * Refactor ClusterId validation - Centralize ClusterId mismatch check in Topology.SetClusterId. - Simplify MaxVolumeIdCommand.Apply and RaftServer.Recovery to rely on SetClusterId. * Fix goroutine leak and add timeout - Handle channel closure in Hashicorp Raft leader listener. - Add timeout to Raft Apply call to prevent blocking. * Fix deadlock in legacy Raft listener - Wrap ClusterId generation/persistence in a goroutine to avoid blocking the Raft event loop (deadlock). * Rename ClusterId to SystemId - Renamed ClusterId to SystemId across the codebase (protobuf, topology, server, telemetry). - Regenerated telemetry.pb.go with new field. * Rename SystemId to TopologyId - Rename to SystemId was intermediate step. - Final name is TopologyId for the persistent cluster identifier. - Updated protobuf, topology, raft server, master server, and telemetry. * Optimize Hashicorp Raft listener - Integrated TopologyId generation into existing monitorLeaderLoop. - Removed extra goroutine in master_server.go. * Fix optimistic TopologyId update - Removed premature local state update of TopologyId in master_server.go and raft_hashicorp.go. - State is now solely updated via the Raft state machine Apply/Restore methods after consensus. * Add explicit log for recovered TopologyId - Added glog.V(0) info log in RaftServer.Recovery to print the recovered TopologyId on startup. * Add Raft barrier to prevent TopologyId race condition - Implement ensureTopologyId helper method - Send no-op MaxVolumeIdCommand to sync Raft log before checking TopologyId - Ensures persisted TopologyId is recovered before generating new one - Prevents race where generation happens during log replay * Serialize TopologyId generation with mutex - Add topologyIdGenLock mutex to MasterServer struct - Wrap ensureTopologyId method with lock to prevent concurrent generation - Fixes race where event listener and manual leadership check both generate IDs - Second caller waits for first to complete and sees the generated ID * Add TopologyId recovery logging to Apply method - Change log level from V(1) to V(0) for visibility - Log 'Recovered TopologyId' when applying from Raft log - Ensures recovery is visible whether from snapshot or log replay - Matches Recovery() method logging for consistency * Fix Raft barrier timing issue - Add 100ms delay after barrier command to ensure log application completes - Add debug logging to track barrier execution and TopologyId state - Return early if barrier command fails - Prevents TopologyId generation before old logs are fully applied * ensure leader * address comments * address comments * redundant * clean up * double check * refactoring * comment	3 days ago
Chris Lu	07dc552e1c	master: Fix raft url (#7255 ) * fix signature * fix url scheme	4 months ago
Dmitriy Pavlov	cd78e653e1	add disable volume_growth flag (#7196 )	5 months ago
Chris Lu	e446234e9c	remove spoof-able request header (#7103 ) * remove spoof-able request header https://github.com/seaweedfs/seaweedfs/issues/7094#issuecomment-3158320497 * Update weed/security/guard.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	6 months ago
Chris Lu	0703308270	remote address parsing should handle special cases (#7101 ) * remote address parsing should handle special cases * handling ipv6 * simplify * Update weed/security/guard.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update weed/security/guard.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * x-real-ip * Update guard.go * fixes Hostname Whitelisting: Fully restored - supports localhost, example.com, etc. IP Whitelisting: Still works - supports exact IPs and CIDR ranges Header Support: Consistent handling of X-Forwarded-For, X-Real-IP * simplify * Update weed/security/guard.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update weed/security/guard.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update guard.go * adjust function signature * Update weed/security/guard.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * indention * skip empty host --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	6 months ago
chrislu	798f797158	use float for sleep seconds fix https://github.com/seaweedfs/seaweedfs/pull/6795	7 months ago
chrislu	1733d0ce68	remove features and deployments fields	7 months ago
Chris Lu	a1aab8a083	add telemetry (#6926 ) * add telemetry * fix go mod * add default telemetry server url * Update README.md * replace with broker count instead of s3 count * Update telemetry.pb.go * github action to deploy	7 months ago
Aleksey Kosov	5182d46e22	Added middleware for processing request_id grpc and http requests (#6805 )	8 months ago
Lisandro Pin	fc4df944a0	Remove rate limit semaphore on master's leader selection logic. (#6494 ) This was introduced by `054374c7` (2024-03-12) and serves no practical purpose, yet it caps the maximum QPS master servers can handle.	12 months ago
Konstantin Lebedev	b65eb2ec45	[security] reload whiteList on http seerver (#6302 ) * reload whiteList * white_list add to scaffold	1 year ago
Konstantin Lebedev	fec88e64eb	[master] update LastLeaderChangeTime for hashicorp raft (#6292 )	1 year ago
chrislu	ccf1795e6f	wait a bit before getting the next volume id if the leader is recently elected	1 year ago
chrislu	6564ceda91	skip resource heavy commands from running on master nodes	1 year ago
chrislu	4463296811	add parallel vacuuming	1 year ago
Riccardo Bertossa	6fe8639504	add http endpoint to get the size of a collection (#5910 )	1 year ago
wyang	4b1f539ab8	fix allocate reduplicated volumeId to different volume (#5811 ) * fix allocate reduplicated volumeId to different volume * only check barrier when read --------- Co-authored-by: Yang Wang <yangwang@weride.ai>	2 years ago
vadimartynov	86d92a42b4	Added tls for http clients (#5766 ) * Added global http client * Added Do func for global http client * Changed the code to use the global http client * Fix http client in volume uploader * Fixed pkg name * Fixed http util funcs * Fixed http client for bench_filer_upload * Fixed http client for stress_filer_upload * Fixed http client for filer_server_handlers_proxy * Fixed http client for command_fs_merge_volumes * Fixed http client for command_fs_merge_volumes and command_volume_fsck * Fixed http client for s3api_server * Added init global client for main funcs * Rename global_client to client * Changed: - fixed NewHttpClient; - added CheckIsHttpsClientEnabled func - updated security.toml in scaffold * Reduce the visibility of some functions in the util/http/client pkg * Added the loadSecurityConfig function * Use util.LoadSecurityConfiguration() in NewHttpClient func	2 years ago
Konstantin Lebedev	67edf1d014	[master] Do Automatic Volume Grow in background (#5781 ) * Do Automatic Volume Grow in backgound * pass lastGrowCount to master * fix build * fix count to uint64	2 years ago
vadimartynov	8aae82dd71	Added context for the MasterClient's methods to avoid endless loops (#5628 ) * Added context for the MasterClient's methods to avoid endless loops * Returned WithClient function. Added WithClientCustomGetMaster function * Hid unused ctx arguments * Using a common context for the KeepConnectedToMaster and WaitUntilConnected functions * Changed the context termination check in the tryConnectToMaster function * Added a child context to the tryConnectToMaster function * Added a common context for KeepConnectedToMaster and WaitUntilConnected functions in benchmark	2 years ago
shenxingwuying	ee25ada732	reduce ambiguity about use memory_sequencer (#5555 )	2 years ago
chrislu	55976ae04a	avoid repeated calls to heavy-weighted viper	2 years ago
chrislu	d9490c5e1f	rename	2 years ago
Nico D'Cotta	796b7508f3	Implement SRV lookups for filer (#4767 )	2 years ago
chrislu	a315490f7d	proxy to master uses http address fix https://github.com/seaweedfs/seaweedfs/issues/4607	3 years ago
chrislu	adb90bd252	avoid lower casing the command fix https://github.com/seaweedfs/seaweedfs/pull/4321	3 years ago
Konstantin Lebedev	b9933d5589	master server graceful stop (#3797 )	3 years ago
Konstantin Lebedev	e90ab4ac60	avoid race conditions for OnPeerUpdate (#3525 ) https://github.com/seaweedfs/seaweedfs/issues/3524	3 years ago
Patrick Schmidt	7b424a54dc	Add raft server access mutex to avoid races (#3503 )	3 years ago
chrislu	10414fd81c	ping timeout at 15 seconds this 72 minute timeout setting seems unreasonably long 15 seconds is around the time when a new raft leader should be elected.	3 years ago
askeipx	2e78a522ab	remove old raft servers if they don't answer to pings for too long (#3398 ) * remove old raft servers if they don't answer to pings for too long add ping durations as options rename ping fields fix some todos get masters through masterclient raft remove server from leader use raft servers to ping them CheckMastersAlive for hashicorp raft only * prepare blocking ping * pass waitForReady as param * pass waitForReady through all functions * waitForReady works * refactor * remove unneeded params * rollback unneeded changes * fix	3 years ago
Konstantin Lebedev	4d4cd0948d	avoid infinite loop WaitUntilConnected() (#3431 ) https://github.com/seaweedfs/seaweedfs/issues/3421	3 years ago
Konstantin Lebedev	a98f6d66a3	rollback over onPeerupdate implementation of automatic clean-up of failed servers in favor of synchronous ping	4 years ago
chrislu	26dbc6c905	move to https://github.com/seaweedfs/seaweedfs	4 years ago
chrislu	bb01b68fa0	refactor	4 years ago
chrislu	68065128b8	add dc and rack	4 years ago
chrislu	3828b8ce87	"github.com/chrislusf/raft" => "github.com/seaweedfs/raft"	4 years ago
Konstantin Lebedev	c88ea31f62	fix RUnlock of unlocked RWMutex	4 years ago
Konstantin Lebedev	3c42814b58	avoid deadlock	4 years ago
Konstantin Lebedev	93ca87b7cb	use safe onPeerUpdateDoneCns	4 years ago
Konstantin Lebedev	7875470e74	onPeerUpdateGoroutineCount use int32	4 years ago
Konstantin Lebedev	6c390851e7	fix design	4 years ago
Konstantin Lebedev	f6a966b4fc	add waiting log message	4 years ago
Konstantin Lebedev	6cfbfb0849	check for ping before deleting raft server https://github.com/chrislusf/seaweedfs/issues/3083	4 years ago
Konstantin Lebedev	f419d5643a	fix typo add remove logs	4 years ago
chrislu	24291e23eb	refactor	4 years ago
chrislu	9f20d3ebd1	add dc and rack	4 years ago
chrislu	6adc42147f	fresh filer store bootstrap from the oldest peer	4 years ago
chrislu	b201edb9df	fix wrong assignment	4 years ago
chrislu	9271866d1e	fix segmentation violation fix https://github.com/chrislusf/seaweedfs/issues/3000	4 years ago

1 2 3

140 Commits (59dfe047b6e12ebf67f77c3473bba3b4d44e1090)