Browse Source
Fix webhook duplicate deliveries and POST to GET conversion (#7668)
Fix webhook duplicate deliveries and POST to GET conversion (#7668)
* Fix webhook duplicate deliveries and POST to GET conversion Fixes #7667 This commit addresses two critical issues with the webhook notification system: 1. Duplicate webhook deliveries based on worker count 2. POST requests being converted to GET when following redirects Issue 1: Multiple webhook deliveries ------------------------------------ Problem: The webhook queue was creating multiple handlers (one per worker) that all subscribed to the same topic. With Watermill's gochannel, each handler creates a separate subscription, and all subscriptions receive their own copy of every message, resulting in duplicate webhook calls equal to the worker count. Solution: Use a single handler instead of multiple handlers to ensure each webhook event is sent only once, regardless of worker configuration. Issue 2: POST to GET conversion with intelligent redirect handling ------------------------------------------------------------------ Problem: When webhook endpoints returned redirects (301/302/303), Go's default HTTP client would automatically follow them and convert POST requests to GET requests per HTTP specification. Solution: Implement intelligent redirect handling that: - Prevents automatic redirects to preserve POST method - Manually follows redirects by recreating POST requests - Caches the final redirect destination for performance - Invalidates cache and retries on failures (network or HTTP errors) - Provides automatic recovery from cached endpoint failures Benefits: - Webhooks are now sent exactly once per event - POST method is always preserved through redirects - Reduced latency through redirect destination caching - Automatic failover when cached destinations become unavailable - Thread-safe concurrent webhook delivery Testing: - Added TestQueueNoDuplicateWebhooks to verify single delivery - Added TestHttpClientFollowsRedirectAsPost for redirect handling - Added TestHttpClientUsesCachedRedirect for caching behavior - Added cache invalidation tests for error scenarios - All 18 webhook tests pass successfully * Address code review comments - Add maxWebhookRetryDepth constant to avoid magic number - Extract cache invalidation logic into invalidateCache() helper method - Fix redirect handling to properly follow redirects even on retry attempts - Remove misleading comment about nWorkers controlling handler parallelism - Fix test assertions to match actual execution flow - Remove trailing whitespace in test file All tests passing. * Refactor: use setFinalURL() instead of invalidateCache() Replace invalidateCache() with more explicit setFinalURL() function. This is cleaner as it makes the intent clear - we're setting the URL (either to a value or to empty string to clear it), rather than having a separate function just for clearing. No functional changes, all tests passing. * Add concurrent webhook delivery using nWorkers configuration Webhooks were previously sent sequentially (one-by-one), which could be a performance bottleneck for high-throughput scenarios. Now nWorkers configuration is properly used to control concurrent webhook delivery. Implementation: - Added semaphore channel (buffered to nWorkers capacity) - handleWebhook acquires semaphore slot before sending (blocks if at capacity) - Releases slot after webhook completes - Allows up to nWorkers concurrent webhook HTTP requests Benefits: - Improved throughput for slow webhook endpoints - nWorkers config now has actual purpose (was validated but unused) - Default 5 workers provides good balance - Configurable from 1-100 workers based on needs Example performance improvement: - Before: 500ms webhook latency = ~2 webhooks/sec max - After (5 workers): 500ms latency = ~10 webhooks/sec - After (10 workers): 500ms latency = ~20 webhooks/sec All tests passing. * Replace deprecated AddNoPublisherHandler with AddConsumerHandler AddNoPublisherHandler is deprecated in Watermill. Use AddConsumerHandler instead, which is the current recommended API for handlers that only consume messages without publishing. No functional changes, all tests passing. * Drain response bodies to enable HTTP connection reuse Added drainBody() calls in all code paths to ensure response bodies are consumed before returning. This is critical for HTTP keep-alive connection reuse. Without draining: - Connections are closed after each request - New TCP handshake + TLS handshake for every webhook - Higher latency and resource usage With draining: - Connections are reused via HTTP keep-alive - Significant performance improvement for repeated webhooks - Lower latency (no handshake overhead) - Reduced resource usage Implementation: - Added drainBody() helper that reads up to 1MB (prevents memory issues) - Drain on success path (line 161) - Drain on error responses before retry (lines 119, 152) - Drain on redirect responses before following (line 118) - Already had drainResponse() for network errors (line 99) All tests passing. * Use existing CloseResponse utility instead of custom drainBody Replaced custom drainBody() function with the existing util_http.CloseResponse() utility which is already used throughout the codebase. This provides: - Consistent behavior with rest of the codebase - Better logging (logs bytes drained via CountingReader) - Full body drainage (not limited to 1MB) - Cleaner code (no duplication) CloseResponse properly drains and closes the response body to enable HTTP keep-alive connection reuse. All tests passing. * Fix: Don't overwrite original error when draining response Before: err was being overwritten by drainResponse() result After: Use drainErr to avoid losing the original client.Do() error This was a subtle bug where if drainResponse() succeeded (returned nil), we would lose the original network error and potentially return a confusing error message. All tests passing. * Optimize HTTP client: reuse client and remove redundant timeout 1. Reuse single http.Client instance instead of creating new one per request - Reduces allocation overhead - More efficient for high-volume webhooks 2. Remove redundant timeout configuration - Before: timeout set on both context AND http.Client - After: timeout only on context (cleaner, context fires first anyway) Performance benefits: - Reduced GC pressure (fewer client allocations) - Better connection pooling (single transport instance) - Cleaner code (no redundancy) All tests passing.pull/7669/head
committed by
GitHub
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 564 additions and 16 deletions
-
106weed/notification/webhook/http.go
-
372weed/notification/webhook/http_test.go
-
29weed/notification/webhook/webhook_queue.go
-
73weed/notification/webhook/webhook_queue_test.go
Write
Preview
Loading…
Cancel
Save
Reference in new issue