Rate Limiting¶
relay provides a client-side rate limiter that throttles outgoing requests to protect upstream services and stay within API quotas.
Configuration¶
import "github.com/jhonsferg/relay"
client := relay.New(relay.Config{
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
RPS: 100, // requests per second
Burst: 20, // burst allowance
},
})
The underlying implementation uses a token-bucket algorithm. RPS sets the refill rate and Burst sets the bucket capacity.
Strategies¶
Token bucket (default)¶
Allows short bursts then smooths out to RPS:
client := relay.New(relay.Config{
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
RPS: 50,
Burst: 10,
Strategy: relay.TokenBucket, // default
},
})
Fixed window¶
Hard limit of N requests per second window. No burst allowed:
client := relay.New(relay.Config{
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
RPS: 30,
Strategy: relay.FixedWindow,
},
})
Leaky bucket¶
Requests are queued and emitted at a constant rate:
client := relay.New(relay.Config{
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
RPS: 20,
QueueSize: 100, // max queued requests
Strategy: relay.LeakyBucket,
},
})
Behaviour when limit is reached¶
By default, relay blocks the goroutine until a token is available. Configure a maximum wait time to return an error instead:
client := relay.New(relay.Config{
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
RPS: 10,
Burst: 2,
MaxWait: 500 * time.Millisecond, // error if no token within 500ms
},
})
When MaxWait is exceeded relay.ErrRateLimitExceeded is returned:
resp, err := client.R().GET(ctx, "/v1/data")
if errors.Is(err, relay.ErrRateLimitExceeded) {
log.Println("rate limit reached, back off and retry later")
}
Adaptive rate limiting¶
Automatically reduce the rate when upstream signals throttling (HTTP 429):
client := relay.New(relay.Config{
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
RPS: 100,
Adaptive: true, // reduce RPS on 429, recover gradually
},
MaxRetries: 3,
HonourRetryAfter: true,
})
With Adaptive: true, relay: 1. Halves RPS on each 429 response. 2. Gradually increases RPS back to the configured maximum over 30 seconds of successful responses.
Per-host rate limiting¶
Apply separate limits per upstream host:
client := relay.New(relay.Config{
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
PerHost: true,
HostLimits: map[string]relay.HostRateLimit{
"api.example.com": {RPS: 100, Burst: 20},
"search.example.com": {RPS: 20, Burst: 5},
},
Default: relay.HostRateLimit{RPS: 50, Burst: 10},
},
})
Request-level bypass¶
Skip rate limiting for a single high-priority request:
Warning
Use BypassRateLimit sparingly. It can cause 429 errors from the upstream if overused.
Metrics and observability¶
Inspect current rate limiter state at runtime:
stats := client.RateLimiterStats()
log.Printf("current RPS: %.1f, tokens available: %d, queued: %d",
stats.CurrentRPS, stats.Available, stats.Queued)
Hook into state changes:
client := relay.New(relay.Config{
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
RPS: 100,
OnThrottle: func(wait time.Duration) {
metrics.ThrottledRequests.Inc()
log.Printf("rate limited - waiting %s", wait)
},
},
})
Full example¶
package main
import (
"context"
"errors"
"log"
"sync"
"time"
"github.com/jhonsferg/relay"
)
func main() {
client := relay.New(relay.Config{
BaseURL: "https://api.example.com",
RateLimiter: relay.RateLimiterConfig{
Enabled: true,
RPS: 20,
Burst: 5,
MaxWait: 2 * time.Second,
Adaptive: true,
},
MaxRetries: 3,
HonourRetryAfter: true,
})
var wg sync.WaitGroup
for i := range 100 {
wg.Add(1)
go func(n int) {
defer wg.Done()
var result map[string]any
_, err := client.R().
SetResult(&result).
GET(context.Background(), "/v1/items")
if errors.Is(err, relay.ErrRateLimitExceeded) {
log.Printf("request %d: rate limit wait exceeded", n)
return
}
if err != nil {
log.Printf("request %d: %v", n, err)
return
}
log.Printf("request %d: ok", n)
}(i)
}
wg.Wait()
}