Prometheus Metrics Extension¶

The prometheus extension exposes relay request metrics in the Prometheus text format. It integrates directly with prometheus/client_golang, registering counters, histograms, and gauges into a Prometheus registry that you expose via a standard /metrics HTTP endpoint.

Use this extension when you run a Prometheus scrape pipeline and do not need a full OpenTelemetry SDK. If you already have an OTel pipeline, consider the metrics extension instead, which can also export to Prometheus via the OTel Prometheus bridge.

Installation¶

go get github.com/jhonsferg/relay/ext/prometheus

Import¶

import relayprom "github.com/jhonsferg/relay/ext/prometheus"

Quick Start¶

package main

import (
    "context"
    "log"
    "net/http"

    "github.com/jhonsferg/relay"
    relayprom "github.com/jhonsferg/relay/ext/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
)

func main() {
    client, err := relay.New(
        relay.WithBaseURL("https://api.example.com"),
        relayprom.WithPrometheus(),
    )
    if err != nil {
        log.Fatalf("relay.New: %v", err)
    }
    defer client.Close()

    // Expose metrics at /metrics.
    http.Handle("/metrics", promhttp.Handler())
    go func() {
        if err := http.ListenAndServe(":9090", nil); err != nil {
            log.Fatalf("metrics server: %v", err)
        }
    }()

    ctx := context.Background()
    resp, err := client.Get(ctx, "/health")
    if err != nil {
        log.Fatalf("GET /health: %v", err)
    }
    defer resp.Body.Close()
    log.Printf("status: %d", resp.StatusCode)
}

API Reference¶

`relayprom.WithPrometheus(opts...)`¶

func WithPrometheus(opts ...PrometheusOption) relay.Option

Registers the relay Prometheus collectors and wraps the client transport with an instrumented layer. The function is idempotent per registry - calling it twice with the same registry returns an error from the second call.

Options¶

Option Function	Signature	Description
`WithNamespace`	`WithNamespace(ns string)`	Prefix all metric names with `ns_`. Default: `""` (no prefix).
`WithSubsystem`	`WithSubsystem(sub string)`	Add a subsystem component between namespace and metric name.
`WithRegisterer`	`WithRegisterer(reg prometheus.Registerer)`	Use a custom registry instead of `prometheus.DefaultRegisterer`.
`WithGatherer`	`WithGatherer(g prometheus.Gatherer)`	Use a custom gatherer for `promhttp.HandlerFor`.
`WithDurationBuckets`	`WithDurationBuckets(buckets []float64)`	Override histogram bucket boundaries.
`WithConstLabels`	`WithConstLabels(labels prometheus.Labels)`	Attach fixed labels to all metrics from this client.
`WithObserveRequestSize`	`WithObserveRequestSize(bool)`	Enable the request body size histogram. Default: `false`.
`WithObserveResponseSize`	`WithObserveResponseSize(bool)`	Enable the response body size histogram. Default: `false`.

Metric Names and Labels¶

The extension registers the following collectors. All names assume no namespace prefix; set WithNamespace("myapp") to get myapp_relay_requests_total, etc.

`relay_requests_total`¶

A counter vector counting completed requests.

Type: CounterVec
Labels: method, host, status_code

When a transport-level error occurs and no response is received, status_code is set to "error".

`relay_request_duration_seconds`¶

A histogram of request durations. The duration is measured from the first byte sent to the response body being fully read or closed.

Type: HistogramVec
Labels: method, host, status_code
Default buckets (seconds): .005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10

`relay_active_requests`¶

A gauge tracking in-flight requests.

Type: GaugeVec
Labels: method, host

`relay_request_size_bytes` (optional)¶

A histogram of request body sizes in bytes. Enable with WithObserveRequestSize(true).

Type: HistogramVec
Labels: method, host

`relay_response_size_bytes` (optional)¶

A histogram of response body sizes in bytes. Enable with WithObserveResponseSize(true).

Type: HistogramVec
Labels: method, host, status_code

Complete Example: /metrics Endpoint with Custom Registry¶

Using a custom registry keeps relay metrics isolated from default Go runtime metrics during testing, or when running multiple relay clients in the same process.

package main

import (
    "context"
    "log"
    "net/http"
    "os"
    "os/signal"
    "syscall"
    "time"

    "github.com/jhonsferg/relay"
    relayprom "github.com/jhonsferg/relay/ext/prometheus"
    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/collectors"
    "github.com/prometheus/client_golang/prometheus/promhttp"
)

func main() {
    // Create a custom registry with Go runtime and process metrics.
    reg := prometheus.NewRegistry()
    reg.MustRegister(
        collectors.NewGoCollector(),
        collectors.NewProcessCollector(collectors.ProcessCollectorOpts{}),
    )

    // Build the relay client, recording all metrics into the custom registry.
    client, err := relay.New(
        relay.WithBaseURL("https://jsonplaceholder.typicode.com"),
        relay.WithTimeout(10),
        relayprom.WithPrometheus(
            relayprom.WithNamespace("myapp"),
            relayprom.WithSubsystem("http_client"),
            relayprom.WithRegisterer(reg),
            relayprom.WithDurationBuckets([]float64{
                .005, .01, .025, .05, .1, .25, .5, 1, 2.5,
            }),
            relayprom.WithConstLabels(prometheus.Labels{
                "service": "user-service",
            }),
            relayprom.WithObserveResponseSize(true),
        ),
    )
    if err != nil {
        log.Fatalf("relay.New: %v", err)
    }
    defer client.Close()

    // Serve the custom registry on /metrics.
    mux := http.NewServeMux()
    mux.Handle("/metrics", promhttp.HandlerFor(reg, promhttp.HandlerOpts{
        EnableOpenMetrics: true,
    }))
    mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
        w.WriteHeader(http.StatusOK)
    })

    srv := &http.Server{
        Addr:         ":8080",
        Handler:      mux,
        ReadTimeout:  5 * time.Second,
        WriteTimeout: 10 * time.Second,
    }

    go func() {
        log.Printf("metrics server listening on :8080")
        if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
            log.Fatalf("metrics server: %v", err)
        }
    }()

    // Make some requests so you have data in the metrics.
    ctx := context.Background()
    endpoints := []string{"/posts/1", "/posts/2", "/users/1", "/nonexistent"}
    for _, ep := range endpoints {
        resp, err := client.Get(ctx, ep)
        if err != nil {
            log.Printf("GET %s error: %v", ep, err)
            continue
        }
        resp.Body.Close()
        log.Printf("GET %s -> %d", ep, resp.StatusCode)
    }

    // Wait for interrupt signal before shutting down.
    quit := make(chan os.Signal, 1)
    signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
    <-quit

    shutCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()
    if err := srv.Shutdown(shutCtx); err != nil {
        log.Printf("server shutdown error: %v", err)
    }
}

After running this, curl localhost:8080/metrics returns output like:

# HELP myapp_http_client_relay_active_requests Number of HTTP requests currently in flight.
# TYPE myapp_http_client_relay_active_requests gauge
myapp_http_client_relay_active_requests{host="jsonplaceholder.typicode.com",method="GET",service="user-service"} 0
# HELP myapp_http_client_relay_relay_request_duration_seconds HTTP request duration in seconds.
# TYPE myapp_http_client_relay_relay_request_duration_seconds histogram
myapp_http_client_relay_relay_request_duration_seconds_bucket{host="jsonplaceholder.typicode.com",method="GET",service="user-service",status_code="200",le="0.005"} 0
...
# HELP myapp_http_client_relay_relay_requests_total Total number of HTTP requests.
# TYPE myapp_http_client_relay_relay_requests_total counter
myapp_http_client_relay_relay_requests_total{host="jsonplaceholder.typicode.com",method="GET",service="user-service",status_code="200"} 3
myapp_http_client_relay_relay_requests_total{host="jsonplaceholder.typicode.com",method="GET",service="user-service",status_code="404"} 1

Custom Histogram Buckets¶

The default duration buckets in seconds (.005 through 10) cover most HTTP workloads. Override them when your API has a different latency profile.

Sub-5ms Internal RPC¶

relayprom.WithDurationBuckets([]float64{
    .0001, .0005, .001, .0025, .005, .01, .025, .05, .1, .25,
})

Long-running Async APIs¶

relayprom.WithDurationBuckets([]float64{
    1, 2.5, 5, 10, 30, 60, 120, 300, 600,
})

Using prometheus.DefBuckets¶

To restore the Prometheus default buckets explicitly:

import "github.com/prometheus/client_golang/prometheus"

relayprom.WithDurationBuckets(prometheus.DefBuckets)

tip Use prometheus.ExponentialBucketsRange(min, max, count) to generate evenly distributed exponential buckets without hand-tuning every boundary value.

// 12 buckets from 1ms to 10s distributed exponentially.
buckets, _ := prometheus.LinearBuckets(0.001, 10, 12)
relayprom.WithDurationBuckets(buckets)

Multiple Clients with Separate Metrics¶

If you run multiple relay clients targeting different services, give each its own registry or differentiate them with const labels:

package main

import (
    "github.com/jhonsferg/relay"
    relayprom "github.com/jhonsferg/relay/ext/prometheus"
    "github.com/prometheus/client_golang/prometheus"
)

func buildClients() (*relay.Client, *relay.Client, error) {
    reg := prometheus.NewRegistry()

    orders, err := relay.New(
        relay.WithBaseURL("https://orders.internal"),
        relayprom.WithPrometheus(
            relayprom.WithRegisterer(reg),
            relayprom.WithConstLabels(prometheus.Labels{"upstream": "orders"}),
        ),
    )
    if err != nil {
        return nil, nil, err
    }

    inventory, err := relay.New(
        relay.WithBaseURL("https://inventory.internal"),
        relayprom.WithPrometheus(
            relayprom.WithRegisterer(reg),
            relayprom.WithConstLabels(prometheus.Labels{"upstream": "inventory"}),
        ),
    )
    if err != nil {
        return nil, nil, err
    }

    return orders, inventory, nil
}

note Both clients share the same registry in the example above. relay's Prometheus extension uses MustRegisterOrGet semantics - if metrics with the same name but different const labels are already registered, the extension creates new label combinations rather than failing.

Prometheus Alerting Rules¶

Sample recording and alerting rules for the relay metrics:

groups:
  - name: relay_client
    rules:
      - record: job:relay_request_rate5m
        expr: rate(relay_requests_total[5m])

      - record: job:relay_error_rate5m
        expr: |
          rate(relay_requests_total{status_code=~"5.."}[5m])
          /
          rate(relay_requests_total[5m])

      - alert: RelayHighErrorRate
        expr: job:relay_error_rate5m > 0.05
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Relay client error rate above 5% for {{ $labels.host }}"

      - alert: RelaySlowRequests
        expr: histogram_quantile(0.99, rate(relay_request_duration_seconds_bucket[5m])) > 2
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "P99 latency above 2s for {{ $labels.host }}"