Skip to content

Observability

v2 writes operational data to the persistence backend so native, multi-instance, and edge deployments can share the same view. Hot-path settings are loaded into the control-plane snapshot.

Every gateway request gets a request id. Usage records, downstream logs, and upstream logs carry that id so one call can be joined across tables and console views.

Usage records include:

  • request id and timestamp;
  • route name, provider id, credential id;
  • org, team, user, and user key ids;
  • operation and kind;
  • model;
  • input, output, cache-read, and cache-creation tokens;
  • cost;
  • latency and usage source.

Usage is controlled by instance_settings.enable_usage, which defaults to true. Settlement also updates quotas and token-limit counters.

Request logging is split into downstream and upstream streams:

SettingCaptures
enable_downstream_logClient-facing method, path, query, status, headers.
enable_downstream_log_bodyDownstream request and response bodies.
enable_upstream_logProvider URL, method, status, latency, headers.
enable_upstream_log_bodyUpstream request and response bodies.

Redaction is on by default. disable_log_redaction exists for debugging, but it can expose secrets and should not be enabled casually.

Admin and portal mutation paths emit audit rows with actor id/name, action, target, status, and source IP. Use these to answer “who changed the control plane” rather than to debug LLM payloads.

Credential status rows track each credential/channel pair:

  • health_kind;
  • optional structured health_json;
  • checked_at;
  • last_error.

The pipeline and channel response classifier decide when a credential should be retried, cooled down, or treated as auth-dead. The console shows current status through /admin/credential-statuses.

/metrics is admin-gated and renders Prometheus text from persisted aggregate data, not process-local counters. Current families include:

  • gproxy_requests_total
  • gproxy_tokens_total
  • gproxy_upstream_latency_ms
  • gproxy_credential_health
  • gproxy_quota_total
  • gproxy_quota_used

This design keeps metrics meaningful across native multi-instance and edge deployments where process-local counters would be misleading.

instance_settings.retention_days controls cleanup of usage and request-log rows. None or a non-positive value retains rows indefinitely. Retention is for logs and usage data; it should not delete business/control-plane records.