Architecture

GPROXY is a Cargo workspace organized into three layers:

Apps — runnable binaries (gproxy, gproxy-recorder).
Crates — the server-side pieces that the main app composes (gproxy-core, gproxy-storage, gproxy-api, gproxy-server).
SDK — framework-agnostic libraries that can be reused outside the server (gproxy-protocol, gproxy-channel, gproxy-engine, gproxy-sdk).

Workspace layout

gproxy/
├── apps/
│   ├── gproxy/              # The main binary (HTTP server + console host)
│   └── gproxy-recorder/     # Upstream traffic recorder (dev/debugging tool)
├── crates/
│   ├── gproxy-core/         # Config, identity, policy, quota, routing types
│   ├── gproxy-storage/      # SeaORM storage + encryption + migrations
│   ├── gproxy-api/          # Admin + user HTTP API, auth, login, CORS
│   └── gproxy-server/       # The Axum server wiring it all together
├── sdk/
│   ├── gproxy-protocol/     # L0: OpenAI/Claude/Gemini wire types + transforms
│   ├── gproxy-channel/      # L1: Channel trait, channel implementations,
│   │                        #     credentials, billing, utils, health
│   ├── gproxy-engine/       # L2: GproxyEngine, ProviderStore, routing,
│   │                        #     retry, credential affinity, backends
│   └── gproxy-sdk/          # Umbrella crate re-exporting the three layers
├── frontend/console/        # React console, embedded into the binary at build time
└── docs/                    # This documentation site

Request lifecycle

At a high level, an incoming LLM request goes through these stages:

                ┌─────────────────────────────────────────────┐
HTTP request ──►│  gproxy-server (Axum)                       │
                │    ├── auth: API key → user identity        │
                │    ├── protocol classification (OpenAI/…)   │
                │    └── handler dispatch                     │
                └───────────────┬─────────────────────────────┘
                                │
                                ▼
                ┌─────────────────────────────────────────────┐
                │  gproxy-engine :: routing                   │
                │    permission → rewrite → alias → execute   │
                └───────────────┬─────────────────────────────┘
                                │ resolved (provider, model)
                                ▼
                ┌─────────────────────────────────────────────┐
                │  gproxy-engine :: GproxyEngine              │
                │    ├── channel.prepare_request(...)         │
                │    ├── HTTP call to upstream                │
                │    ├── retries + health updates             │
                │    └── usage accounting                     │
                └───────────────┬─────────────────────────────┘
                                │
                                ▼
                          upstream LLM API

Auth. The request carries an API key (Authorization: Bearer …). The API layer resolves it to a User, checks that the user and key are enabled, and attaches the identity.
Classification. The router figures out the protocol (OpenAI Chat, OpenAI Responses, Claude, Gemini) and the route kind (model_list, model_get, a chat invocation, etc.).
Model resolution. permission → rewrite → alias → execute is the single canonical order (see Guides → Models & Aliases). Aliases, permission glob patterns, and channel-level rewrite rules are applied here.
Routing. For *-only presets, model_list and model_get are served locally from the models table, never touching upstream. For *-like / pass-through presets, upstream is called and its response is merged with the local models table so admins still see what they registered.
Execution. GproxyEngine::execute asks the resolved channel to prepare the upstream request, makes the call, handles retries, and updates per-credential health state.
Accounting. Usage is captured through a sink, batched, and written to storage by background workers.

Background workers

apps/gproxy/src/workers/mod.rs wires up a set of long-running tasks:

Worker	Purpose
`UsageSink`	Drains usage messages and writes them to storage in batches.
`HealthBroadcaster`	Debounces per-credential health state changes.
`QuotaReconciler`	Periodically reconciles users’ `cost_used` against recorded usage.
`RateLimitGC`	Garbage-collects expired rate-limit counters.

All of them participate in graceful shutdown — see Graceful Shutdown.

Storage

gproxy-storage is built on SeaORM + SQLx and supports SQLite, PostgreSQL, and MySQL. When DATABASE_SECRET_KEY is set, credential material (provider credentials, user passwords, API keys) is encrypted at rest with XChaCha20-Poly1305.

The embedded console, the admin HTTP API, and the TOML seed config all write into the same schema — the TOML file is only consulted on first-time initialization when the database is empty. See TOML Config for how the seed is loaded.

SDK boundary

The sdk/ crates intentionally have no dependency on the database, the HTTP server, or Axum. That separation is what lets you take gproxy-sdk and embed the provider engine into an entirely different application. See the Rust SDK reference for details.