Skip to content

What is GPROXY v2?

GPROXY v2 is the rewrite of the GPROXY LLM gateway. It keeps the original goal from v1: one HTTP entry point for many LLM providers, with routing, credentials, user API keys, policy, usage accounting, and a browser console. The implementation shape is different: v2 is one Rust crate that can build a native server binary and a wasm library for edge runtimes.

The v2 design is intentionally operation-oriented. Protocol behavior is grouped by capability, such as model listing, token counting, content generation, embeddings, images, compacting, and conversations. Provider families still matter at the wire boundary, but they are not the organizing unit for routing or transforms.

  • One gateway for multiple providers. Providers are configured as channel instances with settings, credentials, health state, optional TLS fingerprints, and rule sets.
  • OpenAI, Claude, and Gemini-compatible traffic. v2 classifies inbound requests by operation and wire kind, then either keeps same-protocol traffic light or transforms requests into the selected provider-native format.
  • Multi-tenant access. Users, organizations, teams, user API keys, route permissions, rate limits, and quotas are part of the control plane.
  • Operational routing. A public model name can resolve to an aggregate route, route members, upstream model ids, and credentials. Failover and health state live around this route execution path.
  • Native and edge deployments. The native binary uses Axum and wreq. The wasm build uses fetch-compatible transports with libSQL/Turso and Upstash-style backends where the platform supports them.
  • Embedded administration. The React console is built separately and can be embedded into the native binary or served as static files next to the API.

v1 was organized as a Cargo workspace with app crates, server crates, and SDK crates. v2 collapses that into one crate with clear module boundaries under src/. This is not a downgrade in layering; it is a packaging decision that keeps the native binary, wasm library, and shared runtime code in one place.

The most important conceptual changes are:

Areav1 shapev2 shape
RepositoryWorkspace with apps, crates, and SDK packagesOne crate with native and wasm outputs
Protocol matrixProvider-family language appears in more placesOperation / OperationGroup first
Config flowTOML/database-oriented v1 control planeImport/export snapshots plus persistence backends
ConsoleSeparate frontend embedded at build timeReact console remains separate but is synced into assets/console
EdgeNot the primary runtime shapeFirst-class wasm library and platform bundles
ConceptMeaning in v2
ProviderA configured upstream adapter: channel id, settings, credentials, optional proxy and TLS behavior.
ChannelThe code that prepares provider-native requests and classifies provider-native responses.
OperationA capability such as GenerateContent, ListModels, CreateEmbedding, or CountTokens.
RouteA public model entry that selects one or more provider/upstream model members.
AliasA user-facing model name that maps to a route.
Rule setOrdered request mutation rules applied after protocol transform and before channel send.
SnapshotThe hot-path control-plane view read by request execution.
Cache backendEphemeral/shared coordination, sessions, counters, invalidation, and locks.
Persistence backendDurable control-plane records, logs, usage, audit, and metrics.

GPROXY v2 is not a model host; it does not run inference. It is not a generic reverse proxy; it understands LLM protocol operations. It is also not a managed hosted SaaS console; the embedded console is part of your deployment and should sit behind your own network and operational controls.