Architecture

Talos separates API key management into two planes.

Admin plane

The admin plane handles all key management and verification operations: key issuance, rotation, revocation, token derivation, JWKS, and verification (single and batch). It is exposed only to internal services and clients with admin credentials.

Endpoints: /v2alpha1/admin/, including /v2alpha1/admin/apiKeys:verify and /v2alpha1/admin/apiKeys:batchVerify.

For low-latency verification close to clients, deploy the commercial edge proxy as a sidecar. The proxy caches admin verify responses locally, so applications get sub-millisecond cache hits without exposing the admin plane publicly.

Data plane

The data plane handles self-service operations that credential holders perform with proof of possession of the credential itself, no admin authentication required.

Endpoints: POST /v2alpha1/apiKeys:selfRevoke

Verification flow

Client --> Verifier --> Cache (hit?) --> Database --> Response
                          |                            ^
                          +-- cache hit ---------------+

Client sends credential to POST /v2alpha1/admin/apiKeys:verify
Talos identifies the credential type (generated, imported, JWT, macaroon)
For generated keys, the UUID is extracted from the token identifier
For imported keys, a tenant-scoped SHA-512/256 hash is computed
Database lookup (or cache hit) returns key metadata
Response includes key status, owner, scopes, and metadata

Deployment topologies

Topology	Edition	Description
Single-node	OSS	One process serves both planes
Split planes	Commercial	Admin and data planes as separate deployments
Edge proxy	Commercial	Sidecar proxy at the edge that caches admin verify responses locally

Both planes share the same database. Verification uses caching (memory or Redis) to minimize database load.

Ports

Port	Purpose
4420	HTTP API (default)
4422	Prometheus metrics

Design philosophy

Separation of concerns

The system is divided into distinct layers:

Admin plane: Management operations (CRUD for keys, rotation, import, token derivation)
Data plane: High-throughput verification operations
Persistence layer: Database abstraction with pluggable drivers
Cache layer: Performance optimization with multiple backends

This separation allows independent scaling of components, different SLOs for different operations (admin targets <100ms p99, data plane targets <3ms p99), and clear boundaries between responsibilities.

Production-first design

Hard isolation between admin and data operations
Metrics, traces, and structured logs are emitted by default
Graceful degradation when the database or cache backend is unavailable
Zero-downtime deployments via rolling updates and stateless verification

Performance characteristics

Self-contained tokens (JWT/macaroon) enable stateless verification
HMAC-SHA256 keeps the revocation check on the order of microseconds; bcrypt would cap a single core at roughly 10 verifications per second
LRU caching for hot paths
Minimal allocations in the verification path

System architecture

Clients (CLI, SDK, HTTP)
         |
         v
+----------------------------------+
|  HTTP Server (grpc-gateway)      |
|  Port: 4420                      |
+----------------------------------+
         |
         v
+----------------------------------+
|  Middleware                      |
|  Logging, Metrics, Tracing       |
+----------------------------------+
         |
   +-----+----------+
   |                 |
   v                 v
+-----------+  +-----------+
| Admin     |  | Data      |
| Plane     |  | Plane     |
| <100ms    |  | <3ms p99  |
+-----------+  +-----------+
   |                 |
   v                 v
+----------------------------------+
|  Service Layer                   |
|  Business logic, Validation      |
+----------------------------------+
         |
   +-----+----------+
   |                 |
   v                 v
+-----------+  +-----------+
| Persist.  |  | Cache     |
| SQLite    |  | Memory    |
| PG/MySQL  |  | LRU       |
| CRDB      |  | Redis     |
+-----------+  +-----------+

All requests enter through a single HTTP server built on grpc-gateway (port 4420) and pass through middleware for logging, metrics, and tracing before being routed to the appropriate plane.

Component overview

HTTP server

The API layer uses grpc-gateway for HTTP/JSON routing with protobuf-based schemas. It serves both planes through a single port, handles CORS and compression, and exposes OpenAPI documentation.

Service layer

Business logic is split between the admin plane service (key lifecycle, import, token derivation, input validation) and the data plane verifier (token parsing, signature verification, revocation checking, cache management). The verifier is optimized for the hot path with minimal allocations.

Persistence

Database access uses sqlc-generated type-safe queries with pluggable drivers:

SQLite -- OSS edition, zero-config, suitable for millions of keys
PostgreSQL -- production workloads
MySQL -- production workloads
CockroachDB -- distributed deployments

Schema changes are managed through versioned migrations using golang-migrate.

Cache

The cache layer reduces database load on the verification path:

Memory LRU (OSS) -- local to each instance, configurable size limits
Redis (Commercial) -- distributed, supports cluster and sentinel modes
Hierarchical L1+L2 (Commercial) -- memory for speed, Redis for shared state

Crypto

Talos supports multiple JWT signing algorithms and a separate API key hashing mechanism:

JWT signing algorithms
Ed25519 (EdDSA) -- default, fastest signing and smallest keys
RSA-2048/4096 (RS256) -- legacy compatibility
API key hashing
HMAC-SHA256 -- used for API key revocation checks (<1ms with constant-time comparison)

The JWT signing algorithm is determined per JWK by its alg field, so one JWKS can contain keys for multiple signing algorithms at the same time.

Observability

Built-in instrumentation across three pillars:

Metrics -- Prometheus exposition on port 4422 with request latency histograms and error rate counters
Tracing -- OpenTelemetry with W3C Trace Context propagation, configurable sampling, OTLP and Jaeger exporters
Logging -- structured JSON logging via slog with correlation IDs and contextual fields

Scalability

Small (<1k RPS)

A single Talos instance handles both planes with SQLite and an in-memory LRU cache. No external dependencies required.

OSS edition sufficient
1 CPU, 512MB RAM
Cost: $5-10/month

Medium (10-50k RPS)

Separate admin and data plane deployments behind a load balancer. PostgreSQL replaces SQLite for durability. Redis provides shared caching across data plane instances.

Commercial edition
Auto-scaling for data plane
Cost: $100-500/month

Large (200k+ RPS)

A cluster of 10-50+ stateless data plane instances with auto-scaling, backed by a distributed Redis cache and PostgreSQL with read replicas and connection pooling. Supports multi-region deployment.

Commercial edition
Regional data plane deployment
Cost: $1-5k/month

Admin plane​

Data plane​

Verification flow​

Deployment topologies​

Ports​

Design philosophy​

Separation of concerns​

Production-first design​

Performance characteristics​

System architecture​

Component overview​

HTTP server​

Service layer​

Persistence​

Cache​

Crypto​

Observability​

Scalability​

Small (<1k RPS)​

Medium (10-50k RPS)​

Large (200k+ RPS)​

Ory Network