Architecture
Talos separates API key management into two planes.
Admin plane
The admin plane handles all key management and verification operations: key issuance, rotation, revocation, token derivation, JWKS, and verification (single and batch). It is exposed only to internal services and clients with admin credentials.
Endpoints: /v2alpha1/admin/, including /v2alpha1/admin/apiKeys:verify and /v2alpha1/admin/apiKeys:batchVerify.
For low-latency verification close to clients, deploy the commercial edge proxy as a sidecar. The proxy caches admin verify responses locally, so applications get sub-millisecond cache hits without exposing the admin plane publicly.
Data plane
The data plane handles self-service operations that credential holders perform with proof of possession of the credential itself, no admin authentication required.
Endpoints: POST /v2alpha1/apiKeys:selfRevoke
Verification flow
Client --> Verifier --> Cache (hit?) --> Database --> Response
| ^
+-- cache hit ---------------+
- Client sends credential to
POST /v2alpha1/admin/apiKeys:verify - Talos identifies the credential type (generated, imported, JWT, macaroon)
- For generated keys, the UUID is extracted from the token identifier
- For imported keys, a tenant-scoped SHA-512/256 hash is computed
- Database lookup (or cache hit) returns key metadata
- Response includes key status, owner, scopes, and metadata
Deployment topologies
| Topology | Edition | Description |
|---|---|---|
| Single-node | OSS | One process serves both planes |
| Split planes | Commercial | Admin and data planes as separate deployments |
| Edge proxy | Commercial | Sidecar proxy at the edge that caches admin verify responses locally |
Both planes share the same database. Verification uses caching (memory or Redis) to minimize database load.
Ports
| Port | Purpose |
|---|---|
| 4420 | HTTP API (default) |
| 4422 | Prometheus metrics |
Design philosophy
Separation of concerns
The system is divided into distinct layers:
- Admin plane: Management operations (CRUD for keys, rotation, import, token derivation)
- Data plane: High-throughput verification operations
- Persistence layer: Database abstraction with pluggable drivers
- Cache layer: Performance optimization with multiple backends
This separation allows independent scaling of components, different SLOs for different operations (admin targets <100ms p99, data plane targets <3ms p99), and clear boundaries between responsibilities.
Production-first design
- Hard isolation between admin and data operations
- Metrics, traces, and structured logs are emitted by default
- Graceful degradation when the database or cache backend is unavailable
- Zero-downtime deployments via rolling updates and stateless verification
Performance characteristics
- Self-contained tokens (JWT/macaroon) enable stateless verification
- HMAC-SHA256 keeps the revocation check on the order of microseconds; bcrypt would cap a single core at roughly 10 verifications per second
- LRU caching for hot paths
- Minimal allocations in the verification path
System architecture
Clients (CLI, SDK, HTTP)
|
v
+----------------------------------+
| HTTP Server (grpc-gateway) |
| Port: 4420 |
+----------------------------------+
|
v
+----------------------------------+
| Middleware |
| Logging, Metrics, Tracing |
+----------------------------------+
|
+-----+----------+
| |
v v
+-----------+ +-----------+
| Admin | | Data |
| Plane | | Plane |
| <100ms | | <3ms p99 |
+-----------+ +-----------+
| |
v v
+----------------------------------+
| Service Layer |
| Business logic, Validation |
+----------------------------------+
|
+-----+----------+
| |
v v
+-----------+ +-----------+
| Persist. | | Cache |
| SQLite | | Memory |
| PG/MySQL | | LRU |
| CRDB | | Redis |
+-----------+ +-----------+
All requests enter through a single HTTP server built on grpc-gateway (port 4420) and pass through middleware for logging, metrics, and tracing before being routed to the appropriate plane.
Component overview
HTTP server
The API layer uses grpc-gateway for HTTP/JSON routing with protobuf-based schemas. It serves both planes through a single port, handles CORS and compression, and exposes OpenAPI documentation.
Service layer
Business logic is split between the admin plane service (key lifecycle, import, token derivation, input validation) and the data plane verifier (token parsing, signature verification, revocation checking, cache management). The verifier is optimized for the hot path with minimal allocations.
Persistence
Database access uses sqlc-generated type-safe queries with pluggable drivers:
- SQLite -- OSS edition, zero-config, suitable for millions of keys
- PostgreSQL -- production workloads
- MySQL -- production workloads
- CockroachDB -- distributed deployments
Schema changes are managed through versioned migrations using golang-migrate.
Cache
The cache layer reduces database load on the verification path:
- Memory LRU (OSS) -- local to each instance, configurable size limits
- Redis (Commercial) -- distributed, supports cluster and sentinel modes
- Hierarchical L1+L2 (Commercial) -- memory for speed, Redis for shared state
Crypto
Talos supports multiple JWT signing algorithms and a separate API key hashing mechanism:
- JWT signing algorithms
Ed25519 (EdDSA)-- default, fastest signing and smallest keysRSA-2048/4096 (RS256)-- legacy compatibility- API key hashing
HMAC-SHA256-- used for API key revocation checks (<1ms with constant-time comparison)
The JWT signing algorithm is determined per JWK by its alg field, so one JWKS can contain keys for multiple signing algorithms
at the same time.
Observability
Built-in instrumentation across three pillars:
- Metrics -- Prometheus exposition on port 4422 with request latency histograms and error rate counters
- Tracing -- OpenTelemetry with W3C Trace Context propagation, configurable sampling, OTLP and Jaeger exporters
- Logging -- structured JSON logging via slog with correlation IDs and contextual fields
Scalability
Small (<1k RPS)
A single Talos instance handles both planes with SQLite and an in-memory LRU cache. No external dependencies required.
- OSS edition sufficient
- 1 CPU, 512MB RAM
- Cost: $5-10/month
Medium (10-50k RPS)
Separate admin and data plane deployments behind a load balancer. PostgreSQL replaces SQLite for durability. Redis provides shared caching across data plane instances.
- Commercial edition
- Auto-scaling for data plane
- Cost: $100-500/month
Large (200k+ RPS)
A cluster of 10-50+ stateless data plane instances with auto-scaling, backed by a distributed Redis cache and PostgreSQL with read replicas and connection pooling. Supports multi-region deployment.
- Commercial edition
- Regional data plane deployment
- Cost: $1-5k/month
