Skip to main content

Benchmarks

Talos includes a k6-based load test suite that measures throughput, latency, and correctness under concurrent load. Use these benchmarks to validate your deployment and catch performance regressions.

note

These benchmarks require the Commercial edition with PostgreSQL (or CockroachDB/MySQL). The OSS edition uses SQLite, which does not support concurrent writers and cannot handle the parallel load generated by multi-VU test profiles.

Reference results

Measured on Apple M-series (M4 Pro Max), single-process commercial binary with PostgreSQL 16, stress profile (ramping 0→437 VUs over 5 minutes):

MetricValue
Total requests~5,000,000
Peak throughput16,766 req/s
Overall p99 latency123ms
Verify p95 latency48ms
Verify p99 latency95ms
Error rate0.00%
Peak VUs437
Key creations493/s
Verifications3,797/s
Token derivations3,797/s

Profiles

The test suite provides three profiles selected via the TEST_PROFILE environment variable:

ProfileVUsDurationExecutorPurpose
smoke1 read + 1 write15sconstant-vusQuick validation after changes
load15 read + 5 write2minconstant-vusSustained load for regression detection
stress0→437 ramping5minramping-vusFind breaking points and measure peak capacity

The stress profile ramps through four stages:

  1. Warm-up: 0→25 VUs over 30s
  2. Ramp 1: 25→75 VUs over 60s
  3. Ramp 2: 75→150 VUs over 60s
  4. Hold: 150 VUs for 120s
  5. Ramp down: 150→0 VUs over 30s

Read scenarios (verify, batch verify, get key, list keys, JWKS, derive token) get ~70% of VUs. Write scenarios (create, rotate, revoke, import, update, self-revoke) get ~30%.

Running benchmarks

Prerequisites

  • k6 load testing tool
  • Docker (for local PostgreSQL) or an existing PostgreSQL instance
  • Go toolchain (to build the binary)

Quick start

# Smoke test (quick validation)
TEST_PROFILE=smoke bash test/load/run.sh

# Load test (sustained)
TEST_PROFILE=load bash test/load/run.sh

# Stress test (peak capacity)
TEST_PROFILE=stress bash test/load/run.sh

The run.sh script handles everything: builds the commercial binary, starts PostgreSQL in Docker, runs migrations, seeds tenant data, starts the server, and executes k6.

Using an existing database

SKIP_DOCKER=true DB_DSN="postgres://user:pass@host:5432/db?sslmode=disable" \
TEST_PROFILE=load bash test/load/run.sh

Environment variables

VariableDefaultDescription
TEST_PROFILEsmokeTest profile: smoke, load, or stress
BASE_URLhttp://localhost:4420Server base URL
AUTH_TOKENtest-tokenBearer token for admin endpoints
DB_DSNpostgres://talos:talos@localhost:5432/talos_test?sslmode=disablePostgreSQL connection string
SKIP_DOCKERfalseSkip Docker PostgreSQL setup (use existing DB)

Thresholds

Each profile enforces regression thresholds. Tests fail if any threshold is breached.

Smoke and load profiles

MetricThresholdRationale
All checks100% passZero tolerance for correctness failures
HTTP errors0%No errors allowed at low concurrency
Overall p99< 500msGenerous headroom for CI runners
Verify p95< 50ms~25ms measured in CI (postgres)
Verify p99< 100msAllows for CI variance

Stress profile

MetricThresholdRationale
All checks100% passCorrectness under load
HTTP errors< 1%Small tolerance for stress conditions
Overall p99< 400ms~3x headroom over measured 123ms
Verify p95< 100ms~2x headroom over measured 48ms
Verify p99< 200ms~2x headroom over measured 95ms

Interpreting results

After a k6 run, look for:

  • checks rate: Must be 100%. Any failure indicates a correctness bug.
  • http_req_duration percentiles: Compare against the thresholds above. Significant increases suggest a regression.
  • http_req_failed rate: Should be 0% for smoke/load. Under 1% for stress.
  • Custom counters (key_creations, verifications, token_derivations): Compare rates against the reference results to detect throughput regressions.
  • iteration_duration: End-to-end time for each VU iteration including all operations.

Results are saved to .test/k6-output.txt (human-readable) and .test/k6-results.json (machine-readable).