Production Deployment

Checklist before going live

Rotate the setup token

Change ENGRAM_SETUP_TOKEN to a random secret before exposing the server to any network. This token creates tenants — treat it like a root password.

openssl rand -hex 32

Use managed Postgres

Replace the Docker Compose Postgres with a managed database that has automated backups and point-in-time recovery. Recommended: Neon, Supabase, or AWS RDS with pgvector enabled.

DATABASE_URL=postgresql://user:pass@your-db-host:5432/engram?sslmode=require

Set LLM_PROVIDER=none for high throughput

If you don’t need LLM-based contradiction detection, set LLM_PROVIDER=none. This eliminates all external API calls on the POST /v1/memories hot path and reduces P99 latency from ~3s to under 150ms.

Restrict network access

Engram’s API should not be directly internet-accessible in most deployments. Put it behind a reverse proxy (Nginx, Caddy, or a cloud load balancer) and restrict the setup endpoint:

location /v1/setup {
    allow 10.0.0.0/8;
    deny all;
}

Configure rate limits

Set per-tenant rate limits appropriate for your expected traffic:

RATE_LIMIT_RPS=200
RATE_LIMIT_BURST=50

Connection pooling

Engram opens one connection per request by default. At high QPS, add PgBouncer in front of Postgres:

# docker-compose.production.yml
pgbouncer:
  image: edoburu/pgbouncer
  environment:
    DATABASE_URL: postgresql://user:pass@postgres:5432/engram
    POOL_MODE: transaction
    MAX_CLIENT_CONN: 1000
    DEFAULT_POOL_SIZE: 25
  ports:
    - "5432:5432"

Point DATABASE_URL at PgBouncer, not Postgres directly.

Reverse proxy with TLS

Caddy (simplest)

engram.yourdomain.com {
    reverse_proxy localhost:8080
}

Caddy handles TLS automatically via Let’s Encrypt.

Nginx

server {
    listen 443 ssl;
    server_name engram.yourdomain.com;

    ssl_certificate     /etc/letsencrypt/live/engram.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/engram.yourdomain.com/privkey.pem;

    location / {
        proxy_pass http://localhost:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_read_timeout 30s;
    }
}

Environment variables (production)

# Server
SERVER_PORT=8080
LOG_LEVEL=warn          # reduce log volume in production

# Database — use your managed Postgres URL
DATABASE_URL=postgresql://user:pass@db-host:5432/engram?sslmode=require

# Auth
ENGRAM_SETUP_TOKEN=<random-secret-from-openssl-rand>

# Embeddings
EMBEDDING_PROVIDER=openai
OPENAI_API_KEY=sk-...

# LLM (set to none for embedding-only mode)
LLM_PROVIDER=none

# Rate limiting
RATE_LIMIT_RPS=200
RATE_LIMIT_BURST=50

Health checks

Use the health endpoint for load balancer probes:

GET /health
# {"status":"ok"}

Recommended probe configuration:

Interval: 10s
Timeout: 5s
Unhealthy threshold: 3 consecutive failures

Monitoring

Engram exposes a JSON metrics endpoint:

GET /metrics

Key metrics to alert on:

Metric	Alert threshold
`recall_latency_p95`	> 500ms
`memory_store_errors`	> 1% error rate
`database_connections`	> 80% of pool
`decay_job_last_run`	> 2 hours ago

For Prometheus scraping, a native /metrics endpoint in Prometheus exposition format is on the roadmap (P4-5).

Backup and recovery

Neon / Supabase

Both support point-in-time recovery (PITR) out of the box. No additional configuration needed.

Self-managed Postgres

Set up pg_dump on a cron:

# Daily backup to S3
0 2 * * * pg_dump $DATABASE_URL | gzip | aws s3 cp - s3://your-bucket/engram/$(date +%Y%m%d).sql.gz

Engram stores embeddings as pgvector columns. Ensure your backup process includes the pgvector extension and that the restore environment also has pgvector installed.

Scaling

Engram is stateless on the HTTP path — the only shared state is Postgres. To scale horizontally:

Run multiple server instances behind a load balancer
Ensure all instances point to the same DATABASE_URL
Background workers (decay, consolidation) use Postgres advisory locks to prevent duplicate runs — only one instance runs them at a time, regardless of replica count

# Scale to 3 replicas on Fly.io
fly scale count 3

​Checklist before going live

​Connection pooling

​Reverse proxy with TLS

​Caddy (simplest)

​Nginx

​Environment variables (production)

​Health checks

​Monitoring

​Backup and recovery

​Neon / Supabase

​Self-managed Postgres

​Scaling

Checklist before going live

Connection pooling

Reverse proxy with TLS

Caddy (simplest)

Nginx

Environment variables (production)

Health checks

Monitoring

Backup and recovery

Neon / Supabase

Self-managed Postgres

Scaling