Skip to main content

What is Engram?

Engram is memory infrastructure for AI agents. It is a REST API + SDK that gives your agents persistent, reliable memory — not just storage and retrieval, but a cognitive layer that tracks how confident the agent should be about each thing it knows. Most memory systems treat every stored fact as equally true, forever. Engram doesn’t. Every memory has a confidence score that changes over time based on:
  • New evidence that reinforces or contradicts it
  • How recently it was accessed
  • Whether competing memories suppress it
  • Whether it was explicitly contradicted
This means agents built on Engram stop confidently acting on stale, wrong, or contradicting information.

Why Engram?

Calibrated Confidence

Log-odds arithmetic keeps confidence scores as properly calibrated probabilities — not raw cosine similarity passed off as certainty.

Contradiction Detection

When new information contradicts an existing belief, Engram detects it and handles it correctly — demoting, archiving, or allowing coexistence based on contradiction type.

Memory Lifecycle

Unused memories decay. Frequently accessed ones strengthen. Competing similar memories suppress each other. The agent’s knowledge matures over time.

Knowledge Health

A metacognitive layer surfaces the overall health of an agent’s knowledge — confidence distribution, contradiction count, staleness indicators, learning velocity.

How it works

Engram is a database layer, not another LLM consumer. In embedding-only mode (LLM_PROVIDER=none), it runs entirely on pgvector + heuristics with zero external API calls on the memory store path.
Your Agent  →  POST /v1/memories  →  Engram  →  PostgreSQL + pgvector
Your Agent  ←  GET /v1/memories/recall  ←  Hybrid vector + graph recall
When a memory is stored:
  1. An embedding is generated (OpenAI or local)
  2. Similar existing memories are found via HNSW index
  3. Contradiction detection runs (LLM-based or embedding-based)
  4. The memory is stored with an initial confidence score based on evidence type
  5. Graph edges are built to connected memories
When memories are recalled:
  1. The query is embedded
  2. HNSW approximate nearest neighbour finds candidate memories
  3. Graph traversal surfaces thematically connected memories not caught by vector search
  4. Results are scored by similarity × confidence × freshness and ranked

Architecture

┌─────────────────────────────────────────────────┐
│                  Your AI Agent                   │
└──────────────────────┬──────────────────────────┘
                       │ HTTP / Python SDK
┌──────────────────────▼──────────────────────────┐
│              Engram REST API (Go)                │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐ │
│  │  Memory    │  │Contradiction│  │  Cognitive  │ │
│  │  Service   │  │  Detector  │  │  Health    │ │
│  └─────┬──────┘  └────────────┘  └────────────┘ │
│        │                                         │
│  ┌─────▼──────────────────────────────────────┐  │
│  │        PostgreSQL + pgvector (HNSW)        │  │
│  └────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────┘

Get started

Quickstart

Store your first memory and run your first recall in 5 minutes.

Python SDK

pip install engram.to — full async + sync support with Pydantic v2 models.

Self-Hosting

Run Engram on your own infrastructure with Docker Compose in one command.

Benchmarks

82.0% on LongMemEval (ICLR 2025). Full per-task results and methodology.