Skip to content
§ RecordorAI

AI that remembers. Every client. Every conversation. Yours.

Every AI company has great models. Almost none have solved memory. RecordorAI is the persistent memory layer that makes AI agents actually useful across days, sessions, and users.

§ The Problem

AI agents hit a memory wall. Not a model wall.

Context windows are expensive and ephemeral. Million-token context windows exist — but they're wiped at the end of each session. Paying for 1M tokens on every API call is cost-prohibitive for ongoing agentic workflows.

Memory doesn't survive across sessions. OpenAI and Anthropic are built for single conversations. An agent that needs to remember what a user decided three weeks ago, across a dozen sessions, is not a supported use case.

Inference cost compounds with history. Every extra turn of conversation adds to the context window. At 10K MAU, naive API-based approaches run $75–180K/month in inference fees before you're even profitable.

Data stays on third-party infrastructure. Every memory API call to a closed provider means your users' conversation history leaves your infrastructure. Enterprise customers won't accept that. Regulated industries can't.

The bottleneck isn't models. It's memory that persists.

§ Architecture

Tiered memory. Built for agents that work.

Hot — GPU KV Cache

5–10K tokens of active context in GPU VRAM. Instant recall within a session. Zero latency. ~$0.003 per 1K tokens — 800x cheaper than GPT-5.4 for equivalent context density.

Warm — Semantic Vector Store

Full conversation history in pgvector + Supabase. Infinite retention, queried on demand. No per-token cost scaling with history length. Effectively free past the first retrieval.

Cold — Entity Compression

Each entity — user, project, decision — summarized into ~500 tokens via extractive compression (not generative). No hallucination risk. Costs almost nothing to store and retrieve.

§ Implementation

Technical details

SQLite + sqlite-vec + BM25 hybrid store
Cross-encoder reranking for accurate retrieval
3-way Reciprocal Rank Fusion (RRF) for result merging
Negation reformulator for query precision
CoreML/ANE-ported embeddings (+20pt R@1 lift over upstream)
Rust crate FFI interface — no subprocess hop from alice-runtime
qmd-recordorai-shim CLI for OpenClaw compatibility
SQLitesqlite-vecBM25Cross-EncoderRRFCoreMLRust FFISupabasepgvector
§ A.L.I.C.E. Stack

Memory in the full stack

In the A.L.I.C.E. architecture, Hub routes a request to the right runtime, which loads the appropriate agent workspace and persona. When the agent needs to remember or retrieve, it calls RecordorAI via FFI (alice-runtime) or the qmd-recordorai-shim (OpenClaw) — no subprocess overhead, no context loss.

Your agents are only as good as their memory.

RecordorAI is available as a standalone product. Let's talk about what it would take to give your agents a persistent memory.

Learn More at recordor.ai