SQLite LLM Agents Guide: Local Memory for Autonomous AI

If your LLM agents feel fragile or over-engineered, start with SQLite. Inspired by Simon Willison’s take on agent design, the simplest stack often wins for durability, observability, and speed (source).

Why SQLite for agents

Durable by default: ACID transactions and a single-file DB make crashes less scary.
Zero-ops: Ship the DB with your app. No cluster to run for many use cases.
Inspectable: Query every tool call, prompt, and decision in plain SQL.
Search-ready: Full‑text search (FTS5) and vectors via extensions power hybrid RAG.
Good-enough concurrency: Readers scale, one fast writer with WAL mode works for most agents.

See docs for WAL mode and FTS5. For vectors, try sqlite-vec.

A practical schema that just works

sessions(id, created_at, meta JSON): each agent run/session.
events(id, session_id, ts, type, payload JSON): append-only log of prompts, tool calls, errors, outputs.
messages(id, session_id, role, content, tokens, ts): chat turns for quick replay.
tools(name, schema JSON, code_ref, enabled): registry of callable tools.
artifacts(id, session_id, kind, path/blob, meta JSON): files, code, or results produced.
memories(id, scope, content, embedding, ts): long-term facts; index with FTS5 + vectors.
jobs(id, status, run_at, payload JSON, attempts): lightweight queue for background work.

Tip: Keep events append-only for perfect auditability, then derive views (e.g., latest state, success rates) with SQL.

Patterns to copy

Hybrid search: store content in memories, add FTS5 index + vector column. First match semantic neighbors, then rerank with keyword hits.
Tool safety: validate tool inputs against the tools.schema JSON, log every invocation in events with inputs/outputs.
Deterministic replays: bind a fixed seed and model params; replay a session by streaming its events.
Cheap analytics: compute latency, token usage, and failure modes directly from events—no separate telemetry pipeline needed.
Resilience: enable WAL, add periodic VACUUM, and stream backups to object storage with Litestream.

Starter queries

– Last 50 tool calls: select ts, json_extract(payload,'$.tool'), json_extract(payload,'$.status') from events where type='tool_call' order by ts desc limit 50;

– Sessions with failures: select session_id, count(*) as failures from events where type='error' group by session_id order by failures desc;

– Hybrid RAG (conceptual): FTS match on text, union with top-k vector neighbors, then dedupe + score.

Quick-start checklist

Turn on WAL: PRAGMA journal_mode=WAL;
Create the seven tables above; keep events append-only.
Add FTS5 to memories.content and a vector column for embeddings.
Wrap every model/tool step in a transaction and log to events.
Expose a /inspect endpoint to query sessions and events for live debugging.
Automate snapshots or stream to S3 with Litestream.

The takeaway

Most agents don’t need a distributed datastore. Start with SQLite for durable memory, tools, and logs you can actually debug—then scale only when the data demands it.

Enjoy nuggets like this? Subscribe to our weekly brief: theainuggets.com/newsletter.

Subscribe

What's Hot

Why SQLite Belongs at the Core of Your LLM Agents