If your LLM agents feel fragile or over-engineered, start with SQLite. Inspired by Simon Willison’s take on agent design, the simplest stack often wins for durability, observability, and speed (source).
Why SQLite for agents
- Durable by default: ACID transactions and a single-file DB make crashes less scary.
- Zero-ops: Ship the DB with your app. No cluster to run for many use cases.
- Inspectable: Query every tool call, prompt, and decision in plain SQL.
- Search-ready: Full‑text search (FTS5) and vectors via extensions power hybrid RAG.
- Good-enough concurrency: Readers scale, one fast writer with WAL mode works for most agents.
See docs for WAL mode and FTS5. For vectors, try sqlite-vec.
A practical schema that just works
- sessions(id, created_at, meta JSON): each agent run/session.
- events(id, session_id, ts, type, payload JSON): append-only log of prompts, tool calls, errors, outputs.
- messages(id, session_id, role, content, tokens, ts): chat turns for quick replay.
- tools(name, schema JSON, code_ref, enabled): registry of callable tools.
- artifacts(id, session_id, kind, path/blob, meta JSON): files, code, or results produced.
- memories(id, scope, content, embedding, ts): long-term facts; index with FTS5 + vectors.
- jobs(id, status, run_at, payload JSON, attempts): lightweight queue for background work.
Tip: Keep events append-only for perfect auditability, then derive views (e.g., latest state, success rates) with SQL.
Patterns to copy
- Hybrid search: store content in
memories, add FTS5 index + vector column. First match semantic neighbors, then rerank with keyword hits. - Tool safety: validate tool inputs against the
tools.schemaJSON, log every invocation ineventswith inputs/outputs. - Deterministic replays: bind a fixed seed and model params; replay a session by streaming its
events. - Cheap analytics: compute latency, token usage, and failure modes directly from
events—no separate telemetry pipeline needed. - Resilience: enable WAL, add periodic VACUUM, and stream backups to object storage with Litestream.
Starter queries
– Last 50 tool calls: select ts, json_extract(payload,'$.tool'), json_extract(payload,'$.status') from events where type='tool_call' order by ts desc limit 50;
– Sessions with failures: select session_id, count(*) as failures from events where type='error' group by session_id order by failures desc;
– Hybrid RAG (conceptual): FTS match on text, union with top-k vector neighbors, then dedupe + score.
Quick-start checklist
- Turn on WAL:
PRAGMA journal_mode=WAL; - Create the seven tables above; keep
eventsappend-only. - Add FTS5 to
memories.contentand a vector column for embeddings. - Wrap every model/tool step in a transaction and log to
events. - Expose a
/inspectendpoint to query sessions and events for live debugging. - Automate snapshots or stream to S3 with Litestream.
The takeaway
Most agents don’t need a distributed datastore. Start with SQLite for durable memory, tools, and logs you can actually debug—then scale only when the data demands it.
Enjoy nuggets like this? Subscribe to our weekly brief: theainuggets.com/newsletter.

