100% local, OpenAI-compatible memory layer for AI agents. On-device embeddings via Ollama, SQLite +
sqlite-vecfor vector search, zero cloud round-trips. Drop-in for any code that already speaks/v1/embeddings.
Every AI-agent memory repo today — Supermemory, claude-mem, mem0, Graphiti — assumes you'll call OpenAI for embeddings and often a hosted vector DB on top. That means three things for a solo dev or privacy-conscious user:
- Every agent interaction has a network round-trip.
- A paid API bill scales with your agent's usage.
- Your memory lives off-device, off-machine, in someone else's cloud.
local-memory is the opposite: a single Node process, one SQLite file on disk, embeddings via a local Ollama instance running on your own laptop. It exposes an OpenAI-compatible /v1/embeddings endpoint so any existing client that accepts a baseURL override just works — no SDK swap, no code rewrite. It also exposes a small first-class memory API (POST /memory, /memory/search, namespaces, export, import) for when you want more than just embeddings.
# Prereq: Ollama running with an embedding model pulled
ollama pull nomic-embed-text
ollama serve & # listens on 127.0.0.1:11434
# Start the memory server
npx -y @swarmclawai/local-memory start
# Point your OpenAI client at it
export OPENAI_BASE_URL=http://localhost:3456/v1
export OPENAI_API_KEY=not-used
# From another shell, store and search memories
npx -y @swarmclawai/local-memory add "User prefers kebab-case for slugs"
npx -y @swarmclawai/local-memory search "casing conventions"pnpm add -g @swarmclawai/local-memory
# or
npm i -g @swarmclawai/local-memory
# or run on demand
npx -y @swarmclawai/local-memory --help| Provider | Default model | Dim | When to use |
|---|---|---|---|
ollama (default) |
nomic-embed-text |
768 | You have ollama serve running locally |
mock |
deterministic hash | 64 | Unit tests, offline demos, zero-dep fallback |
Override the embedder with --provider, --model, --dim, and --embed-base-url.
| Command | Purpose |
|---|---|
local-memory start |
Start the HTTP server on port 3456 |
local-memory add <text> |
Embed and store a memory (-n <namespace>, --metadata '{...}') |
local-memory search <query> |
Semantic search (-n <namespace>, -k <limit>) |
local-memory namespaces |
List namespaces + counts |
local-memory export |
Dump memories as JSONL on stdout |
local-memory import <file> |
Re-embed every line from a JSONL file |
local-memory help-agents |
Print the full machine-readable catalog |
Every command accepts --json and returns a one-line JSON envelope. Exit codes: 0 success, 1 user error, 2 internal error.
| Endpoint | Purpose |
|---|---|
GET / |
Service info + dim + embedder id |
POST /v1/embeddings |
OpenAI-compatible drop-in — body {input: string | string[]} |
POST /memory |
Store a memory ({text, namespace?, metadata?}) |
GET /memory/:id |
Fetch a memory |
DELETE /memory/:id |
Remove a memory |
POST /memory/search |
Semantic search ({query, namespace?, limit?}) |
GET /namespaces |
List namespaces |
Any OpenAI SDK that supports baseURL works out of the box. Example with the Node SDK:
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:3456/v1",
apiKey: "not-used",
});
const res = await client.embeddings.create({
model: "anything",
input: "the quick brown fox",
});- Embeddings come from a local Ollama process — nothing leaves your machine.
- Vectors are stored in a plain SQLite file via
sqlite-vec. Default path:~/.local-memory/memory.db. - Search uses
sqlite-vec'sMATCHoperator for k-nearest-neighbor retrieval. Results come back with a normalized similarity score in(0, 1]— higher is better. - Namespaces are just a column — free per-app or per-agent memory isolation.
| local-memory | Supermemory | claude-mem | mem0 | Graphiti | |
|---|---|---|---|---|---|
| Runs 100% offline | ✅ | ❌ | partial | ❌ | ❌ |
| OpenAI-compatible drop-in | ✅ | ❌ | ❌ | ❌ | ❌ |
| Single SQLite file | ✅ | ❌ | ❌ | ❌ | ❌ |
| Pluggable embedder | ✅ | ❌ | ❌ | partial | partial |
| Agent-driven CLI | ✅ | — | partial | partial | — |
Every swarmclawai CLI follows the same agent conventions:
--jsoneverywhere, one-line envelope on stdout- Stderr for logs, stdout for data
- Stable exit codes:
0/1/2 - Non-interactive by default
local-memory help-agentsreturns the entire command catalog as JSON
See AGENTS.md for the full machine-readable reference.
- MCP adapter — expose memory tools to any MCP-compatible agent (Claude Code, Cursor, Cline, Aider, etc.)
- Apple Intelligence + MLX providers for macOS
- Gemini Nano provider for Chrome/Pixel
- Incremental compaction (TTL + importance-weighted pruning)
- Hybrid BM25 + vector retrieval
- Encryption-at-rest flag for the DB
See CONTRIBUTING.md.