Primkit is a monorepo containing four primitives (taskprim, stateprim, knowledgeprim, and queueprim) and a shared infrastructure library (primkit). All primitives follow identical layered architecture — the only differences are the domain model, store operations, and (for knowledgeprim) the embedding layer.
primkit/
├── go.work # Go workspace (5 modules)
├── Makefile # build, test, lint, fmt, build-pi
├── config.example.yaml # shared config format
├── primkit/ # shared library (module: github.com/propifly/primkit/primkit)
│ ├── config/ # YAML config loader + env var interpolation
│ ├── db/ # SQLite open + migration runner
│ ├── server/ # HTTP server, middleware, JSON helpers
│ ├── auth/ # Bearer token authentication
│ ├── mcp/ # MCP server scaffold
│ └── replicate/ # Litestream WAL replication wrapper
├── taskprim/ # task management primitive (module: github.com/propifly/primkit/taskprim)
│ ├── cmd/taskprim/ # main.go entry point
│ └── internal/
│ ├── model/ # Task, Filter, state machine
│ ├── store/ # Store interface + SQLite implementation
│ ├── cli/ # Cobra commands (add, list, done, kill, ...)
│ ├── api/ # HTTP API handler
│ └── mcpserver/ # MCP tool registration
├── stateprim/ # state persistence primitive (module: github.com/propifly/primkit/stateprim)
│ ├── cmd/stateprim/ # main.go entry point
│ └── internal/
│ ├── model/ # Record, QueryFilter
│ ├── store/ # Store interface + SQLite implementation
│ ├── cli/ # Cobra commands (set, get, append, query, ...)
│ ├── api/ # HTTP API handler
│ └── mcpserver/ # MCP tool registration
├── knowledgeprim/ # knowledge graph primitive (module: github.com/propifly/primkit/knowledgeprim)
│ ├── cmd/knowledgeprim/ # main.go entry point
│ └── internal/
│ ├── model/ # Entity, Edge, SearchFilter, TraversalOpts, DiscoverOpts
│ ├── store/ # Store interface + SQLite implementation (FTS5, vectors)
│ ├── embed/ # Embedding provider abstraction (Gemini, OpenAI, custom)
│ ├── cli/ # Cobra commands (capture, search, connect, discover, ...)
│ ├── api/ # HTTP API handler
│ └── mcpserver/ # MCP tool registration
└── queueprim/ # work queue primitive (module: github.com/propifly/primkit/queueprim)
├── cmd/queueprim/ # main.go entry point
└── internal/
├── model/ # Job, Filter, Priority, Status, QueueInfo, Stats
├── store/ # Store interface + SQLite implementation
├── cli/ # Cobra commands (enqueue, dequeue, complete, fail, ...)
├── api/ # HTTP API handler
└── mcpserver/ # MCP tool registration
Dependencies flow strictly downward. No lateral dependencies between sibling layers.
┌──────────────────────────────────────────────────────────┐
│ Access Interfaces │
│ ┌───────┐ ┌──────────┐ ┌───────────────┐ │
│ │ CLI │ │ HTTP API │ │ MCP Server │ │
│ │(cobra)│ │ (net/http)│ │ (mcp-go) │ │
│ └───┬───┘ └────┬─────┘ └──────┬────────┘ │
│ │ │ │ │
│ └─────────────┼─────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────────┐ │
│ │ │ │ │
│ │ ┌─────▼─────┐ ┌───────▼────────┐ │
│ │ │ Store │ │ Embedder │ │
│ │ │ (interface)│ │ (interface) │ │
│ │ └─────┬─────┘ │ knowledgeprim │ │
│ │ │ │ only │ │
│ │ ┌─────▼─────┐ └────────────────┘ │
│ │ │ Model │ ◄── structs, │
│ │ │ │ validation, │
│ │ └─────┬─────┘ state machine │
│ │ │ │
│ └───────┬───────┼────────────────┐ │
│ │ │ │ │
│ ┌─────────▼┐ ┌────▼──────┐ ┌─────▼──────┐ │
│ │ config │ │ db │ │ replicate │ │
│ │ (YAML) │ │ (SQLite) │ │(Litestream)│ │
│ └──────────┘ └────────────┘ └────────────┘ │
│ │
│ primkit (shared library) │
└──────────────────────────────────────────────────────────┘
Note: The Embedder interface is unique to knowledgeprim. The background sweeper goroutine (for expired claim release) is unique to queueprim. taskprim, stateprim, and queueprim do not have an embedding layer.
CLI, API, and MCP are sibling consumers of the Store interface. They never depend on each other. This means:
- You can use CLI without the HTTP server
- You can use the API without MCP
- Any new access interface just imports the Store
The Store is the central abstraction. Each primitive defines its own interface in internal/store/store.go.
| Operation | Description |
|---|---|
CreateTask |
Persist a new task (store assigns ID, state, timestamps) |
GetTask |
Retrieve a single task by ID |
ListTasks |
Filter and list tasks (by list, state, labels, source, etc.) |
UpdateTask |
Partial update to mutable fields |
DoneTask |
Mark task as done (sets resolved_at) |
KillTask |
Mark task as killed with reason |
MarkSeen |
Record that an agent has seen a task |
MarkAllSeen |
Mark all open tasks in a list as seen by an agent |
AddDep |
Add a dependency edge (with cycle detection via recursive CTE) |
RemoveDep |
Remove a dependency edge |
Deps |
List tasks that a given task depends on |
Dependents |
List tasks that depend on a given task (reverse lookup) |
Frontier |
Open tasks with all dependencies resolved or no dependencies |
DepEdges |
Raw dependency edges, optionally filtered by list |
ListLabels |
All labels with count of open tasks per label |
ClearLabel |
Remove a label from all tasks |
ListLists |
All lists with task counts by state |
Stats |
Aggregate counts (open, done, killed) |
ExportTasks |
Full export for data portability |
ImportTasks |
Bulk import preserving IDs |
Close |
Release database connection |
| Operation | Description |
|---|---|
Set |
Create or update a record (upsert) |
Get |
Retrieve by namespace + key |
Has |
Check existence |
SetIfNew |
Create only if key doesn't exist |
Append |
Create immutable record with auto-generated key |
Delete |
Remove by namespace + key |
Query |
Records matching filter (namespace, key prefix, time window) |
Purge |
Delete records older than a duration |
ListNamespaces |
All namespaces with record counts |
Stats |
Aggregate counts |
ExportRecords |
Full export, optionally filtered |
ImportRecords |
Bulk import preserving keys |
Close |
Release database connection |
| Operation | Description |
|---|---|
CaptureEntity |
Persist a new entity with optional embedding vector |
GetEntity |
Retrieve a single entity by ID (includes edges) |
UpdateEntity |
Partial update to mutable fields |
DeleteEntity |
Remove entity and all connected edges |
CreateEdge |
Create a weighted, contextualized connection |
UpdateEdge |
Update edge context or weight |
StrengthenEdge |
Increment an edge's weight by 1.0 |
DeleteEdge |
Remove a connection between entities |
SearchFTS |
Full-text search via FTS5 (BM25 ranking) |
SearchVector |
Semantic search via cosine distance on embeddings |
SearchHybrid |
Combined FTS + vector via Reciprocal Rank Fusion (k=60) |
Related |
Multi-hop graph traversal with direction and weight filters |
Discover |
Pattern detection: orphans, clusters, bridges, temporal groups, weak edges |
ListTypes |
All entity types with counts |
ListRelationships |
All relationship types with counts |
Stats |
Aggregate counts (entities, edges, vectors, orphans, DB size) |
ExportEntities |
Full export with optional type filter |
ImportEntities |
Bulk import preserving IDs |
GetEmbeddingMeta |
Fetch the stored embedding provider/model metadata for this database |
SetEmbeddingMeta |
Write or overwrite the embedding metadata record |
StripVectors |
Delete all embedding vectors and metadata (reverts to FTS5-only) |
UpdateEntityVector |
Upsert a single entity's embedding vector (used by re-embed) |
Close |
Release database connection |
| Operation | Description |
|---|---|
EnqueueJob |
Persist a new job (store assigns ID, status, timestamps) |
DequeueJob |
Atomically claim the next available job in a queue (status=pending AND visible_after ≤ now) |
CompleteJob |
Mark a claimed job as done; optionally store output payload |
FailJob |
Mark a claimed job as failed; retries if retries remain, otherwise moves to dead |
ReleaseJob |
Return a claimed job to pending immediately (unclaim) |
ExtendJob |
Extend a claimed job's visibility timeout to prevent auto-release |
PeekJob |
Inspect the next available job without claiming it |
GetJob |
Retrieve a single job by ID |
ListJobs |
Filter and list jobs (by queue, status, type, age) |
ListQueues |
All named queues with job counts by status |
Stats |
Aggregate counts across all queues |
PurgeJobs |
Delete jobs matching queue + status + age criteria; returns count |
ExportJobs |
Full export of all jobs in a queue |
ImportJobs |
Bulk import preserving original IDs |
SweepExpiredClaims |
Release claimed jobs whose visibility_after has passed; called by background sweeper |
Close |
Release database connection |
Task {
ID string // t_<nanoid>, assigned by store
List string // required: which list
What string // required: task description
Source string // required: who created it
State State // open → done | killed
WaitingOn *string // optional: blocking dependency
ParentID *string // optional: subtask relationship
Context *string // optional: background info
Labels []string // freeform tags
Created time.Time // assigned by store
Updated time.Time // assigned by store
ResolvedAt *time.Time // set on done/kill
ResolvedReason *string // why it was killed
}
State machine:
done()
open ──────────► done
│
│ kill(reason)
└──────────► killed
Tasks start as open. Transitions to done or killed are one-way. There is no restore/reopen.
Dependency graph:
DepEdge {
TaskID string // the task that is blocked
DependsOn string // the task it depends on
}
Stored in task_deps table with composite primary key (task_id, depends_on) and a self-reference check (task_id != depends_on). Cycle detection is enforced via recursive CTE on AddDep. waiting_on (freeform text for external/human blockers) and task_deps (structural task-to-task edges) coexist — they serve different purposes.
Record {
Namespace string // required: scope
Key string // required: identifier
Value json.RawMessage // required: JSON payload
Immutable bool // true for append records
CreatedAt time.Time // assigned by store
UpdatedAt time.Time // assigned by store
}
Three access patterns share the same model:
- Key-value state (
set/get): current state, updatable.Immutable=false. - Dedup lookups (
has/set-if-new): existence checks, create-once semantics. - Append log (
append): immutable, timestamped entries.Immutable=true, auto-generated key.
Entity {
ID string // e_<nanoid>, assigned by store
Type string // required: entity type (article, concept, pattern, etc.)
Title string // required: entity title
Body *string // optional: entity body text
URL *string // optional: source URL
Source string // required: who captured it
Properties json.RawMessage // optional: custom JSON
CreatedAt time.Time // assigned by store
UpdatedAt time.Time // assigned by store
Edges []*Edge // populated on GetEntity
}
Edge {
SourceID string // required: source entity ID
TargetID string // required: target entity ID
Relationship string // required: relationship type (relates_to, extends, etc.)
Weight float64 // starts at 1.0, grows via strengthen
Context *string // optional: WHY this connection exists
CreatedAt time.Time // assigned by store
UpdatedAt time.Time // assigned by store
}
EmbeddingMeta {
Provider string // embedding provider name (e.g., "gemini", "openai")
Model string // model name (e.g., "text-embedding-004")
Dimensions int // vector dimensions produced by this model
CreatedAt time.Time // when the metadata was first recorded
}
EmbeddingMeta is a single-row record (enforced by CHECK (id = 1) in SQLite) that tracks which embedding provider and model generated the vectors in this database. One row per .db file. Used by CheckEmbeddingMeta to prevent silent degradation when the configured provider changes.
Entity types are freeform strings — agents define their own vocabulary (e.g., article, thought, concept, pattern, observation, decision, bug).
Relationship types are also freeform (e.g., relates_to, contradicts, extends, inspired_by, applies_to, similar_to).
Three search modes:
- FTS — keyword search via SQLite FTS5, BM25 ranking
- Vector — semantic search via cosine distance on embeddings
- Hybrid — combines FTS + vector results via Reciprocal Rank Fusion (k=60)
Discovery operations surface non-obvious patterns:
- Orphans — entities with no edges
- Clusters — densely connected entity groups
- Bridges — high-degree entities connecting separate clusters
- Temporal — entity type distribution over time periods
- Weak edges — edges missing context prose
Job {
ID string // q_<nanoid>, assigned by store
Queue string // required: named queue (slashes allowed, e.g., infra/prod)
Type string // optional: job type category for type-filtered dequeue
Priority Priority // high | normal (default) | low
Payload json.RawMessage // required: arbitrary JSON work description
Status Status // pending → claimed → done | failed | dead
ClaimedBy *string // set on dequeue: worker name
ClaimedAt *time.Time // set on dequeue
VisibleAfter time.Time // delayed jobs: not visible until this time
CompletedAt *time.Time // set on complete
Output json.RawMessage // optional: worker result payload
FailureReason *string // set on fail
AttemptCount int // incremented on each dequeue
MaxRetries int // 0 = one-shot; >0 = retry up to N times before dead
CreatedAt time.Time // assigned by store
UpdatedAt time.Time // assigned by store
}
State machine:
enqueue() dequeue() complete()
─────────► pending ──────────► claimed ──────────► done
│
│ fail() + retries remain
├──────────────────────── → pending (re-queued)
│
│ fail() + retries exhausted
│ fail(--dead)
└──────────────────────── → dead
Priority ordering: high → normal → low. Within a priority level, ordering is FIFO.
Visibility timeout: Claimed jobs hold a visible_after lock. If a worker crashes without completing, a background sweeper goroutine releases the claim once visible_after passes, returning the job to pending.
User → cobra command → parse flags → store.Operation() → format output → stdout
Client → HTTP request
→ RequestID middleware (assigns/propagates X-Request-ID)
→ Logging middleware (logs method, path, status, duration)
→ Recovery middleware (catches panics → 500)
→ Auth middleware (validates Bearer token → 401 if invalid)
→ API handler → store.Operation()
→ JSON response
Agent → MCP protocol (stdio or SSE)
→ mcp-go framework → tool handler → store.Operation()
→ MCP response
Litestream is embedded as a Go library (not a sidecar process). WAL frames are continuously streamed to S3-compatible object storage (R2, S3, B2, GCS).
Replication runs for every command — CLI, serve, and MCP alike. This is managed in root.go via Cobra's PersistentPreRunE and PersistentPostRunE:
PersistentPreRunE:
1. Resolve database path (flag → env var → home default)
2. Load config (YAML + env overrides)
3. RestoreIfNeeded: if replication enabled and local DB missing,
download from replica before opening
4. Open SQLite database
5. Start Litestream replication (continuous WAL streaming)
Command executes (add, list, serve, mcp, etc.)
PersistentPostRunE:
6. Stop replication with final sync
For short-lived CLI commands, the final sync ensures the last WAL changes reach the replica. For long-running serve/MCP commands, replication streams continuously.
Two restore paths:
- Auto-restore (
RestoreIfNeeded): On startup, if the local DB file doesn't exist but replication is configured, the DB is automatically downloaded from the replica. This enables stateless deployments. - Manual restore (
taskprim restore/stateprim restore/knowledgeprim restore/queueprim restore): Point-in-time recovery. Overwrites the local database with the latest replica.
knowledgeprim supports optional vector embeddings for semantic search. The embedding layer is a pluggable interface:
type Embedder interface {
Embed(ctx context.Context, text string) ([]float32, error)
Dimensions() int
Provider() string // "gemini", "openai", or "custom"
Model() string // e.g., "text-embedding-004"
}Provider() and Model() are used by the metadata safety layer to detect provider changes. Three implementations ship out of the box:
| Provider | Model | Dimensions |
|---|---|---|
gemini |
text-embedding-004 |
768 |
openai |
text-embedding-3-small |
1536 |
custom |
Any OpenAI-compatible endpoint | Configurable |
Embedding is optional. Without it, knowledgeprim still provides:
- Full-text search (FTS5/BM25)
- Manual edge creation
- Graph traversal
- Discovery operations
You only lose vector search and auto-connect.
Each knowledgeprim database stores a single EmbeddingMeta row (in the embedding_meta table, CHECK (id = 1)) recording which provider and model generated the stored vectors. This prevents silent degradation when switching embedding providers — old 768-dimension Gemini vectors are incompatible with a new 1536-dimension OpenAI config.
Flow on capture or search --mode vector/hybrid:
CheckEmbeddingMeta(provider, model, dimensions)
├── No meta yet → OK (first embed will call EnsureEmbeddingMeta)
├── Meta matches config → OK
└── Meta differs → ErrEmbeddingMismatch with clear message:
"db uses gemini/text-embedding-004 (768d),
config uses openai/text-embedding-3-small (1536d).
Use --mode fts, run re-embed, or pass --force"
Recovery options:
| Option | When to use |
|---|---|
knowledgeprim re-embed |
Switching to a new provider — re-generates all vectors |
knowledgeprim strip-vectors --confirm |
Dropping back to FTS5-only — removes all vectors and metadata |
--force flag |
Bypassing the check temporarily (risky — mixed-dimension vectors in one DB) |
--mode fts on search |
Read-only fallback that skips vector operations entirely |
When embedding is configured, CaptureEntity can automatically link new entities to semantically similar existing ones:
- Embed the new entity's title + body
- Cosine distance search against all existing embeddings
- Entities below the threshold get automatic
similar_toedges - Configurable: threshold (default 0.35), max connections (default 10)
Authentication is only active in serve and MCP SSE modes. CLI mode uses filesystem permissions.
- API keys are configured in
config.yamland mapped to human-readable names - Keys are validated using constant-time comparison (prevents timing attacks)
- When no keys are configured, the server runs in open mode (all requests allowed)
- The authenticated key's
nameis injected into the request context and used as thesourcefield for created tasks/records
The HTTP server wraps net/http with:
- Graceful shutdown: Listens for SIGINT/SIGTERM, gives in-flight requests 10 seconds to complete
- Timeouts: 30s read, 30s write, 60s idle
- Middleware chain: RequestID → Logging → Recovery → Auth → Handler
Pure Go SQLite via modernc.org/sqlite (no CGo). This simplifies cross-compilation, especially for ARM64 (Raspberry Pi).
- WAL mode: concurrent readers during writes, required for Litestream and serve mode
- Foreign keys: enforced for referential integrity
- Busy timeout: 5 seconds to prevent lock contention errors
- Embedded migrations: SQL files are embedded via
embed.FSfor single-binary deployment - In-memory mode: available for tests (
db.OpenInMemory())
The monorepo uses a Go workspace (go.work) with five modules. The Makefile provides:
| Target | Description |
|---|---|
make build |
Compile bin/taskprim, bin/stateprim, bin/knowledgeprim, and bin/queueprim |
make build-pi |
Cross-compile for ARM64 Linux |
make test |
Run all tests with race detector |
make lint |
Run go vet across all modules |
make fmt |
Format all code with gofmt |
make tidy |
Run go mod tidy for all modules |
make all |
tidy → fmt → lint → test → build |