AI DevKit

English | 中文

Your AI coding agent is fast, eager, and reckless. Make it work like a senior engineer instead.

AI DevKit turns one-off AI coding chats into a repeatable software delivery workflow: requirements, design, planning, implementation, tests, verification, memory, and review.

Stops prompt-and-pray coding — /new-requirement makes the agent clarify the problem before touching code
Blocks fake "done" claims — verify requires fresh test/build output before completion claims
Keeps project knowledge alive — @ai-devkit/memory stores decisions, conventions, and fixes across sessions
Catches drift before push — /code-review audits the diff against the design and requirements docs

One config. All coding agents: Claude Code, Cursor, Codex CLI, Gemini CLI, GitHub Copilot, opencode, Antigravity, Amp, Windsurf, Kilo Code, Roo Code.

Run npx ai-devkit@latest init and your agent gets:

What you need	What AI DevKit installs
A plan before code	`/new-requirement`, `/review-design`, and `/execute-plan`
Evidence before "done"	`verify` gates tied to fresh test/build output
Memory across sessions	Local SQLite memory exposed through MCP and CLI
Same behavior across agents	Generated config for the coding tools your team uses

Who this is for

Developers who use AI coding agents daily and are tired of:

re-rigging CLAUDE.md / .cursor/rules / AGENTS.md for every project
the agent forgetting yesterday's conventions
"I've successfully implemented the feature" with a red build
the agent diving into code without a plan and producing the wrong thing

Before AI DevKit, your agent is a capable but inconsistent chatbot. After AI DevKit, it has a workflow, memory, verification gates, and reusable skills that travel with your repo.

Without AI DevKit	With AI DevKit
You repeat project rules in every chat	The agent searches project memory and docs first
The agent jumps from prompt to code	The agent moves through requirements, design, and plan
"Done" means the agent stopped editing	"Done" requires fresh verification output
Each agent needs separate hand-maintained rules	One config reconciles commands, skills, and MCP setup

Start in 30 seconds

npx ai-devkit@latest init

One wizard. Pick your agents, install the workflow, and give them the same operating model. It writes project-local files you can review and commit. Re-run it whenever your agent list or workflow changes.

Here's what lands in your repo:

your-project/
├── .ai-devkit.json              # single source of truth (re-run init anytime)
├── .claude/                     # or .cursor/, .codex/, etc. per agent you picked
│   ├── skills/                  # dev-lifecycle, verify, memory, tdd, ...
│   ├── commands/                # /new-requirement, /execute-plan, /code-review, ...
│   └── settings.json            # MCP servers wired up (incl. @ai-devkit/memory)
└── docs/ai/
    ├── requirements/            # phase 1 — what to build, why
    ├── design/                  # phase 2 — how it'll be built
    ├── planning/                # phase 3 — task-by-task plan
    ├── implementation/          # phase 4 — execution notes
    └── testing/                 # phase 5 — coverage strategy

Or get the full engineering workflow stack

Save templates/senior-engineer.yaml locally and run:

ai-devkit init --template ./senior-engineer.yaml

Bundles the eight built-in skills with curated additions from Anthropic, Vercel, and others — TDD, frontend design, webapp testing, doc co-authoring, React best practices, security review, and more.

A feature, end-to-end

You:    /new-requirement add OAuth login with Google

Agent:  Searches memory for prior auth conventions. Asks clarifying
        questions about scope, users, success criteria. Drafts
        docs/ai/{requirements,design,planning}/feature-oauth-login.md
        in a feature worktree. Stops before coding.

You:    /review-design feature-oauth-login

Agent:  Audits the design doc against the requirements. Flags gaps,
        proposes fixes — before any code gets written.

You:    /execute-plan feature-oauth-login

Agent:  Works the planning doc task-by-task. Updates progress after
        each task. The `verify` skill blocks a task from being
        marked done without fresh test/build output.

You:    /code-review

Agent:  Audits the diff against the design doc — scope creep,
        missing tests, edge cases the requirements named —
        before you push.

What changes in the agent

The flow above is powered by eight built-in skills, each addressing a failure mode developers see in real AI coding sessions:

Failure mode	AI DevKit behavior
Agent starts coding too early	`dev-lifecycle` forces requirements, design, planning, implementation, tests, and review
Agent says "done" without proof	`verify` blocks completion claims without fresh test/build evidence
Agent forgets project decisions	`memory` gives it a local, searchable knowledge base across sessions and projects
New behavior ships without tests	`tdd` pushes test-first implementation
Debugging becomes guess-and-patch	`structured-debug` makes it reproduce, hypothesize, fix, and verify
Existing code is opaque	`document-code` maps entry points, dependencies, and behavior
Implementation gets bloated	`simplify-implementation` reduces complexity before code ships
Documentation is hard to follow	`technical-writer` audits docs for novice-user clarity

Need more? ai-devkit skill add <registry> <skill> pulls from 30+ publishers — Anthropic, Vercel, Supabase, Microsoft, Google.

Works with every coding agent

One .ai-devkit.json configures all of them. Add a new agent to your team without rewriting your rules.

Agent	Setup	Remote control
Claude Code	yes	yes
Gemini CLI	yes	yes
Codex CLI	yes	yes
opencode	yes	testing
Cursor	yes	—
GitHub Copilot	yes	—
Antigravity	yes	—
Amp	yes	—
Windsurf	testing	—
Kilo Code	testing	—
Roo Code	testing	—

Setup — ai-devkit init writes the agent's config (rules, MCP servers, skills, slash commands) so it follows the same workflow. Remote control — drive running sessions from ai-devkit agent send and route them through external channels.

Operate agents like infrastructure

AI DevKit also ships an agent control plane — drive sessions from the CLI, supervise from anywhere:

# List running sessions across providers
ai-devkit agent list

# Send a prompt to a running session and wait for the response
ai-devkit agent send <session-id> "run the tests and report back" --wait

# Pipe a session through Telegram — operate your agent from your phone
ai-devkit channel start telegram

Useful for long-running tasks, scheduled work, or checking on an agent from your phone at lunch.

How is this different from `CLAUDE.md`, `.cursor/rules`, or `AGENTS.md`?

Those files are static instructions the agent re-reads. AI DevKit gives the agent a workflow layer: phase docs, slash commands, skills loaded on demand, local searchable memory, verification gates, and a control surface that works across agents. The rules still matter, but AI DevKit makes them operational.

Static rules files	AI DevKit
Tell the agent what you prefer	Installs commands that drive the next step
Depend on the agent remembering every rule	Stores and searches reusable project knowledge
Cannot prove a task is complete	Requires fresh command output before completion claims
Are different for each agent	Generates the right files for each supported agent

What this isn't

Not a smarter LLM. Bad models stay bad — this raises the floor on process, not on raw capability.
Not a magic "write the feature for me" button. You still review the requirements doc, accept the design, and read the diff. The workflow makes that review possible (artifacts to point at) instead of impossible (chat scrollback).
Not a hosted service. MIT-licensed, runs locally, no telemetry. Memory is a SQLite file on your disk. The agent control plane talks to the agent SDKs you already use.

Documentation & community

Full guides, workflow patterns, skill authoring → ai-devkit.com/docs
Release notes → CHANGELOG.md
Contributing → CONTRIBUTING.md

git clone https://github.com/Codeaholicguy/ai-devkit.git
cd ai-devkit && npm install && npm run build

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 446 Commits
.claude-plugin		.claude-plugin
.codex-plugin		.codex-plugin
.cursor-plugin		.cursor-plugin
.github/workflows		.github/workflows
.husky		.husky
commands		commands
docs/ai		docs/ai
e2e		e2e
packages		packages
skills		skills
templates		templates
web		web
.ai-devkit.json		.ai-devkit.json
.editorconfig		.editorconfig
.gitignore		.gitignore
.nvmrc		.nvmrc
BACKLOG.md		BACKLOG.md
CHANGELOG.md		CHANGELOG.md
README-zh.md		README-zh.md
README.md		README.md
nx.json		nx.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.base.json		tsconfig.base.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI DevKit

Who this is for

Start in 30 seconds

Or get the full engineering workflow stack

A feature, end-to-end

What changes in the agent

Works with every coding agent

Operate agents like infrastructure

How is this different from `CLAUDE.md`, `.cursor/rules`, or `AGENTS.md`?

What this isn't

Documentation & community

License

About

Uh oh!

Releases 34

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI DevKit

Who this is for

Start in 30 seconds

Or get the full engineering workflow stack

A feature, end-to-end

What changes in the agent

Works with every coding agent

Operate agents like infrastructure

How is this different from CLAUDE.md, .cursor/rules, or AGENTS.md?

What this isn't

Documentation & community

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 34

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

How is this different from `CLAUDE.md`, `.cursor/rules`, or `AGENTS.md`?

Packages