English | 中文
Your AI coding agent is fast, eager, and reckless. Make it work like a senior engineer instead.
AI DevKit turns one-off AI coding chats into a repeatable software delivery workflow: requirements, design, planning, implementation, tests, verification, memory, and review.
- Stops prompt-and-pray coding —
/new-requirementmakes the agent clarify the problem before touching code - Blocks fake "done" claims —
verifyrequires fresh test/build output before completion claims - Keeps project knowledge alive —
@ai-devkit/memorystores decisions, conventions, and fixes across sessions - Catches drift before push —
/code-reviewaudits the diff against the design and requirements docs
One config. All coding agents: Claude Code, Cursor, Codex CLI, Gemini CLI, GitHub Copilot, opencode, Antigravity, Amp, Windsurf, Kilo Code, Roo Code.
Run npx ai-devkit@latest init and your agent gets:
| What you need | What AI DevKit installs |
|---|---|
| A plan before code | /new-requirement, /review-design, and /execute-plan |
| Evidence before "done" | verify gates tied to fresh test/build output |
| Memory across sessions | Local SQLite memory exposed through MCP and CLI |
| Same behavior across agents | Generated config for the coding tools your team uses |
Developers who use AI coding agents daily and are tired of:
- re-rigging
CLAUDE.md/.cursor/rules/AGENTS.mdfor every project - the agent forgetting yesterday's conventions
- "I've successfully implemented the feature" with a red build
- the agent diving into code without a plan and producing the wrong thing
Before AI DevKit, your agent is a capable but inconsistent chatbot. After AI DevKit, it has a workflow, memory, verification gates, and reusable skills that travel with your repo.
| Without AI DevKit | With AI DevKit |
|---|---|
| You repeat project rules in every chat | The agent searches project memory and docs first |
| The agent jumps from prompt to code | The agent moves through requirements, design, and plan |
| "Done" means the agent stopped editing | "Done" requires fresh verification output |
| Each agent needs separate hand-maintained rules | One config reconciles commands, skills, and MCP setup |
npx ai-devkit@latest initOne wizard. Pick your agents, install the workflow, and give them the same operating model. It writes project-local files you can review and commit. Re-run it whenever your agent list or workflow changes.
Here's what lands in your repo:
your-project/
├── .ai-devkit.json # single source of truth (re-run init anytime)
├── .claude/ # or .cursor/, .codex/, etc. per agent you picked
│ ├── skills/ # dev-lifecycle, verify, memory, tdd, ...
│ ├── commands/ # /new-requirement, /execute-plan, /code-review, ...
│ └── settings.json # MCP servers wired up (incl. @ai-devkit/memory)
└── docs/ai/
├── requirements/ # phase 1 — what to build, why
├── design/ # phase 2 — how it'll be built
├── planning/ # phase 3 — task-by-task plan
├── implementation/ # phase 4 — execution notes
└── testing/ # phase 5 — coverage strategy
Save templates/senior-engineer.yaml locally and run:
ai-devkit init --template ./senior-engineer.yamlBundles the eight built-in skills with curated additions from Anthropic, Vercel, and others — TDD, frontend design, webapp testing, doc co-authoring, React best practices, security review, and more.
You: /new-requirement add OAuth login with Google
Agent: Searches memory for prior auth conventions. Asks clarifying
questions about scope, users, success criteria. Drafts
docs/ai/{requirements,design,planning}/feature-oauth-login.md
in a feature worktree. Stops before coding.
You: /review-design feature-oauth-login
Agent: Audits the design doc against the requirements. Flags gaps,
proposes fixes — before any code gets written.
You: /execute-plan feature-oauth-login
Agent: Works the planning doc task-by-task. Updates progress after
each task. The `verify` skill blocks a task from being
marked done without fresh test/build output.
You: /code-review
Agent: Audits the diff against the design doc — scope creep,
missing tests, edge cases the requirements named —
before you push.
The flow above is powered by eight built-in skills, each addressing a failure mode developers see in real AI coding sessions:
| Failure mode | AI DevKit behavior |
|---|---|
| Agent starts coding too early | dev-lifecycle forces requirements, design, planning, implementation, tests, and review |
| Agent says "done" without proof | verify blocks completion claims without fresh test/build evidence |
| Agent forgets project decisions | memory gives it a local, searchable knowledge base across sessions and projects |
| New behavior ships without tests | tdd pushes test-first implementation |
| Debugging becomes guess-and-patch | structured-debug makes it reproduce, hypothesize, fix, and verify |
| Existing code is opaque | document-code maps entry points, dependencies, and behavior |
| Implementation gets bloated | simplify-implementation reduces complexity before code ships |
| Documentation is hard to follow | technical-writer audits docs for novice-user clarity |
Need more? ai-devkit skill add <registry> <skill> pulls from 30+ publishers — Anthropic, Vercel, Supabase, Microsoft, Google.
One .ai-devkit.json configures all of them. Add a new agent to your team without rewriting your rules.
| Agent | Setup | Remote control |
|---|---|---|
| Claude Code | yes | yes |
| Gemini CLI | yes | yes |
| Codex CLI | yes | yes |
| opencode | yes | testing |
| Cursor | yes | — |
| GitHub Copilot | yes | — |
| Antigravity | yes | — |
| Amp | yes | — |
| Windsurf | testing | — |
| Kilo Code | testing | — |
| Roo Code | testing | — |
Setup — ai-devkit init writes the agent's config (rules, MCP servers, skills, slash commands) so it follows the same workflow.
Remote control — drive running sessions from ai-devkit agent send and route them through external channels.
AI DevKit also ships an agent control plane — drive sessions from the CLI, supervise from anywhere:
# List running sessions across providers
ai-devkit agent list
# Send a prompt to a running session and wait for the response
ai-devkit agent send <session-id> "run the tests and report back" --wait
# Pipe a session through Telegram — operate your agent from your phone
ai-devkit channel start telegramUseful for long-running tasks, scheduled work, or checking on an agent from your phone at lunch.
Those files are static instructions the agent re-reads. AI DevKit gives the agent a workflow layer: phase docs, slash commands, skills loaded on demand, local searchable memory, verification gates, and a control surface that works across agents. The rules still matter, but AI DevKit makes them operational.
| Static rules files | AI DevKit |
|---|---|
| Tell the agent what you prefer | Installs commands that drive the next step |
| Depend on the agent remembering every rule | Stores and searches reusable project knowledge |
| Cannot prove a task is complete | Requires fresh command output before completion claims |
| Are different for each agent | Generates the right files for each supported agent |
- Not a smarter LLM. Bad models stay bad — this raises the floor on process, not on raw capability.
- Not a magic "write the feature for me" button. You still review the requirements doc, accept the design, and read the diff. The workflow makes that review possible (artifacts to point at) instead of impossible (chat scrollback).
- Not a hosted service. MIT-licensed, runs locally, no telemetry. Memory is a SQLite file on your disk. The agent control plane talks to the agent SDKs you already use.
- Full guides, workflow patterns, skill authoring → ai-devkit.com/docs
- Release notes → CHANGELOG.md
- Contributing → CONTRIBUTING.md
git clone https://github.com/Codeaholicguy/ai-devkit.git
cd ai-devkit && npm install && npm run buildMIT