|
1 | | -# AGENTS.md |
| 1 | +# BioAgents AgentKit — AI-powered bioscience research assistant |
2 | 2 |
|
3 | | -Guidance for AI coding agents working on this repository. Read these **before** making changes. |
| 3 | +## Core Principles |
4 | 4 |
|
5 | | ---- |
| 5 | +See `CODING_GUIDELINES.md` for general principles. Additionally: |
6 | 6 |
|
7 | | -## Core Principles |
| 7 | +- State assumptions explicitly before coding; ask if uncertain |
| 8 | +- Minimum code solving the problem; nothing speculative |
| 9 | +- Touch only what you must; preserve adjacent style |
| 10 | +- Verify with tests/typecheck before considering done |
| 11 | + |
| 12 | +## Commands |
| 13 | + |
| 14 | +**MANDATORY after edits:** |
| 15 | +```bash |
| 16 | +bun typecheck && bun test |
| 17 | +``` |
| 18 | + |
| 19 | +**Do NOT run `bun style:write` manually.** Husky runs Biome automatically on staged files during commit (pre-commit hook). This keeps formatting incremental — only touched files get fixed. |
| 20 | + |
| 21 | +Other commands: |
| 22 | +- `bun dev` — API server with hot reload |
| 23 | +- `bun start` — Production server |
| 24 | +- `bun worker` / `bun worker:dev` — BullMQ worker process |
| 25 | +- `bun lint` / `bun lint:fix` — Biome lint |
| 26 | +- `bun format:check` / `bun format:write` — Biome format |
| 27 | +- `bun style:check` / `bun style:write` — Biome lint + format + import sorting (prefer pre-commit hook) |
| 28 | +- `bun typecheck` — TypeScript type checking |
| 29 | +- `bun test` — bun:test |
| 30 | +- `bun build:client` — Build Preact frontend |
| 31 | + |
| 32 | +## Tech Stack |
| 33 | + |
| 34 | +- **Runtime**: Bun (not Node.js) — use `bun` everywhere, not `node`/`npm`/`ts-node` |
| 35 | +- **Web Framework**: Elysia |
| 36 | +- **Database**: Supabase (PostgreSQL) |
| 37 | +- **Job Queue**: BullMQ with Redis (optional, `USE_JOB_QUEUE=true`) |
| 38 | +- **Frontend**: Preact (bundled client in `client/dist/`) |
| 39 | +- **Testing**: bun:test |
| 40 | + |
| 41 | +## Project Structure |
| 42 | + |
| 43 | +``` |
| 44 | +src/ |
| 45 | +├── index.ts # Main server entry (Elysia app) |
| 46 | +├── worker.ts # BullMQ worker entry (separate process) |
| 47 | +├── character.ts # Agent identity/persona |
| 48 | +├── routes/ |
| 49 | +│ ├── chat.ts # POST /api/chat, GET /api/chat/status/:jobId |
| 50 | +│ ├── auth.ts # /api/auth/* (login, logout, status) |
| 51 | +│ ├── artifacts.ts # GET /api/artifacts/download |
| 52 | +│ ├── clarification.ts # /api/clarification/* (pre-research flow) |
| 53 | +│ ├── files.ts # /api/files/* (upload, confirm, status, delete) |
| 54 | +│ ├── deep-research/ # /api/deep-research/* (start, status, branch, paper) |
| 55 | +│ ├── x402/ # Payment-gated routes (Base/USDC) |
| 56 | +│ ├── b402/ # Payment-gated routes (BNB/USDT) |
| 57 | +│ └── admin/ # Bull Board dashboard + job management |
| 58 | +├── agents/ |
| 59 | +│ ├── analysis/ # Data analysis (Edison, BioData) |
| 60 | +│ ├── clarification/ # Pre-research question generation & plan creation |
| 61 | +│ ├── continueResearch/ # Autonomy decision (continue vs ask user) |
| 62 | +│ ├── discovery/ # Structure scientific discoveries from task results |
| 63 | +│ ├── fileUpload/ # File parsing (PDF, Excel, CSV, MD, JSON, TXT, OCR) |
| 64 | +│ ├── hypothesis/ # Hypothesis generation/updates from task outputs |
| 65 | +│ ├── literature/ # Literature search (OpenScholar, Knowledge, Edison, BioLit) |
| 66 | +│ ├── planning/ # Research plan/task generation |
| 67 | +│ ├── reflection/ # World state updates (objective, insights, methodology) |
| 68 | +│ └── reply/ # User-facing response generation |
| 69 | +├── chat-agent/ # Shared chat agent runtime (Claude Sonnet + tool use) |
| 70 | +│ ├── runner.ts # Main execution loop (in-process + queue modes) |
| 71 | +│ ├── loop.ts # Recursive message loop with tool execution |
| 72 | +│ ├── registry.ts # Tool registration |
| 73 | +│ └── tools/ # Chat agent tools (literature-search) |
| 74 | +├── services/ |
| 75 | +│ ├── chat/ # Conversation setup, message tools, payments |
| 76 | +│ ├── deep-research/ # Deep research mode guard/validation |
| 77 | +│ ├── files/ # File upload URL generation, processing, status |
| 78 | +│ ├── paper/ # Paper generation (Markdown → Pandoc → LaTeX → PDF) |
| 79 | +│ ├── queue/ # BullMQ connection, queues, workers, notifications |
| 80 | +│ └── websocket/ # WebSocket handler, Redis pub/sub |
| 81 | +├── middleware/ |
| 82 | +│ ├── authResolver.ts # Multi-method auth (JWT, API key, x402, b402) |
| 83 | +│ ├── rateLimiter.ts # Redis-backed rate limiting |
| 84 | +│ ├── x402/ # x402 payment validation (Base/USDC) |
| 85 | +│ └── b402/ # b402 payment validation (BNB/USDT) |
| 86 | +├── llm/ # LLM provider adapters (OpenAI, Anthropic, Google, OpenRouter) |
| 87 | +├── embeddings/ # Vector search and document processing |
| 88 | +├── mcp/ # MCP server integration |
| 89 | +├── db/ # Database operations (Supabase) |
| 90 | +├── storage/ # File storage (S3 with presigned URLs) |
| 91 | +├── types/ # TypeScript types + Zod schemas |
| 92 | +└── utils/ # Helpers (logger, cache, UUID, state, polyfills) |
| 93 | +``` |
| 94 | + |
| 95 | +## Running Modes |
| 96 | + |
| 97 | +### In-Process (Default) |
| 98 | +```bash |
| 99 | +USE_JOB_QUEUE=false bun run dev |
| 100 | +``` |
| 101 | +Jobs execute in the main process. Simpler for development. |
| 102 | + |
| 103 | +### Queue Mode (Production) |
| 104 | +```bash |
| 105 | +# Terminal 1: API server |
| 106 | +USE_JOB_QUEUE=true bun run dev |
| 107 | + |
| 108 | +# Terminal 2: Worker |
| 109 | +USE_JOB_QUEUE=true bun run worker:dev |
| 110 | +``` |
| 111 | +Jobs queued in Redis, processed by separate workers. Supports horizontal scaling. |
| 112 | + |
| 113 | +## Key Environment Variables |
| 114 | + |
| 115 | +See `.env.example` for full list. Key groups: |
| 116 | + |
| 117 | +- **Auth**: `BIOAGENTS_SECRET`, `AUTH_MODE` (none/jwt), `UI_PASSWORD` |
| 118 | +- **LLM**: `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GOOGLE_API_KEY` + per-agent model config (`REPLY_LLM_PROVIDER`, `HYP_LLM_MODEL`, etc.) |
| 119 | +- **Chat Agent**: `CHAT_AGENT_MODEL`, `CHAT_AGENT_MAX_TOOL_CALLS`, `CHAT_AGENT_MAX_TOKENS` |
| 120 | +- **Database**: `SUPABASE_URL`, `SUPABASE_ANON_KEY`, `SUPABASE_SERVICE_KEY` |
| 121 | +- **Embeddings/RAG**: `EMBEDDING_PROVIDER`, `TEXT_EMBEDDING_MODEL`, `COHERE_API_KEY`, chunk/vector/reranking settings |
| 122 | +- **External Services**: `EDISON_API_URL`, `OPENSCHOLAR_API_URL`, `BIO_LIT_AGENT_API_URL` |
| 123 | +- **Storage**: `STORAGE_PROVIDER` (s3), `S3_BUCKET`, AWS credentials |
| 124 | +- **Queue**: `USE_JOB_QUEUE`, `REDIS_URL`, concurrency + rate limit settings |
| 125 | +- **Payments**: `X402_ENABLED`, `B402_ENABLED` + CDP credentials |
| 126 | + |
| 127 | +## API Endpoints |
| 128 | + |
| 129 | +### Core |
| 130 | +- `POST /api/chat` — Chat with AI agent |
| 131 | +- `GET /api/chat/status/:jobId` — Check chat job status (queue mode) |
| 132 | +- `POST /api/deep-research/start` — Start deep research session |
| 133 | +- `GET /api/deep-research/status/:messageId` — Check research status |
| 134 | +- `POST /api/deep-research/branch` — Fork research conversation |
| 135 | +- `GET /api/health` — Health check |
8 | 136 |
|
9 | | -### 1. Think before coding |
| 137 | +### Clarification (Pre-Research) |
| 138 | +- `POST /api/clarification/generate-questions` — Generate clarification questions |
| 139 | +- `POST /api/clarification/submit-answers` — Submit answers, create plan |
| 140 | +- `POST /api/clarification/plan-feedback` — Feedback on generated plan |
| 141 | +- `GET /api/clarification/:sessionId` — Get session state |
10 | 142 |
|
11 | | -Don't assume. Don't hide confusion. State ambiguity explicitly. Present multiple interpretations rather than silently picking one. Push back if a simpler approach exists. Stop and ask rather than guess. |
| 143 | +### Files |
| 144 | +- `POST /api/files/upload-url` — Request presigned upload URL |
| 145 | +- `POST /api/files/confirm` — Confirm upload, start processing |
| 146 | +- `GET /api/files/:fileId/status` — Processing status |
| 147 | +- `DELETE /api/files/:fileId` — Delete file |
12 | 148 |
|
13 | | -### 2. Simplicity first |
| 149 | +### Paper Generation |
| 150 | +- `POST /api/deep-research/conversations/:conversationId/paper` — Generate paper |
| 151 | +- `GET /api/deep-research/paper/:paperId` — Get paper with presigned URLs |
| 152 | +- `GET /api/deep-research/conversations/:conversationId/papers` — List papers |
14 | 153 |
|
15 | | -No features beyond what was asked. No abstractions for single-use code. No "flexibility" that wasn't requested. No error handling for impossible scenarios. The test: would a senior engineer say this is overcomplicated? If yes, rewrite it. |
| 154 | +### Payment-Gated |
| 155 | +- `POST /api/x402/chat` — Chat via Base/USDC |
| 156 | +- `POST /api/b402/chat` — Chat via BNB/USDT |
16 | 157 |
|
17 | | -### 3. Surgical changes |
| 158 | +### Admin |
| 159 | +- `/admin/queues` — Bull Board dashboard (when queue enabled) |
18 | 160 |
|
19 | | -Don't "improve" adjacent code. Don't refactor things that aren't broken. Match the existing style even if you'd do it differently. If you notice unrelated dead code, mention it, don't delete it. Every changed line should trace directly to the request. |
| 161 | +## Docker Deployment |
20 | 162 |
|
21 | | -### 4. Goal-driven execution |
| 163 | +### Production (with Job Queue) |
| 164 | +```bash |
| 165 | +docker compose up -d # API + Worker + Redis |
| 166 | +docker compose up -d --scale worker=3 # Scale workers |
| 167 | +``` |
22 | 168 |
|
23 | | -Transform "fix the bug" into "write a test that reproduces it, then make it pass." Transform "add validation" into "write tests for invalid inputs, then make them pass." Give it success criteria and watch it loop until done. |
| 169 | +### Worker-Only |
| 170 | +```bash |
| 171 | +docker compose -f docker-compose.worker.yml up -d |
| 172 | +``` |
| 173 | + |
| 174 | +### Swarm Mode |
| 175 | +```bash |
| 176 | +docker compose -f docker-compose.swarm.yml ... |
| 177 | +``` |
24 | 178 |
|
25 | 179 | --- |
26 | 180 |
|
27 | | -## Project-Specific Guidance |
| 181 | +## Deep Research: The AI Scientist Framework |
| 182 | + |
| 183 | +Deep Research is the PRIMARY way to use this agent. The agent behaves like a real scientist: iterative, methodical, hypothesis-driven. |
| 184 | + |
| 185 | +### Iterative Workflow |
| 186 | + |
| 187 | +``` |
| 188 | +Planning → Execute Tasks → Hypothesis → Reflection → Discovery → Human Steering → Next Cycle |
| 189 | +``` |
| 190 | + |
| 191 | +Each cycle: |
| 192 | +1. **Planning** — Decides WHAT tasks to run based on current state + user input |
| 193 | +2. **Execution** — Runs LITERATURE and ANALYSIS tasks in parallel (external services) |
| 194 | +3. **Hypothesis** — Synthesizes outputs into scientific claims |
| 195 | +4. **Reflection** — Updates world state with insights, evolves objectives |
| 196 | +5. **Discovery** — Identifies novel claims, links to evidence |
| 197 | +6. **Human Steering** — User reviews, approves, or redirects |
| 198 | + |
| 199 | +### Mini-Agent State Ownership |
| 200 | + |
| 201 | +| Agent | Updates | |
| 202 | +|-------|---------| |
| 203 | +| Planning | Returns suggestions (no state mutation) | |
| 204 | +| Hypothesis | `currentHypothesis` | |
| 205 | +| Reflection | `currentObjective`, `keyInsights`, `methodology`, `conversationTitle` | |
| 206 | +| Discovery | `discoveries[]` | |
| 207 | + |
| 208 | +### Behavioral Mandates |
| 209 | + |
| 210 | +- Update world state after every task completion — NEVER lose accumulated discoveries |
| 211 | +- Maintain traceability: claims → evidence → tasks → jobIds |
| 212 | +- User input ALWAYS overrides agent suggestions |
| 213 | +- Present clear next steps for user approval before execution |
| 214 | +- Every discovery MUST link to supporting evidence (taskId, jobId) |
| 215 | +- Each cycle MUST build meaningfully on prior work |
| 216 | +- LITERATURE and ANALYSIS tasks are executed by EXTERNAL services — handle failures gracefully |
| 217 | + |
| 218 | +## Known Issues |
| 219 | + |
| 220 | +### TDZ (Temporal Dead Zone) in Workers |
| 221 | + |
| 222 | +Bun workers have different module initialization. Module-level variables cause TDZ errors. |
| 223 | + |
| 224 | +```typescript |
| 225 | +// BAD — TDZ error in workers |
| 226 | +const config = process.env.MY_VAR; |
| 227 | + |
| 228 | +// GOOD — inside function |
| 229 | +export async function doSomething() { |
| 230 | + const config = process.env.MY_VAR; |
| 231 | +} |
| 232 | + |
| 233 | +// GOOD — globalThis for singletons |
| 234 | +let cache = (globalThis as any).__myCache; |
| 235 | +if (!cache) { |
| 236 | + cache = new Map(); |
| 237 | + (globalThis as any).__myCache = cache; |
| 238 | +} |
| 239 | +``` |
| 240 | + |
| 241 | +### Canvas Polyfill |
| 242 | + |
| 243 | +`pdf-parse` requires canvas polyfills. Both `index.ts` and `worker.ts` MUST import the polyfill first: |
| 244 | +```typescript |
| 245 | +import "./utils/canvas-polyfill"; |
| 246 | +``` |
| 247 | + |
| 248 | +## Related Documentation |
| 249 | + |
| 250 | +- [AUTH.md](documentation/docs/AUTH.md) — Authentication (JWT, x402/b402 payments) |
| 251 | +- [SETUP.md](documentation/docs/SETUP.md) — Environment setup and LLM configuration |
| 252 | +- [JOB_QUEUE.md](documentation/docs/JOB_QUEUE.md) — BullMQ queue system architecture |
| 253 | +- [FILE_UPLOAD.md](documentation/docs/FILE_UPLOAD.md) — S3 presigned URL file upload flow |
| 254 | + |
| 255 | +## Git & PRs |
28 | 256 |
|
29 | | -For codebase architecture, tech stack, commands, and project structure, see [CLAUDE.md](./CLAUDE.md). |
| 257 | +- Branch naming: `[initials]-[description]` (e.g., `ms-add-basic-code-quality`) |
| 258 | +- Biome lint/format and tests run in CI — do NOT include as manual test plan items |
| 259 | +- PR test plan: only manual verification steps |
0 commit comments