Skip to content

Commit f4baf3e

Browse files
committed
Add basic code quality infrastructure
Replace Prettier with Biome for linting, formatting, and import sorting. Add Husky pre-commit hook that runs Biome on staged files incrementally. Add GitHub Actions CI with lint-format (PR changed files only) and test jobs. Add initial test suites (58 tests) for core utilities: uuid, cache, planningJsonExtractor, state, discovery, DOI, textUtils, authResolver. Rewrite CLAUDE.md into concise agent-optimized AGENTS.md with symlink for backwards compatibility.
1 parent adf5707 commit f4baf3e

18 files changed

Lines changed: 986 additions & 482 deletions

.editorconfig

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
root = true
2+
3+
[*]
4+
charset = utf-8
5+
end_of_line = lf
6+
indent_size = 2
7+
indent_style = space
8+
insert_final_newline = true
9+
trim_trailing_whitespace = true

.github/workflows/ci.yml

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
name: CI
2+
on:
3+
pull_request:
4+
branches: [main, dev]
5+
push:
6+
branches: [main, dev]
7+
8+
concurrency:
9+
group: ${{ github.workflow }}-${{ github.ref }}
10+
cancel-in-progress: true
11+
12+
jobs:
13+
lint-format:
14+
name: Lint & Format
15+
if: github.event_name == 'pull_request'
16+
runs-on: ubuntu-latest
17+
steps:
18+
- uses: actions/checkout@v4
19+
with:
20+
fetch-depth: 0
21+
22+
- uses: biomejs/setup-biome@v2
23+
with:
24+
version: 2.4.11
25+
26+
- name: Biome CI (changed files only)
27+
run: |
28+
FILES=$(git diff --name-only --diff-filter=ACMR "${{ github.event.pull_request.base.sha }}" HEAD -- '*.ts' '*.tsx' '*.js' '*.jsx' '*.json' '*.css' | tr '\n' ' ')
29+
if [ -n "$FILES" ]; then
30+
biome ci $FILES
31+
fi
32+
33+
test:
34+
name: Test
35+
runs-on: ubuntu-latest
36+
steps:
37+
- uses: actions/checkout@v4
38+
39+
- uses: oven-sh/setup-bun@v2
40+
41+
- run: bun install --frozen-lockfile
42+
43+
- run: bun run typecheck
44+
45+
- name: Test
46+
run: bun test

.husky/pre-commit

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Only run biome if there are supported files staged
2+
STAGED_FILES=$(git diff --cached --name-only --diff-filter=ACMR | grep -E '\.(js|jsx|ts|tsx|json|css)$' || true)
3+
4+
if [ -n "$STAGED_FILES" ]; then
5+
biome check --staged --error-on-warnings --write
6+
git update-index --again
7+
fi

AGENTS.md

Lines changed: 244 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,259 @@
1-
# AGENTS.md
1+
# BioAgents AgentKit — AI-powered bioscience research assistant
22

3-
Guidance for AI coding agents working on this repository. Read these **before** making changes.
3+
## Core Principles
44

5-
---
5+
See `CODING_GUIDELINES.md` for general principles. Additionally:
66

7-
## Core Principles
7+
- State assumptions explicitly before coding; ask if uncertain
8+
- Minimum code solving the problem; nothing speculative
9+
- Touch only what you must; preserve adjacent style
10+
- Verify with tests/typecheck before considering done
11+
12+
## Commands
13+
14+
**MANDATORY after edits:**
15+
```bash
16+
bun typecheck && bun test
17+
```
18+
19+
**Do NOT run `bun style:write` manually.** Husky runs Biome automatically on staged files during commit (pre-commit hook). This keeps formatting incremental — only touched files get fixed.
20+
21+
Other commands:
22+
- `bun dev` — API server with hot reload
23+
- `bun start` — Production server
24+
- `bun worker` / `bun worker:dev` — BullMQ worker process
25+
- `bun lint` / `bun lint:fix` — Biome lint
26+
- `bun format:check` / `bun format:write` — Biome format
27+
- `bun style:check` / `bun style:write` — Biome lint + format + import sorting (prefer pre-commit hook)
28+
- `bun typecheck` — TypeScript type checking
29+
- `bun test` — bun:test
30+
- `bun build:client` — Build Preact frontend
31+
32+
## Tech Stack
33+
34+
- **Runtime**: Bun (not Node.js) — use `bun` everywhere, not `node`/`npm`/`ts-node`
35+
- **Web Framework**: Elysia
36+
- **Database**: Supabase (PostgreSQL)
37+
- **Job Queue**: BullMQ with Redis (optional, `USE_JOB_QUEUE=true`)
38+
- **Frontend**: Preact (bundled client in `client/dist/`)
39+
- **Testing**: bun:test
40+
41+
## Project Structure
42+
43+
```
44+
src/
45+
├── index.ts # Main server entry (Elysia app)
46+
├── worker.ts # BullMQ worker entry (separate process)
47+
├── character.ts # Agent identity/persona
48+
├── routes/
49+
│ ├── chat.ts # POST /api/chat, GET /api/chat/status/:jobId
50+
│ ├── auth.ts # /api/auth/* (login, logout, status)
51+
│ ├── artifacts.ts # GET /api/artifacts/download
52+
│ ├── clarification.ts # /api/clarification/* (pre-research flow)
53+
│ ├── files.ts # /api/files/* (upload, confirm, status, delete)
54+
│ ├── deep-research/ # /api/deep-research/* (start, status, branch, paper)
55+
│ ├── x402/ # Payment-gated routes (Base/USDC)
56+
│ ├── b402/ # Payment-gated routes (BNB/USDT)
57+
│ └── admin/ # Bull Board dashboard + job management
58+
├── agents/
59+
│ ├── analysis/ # Data analysis (Edison, BioData)
60+
│ ├── clarification/ # Pre-research question generation & plan creation
61+
│ ├── continueResearch/ # Autonomy decision (continue vs ask user)
62+
│ ├── discovery/ # Structure scientific discoveries from task results
63+
│ ├── fileUpload/ # File parsing (PDF, Excel, CSV, MD, JSON, TXT, OCR)
64+
│ ├── hypothesis/ # Hypothesis generation/updates from task outputs
65+
│ ├── literature/ # Literature search (OpenScholar, Knowledge, Edison, BioLit)
66+
│ ├── planning/ # Research plan/task generation
67+
│ ├── reflection/ # World state updates (objective, insights, methodology)
68+
│ └── reply/ # User-facing response generation
69+
├── chat-agent/ # Shared chat agent runtime (Claude Sonnet + tool use)
70+
│ ├── runner.ts # Main execution loop (in-process + queue modes)
71+
│ ├── loop.ts # Recursive message loop with tool execution
72+
│ ├── registry.ts # Tool registration
73+
│ └── tools/ # Chat agent tools (literature-search)
74+
├── services/
75+
│ ├── chat/ # Conversation setup, message tools, payments
76+
│ ├── deep-research/ # Deep research mode guard/validation
77+
│ ├── files/ # File upload URL generation, processing, status
78+
│ ├── paper/ # Paper generation (Markdown → Pandoc → LaTeX → PDF)
79+
│ ├── queue/ # BullMQ connection, queues, workers, notifications
80+
│ └── websocket/ # WebSocket handler, Redis pub/sub
81+
├── middleware/
82+
│ ├── authResolver.ts # Multi-method auth (JWT, API key, x402, b402)
83+
│ ├── rateLimiter.ts # Redis-backed rate limiting
84+
│ ├── x402/ # x402 payment validation (Base/USDC)
85+
│ └── b402/ # b402 payment validation (BNB/USDT)
86+
├── llm/ # LLM provider adapters (OpenAI, Anthropic, Google, OpenRouter)
87+
├── embeddings/ # Vector search and document processing
88+
├── mcp/ # MCP server integration
89+
├── db/ # Database operations (Supabase)
90+
├── storage/ # File storage (S3 with presigned URLs)
91+
├── types/ # TypeScript types + Zod schemas
92+
└── utils/ # Helpers (logger, cache, UUID, state, polyfills)
93+
```
94+
95+
## Running Modes
96+
97+
### In-Process (Default)
98+
```bash
99+
USE_JOB_QUEUE=false bun run dev
100+
```
101+
Jobs execute in the main process. Simpler for development.
102+
103+
### Queue Mode (Production)
104+
```bash
105+
# Terminal 1: API server
106+
USE_JOB_QUEUE=true bun run dev
107+
108+
# Terminal 2: Worker
109+
USE_JOB_QUEUE=true bun run worker:dev
110+
```
111+
Jobs queued in Redis, processed by separate workers. Supports horizontal scaling.
112+
113+
## Key Environment Variables
114+
115+
See `.env.example` for full list. Key groups:
116+
117+
- **Auth**: `BIOAGENTS_SECRET`, `AUTH_MODE` (none/jwt), `UI_PASSWORD`
118+
- **LLM**: `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GOOGLE_API_KEY` + per-agent model config (`REPLY_LLM_PROVIDER`, `HYP_LLM_MODEL`, etc.)
119+
- **Chat Agent**: `CHAT_AGENT_MODEL`, `CHAT_AGENT_MAX_TOOL_CALLS`, `CHAT_AGENT_MAX_TOKENS`
120+
- **Database**: `SUPABASE_URL`, `SUPABASE_ANON_KEY`, `SUPABASE_SERVICE_KEY`
121+
- **Embeddings/RAG**: `EMBEDDING_PROVIDER`, `TEXT_EMBEDDING_MODEL`, `COHERE_API_KEY`, chunk/vector/reranking settings
122+
- **External Services**: `EDISON_API_URL`, `OPENSCHOLAR_API_URL`, `BIO_LIT_AGENT_API_URL`
123+
- **Storage**: `STORAGE_PROVIDER` (s3), `S3_BUCKET`, AWS credentials
124+
- **Queue**: `USE_JOB_QUEUE`, `REDIS_URL`, concurrency + rate limit settings
125+
- **Payments**: `X402_ENABLED`, `B402_ENABLED` + CDP credentials
126+
127+
## API Endpoints
128+
129+
### Core
130+
- `POST /api/chat` — Chat with AI agent
131+
- `GET /api/chat/status/:jobId` — Check chat job status (queue mode)
132+
- `POST /api/deep-research/start` — Start deep research session
133+
- `GET /api/deep-research/status/:messageId` — Check research status
134+
- `POST /api/deep-research/branch` — Fork research conversation
135+
- `GET /api/health` — Health check
8136

9-
### 1. Think before coding
137+
### Clarification (Pre-Research)
138+
- `POST /api/clarification/generate-questions` — Generate clarification questions
139+
- `POST /api/clarification/submit-answers` — Submit answers, create plan
140+
- `POST /api/clarification/plan-feedback` — Feedback on generated plan
141+
- `GET /api/clarification/:sessionId` — Get session state
10142

11-
Don't assume. Don't hide confusion. State ambiguity explicitly. Present multiple interpretations rather than silently picking one. Push back if a simpler approach exists. Stop and ask rather than guess.
143+
### Files
144+
- `POST /api/files/upload-url` — Request presigned upload URL
145+
- `POST /api/files/confirm` — Confirm upload, start processing
146+
- `GET /api/files/:fileId/status` — Processing status
147+
- `DELETE /api/files/:fileId` — Delete file
12148

13-
### 2. Simplicity first
149+
### Paper Generation
150+
- `POST /api/deep-research/conversations/:conversationId/paper` — Generate paper
151+
- `GET /api/deep-research/paper/:paperId` — Get paper with presigned URLs
152+
- `GET /api/deep-research/conversations/:conversationId/papers` — List papers
14153

15-
No features beyond what was asked. No abstractions for single-use code. No "flexibility" that wasn't requested. No error handling for impossible scenarios. The test: would a senior engineer say this is overcomplicated? If yes, rewrite it.
154+
### Payment-Gated
155+
- `POST /api/x402/chat` — Chat via Base/USDC
156+
- `POST /api/b402/chat` — Chat via BNB/USDT
16157

17-
### 3. Surgical changes
158+
### Admin
159+
- `/admin/queues` — Bull Board dashboard (when queue enabled)
18160

19-
Don't "improve" adjacent code. Don't refactor things that aren't broken. Match the existing style even if you'd do it differently. If you notice unrelated dead code, mention it, don't delete it. Every changed line should trace directly to the request.
161+
## Docker Deployment
20162

21-
### 4. Goal-driven execution
163+
### Production (with Job Queue)
164+
```bash
165+
docker compose up -d # API + Worker + Redis
166+
docker compose up -d --scale worker=3 # Scale workers
167+
```
22168

23-
Transform "fix the bug" into "write a test that reproduces it, then make it pass." Transform "add validation" into "write tests for invalid inputs, then make them pass." Give it success criteria and watch it loop until done.
169+
### Worker-Only
170+
```bash
171+
docker compose -f docker-compose.worker.yml up -d
172+
```
173+
174+
### Swarm Mode
175+
```bash
176+
docker compose -f docker-compose.swarm.yml ...
177+
```
24178

25179
---
26180

27-
## Project-Specific Guidance
181+
## Deep Research: The AI Scientist Framework
182+
183+
Deep Research is the PRIMARY way to use this agent. The agent behaves like a real scientist: iterative, methodical, hypothesis-driven.
184+
185+
### Iterative Workflow
186+
187+
```
188+
Planning → Execute Tasks → Hypothesis → Reflection → Discovery → Human Steering → Next Cycle
189+
```
190+
191+
Each cycle:
192+
1. **Planning** — Decides WHAT tasks to run based on current state + user input
193+
2. **Execution** — Runs LITERATURE and ANALYSIS tasks in parallel (external services)
194+
3. **Hypothesis** — Synthesizes outputs into scientific claims
195+
4. **Reflection** — Updates world state with insights, evolves objectives
196+
5. **Discovery** — Identifies novel claims, links to evidence
197+
6. **Human Steering** — User reviews, approves, or redirects
198+
199+
### Mini-Agent State Ownership
200+
201+
| Agent | Updates |
202+
|-------|---------|
203+
| Planning | Returns suggestions (no state mutation) |
204+
| Hypothesis | `currentHypothesis` |
205+
| Reflection | `currentObjective`, `keyInsights`, `methodology`, `conversationTitle` |
206+
| Discovery | `discoveries[]` |
207+
208+
### Behavioral Mandates
209+
210+
- Update world state after every task completion — NEVER lose accumulated discoveries
211+
- Maintain traceability: claims → evidence → tasks → jobIds
212+
- User input ALWAYS overrides agent suggestions
213+
- Present clear next steps for user approval before execution
214+
- Every discovery MUST link to supporting evidence (taskId, jobId)
215+
- Each cycle MUST build meaningfully on prior work
216+
- LITERATURE and ANALYSIS tasks are executed by EXTERNAL services — handle failures gracefully
217+
218+
## Known Issues
219+
220+
### TDZ (Temporal Dead Zone) in Workers
221+
222+
Bun workers have different module initialization. Module-level variables cause TDZ errors.
223+
224+
```typescript
225+
// BAD — TDZ error in workers
226+
const config = process.env.MY_VAR;
227+
228+
// GOOD — inside function
229+
export async function doSomething() {
230+
const config = process.env.MY_VAR;
231+
}
232+
233+
// GOOD — globalThis for singletons
234+
let cache = (globalThis as any).__myCache;
235+
if (!cache) {
236+
cache = new Map();
237+
(globalThis as any).__myCache = cache;
238+
}
239+
```
240+
241+
### Canvas Polyfill
242+
243+
`pdf-parse` requires canvas polyfills. Both `index.ts` and `worker.ts` MUST import the polyfill first:
244+
```typescript
245+
import "./utils/canvas-polyfill";
246+
```
247+
248+
## Related Documentation
249+
250+
- [AUTH.md](documentation/docs/AUTH.md) — Authentication (JWT, x402/b402 payments)
251+
- [SETUP.md](documentation/docs/SETUP.md) — Environment setup and LLM configuration
252+
- [JOB_QUEUE.md](documentation/docs/JOB_QUEUE.md) — BullMQ queue system architecture
253+
- [FILE_UPLOAD.md](documentation/docs/FILE_UPLOAD.md) — S3 presigned URL file upload flow
254+
255+
## Git & PRs
28256

29-
For codebase architecture, tech stack, commands, and project structure, see [CLAUDE.md](./CLAUDE.md).
257+
- Branch naming: `[initials]-[description]` (e.g., `ms-add-basic-code-quality`)
258+
- Biome lint/format and tests run in CI — do NOT include as manual test plan items
259+
- PR test plan: only manual verification steps

0 commit comments

Comments
 (0)