|
| 1 | +# BIOS Backend Architecture Review |
| 2 | + |
| 3 | +## Executive summary |
| 4 | + |
| 5 | +BIOS is still a Bun + Elysia monolith for chat and iterative deep research, backed by Supabase/Postgres (with pgvector), Redis/BullMQ, and S3. Since this review was first drafted, the branch has been rebased onto `dev` and PR #156 changes are now present: x402/b402 code was removed, TypeScript strictness was tightened, Biome + Husky were added, and CI now runs typecheck/tests. |
| 6 | + |
| 7 | +Shape today: **stronger baseline, same core orchestration risk**. The repository now has real quality infrastructure, but deep-research orchestration is still split across very large route/worker files, queue vs in-process behavior still diverges in important places, streaming is still absent at the API layer, and the core state model remains JSONB-centric. |
| 8 | + |
| 9 | +## Top findings (prioritized) |
| 10 | + |
| 11 | +1. **Deep-research orchestration is still duplicated across two very large files.** `src/routes/deep-research/start.ts` is 1919 lines and `src/services/queue/workers/deep-research.worker.ts` is 1387 lines. Both implement the same lifecycle (planning -> task execution -> hypothesis -> reflection/discovery -> continue/stop) with near-parallel logic branches. **Impact**: iteration behavior changes require dual edits; high regression risk during refactors. **Recommendation**: extract shared orchestration into `src/services/deep-research/orchestrator.ts` and keep route/worker as thin transport shells. |
| 12 | + |
| 13 | +2. **"No CI/tests" is resolved, but quality gates are still shallow around highest-risk paths.** CI now exists in `.github/workflows/ci.yml` (Biome on changed files for PRs, plus `bun run typecheck` and `bun test`), and there are 14 test files under `src/**/__tests__`. However, coverage is concentrated in utilities/adapters; there are still no route-level tests for `chat`, `deep-research/start`, `deep-research/status`, or queue/in-process parity. **Impact**: the most complex orchestration paths remain largely unguarded. **Recommendation**: add focused integration tests for deep-research start/status/retry and queue-vs-in-process parity checks. |
| 14 | + |
| 15 | +3. **Queue mode and in-process mode still diverge semantically.** Chat in-process goes straight through `runChatAgent` (`src/routes/chat.ts`), while queue chat defaults to a legacy planning/literature/hypothesis/reply pipeline unless `CHAT_AGENT_QUEUE_ENABLED=true` (`src/services/queue/workers/chat.worker.ts`). Deep research also keeps separate in-process and worker implementations. **Impact**: behavior can differ between local/dev and production depending on `USE_JOB_QUEUE` and feature flags. **Recommendation**: converge on one execution core per capability and use adapters for sync/async transport only. |
| 16 | + |
| 17 | +4. **No token-streaming API path yet (SSE/WebSocket payload streaming).** There is no `text/event-stream` route in `src/routes/`; status remains poll-based (`/api/chat/status/:jobId`, `/api/deep-research/status/:messageId`), and WebSocket uses notify-then-fetch metadata (`src/services/websocket/handler.ts`, `src/services/queue/notify.ts`). LLM adapters support streaming callbacks, but route/agent loop plumbing does not expose streamed tokens to clients. **Impact**: CLI/native clients cannot render progressive output. **Recommendation**: add SSE for `/api/chat` and deep-research updates; thread model deltas through chat/deep-research runners. |
| 18 | + |
| 19 | +5. **JSONB state blobs remain the primary domain model.** `states.values` and `conversation_states.values` are still unstructured JSONB in persistence terms. `ConversationStateValues` exists as a TypeScript interface (`src/types/core.ts`), but write-time runtime validation is not enforced in `updateConversationState`/`updateState` (`src/db/operations.ts`). `cleanValues` was improved and extracted (`src/db/cleanValues.ts`) with tests, but this is payload trimming, not schema enforcement. **Impact**: malformed state can still persist and couple many agents/routes implicitly. **Recommendation**: add Zod runtime validation at DB write boundaries and introduce versioned state migrations for breaking shape changes. |
| 20 | + |
| 21 | +6. **`src/db/setup.sql` is still drifted and unsafe as a setup source.** It drops/recreates only 5 tables (`users`, `conversations`, `messages`, `states`, `conversation_states`) while current schema evolves through `supabase/migrations/*.sql` (including `paper`, `token_usage`, `clarification_sessions`, `documents`, branching lineage, and other tables). **Impact**: new engineers using `setup.sql` get an incomplete/outdated DB shape. **Recommendation**: retire `src/db/setup.sql` from onboarding and point setup exclusively to Supabase migrations. |
| 22 | + |
| 23 | +7. **Service-role DB access is centralized but still leaks outside `src/db`.** Core DB operations use `getServiceClient()` (bypassing RLS intentionally), and there are still direct `supabase.from(...)` calls outside `src/db` in `src/routes/deep-research/branch.ts`, `src/routes/deep-research/paper.ts`, and `src/services/paper/generatePaper.ts`. Ownership checks are present in these flows, but enforcement is by convention, not architecture. **Impact**: security relies on every caller consistently implementing authorization checks. **Recommendation**: move all table operations behind `src/db/*` and keep route/service layers authorization-only. |
| 24 | + |
| 25 | +8. **`/api/auth/login` remains single-user by design.** `src/routes/auth.ts` still mints JWTs for a fixed UUID (`550e8400-e29b-41d4-a716-446655440000`) when UI password auth is used. **Impact**: browser login path is effectively single-tenant and not a true account model. **Recommendation**: decide and implement a real identity flow (external IdP JWT issuance, or first-class user auth model) before scaling to multi-user frontend usage. |
| 26 | + |
| 27 | +9. **Admin queue dashboard protection improved in production, but can still be open in non-production queue deployments.** `createQueueDashboard()` now disables dashboard mounting in production when `ADMIN_PASSWORD` is missing (`src/routes/admin/queue-dashboard.ts`). In non-production, if queue mode is on and password is unset, `/admin/queues` is mounted without basic auth (`src/index.ts`). **Impact**: staging/dev environments can unintentionally expose job payloads if reachable. **Recommendation**: require explicit opt-in for unauthenticated dashboard access, or require admin auth in all environments by default. |
| 28 | + |
| 29 | +10. **Rate limiter still no-ops when `USE_JOB_QUEUE=false`.** `checkRateLimit` and `rateLimitMiddleware` return early if queue mode is disabled (`src/middleware/rateLimiter.ts`). **Impact**: local and some deployment modes can run without throttling, masking abuse/perf behavior. **Recommendation**: add in-memory fallback limits when Redis/BullMQ is off, with startup warnings when running unthrottled. |
| 30 | + |
| 31 | +## Action matrix |
| 32 | + |
| 33 | +| Finding | Priority | Effort | Risk reduction | Suggested owner | Dependency | |
| 34 | +| --- | --- | --- | --- | --- | --- | |
| 35 | +| #1 Deep-research orchestration duplication | P0 | L | High | Backend platform | Align on target orchestrator API | |
| 36 | +| #2 High-risk flow test depth gaps | P0 | M | High | QA + backend | #1/#3 test seams defined | |
| 37 | +| #3 Queue vs in-process behavior drift | P0 | L | High | Backend platform | #1 shared execution core | |
| 38 | +| #4 Missing streaming transport | P1 | L | High | API + frontend | #1/#3 core convergence helps | |
| 39 | +| #5 No runtime schema validation for state JSONB | P1 | M | High | Data/backend | Agree state schema/versioning approach | |
| 40 | +| #6 `setup.sql` schema drift | P1 | S | Medium | Developer experience | Docs/setup update approval | |
| 41 | +| #7 Direct Supabase table access outside `src/db` | P1 | M | Medium | Backend platform | Define `src/db` ownership boundaries | |
| 42 | +| #8 Single-user `/api/auth/login` shim | P2 | M | Medium | Product + auth/backend | Decide identity strategy (IdP vs native) | |
| 43 | +| #9 Admin dashboard can be open in non-prod | P2 | S | Medium | Infra/backend | Policy decision for non-prod access | |
| 44 | +| #10 Rate limit disabled when queue disabled | P2 | S | Medium | API/backend | Select in-memory limiter behavior | |
| 45 | + |
| 46 | +### Suggested execution order |
| 47 | + |
| 48 | +1. **Stabilize execution core**: #1 and #3 together. |
| 49 | +2. **Lock confidence**: #2 immediately after initial extraction milestones. |
| 50 | +3. **Unblock multi-frontend UX**: #4 in parallel once core interfaces are stable. |
| 51 | +4. **Harden data/contracts**: #5 and #7. |
| 52 | +5. **Clean operational hygiene**: #6, #9, #10. |
| 53 | +6. **Finalize product auth direction**: #8. |
| 54 | + |
| 55 | +## Repository structure |
| 56 | + |
| 57 | +### What's there |
| 58 | + |
| 59 | +Top-level structure remains coherent for a Bun monolith: `src/`, `client/`, `supabase/migrations/`, docs, compose files, and a single Dockerfile. `src/` keeps clear folders (`routes`, `services`, `agents`, `middleware`, `db`, `llm`, `chat-agent`, etc.). x402/b402 route/middleware trees are gone. |
| 60 | + |
| 61 | +### Assessment |
| 62 | + |
| 63 | +Layering exists but orchestration still lives primarily in route/worker files: |
| 64 | + |
| 65 | +- `src/routes/deep-research/start.ts` (1919 lines) |
| 66 | +- `src/services/queue/workers/deep-research.worker.ts` (1387 lines) |
| 67 | +- `src/routes/deep-research/paper.ts` (676 lines) |
| 68 | +- `src/routes/chat.ts` (620 lines) |
| 69 | +- `src/services/queue/workers/chat.worker.ts` (588 lines) |
| 70 | +- `src/routes/clarification.ts` (457 lines) |
| 71 | + |
| 72 | +`src/services/deep-research/` currently contains `run-guard.ts` only; there is still no extracted orchestration core. Chat has some reuse (`runChatAgent`) but also keeps mode-specific behavior drift (finding #3). |
| 73 | + |
| 74 | +### Recommendations |
| 75 | + |
| 76 | +1. Extract deep-research orchestration to shared service modules and remove route/worker duplication. |
| 77 | +2. Unify chat queue/in-process behavior (single execution core, transport wrappers only). |
| 78 | +3. Move direct Supabase table calls from route/service files into `src/db` operations. |
| 79 | +4. Keep route files as transport/adaptation layers; target <200 lines for most handlers. |
| 80 | +5. Remove `src/db/setup.sql` from setup docs and scripts. |
| 81 | + |
| 82 | +## Deployment & runtime requirements |
| 83 | + |
| 84 | +### What's there |
| 85 | + |
| 86 | +`Dockerfile` builds app + client and installs TeX/Pandoc dependencies needed for paper generation. CI now has: |
| 87 | + |
| 88 | +- `.github/workflows/ci.yml` (lint/format on PR changed files, typecheck, tests, PR docker smoke build) |
| 89 | +- `.github/workflows/deploy-worker.yml` (build/push and deployment webhook on `dev`/`main` pushes) |
| 90 | + |
| 91 | +Compose/swarm deployment files are present for API + worker topologies. |
| 92 | + |
| 93 | +### Assessment |
| 94 | + |
| 95 | +Quality posture is better than before, but runtime/ops risks remain: |
| 96 | + |
| 97 | +- `docker-compose.swarm.yml` still defaults to `biosagent/bios:latest`. |
| 98 | +- `docker-compose.yml` still configures Redis with `--maxmemory-policy noeviction`. |
| 99 | +- Worker healthcheck remains `pgrep -f worker` (process liveness, not queue processing health). |
| 100 | +- Security settings still favor warnings over fail-closed behavior in several startup paths. |
| 101 | + |
| 102 | +### Recommendations |
| 103 | + |
| 104 | +1. Pin deploy images to immutable SHA tags in swarm configs. |
| 105 | +2. Revisit Redis eviction policy for queue robustness under memory pressure. |
| 106 | +3. Replace process-only worker healthchecks with queue-aware health probes. |
| 107 | +4. Add targeted integration tests for deep-research and queue workflows to CI. |
| 108 | + |
| 109 | +## Persistence layers |
| 110 | + |
| 111 | +### What's there |
| 112 | + |
| 113 | +- **Postgres (Supabase)** via migration files under `supabase/migrations/`. |
| 114 | +- **Redis (BullMQ + pub/sub + rate limiter)** for queueing and notifications. |
| 115 | +- **S3-compatible storage** through `src/storage/` provider abstraction. |
| 116 | + |
| 117 | +x402 tables are now removed by migration (`20260414000000_drop_x402_tables.sql`). |
| 118 | + |
| 119 | +### Assessment |
| 120 | + |
| 121 | +Persistence architecture is functional but still schema-contract fragile at the state boundary: |
| 122 | + |
| 123 | +- `ConversationStateValues`/`StateValues` are TypeScript-only contracts; DB writes do not enforce runtime schema. |
| 124 | +- `cleanValues` now has dedicated tests and cleaner isolation, but it is still a manual strip list. |
| 125 | +- RLS is enabled only on newer tables (`token_usage`, `clarification_sessions`) with permissive policies; core conversation/message/state tables still rely on service-role access + middleware checks. |
| 126 | +- `src/db/setup.sql` remains drifted from migration-driven reality. |
| 127 | + |
| 128 | +### Recommendations |
| 129 | + |
| 130 | +1. Add runtime schema validation before persisting state JSONB. |
| 131 | +2. Version state payloads and add migration helpers for shape changes. |
| 132 | +3. Consolidate table access behind `src/db` boundary. |
| 133 | +4. Keep Supabase migrations as the only schema source of truth. |
| 134 | + |
| 135 | +## Readiness for CLI and macOS frontends |
| 136 | + |
| 137 | +**Auth**: header-based auth is now JWT/API key/anonymous (no x402/b402). This fits CLI/native clients. Remaining gap is single-user UI login shim (`/api/auth/login`). |
| 138 | + |
| 139 | +**Streaming UX**: still missing at transport layer. Clients rely on poll status + notify/fetch patterns. |
| 140 | + |
| 141 | +**Contracts**: no generated OpenAPI/Swagger spec; response contracts are mostly implicit per-handler. |
| 142 | + |
| 143 | +**Orchestration reuse**: deep-research logic is transport-coupled (route/worker), making SDK-level reuse harder for additional frontends. |
| 144 | + |
| 145 | +**Bundling**: backend still serves SPA by default via catch-all in `src/index.ts`; no explicit headless server toggle. |
| 146 | + |
| 147 | +**Verdict**: onboarding/quality baseline is improved, but frontend expansion still depends on (1) orchestration extraction, (2) streaming responses, and (3) stronger state-contract enforcement. |
| 148 | + |
| 149 | +## Appendix: notable files and line references |
| 150 | + |
| 151 | +- `src/routes/deep-research/start.ts` - 1919 lines; in-process deep-research orchestration + queue enqueueing. |
| 152 | +- `src/services/queue/workers/deep-research.worker.ts` - 1387 lines; queue-mode deep-research orchestration. |
| 153 | +- `src/routes/chat.ts` - dual-mode chat handler. |
| 154 | +- `src/services/queue/workers/chat.worker.ts` - queue chat path with feature-flagged legacy vs agent-loop execution. |
| 155 | +- `src/chat-agent/runner.ts` and `src/chat-agent/loop.ts` - reusable chat agent core (currently non-streaming route integration). |
| 156 | +- `.github/workflows/ci.yml` - current CI quality gates. |
| 157 | +- `.github/workflows/deploy-worker.yml` - deployment workflow. |
| 158 | +- `src/middleware/rateLimiter.ts` - queue-disabled early-return behavior. |
| 159 | +- `src/routes/auth.ts` - fixed UUID login flow. |
| 160 | +- `src/routes/admin/queue-dashboard.ts` and `src/index.ts` - admin dashboard exposure rules. |
| 161 | +- `src/db/operations.ts` and `src/db/cleanValues.ts` - JSONB write path and cleanup behavior. |
| 162 | +- `src/db/setup.sql` vs `supabase/migrations/*.sql` - schema drift source. |
0 commit comments