# From the repo root (WSL or native Linux):
./quickstart.sh # dev mode — generates secrets, starts all services
./quickstart.sh --prod # production mode — strong secrets, no debug toolsSimulator → Kafka:9092 → Consumer → Postgres:5432 + ClickHouse:8123
↓
Rule Engine (18 rules) → Alerts → Kafka → WebSocket clients
↓
ML Pipeline (features → inference → baseline → content)
Gateway:50051 (Go, gRPC + Kafka)
uvicorn:8000 ← FastAPI API (REST + WS)
↕ gRPC
Trust Engine:50052 (Rust, in-memory graph)
Dashboard:3000 (Docker) / :5173 (Vite dev)
| Service | Port | Runtime | Log File |
|---|---|---|---|
| PostgreSQL | 5432 | Docker | docker logs phantex-postgres |
| Kafka | 9092 | Docker | docker logs phantex-kafka |
| Redis | 6379 | Docker | docker logs phantex-redis |
| ClickHouse | 8123 | Docker | docker logs phantex-clickhouse |
| Neo4j | 7687 | Docker | docker logs phantex-neo4j |
| Trust Engine | 50052 | Rust binary | /tmp/trust-engine.log |
| API (uvicorn) | 8000 | Python | /tmp/phantex-uvicorn.log |
| Kafka Consumer | — | Python | /tmp/phantex-consumer.log |
| Rule Engine | — | Python | /tmp/phantex-rule-engine.log |
| ML Features | — | Python | /tmp/phantex-main_features.log |
| ML Inference | — | Python | /tmp/phantex-main_inference.log |
| ML Baseline | — | Python | /tmp/phantex-main_baseline.log |
| ML Content | — | Python | docker logs phantex-ml-content |
| Gateway | 50051 | Go | docker logs phantex-gateway |
| Storage Writer | — | Go | docker logs phantex-storage-writer |
| Dashboard | 3000 | Docker (nginx) | docker logs phantex-dashboard |
| Kafka UI | 8080 | Docker | docker logs phantex-kafka-ui |
| Simulator | — | Python | /tmp/phantex-simulator.log |
quickstart.sh handles this automatically. Manual order:
- Docker containers —
docker compose -f docker-compose.dev.yml up -d - Trust Engine — needs port 50052 free
- Kafka Consumer — writes events from Kafka to Postgres + ClickHouse
- Rule Engine — reads events from Kafka, evaluates 18 PRL rules, fires alerts
- ML Pipelines — feature extraction, inference, baseline, content analysis (4 processes)
- uvicorn — API server, must set
TRUST_ENGINE_ADDR=localhost:50052 - Simulator — produces synthetic agent telemetry (2 events/sec, 8% attack)
uvicorn reads backend/.env via pydantic-settings (env_prefix=PHANTEX_):
PHANTEX_CLICKHOUSE_HOST=localhost
PHANTEX_NEO4J_URI=bolt://localhost:7687
PHANTEX_CORS_ORIGINS=["http://localhost:5173","http://localhost:3000"]Other env vars set in docker-compose.dev.yml:
DATABASE_URL/ADMIN_DATABASE_URL— Postgres connectionKAFKA_BOOTSTRAP_SERVERS— Kafka brokerREDIS_URL— RedisTRUST_ENGINE_ADDR— gRPC address for trust clientPHANTEX_INTERNAL_TOKEN— Shared secret for gateway ↔ backend internal API (response action command relay). Set indocker-compose.dev.yml, defaults tophantex-dev-internal-tokenin dev. Must be a strong random value in production.
SOC analyst triggers a response action on an alert → backend queues command in agent_commands table → gateway polls GET /internal/commands/pending/{sensor_id} → relays to sensor → sensor executes (isolate, block_ip, kill_process, quarantine, collect_forensics) → reports back via PATCH /internal/commands/{id}/status.
AlertDetailPage → POST /alerts/{id}/actions → agent_commands (DB)
↓ (gateway polls)
GET /internal/commands/pending/{sensor_id}
↓
Gateway → Sensor (heartbeat response)
↓
Sensor executor (allow-list, no shell)
↓
PATCH /internal/commands/{id}/status
Internal API auth: PHANTEX_INTERNAL_TOKEN header with hmac.compare_digest() timing-safe comparison. ABAC permission: alerts.execute_action (separate from alerts.acknowledge).
Email: [email protected] # Set via PHANTEX_ADMIN_EMAIL env var
Password: changeme # Set via PHANTEX_ADMIN_PASSWORD env var
⚠️ Change default credentials immediately. The seed migration useschangeme— override via env vars in any non-local environment.
# Get a token:
curl -s -X POST http://localhost:8000/api/v1/auth/login \
-H 'Content-Type: application/json' \
-d '{"email":"'"${PHANTEX_ADMIN_EMAIL:-admin@phantex.dev}"'","password":"'"${PHANTEX_ADMIN_PASSWORD:-changeme}"'"}'The platform tracks deployed sensors (eBPF probes) via a dedicated sensors table and API.
Migration 030_sensors.sql creates the sensors table with RLS, grants, indexes, and CHECK constraints. Applied via:
# Preferred: use the migration runner
bash backend/migrations/migrate.sh up
# Or manually for a single migration:
docker cp backend/migrations/030_sensors.sql phantex-postgres:/tmp/030_sensors.sql
docker exec phantex-postgres psql -U phantex_admin -d phantex -f /tmp/030_sensors.sql| Method | Path | Auth | Purpose |
|---|---|---|---|
GET |
/api/v1/sensors |
JWT (user) | List sensors (cursor pagination, status/search filters) |
GET |
/api/v1/sensors/{uuid} |
JWT (user) | Sensor detail |
POST |
/api/internal/sensors/register |
Internal token | Gateway registers a sensor on connect |
POST |
/api/internal/sensors/heartbeat |
Internal token | Gateway relays sensor heartbeat metrics |
Internal endpoints use timing-safe PHANTEX_INTERNAL_TOKEN auth via hmac.compare_digest.
- Sensors (
/sensors) — Fleet overview with health badges, probe counts, event throughput, CPU/memory, live refresh. - Sensor Detail (
/sensors/:id) — Identity, health metrics, resources, diagnostics cards.
| File | Purpose |
|---|---|
backend/migrations/030_sensors.sql |
DDL, RLS, grants, indexes |
backend/app/models/sensor.py |
SQLAlchemy model |
backend/app/schemas/sensor.py |
Pydantic response/filter schemas |
backend/app/services/sensor_service.py |
CRUD + heartbeat + status refresh |
backend/app/routers/sensors.py |
Public REST API |
backend/app/routers/internal_sensors.py |
Internal gateway API |
dashboard/src/pages/SensorsPage.tsx |
Fleet list page |
dashboard/src/pages/SensorDetailPage.tsx |
Detail page |
dashboard/src/api/sensors.ts |
TanStack Query hooks |
File: backend/app/clickhouse.py
Symptom: /api/v1/analytics/* returned HTTP 503. ClickHouse client threw HttpClient.__init__() got an unexpected keyword argument 'ssl_context'.
Root cause: get_clickhouse() always passed ssl_context=None to clickhouse_connect.get_async_client() even when TLS was disabled. The library rejects None as a value.
Fix: Only pass ssl_context kwarg when the value is not None.
Files: backend/app/services/trust_client.py
Symptom: /api/v1/trust/health returned {"status":"NOT_SERVING","uptime_secs":0.0} — the fallback response.
Root causes (3 bugs):
| Bug | Detail | Fix |
|---|---|---|
| grpcio not installed | _grpc_available = False at import time → fallback mode |
Installed grpcio==1.78.0 + grpcio-tools in backend venv |
| Proto stubs not on sys.path | Generated trust_pb2_grpc.py does from phantex.v1 import trust_pb2 — needs proto/gen on sys.path |
Added path setup at top of trust_client.py (resolves project root + proto/gen) |
health_check() never connected |
Checked self._stub is None and returned fallback without calling _ensure_connected() first |
Added await self._ensure_connected() before the stub check |
File: backend/app/services/trust_client.py
Symptom: Trust graph endpoint logged trust_client.retry attempt=1 error=Channel is closed 3 times, then trust_client.call_failed.
Root cause: grpc.aio.insecure_channel() returns a lazy channel that isn't ready yet. The stub call fired immediately before the TCP handshake completed.
Fix: Added await channel_ready() with timeout after channel creation. Waits for the channel to reach READY state before creating the stub.
File: backend/app/routers/ws.py
Symptom: Dashboard WebSocket connection threw TypeError: rate_limit() missing 1 required positional argument: 'request'.
Root cause: router = APIRouter(dependencies=[Depends(rate_limit)]) applies to all routes including WebSocket. rate_limit(request: Request) can't resolve a Request from a WebSocket scope — FastAPI injects WebSocket, not Request.
Fix: Removed router-level dependencies=[Depends(rate_limit)]. Applied dependencies=[Depends(rate_limit)] only to the two REST endpoints (POST /ws/ticket, GET /ws/status).
Symptom: KafkaConnectionError: Unable to bootstrap from [('localhost', 9092)] on startup.
Root cause: Kafka container takes a few seconds to be ready. aiokafka's built-in retry (5s delay) reconnects automatically.
Not a bug — graceful degradation working as designed. No code change needed.