TwinBench is the open benchmark for personal AI assistant runtimes.
TwinBench evaluates the runtime category behind persistent personal AI assistants: systems that remember across sessions, act autonomously, stay safe during background turns, and operate over time rather than only answering a single prompt.
- open and vendor-neutral
- evidence-first
- verified and projected scores separated clearly
- unsupported behavior reported honestly
- reference runtimes welcomed, not privileged
The market has benchmarks for coding agents, memory recall, and task completion, but not for the full runtime behavior expected from a real personal AI assistant. TwinBench is an attempt to define that category publicly and make it measurable.
Nullalis is the current reference runtime because it demonstrates the full-stack behavior TwinBench is trying to name. It is not the owner of the benchmark and should be beatable in public.
TwinBench is published by Nova Nuggets, an AI innovation company building toward personal, secure, sovereign AI for everyone.
Nova Nuggets focuses on practical infrastructure and products for long-lived assistants. The benchmark should stay neutral and open, while making that mission visible to the people who discover it.