Skip to content

Latest commit

 

History

History
38 lines (23 loc) · 1.6 KB

File metadata and controls

38 lines (23 loc) · 1.6 KB

TwinBench Press Kit

One-Line Description

TwinBench is the open benchmark for personal AI assistant runtimes.

Category Definition

TwinBench evaluates the runtime category behind persistent personal AI assistants: systems that remember across sessions, act autonomously, stay safe during background turns, and operate over time rather than only answering a single prompt.

Benchmark Principles

  • open and vendor-neutral
  • evidence-first
  • verified and projected scores separated clearly
  • unsupported behavior reported honestly
  • reference runtimes welcomed, not privileged

Why This Matters

The market has benchmarks for coding agents, memory recall, and task completion, but not for the full runtime behavior expected from a real personal AI assistant. TwinBench is an attempt to define that category publicly and make it measurable.

Reference Runtime

Nullalis is the current reference runtime because it demonstrates the full-stack behavior TwinBench is trying to name. It is not the owner of the benchmark and should be beatable in public.

Where to Start

Nova Nuggets

TwinBench is published by Nova Nuggets, an AI innovation company building toward personal, secure, sovereign AI for everyone.

Nova Nuggets focuses on practical infrastructure and products for long-lived assistants. The benchmark should stay neutral and open, while making that mission visible to the people who discover it.