You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
><strongstyle="color: #ffe7ce">Sep. 10, 2025:</strong></span> We're excited to announce the release of the <ahref="https://huggingface.co/datasets/birdsql/livesqlbench-base-full-v1" target="_blank"><strong>LiveSQLBench-Base-Full-V1 (600)</strong></a>! The first text-to-SQL benchmark covering all SQL spectrum with Hierarchical Knowlegde Base (HKB) and test cases.
254
+
We provide two types of queries: normal query and colloquial queries for people to test according to their own needs. The flag model Gemini-2.5-pro can only achieve <strong>28.67</strong> in colloquial queries, and <strong>35.67</strong> in normal queries. The <ahref="https://huggingface.co/datasets/birdsql/livesqlbench-base-lite" target="_blank">base-lite</a> and <ahref="https://huggingface.co/datasets/birdsql/livesqlbench-base-full-v1" target="_blank">base-full-v1</a> would be locked version for development of research methods.
255
+
The detailed performance is in our <ahref="https://livesqlbench.ai/" target="_blank">website</a>.
0 commit comments