Skip to content

feichai0017/QuillSQL

Repository files navigation

QuillSQL

QuillSQL is a frontend-agnostic Arrow/MLIR query compiler and JIT execution engine. DataFusion is the default frontend today; DuckDB, Spark, Substrait, or other engines can be added through adapters that lower their plans into the same PipelineGraph.

Crates.io License: MIT Mentioned in Awesome Discord Ask DeepWiki

QuillSQL

Frontend adapters, Arrow batches, and an MLIR JIT execution core.

Architecture

flowchart LR
    F["Frontend adapters\nDataFusion default\nDuckDB / Spark future"]
    P["quill-plan\nPipelineGraph"]
    J["quill-jit\ncompile + select"]
    M["quill-mlir\nMLIR dialect + passes"]
    R["quill-runtime\nArrow kernels"]
    O["Arrow RecordBatch"]

    F --> P --> J --> M --> R --> O
    P --> R
Loading

The stable boundary is PipelineGraph, not DataFusion. A frontend adapter owns plan inspection and replacement. QuillSQL owns the neutral pipeline model, MLIR lowering, Arrow runtime, and compiled execution path.

Operator fusion is designed to happen in MLIR lowering passes. PipelineGraph keeps semantic operators such as filter, project, plain_sum, and group_aggregate; the lowering layer turns supported shapes into fused loops. That keeps the system extensible without adding query-specific fused operators.

Packages

Package Role
quill-sql CLI, server, benchmarks, and release metadata.
quill-core Public Database API and DataFusion-backed shell integration.
quill-df Default DataFusion frontend adapter and CompiledPipelineExec.
quill-plan Frontend-neutral PipelineGraph, expressions, types, stages, and sinks.
quill-jit JIT orchestration, frontend adapter trait, dialect emission, and MLIR backend.
quill-runtime Arrow binding, safety checks, fixed-width kernels, and result materialization.
quill-mlir Optional C++/TableGen MLIR dialect and lowering pass package.

Why quill-mlir Is Native

Rust builds pipeline graphs, integrates with DataFusion, and manages Arrow batches. The formal MLIR dialect, TableGen operation definitions, verifiers, and pass registration live in quill-mlir because those are native MLIR C++ API surfaces. Rust calls into that package through melior and the jit-mlir feature.

Current compiled coverage is intentionally narrow: fixed-width filter -> project -> record_batch, f64 filter -> SUM, and Q6-shaped Date32/Decimal128 filter -> SUM. Unsupported expressions or unsafe Arrow layouts stay on the safe Rust runtime or DataFusion path.

Quick Start

cargo run --bin client

# keep DataFusion scratch state under a chosen directory
cargo run --bin client -- --data-dir .quillsql-data

# start web server at http://127.0.0.1:8080
cargo run --bin server

Sample session:

CREATE TABLE t AS SELECT 1 AS id, 10 AS v;
INSERT INTO t VALUES (2, 20), (3, 30);

SELECT id, v FROM t WHERE v > 10 ORDER BY id DESC LIMIT 1;
EXPLAIN SELECT id, COUNT(*) FROM t GROUP BY id ORDER BY id;

Parquet datasets can be registered through Database::register_parquet:

db.register_parquet("events", "/data/events.parquet").await?;
let out = db.run("SELECT count(*) FROM events WHERE user_id IS NOT NULL").await?;

JIT And Benchmarks

# Default checks.
cargo test
cargo clippy --all-targets -- -D warnings
cargo bench --no-run

# MLIR path. Adjust prefixes for your local LLVM/MLIR install.
MLIR_SYS_220_PREFIX=/opt/homebrew/opt/llvm \
LLVM_SYS_220_PREFIX=/opt/homebrew/opt/llvm \
cargo test --features jit-mlir

QUILL_JIT=mlir \
MLIR_SYS_220_PREFIX=/opt/homebrew/opt/llvm \
LLVM_SYS_220_PREFIX=/opt/homebrew/opt/llvm \
cargo bench --features jit-mlir --bench tpch -- q6_scan_filter_aggregate

Benchmark harnesses:

  • jit_micro: lowering, compile, kernel, pipeline, and small SQL paths.
  • tpch: Q6/Q1/Q3 analytical ladder over generated or external Parquet data.

Useful knobs:

  • QUILL_JIT=off: pure DataFusion baseline.
  • QUILL_JIT=runtime: Quill Arrow runtime without executable MLIR.
  • QUILL_JIT=mlir: executable MLIR kernels when built with jit-mlir.
  • QUILL_TPCH_SF=1: generate SF1 TPC-H data.
  • QUILL_TPCH_DIR=/path/to/tpch-parquet: use an existing Parquet dataset.

Scope

QuillSQL is no longer a teaching page-store database. It does not ship a custom buffer pool, WAL, table heap, index manager, storage engine, SQL transaction layer, or external KV adapter. The project is now focused on query compilation: frontend plans in, Arrow batches out.

Acknowledgements

Community

Discord: https://discord.gg/dJqa4RYW65

About

MLIR query compiler and JIT execution engine

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors