GitHub - aalhour/beachdb: 🏖️ 🪨 Distributed NoSQL database in Go

BeachDB is a toy distributed NoSQL database. Built for learning and education, not production.

It starts life as a small, inspectable storage engine, then deliberately grows “real-system bones”: a server API, a failure model, and a Raft-replicated core. The point isn’t to win benchmarks — it’s to understand, measure, and explain what’s actually happening.

Backstory

I’ve been fond of distributed systems and databases for a long time. I wrote my first Hadoop and Apache Spark pipeline back in 2016, then went on to solve hairy stream-processing problems at Shopify, and later worked on Apache HBase at HubSpot where I helped build and operate database infrastructure on top of Kubernetes at massive scale.

BeachDB is my attempt to re-learn the fundamentals by building them from scratch in Go. I’m prioritizing simplicity, clarity, and understanding over scalability, speed, and micro-optimizations.

Architecture

LSM storage engine (WAL → memtable → SSTables → compaction)
Single-node API (server wrapper for Get/Put/Delete/Scan with timeouts + backpressure)
Distributed replication with Raft (single group: leader writes + leader reads; log entry == WriteBatch)
Inspectability-first (dump tools + crash tests as part of the architecture)

Key features (shipped as a checklist)

This list is ordered to match the build + blog sequence. I’ll tick these off as they land.

Engine (storage truth)

Server (systems truth)

Binary protocol (framed) + timeouts + backpressure
Load generator + p50/p99 latency reporting
Metrics/tracing hooks that make performance explainable

Replication (distributed truth)

Raft (single group) where a log entry == serialized WriteBatch
Deterministic apply + restart safety
Snapshotting for fast catch-up

Sequel teaser (maybe)

Tables & Regions: table-ish encoding + scans + key-range routing (minimal, no rabbit holes)

Non-goals (by design)

To keep BeachDB small and finishable, these are intentionally out of scope for Season 1:

Production readiness, multi-year maintenance guarantees, or compatibility promises
Multi-writer concurrency in the engine (single-writer early on)
Background compaction early on (added only after invariants are rock-solid)
SQL, query planner, joins, secondary indexes
Full transactions / serializable isolation
Auto sharding, region split/merge, rebalancing, quorum reads, gossip/repair

Philosophy

Every chapter ends with evidence: a dump tool, a crash test, a benchmark, or a diagram.

See docs/principles.md to see how I'm keeping this project from turning into a second job :)

License

Apache 2.0 (see: LICENSE)

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
art		art
cmd		cmd
docs		docs
engine		engine
examples/engine		examples/engine
internal		internal
raft		raft
server		server
.gitignore		.gitignore
.golangci.yml		.golangci.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Backstory

Architecture

Key features (shipped as a checklist)

Engine (storage truth)

Server (systems truth)

Replication (distributed truth)

Sequel teaser (maybe)

Non-goals (by design)

Philosophy

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Backstory

Architecture

Key features (shipped as a checklist)

Engine (storage truth)

Server (systems truth)

Replication (distributed truth)

Sequel teaser (maybe)

Non-goals (by design)

Philosophy

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages