A Go-based limit-order matching engine.
# Start the server
go run ./cmd/server
# Run load test (in another terminal)
go run ./cmd/loadtest
# Run unit tests
go test ./...┌─────────────────────────────────────────────────────────────────────┐
│ HTTP Layer │
│ POST /api/v1/orders │ GET /api/v1/orders/{id} │ GET /health │
└───────────────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Exchange Service │
│ • Manages order books per symbol (sync.Map) │
│ • Routes orders to correct AsyncOrderBook │
│ • Generates fast atomic IDs │
│ • Records metrics and health │
└───────────────────────────────┬─────────────────────────────────────┘
│
┌───────────┴───────────┐
▼ ▼
┌─────────────────────────────┐ ┌─────────────────────────────┐
│ AsyncOrderBook (AAPL) │ │ AsyncOrderBook (GOOGL) │
│ ┌───────────────────────┐ │ │ ┌───────────────────────┐ │
│ │ Dedicated Goroutine │ │ │ │ Dedicated Goroutine │ │
│ │ (single-threaded) │ │ │ │ (single-threaded) │ │
│ └───────────────────────┘ │ │ └───────────────────────┘ │
│ • buyBook map[price]queue │ │ • buyBook map[price]queue │
│ • sellBook map[price]queue │ │ • sellBook map[price]queue │
│ • Sorted price slices │ │ • Sorted price slices │
└─────────────────────────────┘ └─────────────────────────────┘
What: Each symbol has a dedicated goroutine that processes orders sequentially via a buffered channel.
Why this matters:
- No lock contention: The global mutex was the #1 bottleneck. With 200 workers hitting the same mutex, only one could process at a time → ~1,000 orders/sec max.
- Sequential correctness: Orders for the same symbol must be processed in order (price-time priority). A single goroutine guarantees this naturally.
- Parallel across symbols: Different symbols can process simultaneously since each has its own goroutine.
How it works:
┌─────────────────┐
HTTP Handler ──────► orderChan │
HTTP Handler ──────► (buffered) │────► Single Goroutine ────► Match & Update
HTTP Handler ──────► 50,000 cap │
└─────────────────┘
│
▼
Response via per-request channel
What: Use an atomic counter with base-36 encoding instead of uuid.NewString().
Why this matters:
- UUID cost: Expensive ~500ns
- Atomic cost:
atomic.AddUint64()is ~10ns, purely in user space. - At 30K orders/sec: UUIDs would waste 15ms of CPU time per second on ID generation alone.
Implementation:
type AtomicIDGenerator struct {
counter uint64
prefix string
}
func (g *AtomicIDGenerator) Next() string {
id := atomic.AddUint64(&g.counter, 1)
return g.prefix + strconv.FormatUint(id, 36) // "O1a2b3c"
}Trade-off considered: These IDs are not globally unique across restarts. Needs to add some kind of prefix for a distributed system.
What: The Exchange uses sync.Map instead of map + sync.RWMutex for storing order books and order-to-symbol mappings.
Why this matters:
- Read-heavy workload: Most operations are reads (check if book exists) or single writes (store order ID).
- sync.Map is optimized for this pattern—lock-free reads, amortized lock-free writes for existing keys.
- RWMutex contention: Even with read locks, high contention on
RUnlock()was measurable.
How it's used:
// Get or create order book (lock-free fast path)
if bookVal, ok := ex.books.Load(symbol); ok {
return bookVal.(*AsyncOrderBook)
}
// Slow path: create new book, use LoadOrStore to avoid races
newBook := NewAsyncOrderBook(symbol, 50000)
actual, loaded := ex.books.LoadOrStore(symbol, newBook)
if loaded {
newBook.Stop() // Another goroutine won, clean up our book
}Submit a new order.
Request (LIMIT):
{
"symbol": "AAPL",
"side": "BUY",
"type": "LIMIT",
"price": 15050,
"quantity": 100
}Request (MARKET):
{
"symbol": "AAPL",
"side": "BUY",
"type": "MARKET",
"quantity": 100
}Responses:
201 Created— Order accepted, added to book202 Accepted— Partial fill, remainder in book200 OK— Fully filled400 Bad Request— Validation error or insufficient liquidity
Get order status.
Response:
{
"order_id": "O1a2b3c",
"symbol": "AAPL",
"side": "BUY",
"type": "LIMIT",
"price": 15050,
"quantity": 100,
"filled_quantity": 60,
"status": "PARTIAL_FILL",
"timestamp": 1698567890123
}Get order book snapshot.
Response:
{
"symbol": "AAPL",
"timestamp": 1698567890123,
"bids": [
{"price": 15045, "quantity": 500},
{"price": 15040, "quantity": 1000}
],
"asks": [
{"price": 15050, "quantity": 800},
{"price": 15055, "quantity": 600}
]
}Response:
{
"status": "healthy",
"uptime_seconds": 3600,
"orders_processed": 150000
}Response:
{
"orders_received": 150000,
"orders_matched": 120000,
"orders_cancelled": 5000,
"orders_in_book": 25000,
"trades_executed": 95000,
"latency_p50_ms": 0.8,
"latency_p99_ms": 3.2,
"latency_p999_ms": 8.5,
"throughput_orders_per_sec": 28500
}order-matching-engine/
├── cmd/
│ ├── server/ # HTTP server entry point
│ │ └── main.go
│ └── loadtest/ # Load testing tool
│ └── main.go
├── internal/
│ ├── api/http/ # HTTP handlers, routing, DTOs
│ │ ├── handlers.go
│ │ ├── router.go
│ │ ├── dto.go
│ │ └── validation.go
│ ├── config/ # Configuration loading
│ │ └── config.go
│ ├── engine/ # Core matching engine
│ │ ├── orderbook.go # Channel-based async order book
│ │ ├── order.go # Domain types (Order, Trade, MatchResult)
│ │ ├── idgen.go # Atomic ID generator
│ │ └── errors.go # Error definitions
│ ├── log/ # Logging utilities
│ │ └── log.go
│ └── services/ # Business logic layer
│ ├── exchange.go # Order routing & coordination
│ ├── metrics.go # Performance metrics
│ └── health.go # Health tracking
└── test/ # Unit tests
├── engine/
│ └── orderbook_test.go
├── services/
│ └── exchange_test.go
└── api/http/
└── handlers_test.go