Granular Testing Philosophy

Overview

This document explains why we test specific areas of the Granular cache library with particular testing techniques. Understanding the rationale behind our testing strategy helps maintain high code quality and guides future testing decisions.

Core Principles

1. Meaningful Tests Over Coverage Metrics

We aim for 85%+ coverage, but prioritize meaningful tests that catch real bugs over achieving 100% coverage. Tests should:

Verify behavior, not implementation details
Catch regression bugs
Document expected behavior
Fail when code changes break contracts

2. Match Testing Technique to Risk Profile

Different code has different risk profiles. We use specialized testing techniques where they provide the most value:

Property-based tests: For invariants that must hold for all inputs
Fuzz tests: For parsing and input validation
Concurrency tests: For thread-safety guarantees
Benchmarks: For performance-critical paths

3. Fail Fast, Learn Fast

Tests should:

Run quickly in CI/CD pipelines
Provide clear error messages
Use deterministic test data when possible
Isolate failures to specific components

Why We Test What We Test

Validation & Error Handling (`validation_test.go`)

Why Critical:

Error handling is a primary feature of Granular, not an edge case
The WithAccumulateErrors() option changes behavior fundamentally
Validation errors affect user experience directly
Go 1.20+ multi-error unwrapping must work correctly with errors.Is() and errors.As()

Why These Techniques:

Unit tests: Verify error formatting, unwrapping, and accumulation logic
Integration tests: Test validation across the full key-building pipeline
State-based testing: Verify fail-fast vs accumulate-all behavior differences

Real-World Impact: If validation errors are broken, users get:

Confusing error messages
Missing context about multiple failures
Inability to use standard Go error handling patterns

Glob Pattern Matching (`glob_test.go`)

Why Critical:

Glob matching has 0% coverage in matchGlobParts/matchesGlobPattern
Recursive ** patterns are complex with many edge cases
Bugs cause silent failures (wrong files cached or missed)
Used in build pipelines where correctness is essential

Why These Techniques:

Edge case testing: Cover **, **/, empty matches, deep nesting
Property-based testing: Verify glob results are always subset of directory walk
Fuzz testing: Find unexpected pattern/path combinations that break logic

Real-World Impact: Broken glob matching means:

Build tools cache wrong files
Security-sensitive files accidentally included
Builds aren't reproducible
Silent data corruption in cached results

Why Property Tests: The invariant "glob matches ⊆ all files" must hold for every pattern. A single counterexample breaks the contract. Property tests efficiently search for counterexamples.

Concurrency Safety (`concurrency_test.go`)

Why Critical:

Granular is designed for use in concurrent build systems
Race conditions cause non-deterministic failures
Data corruption from races can silently break builds
RWMutex usage must be correct (deadlocks are catastrophic)

Why testing/synctest (Go 1.25):

Deterministic concurrency testing: Traditional concurrent tests are flaky
Controlled scheduling: synctest.Run() controls goroutine interleaving
Reproducible failures: Race conditions become consistently reproducible
No sleeps or timeouts: Tests run fast and reliably

Scenarios Tested:

Concurrent reads: Multiple goroutines reading same key (RLock correctness)
Concurrent writes: Multiple goroutines writing different keys (no contention)
Read during write: Verify no partial/corrupted reads
Delete during read: Ensure safe cleanup
Clear during operations: Test bulk deletion safety
Prune during operations: Background maintenance doesn't corrupt state

Real-World Impact: Race conditions cause:

Builds that pass sometimes, fail others (Heisenbug hell)
Corrupted cache manifests
Deadlocks that freeze CI/CD pipelines
Data races that pass tests but fail in production

Configuration Options (`options_test.go`)

Why Critical:

Options had 0% coverage (WithHashFunc, WithNowFunc)
Changing hash function invalidates entire cache
Custom time functions affect cache expiration and pruning
Wrong option behavior breaks cache semantics

Why These Techniques:

Functional testing: Verify each option changes cache behavior
Combination testing: Options must compose correctly
Deterministic testing: WithNowFunc enables reproducible time-based tests

Real-World Impact: Broken options cause:

Cache invalidation after updates (all entries become misses)
Incorrect pruning (keeping too much or too little)
Flaky tests due to time-based behavior
Inability to test cache in isolation

Design Insight: WithNowFunc() and WithHashFunc() are dependency injection for testability. We test that the injection works correctly, enabling deterministic testing of cache behavior.

Property-Based Tests (`property_test.go`)

Why Property Testing: Property tests verify invariants that must hold for all inputs, not just examples. They're ideal for Granular because:

Hash determinism: Same inputs → same hash (always)
Idempotency: Put(K,V); Put(K,V) → Get(K) == V
Round-trip: Save → Load preserves all data
Immutability: Bytes() copies data (no external mutation)

Why testing/quick (stdlib):

No external dependencies: Keeps codebase lean
Good-enough fuzzing: 100-1000 iterations catch most bugs
Simple API: Easy for contributors to add property tests
Fast execution: Quick tests run in CI without timeout issues

What Property Tests Catch:

Hash collisions: Different inputs producing same hash
Mutation bugs: Data corruption via aliasing
Serialization bugs: Lost data in JSON round-trip
Order sensitivity: Behavior depending on map iteration order (bad!)

Example:

// Property: Hash is deterministic
property := func(seed int64) bool {
    // For any random inputs...
    hash1 := computeHash(inputs)
    hash2 := computeHash(inputs)
    return hash1 == hash2  // Always equal
}

If this property fails even once, hash determinism is broken, and caching becomes unreliable.

Fuzz Testing (`fuzz_test.go`)

Why Fuzz Testing: Fuzzing finds unexpected inputs that developers don't think to test. Critical for:

Manifest JSON parsing: Untrusted data from filesystem
Glob pattern parsing: User-provided patterns with complex rules
Key hashing: Arbitrary file paths and metadata

Why Native Go Fuzzing:

Built into Go 1.18+: No external tools
Coverage-guided: Intelligently explores code paths
Corpus management: Automatically saves interesting inputs
CI integration: Can run continuously to find bugs

What Fuzzing Catches:

Panic on malformed JSON: Corrupted manifest files
Infinite loops: Pathological glob patterns like **/**/**/**
Buffer overflows: (Impossible in Go, but finds logic errors)
Unexpected nil dereferences: Missing nil checks
String escaping bugs: Special characters in paths

Fuzzing Targets:

1. Manifest JSON (`FuzzManifestJSON`)

Why: Manifest files are persisted to disk and can be:

Corrupted by filesystem errors
Maliciously crafted
Truncated mid-write
Contain very large values (DoS potential)

Goal: Never panic, always return error for invalid JSON

2. Glob Patterns (`FuzzGlobPattern`, `FuzzGlobMatching`)

Why: User-provided patterns can be:

Maliciously complex (****/****/****/)
Contain special regex characters ([, ], \)
Have platform-specific path separators
Be empty or have only wildcards

Goal: Never infinite loop, never panic, handle all syntax gracefully

3. Cache Operations (`FuzzCachePutGet`)

Why: End-to-end fuzzing finds interactions between components:

Special characters in filenames
Binary data in metadata
Large data blobs

Goal: Round-trip always preserves data, no corruption

Running Fuzzing:

# Run for 1 minute
go test -fuzz=FuzzManifestJSON -fuzztime=1m

# Run overnight in CI
go test -fuzz=. -fuzztime=8h

Benchmarks (`benchmark_test.go`)

Why Benchmarking: Granular is a performance-sensitive library. Slow caching defeats the purpose of caching. Benchmarks:

Establish performance baselines
Detect regressions in PRs
Guide optimization efforts
Verify buffer pooling works

Critical Paths Benchmarked:

1. Cache Get/Put Operations

Why: These are called on every build operation. Even small slowdowns multiply across large projects.

BenchmarkCacheGet_Hit: Measures hot path (cache effectiveness)
BenchmarkCacheGet_Miss: Measures overhead of miss detection
BenchmarkCachePut_*: Measures write performance at different scales

2. Key Hash Computation

Why: Hash is computed on every cache lookup. Slow hashing = slow builds.

BenchmarkKeyHash_SingleFile: Baseline performance
BenchmarkKeyHash_Glob100Files: Scales to real projects
BenchmarkKeyHash_LargeFile: Tests buffer pooling efficiency

Buffer Pool Insight: We use sync.Pool for file I/O buffers. Benchmarks verify this reduces allocations (B/op should be low).

3. Concurrent Operations

Why: Build systems are highly parallel. Contention on locks kills performance.

BenchmarkConcurrentReads: Should scale linearly (RWMutex works)
BenchmarkConcurrentWrites: Measures lock contention
BenchmarkConcurrentMixedOperations: Realistic workload

What to Watch:

ns/op: Time per operation (lower is better)
B/op: Bytes allocated (lower is better, indicates less GC pressure)
allocs/op: Number of allocations (fewer = better)
MB/s: Throughput for file hashing (higher is better)

Performance Targets:

Cache hit: < 100µs (microseconds)
Cache miss: < 500µs
Hash 1MB file: > 500 MB/s
Concurrent reads: near-linear scaling

Testing Strategy Decision Tree

When to Use Each Technique

Is it parsing external input?
├─ YES → Fuzz test
└─ NO
    │
    Does it have an invariant that must always hold?
    ├─ YES → Property test
    └─ NO
        │
        Is it performance-critical?
        ├─ YES → Benchmark
        └─ NO
            │
            Is it concurrent?
            ├─ YES → testing/synctest
            └─ NO
                │
                Regular unit/integration test

Coverage Goals by Component

Component	Target Coverage	Rationale
Error handling	100%	User-facing, critical paths
Glob matching	100%	Complex logic, many edge cases
Concurrency	100% (scenarios)	Race conditions are catastrophic
Options	100%	Small surface area, easy to test
Cache operations	95%	Core functionality
Stats/Prune	90%	Less critical, background operations
Close()	Can be 0%	Currently a no-op

Anti-Patterns We Avoid

❌ Testing Implementation Details

Bad:

// Testing internal hash buffer size
if len(cache.hashBuffer) != 8192 { ... }

Good:

// Testing hash correctness
if hash1 != hash2 { ... }

❌ Flaky Time-Based Tests

Bad:

time.Sleep(100 * time.Millisecond)
// Hope operation completed...

Good:

cache, _ := Open(".cache", WithFs(fs), WithNowFunc(fixedTime))
// Deterministic time for testing

❌ Overly Specific Assertions

Bad:

if err.Error() != "validation failed: file does not exist: /exact/path/file.txt" { ... }

Good:

var ve *ValidationError
if !errors.As(err, &ve) || !strings.Contains(ve.Error(), "file.txt") { ... }

❌ Ignoring Race Detector

Always run:

go test -race

Race detector catches bugs tests miss.

Continuous Testing

CI/CD Pipeline

# Every commit
- go test -v -race -coverprofile=coverage.out
- go test -bench=. -benchtime=100ms

# Nightly
- go test -fuzz=. -fuzztime=1h
- go test -bench=. -benchtime=10s -benchmem

# Monthly
- Review coverage reports
- Update fuzz corpus
- Check benchmark trends

Coverage Reporting

We use standard Go coverage:

go test -coverprofile=coverage.out
go tool cover -html=coverage.out

Coverage is a metric, not a goal. 85% coverage with meaningful tests beats 100% coverage with weak assertions.

Contributing Tests

Adding a New Test

Choose the right file:
- Validation errors → validation_test.go
- Glob patterns → glob_test.go
- Concurrency → concurrency_test.go
- New invariant → property_test.go
- Performance → benchmark_test.go
Write a clear test name:
- TestValidationError_EmptyErrors ✅
- TestVE ❌

Document why the test exists:

// TestConcurrentReadWrite verifies that concurrent reads during
// a write operation never return corrupted data. This is critical
// for build systems that parallelize cache access.
func TestConcurrentReadWrite(t *testing.T) { ... }

Add to relevant section of this document

Testing Philosophy in Practice

Example: Hash Determinism

Why it matters: If hash is non-deterministic, cache becomes useless. Same inputs must produce same hash every single time.

How we test:

Unit test: Hash file twice, compare results
Property test: Hash random inputs 100 times, all identical
Integration test: Put → Get → Put → Get, verify hashes match

Why multiple approaches:

Unit test: Fast, runs on every commit
Property test: Explores input space, finds edge cases
Integration test: Verifies real-world usage

This defense in depth catches bugs that single-technique testing would miss.

Lessons Learned

1. Glob Matching is Harder Than It Looks

Initially had 0% coverage on matchGlobParts because we thought it was "obviously correct." Fuzz testing found edge cases with ** at the end of patterns and empty path components.

Lesson: Complex recursive algorithms need exhaustive testing.

2. Concurrency Bugs Are Sneaky

Traditional concurrent tests using sync.WaitGroup were flaky. Only after switching to testing/synctest did we get reproducible concurrency tests.

Lesson: Use modern tooling for hard problems.

3. Validation Errors Are First-Class

Users spend more time debugging validation errors than any other part of the API. Investing in clear, well-tested error messages pays dividends in support time saved.

Lesson: Error handling is a feature, not an afterthought.

Future Testing Directions

Potential Additions

Mutation testing: Use go-mutesting to verify tests catch intentional bugs
Chaos testing: Randomly inject filesystem errors, verify graceful degradation
Load testing: Simulate cache with 100k+ entries, measure performance
Integration tests: Test with real tools (protoc, golangci-lint)

Metrics to Track

Coverage trend: Should stay above 85%
Benchmark trend: Should not regress > 5%
Fuzz crash count: Should be zero
Race detector warnings: Should be zero

Conclusion

Testing Granular comprehensively requires multiple techniques because the library has diverse risk profiles:

Correctness-critical: Validation, glob matching → exhaustive unit + property tests
Concurrency-critical: Cache operations → testing/synctest
Security-critical: Input parsing → fuzz testing
Performance-critical: Hash computation → benchmarks

By matching testing technique to risk, we achieve high confidence without excessive test maintenance burden.

The goal: Ship a cache library that users can trust in production build systems handling millions of dollars of engineering time.

Tests are the contract that guarantees this trust.

Uh oh!

FilesExpand file tree

granular-test-philosophy.md

Latest commit

History

granular-test-philosophy.md

File metadata and controls

Granular Testing Philosophy

Overview

Core Principles

1. Meaningful Tests Over Coverage Metrics

2. Match Testing Technique to Risk Profile

3. Fail Fast, Learn Fast

Why We Test What We Test

Validation & Error Handling (validation_test.go)

Glob Pattern Matching (glob_test.go)

Concurrency Safety (concurrency_test.go)

Configuration Options (options_test.go)

Property-Based Tests (property_test.go)

Fuzz Testing (fuzz_test.go)

1. Manifest JSON (FuzzManifestJSON)

2. Glob Patterns (FuzzGlobPattern, FuzzGlobMatching)

3. Cache Operations (FuzzCachePutGet)

Benchmarks (benchmark_test.go)

1. Cache Get/Put Operations

2. Key Hash Computation

3. Concurrent Operations

Testing Strategy Decision Tree

When to Use Each Technique

Coverage Goals by Component

Anti-Patterns We Avoid

❌ Testing Implementation Details

❌ Flaky Time-Based Tests

❌ Overly Specific Assertions

❌ Ignoring Race Detector

Continuous Testing

CI/CD Pipeline

Coverage Reporting

Contributing Tests

Adding a New Test

Testing Philosophy in Practice

Example: Hash Determinism

Lessons Learned

1. Glob Matching is Harder Than It Looks

2. Concurrency Bugs Are Sneaky

3. Validation Errors Are First-Class

Future Testing Directions

Potential Additions

Metrics to Track

Conclusion

Validation & Error Handling (`validation_test.go`)

Glob Pattern Matching (`glob_test.go`)

Concurrency Safety (`concurrency_test.go`)

Configuration Options (`options_test.go`)

Property-Based Tests (`property_test.go`)

Fuzz Testing (`fuzz_test.go`)

1. Manifest JSON (`FuzzManifestJSON`)

2. Glob Patterns (`FuzzGlobPattern`, `FuzzGlobMatching`)

3. Cache Operations (`FuzzCachePutGet`)

Benchmarks (`benchmark_test.go`)