Snowdrop Tangled Agents

AI agents that compete in the quantum game of Tangled

A comprehensive framework for building intelligent agents that play Tangled, a quantum game where players color edges on graph structures and outcomes are determined by quantum annealing. This project combines Monte Carlo Tree Search, MATLAB-powered compute, and machine learning to compete against the game's built-in AI opponents.

Tech Stack

Component	Technology
Core Language	Python 3.11+
High-Performance Compute	MATLAB R2024a+ (Deep Learning, Statistics, Parallel Computing Toolboxes)
Web Automation	Playwright (Chromium)
Database	SQLite with automatic migrations
Game Engine	snowdrop-tangled-game-engine
Quantum Adjudication	Simulated Annealing, Schrödinger Equation solvers

Highlights

Thompson Sampling Opening Selection - Bayesian approach to exploration-exploitation in AlphaQExplorerStrategy
MCTS Strategy Engine - Monte Carlo Tree Search with 500K iterations per move (default), configurable 100K-1M for speed/quality tradeoff
MATLAB Integration - High-compute search with parallel parfor rollouts across 6 workers and neural network evaluation
Live Web Play - Automated gameplay on tangled-game.com via Playwright
Statistical Analysis - SQLite-backed game history with pattern discovery
Calibrated Heuristics - Terminal evaluation tuned from 140+ real games

Quick Start

Prerequisites

Requirement	Version	Notes
Python	3.11 – 3.13
Poetry	2.x	Package manager. Install: `pipx install poetry`
Git	any
MATLAB	R2024a+	Required for competitive play (see MATLAB Setup)

Installation

# Clone and install
git clone https://github.com/snowdropquantum/snowdrop-tangled-agents.git
cd snowdrop-tangled-agents
poetry install
poetry run playwright install chromium

Or use the automated setup script (installs Poetry if missing):

# Windows:
setup.bat

# macOS / Linux:
chmod +x setup.sh && ./setup.sh

Configuration

cp .env.example .env
# Edit .env: add your tangled-game.com credentials (sign up at https://tangled-game.com)

See .env.example for all available settings including the optional live stats dashboard.

Play a Game

# Play 1 game against MCTS Melissa (default opponent, hybrid_solver strategy)
poetry run python play_tangled.py --games 1

# Play against AlphaQ Up using the AlphaQ Explorer strategy (5K MCTS iterations)
poetry run python play_tangled.py --strategy alphaq_explorer --opponent alphaq \
  --games 3 --mcts-iterations 5000

# View game statistics from the local database
poetry run python play_tangled.py --stats

MATLAB Setup

MATLAB powers the high-performance MCTS engine. Without it, the system falls back to a pure-Python heuristic that is not competitive against strong opponents.

Required toolboxes:

Toolbox	Used for
Parallel Computing	`parfor` rollouts — 6 workers × 100 rollouts/iter
Statistics and Machine Learning	`interp1` P(win) calibration curve

Optional toolboxes (for advanced features):

Toolbox	Used for
Deep Learning	Neural network position evaluation priors
Database	Direct SQLite access from MATLAB training pipelines
MATLAB Compiler SDK	Building standalone Python-callable packages
Reinforcement Learning	RL training experiments

Install the MATLAB Engine API for Python:

# From the MATLAB installation directory:
cd "C:\Program Files\MATLAB\R2026a\extern\engines\python"   # Windows
# cd /Applications/MATLAB/R2026a/extern/engines/python       # macOS
# cd /usr/local/MATLAB/R2026a/extern/engines/python           # Linux

python -m pip install .

The engine starts automatically in headless mode when you run a game — no need to open the MATLAB GUI. All .m source files live in snowdrop_tangled_agents/matlab/rl/ within this repository.

Verify MATLAB is detected:

poetry run python test_matlab_detection.py

Strategies

Strategy	Engine	Best for
`hybrid_solver`	MATLAB Minimax + MCTS + Tabu	Default. Strong general play.
`alphaq_explorer`	MATLAB MCTS + Thompson Sampling	Competing against AlphaQ Up.
`matlab_mcts`	MATLAB MCTS only	Pure MCTS experiments.
`mcts`	Pure Python MCTS	No MATLAB required. Weaker.
`heuristic`	Python heuristics	Fast, always available. Not competitive.

Running Experiments

After installation and MATLAB setup, you can run the current flagship experiment — systematic re-exploration of all 30 Petersen graph openings against AlphaQ Up:

# Phase 1: Screen all 30 openings, 3 games each (est. ~8 hours)
poetry run python play_tangled.py \
  --strategy alphaq_explorer \
  --opponent alphaq \
  --games 90 \
  --mcts-iterations 5000 \
  --opening-mode round_robin

See docs/EXPERIMENT_OPENING_RE_EXPLORATION.md for the full experiment design and rationale.

Analyzing Results

All game data is recorded in a local SQLite database at ~/.tangled/game_stats.db. Every game stores:

Result, final score, opening move, strategy, MCTS iteration setting
Move-by-move: edge, color, score, board state, thinking time
MCTS diagnostics: iterations, total rollout simulations, tree depth

# Quick summary
poetry run python play_tangled.py --stats

# Direct SQL queries
sqlite3 ~/.tangled/game_stats.db "SELECT opening_edge, opening_color, result, final_score FROM games WHERE opening_mode='round_robin' ORDER BY opening_edge, opening_color"

Live Dashboard (Optional)

For real-time monitoring during long experiments, deploy the tangled-stats-dashboard to fly.io or any host that supports WebSockets:

Deploy the dashboard app (see its .env.example for configuration)
Set TANGLED_DASHBOARD_URL and TANGLED_DASHBOARD_API_KEY in your .env
Game telemetry streams automatically during play

If no dashboard is configured, experiments run normally and all data is available for post-analysis from the SQLite database.

How to build your own agent

This repo includes a random agent (see RandomRandyAgent in random_agent.py). To build your own agent, follow the instructions and pattern for that agent. The process is straightforward - your agent gets a Game instance as input/observable, which is a snapshot of the current game being played which includes the current state of the game, and you output an action/move. The move is just which edge to pick, and what color to apply to it. That's all! Obviously the underlying process you use to select this move can be arbitrarily complex but the actual agent interface is simple.

Testing agents

A test script is included in run_local_parallel_tournament.py. Out of the box it runs 100,000 games for each of the X-Prize graphs (not including the mutant C60 graph) where the adjudicator used is simulated annealing, and the two agents in the competition are both random agents.

Suggested next steps

If you want to build your own agent, I would start by building your own adjudicator (look at both the adjudicators in the snowdrop-adjudicators repo). The better you can spoof the D-Wave hardware, the better quality reward your agent will have to learn from. To do this, I would recommend you build a look-up table adjudicator, by first enumerating all the terminal states for a small game graph, and then adjudicating all of them using the schrodinger equation adjudicator (this will be limited to small game graphs, but it's still a great warm-up for doing something more ambitious). Once you have your lookup table adjudicator, you can then build an agent that uses it to give you the correct answer every time. You can use run_local_parallel_tournament.py - add your adjudicator and new agents to and see how well they do against the random agent and any other agents you might have built!

Development Progress

This section documents the actual development steps taken to build a competitive Tangled agent.

1. MCTS Strategy Engine

Implemented a Monte Carlo Tree Search (MCTS) strategy with the following features:

UCB1 exploration with configurable exploration constant
Simulated Annealing adjudicator for terminal state evaluation
Opening book with pre-computed strong opening moves for the Petersen graph
Time and iteration limits for move calculation
Statistics collection for game analysis

# Play with MCTS strategy (default: 500K iterations, unlimited time)
python play_tangled.py --strategy mcts --games 3

# Fast mode for testing (100K iterations, 30s per move)
python play_tangled.py --strategy mcts --mcts-iterations 100000 --mcts-time 30 --games 3

2. Web Automation Bot

Built a Playwright-based bot to play on tangled-game.com:

Automatic login with credentials from .env
Game state parsing from the web interface
Move execution via browser automation
Score tracking and game result detection
Keep-alive mode for continuous play sessions

# Play games on tangled-game.com
python play_tangled.py --strategy hybrid --games 10 --keep-open

3. Statistics Collection System

Created a SQLite-based statistics system:

Game history tracking (opponent, result, scores, timestamps)
Move-by-move analysis with state snapshots
Adjudicator calibration data collection
Win rate analysis by opponent and strategy

# View statistics
python play_tangled.py --stats
python play_tangled.py --calibration

4. MATLAB Toolbox Integration

Extended the system with MATLAB integration for advanced ML capabilities:

Toolbox	Purpose
Deep Learning Toolbox	Neural network training for position evaluation
Statistics and ML Toolbox	Opponent clustering and style classification
Database Toolbox	Direct MATLAB-SQLite access for training pipelines
MATLAB Compiler SDK	Package MATLAB code as Python-callable modules

Neural Network Architecture

Value Network (position evaluation):

Input: 50-element feature vector (board state, edge categories, game phase)
Hidden: FC(128) → ReLU → Dropout → FC(64) → ReLU → FC(32) → ReLU
Output: Tanh → value ∈ [-1, +1]

Opponent Model (style classification):

20-element opponent feature vector (edge preferences, aggression metrics)
K-means clustering into play styles (aggressive, defensive, hub-focused)
Adaptive prior adjustment based on opponent tendencies

5. Compiled Package Deployment

Built standalone Python packages using MATLAB Compiler SDK:

Package	Functions	Purpose
`tangled_value_network`	`evaluate_position_nn`	Neural network inference
`tangled_opponent_model`	`classify_opponent`, `adapt_to_opponent`	Opponent modeling
`tangled_training`	`train_value_network`, `cluster_opponents`	Model training

For users without MATLAB:

Download MATLAB Runtime R2026a (free) from MathWorks
Install compiled packages: pip install tangled_value_network tangled_opponent_model

Important: Runtime version must exactly match the compile version (R2026a).

6. Backend Fallback Chain

The system automatically selects the best available backend:

1. Compiled Packages (fastest, MATLAB Runtime only)
       ↓ if unavailable
2. MATLAB Engine API (full functionality, requires license)
       ↓ if unavailable
3. Pure Python Heuristics (always available)

# Check what's available
python play_tangled.py --training-status

# Play with MATLAB enhancement
python play_tangled.py --strategy matlab --use-nn --adapt-opponent --games 5

7. D-Wave Inspired Hybrid Solver (January 2026)

A hybrid Minimax-MCTS solver inspired by D-Wave's classical optimization techniques:

Architecture:

Alpha-Beta Minimax - Exact search at shallow depths with transposition tables
Tabu Search - D-Wave MST2-inspired multistart optimization for rollout refinement
MCTS - Monte Carlo Tree Search with progressive widening for deep exploration
Expanded LUT - 19 million pre-computed exact minimax values

Lookup Table Coverage:

Grey Edges	States	Method
0 (terminal)	32,768	Direct lookup
1	491,520	Minimax depth-1
2	3,440,640	Minimax depth-2
3	14,909,440	Minimax depth-3
Total	18,874,368	Exact values

This gives guaranteed optimal play for the last 4 moves of every game.

References:

D-Wave qbsolv decomposition strategy
D-Wave dwave-tabu (MST2 algorithm)
Palubeckis (2004) "Multistart Tabu Search Strategies"

# Play with Hybrid Solver (default: 500K iterations, unlimited time)
python play_tangled.py --strategy hybrid_solver --games 3

# Competitive mode with maximum strength (1M iterations)
python play_tangled.py --strategy hybrid_solver --mcts-iterations 1000000 --games 1

See docs/HYBRID_MINIMAX_MCTS_PLAN.md for complete implementation plan.

8. MATLAB MCTS Strategy (January 2026)

High-compute MCTS implementation in MATLAB for live play against MCTS Melissa:

Key Features:

500K iterations per move with unlimited time (default, elite quality - ~14 min per move)
Configurable 100K-1M iterations via --mcts-iterations for speed/quality tradeoff
Parallel parfor rollouts — each MCTS iteration runs NumWorkers (default 6) independent rollouts in parallel and averages results, producing 6x more samples per leaf visit while using the already-allocated worker pool that previously sat idle
Domain-specific terminal evaluation calibrated from 140+ games
Opening book securing our vertex edges (E9, E10, E11 Green)
Adaptive exploration based on score momentum
Compute diagnostics using MATLAB's cputime and memory functions

Critical Discoveries:

E12 Green is catastrophic: Connecting hub (V6) to opponent vertex (V7) with ferromagnetic coupling helps the opponent. Added -0.8 penalty.
E2 Green causes score collapse: Observed -2.6 point drops at move 4. Added -0.5 penalty.
Turn-based timing: Game is strictly turn-based; apparent "race conditions" were actually stale DOM state from reading board before browser fully updated.

Results:

Before fixes: Score peaked at +2.95, collapsed to -2.6 at move 4
After fixes: Stable score progression, competitive games (draws and close losses)

# Play with MATLAB MCTS (default: 500K iterations, unlimited time - elite quality)
python play_tangled.py --strategy matlab_mcts --games 3

# Fast testing mode (100K iterations)
python play_tangled.py --strategy matlab_mcts --mcts-iterations 100000 --games 5

# View game statistics
python play_tangled.py --stats

See docs/MATLAB_MCTS_STRATEGY.md for complete theory of operation.

9. Game Analytics & Visualization (January 2026)

An instrumented research system for tracking performance, discovering patterns, and measuring improvement:

Visualization Tools:

Progress plots — Rolling win rate and score trends over time
Edge effectiveness — Score delta and win rate by edge/color combination
Opening analysis — Identify winning opening sequences with sample sizes

Design Philosophy:

Observability over raw metrics — answer why, not just what
Every plot maps to a subsystem lever you can actually adjust
Query-first architecture — plots are views, not logic
Timestamped outputs for historical comparison

# Generate all analysis plots
python -m snowdrop_tangled_agents.tools.plot_progress --all

# Individual plot types
python -m snowdrop_tangled_agents.tools.plot_progress -t progress
python -m snowdrop_tangled_agents.tools.plot_progress -t edge
python -m snowdrop_tangled_agents.tools.plot_progress -t opening

Plots are saved to plots/ with naming: {type}_{YYYYMMDD}_{HHMMSS}.png

See docs/GAME_ANALYTICS.md for complete documentation and research applications.

10. AlphaQ Explorer — Closed-Loop Learning Against a New Opponent (January 2026)

A two-phase explore/exploit strategy built to contend with the new AlphaQ Up opponent. The key advance over previous opponent-specific strategies is a closed learning loop: learned edge-value adjustments now actually modify the MATLAB MCTS rollout priors, so the solver gets stronger with every game it plays.

How it works:

Phase	Games	Behaviour
Exploration	0–29	Cycles all 30 possible first moves (15 edges × 2 colors). Learning is disabled so each opening is tested under identical solver conditions. Win/score is recorded per opening.
Exploitation	30+	Re-enables REINFORCE learning. Rotates only the top-N openings (default 5) ranked by wins then avg score. After every game, the updated `edge_adjustments` vector is pushed into MATLAB via `hybridSolver.setEdgeBias()`.

The closed loop in detail:

Previously HybridSolverStrategy ran REINFORCE and accumulated per-edge adjustments in Python, but those values were never fed back into the MATLAB solver — the MCTS rollouts kept using the same static heuristic priors every game. This change adds:

An EdgeBias property (1×15 double) to both TangledMCTS and HybridTangledSolver
A setEdgeBias() method on both classes, with the solver forwarding to its MCTS instance
Two lines at the end of computeRolloutPrior: prior += EdgeBias(edge), clamped to [0.001, 0.999]
Bias preservation across setPlayer() so a new MCTS instance inherits learned adjustments
AlphaQExplorerStrategy._push_edge_bias() which calls hybridSolver.setEdgeBias([...]) after each exploitation game, closing the loop

Persistence and resumability: Full strategy state (phase, exploration results, exploitation openings, index) is saved to ~/.tangled/alphaq_explorer_state.json. If a run is interrupted during exploitation, the next run re-enables learning and re-pushes the accumulated bias before playing.

# Run exploration games followed by exploitation (default: 500K iterations per move)
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --run 60

# Fast exploration mode for development (100K iterations)
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --mcts-iterations 100000 --run 60

# Check what the explorer learned
cat ~/.tangled/alphaq_explorer_state.json
cat ~/.tangled/hybrid_solver_adjustments.json

11. Score-Weighted Draw Reward + Opponent-Conditional Calibration (January 2026)

Two fixes targeting the learning stall observed in Run 47 (0W/27L/33D vs AlphaQ Up).

Fix 1 — Score-weighted draw reward

Against strong opponents wins are rare; most games end in draws. The old REINFORCE reward for a draw was a flat ±0.1 regardless of the final score, which gave the policy gradient almost no signal. The draw reward is now score × 0.65, so a near-miss draw at +0.78 produces reward +0.507 instead of +0.1. The multiplier 0.65 is chosen so that the maximum realistic draw reward (+0.65 at score +1.0) stays below the minimum win reward (1.0), preserving the win > draw > loss ordering. Win and loss branches are unchanged.

Scenario	Old reward	New reward
Win at +2.0	2.0	2.0
Draw at +0.78	+0.1	+0.507
Draw at 0.0	+0.1	0.0
Draw at −0.5	−0.1	−0.325
Loss at −1.89	−1.945	−1.945

Fix 2 — Opponent-conditional calibration

The P(win) calibration curve was fitted on Melissa game data. Loading it for every opponent made the solver complacent at moderate scores against opponents with different noise profiles. loadCalibration() now follows a two-phase lookup:

Named opponent with fitted curve  →  load calibration_<name>.mat
Named opponent, no fitted curve   →  tanh sigmoid fallback
No opponent name (legacy path)    →  load calibration_pwin.mat

calibration_melissa.mat is shipped alongside calibration_pwin.mat so Melissa resolves correctly. New per-opponent curves can be added by dropping a calibration_<name>.mat into data/ — the sanitisation rule is: lowercase, replace non-[a-z0-9] with _, collapse runs, strip trailing _.

Verification (Run 50, 10 games vs AlphaQ Up):

Tanh fallback fires correctly (no calibration_alphaq_up.mat exists)
Draw rewards: 0.34–0.50 per game (vs flat 0.1 previously)
Final edge adjustments are non-uniform: +0.035 on edges played in good draws, −0.081 on edges associated with losses

# Play with the updated solver
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --run 10

# Inspect learned bias
cat ~/.tangled/hybrid_solver_adjustments.json

12. Thompson Sampling Opening Selection for AlphaQ Explorer (February 2026)

The two-phase explore/exploit strategy (Section 10) had a critical failure: the greedy opening ranking by (wins DESC, avg_score DESC) degenerated to avg_score DESC when all 74 games had 0 wins. This caused the strategy to select E9G (89% loss rate) and E11G (100% loss rate) for exploitation, producing 0% win rate during games 29–56 and collapse of the entire strategy.

Root Cause: With zero wins across all openings, the sort key (wins, avg_score) defaults to average score. E9G's single draw at high score (0.728) and E11G's high-scoring losses (~0.58) ranked higher than E0G, E6G, E7G—the only openings that never lost.

Solution: Thompson Sampling — a principled Bayesian approach to exploration-exploitation that naturally avoids catastrophic losers:

Key Innovation: Half-Draw Credits

Each opening's posterior: Beta(α, β) where:

α = 1 + wins + 0.5 × draws    (successes, half-weighted)
β = 1 + losses + 0.5 × draws  (failures, half-weighted)

Why draws are half-weighted: In a 0-win regime, draws represent partial success (game didn't lose). Without this credit, all openings collapse to Beta(1, 1+losses), losing all signal about safety. With it:

E0G (0W/10D/0L) → Beta(6, 6), mean=0.5, strong selection
E11G (0W/0D/10L) → Beta(1, 11), mean=0.08, almost never selected
Untried openings → Beta(1, 1), mean=0.5, explored freely

How it works:

Each game:

For all 30 openings: sample from Beta(α, β)
Play the opening with the highest sample
Record result and update counts
Learning-rate gating: REINFORCE disabled for first 10 games (MIN_GAMES_BEFORE_LEARNING) to collect clean opening data, then enabled

Test Results (8 comprehensive tests, all passing):

Test	Expected	Result
Thompson favors safe openings	E0G (5 wins) selected >150/1000	186/1000 (18.6%) ✓
Thompson explores untried	E0G (untried among 10L openings) >50/1000	54–80/1000 (5–8%) ✓
v1→v2 migration	74 games migrated losslessly	0 data loss ✓
Learning gating	Learning disabled games 1–9, enabled game 10+	Correct transition ✓
State persistence	Save/load across sessions	All counts match ✓

Expected Improvements (vs. old two-phase strategy):

Metric	Old Strategy	Thompson Sampling
Safe openings (0L) selected	~30%	80%+
Catastrophic losers (100L) selected	~20%	<5%
Win rate trajectory	Cliff collapse at game 30	Monotonic improvement
Opening convergence	Broken (selects worst)	Principled (Bayesian)

Migration & Backward Compatibility:

Old v1 state files (alphaq_explorer_state.json) automatically migrate to v2 on first load
All 74 existing games successfully migrated with zero data loss
State versioning ensures future compatibility

Code & Documentation:

Implementation: snowdrop_tangled_agents/matlab/matlab_strategy.py (complete rewrite)
Tests: 8 unit tests in snowdrop_tangled_agents/tests/test_matlab_integration.py
Full technical docs: docs/ALPHAQ_STRATEGY.md (mathematical foundation, error analysis, reproduction instructions)

Usage:

# Run trial with Thompson Sampling (default: 500K iterations, ~14 min per move)
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --games 10

# Fast mode for testing algorithm behavior (100K iterations, ~3 min per move)
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --mcts-iterations 100000 --games 10

# Inspect opening posteriors
cat ~/.tangled/alphaq_explorer_state.json

The Thompson Sampling approach generalizes to any opening-selection problem in Tangled and other stochastic games where a strategy needs to learn which first moves lead to success against a specific opponent.

13. Opening Re-Exploration Experiment (February 2026)

All prior conclusions about optimal openings against AlphaQ Up — including the "Nash Equilibrium" claim from Runs 60–66 — were made with broken MCTS parallelization. The parfor worker pool allocated 6 workers but never dispatched work to them: every iteration performed a single serial rollout while workers sat idle at 0% CPU. This means historical tests used 5,000 rollouts per move when they should have used 3,000,000 (a 600x deficit).

Additionally, the MCTS terminal evaluation used a tanh(score × 0.4) calibration fallback that is catastrophically miscalibrated for AlphaQ: it predicts 66% P(win) at score +0.8, where the actual win rate is 0% across 1,297 games. This caused the search to satisfice on the E7G opening line instead of exploring alternatives.

What changed:

Fix	Before	After
Parallel rollouts	1 per iteration (broken)	600 per iteration (6 workers × 100)
Calibration fallback	`tanh(score × 0.4)` — 66% at +0.8	Generic `calibration_pwin.mat` — 9.5% at +0.8
Total rollouts/move (5K iter)	5,000	3,000,000

The experiment: Systematically re-test all 30 possible opening moves (15 edges × 2 colors) against AlphaQ on the Petersen graph with functional parallel MCTS and correct calibration.

Phase	Openings	Games	Iterations	Rollouts/move
Phase 1: Screening	All 30	90 (3 each)	5,000	3,000,000
Phase 2: Deep test	Promising	10 each	10,000	6,000,000

# Run Phase 1: systematic round-robin exploration of all 30 openings (est. ~8 hours)
poetry run python play_tangled.py \
  --strategy alphaq_explorer \
  --opponent alphaq \
  --games 90 \
  --mcts-iterations 5000 \
  --opening-mode round_robin

See docs/EXPERIMENT_OPENING_RE_EXPLORATION.md for the full experiment design, calibration analysis, and expected results.

14. Empirical Closure of the Petersen Graph Under Quantum Adjudication (February 2026)

A formal investigation into whether AlphaQ Up can be defeated on the Petersen graph.

Key findings:

0 wins across 120 diverse terminal states against AlphaQ, from 2,436+ game observations
SA polarity inversion: Simulated annealing scores are anti-correlated (r = -0.396) with the website's quantum adjudicator against AlphaQ -- states the SA LUT predicts as wins are actually losses
AlphaQ zero-loss equilibrium: Max observed website score +0.861, far below the +2 win threshold. AlphaQ's policy restricts the reachable terminal state basin to [-8.8, +0.9]
Statistical confidence: 95% confidence that winning terminals comprise < 2.5% of the reachable state space (binomial model)
Sampling bias caveat: Classical strategies may be biased toward classically accessible regions; winning terminals may exist only in non-classical portions of the state space

The analysis is grounded in 35 references spanning quantum annealing theory (Kadowaki & Nishimori 1998), order-by-disorder phenomena (Villain et al. 1980), Petersen graph symmetry (Schmidt 2018), and Geordie Rose's Tangled game design and $10,000 AlphaQ Up Challenge.

Tools created during this investigation:

tools/build_website_lut.py -- empirical website-calibrated LUT builder
strategy/terminal_explorer_strategy.py -- diversity-maximizing terminal state exploration
Oracle route cycling mode (--route-mode cycle)

See docs/EMPIRICAL_CLOSURE_OF_THE_PETERSEN_GRAPH_UNDER_QUANTUM_ADJUDICATION.md for the full paper.

Documentation

docs/EMPIRICAL_CLOSURE_OF_THE_PETERSEN_GRAPH_UNDER_QUANTUM_ADJUDICATION.md - Formal paper on the empirical negative result against AlphaQ
docs/EMPIRICAL_LUT_CONSTRUCTION_RESULTS.md - Campaign results: 65 games, SA polarity inversion data
docs/HIGH_COMPUTE_STRATEGY_OPTIONS.md - GPU + MATLAB strategy options for unlimited compute
docs/POST_SA_LUT_STRATEGY_OPTIONS.md - Strategy alternatives after SA LUT proven anti-correlated
docs/DEPENDENCY_REPOSITORIES.md - Complete documentation of the three dependency repos (game engine, adjudicators, tangled-adjudicate)
docs/ALPHAQ_STRATEGY.md - Thompson Sampling mathematical foundation, test results, reproduction guide
docs/THEORY_OF_OPERATION.md - Comprehensive system documentation
docs/MATLAB_INTEGRATION.md - Complete MATLAB integration guide
docs/MATLAB_MCTS_STRATEGY.md - MATLAB MCTS theory of operation and tuning guide
docs/HYBRID_MINIMAX_MCTS_PLAN.md - D-Wave inspired hybrid solver implementation plan
docs/MCTS_DEPTH_ENHANCEMENT_RESEARCH.md - MCTS depth enhancement literature review
docs/GAME_ANALYTICS.md - Game analytics and visualization system
docs/TEST_SUITE.md - Regression test documentation
docs/THE_MATHEMATICS_OF_TANGLED_GAME.md - Game theory analysis
docs/EXPERIMENT_OPENING_RE_EXPLORATION.md - Opening re-exploration experiment design and calibration analysis

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
.github/workflows		.github/workflows
docs		docs
oracle-solver		oracle-solver
scripts		scripts
snowdrop_tangled_agents		snowdrop_tangled_agents
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
Screenshot 2026-01-24 215153.png		Screenshot 2026-01-24 215153.png
play_tangled.py		play_tangled.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
query_matlab_capabilities.py		query_matlab_capabilities.py
setup.bat		setup.bat
setup.sh		setup.sh
setup_env.py		setup_env.py
test_lut_integration.py		test_lut_integration.py
test_matlab_connection.py		test_matlab_connection.py
test_matlab_detection.py		test_matlab_detection.py
test_opponent_history.py		test_opponent_history.py

Folders and files

Latest commit

History

Repository files navigation

Snowdrop Tangled Agents

Tech Stack

Highlights

Quick Start

Prerequisites

Installation

Configuration

Play a Game

MATLAB Setup

Strategies

Running Experiments

Analyzing Results

Live Dashboard (Optional)

How to build your own agent

Testing agents

Suggested next steps

Development Progress

1. MCTS Strategy Engine

2. Web Automation Bot

3. Statistics Collection System

4. MATLAB Toolbox Integration

Neural Network Architecture

5. Compiled Package Deployment

6. Backend Fallback Chain

7. D-Wave Inspired Hybrid Solver (January 2026)

8. MATLAB MCTS Strategy (January 2026)

9. Game Analytics & Visualization (January 2026)

10. AlphaQ Explorer — Closed-Loop Learning Against a New Opponent (January 2026)

11. Score-Weighted Draw Reward + Opponent-Conditional Calibration (January 2026)

12. Thompson Sampling Opening Selection for AlphaQ Explorer (February 2026)

13. Opening Re-Exploration Experiment (February 2026)

14. Empirical Closure of the Petersen Graph Under Quantum Adjudication (February 2026)

Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages