AI agents that compete in the quantum game of Tangled
A comprehensive framework for building intelligent agents that play Tangled, a quantum game where players color edges on graph structures and outcomes are determined by quantum annealing. This project combines Monte Carlo Tree Search, MATLAB-powered compute, and machine learning to compete against the game's built-in AI opponents.
| Component | Technology |
|---|---|
| Core Language | Python 3.11+ |
| High-Performance Compute | MATLAB R2024a+ (Deep Learning, Statistics, Parallel Computing Toolboxes) |
| Web Automation | Playwright (Chromium) |
| Database | SQLite with automatic migrations |
| Game Engine | snowdrop-tangled-game-engine |
| Quantum Adjudication | Simulated Annealing, Schrödinger Equation solvers |
- Thompson Sampling Opening Selection - Bayesian approach to exploration-exploitation in AlphaQExplorerStrategy
- MCTS Strategy Engine - Monte Carlo Tree Search with 500K iterations per move (default), configurable 100K-1M for speed/quality tradeoff
- MATLAB Integration - High-compute search with parallel
parforrollouts across 6 workers and neural network evaluation - Live Web Play - Automated gameplay on tangled-game.com via Playwright
- Statistical Analysis - SQLite-backed game history with pattern discovery
- Calibrated Heuristics - Terminal evaluation tuned from 140+ real games
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.11 – 3.13 | |
| Poetry | 2.x | Package manager. Install: pipx install poetry |
| Git | any | |
| MATLAB | R2024a+ | Required for competitive play (see MATLAB Setup) |
# Clone and install
git clone https://github.com/snowdropquantum/snowdrop-tangled-agents.git
cd snowdrop-tangled-agents
poetry install
poetry run playwright install chromiumOr use the automated setup script (installs Poetry if missing):
# Windows:
setup.bat
# macOS / Linux:
chmod +x setup.sh && ./setup.shcp .env.example .env
# Edit .env: add your tangled-game.com credentials (sign up at https://tangled-game.com)See .env.example for all available settings including the optional live
stats dashboard.
# Play 1 game against MCTS Melissa (default opponent, hybrid_solver strategy)
poetry run python play_tangled.py --games 1
# Play against AlphaQ Up using the AlphaQ Explorer strategy (5K MCTS iterations)
poetry run python play_tangled.py --strategy alphaq_explorer --opponent alphaq \
--games 3 --mcts-iterations 5000
# View game statistics from the local database
poetry run python play_tangled.py --statsMATLAB powers the high-performance MCTS engine. Without it, the system falls back to a pure-Python heuristic that is not competitive against strong opponents.
Required toolboxes:
| Toolbox | Used for |
|---|---|
| Parallel Computing | parfor rollouts — 6 workers × 100 rollouts/iter |
| Statistics and Machine Learning | interp1 P(win) calibration curve |
Optional toolboxes (for advanced features):
| Toolbox | Used for |
|---|---|
| Deep Learning | Neural network position evaluation priors |
| Database | Direct SQLite access from MATLAB training pipelines |
| MATLAB Compiler SDK | Building standalone Python-callable packages |
| Reinforcement Learning | RL training experiments |
Install the MATLAB Engine API for Python:
# From the MATLAB installation directory:
cd "C:\Program Files\MATLAB\R2026a\extern\engines\python" # Windows
# cd /Applications/MATLAB/R2026a/extern/engines/python # macOS
# cd /usr/local/MATLAB/R2026a/extern/engines/python # Linux
python -m pip install .The engine starts automatically in headless mode when you run a game — no need
to open the MATLAB GUI. All .m source files live in
snowdrop_tangled_agents/matlab/rl/ within this repository.
Verify MATLAB is detected:
poetry run python test_matlab_detection.py| Strategy | Engine | Best for |
|---|---|---|
hybrid_solver |
MATLAB Minimax + MCTS + Tabu | Default. Strong general play. |
alphaq_explorer |
MATLAB MCTS + Thompson Sampling | Competing against AlphaQ Up. |
matlab_mcts |
MATLAB MCTS only | Pure MCTS experiments. |
mcts |
Pure Python MCTS | No MATLAB required. Weaker. |
heuristic |
Python heuristics | Fast, always available. Not competitive. |
After installation and MATLAB setup, you can run the current flagship experiment — systematic re-exploration of all 30 Petersen graph openings against AlphaQ Up:
# Phase 1: Screen all 30 openings, 3 games each (est. ~8 hours)
poetry run python play_tangled.py \
--strategy alphaq_explorer \
--opponent alphaq \
--games 90 \
--mcts-iterations 5000 \
--opening-mode round_robinSee docs/EXPERIMENT_OPENING_RE_EXPLORATION.md for the full experiment
design and rationale.
All game data is recorded in a local SQLite database at
~/.tangled/game_stats.db. Every game stores:
- Result, final score, opening move, strategy, MCTS iteration setting
- Move-by-move: edge, color, score, board state, thinking time
- MCTS diagnostics: iterations, total rollout simulations, tree depth
# Quick summary
poetry run python play_tangled.py --stats
# Direct SQL queries
sqlite3 ~/.tangled/game_stats.db "SELECT opening_edge, opening_color, result, final_score FROM games WHERE opening_mode='round_robin' ORDER BY opening_edge, opening_color"For real-time monitoring during long experiments, deploy the tangled-stats-dashboard to fly.io or any host that supports WebSockets:
- Deploy the dashboard app (see its
.env.examplefor configuration) - Set
TANGLED_DASHBOARD_URLandTANGLED_DASHBOARD_API_KEYin your.env - Game telemetry streams automatically during play
If no dashboard is configured, experiments run normally and all data is available for post-analysis from the SQLite database.
This repo includes a random agent (see RandomRandyAgent in random_agent.py). To build your
own agent, follow the instructions and pattern for that agent. The process is straightforward - your
agent gets a Game instance as input/observable, which is a snapshot of the current game being played
which includes the current state of the game, and you output an action/move. The move is just which
edge to pick, and what color to apply to it. That's all! Obviously the underlying process you use to
select this move can be arbitrarily complex but the actual agent interface is simple.
A test script is included in run_local_parallel_tournament.py. Out of the box it runs 100,000 games
for each of the X-Prize graphs (not including the mutant C60 graph) where the adjudicator used is
simulated annealing, and the two agents in the competition are both random agents.
If you want to build your own agent, I would start by building your own adjudicator (look at both
the adjudicators in the snowdrop-adjudicators repo). The better you
can spoof the D-Wave hardware, the better quality reward your agent will have to learn from. To do this,
I would recommend you build a look-up table adjudicator, by first enumerating all the terminal states
for a small game graph, and then adjudicating all of them using the schrodinger equation adjudicator (this will
be limited to small game graphs, but it's still a great warm-up for doing something more ambitious). Once
you have your lookup table adjudicator, you can then build an agent that uses it to give you the correct
answer every time. You can use run_local_parallel_tournament.py - add your adjudicator and new agents to
and see how well they do against the random agent and any other agents you might have built!
This section documents the actual development steps taken to build a competitive Tangled agent.
Implemented a Monte Carlo Tree Search (MCTS) strategy with the following features:
- UCB1 exploration with configurable exploration constant
- Simulated Annealing adjudicator for terminal state evaluation
- Opening book with pre-computed strong opening moves for the Petersen graph
- Time and iteration limits for move calculation
- Statistics collection for game analysis
# Play with MCTS strategy (default: 500K iterations, unlimited time)
python play_tangled.py --strategy mcts --games 3
# Fast mode for testing (100K iterations, 30s per move)
python play_tangled.py --strategy mcts --mcts-iterations 100000 --mcts-time 30 --games 3Built a Playwright-based bot to play on tangled-game.com:
- Automatic login with credentials from
.env - Game state parsing from the web interface
- Move execution via browser automation
- Score tracking and game result detection
- Keep-alive mode for continuous play sessions
# Play games on tangled-game.com
python play_tangled.py --strategy hybrid --games 10 --keep-openCreated a SQLite-based statistics system:
- Game history tracking (opponent, result, scores, timestamps)
- Move-by-move analysis with state snapshots
- Adjudicator calibration data collection
- Win rate analysis by opponent and strategy
# View statistics
python play_tangled.py --stats
python play_tangled.py --calibrationExtended the system with MATLAB integration for advanced ML capabilities:
| Toolbox | Purpose |
|---|---|
| Deep Learning Toolbox | Neural network training for position evaluation |
| Statistics and ML Toolbox | Opponent clustering and style classification |
| Database Toolbox | Direct MATLAB-SQLite access for training pipelines |
| MATLAB Compiler SDK | Package MATLAB code as Python-callable modules |
Value Network (position evaluation):
- Input: 50-element feature vector (board state, edge categories, game phase)
- Hidden: FC(128) → ReLU → Dropout → FC(64) → ReLU → FC(32) → ReLU
- Output: Tanh → value ∈ [-1, +1]
Opponent Model (style classification):
- 20-element opponent feature vector (edge preferences, aggression metrics)
- K-means clustering into play styles (aggressive, defensive, hub-focused)
- Adaptive prior adjustment based on opponent tendencies
Built standalone Python packages using MATLAB Compiler SDK:
| Package | Functions | Purpose |
|---|---|---|
tangled_value_network |
evaluate_position_nn |
Neural network inference |
tangled_opponent_model |
classify_opponent, adapt_to_opponent |
Opponent modeling |
tangled_training |
train_value_network, cluster_opponents |
Model training |
For users without MATLAB:
- Download MATLAB Runtime R2026a (free) from MathWorks
- Install compiled packages:
pip install tangled_value_network tangled_opponent_model
Important: Runtime version must exactly match the compile version (R2026a).
The system automatically selects the best available backend:
1. Compiled Packages (fastest, MATLAB Runtime only)
↓ if unavailable
2. MATLAB Engine API (full functionality, requires license)
↓ if unavailable
3. Pure Python Heuristics (always available)
# Check what's available
python play_tangled.py --training-status
# Play with MATLAB enhancement
python play_tangled.py --strategy matlab --use-nn --adapt-opponent --games 5A hybrid Minimax-MCTS solver inspired by D-Wave's classical optimization techniques:
Architecture:
- Alpha-Beta Minimax - Exact search at shallow depths with transposition tables
- Tabu Search - D-Wave MST2-inspired multistart optimization for rollout refinement
- MCTS - Monte Carlo Tree Search with progressive widening for deep exploration
- Expanded LUT - 19 million pre-computed exact minimax values
Lookup Table Coverage:
| Grey Edges | States | Method |
|---|---|---|
| 0 (terminal) | 32,768 | Direct lookup |
| 1 | 491,520 | Minimax depth-1 |
| 2 | 3,440,640 | Minimax depth-2 |
| 3 | 14,909,440 | Minimax depth-3 |
| Total | 18,874,368 | Exact values |
This gives guaranteed optimal play for the last 4 moves of every game.
References:
- D-Wave qbsolv decomposition strategy
- D-Wave dwave-tabu (MST2 algorithm)
- Palubeckis (2004) "Multistart Tabu Search Strategies"
# Play with Hybrid Solver (default: 500K iterations, unlimited time)
python play_tangled.py --strategy hybrid_solver --games 3
# Competitive mode with maximum strength (1M iterations)
python play_tangled.py --strategy hybrid_solver --mcts-iterations 1000000 --games 1See docs/HYBRID_MINIMAX_MCTS_PLAN.md for complete implementation plan.
High-compute MCTS implementation in MATLAB for live play against MCTS Melissa:
Key Features:
- 500K iterations per move with unlimited time (default, elite quality - ~14 min per move)
- Configurable 100K-1M iterations via
--mcts-iterationsfor speed/quality tradeoff - Parallel
parforrollouts — each MCTS iteration runsNumWorkers(default 6) independent rollouts in parallel and averages results, producing 6x more samples per leaf visit while using the already-allocated worker pool that previously sat idle - Domain-specific terminal evaluation calibrated from 140+ games
- Opening book securing our vertex edges (E9, E10, E11 Green)
- Adaptive exploration based on score momentum
- Compute diagnostics using MATLAB's
cputimeandmemoryfunctions
Critical Discoveries:
- E12 Green is catastrophic: Connecting hub (V6) to opponent vertex (V7) with ferromagnetic coupling helps the opponent. Added -0.8 penalty.
- E2 Green causes score collapse: Observed -2.6 point drops at move 4. Added -0.5 penalty.
- Turn-based timing: Game is strictly turn-based; apparent "race conditions" were actually stale DOM state from reading board before browser fully updated.
Results:
- Before fixes: Score peaked at +2.95, collapsed to -2.6 at move 4
- After fixes: Stable score progression, competitive games (draws and close losses)
# Play with MATLAB MCTS (default: 500K iterations, unlimited time - elite quality)
python play_tangled.py --strategy matlab_mcts --games 3
# Fast testing mode (100K iterations)
python play_tangled.py --strategy matlab_mcts --mcts-iterations 100000 --games 5
# View game statistics
python play_tangled.py --statsSee docs/MATLAB_MCTS_STRATEGY.md for complete theory of operation.
An instrumented research system for tracking performance, discovering patterns, and measuring improvement:
Visualization Tools:
- Progress plots — Rolling win rate and score trends over time
- Edge effectiveness — Score delta and win rate by edge/color combination
- Opening analysis — Identify winning opening sequences with sample sizes
Design Philosophy:
- Observability over raw metrics — answer why, not just what
- Every plot maps to a subsystem lever you can actually adjust
- Query-first architecture — plots are views, not logic
- Timestamped outputs for historical comparison
# Generate all analysis plots
python -m snowdrop_tangled_agents.tools.plot_progress --all
# Individual plot types
python -m snowdrop_tangled_agents.tools.plot_progress -t progress
python -m snowdrop_tangled_agents.tools.plot_progress -t edge
python -m snowdrop_tangled_agents.tools.plot_progress -t openingPlots are saved to plots/ with naming: {type}_{YYYYMMDD}_{HHMMSS}.png
See docs/GAME_ANALYTICS.md for complete documentation and research applications.
A two-phase explore/exploit strategy built to contend with the new AlphaQ Up opponent. The key advance over previous opponent-specific strategies is a closed learning loop: learned edge-value adjustments now actually modify the MATLAB MCTS rollout priors, so the solver gets stronger with every game it plays.
How it works:
| Phase | Games | Behaviour |
|---|---|---|
| Exploration | 0–29 | Cycles all 30 possible first moves (15 edges × 2 colors). Learning is disabled so each opening is tested under identical solver conditions. Win/score is recorded per opening. |
| Exploitation | 30+ | Re-enables REINFORCE learning. Rotates only the top-N openings (default 5) ranked by wins then avg score. After every game, the updated edge_adjustments vector is pushed into MATLAB via hybridSolver.setEdgeBias(). |
The closed loop in detail:
Previously HybridSolverStrategy ran REINFORCE and accumulated per-edge adjustments in Python,
but those values were never fed back into the MATLAB solver — the MCTS rollouts kept using the
same static heuristic priors every game. This change adds:
- An
EdgeBiasproperty (1×15 double) to bothTangledMCTSandHybridTangledSolver - A
setEdgeBias()method on both classes, with the solver forwarding to its MCTS instance - Two lines at the end of
computeRolloutPrior:prior += EdgeBias(edge), clamped to [0.001, 0.999] - Bias preservation across
setPlayer()so a new MCTS instance inherits learned adjustments AlphaQExplorerStrategy._push_edge_bias()which callshybridSolver.setEdgeBias([...])after each exploitation game, closing the loop
Persistence and resumability: Full strategy state (phase, exploration results, exploitation
openings, index) is saved to ~/.tangled/alphaq_explorer_state.json. If a run is interrupted
during exploitation, the next run re-enables learning and re-pushes the accumulated bias before
playing.
# Run exploration games followed by exploitation (default: 500K iterations per move)
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --run 60
# Fast exploration mode for development (100K iterations)
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --mcts-iterations 100000 --run 60
# Check what the explorer learned
cat ~/.tangled/alphaq_explorer_state.json
cat ~/.tangled/hybrid_solver_adjustments.jsonTwo fixes targeting the learning stall observed in Run 47 (0W/27L/33D vs AlphaQ Up).
Fix 1 — Score-weighted draw reward
Against strong opponents wins are rare; most games end in draws. The old REINFORCE reward for a
draw was a flat ±0.1 regardless of the final score, which gave the policy gradient almost no signal.
The draw reward is now score × 0.65, so a near-miss draw at +0.78 produces reward +0.507 instead
of +0.1. The multiplier 0.65 is chosen so that the maximum realistic draw reward (+0.65 at score
+1.0) stays below the minimum win reward (1.0), preserving the win > draw > loss ordering. Win and
loss branches are unchanged.
| Scenario | Old reward | New reward |
|---|---|---|
| Win at +2.0 | 2.0 | 2.0 |
| Draw at +0.78 | +0.1 | +0.507 |
| Draw at 0.0 | +0.1 | 0.0 |
| Draw at −0.5 | −0.1 | −0.325 |
| Loss at −1.89 | −1.945 | −1.945 |
Fix 2 — Opponent-conditional calibration
The P(win) calibration curve was fitted on Melissa game data. Loading it for every opponent made the
solver complacent at moderate scores against opponents with different noise profiles. loadCalibration()
now follows a two-phase lookup:
Named opponent with fitted curve → load calibration_<name>.mat
Named opponent, no fitted curve → tanh sigmoid fallback
No opponent name (legacy path) → load calibration_pwin.mat
calibration_melissa.mat is shipped alongside calibration_pwin.mat so Melissa resolves correctly.
New per-opponent curves can be added by dropping a calibration_<name>.mat into data/ — the
sanitisation rule is: lowercase, replace non-[a-z0-9] with _, collapse runs, strip trailing _.
Verification (Run 50, 10 games vs AlphaQ Up):
- Tanh fallback fires correctly (no
calibration_alphaq_up.matexists) - Draw rewards: 0.34–0.50 per game (vs flat 0.1 previously)
- Final edge adjustments are non-uniform: +0.035 on edges played in good draws, −0.081 on edges associated with losses
# Play with the updated solver
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --run 10
# Inspect learned bias
cat ~/.tangled/hybrid_solver_adjustments.jsonThe two-phase explore/exploit strategy (Section 10) had a critical failure: the greedy opening ranking by (wins DESC, avg_score DESC) degenerated to avg_score DESC when all 74 games had 0 wins. This caused the strategy to select E9G (89% loss rate) and E11G (100% loss rate) for exploitation, producing 0% win rate during games 29–56 and collapse of the entire strategy.
Root Cause: With zero wins across all openings, the sort key (wins, avg_score) defaults to average score. E9G's single draw at high score (0.728) and E11G's high-scoring losses (~0.58) ranked higher than E0G, E6G, E7G—the only openings that never lost.
Solution: Thompson Sampling — a principled Bayesian approach to exploration-exploitation that naturally avoids catastrophic losers:
Key Innovation: Half-Draw Credits
Each opening's posterior: Beta(α, β) where:
α = 1 + wins + 0.5 × draws (successes, half-weighted)
β = 1 + losses + 0.5 × draws (failures, half-weighted)
Why draws are half-weighted: In a 0-win regime, draws represent partial success (game didn't lose). Without this credit, all openings collapse to Beta(1, 1+losses), losing all signal about safety. With it:
- E0G (0W/10D/0L) → Beta(6, 6), mean=0.5, strong selection
- E11G (0W/0D/10L) → Beta(1, 11), mean=0.08, almost never selected
- Untried openings → Beta(1, 1), mean=0.5, explored freely
How it works:
Each game:
- For all 30 openings: sample from
Beta(α, β) - Play the opening with the highest sample
- Record result and update counts
- Learning-rate gating: REINFORCE disabled for first 10 games (MIN_GAMES_BEFORE_LEARNING) to collect clean opening data, then enabled
Test Results (8 comprehensive tests, all passing):
| Test | Expected | Result |
|---|---|---|
| Thompson favors safe openings | E0G (5 wins) selected >150/1000 | 186/1000 (18.6%) ✓ |
| Thompson explores untried | E0G (untried among 10L openings) >50/1000 | 54–80/1000 (5–8%) ✓ |
| v1→v2 migration | 74 games migrated losslessly | 0 data loss ✓ |
| Learning gating | Learning disabled games 1–9, enabled game 10+ | Correct transition ✓ |
| State persistence | Save/load across sessions | All counts match ✓ |
Expected Improvements (vs. old two-phase strategy):
| Metric | Old Strategy | Thompson Sampling |
|---|---|---|
| Safe openings (0L) selected | ~30% | 80%+ |
| Catastrophic losers (100L) selected | ~20% | <5% |
| Win rate trajectory | Cliff collapse at game 30 | Monotonic improvement |
| Opening convergence | Broken (selects worst) | Principled (Bayesian) |
Migration & Backward Compatibility:
- Old v1 state files (
alphaq_explorer_state.json) automatically migrate to v2 on first load - All 74 existing games successfully migrated with zero data loss
- State versioning ensures future compatibility
Code & Documentation:
- Implementation:
snowdrop_tangled_agents/matlab/matlab_strategy.py(complete rewrite) - Tests: 8 unit tests in
snowdrop_tangled_agents/tests/test_matlab_integration.py - Full technical docs:
docs/ALPHAQ_STRATEGY.md(mathematical foundation, error analysis, reproduction instructions)
Usage:
# Run trial with Thompson Sampling (default: 500K iterations, ~14 min per move)
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --games 10
# Fast mode for testing algorithm behavior (100K iterations, ~3 min per move)
python play_tangled.py --strategy alphaq_explorer --opponent alphaq --mcts-iterations 100000 --games 10
# Inspect opening posteriors
cat ~/.tangled/alphaq_explorer_state.jsonThe Thompson Sampling approach generalizes to any opening-selection problem in Tangled and other stochastic games where a strategy needs to learn which first moves lead to success against a specific opponent.
All prior conclusions about optimal openings against AlphaQ Up — including the
"Nash Equilibrium" claim from Runs 60–66 — were made with broken MCTS
parallelization. The parfor worker pool allocated 6 workers but never dispatched
work to them: every iteration performed a single serial rollout while workers sat
idle at 0% CPU. This means historical tests used 5,000 rollouts per move when
they should have used 3,000,000 (a 600x deficit).
Additionally, the MCTS terminal evaluation used a tanh(score × 0.4) calibration
fallback that is catastrophically miscalibrated for AlphaQ: it predicts 66% P(win)
at score +0.8, where the actual win rate is 0% across 1,297 games. This caused
the search to satisfice on the E7G opening line instead of exploring alternatives.
What changed:
| Fix | Before | After |
|---|---|---|
| Parallel rollouts | 1 per iteration (broken) | 600 per iteration (6 workers × 100) |
| Calibration fallback | tanh(score × 0.4) — 66% at +0.8 |
Generic calibration_pwin.mat — 9.5% at +0.8 |
| Total rollouts/move (5K iter) | 5,000 | 3,000,000 |
The experiment: Systematically re-test all 30 possible opening moves (15 edges × 2 colors) against AlphaQ on the Petersen graph with functional parallel MCTS and correct calibration.
| Phase | Openings | Games | Iterations | Rollouts/move |
|---|---|---|---|---|
| Phase 1: Screening | All 30 | 90 (3 each) | 5,000 | 3,000,000 |
| Phase 2: Deep test | Promising | 10 each | 10,000 | 6,000,000 |
# Run Phase 1: systematic round-robin exploration of all 30 openings (est. ~8 hours)
poetry run python play_tangled.py \
--strategy alphaq_explorer \
--opponent alphaq \
--games 90 \
--mcts-iterations 5000 \
--opening-mode round_robinSee docs/EXPERIMENT_OPENING_RE_EXPLORATION.md for the full experiment design,
calibration analysis, and expected results.
A formal investigation into whether AlphaQ Up can be defeated on the Petersen graph.
Key findings:
- 0 wins across 120 diverse terminal states against AlphaQ, from 2,436+ game observations
- SA polarity inversion: Simulated annealing scores are anti-correlated (r = -0.396) with the website's quantum adjudicator against AlphaQ -- states the SA LUT predicts as wins are actually losses
- AlphaQ zero-loss equilibrium: Max observed website score +0.861, far below the +2 win threshold. AlphaQ's policy restricts the reachable terminal state basin to [-8.8, +0.9]
- Statistical confidence: 95% confidence that winning terminals comprise < 2.5% of the reachable state space (binomial model)
- Sampling bias caveat: Classical strategies may be biased toward classically accessible regions; winning terminals may exist only in non-classical portions of the state space
The analysis is grounded in 35 references spanning quantum annealing theory (Kadowaki & Nishimori 1998), order-by-disorder phenomena (Villain et al. 1980), Petersen graph symmetry (Schmidt 2018), and Geordie Rose's Tangled game design and $10,000 AlphaQ Up Challenge.
Tools created during this investigation:
tools/build_website_lut.py-- empirical website-calibrated LUT builderstrategy/terminal_explorer_strategy.py-- diversity-maximizing terminal state exploration- Oracle route cycling mode (
--route-mode cycle)
See docs/EMPIRICAL_CLOSURE_OF_THE_PETERSEN_GRAPH_UNDER_QUANTUM_ADJUDICATION.md for the full paper.
docs/EMPIRICAL_CLOSURE_OF_THE_PETERSEN_GRAPH_UNDER_QUANTUM_ADJUDICATION.md- Formal paper on the empirical negative result against AlphaQdocs/EMPIRICAL_LUT_CONSTRUCTION_RESULTS.md- Campaign results: 65 games, SA polarity inversion datadocs/HIGH_COMPUTE_STRATEGY_OPTIONS.md- GPU + MATLAB strategy options for unlimited computedocs/POST_SA_LUT_STRATEGY_OPTIONS.md- Strategy alternatives after SA LUT proven anti-correlateddocs/DEPENDENCY_REPOSITORIES.md- Complete documentation of the three dependency repos (game engine, adjudicators, tangled-adjudicate)docs/ALPHAQ_STRATEGY.md- Thompson Sampling mathematical foundation, test results, reproduction guidedocs/THEORY_OF_OPERATION.md- Comprehensive system documentationdocs/MATLAB_INTEGRATION.md- Complete MATLAB integration guidedocs/MATLAB_MCTS_STRATEGY.md- MATLAB MCTS theory of operation and tuning guidedocs/HYBRID_MINIMAX_MCTS_PLAN.md- D-Wave inspired hybrid solver implementation plandocs/MCTS_DEPTH_ENHANCEMENT_RESEARCH.md- MCTS depth enhancement literature reviewdocs/GAME_ANALYTICS.md- Game analytics and visualization systemdocs/TEST_SUITE.md- Regression test documentationdocs/THE_MATHEMATICS_OF_TANGLED_GAME.md- Game theory analysisdocs/EXPERIMENT_OPENING_RE_EXPLORATION.md- Opening re-exploration experiment design and calibration analysis