Benchmarking evolutionary AI on biological organoid data — standardised evaluation of genetic algorithms, evolutionary strategies, and neuroevolution on organoid simulation tasks.
Topics: computational-biology · artificial-general-intelligence · cognitive-architecture · evolutionary-computation · evolutionary-optimization · genetic-evolutionary-algorithms · genetic-programming · neuromorphic-computation · organoid-benchmarks · agi-benchmarks
GENEVO Benchmarks provides a standardised evaluation framework for evolutionary and genetic algorithms applied to synthetic organoid development tasks — a novel benchmark domain at the intersection of computational biology, evolutionary computation, and AI research. It defines a family of optimisation and simulation tasks derived from organoid biology (cellular differentiation, spatial self-organisation, morphogenetic field dynamics) and provides reference implementations of EA baselines for comparison.
The benchmark suite is motivated by the increasing use of organoids — three-dimensional organ-like tissue structures grown from stem cells — in drug discovery and disease modelling. Computational optimisation and evolution of organoid protocols (which growth factors to add, in which concentrations, at which time points) is a high-dimensional, multi-objective optimisation problem with expensive fitness evaluation (each protocol evaluation requires wet-lab or high-fidelity simulation).
The benchmark tasks are designed to be computationally tractable surrogates for real organoid optimisation: a cellular automaton model of differentiation, a reaction-diffusion system modelling morphogen gradients, and an agent-based model of cell migration and aggregation. Each task has a defined fitness function, configurable difficulty parameters, and reference EA performance curves.
Evolutionary algorithms are increasingly applied to biological optimisation problems, but fair comparison of algorithms across publications is difficult without standardised benchmarks and evaluation protocols. GENEVO Benchmarks was created to provide that standardisation specifically for organoid-inspired tasks — enabling reproducible, apple-to-apple comparisons between genetic algorithms, evolution strategies, neuroevolution approaches, and Bayesian optimisation baselines.
Benchmark Task (task_id + difficulty_level)
│
┌──────────────────────────────────────────────┐
│ Benchmark Tasks: │
│ ├── T1: Cellular differentiation CA model │
│ ├── T2: Morphogen reaction-diffusion │
│ ├── T3: Cell migration agent-based model │
│ └── T4: Multi-lineage specification │
└──────────────────────────────────────────────┘
│
Evolutionary Algorithm Baselines:
├── Genetic Algorithm (SGA, NSGA-II)
├── Evolution Strategies (CMA-ES, (1+1)-ES)
├── NEAT (NeuroEvolution of Augmenting Topologies)
└── Bayesian Optimisation (GPyOpt baseline)
│
Standardised evaluation: fitness curves, wall time, sample efficiency
T1 (cellular differentiation CA), T2 (morphogen reaction-diffusion), T3 (cell migration ABM), T4 (multi-lineage specification) — each with Easy/Medium/Hard difficulty levels and published reference performance curves.
Reference implementations of Genetic Algorithm (SGA, tournament selection, uniform crossover), NSGA-II (multi-objective), CMA-ES (evolution strategy), NEAT (topology evolution), and Gaussian Process Bayesian Optimisation.
Benchmark scoring based on sample efficiency (fitness at N function evaluations) rather than wall time alone — making results hardware-independent and meaningful for real-world experiment budget allocation.
T4 task requires simultaneous optimisation of differentiation efficiency, spatial uniformity, and viability — requiring Pareto front evolution and NSGA-II-class algorithms.
Interactive Plotly dashboard comparing fitness curves, sample efficiency, Pareto front area, and wall time across all algorithms on any selected benchmark task.
Plug-in interface for registering any custom EA or optimisation algorithm with the benchmark suite — three methods required: initialise(), ask(), tell().
Fixed random seeds, deterministic simulation, versioned task definitions, and benchmark leaderboard CSV for transparent, reproducible comparison publication.
Animate the best-evolved organoid development trajectory for each benchmark task — providing qualitative insight into what the optimised protocol produces.
| Library / Tool | Role | Why This Choice |
|---|---|---|
| NumPy / SciPy | Simulation core | CA, RD, ABM simulation numerical implementation |
| DEAP | Genetic algorithms | Flexible EA framework: SGA, NSGA-II, GP |
| CMA-ES (cmaes package) | Evolution strategy | CMA-ES reference implementation |
| neat-python | NEAT neuroevolution | Topology-evolving neural networks |
| GPyOpt / BoTorch | Bayesian optimisation | Gaussian process surrogate optimisation baseline |
| Plotly / Matplotlib | Results visualisation | Fitness curves, Pareto fronts, solution animations |
| pandas | Benchmark results | Algorithm comparison tables and leaderboard |
- Python 3.9+ (or Node.js 18+ for TypeScript/JavaScript projects)
- A virtual environment manager (
venv,conda, or equivalent) - API keys as listed in the Configuration section
git clone https://github.com/Devanik21/GENEVO-GENetic-EVolutionary-Organoid-Benchmarks.git
cd GENEVO-GENetic-EVolutionary-Organoid-Benchmarks
python -m venv venv && source venv/bin/activate
pip install numpy scipy deap cma neat-python pandas plotly matplotlib streamlit
streamlit run dashboard.py# Run all baselines on task T1 (medium)
python run_benchmarks.py --task T1 --difficulty medium --n_evals 10000
# Compare specific algorithms
python compare.py \
--algorithms sga,cmaes,neat \
--task T2 --difficulty hard \
--trials 10 --output comparison.csv
# Register and benchmark a custom algorithm
python benchmark_custom.py \
--algorithm my_algorithm.py \
--task T3 --n_evals 5000
# Launch results dashboard
streamlit run dashboard.py| Variable | Default | Description |
|---|---|---|
DEFAULT_TASK |
T1 |
Benchmark task: T1, T2, T3, T4 |
DIFFICULTY |
medium |
Task difficulty: easy, medium, hard |
N_EVAL_BUDGET |
10000 |
Total function evaluation budget |
N_TRIALS |
10 |
Repeated trials for statistical significance |
SEED_START |
0 |
Starting random seed (incremented per trial) |
Copy
.env.exampleto.envand populate required values before running.
GENEVO-GENetic-EVolutionary-Organoid-Benchmarks/
├── README.md
├── requirements.txt
├── EvolutIoN_criterion.py
└── ...
- T5: 3D organoid morphology benchmark with voxel-based simulation for volumetric optimisation
- Surrogate-assisted benchmark mode: meta-model the fitness function for accelerated evaluation
- Transfer learning benchmark: measure algorithm performance when pre-adapted to a related task
- Wet-lab validation protocol: map computational benchmark performance to real organoid experiment outcomes
- Community leaderboard: submit algorithm results to a hosted web leaderboard for cross-group comparison
Contributions, issues, and suggestions are welcome.
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-idea - Commit your changes:
git commit -m 'feat: add your idea' - Push to your branch:
git push origin feature/your-idea - Open a Pull Request with a clear description
Please follow conventional commit messages and add documentation for new features.
Benchmark task difficulty levels are calibrated to require between 1,000 (easy) and 50,000 (hard) fitness evaluations for a well-tuned CMA-ES to reach the reference quality threshold. Computational cost scales linearly with the number of cells in the simulation — T3 ABM tasks are the most computationally intensive.
Devanik Debnath
B.Tech, Electronics & Communication Engineering
National Institute of Technology Agartala
This project is open source and available under the MIT License.
Built with curiosity, depth, and care — because good projects deserve good documentation.