Single-Reactor Pharmaceutical Batch Scheduling Under Uncertainty

Research companion repository for the paper:

Rababah, A. (2025). Heuristic Scheduling Strategies for Single-Reactor Pharmaceutical Batch Production Under Uncertainty: A Comparative Statistical and Machine Learning Analysis. ChemRxiv. https://doi.org/10.26434/chemrxiv-2025-wq0tr

📋 Abstract

Pharmaceutical batch production faces significant scheduling challenges due to operational uncertainties including equipment failures, yield variability, and demand fluctuations. This study evaluates three scheduling heuristics (FIFO, SPT, LPT) for single-reactor configurations across varying uncertainty levels using a discrete-event simulation model of a 10,000L bioreactor producing three antibiotic products.

Key Findings:

Campaign-based strategies (SPT, LPT) outperform round-robin FIFO by 16.5% in makespan reduction
SPT and LPT are statistically equivalent (Cohen's d = 0.12), providing operational flexibility
No heuristic × uncertainty interaction—benefits remain consistent across all uncertainty levels
Machine learning models achieve 90-97% accuracy in predicting schedule robustness
Polynomial SVM achieves best performance (96.7% accuracy, AUC = 0.972)

🔬 Study Design

Experimental Configuration

Parameter	Value
Reactor Configuration	Single 10,000L bioreactor
Products	3 antibiotics (A: 48h, B: 72h, C: 120h fermentation)
Scheduling Heuristics	FIFO, SPT, LPT
Uncertainty Levels	Low, Medium, High
Total Observations	450 (3 × 3 × 50 scenarios)

Uncertainty Parameters

Parameter	Low	Medium	High
Equipment Failure Probability	2%	5%	8%
Yield Variability (CV)	±5%	±10%	±15%
Processing Time Deviation	±5%	±10%	±15%

Scheduling Heuristics Evaluated

FIFO (First-In-First-Out): Round-robin cycling through products A→B→C
SPT (Shortest Processing Time): Campaign-based, shortest products first (A→B→C)
LPT (Longest Processing Time): Campaign-based, longest products first (C→B→A)

📊 Key Results

Statistical Analysis (Two-Way ANOVA)

Source	F	p	η²	Interpretation
Uncertainty Level	346.69	<.001	0.437	Large effect
Heuristic	225.71	<.001	0.285	Large effect
Interaction	0.103	.981	<0.001	No interaction

Model explains ~72% of variance in makespan (η² combined = 0.722)

Mean Makespan by Condition (Hours)

Uncertainty	SPT	LPT	FIFO
Low	1,762 (CV=2.6%)	1,778 (CV=2.5%)	2,157 (CV=3.0%)
Medium	1,914 (CV=6.1%)	1,942 (CV=6.1%)	2,322 (CV=6.5%)
High	2,300 (CV=11.0%)	2,323 (CV=11.7%)	2,680 (CV=11.6%)
Marginal Mean	1,992	2,014	2,387

Machine Learning Classification Performance

Model	Test Accuracy	AUC	F1	MCC
Polynomial SVM	96.7%	0.972	0.967	0.934
Random Forest	94.4%	0.985	0.944	0.890
RBF SVM	94.4%	0.946	0.945	0.891
Decision Tree	94.4%	0.944	0.945	0.884
Gradient Boosting	93.3%	0.997	0.933	0.869
Linear SVM	92.2%	0.920	0.922	0.845
Logistic Regression	90.0%	0.900	0.900	0.799

Feature Importance (Averaged Across Models)

Feature	Relative Importance
Heuristic Type	33%
Uncertainty Level	30%
Total Demand	18%
Average Yield	8%
Downtime Hours	6%
Equipment Failure	5%

📁 Repository Structure

single-reactor-scheduling/
├── README.md                    # This file
├── LICENSE                      # MIT License
├── DATA_DICTIONARY.md           # Variable definitions
├── CITATION.cff                 # Citation metadata
│
├── data/
│   ├── optimized_dataset.csv            # Base simulation output (450 obs)
│   └── optimized_dataset_ML.csv         # ML-ready with target variables
│
├── scripts/
│   ├── create_ml_dataset.py             # Adds ML target variables
│   └── visualization_script.py          # Generates all 8 figures
│
├── jasp/
│   ├── statistical_analysis.jasp        # ANOVA, regression, post-hoc
│   └── ml_classification.jasp           # All ML models
│
├── figures/
│   ├── Figure1_Main_Results.png         # Makespan by heuristic & interaction
│   ├── Figure2_Mechanism_Analysis.png   # Changeover & learning effects
│   ├── Figure3_Uncertainty_Effects.png  # Distribution & SPT advantage
│   ├── FigureS1_Data_Quality.png        # Data quality dashboard
│   ├── FigureS2_Statistical_Diagnostics.png  # ANOVA assumptions
│   └── FigureS3_Detailed_Comparison.png # Violin plots by condition
│
└── paper/
    └── Rababah_2025_Single_Reactor_Scheduling.pdf

🛠️ Analysis Workflow

Software Requirements

Software	Version	Purpose
Python	3.8+	Data preparation & visualization
JASP	0.18+	All statistical & ML analysis
pandas	1.5+	Data manipulation
numpy	1.21+	Numerical operations
matplotlib	3.5+	Figure generation
seaborn	0.12+	Statistical visualization
scipy	1.9+	Statistical functions

Division of Labor

Task	Tool	Script/File
Dataset preparation	Python	`create_ml_dataset.py`
Figure generation	Python	`visualization_script.py`
Descriptive statistics	JASP	`statistical_analysis.jasp`
Two-Way ANOVA	JASP	`statistical_analysis.jasp`
Kruskal-Wallis tests	JASP	`statistical_analysis.jasp`
Post-hoc comparisons	JASP	`statistical_analysis.jasp`
Multiple regression	JASP	`statistical_analysis.jasp`
ML Classification	JASP	`ml_classification.jasp`

Reproduction Steps

Step 1: Prepare ML Dataset

cd scripts/
python create_ml_dataset.py

This creates optimized_dataset_ML.csv with three target variables:

schedule_robust: Binary (0=Vulnerable, 1=Robust)
performance_class: 3-class (Excellent/Acceptable/Poor)
performance_numeric: Ordinal encoding (0/1/2)

Step 2: Generate Figures

cd scripts/
python visualization_script.py

This generates all 8 publication-quality figures:

Main Paper Figures:

Figure	Description	Output Files
Figure 1	Main results: boxplots & interaction plot	`Figure1_Main_Results.png/pdf`
Figure 2	Mechanism: changeover time, learning savings, changeover count	`Figure2_Mechanism_Analysis.png/pdf`
Figure 3	Uncertainty effects: distributions & SPT advantage	`Figure3_Uncertainty_Effects.png/pdf`

Supplementary Figures:

Figure	Description	Output Files
Figure S1	Data quality dashboard (6 panels)	`FigureS1_Data_Quality.png/pdf`
Figure S2	ANOVA diagnostics: Q-Q, residuals, homogeneity	`FigureS2_Statistical_Diagnostics.png/pdf`
Figure S3	Violin plots by heuristic × uncertainty	`FigureS3_Detailed_Comparison.png/pdf`

Summary Tables:

Table1_PanelA_Heuristic_Stats.csv
Table1_PanelB_Uncertainty_Stats.csv
Table1_PanelC_CrossTab.csv

Step 3: Run Statistical Analysis in JASP

Open jasp/statistical_analysis.jasp
Analyses included:
- Descriptive Statistics
- Two-Way Factorial ANOVA
- Kruskal-Wallis Tests (non-parametric confirmation)
- Post-Hoc Comparisons (Tukey HSD, Dunn's test)
- Multiple Linear Regression
- Assumption Diagnostics

Step 4: Run ML Classification in JASP

Open jasp/ml_classification.jasp
Set variable types:
- schedule_robust → Nominal (target)
- performance_class → Nominal (multi-class target)
ML models configured:
- Random Forest (100 trees)
- Gradient Boosting (100 iterations)
- SVM (Linear, RBF, Polynomial kernels)
- Decision Tree (max depth = 30)
- Logistic Regression

📖 Data Dictionary

See DATA_DICTIONARY.md for complete variable definitions.

Quick Reference

Variable	Type	Description
`scenario_id`	Integer	Unique identifier (1-450)
`heuristic`	Categorical	FIFO, SPT, or LPT
`uncertainty_level`	Ordinal	Low, Medium, or High
`makespan`	Continuous	Total production time (hours)
`utilization`	Continuous	Equipment utilization (0-1)
`total_changeover_time`	Continuous	Sum of changeover durations
`total_learning_savings`	Continuous	Time saved via learning curves
`schedule_robust`	Binary	ML target (0/1)
`performance_class`	Categorical	ML target (3-class)

📈 Practical Implications

For Production Planning

✅ Use campaign-based scheduling (SPT or LPT) as the default strategy
❌ Avoid FIFO/round-robin except when explicitly required
📊 Expected savings: ~395 hours (~16.5%) per production cycle

For Risk Management

🎯 Deploy ML-based prediction tools to flag high-risk schedules
📈 Increase buffer time allocation under high uncertainty
📉 Monitor CV of schedule outcomes as reliability metric

Decision Rules from ML Models

IF heuristic = FIFO:
    → VULNERABLE schedule (94-101 of 150 cases)
    
IF heuristic = SPT or LPT:
    IF uncertainty = Low or Medium:
        → ROBUST schedule (116-118 of 200 cases)
    IF uncertainty = High:
        → Check demand level for final classification

📚 Citation

If you use this dataset or methodology, please cite:

@article{rababah2025single,
  title={Heuristic Scheduling Strategies for Single-Reactor Pharmaceutical 
         Batch Production Under Uncertainty: A Comparative Statistical 
         and Machine Learning Analysis},
  author={Rababah, Anfal},
  journal={ChemRxiv},
  year={2025},
  doi={10.26434/chemrxiv-2025-wq0tr}
}

🔗 Related Work

This study complements multi-reactor scheduling research. For facilities with parallel processing capacity, different scheduling considerations apply.

📬 Contact

Anfal Rababah

Email: Anfal0Rababah@gmail.com
ORCID: 0009-0003-7450-8907

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

JASP Team for open-source statistical software
Python scientific computing community (NumPy, pandas, Matplotlib, Seaborn, SciPy)
Anthropic's Claude AI for technical writing assistance

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
figures		figures
jasp		jasp
paper		paper
scripts		scripts
.gitignore		.gitignore
CITATION.cff		CITATION.cff
DATA_DICTIONARY.md		DATA_DICTIONARY.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Single-Reactor Pharmaceutical Batch Scheduling Under Uncertainty

📋 Abstract

🔬 Study Design

Experimental Configuration

Uncertainty Parameters

Scheduling Heuristics Evaluated

📊 Key Results

Statistical Analysis (Two-Way ANOVA)

Mean Makespan by Condition (Hours)

Machine Learning Classification Performance

Feature Importance (Averaged Across Models)

📁 Repository Structure

🛠️ Analysis Workflow

Software Requirements

Division of Labor

Reproduction Steps

Step 1: Prepare ML Dataset

Step 2: Generate Figures

Step 3: Run Statistical Analysis in JASP

Step 4: Run ML Classification in JASP

📖 Data Dictionary

Quick Reference

📈 Practical Implications

For Production Planning

For Risk Management

Decision Rules from ML Models

📚 Citation

🔗 Related Work

📬 Contact

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages