GitHub - Open-Athena/scaling-law-analysis: Analyzing systematic biases in neural scaling law fitting procedures, with focus on Chinchilla Approach 2

Overview

This project studies how neural scaling laws are fit, comparing methods across objectives, optimizers, reparameterizations, and experimental designs.

Bias analysis: Systematic errors in Chinchilla Approach 2's parabolic approximation are isolated using noise-free synthetic loss surfaces, tracing them to IsoFLOP sampling grid width, uncentered sampling, and loss surface asymmetry.
Method comparison: Multiple fitting methods are evaluated — including Approach 2, several Approach 3 variants (direct 5D optimization with different objectives, gradient strategies, and initializations), and Variable Projection (VPNLS) — under both noiseless and noisy conditions.
Empirical validation: Fitting implementations are validated against Apple's ml-scalefit reproductions. Biases are quantified against published Llama 3 IsoFLOP data and shown to produce even larger misallocations on simulated multimodal scaling surfaces.
Novel reparameterization: VPNLS exploits the partially linear structure of the loss surface to reduce fitting to a well-conditioned 2D search, eliminating the biases of Approach 2 while avoiding the numerical difficulties of full 5D optimization.

See specs/project.md for the full directory layout and implementation map.

Results

Results from this analysis can be found in a few places:

See specs/build.md for build and reproduction details.

Installation

uv sync

Experiments

Experiments from this analysis each isolate a specific bias source or fitting method comparison, progressing from baseline validation through individual error modes to combined effects and practical cost implications.

Experiment	Focus
Exp 0: Reproductions	Reproduce Apple ml-scalefit results to validate the fitting implementation
Exp 1: Empirical Error	How sampling range affects exponent and intercept recovery on a symmetric surface
Exp 2: Exponent Imbalance	How α/β asymmetry amplifies fitting errors across surface configurations
Exp 3: Drift Sensitivity	How systematic sampling center biases (constant offset and linear drift) affect exponent and intercept accuracy
Exp 4: Extrapolation Error	How intercept errors from asymmetry and off-center sampling translate into token count errors when extrapolating to large compute budgets
Exp 5: Parameter Recovery	Whether VPNLS (variable projection + NNLS) recovers all five surface parameters without the parabolic approximation's biases, and how optimizer choice affects precision and stability
Exp 6: Analytical Error	Closed-form derivation of Approach 2 intercept error as a function of surface exponents and grid specification, validated against numerical results
Exp 7: Exponent Inference	How VPNLS and Approach 3 compare to Approach 2 for recovering scaling exponents under noise, sampling drift, and varying data budgets
Exp 8: Conditioning Analysis	Why Approach 3's 5D optimization is ill-conditioned (κ ≈ 3.5×10¹¹) and how variable projection reduces the problem to a well-conditioned 2D search (κ ≈ 11)
Exp 9: Data Efficiency	How fitting methods compare in accuracy given limited IsoFLOP data budgets
Exp 10: Compounding Errors	How individual bias sources accumulate when present simultaneously
Exp 11: Cost Estimates	How fitting biases translate into compute allocation errors at training scale
Exp 12: Residual Distributions	Residual patterns across fitting methods as a diagnostic for model misspecification

See specs/experiments.md for full specifications.

Name		Name	Last commit message	Last commit date
Latest commit History 256 Commits
.cursor		.cursor
.github/workflows		.github/workflows
data/isoflops		data/isoflops
docs		docs
results		results
specs		specs
src/scaling_law_analysis		src/scaling_law_analysis
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
README.md		README.md
TODO.md		TODO.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Results

Installation

Experiments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview

Results

Installation

Experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages