TODO

Primary:
- Add limitations:
  - Add limitation on the possibility of sampling grid errors canceling out
  - Mention hardware discretization in extrapolations, re: https://openathena.slack.com/archives/C0884476QSC/p1773257160211219?thread_ts=1773062864.434009&cid=C0884476QSC
    - This is relevant for the simulations too in continuous param space
  - Mention C=6ND assumption as a limitation
  - Discuss Olmo hybrid assumption of constrained scaling exponents for architecture comparisons
  - Bootstrap for within-budget resampling on Approach 2 only
- Add https://arxiv.org/abs/2603.03276 as reference on multimodal asymmetry
- Add Gemini Pretraining notes on MoE data scaling asymmetry as additional need for non-symmetric methods beyond multimodal
- Make a reference implementation
- Mention FLOP factor correction and WLS weighting for approach 2 as possible improvements
  - Or at least mention importance of reliance on C=6ND assumption
Secondary:
- Add WLS analysis
  - Cite https://arxiv.org/pdf/2406.19146 when discussing WLS adjustments based on noise at different budgets
    - See 2.3 Data analysis
- Add exp6 validation for proof to appendix
- Consider https://arxiv.org/abs/2603.06603 as another citation for methods that "extend individual terms in isolation (e.g. token scaling terms alone)"
- Cite Gemstones: A Model Suite for Multi-Faceted Scaling Laws on how C=6ND breaks down w/ model shape
- Cite Scaling Laws for Native Multimodal Models on PlantCAD issue for empirical C ~ D^b method (see C. Scaling Laws)
- Mention the demo prompt examples for making your own simulator; examples:
  - Claude App (prompt)
  - Gemini App
  - Codex App
- Add note advising against using logloss given bias in simulations and ml-scalefit reproduction
- Copy intercept-error proof into paper appendix
- Add citations from "Configuration-to-Performance Scaling Law with Neural Ansatz" on other adaptations of functional forms for Chinchilla scaling laws
- Review figures.py for ways to use existing code utilities and then regen (or push back into experiments code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODO

FilesExpand file tree

TODO.md

Latest commit

History

TODO.md

File metadata and controls

TODO