goldilocks-core

goldilocks-core is a research-grade Python package for organizing and recommending DFT calculation inputs from structures, machine-learning models, and parsed pseudopotentials.

The project is designed around domain-focused modules such as k-mesh construction, pseudopotential parsing, recommendation advisors, and thin CLI entry points.

What It Does

goldilocks-core currently focuses on two main workflows:

recommending k-mesh settings from structure-aware logic and ML-predicted k_index
parsing UPF pseudopotential files and building local pseudopotential registries

The package is intended to grow toward code- and task-aware input recommendation, where structure, pseudopotential choice, and calculation settings can be coordinated in a clean and testable way.

Current Capabilities

K-mesh stack

generate candidate k-distance values from reciprocal lattice geometry
convert k-distance values into Monkhorst-Pack-style meshes
build indexed KMeshEntry objects
compute mesh-related metadata such as k-point density intervals and reduced-k-point counts
map ML-predicted k_index values onto concrete k-mesh recommendations
expose a minimal CLI entry point for k-mesh recommendation

Pseudopotential stack

parse real UPF files into structured metadata
support both attribute-style and text-style PP_HEADER
supplement header parsing with PP_INFO when needed
normalize key fields such as:
- element
- pseudo_type
- functional
- relativistic
- z_valence
scan a local pseudo library into a list of PseudoMetadata
filter registry entries by element

Installation

This project uses uv for environment and dependency management.

Clone the repository and sync the environment:

uv sync

If you want development tools as well:

uv sync --group dev

Quick Start

Load a structure and get k-mesh advice

from pathlib import Path

from goldilocks_core.advisors import advise_kpoints
from goldilocks_core.io.structures import load_structure
from goldilocks_core.shared.types import ModelSpec

structure = load_structure("path/to/structure.cif")

spec = ModelSpec(
    name="local-kmesh-model",
    version="v0",
    model_type="random_forest",
    target="k_index",
    feature_set="cslr",
    source="local",
    location="path/to/model.joblib",
    revision=None,
)

advice = advise_kpoints(structure, spec)
print(advice.grid)

Parse one UPF file

from goldilocks_core.pseudo.parse_upf import parse_upf_metadata

metadata = parse_upf_metadata("path/to/pseudo.UPF")
print(metadata)

Build a local pseudo registry

from goldilocks_core.pseudo.registry import load_pseudo_metadata, filter_by_element

metadata_list = load_pseudo_metadata("path/to/pseudopotentials")
si_pseudos = filter_by_element(metadata_list, "Si")

print(len(metadata_list))
print(len(si_pseudos))

Python API

The current Python-facing entry points are:

K-mesh and advice

goldilocks_core.advisors.advise_kpoints
goldilocks_core.kmesh
goldilocks_core.io.structures.load_structure

Pseudopotentials

goldilocks_core.pseudo.parse_upf.parse_upf_metadata
goldilocks_core.pseudo.registry.load_pseudo_metadata
goldilocks_core.pseudo.registry.filter_by_element

Shared models

goldilocks_core.shared.types

This package is intended to be notebook-friendly, but the package modules and tests should remain the source of truth rather than notebook-only logic.

CLI

A minimal k-mesh CLI entry point is available.

Show help:

uv run goldilocks-kmesh --help

Current usage pattern:

uv run goldilocks-kmesh path/to/structure.cif --model path/to/model.joblib

At this stage, the CLI is intentionally small and thin. The main logic lives in the Python package APIs.

Project Structure

src/goldilocks_core/
├── advisors/
├── cli/
├── io/
├── kmesh.py
├── ml/
├── pseudo/
└── shared/

High-level responsibilities

advisors/ Coordinates recommendation workflows and policy decisions.
cli/ Exposes thin command-line entry points.
io/ Handles structure loading and normalization.
kmesh.py Contains k-mesh construction and interval logic.
ml/ Contains feature extraction, model loading, and inference utilities.
pseudo/ Contains UPF parsing and local pseudopotential registry logic.
shared/ Contains reusable shared data models and type definitions.

For a fuller explanation, see docs/architecture.md.

Development

Run the test suite:

uv run pytest

Run formatting and checks:

uv run pre-commit run --all-files

A typical development loop is:

uv run pytest
uv run pre-commit run --all-files

Testing Philosophy

This project uses two complementary validation styles:

portable tests built from synthetic fixtures under tmp_path
local exploratory validation against real pseudopotential libraries and notebook experiments

When a local exploration reveals an important behavior, it should be turned into a focused regression test whenever possible.

Current Status

This project is under active design and development.

The current codebase already has:

a working ML-driven k-mesh recommendation path
real UPF parsing across multiple pseudo-library styles
a local pseudo registry foundation
an evolving domain-oriented package structure

The next major steps are expected to include:

richer pseudo registry filtering
pseudopotential selection logic
electron metadata derived from selected pseudos
clearer user-facing workflows for local pseudo management

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
docs		docs
src/goldilocks_core		src/goldilocks_core
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

goldilocks-core

What It Does

Current Capabilities

K-mesh stack

Pseudopotential stack

Installation

Quick Start

Load a structure and get k-mesh advice

Parse one UPF file

Build a local pseudo registry

Python API

K-mesh and advice

Pseudopotentials

Shared models

CLI

Project Structure

High-level responsibilities

Development

Testing Philosophy

Current Status

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

goldilocks-core

What It Does

Current Capabilities

K-mesh stack

Pseudopotential stack

Installation

Quick Start

Load a structure and get k-mesh advice

Parse one UPF file

Build a local pseudo registry

Python API

K-mesh and advice

Pseudopotentials

Shared models

CLI

Project Structure

High-level responsibilities

Development

Testing Philosophy

Current Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages