Skip to content

Set repository basepath to '~/.cache/skyllh' by default#307

Open
tomaskontrimas wants to merge 3 commits intomasterfrom
config_repository
Open

Set repository basepath to '~/.cache/skyllh' by default#307
tomaskontrimas wants to merge 3 commits intomasterfrom
config_repository

Conversation

@tomaskontrimas
Copy link
Copy Markdown
Collaborator

Provide a default path that is mostly useful for public data analyses as i3skyllh overwrites it to /data/ana/analyses (avoiding data copy).

After this is merged we should update tutorial notebooks in the documentation branch, removing hardcoded paths.

@tomaskontrimas tomaskontrimas self-assigned this Apr 21, 2026
Copilot AI review requested due to automatic review settings April 21, 2026 18:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR sets a non-None default repository base path intended to make public-data analyses work out-of-the-box, and updates related docstrings to describe the new default.

Changes:

  • Set cfg['repository']['base_path'] default to ~/.cache/skyllh (expanded) in the global base config.
  • Update dataset and dataset-collection docstrings to mention the new default base path behavior.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
skyllh/core/config.py Changes the default repository base path from None to a user-cache location.
skyllh/core/dataset.py Updates Dataset constructor docs to mention the default base path.
skyllh/datasets/i3/TestData.py Updates dataset-collection docs to mention the default base path.
skyllh/datasets/i3/PublicData_10y_ps.py Updates dataset-collection docs to mention the default base path.
skyllh/datasets/i3/PublicData_10y_ps_wMC.py Updates dataset-collection docs to mention the default base path.
skyllh/datasets/i3/PublicData_14y_ps.py Updates dataset-collection docs to mention the default base path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread skyllh/core/config.py Outdated
'repository': {
# A base path of repository datasets.
'base_path': None,
'base_path': Path('~/.cache/skyllh').expanduser(),
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_BASECONFIG['repository']['base_path'] is now a Path object. Most of the codebase (and the surrounding docstrings/type hints) treat config values as plain serializable primitives (here: str | None). Keeping a Path in the config can surprise callers (e.g., string concatenation, YAML/JSON serialization) and diverges from the documented str type. Consider storing this as a string (e.g., expand ~ to a str) and only converting to Path at the call site when needed.

Suggested change
'base_path': Path('~/.cache/skyllh').expanduser(),
'base_path': str(Path('~/.cache/skyllh').expanduser()),

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread skyllh/core/dataset.py Outdated
Comment on lines +805 to +806
If set to ``None``, ``cfg['repository']['base_path']`` is used,
which defaults to ``~/.cache/skyllh``.
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base_path parameter docstring now refers to cfg['repository']['base_path'], but cfg is not a documented parameter of Dataset.__init__ (it’s only passed implicitly via **kwargs to HasConfig). This makes the docstring harder to follow and introduces an undefined name in the parameter docs. Consider rephrasing to reference the dataset’s config (self.cfg['repository']['base_path']) or the Config setting in general, without relying on an undocumented cfg symbol.

Suggested change
If set to ``None``, ``cfg['repository']['base_path']`` is used,
which defaults to ``~/.cache/skyllh``.
If set to ``None``, the dataset configuration's repository base
path setting is used, which defaults to ``~/.cache/skyllh``.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants