feat(pt_expt): make model.json optional in .pt2/.pte loading#5416
feat(pt_expt): make model.json optional in .pt2/.pte loading#5416OutisLi wants to merge 4 commits intodeepmodeling:masterfrom
Conversation
📝 WalkthroughWalkthroughDeepEval gains metadata-only loading: Changes
Sequence DiagramsequenceDiagram
actor User
participant Loader
participant DeepEval
participant Metadata
participant DPModel
User->>Loader: Load .pte/.pt2 archive
Loader->>Metadata: Check `model/extra/metadata.json`
alt metadata.json exists
Loader->>Metadata: Extract metadata
alt model.json exists in `model/extra/`
Loader->>DPModel: Load binary model
Loader->>DeepEval: Initialize with DPModel + metadata
else model.json absent
Loader->>DeepEval: Call _init_from_metadata (hoist fields)
Note over DeepEval: _dpmodel = None
end
else metadata.json missing
Loader->>User: Raise ValueError
end
User->>DeepEval: Call get_dim_fparam()
alt _dpmodel exists
DeepEval->>DPModel: Query model for dim_fparam
DPModel->>DeepEval: Return value
else metadata-only
DeepEval->>Metadata: Read hoisted metadata fields
Metadata->>DeepEval: Return value
end
DeepEval->>User: Return result
User->>DeepEval: Call eval_descriptor()
alt _dpmodel exists
DeepEval->>DPModel: Evaluate descriptor
DPModel->>DeepEval: Return result
DeepEval->>User: Return result
else metadata-only
DeepEval->>User: Raise NotImplementedError
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR makes pt_expt .pte/.pt2 inference loading accept archives that omit extra/model.json, using extra/metadata.json as the minimum contract (matching the C++ DeepPotPTExpt reader).
Changes:
- Add a metadata-only init path in
DeepEvaland make.pte/.pt2loaders treatmodel.jsonas optional while requiringmetadata.json. - Extend exported metadata to fully support dpmodel-free metadata accessors (
sel_type, deterministiccategoryserialization). - Add a new end-to-end test suite that strips
extra/model.jsonand asserts parity vs full archives, plus guard tests for missing metadata.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
source/tests/pt_expt/infer/test_deep_eval_metadata_only.py |
New tests validating metadata-only archive loading parity and dpmodel-only feature guards. |
deepmd/pt_expt/utils/serialization.py |
Metadata export augmented for metadata-only inference (cast category to int; include sel_type). |
deepmd/pt_expt/infer/deep_eval.py |
Loaders now require metadata.json but allow missing model.json; new _init_from_metadata; dpmodel-only hooks guarded. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5416 +/- ##
=======================================
Coverage 80.46% 80.47%
=======================================
Files 823 823
Lines 86625 86680 +55
Branches 4139 4139
=======================================
+ Hits 69701 69753 +52
- Misses 15651 15652 +1
- Partials 1273 1275 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
deepmd/pt_expt/utils/serialization.py (1)
345-350: Optional: make the missing-entry error hint at the layout migration.Pre-PR archives stored metadata under
extra/(nomodel/prefix). After this PR, the reader strictly requiresmodel/extra/model.json, so pre-existing.pt2files on disk will fail here with a message that doesn't suggest what changed. Since.pt2has no versioning field, a small nudge in the error text would save users a debugging round-trip.Suggested tweak
- raise ValueError( - f"Invalid .pt2 file '{model_file}': missing '{model_json_entry}'" - ) + raise ValueError( + f"Invalid .pt2 file '{model_file}': missing '{model_json_entry}'. " + "Archives produced before the PyTorch 2.11 layout migration stored " + "metadata under 'extra/' and must be re-exported." + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/utils/serialization.py` around lines 345 - 350, The ValueError raised when model_json_entry is missing (inside the with zipfile.ZipFile(model_file, "r") as zf block) should include a hint about the pre-PR layout: if model.json used to live under "extra/" instead of "model/extra/", add a suggested cause and recovery (e.g., "this archive may be using the older layout where metadata was stored under 'extra/' — re-run the migration or recreate the .pt2") to the error message so users know why model_file failed to load and what to try next.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deepmd/pt_expt/utils/serialization.py`:
- Around line 345-350: The ValueError raised when model_json_entry is missing
(inside the with zipfile.ZipFile(model_file, "r") as zf block) should include a
hint about the pre-PR layout: if model.json used to live under "extra/" instead
of "model/extra/", add a suggested cause and recovery (e.g., "this archive may
be using the older layout where metadata was stored under 'extra/' — re-run the
migration or recreate the .pt2") to the error message so users know why
model_file failed to load and what to try next.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 167dc0c0-b82c-4db6-8079-f1afcfcfa385
📒 Files selected for processing (3)
deepmd/pt_expt/infer/deep_eval.pydeepmd/pt_expt/utils/serialization.pysource/tests/pt_expt/infer/test_deep_eval.py
✅ Files skipped from review due to trivial changes (1)
- source/tests/pt_expt/infer/test_deep_eval.py
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/pt_expt/infer/deep_eval.py
wanghan-iapcm
left a comment
There was a problem hiding this comment.
Test coverage gap: .pt2 metadata-only branch is untested
The new _init_from_metadata fallback in _load_pt2 (the model_json_str == "" branch) is not exercised end-to-end. test_deep_eval_metadata_only.py only does the zip-strip surgery on .pte.
Since test_deep_eval.py::setUpClass already builds a shared AOTI-compiled .pt2 (paying the ~82 s compile cost once per class), adding a metadata-only .pt2 case is essentially free: strip model/extra/model.json from the shared archive, reload via
DeepPot, and assert numeric parity against the full archive — the same pattern already used for .pte.
Add a spin metadata-only parity test
The new _init_from_metadata reconstructs ModelOutputDef from the serialized fitting_output_defs, but _init_from_model_json builds the spin output def by hand (energy, shape=[1], magnetic=True, etc.). Whether these two paths produce
equivalent defs for a spin model is not asserted anywhere — the existing test_deep_eval_spin.py only loads full archives.
Likewise, get_use_spin's metadata fallback ([bool(v) for v in self.metadata.get("use_spin", [])]) only fires when a spin archive is loaded metadata-only, so it's currently dead from the test suite's perspective.
Suggestion: reuse the .pte fixture in test_deep_eval_spin.py, strip extra/model.json (same pattern as test_deep_eval_metadata_only.py::_strip_extra_model_json), reload, and assert:
- bitwise numeric parity of
eval(energy / force / virial / magnetic force) against the full archive, get_has_spin(),get_use_spin(),get_ntypes_spin()agree,- the reconstructed
_model_output_defmatches the hand-built one (e.g. comparedef_outp.get_data()keys + per-variablemagnetic/r_differentiable/c_differentiableflags).
The pt_expt DeepEval's inference path runs through aoti_load_package / the exported module; `self._dpmodel` is only used to resolve metadata (rcut / type_map / atomic_output_def / dim_fparam / ...), which is already available in extra/metadata.json (the contract the C++ reader DeepPotPTExpt enforces). Drop the requirement that extra/model.json be present: * _load_pt2 / _load_pte: model.json is optional; metadata.json is now the minimum contract. * _init_from_metadata: reconstructs ModelOutputDef from the serialised fitting_output_defs and hoists sel / mixed_types to plain attributes. * get_dim_fparam / get_dim_aparam / get_sel_type / model_type / get_use_spin: fall back to metadata when _dpmodel is None. * eval_descriptor / eval_typeebd / eval_fitting_last_layer: raise a descriptive NotImplementedError in metadata-only mode (they inspect the dpmodel instance directly). Also fixes two metadata-completeness gaps so metadata-only load is exact: * _collect_metadata: add the `sel_type` field so get_sel_type works without a dpmodel round-trip (relevant for dipole / polar / wfc). * _collect_metadata: force vdef.category to plain int for deterministic JSON serialisation across Python versions. Archives produced by existing pt_expt serialisation still contain model.json and continue to use the dpmodel path unchanged. Regression covered by 77 existing tests in test_deep_eval.py + a dedicated new suite (test_deep_eval_metadata_only.py) that strips extra/model.json and asserts bitwise parity against the full archive.
… 2.11 layout
``aoti_compile_and_package`` writes every entry of the compiled ``.pt2``
archive under a top-level ``model/`` directory; deepmd-kit then appended
its own metadata JSON blobs (``metadata.json``, ``model.json``,
``model_def_script.json``) at the root-level ``extra/`` path via
``zipfile``. Starting with PyTorch 2.11, the strict single-model
loader ``torch.export.pt2_archive._package.load_pt2`` refuses archives
that carry files outside ``model/``:
RuntimeError: [enforce fail at inline_container.cc:340] .
file in archive is not in a subdirectory model/: extra/metadata.json
``torch._inductor.package.package.load_package`` catches this error and
falls back to the legacy C++ loader, but prints the misleading warning
``Loading outdated pt2 file. Please regenerate your package.`` every
time the archive is opened -- even though the archive version itself
(``archive_version == '0'``) is already current.
Move the deepmd-kit metadata blobs into ``model/extra/`` so that the
fast path through ``load_pt2`` accepts the archive cleanly and the
misleading warning disappears. A module-level constant
``PT2_EXTRA_PREFIX`` in ``deepmd.pt_expt.utils.serialization`` is the
single source of truth for the prefix; both the writer
(``_deserialize_to_file_pt2``) and the readers
(``_serialize_from_file_pt2``, ``DeepEval._load_pt2``) derive their
entry names from it.
The C++ reader in ``source/api_cc/src/commonPTExpt.h::read_zip_entry``
needs no changes: it already matches ``entry_name`` as a
``/``-delimited suffix, so ``"extra/metadata.json"`` resolves against
both the old root-level and the new ``model/extra/`` location
transparently. The ``test_pt2_has_metadata`` assertion in
``source/tests/pt_expt/infer/test_deep_eval.py`` is updated to expect
the new paths.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
deepmd/pt_expt/infer/deep_eval.py (1)
145-162: Dual code paths for spin_model_output_def— confirm they stay in sync.
_init_from_model_jsonconstructs a fixedOutputVariableDef("energy", ..., magnetic=True)for spin models, while_init_from_metadata(line 204) reconstructs whatever_collect_metadataserialized frommodel.atomic_output_def(). The new_assert_fitting_output_defs_matchtest pins these as equivalent today, but ifSpinModel.atomic_output_def()ever gains additional fields (e.g. an extra output, or a flag flip), only the metadata path will pick them up, silently diverging from the model.json path.Consider unifying both paths through
_collect_metadata-style reconstruction (or, conversely, always derive fromatomic_output_def()when available) so there is a single source of truth.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/infer/deep_eval.py` around lines 145 - 162, The code builds _model_output_def in two different ways (_init_from_model_json uses a hardcoded OutputVariableDef("energy", magnetic=True) while _init_from_metadata uses _collect_metadata/model.atomic_output_def()), which can diverge; update the model-json path in _init_from_model_json to derive the fitting output definition from the model itself (call SpinModel.atomic_output_def() or deserialize via _collect_metadata) so both paths use the same source of truth for _model_output_def; adjust the branch that sets self._model_output_def (the block creating ModelOutputDef(FittingOutputDef([...])) ) to use the dpmodel.atomic_output_def() or the metadata deserializer and keep the _assert_fitting_output_defs_match test unchanged.source/tests/pt_expt/infer/test_deep_eval.py (1)
554-573: Optional: extract the metadata-only zip surgery into a shared helper.The same archive-stripping pattern appears in
test_deep_eval_spin.py(_strip_extra_model_json, usingendswith) andtest_deep_eval_metadata_only.py. Consolidating into a single helper (e.g. undersource/tests/pt_expt/infer/_metadata_only_utils.py) would keep the matching predicate consistent (endswith("extra/model.json")works for both.pteand.pt2) and avoid drift across the three call sites.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 554 - 573, Extract the metadata-only zip surgery used in test_deep_eval.py (the block creating cls.meta_tmpfile and copying entries except "model/extra/model.json") into a shared helper function (e.g., _strip_extra_model_json in a new source/tests/pt_expt/infer/_metadata_only_utils.py) and replace the duplicated logic in test_deep_eval.py, test_deep_eval_spin.py, and test_deep_eval_metadata_only.py with calls to that helper; ensure the helper takes the source zip path and target tmpfile path, uses a stable predicate like entry.filename.endswith("extra/model.json") to skip the extra model.json, and preserves creation of the temporary file and subsequent DeepPot/deserialize_to_file usage by the callers (refer to cls.meta_tmpfile, deserialize_to_file, and DeepPot in the tests to wire return values).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 145-162: The code builds _model_output_def in two different ways
(_init_from_model_json uses a hardcoded OutputVariableDef("energy",
magnetic=True) while _init_from_metadata uses
_collect_metadata/model.atomic_output_def()), which can diverge; update the
model-json path in _init_from_model_json to derive the fitting output definition
from the model itself (call SpinModel.atomic_output_def() or deserialize via
_collect_metadata) so both paths use the same source of truth for
_model_output_def; adjust the branch that sets self._model_output_def (the block
creating ModelOutputDef(FittingOutputDef([...])) ) to use the
dpmodel.atomic_output_def() or the metadata deserializer and keep the
_assert_fitting_output_defs_match test unchanged.
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 554-573: Extract the metadata-only zip surgery used in
test_deep_eval.py (the block creating cls.meta_tmpfile and copying entries
except "model/extra/model.json") into a shared helper function (e.g.,
_strip_extra_model_json in a new
source/tests/pt_expt/infer/_metadata_only_utils.py) and replace the duplicated
logic in test_deep_eval.py, test_deep_eval_spin.py, and
test_deep_eval_metadata_only.py with calls to that helper; ensure the helper
takes the source zip path and target tmpfile path, uses a stable predicate like
entry.filename.endswith("extra/model.json") to skip the extra model.json, and
preserves creation of the temporary file and subsequent
DeepPot/deserialize_to_file usage by the callers (refer to cls.meta_tmpfile,
deserialize_to_file, and DeepPot in the tests to wire return values).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e27497ba-ad1c-44bc-a345-96c043daa787
📒 Files selected for processing (5)
deepmd/pt_expt/infer/deep_eval.pydeepmd/pt_expt/utils/serialization.pysource/tests/pt_expt/infer/test_deep_eval.pysource/tests/pt_expt/infer/test_deep_eval_metadata_only.pysource/tests/pt_expt/infer/test_deep_eval_spin.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
deepmd/pt_expt/utils/serialization.py (1)
258-282: Spin output-def + mask filter + category coercion look correct.
- For spin models, sourcing from
model.model_output_def().def_outpcorrectly preserves themagnetic=Trueflag that the spin override sets on the energy variable;atomic_output_def()would have dropped it.- Filtering
vdef.name == "mask"is needed becauseModelOutputDefretains the backbone's full atomic output (includingmask) even though spin strips it frommodel_output_type.int(vdef.category)is the right move for anIntEnumto keep JSON output stable across Python versions.Minor style nit (optional): the spin/non-spin branches could be unified by always using
model.model_output_def().def_outp, since for non-spinModelOutputDefsimply wrapsatomic_output_def(). Feel free to ignore if you prefer the explicit branching.♻️ Optional unification
- if is_spin: - fitting_output_def = model.model_output_def().def_outp - else: - fitting_output_def = model.atomic_output_def() + fitting_output_def = model.model_output_def().def_outp🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/utils/serialization.py` around lines 258 - 282, The current branching between is_spin and non-spin to set fitting_output_def is redundant; always use model.model_output_def().def_outp (which wraps atomic_output_def() for non-spin) so remove the if/else and assign fitting_output_def = model.model_output_def().def_outp, keeping the existing mask filter (if is_spin and vdef.name == "mask": continue) and the rest of the loop unchanged; update any comments if desired to reflect the unified source.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deepmd/pt_expt/utils/serialization.py`:
- Around line 258-282: The current branching between is_spin and non-spin to set
fitting_output_def is redundant; always use model.model_output_def().def_outp
(which wraps atomic_output_def() for non-spin) so remove the if/else and assign
fitting_output_def = model.model_output_def().def_outp, keeping the existing
mask filter (if is_spin and vdef.name == "mask": continue) and the rest of the
loop unchanged; update any comments if desired to reflect the unified source.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 614c407b-a536-47ea-a624-eca9d58b086a
📒 Files selected for processing (2)
deepmd/pt_expt/infer/deep_eval.pydeepmd/pt_expt/utils/serialization.py
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/pt_expt/infer/deep_eval.py
| tests that construct a non-spin archive and then swap | ||
| :attr:`_dpmodel` to a :class:`SpinModel` instance after load. | ||
| """ | ||
| if bool(getattr(self, "_is_spin", False)): |
There was a problem hiding this comment.
use self._is_spin ?
The pt_expt DeepEval's inference path runs through aoti_load_package / the exported module;
self._dpmodelis only used to resolve metadata (rcut / type_map / atomic_output_def / dim_fparam / ...), which is already available in extra/metadata.json (the contract the C++ reader DeepPotPTExpt enforces). Drop the requirement that extra/model.json be present:Also fixes two metadata-completeness gaps so metadata-only load is exact:
sel_typefield so get_sel_type works without a dpmodel round-trip (relevant for dipole / polar / wfc).Archives produced by existing pt_expt serialisation still contain model.json and continue to use the dpmodel path unchanged. Regression covered by 77 existing tests in test_deep_eval.py + a dedicated new suite (test_deep_eval_metadata_only.py) that strips extra/model.json and asserts bitwise parity against the full archive.
Summary by CodeRabbit
New Features
Behavior Changes
Tests