Skip to content

ci(manfest) updates the commit-latest script and action for the updated repo structure #3304

Open
mtchoum1 wants to merge 1 commit intomainfrom
mtchoum1/commit-action
Open

ci(manfest) updates the commit-latest script and action for the updated repo structure #3304
mtchoum1 wants to merge 1 commit intomainfrom
mtchoum1/commit-action

Conversation

@mtchoum1
Copy link
Copy Markdown
Contributor

@mtchoum1 mtchoum1 commented Apr 7, 2026

Description

changed scripts/update-commit-latest-env.py to look at both odh and rhoai folders in manfest for params-latest
changed .github/workflows/update-commit-latest-env.yaml' to utilize the update of scripts/update-commit-latest-env.py`

How Has This Been Tested?

Self checklist (all need to be checked):

  • Ensure that you have run make test (gmake on macOS) before asking for review
  • Changes to everything except Dockerfile.konflux files should be done in odh/notebooks and automatically synced to rhds/notebooks. For Konflux-specific changes, modify Dockerfile.konflux files directly in rhds/notebooks as these require special attention in the downstream repository and flow to the upcoming RHOAI release.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Chores
    • Automated manifest update workflow expanded to cover multiple product variants and additional config paths.
    • CI job now includes container registry auth and required tooling installation for reliable runs.
    • Update command and commit logic enhanced to detect and commit changes across a broader set of generated manifest outputs, with clearer "no changes" reporting.

@openshift-ci openshift-ci bot requested review from daniellutz and dibryant April 7, 2026 19:54
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 7, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jiridanek for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

The workflow trigger paths were broadened from a single params file to variant-specific patterns for manifests/odh/base/params*.env and manifests/rhoai/base/params*.env, and the job now installs skopeo and git-crypt, unlocks git-crypt using secrets.GIT_CRYPT_KEY, and copies ci/secrets/pull-secret.json to ~/.config/containers/auth.json for registry auth. The manifest update invocation changed to uv run python scripts/update-commit-latest-env.py --all. Commit/push steps were updated to stage and conditionally commit four variant-scoped files: manifests/odh/base/commit-latest.env, manifests/odh/base/commit.env, manifests/rhoai/base/commit-latest.env, and manifests/rhoai/base/commit.env.

The Python script was refactored from single-file inspection to variant-scoped processing: it parses manifests/<variant>/base/params-latest.env to produce commit-latest.env for -n keys (skipping rhoai), merges params.env entries into commit.env for non--n keys while preserving existing unrelated entries, introduces robust env-file utilities (_parse_env_lines, _write_env, _load_env_dict), derives commit keys deterministically via regex-backed parsing and _commit_key, and replaces the old async orchestration with main() handling --variant (repeatable) and --all options while running per-variant work sequentially and failing overall on any variant error.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Actionable Issues

scripts/update-commit-latest-env.py

  • Command injection via shell composition (CWE-78): _inspect_pairs builds shell commands using unescaped image_ref values and runs them with asyncio.create_subprocess_shell(). Replace with asyncio.create_subprocess_exec() or otherwise pass an argument vector to avoid shell interpretation; sanitize/validate image_ref against an allowlist. Cite: CWE-78.

  • Path traversal / unvalidated path construction (CWE-22): Variant names and constructed paths for manifests/<variant>/... are used without whitelist validation or post-resolution containment checks. Validate variant identifiers (whitelist), construct paths via Path.joinpath, and assert resolved paths remain within the repository manifest directory (e.g., compare .resolve() prefixes). Cite: CWE-22.

  • Sensitive data leakage in structured logs (CWE-532): Logging may emit vcs_ref, file contents, or other secrets parsed from env files. Redact keys matching patterns (token, secret, password, key) and avoid logging full env contents. Consider structured logging filters or a redaction layer. Cite: CWE-532.

  • Silent skipping of malformed env lines / weak input validation: _load_env_dict currently ignores malformed/comment lines; _parse_env_lines raises for missing = but behavior differs. Normalize and harden parsing: fail fast with clear errors for malformed files or emit warnings with file/line number. Validate keys against expected naming patterns before merging. This prevents hidden parsing errors and potential injection via malformed input.

  • Inconsistent failure semantics across variants: The new flow runs variants sequentially and sets a non-zero exit code on failure but continues processing remaining variants. Consider whether partial updates are acceptable; if not, fail fast on first error to avoid mixed-state commits.

.github/workflows/update-commit-latest-env.yaml

  • Unverified decryption of git-crypt key / lack of integrity checks (CWE-347): The workflow unlocks encrypted files via git-crypt using secrets.GIT_CRYPT_KEY but performs no integrity verification of critical manifests post-decryption. Add explicit verification (checksums, GPG signatures, or repository provenance checks) before using decrypted artifacts. Cite: CWE-347.

  • Copying pull-secret.json without validation (CWE-200 / CWE-732): The workflow copies ci/secrets/pull-secret.json into container auth path without checking file existence, readability, or schema/structure. Validate the file, ensure minimal privileges, and avoid persisting secrets in readable logs or artifacts. Cite: CWE-200 and CWE-732.

  • Use of elevated tooling without containment: The job installs skopeo and git-crypt and uses them on inputs derived from repository files. Ensure the runner environment is constrained, inputs are validated (see command injection/path traversal), and secrets required by these tools are handled with least privilege and never printed to logs.

Only actionable issues are listed above; address these to reduce risk of code execution, path traversal, secret leakage, and integrity failures.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Title check ⚠️ Warning Title violates imperative mood requirement (uses "updates" instead of "update") and contains typo ("manfest" should be "manifest"). Correct to: 'NO-JIRA: ci(manifest): update commit-latest script and action for updated repo structure' or add appropriate ticket reference.
Description check ❓ Inconclusive Description lacks detail on testing methodology and how changes were validated against the actual operational requirements. Expand 'How Has This Been Tested?' section with specific test commands run, environment details, and verification that both odh and rhoai variants process correctly with actual image URLs.
✅ Passed checks (1 passed)
Check name Status Explanation
Branch Prefix Policy ✅ Passed PR targets main branch with conventional commit format title without branch prefix, complying with branch prefix policy for human-authored commits.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the size/l label Apr 7, 2026
@github-actions github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Apr 7, 2026
@mtchoum1 mtchoum1 force-pushed the mtchoum1/commit-action branch from 3124f74 to 39b5a3c Compare April 7, 2026 19:58
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 7, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 0% with 155 lines in your changes missing coverage. Please review.
✅ Project coverage is 3.23%. Comparing base (ec874cc) to head (b883987).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
scripts/update-commit-latest-env.py 0.00% 155 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##            main   #3304      +/-   ##
========================================
- Coverage   3.34%   3.23%   -0.11%     
========================================
  Files         31      31              
  Lines       3562    3680     +118     
  Branches     572     598      +26     
========================================
  Hits         119     119              
- Misses      3441    3559     +118     
  Partials       2       2              
Flag Coverage Δ
python 3.23% <0.00%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
scripts/update-commit-latest-env.py 0.00% <0.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec874cc...b883987. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mtchoum1 mtchoum1 force-pushed the mtchoum1/commit-action branch from 39b5a3c to a7acbb9 Compare April 7, 2026 19:59
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 7, 2026
@jiridanek
Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 8, 2026
# Using 'docker://' prefix is required for skopeo to identify the transport.
full_image_url = f"docker://{image_url}"

# Use 'inspect --config' which is much faster as it only fetches the config blob.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove the comments, IMO quite nonobvious info

try:
async with semaphore:
log.info(f"Semaphore acquired, starting skopeo inspect for: {image_url}")
# Create an asynchronous subprocess
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, these can be removed

"""
Main function to orchestrate the concurrent inspection of multiple images.
"""
semaphore = asyncio.Semaphore(22) # Limit concurrent skopeo processes
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is somewhat valuable comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for letting me know they we important I'll bring them back

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/update-commit-latest-env.yaml (1)

6-7: ⚠️ Potential issue | 🟡 Minor

Cron comment is incorrect: 0 11 * * * is 11:00 UTC, not midnight.

Either fix the comment or adjust the cron expression.

-    - cron: '0 11 * * *'  # Daily at midnight UTC
+    - cron: '0 11 * * *'  # Daily at 11:00 UTC
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/update-commit-latest-env.yaml around lines 6 - 7, The cron
comment for the schedule block is wrong: the existing schedule: - cron: '0 11 *
* *' runs at 11:00 UTC, not midnight; update either the cron expression to '0 0
* * *' if you want daily midnight UTC, or change the comment to "Daily at 11:00
UTC" to match the current cron, making the edit next to the schedule -> cron: '0
11 * * *' entry.
🧹 Nitpick comments (4)
scripts/update-commit-latest-env.py (3)

174-174: Magic number 22 for semaphore concurrency limit.

Extract to a named constant or CLI argument for clarity and easier tuning.

+MAX_CONCURRENT_INSPECTIONS = 22
+
 async def inspect_urls(urls: typing.Iterable[str]) -> dict[str, str | None]:
     unique = list(dict.fromkeys(urls))
-    semaphore = asyncio.Semaphore(22)
+    semaphore = asyncio.Semaphore(MAX_CONCURRENT_INSPECTIONS)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` at line 174, Replace the hard-coded
semaphore = asyncio.Semaphore(22) with a named constant or configurable
parameter: introduce a constant like MAX_CONCURRENT_TASKS (or add a CLI/ENV
argument parsed in main) and use asyncio.Semaphore(MAX_CONCURRENT_TASKS) so the
concurrency limit is clearly named and can be tuned; update any help text or
defaults where the script parses arguments and use that symbol in place of the
magic number in scripts/update-commit-latest-env.py.

57-65: Missing file existence check before open.

load_params_env will raise FileNotFoundError with a raw traceback. Consider checking existence and providing a clearer error message, especially since the workflow depends on both files being present.

Proposed fix
 def load_params_env(path: pathlib.Path) -> list[tuple[str, str]]:
+    if not path.is_file():
+        log.error("params-latest.env not found", path=str(path))
+        sys.exit(1)
     with open(path, "rt") as file:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 57 - 65, The function
load_params_env currently opens the given pathlib.Path directly and will raise a
raw FileNotFoundError; add an explicit existence check at the top of
load_params_env (use path.exists() and/or path.is_file()) and raise a
FileNotFoundError (or ValueError) with a clear, descriptive message that
includes the path value if the file is missing or not a regular file, so callers
get a readable error instead of an opaque traceback.

78-82: Inconsistent exit patterns: mix of sys.exit(1) and raise SystemExit(1).

Lines 78, 82, 99, 180 use sys.exit(1) while line 200 uses raise SystemExit(1). Pick one for consistency—sys.exit(1) is more common in scripts.

-            raise SystemExit(1)
+            sys.exit(1)

Also applies to: 99-100, 199-200

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 78 - 82, The script mixes
exit styles (sys.exit(1) and raise SystemExit(1)); standardize to sys.exit(1)
throughout: replace the raise SystemExit(1) occurrence with sys.exit(1) so all
exits use sys.exit, ensuring the sys module is already imported and leaving
existing sys.exit(1) calls (e.g., in the blocks referencing odh_by_var[var] and
dummy checks) unchanged; update every instance (around the formerly raise
SystemExit usage and the other noted exit calls) to use sys.exit(1) for
consistency.
.github/workflows/update-commit-latest-env.yaml (1)

46-54: git diff may fail if files don't exist yet.

If commit-latest.env files are newly created (first run), git diff on untracked files returns 0 (no error, no diff shown), but git add will stage them. The logic works, but consider using git diff --exit-code explicitly or checking for untracked files to capture new file creation.

More robust change detection
-          if ! git diff --quiet \
-            manifests/odh/base/commit-latest.env \
-            manifests/rhoai/base/commit-latest.env
-          then
-            git add manifests/odh/base/commit-latest.env manifests/rhoai/base/commit-latest.env
+          git add manifests/odh/base/commit-latest.env manifests/rhoai/base/commit-latest.env
+          if ! git diff --cached --quiet; then
             git commit -m "ci: update commit SHAs for image digests changes"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/update-commit-latest-env.yaml around lines 46 - 54, The
current check using "git diff --quiet" can miss newly added untracked files;
change the flow to stage the two commit-latest.env files first (git add
manifests/odh/base/commit-latest.env manifests/rhoai/base/commit-latest.env) and
then test staged changes with "git diff --cached --quiet --exit-code" (or "! git
diff --cached --quiet --exit-code" in the if condition) before running git
commit and git push; update the conditional around the staging/commit logic so
new files are detected reliably by referencing the existing manifest file names
and the git add/git diff --cached commands.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/update-commit-latest-env.py`:
- Around line 96-100: The current re.sub(r"-n$", "-commit-n", var) silently
leaves var unchanged when it doesn't end with "-n"; update the logic around the
lines.append call to validate var first (e.g., check var.endswith("-n") or use
re.search(r"-n$") on var) and if it doesn't match, call log.error (include var
and url) and exit or skip with a warning, then only append (modified_var,
vcs[:7]) when the validation passes so you avoid producing malformed output;
refer to the variables/functions vcs_by_url, vcs, var, lines.append and the
re.sub usage to locate where to add the check and error handling.
- Around line 53-54: resolve_manifest_dir currently resolves a user-supplied
path against PROJECT_ROOT but does not validate that the resolved path remains
inside PROJECT_ROOT, allowing path traversal; update resolve_manifest_dir to
resolve the candidate (use .resolve(strict=False) if needed) and then verify the
resulting path is inside PROJECT_ROOT (e.g., using
Path.is_relative_to(PROJECT_ROOT) or comparing parents via a loop) and raise an
exception (ValueError) if it is not; ensure the function still returns the
resolved path when validation passes.

---

Outside diff comments:
In @.github/workflows/update-commit-latest-env.yaml:
- Around line 6-7: The cron comment for the schedule block is wrong: the
existing schedule: - cron: '0 11 * * *' runs at 11:00 UTC, not midnight; update
either the cron expression to '0 0 * * *' if you want daily midnight UTC, or
change the comment to "Daily at 11:00 UTC" to match the current cron, making the
edit next to the schedule -> cron: '0 11 * * *' entry.

---

Nitpick comments:
In @.github/workflows/update-commit-latest-env.yaml:
- Around line 46-54: The current check using "git diff --quiet" can miss newly
added untracked files; change the flow to stage the two commit-latest.env files
first (git add manifests/odh/base/commit-latest.env
manifests/rhoai/base/commit-latest.env) and then test staged changes with "git
diff --cached --quiet --exit-code" (or "! git diff --cached --quiet --exit-code"
in the if condition) before running git commit and git push; update the
conditional around the staging/commit logic so new files are detected reliably
by referencing the existing manifest file names and the git add/git diff
--cached commands.

In `@scripts/update-commit-latest-env.py`:
- Line 174: Replace the hard-coded semaphore = asyncio.Semaphore(22) with a
named constant or configurable parameter: introduce a constant like
MAX_CONCURRENT_TASKS (or add a CLI/ENV argument parsed in main) and use
asyncio.Semaphore(MAX_CONCURRENT_TASKS) so the concurrency limit is clearly
named and can be tuned; update any help text or defaults where the script parses
arguments and use that symbol in place of the magic number in
scripts/update-commit-latest-env.py.
- Around line 57-65: The function load_params_env currently opens the given
pathlib.Path directly and will raise a raw FileNotFoundError; add an explicit
existence check at the top of load_params_env (use path.exists() and/or
path.is_file()) and raise a FileNotFoundError (or ValueError) with a clear,
descriptive message that includes the path value if the file is missing or not a
regular file, so callers get a readable error instead of an opaque traceback.
- Around line 78-82: The script mixes exit styles (sys.exit(1) and raise
SystemExit(1)); standardize to sys.exit(1) throughout: replace the raise
SystemExit(1) occurrence with sys.exit(1) so all exits use sys.exit, ensuring
the sys module is already imported and leaving existing sys.exit(1) calls (e.g.,
in the blocks referencing odh_by_var[var] and dummy checks) unchanged; update
every instance (around the formerly raise SystemExit usage and the other noted
exit calls) to use sys.exit(1) for consistency.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Repository UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: b9eb554d-0fbe-4fbd-ba63-89960ae7d50a

📥 Commits

Reviewing files that changed from the base of the PR and between 715a38f and a7acbb9.

📒 Files selected for processing (4)
  • .github/workflows/update-commit-latest-env.yaml
  • manifests/odh/base/commit-latest.env
  • manifests/rhoai/base/commit-latest.env
  • scripts/update-commit-latest-env.py

Comment on lines +68 to +72
def resolve_rhoai_urls(
rhoai_pairs: list[tuple[str, str]],
odh_by_var: dict[str, str],
) -> list[tuple[str, str]]:
"""Replace RHOAI dummy placeholders with the same variable's image URL from ODH params."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, what? @coderabbitai how can this possibly produce the right answer?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

Let me look at the actual env files and the failing tests to understand this fully.


🏁 Script executed:

#!/bin/bash
# Check variable naming in params-latest.env
echo "=== ODH params-latest.env ==="
head -30 manifests/odh/base/params-latest.env 2>/dev/null || echo "NOT FOUND"
echo ""
echo "=== RHOAI params-latest.env ==="
head -30 manifests/rhoai/base/params-latest.env 2>/dev/null || echo "NOT FOUND"
echo ""
# Look at failing tests
echo "=== test_image_manifests_version_alignment ==="
rg -n "test_image_manifests_version_alignment" --type py -A 40
echo ""
echo "=== test_image_pyprojects ==="
rg -n "test_image_pyprojects" --type py -A 20
echo ""
# Check how params-latest and commit-latest are related in tests
echo "=== conftest / fixtures ==="
rg -n "manifest_directory" --type py -A 5 -B 5 | head -100

Repository: opendatahub-io/notebooks
Length of output: 10530


🏁 Script executed:

#!/bin/bash
echo "=== conftest.py manifests_directory fixture ==="
rg -n "manifests_directory" --type py -A 10 -B 2

echo ""
echo "=== load_manifests_file_for ==="
rg -n "load_manifests_file_for" --type py -A 30 -B 2

echo ""
echo "=== Manifest class ==="
rg -n "class Manifest" --type py -A 20

echo ""
echo "=== old manifests/base directory ==="
ls manifests/base/ 2>/dev/null || echo "NOT FOUND (old path)"
ls manifests/odh/base/ 2>/dev/null
ls manifests/rhoai/base/ 2>/dev/null

Repository: opendatahub-io/notebooks
Length of output: 22984


🏁 Script executed:

#!/bin/bash
echo "=== MANIFESTS_ODH_DIR and MANIFESTS_RHOAI_DIR definitions ==="
rg -n "MANIFESTS_ODH_DIR\|MANIFESTS_RHOAI_DIR" --type py -A 2 -B 2

echo ""
echo "=== workflow file ==="
cat .github/workflows/update-commit-latest-env.yaml 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== Check if manifests/odh/base/base exists ==="
ls manifests/odh/base/base/ 2>/dev/null || echo "manifests/odh/base/base/ does NOT exist"
ls manifests/rhoai/base/base/ 2>/dev/null || echo "manifests/rhoai/base/base/ does NOT exist"

Repository: opendatahub-io/notebooks
Length of output: 1877


🏁 Script executed:

#!/bin/bash
echo "=== Search for MANIFESTS constants in tests/manifests.py ==="
cat tests/manifests.py | head -60

echo ""
echo "=== Explicit grep for MANIFESTS dir constants ==="
grep -n "MANIFESTS" tests/manifests.py | head -30

echo ""
echo "=== Check rstudio-notebook-imagestream.yaml in rhoai ==="
ls manifests/rhoai/base/rstudio-notebook-imagestream.yaml 2>/dev/null || echo "MISSING in rhoai"
ls manifests/odh/base/rstudio-notebook-imagestream.yaml 2>/dev/null || echo "MISSING in odh"

Repository: opendatahub-io/notebooks
Length of output: 4324


@jiridanek — here's a full trace of what resolve_rhoai_urls does and why it produces the correct result:

The design: manifests/rhoai/base/params-latest.env uses dummy for every single one of its 18 entries (confirmed from the actual file). The variable names are identical to ODH's. resolve_rhoai_urls substitutes each dummy with the real image URL from ODH params-latest.env by matching on variable name:

RHOAI: odh-workbench-jupyter-minimal-cpu-py312-ubi9-n = dummy
ODH:   odh-workbench-jupyter-minimal-cpu-py312-ubi9-n = quay.io/opendatahub/odh-workbench-jupyter-minimal-cpu-py312-ubi9:3.4-v1.43

After substitution, RHOAI's resolved pairs are the ODH image URLs → skopeo inspect is called on those to get vcs-ref → RHOAI's commit-latest.env ends up with the same commit hashes as ODH's. This matches the PR's actual diff: both odh/base/commit-latest.env and rhoai/base/commit-latest.env were updated to the exact same hashes (7f2342d, 335acea, c2e4c8a, etc.).

So it does produce the right answer — the premise is that RHOAI and ODH are built from the same git source, so ODH image vcs-ref values are authoritative for both. The dummy convention is an explicit declaration of that dependency.


However, the 2 failing test_image_manifests_version_alignment tests appear to be a separate, unrelated issue that this PR has exposed:

  • manifests/odh/base/rstudio-notebook-imagestream.yaml ✅ exists
  • manifests/rhoai/base/rstudio-notebook-imagestream.yamlmissing

The test is parametrized over both MANIFESTS_ODH_DIR (manifests/odh) and MANIFESTS_RHOAI_DIR (manifests/rhoai), and get_source_of_truth_filepath will try to open manifests/rhoai/base/rstudio-notebook-imagestream.yaml which doesn't exist, raising FileNotFoundError. The RHOAI base/ directory appears to be missing several imagestream files compared to ODH.

Could you clarify — is rstudio-notebook-imagestream.yaml intentionally absent from RHOAI (in which case the test needs a guard), or does it need to be added/symlinked?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the premise is that RHOAI and ODH are built from the same git source, so ODH image vcs-ref values are authoritative for both

@coderabbitai exactly, this is a false premise, RHOAI/rhds images are built from rhoai-3.x branch in red-hat-data-services/notebooks repo, the commit hash will be naturally different

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== RHOAI params-latest.env (full) ==="
cat manifests/rhoai/base/params-latest.env 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ODH params-latest.env (first few lines) ==="
head -10 manifests/odh/base/params-latest.env 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== RHOAI commit-latest.env (from this PR) ==="
cat manifests/rhoai/base/commit-latest.env 2>/dev/null || echo "NOT FOUND"

echo ""
echo "=== ODH commit-latest.env (from this PR) ==="
cat manifests/odh/base/commit-latest.env 2>/dev/null || echo "NOT FOUND"

Repository: opendatahub-io/notebooks
Length of output: 4961


@jiridanek — confirmed. The data makes the problem unambiguous: manifests/rhoai/base/commit-latest.env and manifests/odh/base/commit-latest.env are byte-for-byte identical for all 18 shared entries in this PR's diff. That is the smoking gun.

The bug chain in resolve_rhoai_urls:

  1. RHOAI params-latest.env — every entry is dummy
  2. resolve_rhoai_urls substitutes each dummy with the ODH image URL (e.g. quay.io/opendatahub/odh-workbench-jupyter-minimal-cpu-py312-ubi9:3.4-v1.43)
  3. skopeo inspect runs against those ODH images → returns a vcs-ref pointing to a commit in opendatahub-io/notebooks
  4. That ODH commit hash is written into manifests/rhoai/base/commit-latest.env

So commit-latest.env for RHOAI is tracking ODH source commits, not RHOAI/RHDS source commits from red-hat-data-services/notebooks @ rhoai-3.x.

The root cause is that manifests/rhoai/base/params-latest.env must be populated with actual RHOAI image URLs (the images built from red-hat-data-services/notebooks) rather than dummy. Only then will skopeo inspect return the correct vcs-ref values from the RHOAI build branch. The dummy substitution mechanism, as designed here, can never yield correct results for RHOAI because it always redirects to ODH images.


📝 Learnings were identified and not saved (knowledge base disabled). Enable

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai manifests/rhoai/base/params-latest.env must remain dummy, it is populated by rhods-operator during openshift deployment, as described at https://github.com/opendatahub-io/notebooks/blob/main/manifests/README.md. dummy values serve to catch when for some reason the substitutions don't happen correctly; it's better to fail the deployment than to deploy the wrong ("fallback") images

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek — understood and noted. That is an important design constraint.

This actually strengthens the bug finding: the resolve_rhoai_urls function, rather than being a valid workaround, is definitively broken for RHOAI. Because:

  1. RHOAI params-latest.env always has dummy values in the repo (by design, as a deployment-safety guard)
  2. resolve_rhoai_urls replaces those dummy values with ODH image URLs from manifests/odh/base/params-latest.env
  3. skopeo inspect then runs against those ODH images → returns vcs-ref from opendatahub-io/notebooks
  4. Those ODH commit hashes land in manifests/rhoai/base/commit-latest.env — which should track commits from red-hat-data-services/notebooks @ rhoai-3.x

The result is that RHOAI's commit-latest.env silently contains the wrong commit hashes. The CI script, as written in the diff, can never produce correct values for RHOAI — it would need the actual RHOAI image URLs (the real digests served by the registry) to be provided at execution time through a different mechanism (e.g., passed in via CI variable or fetched from the registry catalog), rather than falling back to ODH's params-latest.env.


📝 Learnings were identified and not saved (knowledge base disabled). Enable

@opendatahub-io opendatahub-io deleted a comment from coderabbitai bot Apr 8, 2026
@mtchoum1 mtchoum1 force-pushed the mtchoum1/commit-action branch from a7acbb9 to f01e078 Compare April 8, 2026 12:59
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
scripts/update-commit-latest-env.py (2)

68-86: ⚠️ Potential issue | 🔴 Critical

resolve_rhoai_urls is writing RHOAI output from ODH images.

With the current manifests, every rhoai entry is dummy, so this function always swaps in the matching ODH URL and inspects that image. The generated manifests/rhoai/base/commit-latest.env therefore records ODH vcs-ref values instead of the RHDS/RHOAI commits. Fail fast on dummy here until a real RHOAI image source exists; reusing ODH URLs cannot produce correct RHOAI SHAs.

Minimal safe guard
 def resolve_rhoai_urls(
     rhoai_pairs: list[tuple[str, str]],
     odh_by_var: dict[str, str],
 ) -> list[tuple[str, str]]:
     """Replace RHOAI dummy placeholders with the same variable's image URL from ODH params."""
     out: list[tuple[str, str]] = []
     for var, raw in rhoai_pairs:
         if raw.strip().lower() == DUMMY_IMAGE:
-            if var not in odh_by_var:
-                log.error("No ODH params-latest entry to resolve RHOAI dummy image", variable=var)
-                sys.exit(1)
-            url = odh_by_var[var]
-            if url.strip().lower() == DUMMY_IMAGE:
-                log.error("ODH params-latest still has dummy for variable", variable=var)
-                sys.exit(1)
-            out.append((var, url))
+            log.error(
+                "RHOAI params-latest.env entry cannot be 'dummy'; provide the real RHOAI image URL",
+                variable=var,
+            )
+            sys.exit(1)
         else:
             out.append((var, raw))
     return out
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 68 - 86, The function
resolve_rhoai_urls currently replaces RHOAI DUMMY_IMAGE entries with ODH URLs
(using rhoai_pairs and odh_by_var), which causes RHOAI commits to be recorded
from ODH; change the behavior to fail-fast when a rhoai entry is DUMMY_IMAGE:
for each (var, raw) in resolve_rhoai_urls, if raw.strip().lower() == DUMMY_IMAGE
log an error including the variable name and exit (do not substitute
odh_by_var[var] or append it to out); only allow non-dummy rhoai values to be
appended to out so RHOAI SHAs come from real RHOAI image sources.

53-54: ⚠️ Potential issue | 🔴 Critical

CWE-22: --odh-dir / --rhoai-dir can escape the checkout and overwrite files outside the repo.

Severity: critical. A caller can pass an absolute path or ../../..., and Line 54 will then read params-latest.env and write commit-latest.env outside PROJECT_ROOT. Canonicalize the candidate first and reject anything that does not stay under the repository root.

Proposed fix
 def resolve_manifest_dir(raw: pathlib.Path) -> pathlib.Path:
-    return raw if raw.is_absolute() else (PROJECT_ROOT / raw).resolve()
+    project_root = PROJECT_ROOT.resolve()
+    candidate = raw if raw.is_absolute() else project_root / raw
+    resolved = candidate.resolve(strict=False)
+    try:
+        resolved.relative_to(project_root)
+    except ValueError:
+        log.error("Manifest directory must stay within the repository", path=str(resolved))
+        sys.exit(1)
+    return resolved

As per coding guidelines, "Validate file paths (prevent path traversal)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 53 - 54, The
resolve_manifest_dir function currently allows absolute or traversal paths to
escape PROJECT_ROOT; change it to canonicalize the candidate path after joining
(use .resolve()) and then verify it is inside PROJECT_ROOT (e.g., using
Path.is_relative_to or try: candidate.relative_to(PROJECT_ROOT) except
ValueError) and raise an exception or reject the input if the resolved path is
not under PROJECT_ROOT; update any callers that use resolve_manifest_dir (e.g.,
where params-latest.env / commit-latest.env are read/written) to handle the
rejection instead of proceeding.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scripts/update-commit-latest-env.py`:
- Around line 68-86: The function resolve_rhoai_urls currently replaces RHOAI
DUMMY_IMAGE entries with ODH URLs (using rhoai_pairs and odh_by_var), which
causes RHOAI commits to be recorded from ODH; change the behavior to fail-fast
when a rhoai entry is DUMMY_IMAGE: for each (var, raw) in resolve_rhoai_urls, if
raw.strip().lower() == DUMMY_IMAGE log an error including the variable name and
exit (do not substitute odh_by_var[var] or append it to out); only allow
non-dummy rhoai values to be appended to out so RHOAI SHAs come from real RHOAI
image sources.
- Around line 53-54: The resolve_manifest_dir function currently allows absolute
or traversal paths to escape PROJECT_ROOT; change it to canonicalize the
candidate path after joining (use .resolve()) and then verify it is inside
PROJECT_ROOT (e.g., using Path.is_relative_to or try:
candidate.relative_to(PROJECT_ROOT) except ValueError) and raise an exception or
reject the input if the resolved path is not under PROJECT_ROOT; update any
callers that use resolve_manifest_dir (e.g., where params-latest.env /
commit-latest.env are read/written) to handle the rejection instead of
proceeding.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Repository UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 9e646073-9504-42ec-85a4-e81980939766

📥 Commits

Reviewing files that changed from the base of the PR and between a7acbb9 and f01e078.

📒 Files selected for processing (4)
  • .github/workflows/update-commit-latest-env.yaml
  • manifests/odh/base/commit-latest.env
  • manifests/rhoai/base/commit-latest.env
  • scripts/update-commit-latest-env.py
✅ Files skipped from review due to trivial changes (1)
  • manifests/rhoai/base/commit-latest.env
🚧 Files skipped from review as they are similar to previous changes (2)
  • .github/workflows/update-commit-latest-env.yaml
  • manifests/odh/base/commit-latest.env

@openshift-ci openshift-ci bot removed the size/l label Apr 8, 2026
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
scripts/update-commit-latest-env.py (1)

168-180: ⚠️ Potential issue | 🟡 Minor

Inconsistent error handling: _load_env_dict silently discards malformed lines.

_parse_env_lines (line 155) correctly raises ValueError on malformed rows, but _load_env_dict (line 176-177) silently skips them. A typo in commit.env becomes silent data loss when the file is re-written.

Proposed fix
 def _load_env_dict(path: pathlib.Path) -> dict[str, str]:
     d: dict[str, str] = {}
     if not path.is_file():
         return d
-    for line in path.read_text().splitlines():
+    for lineno, raw_line in enumerate(path.read_text(encoding="utf-8").splitlines(), start=1):
-        line = line.strip()
+        line = raw_line.strip()
         if not line or line.startswith("#"):
             continue
         if "=" not in line:
-            continue
+            raise ValueError(f"{path}:{lineno}: expected KEY=VALUE")
         k, _, v = line.partition("=")
         d[k.strip()] = v.strip()
     return d
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 168 - 180, The helper
_load_env_dict currently swallows malformed env lines (it skips lines without
"=") causing silent data loss; change _load_env_dict to mirror _parse_env_lines
by validating each non-empty, non-comment line and raising a ValueError for
malformed rows (include the offending line and path or line number in the
message) instead of silently skipping, so callers rewriting commit.env will fail
fast on typos; locate and update the logic inside the _load_env_dict function to
perform the same validation behavior as _parse_env_lines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/update-commit-latest-env.py`:
- Around line 257-259: The assert that checks vcs is not None in the loop over
pairs/results should be replaced with an explicit conditional that raises or
logs an error and skips or aborts consistently (like the check in
sync_commit_latest); locate the loop using variables pairs, results, commit_key,
vcs and replace the assert vcs is not None with an if not vcs: ... block that
either raises a RuntimeError (including commit_key and context) or handles the
missing value and then sets existing[commit_key] = vcs[:7] only when vcs is
present.

---

Duplicate comments:
In `@scripts/update-commit-latest-env.py`:
- Around line 168-180: The helper _load_env_dict currently swallows malformed
env lines (it skips lines without "=") causing silent data loss; change
_load_env_dict to mirror _parse_env_lines by validating each non-empty,
non-comment line and raising a ValueError for malformed rows (include the
offending line and path or line number in the message) instead of silently
skipping, so callers rewriting commit.env will fail fast on typos; locate and
update the logic inside the _load_env_dict function to perform the same
validation behavior as _parse_env_lines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Repository UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2d5a62fe-e8fc-40a4-aab8-2ea64d43fd05

📥 Commits

Reviewing files that changed from the base of the PR and between 4f7dc5c and 8e18e70.

📒 Files selected for processing (4)
  • .github/workflows/update-commit-latest-env.yaml
  • manifests/odh/base/commit-latest.env
  • manifests/rhoai/base/commit-latest.env
  • scripts/update-commit-latest-env.py
✅ Files skipped from review due to trivial changes (2)
  • manifests/rhoai/base/commit-latest.env
  • manifests/odh/base/commit-latest.env
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/update-commit-latest-env.yaml

@mtchoum1 mtchoum1 force-pushed the mtchoum1/commit-action branch from 8e18e70 to cc1c7bf Compare April 8, 2026 20:12
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 8, 2026
@mtchoum1 mtchoum1 force-pushed the mtchoum1/commit-action branch from cc1c7bf to b837269 Compare April 8, 2026 20:12
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 8, 2026
@mtchoum1 mtchoum1 force-pushed the mtchoum1/commit-action branch from b837269 to b31a687 Compare April 8, 2026 20:13
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 8, 2026
@mtchoum1 mtchoum1 force-pushed the mtchoum1/commit-action branch from b31a687 to e8af878 Compare April 8, 2026 20:14
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
scripts/update-commit-latest-env.py (2)

174-185: ⚠️ Potential issue | 🟠 Major

Fail fast on malformed commit.env rows.

_load_env_dict() still swallows non-comment lines without =, then _write_env() rewrites the file without them. A typo in commit.env becomes silent data loss on the next run. Mirror _parse_env_lines() here.

Proposed fix
 def _load_env_dict(path: pathlib.Path) -> dict[str, str]:
     d: dict[str, str] = {}
     if not path.is_file():
         return d
-    for line in path.read_text().splitlines():
-        line = line.strip()
+    for lineno, raw_line in enumerate(path.read_text(encoding="utf-8").splitlines(), start=1):
+        line = raw_line.strip()
         if not line or line.startswith("#"):
             continue
         if "=" not in line:
-            continue
+            raise ValueError(f"_load_env_dict: {path}:{lineno}: expected KEY=VALUE, got {raw_line!r}")
         k, _, v = line.partition("=")
         d[k.strip()] = v.strip()
     return d

As per coding guidelines, "REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 174 - 185, The
_load_env_dict function currently ignores non-comment lines missing an "=" which
silently drops malformed rows; update _load_env_dict to fail fast by raising an
explicit ValueError (including the offending line text and the provided path)
when a non-empty, non-comment line does not contain "=", mirroring the behavior
of _parse_env_lines; keep existing trimming and comment/blank skipping logic and
ensure callers like _write_env will no longer rewrite away malformed entries
silently.

189-212: ⚠️ Potential issue | 🟠 Major

--all cannot succeed for rhoai with placeholder URLs.

This function assumes every params-latest.env value is directly inspectable. With the workflow now always running --all, the rhoai branch will hit checked-in dummy placeholders and call skopeo inspect docker://dummy, then return False before commit.env is refreshed. rhoai needs a CI-time source of real image refs, or an explicit skip/fail-fast path before inspection.

As per coding guidelines, "REVIEW PRIORITIES: 2. Architectural issues and anti-patterns".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 189 - 212, In
sync_commit_latest, detect and handle placeholder image URLs before calling
_inspect_pairs: after building pairs from params_latest (using _PARAM_KEY_RE),
filter out entries whose url is a placeholder (e.g., "dummy" or known
placeholder pattern used for rhoai), log a message indicating those keys were
skipped, and if that filtering leaves no remaining inspectable pairs return True
(or otherwise short-circuit appropriately) so rhoai's checked-in placeholders
don't trigger skopeo inspection and fail the run; otherwise call _inspect_pairs
only with the filtered list and proceed as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/update-commit-latest-env.py`:
- Around line 322-324: Remove the unreachable
sys.exit(asyncio.run(main_async(variants))) call at module bottom: main()
already calls sys.exit() and variants is local to main, so delete the duplicate
asyncio invocation. Keep only the existing if __name__ == "__main__": main()
entry; do not reintroduce module-level references to variants or call main_async
from top-level. Ensure no other code depends on that removed call.

---

Duplicate comments:
In `@scripts/update-commit-latest-env.py`:
- Around line 174-185: The _load_env_dict function currently ignores non-comment
lines missing an "=" which silently drops malformed rows; update _load_env_dict
to fail fast by raising an explicit ValueError (including the offending line
text and the provided path) when a non-empty, non-comment line does not contain
"=", mirroring the behavior of _parse_env_lines; keep existing trimming and
comment/blank skipping logic and ensure callers like _write_env will no longer
rewrite away malformed entries silently.
- Around line 189-212: In sync_commit_latest, detect and handle placeholder
image URLs before calling _inspect_pairs: after building pairs from
params_latest (using _PARAM_KEY_RE), filter out entries whose url is a
placeholder (e.g., "dummy" or known placeholder pattern used for rhoai), log a
message indicating those keys were skipped, and if that filtering leaves no
remaining inspectable pairs return True (or otherwise short-circuit
appropriately) so rhoai's checked-in placeholders don't trigger skopeo
inspection and fail the run; otherwise call _inspect_pairs only with the
filtered list and proceed as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Repository UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 6561ae5c-b5c2-4d4f-b670-df70377cf494

📥 Commits

Reviewing files that changed from the base of the PR and between 4f7dc5c and e8af878.

📒 Files selected for processing (2)
  • .github/workflows/update-commit-latest-env.yaml
  • scripts/update-commit-latest-env.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/update-commit-latest-env.yaml

Comment on lines +322 to +324
if __name__ == "__main__":
main()
sys.exit(asyncio.run(main_async(variants)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import ast
import pathlib

path = next(pathlib.Path(".").rglob("update-commit-latest-env.py"))
tree = ast.parse(path.read_text(encoding="utf-8"))

class Visitor(ast.NodeVisitor):
    def __init__(self):
        self.scope = []

    def visit_FunctionDef(self, node):
        self.scope.append(f"function {node.name}")
        self.generic_visit(node)
        self.scope.pop()

    visit_AsyncFunctionDef = visit_FunctionDef

    def visit_Name(self, node):
        if node.id == "variants":
            scope = self.scope[-1] if self.scope else "module scope"
            print(f"Line {node.lineno}: {type(node.ctx).__name__} in {scope}")

Visitor().visit(tree)
PY

Repository: opendatahub-io/notebooks

Length of output: 267


🏁 Script executed:

#!/bin/bash
# Read the relevant section of the file to understand the context
sed -n '310,330p' scripts/update-commit-latest-env.py

Repository: opendatahub-io/notebooks

Length of output: 415


Remove the unreachable sys.exit(asyncio.run(main_async(variants))) on line 324.

The main() function exits via sys.exit() on line 319 and never returns. The duplicate call on line 324 is unreachable dead code, and if sys.exit() is intercepted in tests, it raises NameError because variants is only defined locally within main() and does not exist in module scope.

Proposed fix
 if __name__ == "__main__":
     main()
-    sys.exit(asyncio.run(main_async(variants)))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if __name__ == "__main__":
main()
sys.exit(asyncio.run(main_async(variants)))
if __name__ == "__main__":
main()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 322 - 324, Remove the
unreachable sys.exit(asyncio.run(main_async(variants))) call at module bottom:
main() already calls sys.exit() and variants is local to main, so delete the
duplicate asyncio invocation. Keep only the existing if __name__ == "__main__":
main() entry; do not reintroduce module-level references to variants or call
main_async from top-level. Ensure no other code depends on that removed call.

return True


async def sync_commit_released(variant: str) -> bool:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai is this just dupicated async def sync_commit_latest(variant: str) -> bool?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek — Yes, the docstring body of _inspect_pairs is simply duplicated — the same two sentences are pasted twice verbatim:

async def _inspect_pairs(pairs: list[tuple[str, str]]) -> list[tuple[str, str | None]]:
    """
    Orchestrate concurrent inspection of multiple images (param key → digest URL pairs).

    Limits parallelism with a semaphore (same default concurrency as the historical main-branch script).
    Orchestrate concurrent inspection of multiple images (param key → digest URL pairs).         # ← duplicate

    Limits parallelism with a semaphore (same default concurrency as the historical main-branch script).  # ← duplicate
    """

This is a copy-paste artifact; the function itself is not a duplicate of sync_commit_latest — they do different things (_inspect_pairs runs skopeo concurrently, sync_commit_latest orchestrates reading/writing env files).


While reviewing the file I also noticed a separate bug in the __main__ guard at the bottom of the file:

if __name__ == "__main__":
    main()                                   # calls sys.exit() internally → process ends here
    sys.exit(asyncio.run(main_async(variants)))  # dead code; `variants` is also not defined in this scope → NameError

The second sys.exit(...) line is unreachable dead code and would raise a NameError on variants if it were ever reached. It should simply be removed — main() already handles everything including sys.exit.

if any(commit_hash is None for variable, commit_hash in results):
log.error("Failed to get commit hash for some images. Quitting, please try again to try again, like.")
sys.exit(1)
async def sync_commit_latest(variant: str) -> bool:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai are sync_commit_latest and sync_commit_released duplicates of one another?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek — they are not duplicates, but they are structurally parallel by design. Here's how they differ:

sync_commit_latest sync_commit_released
Input file params-latest.env params.env
Output file commit-latest.env commit.env
Key filter -n suffix only Non--n suffixes (e.g. -1-3)
Missing input Hard error → False Soft skip → True
Write strategy Fresh overwrite Merges into existing entries via _load_env_dict
No applicable keys N/A Skips silently → True

sync_commit_latest handles the nightly/latest image tier (-n tags), while sync_commit_released handles versioned/released images (e.g. -1-3 digests). The merge behavior in sync_commit_released is deliberate — it preserves existing commit.env keys for pipeline images that only appear in params-latest.env.

Two unrelated issues worth flagging while we're here:

  1. Duplicated docstring in _inspect_pairs — the last two sentences in its docstring repeat the first two verbatim (visible in the diff).

  2. Unreachable + broken code at the bottom of the file:

    if __name__ == "__main__":
        main()
        sys.exit(asyncio.run(main_async(variants)))  # unreachable; `variants` not in scope

    The second line can never execute (it follows main() which calls sys.exit), and variants is not defined at module scope — this would raise a NameError if somehow reached.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtchoum1 what's the reason for handling params and params-latest differently, like this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

- 'manifests/base/params-latest.env'
- 'manifests/odh/base/params-latest.env'
- 'manifests/odh/base/params.env'
- 'manifests/rhoai/base/params-latest.env'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtchoum1 as we discussed with @coderabbitai in #3304 (comment), what can the script possibly usefully do with rhoai/base/params-latest.env? It only contains dummy values

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
cat manifests/rhoai/base/params-latest.env
echo "---"
cat manifests/odh/base/params-latest.env 2>/dev/null | head -30

Repository: opendatahub-io/notebooks
Length of output: 3658


🏁 Script executed:

#!/bin/bash
echo "=== rhoai/base/params.env ==="
cat manifests/rhoai/base/params.env 2>/dev/null | head -30
echo "=== rhoai/base/commit-latest.env (if exists) ==="
cat manifests/rhoai/base/commit-latest.env 2>/dev/null | head -10
echo "=== rhoai/base/commit.env (if exists) ==="
cat manifests/rhoai/base/commit.env 2>/dev/null | head -10

Repository: opendatahub-io/notebooks
Length of output: 6360


@jiridanek, the short answer is: nothing useful. Here's what the data shows:

  • manifests/rhoai/base/params-latest.env has all 18 entries set to dummy — there are no real image URLs.
  • When sync_commit_latest("rhoai") runs, it will attempt skopeo inspect dummy for every entry, fail on each, and abort with False. The rhoai/base/commit-latest.env will never get updated by this workflow.

The reason dummy is there is structurally sound: RHOAI doesn't publish nightly/latest images — it only ships versioned releases via registry.redhat.io (which are tracked in params.env, not params-latest.env). So the -n / "latest" concept simply doesn't apply to RHOAI.

Notably though, manifests/rhoai/base/commit-latest.env does exist and contains real commit SHAs (e.g. d3137ca, bff12e2) that match ODH commits — because RHOAI images are built from the same source as ODH. So RHOAI's commit-latest.env is meant to track the ODH source commits.

The fix should be Option B from the earlier discussion: the script should derive RHOAI's commit-latest.env from ODH's params-latest.env (filtered to the keys that exist in rhoai/base/params-latest.env), rather than trying to inspect dummy RHOAI URLs. The rhoai/base/params-latest.env file essentially acts as a key manifest declaring which images are relevant to RHOAI — the actual URLs to inspect are in ODH.

@mtchoum1, does that match the intent? If so, the script's sync_commit_latest for the rhoai variant needs to be rerouted to read image URLs from odh/base/params-latest.env using the keys from rhoai/base/params-latest.env, rather than trying to resolve the dummy values directly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because RHOAI images are built from the same source as ODH. So RHOAI's commit-latest.env is meant to track the ODH source commits

@coderabbitai nope, RHOAI images are built from red-hat-data-services/notebooks#rhoai-3.x branch; has different commit hashes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek, thank you for the correction — I'll update my analysis.

That changes the picture significantly. Since RHOAI images are built from red-hat-data-services/notebooks#rhoai-3.x with different commit hashes, Option B is also wrong: using ODH's params-latest.env URLs would yield ODH commit SHAs, not RHOAI ones. So both options from the earlier discussion are invalid for rhoai/base/commit-latest.env.

The core issue is that there are no publicly accessible nightly RHOAI image URLsrhoai/base/params-latest.env contains only dummy values because RHOAI ships exclusively through versioned releases on registry.redhat.io. The "latest" concept simply doesn't exist for RHOAI's public-facing images.

This means:

  • sync_commit_latest("rhoai") cannot work in any meaningful way — there's nothing to inspect.
  • rhoai/base/commit-latest.env cannot be automatically maintained by this workflow. It should either be skipped entirely or manually maintained.
  • The workflow's path trigger on manifests/rhoai/base/params-latest.env is a no-op since that file will always contain dummy.

What could still be useful: the script updating rhoai/base/commit.env by inspecting the versioned images from rhoai/base/params.env (which has real registry.redhat.io URLs) — but that requires registry authentication via the pull-secret for those private images.

@mtchoum1, given this, the rhoai commit-latest.env path in the script should likely be guarded with an explicit skip, and the path trigger for manifests/rhoai/base/params-latest.env in the workflow removed. Does that match the intended design?


📝 Learnings were identified and not saved (knowledge base disabled). Enable

@@ -27,21 +32,54 @@ jobs:
token: ${{ secrets.GITHUB_TOKEN }}
persist-credentials: true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai it's probably a bit of a security theater, but this can be written without having to use persist-credentials: true. See #3202

Up to you if this is worth implementing. The credentials are still in memory and malicious code, if it ever gets to run in our action, can introspect runner's memory. So maybe it's useless.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

gh pr view 3202 --repo opendatahub-io/notebooks --json body,files | jq -r '.body, "\n---FILES---", (.files[].path)'

Repository: opendatahub-io/notebooks
Length of output: 2348


🏁 Script executed:

gh pr diff 3202 --repo opendatahub-io/notebooks | grep -A 30 -B 5 "persist-credentials\|x-access-token\|push\|token"

Repository: opendatahub-io/notebooks
Length of output: 2405


@jiridanek — you're right that it's a limited mitigation, but the concrete benefit is that the token is never written to .git/config on disk, so code executing in the workspace (e.g. a compromised build tool, a malicious dep pulled during uv run) cannot read it via a simple file read. Memory introspection is a higher bar. Worth doing for consistency with the pattern established in #3202.

The change is small — apply the same x-access-token push pattern:

      - name: Checkout repository
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
        with:
          persist-credentials: false   # token never written to .git/config
          # no `token:` here — supplied only at push time below
      - name: Commit and push changes
        run: |
          git config --global user.name 'github-actions[bot]'
          git config --global user.email 'github-actions[bot]@users.noreply.github.com'

          changed=0
          for f in \
            manifests/odh/base/commit-latest.env \
            manifests/odh/base/commit.env \
            manifests/rhoai/base/commit-latest.env \
            manifests/rhoai/base/commit.env; do
            if ! git diff --quiet -- "$f" 2>/dev/null; then
              changed=1
              git add -- "$f"
            fi
          done

          if [[ "$changed" -eq 1 ]]; then
            git commit -m "ci: refresh commit SHAs from image vcs-ref labels"
            # Supply token only at push time — never persisted to .git/config
            git push "https://x-access-token:${GITHUB_TOKEN}@github.com/${{ github.repository }}.git" HEAD
          else
            echo "No effective changes in commit env files, skipping commit."
          fi
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

The job-level permissions: contents: write stays as-is — it governs what GITHUB_TOKEN is allowed to do, which is unchanged. This is the same pattern used in #3202 and aligns with the GitHub Security Lab guidance on preventing pwn requests.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkout@v6, release notes:

Improved credential security: persist-credentials now stores credentials in a separate file under $RUNNER_TEMP instead of directly in .git/config

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek — you're right, and I stand corrected. persist-credentials in v6 now stores credentials in a separate file under $RUNNER_TEMP instead of directly in .git/config, which directly addresses the concern I raised. The token is no longer sitting in a workspace-accessible file where a compromised build tool could trivially read it.

Since the workflow already pins actions/checkout at the SHA corresponding to v6.0.2, the persist-credentials: true + git push pattern here is fine as-is. The x-access-token push workaround I suggested was the right advice for v4/v5, but is redundant with v6's new storage location. Withdrawing that suggestion.


📝 Learnings were identified and not saved (knowledge base disabled). Enable

@mtchoum1 mtchoum1 force-pushed the mtchoum1/commit-action branch from e8af878 to b883987 Compare April 9, 2026 16:54
@openshift-ci openshift-ci bot added size/l and removed size/l labels Apr 9, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/update-commit-latest-env.py`:
- Around line 208-219: The loop collecting pairs from _parse_env_lines is
allowed to produce an empty pairs list which then leads to calling
_inspect_pairs and ultimately overwriting commit-latest.env with zero rows;
change the flow to treat empty pairs as a hard failure: after the loop that
fills pairs (using _PARAM_KEY_RE and suffix "-n"), check if pairs is empty and
raise an exception or exit with a nonzero status (e.g., raise RuntimeError or
call sys.exit(1)) with a clear message indicating no valid -n keys found, and do
not call _inspect_pairs or proceed to write output; this preserves the existing
file on bad input and mirrors sync_commit_released()'s conservative behavior.
- Around line 124-129: Reject non-SHA vcs-ref values by validating the vcs_ref
variable before accepting it: ensure vcs_ref matches the expected commit-hash
pattern (e.g. a 40-character hex SHA) and if it doesn't, log a warning/error and
treat it as missing (set vcs_ref to None or raise) so malformed values like
"main" or truncated SHAs are not written to commit*.env; update the branch
around the existing vcs_ref check (the block that logs "Successfully found
vcs-ref" / "vcs-ref label not found" and returns image_url, vcs_ref) to perform
this validation and only log success/return a vcs_ref when it passes the SHA
pattern.
- Around line 91-102: The subprocess call created in the block using semaphore
and process = await asyncio.create_subprocess_exec (...) must be wrapped with a
timeout and proper cleanup: call asyncio.wait_for on process.communicate() with
a sensible timeout, catch asyncio.TimeoutError, log the timeout (include
image_url), then kill the process (process.kill() or process.terminate()), await
process.wait() to reap it, and ensure you still capture/handle stdout/stderr or
set returncode appropriately before returning; keep the semaphore context (async
with semaphore) intact and add logging around the timeout and cleanup to avoid
permanently burning a semaphore slot.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Repository UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 77292ed0-a53e-48e3-bff9-06c4798ac8d3

📥 Commits

Reviewing files that changed from the base of the PR and between e8af878 and b883987.

📒 Files selected for processing (2)
  • .github/workflows/update-commit-latest-env.yaml
  • scripts/update-commit-latest-env.py
✅ Files skipped from review due to trivial changes (1)
  • .github/workflows/update-commit-latest-env.yaml

Comment on lines 91 to 102
try:
async with semaphore:
log.info(f"Semaphore acquired, starting skopeo inspect for: {image_url}")
log.info("Semaphore acquired, starting skopeo inspect", image_url=image_url)
# Create an asynchronous subprocess
process = await asyncio.create_subprocess_exec(
*command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
stderr=asyncio.subprocess.PIPE,
)
# Wait for the command to complete and capture output
stdout, stderr = await process.communicate()
returncode = process.returncode
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
sed -n '88,108p' scripts/update-commit-latest-env.py
printf '\n--- timeout-related matches ---\n'
rg -n "communicate\(|wait_for\(|timeout-minutes" scripts/update-commit-latest-env.py .github/workflows/update-commit-latest-env.yaml

Repository: opendatahub-io/notebooks

Length of output: 1039


Add a timeout and cleanup around the skopeo subprocess to prevent indefinite hangs.

The call to await process.communicate() (line 101) has no timeout. If the skopeo inspect command stalls on a registry or network failure, that semaphore slot is burned permanently, wedging the entire workflow.

Fix
         async with semaphore:
             log.info("Semaphore acquired, starting skopeo inspect", image_url=image_url)
-            process = await asyncio.create_subprocess_exec(
+            process = await asyncio.create_subprocess_exec(
                 *command,
                 stdout=asyncio.subprocess.PIPE,
                 stderr=asyncio.subprocess.PIPE,
             )
-            # Wait for the command to complete and capture output
-            stdout, stderr = await process.communicate()
+            try:
+                stdout, stderr = await asyncio.wait_for(process.communicate(), timeout=120)
+            except asyncio.TimeoutError:
+                process.kill()
+                stdout, stderr = await process.communicate()
+                log.error("Skopeo inspect timed out", image_url=image_url, timeout_seconds=120)
+                return image_url, None
             returncode = process.returncode
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
async with semaphore:
log.info(f"Semaphore acquired, starting skopeo inspect for: {image_url}")
log.info("Semaphore acquired, starting skopeo inspect", image_url=image_url)
# Create an asynchronous subprocess
process = await asyncio.create_subprocess_exec(
*command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
stderr=asyncio.subprocess.PIPE,
)
# Wait for the command to complete and capture output
stdout, stderr = await process.communicate()
returncode = process.returncode
try:
async with semaphore:
log.info("Semaphore acquired, starting skopeo inspect", image_url=image_url)
# Create an asynchronous subprocess
process = await asyncio.create_subprocess_exec(
*command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
try:
stdout, stderr = await asyncio.wait_for(process.communicate(), timeout=120)
except asyncio.TimeoutError:
process.kill()
stdout, stderr = await process.communicate()
log.error("Skopeo inspect timed out", image_url=image_url, timeout_seconds=120)
return image_url, None
returncode = process.returncode
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 91 - 102, The subprocess
call created in the block using semaphore and process = await
asyncio.create_subprocess_exec (...) must be wrapped with a timeout and proper
cleanup: call asyncio.wait_for on process.communicate() with a sensible timeout,
catch asyncio.TimeoutError, log the timeout (include image_url), then kill the
process (process.kill() or process.terminate()), await process.wait() to reap
it, and ensure you still capture/handle stdout/stderr or set returncode
appropriately before returning; keep the semaphore context (async with
semaphore) intact and add logging around the timeout and cleanup to avoid
permanently burning a semaphore slot.

Comment on lines 124 to 129
if vcs_ref:
log.info(f"Successfully found 'vcs-ref' for {image_url}: {vcs_ref}")
log.info("Successfully found vcs-ref", image_url=image_url, vcs_ref=vcs_ref)
else:
log.warning(f"'vcs-ref' label not found for {image_url}.")
log.warning("vcs-ref label not found", image_url=image_url)

return image_url, vcs_ref
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject malformed vcs-ref labels before generating commit files.

Any non-empty label is accepted here. A value like main, unknown, or a truncated SHA would be silently written into commit*.env, even though downstream replacements treat these as commit hashes.

Proposed fix
-        if vcs_ref:
-            log.info("Successfully found vcs-ref", image_url=image_url, vcs_ref=vcs_ref)
-        else:
-            log.warning("vcs-ref label not found", image_url=image_url)
+        if vcs_ref and re.fullmatch(r"[0-9a-fA-F]{7,40}", vcs_ref):
+            log.info("Successfully found vcs-ref", image_url=image_url, vcs_ref=vcs_ref)
+        else:
+            log.warning("Invalid or missing vcs-ref label", image_url=image_url, vcs_ref=vcs_ref)
+            return image_url, None

As per coding guidelines, "REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if vcs_ref:
log.info(f"Successfully found 'vcs-ref' for {image_url}: {vcs_ref}")
log.info("Successfully found vcs-ref", image_url=image_url, vcs_ref=vcs_ref)
else:
log.warning(f"'vcs-ref' label not found for {image_url}.")
log.warning("vcs-ref label not found", image_url=image_url)
return image_url, vcs_ref
if vcs_ref and re.fullmatch(r"[0-9a-fA-F]{7,40}", vcs_ref):
log.info("Successfully found vcs-ref", image_url=image_url, vcs_ref=vcs_ref)
else:
log.warning("Invalid or missing vcs-ref label", image_url=image_url, vcs_ref=vcs_ref)
return image_url, None
return image_url, vcs_ref
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 124 - 129, Reject non-SHA
vcs-ref values by validating the vcs_ref variable before accepting it: ensure
vcs_ref matches the expected commit-hash pattern (e.g. a 40-character hex SHA)
and if it doesn't, log a warning/error and treat it as missing (set vcs_ref to
None or raise) so malformed values like "main" or truncated SHAs are not written
to commit*.env; update the branch around the existing vcs_ref check (the block
that logs "Successfully found vcs-ref" / "vcs-ref label not found" and returns
image_url, vcs_ref) to perform this validation and only log success/return a
vcs_ref when it passes the SHA pattern.

Comment on lines +208 to +219
pairs: list[tuple[str, str]] = []
for key, url in _parse_env_lines(params_latest):
m = _PARAM_KEY_RE.match(key)
if not m:
log.warning("Skipping param key (unrecognized pattern)", key=key)
continue
if m.group(2) != "-n":
log.warning("Expected only -n keys in params-latest.env", key=key)
continue
pairs.append((key, url))

results = await _inspect_pairs([(k, u) for k, u in pairs])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail instead of overwriting commit-latest.env with zero rows.

If every row in params-latest.env is skipped by the regex/suffix filters, Line 219 still proceeds and Lines 236-237 replace the existing file with an empty one. sync_commit_released() already preserves the previous output on empty input; this path should fail closed too.

Proposed fix
     for key, url in _parse_env_lines(params_latest):
         m = _PARAM_KEY_RE.match(key)
         if not m:
             log.warning("Skipping param key (unrecognized pattern)", key=key)
             continue
         if m.group(2) != "-n":
             log.warning("Expected only -n keys in params-latest.env", key=key)
             continue
         pairs.append((key, url))
 
-    results = await _inspect_pairs([(k, u) for k, u in pairs])
+    if not pairs:
+        log.error("No valid -n entries found in params-latest.env", path=str(params_latest))
+        return False
+
+    results = await _inspect_pairs(pairs)

As per coding guidelines, "REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pairs: list[tuple[str, str]] = []
for key, url in _parse_env_lines(params_latest):
m = _PARAM_KEY_RE.match(key)
if not m:
log.warning("Skipping param key (unrecognized pattern)", key=key)
continue
if m.group(2) != "-n":
log.warning("Expected only -n keys in params-latest.env", key=key)
continue
pairs.append((key, url))
results = await _inspect_pairs([(k, u) for k, u in pairs])
pairs: list[tuple[str, str]] = []
for key, url in _parse_env_lines(params_latest):
m = _PARAM_KEY_RE.match(key)
if not m:
log.warning("Skipping param key (unrecognized pattern)", key=key)
continue
if m.group(2) != "-n":
log.warning("Expected only -n keys in params-latest.env", key=key)
continue
pairs.append((key, url))
if not pairs:
log.error("No valid -n entries found in params-latest.env", path=str(params_latest))
return False
results = await _inspect_pairs(pairs)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/update-commit-latest-env.py` around lines 208 - 219, The loop
collecting pairs from _parse_env_lines is allowed to produce an empty pairs list
which then leads to calling _inspect_pairs and ultimately overwriting
commit-latest.env with zero rows; change the flow to treat empty pairs as a hard
failure: after the loop that fills pairs (using _PARAM_KEY_RE and suffix "-n"),
check if pairs is empty and raise an exception or exit with a nonzero status
(e.g., raise RuntimeError or call sys.exit(1)) with a clear message indicating
no valid -n keys found, and do not call _inspect_pairs or proceed to write
output; this preserves the existing file on bad input and mirrors
sync_commit_released()'s conservative behavior.

@opendatahub-io opendatahub-io deleted a comment from coderabbitai bot Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel size/l

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants