Skip to content

chore(release): cut 5.1.1 — bug-fix release from AI code review#31

Merged
m4r1k merged 12 commits intomainfrom
chore/release-5.1.1
Apr 22, 2026
Merged

chore(release): cut 5.1.1 — bug-fix release from AI code review#31
m4r1k merged 12 commits intomainfrom
chore/release-5.1.1

Conversation

@m4r1k
Copy link
Copy Markdown
Owner

@m4r1k m4r1k commented Apr 22, 2026

Summary

Bug-fix release with one small TUI improvement. Bundles the actionable findings from a third-party AI code review (CodeRabbit Pro + Cubic.Dev) of the v5.1.0 codebase. Drop-in upgrade.

Each commit is one themed group; see the commit body for the per-fix bullets with reviewer attribution.

Commit groups

  • fix(security) — XCP-ng vm-shutdown UUID binding, shell-quoted compose path, pypi workflow input validation + build/publish split for OIDC scope.
  • fix(operational) — silent config drops (notifications.suppress + multi-UPS statistics), is_local redundancy local poweroff, multi-UPS state-file write race, TUI live-blending key mismatch, virsh-list exit unchecked in wait loop, umount multi-flag options, voltage severity escalation (F1), shutdown/remote.py robustness, graph.py correctness, battery.py / logger.py / stats.py / utils.py / cli.py fixes, bash completion quoting.
  • fix(packaging) — pyproject version target, DEB prerm/postrm lifecycle handling, daemon-reload chroot guard, postinstall fresh-vs-upgrade disambiguation.
  • fix(ci) — gh-pages force-push removal, nFPM pinning + checksum, integration || true removal, e2e readiness fail-fast, codeql contents:read, every action SHA-pinned across all workflows, ssh-target /var/run/sshd creation, reference-config corrections (SSH host key checking, Apprise URL formats, compose_files: [], F2 homelab parallel: false comment).
  • test — autouse StatsConfig/StatsStore isolation fixture (production /var/lib/eneru no longer leaks from tests), multi-namespace run_command patching helper, failsafe tests routed through real _main_loop, F3 thread-survival assertion in redundancy evaluator, e2e exit-code capture via PIPESTATUS, low-battery scenario runtime above critical threshold, virsh happy-path command sequence assertions.
  • chore(cleanup) — version docstring refresh, format_seconds negative clamp, is_numeric rejects bool/NaN/Inf, hardened logger handler cleanup, legacy ups-monitor syslog tag renamed to eneru (power events were going to journalctl under the pre-v5.0 identifier).
  • feat(tui) — events panel drops the 24h truncation; ↑/↓ / PgUp/PgDn / Home/End scroll through full history; <M> toggles 8↔500 visible rows.
  • chore(release) — version bump to 5.1.1 + changelog entry (subsequently trimmed to bug-fix-release scale).

Test plan

  • All 10 required CI checks pass (validate × 6 Python versions + 4 E2E matrix jobs)
  • Manual smoke: sudo python3 /opt/ups-monitor/eneru.py --validate-config <example> after building the deb
  • Verify TUI events panel scrolling against a UPS with multi-day history (<↑>/<↓>, <Home>/<End>)
  • Verify journalctl now shows power events under -t eneru (was -t ups-monitor)
  • Confirm no test run leaks new files into /var/lib/eneru/

Notes

  • AI re-review (@coderabbitai full review + @cubic-dev-ai review this pull request) is gated on operator approval per the project's quota-conservation policy in CLAUDE.md.

🤖 Generated with Claude Code


Summary by cubic

Bug-fix release with one TUI improvement: the events panel now scrolls through full history. Includes security hardening, CI/packaging fixes, and multiple correctness fixes. Drop‑in upgrade.

  • Bug Fixes

    • TUI: events panel shows full history with arrow/page/home scrolling; keypad enabled so keys work reliably; first scroll auto-expands; toggles 8↔500 rows.
    • Filesystems: multi‑flag umount options are split correctly; malformed options are logged and skipped (also in dry‑run), and dry‑run logs show the exact argv.
    • XCP‑ng/XenServer shutdown binds UUIDs correctly and uses force=true.
    • Live graph blending uses uppercase state‑file keys; no right‑edge lag.
    • Multi‑UPS: per‑UPS temp files prevent state/history write races.
    • Redundancy is_local: quorum loss now powers off the host.
    • VM shutdown: virsh list failures no longer yield false “all stopped”; force‑destroy still runs.
    • Voltage alerts escalate to severe while pending when thresholds are crossed.
    • Config parsing restores notifications.suppress, voltage_hysteresis_seconds, and multi‑UPS statistics.
    • Power events syslog tag renamed to eneru (was ups-monitor).
  • Security, CI, and Packaging

    • Remote stop_compose path is shlex.quoted; template no longer double‑quotes.
    • PyPI workflow split: build and publish separated; version input validated; OIDC id-token: write limited to publish.
    • All GitHub Actions SHA‑pinned; refresh instructions added to each workflow; maintenance guide added to CLAUDE.md.
    • Release workflow serializes gh-pages deploys to avoid concurrent races; no git push -f to gh-pages.
    • CI installs fail fast (dropped || true masks); readiness checks error on timeout; added CodeQL contents: read; E2E low‑battery scenario tuned to fire against all thresholds.
    • DEB lifecycle scripts handle all states; guarded systemctl daemon-reload when systemd isn’t present.
    • pyproject.toml now reads version from eneru.version.__version__; examples use safer SSH defaults, correct Slack Apprise URL, compose_files: [], and fixed NAS shutdown ordering.

Written for commit e9ae4fc. Summary will update on new commits.

Summary by CodeRabbit

Release Notes - Version 5.1.1

  • New Features

    • TUI now displays full event history with arrow-key, page, and home/end scrolling instead of fixed 24-hour window.
    • Added toggle to adjust maximum visible event rows.
  • Bug Fixes

    • Fixed VM shutdown parameter binding for XCP-ng/XenServer.
    • Corrected voltage severity re-evaluation logic.
    • Resolved battery anomaly false-alert suppression.
    • Fixed multi-UPS temp file race conditions.
    • Improved VM shutdown loop stability.
    • Fixed shell command timeout handling.
  • Security

    • Removed insecure SSH defaults; now enforces known_hosts verification.
    • Added shell injection mitigation in remote commands.
    • Enabled SHA-pinning in CI/build workflows.
  • Documentation

    • Updated configuration examples for security best practices.

m4r1k and others added 9 commits April 22, 2026 20:57
Security and supply-chain hardening from a comprehensive third-party
AI code review (CodeRabbit Pro + Cubic.Dev). All four fixes address
exploitable injection vectors or unsafe credential scoping in the
build / release path. No runtime behavior change for users; the
XCP-ng UUID fix repairs a silent no-op on XenServer hosts.

Bundled fixes:

- src/eneru/actions.py:50,56 — XCP-ng `xargs ... xe vm-shutdown uuid=`
  left uuid= empty; UUIDs were passed positionally and `xe` (which
  parses arguments as key=value pairs) silently no-op'd on XenServer
  / XCP-ng hosts. Switch both the graceful and force passes to
  `xargs -r -I {} xe vm-shutdown uuid={}` (and `uuid={} force=true`,
  matching xe's keyword form rather than the invalid `--force`).
  Reported by @coderabbitai and @cubic-dev-ai.
- src/eneru/actions.py:78,80 (via shutdown/remote.py) — stop_compose
  template wrapped {path} in double quotes; double quotes don't block
  $(), backticks, or ${...} in the remote shell. Drop the surrounding
  double quotes from the template and apply shlex.quote(path) at the
  .format() call site in RemoteShutdownMixin so the path is the only
  quoting boundary. Reported by @coderabbitai.
- .github/workflows/pypi.yml:45,57 — workflow_dispatch input was
  interpolated directly into `run:` shell with PYPI_API_TOKEN in scope
  via id-token: write. Pass via env: block, validate against a strict
  PEP-440 allow-list before use. Same pattern in release.yml:52
  hardened the same way. Reported by @cubic-dev-ai.
- .github/workflows/pypi.yml — single job granted id-token: write to
  every step including `pip install build twine`; a compromised
  third-party dependency could exfiltrate the OIDC token. Split into
  a `build` job (contents: read only, builds the wheel and uploads as
  artifact) and a `publish` job (needs: build, id-token: write only,
  downloads the artifact and runs twine upload). Reported by
  @cubic-dev-ai.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Production-code correctness fixes from the AI code review. These
include silent config-drop bugs (notifications.suppress and
multi-UPS statistics), the missing local-host poweroff in is_local
redundancy groups, the multi-UPS state-file write race, the TUI
live-blending key mismatch, and a long tail of robustness fixes
across shutdown, stats, logging, and utility modules. Several
findings were cross-confirmed by both reviewers.

Bundled fixes:

- src/eneru/config.py — _parse_notifications never read `suppress` or
  `voltage_hysteresis_seconds` from YAML; defaults were always used,
  validator passed against defaults, user YAML was silently ignored.
  Read both keys from notif_data and pass into NotificationsConfig.
  Reported by @cubic-dev-ai.
- src/eneru/multi_ups.py:104-110 — per-group monitor-config build
  dropped `statistics`, so custom stats DB path / retention settings
  were silently ignored in multi-UPS mode. Pass it through. Reported
  by @cubic-dev-ai.
- src/eneru/redundancy.py + multi_ups.py — for is_local: true
  redundancy groups, the executor stopped local services + remote
  servers but never invoked local_shutdown.command. Quorum-loss
  completed with the local host still running. Add a
  local_shutdown_callback parameter delegating to the coordinator's
  defense-in-depth _handle_local_shutdown so the in-memory lock and
  global flag prevent double-fire if the per-UPS path also requests
  a local shutdown. Reported by @cubic-dev-ai.
- src/eneru/monitor.py:595 + health/battery.py:36 —
  `with_suffix('.tmp')` replaced the per-UPS suffix (`.ups1`, `.ups2`)
  with `.tmp`, collapsing every monitor's temp file onto a shared
  one and racing on the atomic rename. Use `with_name(name + '.tmp')`
  in both call sites. Reported by @cubic-dev-ai.
- src/eneru/shutdown/vms.py:53 — wait-loop `virsh list` ignored
  exit_code; if libvirtd died mid-shutdown the loop computed
  remaining_vms=[] from empty stdout and reported "All VMs stopped
  gracefully", skipping force-destroy. Treat non-zero exit
  conservatively: keep the prior remaining_vms and re-poll on the
  next interval. Reported by @cubic-dev-ai.
- src/eneru/shutdown/filesystems.py:58-61 — `cmd.append(options)`
  bundled multi-flag option strings like "-l -f" into a single
  subprocess argument; umount rejected the literal as an unknown
  option. Use `cmd.extend(shlex.split(options))`. Reported by
  @cubic-dev-ai.
- src/eneru/tui.py:114 — `_STATE_FILE_TO_COLUMN` was keyed on NUT-
  style dotted names (`battery.charge`, `ups.load`) but the state
  file uses uppercase (`BATTERY=`, `LOAD=`). Live blending received
  zero samples; TUI graph rightmost edge always lagged by ≤10s.
  Realign the dict to actual state-file keys; drop entries that have
  no state-file source (battery voltage, temperature, frequencies).
  Reported by @cubic-dev-ai.
- src/eneru/shutdown/remote.py:91 — `cmd.timeout or
  server.command_timeout` promoted 0 to the default timeout via
  Python truthiness. Use explicit `is not None` check. Reported by
  @coderabbitai.
- src/eneru/shutdown/remote.py:108 — thread worker swallowed
  unexpected exceptions silently. Log them with the server display
  name so silent failures become visible. Reported by @cubic-dev-ai.
- src/eneru/shutdown/remote.py:152 — ssh_options normalization
  silently broke non-`-o`-style flags (e.g., `-i keyfile`). Pass
  through any flag starting with `-`; only normalize bare key=value
  pairs to `-o key=value`; split single "-o foo=bar" strings into
  two argv entries. Reported by @coderabbitai.
- src/eneru/shutdown/remote.py:179 — docstring claimed the method
  returns False on SSH connection failure but it always returns
  True. Update the docstring to match the actual best-effort
  behavior. Reported by @coderabbitai.
- src/eneru/multi_ups.py:_drain_all_groups — invoked shutdown on
  peer monitors while their main loops were still running.
  Restructure: signal stop_event first, give peer loops a brief
  join window, then run per-monitor shutdown sequences sequentially
  before the final join. Reported by @coderabbitai.
- src/eneru/graph.py:126 — min/max over a list that may contain
  None / non-numeric placeholders raised TypeError. Filter to
  numeric values before bounds calculation; render an empty grid
  when nothing numeric remains. Reported by @cubic-dev-ai.
- src/eneru/graph.py:212 — pixel-row aggregator used the wrong
  direction; the variable named `highest_on` actually tracked the
  bottommost set pixel, so any "on" cell rendered as the tiniest
  block. Use min(r) so the topmost set pixel drives the block size.
  Reported by @cubic-dev-ai.
- src/eneru/health/battery.py:35 — `except Exception: pass` in the
  persistence helper hid all I/O failures. Log at WARNING level and
  keep the path non-fatal. Reported by @coderabbitai.
- src/eneru/health/battery.py:116 — pending anomaly confirmation
  could fire after the drop was no longer >20%, producing a false
  alarm. Re-validate the current drop magnitude before firing.
  Reported by @cubic-dev-ai.
- src/eneru/logger.py:29-30 — handler cleanup directly cleared
  `logger.handlers` and didn't close existing handlers; FileHandler
  fds leaked on re-init. Iterate, removeHandler, close; also set
  propagate=False so root-logger handlers don't duplicate output.
  Reported by @coderabbitai and @cubic-dev-ai.
- src/eneru/stats.py:148 — `isolation_level=None` made
  `with self._conn:` a no-op so each executemany row committed
  individually instead of being batched. Drop the override and
  let sqlite3 use the default deferred mode the with-blocks expect.
  Reported by @coderabbitai.
- src/eneru/stats.py:605 — SQLite URI built via raw string
  interpolation broke on file paths containing `?` or `#`. Apply
  urllib.parse.quote on the path before assembling the URI.
  Reported by @cubic-dev-ai.
- src/eneru/utils.py:37 — `run_command` returned stdout/stderr as
  None when capture_output=False; callers that indexed/stripped them
  crashed. Normalize to empty string in the no-capture branch.
  Reported by @cubic-dev-ai.
- src/eneru/cli.py:204 — validate skipped top-level YAML warnings
  for single-UPS configs and silently swallowed parse errors. Re-
  parse for both single-UPS and multi-UPS configs; surface parse
  errors with a non-zero exit. Reported by @coderabbitai.
- src/eneru/completion/eneru.bash:37 — bash completion for --config
  used unquoted `compgen -f` substitution; filenames with spaces
  split into multiple completions. Use mapfile -t. Reported by
  @coderabbitai.
- src/eneru/health/voltage.py:324 (F1) — severity escalation from
  non-severe to severe within the same LOW/HIGH state never
  refreshed `voltage_pending_severe`, so the severe-bypass path in
  _maybe_notify_voltage_pending was never reached and operators
  waited the full hysteresis window for an immediate-class alert.
  Re-validate severity when the state is unchanged and lift the
  pending record's severe flag (and refresh the recorded voltage
  / threshold) so the next _maybe_notify_voltage_pending fires
  immediately. Reported by @cubic-dev-ai.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Packaging-script hardening for chroot/container build environments,
correct Debian lifecycle handling, and a setuptools dynamic-version
robustness fix.

Bundled fixes:

- pyproject.toml — `version = {attr = "eneru.__version__"}` evaluated
  src/eneru/__init__.py at build time; a future top-level import
  there would break wheel builds. Point at `eneru.version.__version__`
  so setuptools resolves directly from version.py. Reported by
  @coderabbitai and @cubic-dev-ai.
- packaging/scripts/preremove.sh + postremove.sh — DEB lifecycle args
  `failed-upgrade`, `deconfigure`, `abort-install`, `abort-upgrade`,
  and `disappear` fell through to the is_removal=true default; the
  service was stopped/disabled during non-removal flows. Replace the
  if/elif cascade with an explicit `case` covering the full Debian
  Policy enumeration; preserve RPM numeric handling (0 = removal,
  >=1 = upgrade) in the fallback. Reported by @coderabbitai and
  @cubic-dev-ai.
- packaging/scripts/postinstall.sh + postremove.sh — `systemctl
  daemon-reload` under `set -e` aborted in chroot/container builds
  without a live systemd, leaving the package half-applied. Guard
  each call with `[ -d /run/systemd/system ]`. Reported by
  @coderabbitai and @cubic-dev-ai.
- packaging/scripts/postinstall.sh — DEB upgrade detection used
  `[ "$1" = "configure" ]` plus `systemctl list-unit-files`, but
  `configure` is the arg for BOTH fresh installs AND upgrades, and
  the unit file is part of the new package so list-unit-files always
  returns true. Disambiguate via `$2` (previous version, empty on
  fresh install). Drop the unused `was_enabled` variable shellcheck
  flagged. Reported by @coderabbitai and @cubic-dev-ai.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
…view

CI / supply-chain hardening (pinning, checksum verification,
permission scoping, fail-fast on readiness timeouts), e2e test
infrastructure robustness, plus reference-config corrections so
copy-pasting an example YAML produces a working configuration.

Bundled fixes:

- .github/workflows/release.yml — `git push -f` to gh-pages risked
  clobbering on transient races between concurrent release jobs.
  Drop --force; let the push fail loudly on non-fast-forward so the
  operator can reconcile manually. Reported by @cubic-dev-ai.
- .github/workflows/integration.yml + release.yml — nFPM was downloaded
  from `releases/latest` with no version pin or integrity check. Pin
  to a specific release via NFPM_VERSION env var and verify the
  archive against the goreleaser-published checksums.txt before
  extracting / installing. Reported by @cubic-dev-ai.
- .github/workflows/integration.yml — DEB and RPM install commands
  trailed `|| true` that silently masked real install failures (kept
  CI green even when the package never installed). Drop the `|| true`
  on dpkg + apt-get install -f; preserve the `rpm -ivh ... || dnf
  install --skip-broken` chain because it's the documented fallback
  for missing optional deps. Reported by @cubic-dev-ai.
- .github/actions/e2e-setup/action.yml — both readiness loops (NUT
  on port 3493, SSH on 2222/2223/2224) fell out of the loop on
  timeout but never `exit 1`; the composite action reported success
  even when services never came up. Track readiness explicitly per
  loop and fail with a stderr message if the deadline elapses.
  Reported by @coderabbitai and @cubic-dev-ai.
- .github/workflows/codeql.yml — only `security-events: write` was
  declared; once any permission is set, the others default to `none`
  and actions/checkout fails on private repos. Add `contents: read`.
  Reported by @cubic-dev-ai.
- .github/workflows/{validate,integration,e2e,codeql,pypi,release}.yml
  + .github/actions/e2e-setup/action.yml — every third-party action
  reference is now SHA-pinned (with the corresponding tag in a
  comment) so a moved tag in the upstream action repo cannot
  silently change what runs in CI. Refresh by re-running
  `gh api repos/<owner>/<repo>/git/refs/tags/<tag>`. Reported by
  @cubic-dev-ai.
- pytest.ini — blanket `filterwarnings = ignore::DeprecationWarning`
  hid every Python deprecation across the entire suite, including
  the daemon's own code. Drop it: the suite is currently warning-
  clean, and a future deprecation should fail loudly so we react.
  Reported by @coderabbitai.
- tests/e2e/ssh-target/Dockerfile + entrypoint.sh — Alpine's
  openssh-server doesn't create /var/run/sshd at install time and
  sshd refuses to start without it; the container exited
  immediately after launch. mkdir -p both at build time and as a
  defensive fallback in the entrypoint. Reported by @cubic-dev-ai.
- examples/config-reference.yaml + config-dual-ups.yaml — SSH
  `StrictHostKeyChecking=no` was the default in the reference
  examples; copy-pasters inherited a MITM-able default. Comment
  the option out, mark it as testing-only, and document that
  production should rely on populated known_hosts. Reported by
  @cubic-dev-ai.
- examples/config-enterprise.yaml:50 — Slack Apprise URL
  `slack://xoxb-token/CHANNEL_ID?footer=no` doesn't match either
  Apprise's webhook or bot-OAuth form. Replace with the documented
  webhook form and link to the Apprise wiki for the bot-OAuth
  alternative. Reported by @cubic-dev-ai.
- examples/config-reference.yaml:190 — `compose_files:` written as
  a YAML key with no value parses as `null`, not an empty list;
  loader/runtime fails on a copy-paste-unmodified reference. Use
  `compose_files: []` (explicit empty list). Reported by
  @coderabbitai.
- examples/config-homelab.yaml:88-95 (F2) — comment claimed
  `parallel: false` runs the NAS LAST; the legacy flag actually
  runs the server in a phase BEFORE the parallel-order=0 batch
  (config.py RemoteServerConfig: "False = run sequentially before
  the parallel batch"). Switch to `shutdown_order: 2` and rewrite
  the comment to document the legacy flag's real semantics.
  Reported by @cubic-dev-ai.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Test-suite quality fixes: SQLite-store leakage to /var/lib/eneru
during pytest runs, run_command patches that missed the shutdown-
mixin namespaces, failsafe tests that inlined production logic
instead of routing through the real loop, swallowed e2e exit codes,
and missing assertions on happy-path tests. None affect runtime
behavior; all reduce false-confidence in the test suite.

Bundled fixes:

- tests/conftest.py — tests instantiating UPSGroupMonitor or
  StatsStore without overriding StatsConfig.db_directory leaked
  SQLite files into the real /var/lib/eneru. Add an autouse
  isolate_stats_db_directory fixture that monkeypatches the
  dataclass default to a per-test tmp_path so no test can hit the
  system path. Reported by user during release execution.
- tests/conftest.py — add a patch_run_command_everywhere fixture
  that patches `run_command` in every shutdown-module namespace.
  `from eneru.utils import run_command` binds the symbol at import
  time, so a test that only patches `eneru.monitor.run_command`
  doesn't intercept calls made by the shutdown mixins (vms,
  containers, filesystems, remote). Reported by @cubic-dev-ai.
- tests/test_monitor_core.py:312 —
  test_connection_lost_while_ob_triggers_shutdown inlined the
  failsafe logic and asserted the inlined simulation; the real
  _main_loop could regress and the test would still pass. Route
  through TestAdvisoryTriggers._run_one_iteration so the production
  code path is exercised. Reported by @coderabbitai.
- tests/test_monitor_core.py:421 —
  test_shutdown_continues_on_step_failure documented a behavior
  gap rather than verifying behavior: it caught the RuntimeError
  in the test body and asserted only that the failing step was
  called. Rename to test_shutdown_aborts_on_step_failure and
  assert the actual current contract (raise propagates, subsequent
  steps are NOT invoked) so the test is honest and a future
  best-effort change must update it deliberately. Reported by
  @coderabbitai and @cubic-dev-ai.
- tests/test_redundancy.py:343 (F3) —
  test_evaluator_swallows_exceptions_and_continues didn't assert
  the thread was alive AFTER the bad monitor had time to raise on
  every tick. A thread killed by the unhandled exception would
  still leave `not ev.is_alive()` True after join (a dead thread
  joins instantly), so the test silently passed even when the
  swallowing contract broke. Add `assert ev.is_alive()` before
  signaling stop. Reported by @cubic-dev-ai.
- tests/test_cli.py:233 — test_validate_config_nonexistent_file
  asserted only exit 0 + "Configuration is valid"; passed whether
  the typo'd path was used or the default-config fallback fired.
  Now also assert the fallback warning containing the typo'd path
  in stdout so a regression that swallowed the warning would fail
  loudly. Reported by @coderabbitai.
- tests/test_shutdown_vms.py:101 —
  test_shutdown_vms_graceful_shutdown_completes_first_poll set up
  the run_command sequence but never asserted what was actually
  issued; it would pass even if the wrong virsh command (or none)
  ran. Assert the virsh list / shutdown sequence and the absence
  of `virsh destroy` on the happy path. Reported by @cubic-dev-ai.
- tests/e2e/groups/single-ups.sh:35,63 — `eneru run ... | tee ... ||
  true` masked any non-zero exit (including a real crash) so the
  test passed even when eneru never produced log output. Capture
  the exit code via PIPESTATUS and fail loudly on anything other
  than 0 (or 124 from the timeout-bounded test). Reported by
  @cubic-dev-ai.
- tests/e2e/groups/multi-ups.sh:391 — parse_epoch silently produced
  a "now" epoch when the shutdown.log file was empty, corrupting
  the phase-ordering comparison further down. Fail loudly on empty
  input. Reported by @cubic-dev-ai.
- tests/e2e/scenarios/low-battery.dev — runtime was 180 s, below
  the 600 s critical-runtime threshold; the scenario actually
  exercised the critical-runtime trigger, not the low-battery
  trigger, making test failures hard to localise. Bump runtime to
  900 s so the test unambiguously exercises the low-battery path.
  Reported by @cubic-dev-ai.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
…rand

Aggregate of cosmetic / lint / robustness fixes from the AI code
review, plus the legacy `ups-monitor` syslog identifier rebrand and
a hardened SQLite-isolation fixture for the test suite. The emoji-
in-logs convention (per root CLAUDE.md "Code Style") is intentional
and remains unchanged despite Ruff lint comments to the contrary.

Bundled fixes:

- src/eneru/monitor.py:391 — power events shelled out to `logger -t
  ups-monitor`, the legacy package identifier from before the v5.0
  rebrand. Every other journal line emitted by the daemon already
  uses the `eneru` identifier, so journalctl filtering on `-t eneru`
  silently missed the power-event side channel. Rename to `-t eneru`.
  Reported by user (release execution).
- src/eneru/version.py:3-5 — stale `4.3.0` example in the docstring;
  refresh to `5.1.0` so the example tracks the current series.
- src/eneru/utils.py — `is_numeric` accepted booleans (subtype of
  int) and NaN/±Inf, both of which the UPS-metric callers
  (voltage/runtime/load math) treat as garbage. Reject bools and
  use math.isfinite to fence off the float edge cases.
- src/eneru/utils.py — `format_seconds` produced strings like
  "-1m 30s" when a misbehaving driver briefly returned a negative
  runtime. Clamp to max(0, ...) before bucket math.
- tests/conftest.py — the autouse isolate_stats_db_directory
  fixture introduced earlier in this release patched the dataclass
  CLASS attribute, which is a no-op for new instances because the
  dataclass-generated __init__ captures defaults at decoration
  time. Replace with a two-layer wrapper that patches both
  StatsConfig.__init__ (for fresh defaults) and StatsStore.__init__
  (for direct paths under /var/lib/eneru). Add a
  no_stats_isolation marker so tests that specifically need to
  verify the production default can opt out. Reported by user
  (release execution): leaked TestUPS@localhost DAEMON_START
  events showing up in the production /var/lib/eneru/default.db.
- pytest.ini — register the new `no_stats_isolation` marker so
  --strict-markers passes.
- tests/test_stats.py — mark test_defaults +
  test_default_yaml_omitted_section with @pytest.mark.no_stats_isolation
  so they continue to verify the literal "/var/lib/eneru" default.

Project-policy items deliberately NOT changed:
- Emoji in log messages (per root CLAUDE.md "Code Style — Emojis
  in logs/notifications") are intentional semantic markers, not
  decoration. Suppress the Ruff lint rule rather than the emojis.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Drop the 24h fixed window on the events panel and let the operator
scroll through every event in the SQLite store with the arrow keys.
The events table is one row per power event (not per poll), so even
years of history stay tiny -- truncating it by age was hiding the
operator's own outage history without any storage benefit.

Changes:

- src/eneru/tui.py: query_events_for_display now defaults to no time
  bound (time_range_seconds=None pulls everything; explicit window
  still supported by callers that want one). EVENTS_TIME_WINDOW
  removed; EVENTS_MAX_ROWS_NORMAL=8 / EVENTS_MAX_ROWS_MORE=500
  introduced for the <M>-toggled cap.
- src/eneru/tui.py: render_logs_panel now takes a scroll_offset so
  the visible slice anchors against the operator's scroll position
  instead of always pinning to the most recent rows.
- src/eneru/tui.py: arrow-key scrolling in the main loop --
  Up/Down moves one row, PgUp/PgDn moves ten, Home jumps to the
  oldest event, End returns to the most recent. <R> resets the
  offset. The render path clamps the offset to the current list
  size so a refresh that shrinks the list doesn't strand the user
  on an empty view.
- src/eneru/tui.py: bottom hint row gains <↑↓> Scroll.
- tests/test_tui.py: TestEventsTimescaleDecoupled rewritten to
  assert the new contract (no positional time-range argument
  passed to query_events_for_display) instead of the old
  EVENTS_TIME_WINDOW constant.

Reported by user during release execution: a clean install of
v5.1.0 would only show the most recent ~1 day of events even with
multi-year retention configured, so historical outages dropped off
the operator's TUI as soon as the 24h window expired.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Bug-fix release with one small TUI improvement (events panel pulls
all events with arrow-key scrolling). Bundles fixes from a
third-party AI code review (CodeRabbit Pro + Cubic.Dev) of the
v5.1.0 codebase.

See docs/changelog.md for the grouped summary; `git log v5.1.0..v5.1.1`
for per-commit detail with reviewer attribution on every fix.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
The first cut clocked in at ~900 words, which is the v5.1.0
density — too much for a bug-fix-only release. Compress per-fix
prose into one-line headlines, collapse the "long tail" enumeration
into a single bullet, drop redundant attribution preamble, and
point readers at `git log v5.1.0..v5.1.1` for detail. Final entry
is ~530 words.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

📝 Walkthrough

Walkthrough

This PR bumps version to 5.1.1 and includes CI hardening via SHA-pinned GitHub Actions, fixes to remote shutdown command UUID binding for XCP-ng and Docker/Podman, TUI event scrolling enhancements replacing the fixed 24-hour window, battery/voltage anomaly detection improvements, packaging script robustness for systemd detection, and configuration example updates removing insecure SSH defaults.

Changes

Cohort / File(s) Summary
CI/GitHub Actions Hardening
.github/workflows/codeql.yml, .github/workflows/e2e.yml, .github/workflows/integration.yml, .github/workflows/pypi.yml, .github/workflows/release.yml, .github/workflows/validate.yml
All GitHub Action references (actions/checkout, setup-python, codecov-action, etc.) pinned to specific commit SHAs instead of floating version tags; nFPM installation now uses pinned version with checksum verification; job-level permissions and OIDC token scoping refined for security.
GitHub Actions Composite Action
.github/actions/e2e-setup/action.yml
Pins actions/setup-python to commit SHA; strengthens NUT/SSH readiness-waiting logic with explicit boolean flags and exit 1 failure handling.
Remote Shutdown Commands
src/eneru/actions.py, src/eneru/shutdown/remote.py, src/eneru/shutdown/vms.py
XCP-ng/XenServer graceful and forced shutdown now properly bind UUIDs via xargs -I {} uuid={} (fixing empty UUID parameter); Docker/Podman compose stop removes template quotes to allow proper shell argument parsing; shlex.quote applied to path placeholders; SSH option parsing enhanced; virsh list polling now preserves previous state on errors and re-evaluates anomalies.
TUI Enhancements
src/eneru/tui.py
Events query no longer bound by 24-hour time window; supports full history with configurable max-events cap; adds scroll_offset state for arrow/page/home/end scrolling; state-file key mapping updated to match daemon's uppercase format (STATUS=, BATTERY=, etc.).
Health & Battery Monitoring
src/eneru/health/battery.py, src/eneru/health/voltage.py
Battery history temp file uses per-UPS filenames (prevents multi-UPS write race); anomaly detection re-checks drop magnitude after 3-poll confirmation; voltage state transitions now upgrade pending severity when subsequent polls become severe within the same state window.
Configuration & Packaging
src/eneru/config.py, packaging/scripts/postinstall.sh, packaging/scripts/postremove.sh, packaging/scripts/preremove.sh, pyproject.toml
Config parser accepts suppress and voltage_hysteresis_seconds notification fields; postinstall/postremove/preremove scripts use conditional systemctl daemon-reload (checks /run/systemd/system); DEB upgrade detection via $2 parameter; version resolution redirects to src/eneru/version.py.
Configuration Examples
examples/config-dual-ups.yaml, examples/config-enterprise.yaml, examples/config-homelab.yaml, examples/config-reference.yaml
SSH host key checking guidance changed from active StrictHostKeyChecking=no to comments directing use of known_hosts (insecure option only in testing comments); Slack notification URL format updated to Apprise-compatible syntax; NAS shutdown reordered via shutdown_order: 2 instead of legacy parallel: false.
Core Monitoring & Coordination
src/eneru/monitor.py, src/eneru/multi_ups.py, src/eneru/redundancy.py
Syslog identifier changed from ups-monitor to eneru; multi-UPS monitor startup ensures per-group config includes statistics; redundancy executor now accepts optional local_shutdown_callback and delegates local poweroff to coordinator; drain sequence signals stop event and waits for peer poll loops before invoking shutdown.
Utilities & Diagnostics
src/eneru/utils.py, src/eneru/cli.py, src/eneru/completion/eneru.bash, src/eneru/logger.py, src/eneru/graph.py, src/eneru/stats.py, src/eneru/shutdown/filesystems.py
is_numeric rejects NaN/±Inf and booleans; run_command guarantees string outputs; format_seconds clamps negatives to zero; CLI validate command re-parses config based on file existence; bash completion uses mapfile for safe whitespace handling; logger closes handlers to prevent leaks; graph auto-computes bounds from numeric data only; stats uses SQLite deferred transactions and URL-escapes read-only paths; umount options tokenized via shlex.split.
Documentation & Versioning
docs/changelog.md, src/eneru/version.py
Version bumped to 5.1.1 with comprehensive changelog entry documenting TUI scrolling, shutdown/redundancy fixes, multi-UPS race conditions, voltage re-evaluation, remote-shell injection mitigation, OIDC token scoping, SHA pinning, Debian lifecycle handling, and test isolation improvements.
E2E & Unit Tests
tests/conftest.py, tests/e2e/groups/multi-ups.sh, tests/e2e/groups/single-ups.sh, tests/e2e/scenarios/low-battery.dev, tests/e2e/ssh-target/Dockerfile, tests/e2e/ssh-target/entrypoint.sh, tests/test_cli.py, tests/test_monitor_core.py, tests/test_multi_ups.py, tests/test_redundancy.py, tests/test_remote_commands.py, tests/test_shutdown_vms.py, tests/test_stats.py, tests/test_tui.py
Fixtures added for stats DB isolation and patched run_command everywhere; no_stats_isolation marker for opt-out; E2E scripts now validate exit codes and timestamps; low-battery scenario updated for determinism; SSH target container ensures /var/run/sshd exists; new tests verify UUID binding, compose path quoting, and virsh command sequences; TUI test refocused on max-events query without time window; state-file helper updated to uppercase format.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 Hops of joy for this fine release,
Where events scroll without time's cease,
UUIDs bind, and SSH's made wise,
Shutdown flows dance—no more surprise!
Redundancy nestles, stats stay apart,
Five-one-one beats in your heart. 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main change: cutting a 5.1.1 bug-fix release with fixes from AI code review, which directly corresponds to the changeset's content of version bumps, security/operational fixes, CI improvements, and test isolation.
Docstring Coverage ✅ Passed Docstring coverage is 81.97% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/release-5.1.1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 61.60000% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.06%. Comparing base (3463bc9) to head (e9ae4fc).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
src/eneru/tui.py 29.03% 21 Missing and 1 partial ⚠️
src/eneru/health/battery.py 14.28% 5 Missing and 1 partial ⚠️
src/eneru/graph.py 61.53% 1 Missing and 4 partials ⚠️
src/eneru/shutdown/remote.py 50.00% 4 Missing and 1 partial ⚠️
src/eneru/cli.py 33.33% 4 Missing ⚠️
src/eneru/logger.py 71.42% 2 Missing ⚠️
src/eneru/shutdown/vms.py 77.77% 1 Missing and 1 partial ⚠️
src/eneru/utils.py 71.42% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #31      +/-   ##
==========================================
- Coverage   79.87%   79.06%   -0.82%     
==========================================
  Files          23       23              
  Lines        3593     3664      +71     
  Branches      692      708      +16     
==========================================
+ Hits         2870     2897      +27     
- Misses        572      607      +35     
- Partials      151      160       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The Commit 5 bump of low-battery.dev runtime to 900s removed the
critical-runtime fallback that was previously masking a real bug:
UPS2 in tests/e2e/config-e2e-multi-ups.yaml has
low_battery_threshold=15, and the LOW BATTERY trigger uses strict
`<`, so battery.charge=15 never fires UPS2's shutdown. Test 15
(Non-local failure) sat for the full 15-minute job timeout waiting
for a shutdown that wasn't going to come.

Drop battery.charge to 14 so the scenario fires LOW BATTERY against
both UPS1 (threshold 20) and UPS2 (threshold 15), and the
single-UPS tests still exercise the same trigger.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@m4r1k
Copy link
Copy Markdown
Owner Author

m4r1k commented Apr 22, 2026

@coderabbitai full review

@m4r1k
Copy link
Copy Markdown
Owner Author

m4r1k commented Apr 22, 2026

@cubic-dev-ai review this pull request

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

✅ Actions performed

Full review triggered.

@cubic-dev-ai
Copy link
Copy Markdown

cubic-dev-ai Bot commented Apr 22, 2026

@cubic-dev-ai review this pull request

@m4r1k I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 49 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="examples/config-enterprise.yaml">

<violation number="1" location="examples/config-enterprise.yaml:54">
P3: The Slack example URL puts `?footer=no` after the `#channel` fragment, so the footer option is not passed as a query parameter.</violation>
</file>

<file name="src/eneru/shutdown/filesystems.py">

<violation number="1" location="src/eneru/shutdown/filesystems.py:64">
P1: `shlex.split(options)` can raise on malformed config input and crash the shutdown sequence; handle parse errors before extending the command.</violation>
</file>

<file name="tests/e2e/groups/single-ups.sh">

<violation number="1" location="tests/e2e/groups/single-ups.sh:43">
P2: Do not accept a clean 0 exit here; that lets a premature daemon exit pass this monitoring test.</violation>
</file>

<file name="src/eneru/tui.py">

<violation number="1" location="src/eneru/tui.py:906">
P1: Enable keypad mode before handling `curses.KEY_*`; otherwise arrow/page/home keys can be read as raw escape input and trigger the ESC-to-quit branch instead of scrolling.</violation>
</file>

<file name="tests/conftest.py">

<violation number="1" location="tests/conftest.py:260">
P2: Patch `eneru.utils.run_command` too; otherwise `command_exists()` still shells out during the shutdown tests.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread src/eneru/shutdown/filesystems.py Outdated
Comment thread src/eneru/tui.py
Comment thread tests/e2e/groups/single-ups.sh Outdated
Comment thread tests/conftest.py
Comment thread examples/config-enterprise.yaml Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
examples/config-homelab.yaml (1)

68-99: ⚠️ Potential issue | 🟡 Minor

Update the section guidance to include shutdown_order phases.

The NAS block now correctly uses shutdown_order: 2, but Lines 68-71 still describe only the legacy two-phase parallel model. That makes this example internally inconsistent and can mislead users into using parallel: false for “shutdown last”.

📝 Suggested wording
-# Remote servers with optional pre-shutdown commands
-# Servers are processed in two phases:
-#   1. Sequential: Servers with parallel: false are shutdown one by one (in config order)
-#   2. Parallel: Remaining servers (parallel: true, the default) are shutdown concurrently
+# Remote servers with optional pre-shutdown commands
+# Servers are processed by effective shutdown order, lowest to highest.
+# Servers with the same order run concurrently.
+#   - Default: order 0, parallel batch
+#   - shutdown_order: N: explicit phase; higher numbers run later
+#   - Legacy parallel: false: sequential phases before order 0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/config-homelab.yaml` around lines 68 - 99, The example text
describing remote_servers is outdated: it only explains the legacy two-phase
"parallel" model and misleads users about how to schedule a server "last";
update the guidance to mention the new shutdown_order phases and how they
interact with the old parallel flag (i.e., that shutdown_order: <int> controls
later phases and parallel: false runs a server sequentially before the parallel
batch per RemoteServerConfig), and clarify that to run something after the
parallel batch you must set shutdown_order >= 1 (example: NAS uses
shutdown_order: 2) rather than using parallel: false.
tests/test_cli.py (1)

245-257: ⚠️ Potential issue | 🟡 Minor

Use tmp_path for the missing config path.

The hard-coded /nonexistent/path/config.yaml is environment-dependent. A temp path that is intentionally not created keeps the test deterministic.

Proposed fix
-    def test_validate_config_nonexistent_file(self, capsys):
+    def test_validate_config_nonexistent_file(self, tmp_path, capsys):
@@
-        typo_path = "/nonexistent/path/config.yaml"
+        typo_path = str(tmp_path / "missing-config.yaml")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_cli.py` around lines 245 - 257, Replace the hard-coded missing
config path with a temp path created from the pytest tmp_path fixture (e.g.,
typo_path = tmp_path / "config.yaml") and ensure you do not create that file so
it remains missing; update the test to use str(typo_path) where needed, keep the
argv patching that calls main(), and leave the same SystemExit and captured
output assertions (asserting "Configuration is valid", "Config file not found",
and the tmp_path string) so the test is deterministic across environments.
🧹 Nitpick comments (15)
.github/workflows/validate.yml (1)

19-39: LGTM on SHA pins.

Consider adding the same "how to refresh pins" comment as in codeql.yml for consistency across workflows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/validate.yml around lines 19 - 39, Add the same "how to
refresh pins" explanatory comment next to the pinned action SHAs in this
workflow (e.g., the uses: actions/checkout@de0fac2e..., uses:
actions/setup-python@a309ff8b4..., and uses: codecov/codecov-action@57e3a13...
lines) so it matches codeql.yml; insert a brief one-line note explaining the
source/command or script to update the SHA pins and where to run it (copy the
exact phrasing from codeql.yml for consistency).
.github/workflows/release.yml (1)

485-495: Recommended: add a concurrency: group now that git push -f is gone.

Removing the force-push is correct — silently clobbering a parallel release's commits is worse than failing. The flip side is that two overlapping release runs (e.g., a tag push and a manual dispatch, or a tag + an rc cut close together) will each rebuild and sign packages, do all the repo work, and only discover the race at the final git push. The loser then has to be re-run manually.

Serializing the job with a concurrency group avoids the wasted work and the operator toil, while still surfacing real divergence if gh-pages was edited out of band.

♻️ Suggested change
 on:
   release:
     types: [published]
   workflow_dispatch:
@@
 permissions:
   contents: write

+# Serialize gh-pages deploys so concurrent releases don't race the
+# non-fast-forward push at the end of the job.
+concurrency:
+  group: release-gh-pages
+  cancel-in-progress: false
+
 jobs:
   build-packages:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 485 - 495, Add a concurrency
group to the release workflow so overlapping runs are serialized and wasted work
avoided: update the job that contains the "Deploy to gh-pages" step to include a
concurrency: block (e.g., concurrency: { group: "${{ github.repository
}}-release-gh-pages-${{ github.ref }}", cancel-in-progress: false } or similar)
at the job level so concurrent release jobs targeting gh-pages are run
one-at-a-time while still failing on real divergence from the remote before
pushing with the existing git push step.
tests/test_multi_ups.py (1)

1279-1283: Assertion relaxation matches the new two-phase drain.

_drain_all_groups now joins each thread twice (short window post-stop_event, then final deadline join), so assert_called_once() would always fail. assert live_thread.join.called is correct. If you want a slightly stronger invariant without coupling to the exact count, assert live_thread.join.call_count >= 2 would lock in the two-phase contract and catch regressions if someone collapses the loops back into one.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_multi_ups.py` around lines 1279 - 1283, The test was relaxed to
use assert live_thread.join.called because _drain_all_groups now calls
Thread.join twice per thread; update the assertion to enforce the two-phase
contract by replacing assert live_thread.join.called with assert
live_thread.join.call_count >= 2 so the test verifies at least two joins
occurred for live_thread (keep the existing check for the timeout log:
any("still running after drain timeout" in m for m in logs)).
src/eneru/shutdown/filesystems.py (1)

53-57: Dry-run log shows unsplit options.

Now that the real path splits options via shlex.split, the [DRY-RUN] line on Line 55 still prints the raw {options} string as a single token, which no longer mirrors the actual argv that would be exec'd (e.g. "-l -f" vs -l -f). Cosmetic only, but easy to align:

♻️ Proposed fix
-            if self.config.behavior.dry_run:
-                self._log_message(
-                    f"  🧪 [DRY-RUN] Would execute: timeout {timeout}s umount {options} {mount_point}"
-                )
-                continue
+            if self.config.behavior.dry_run:
+                argv = ["umount", *shlex.split(options), mount_point] if options else ["umount", mount_point]
+                self._log_message(
+                    f"  🧪 [DRY-RUN] Would execute: timeout {timeout}s {' '.join(shlex.quote(a) for a in argv)}"
+                )
+                continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/shutdown/filesystems.py` around lines 53 - 57, The dry-run message
prints the raw options string which no longer matches the actual argv because
the real execution uses shlex.split; update the dry-run branch in the same
method that checks self.config.behavior.dry_run to reconstruct the tokenized
form (e.g. split options with shlex.split and join with spaces or format as
separate tokens) before calling self._log_message so the logged command mirrors
the real exec args (use the same shlex.split logic already used elsewhere on the
options variable).
src/eneru/stats.py (1)

608-619: URL-encoding the DB path — correct, and urlquote's default safe='/' is what you want here.

POSIX paths with ? or # would previously be truncated or mis-parsed into URI query/fragment components by sqlite3.connect(..., uri=True). quote(str(path)) with the default safe-set leaves / unescaped (so the URI remains a legal path) while encoding ?, #, space, etc. Good.

Minor: the docstring still says "Returns None if the file doesn't exist" but the declared return type is sqlite3.Connection (non-Optional) with a # type: ignore on Line 610. Consider typing it Optional[sqlite3.Connection] so callers like tui.py:208 (which already does if conn is not None) are type-checked honestly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/stats.py` around lines 608 - 619, The function currently returns
None when the DB file doesn't exist but its signature/docstring declare a
non-Optional sqlite3.Connection and use a "# type: ignore" to silence this;
update the function signature return type to Optional[sqlite3.Connection],
remove the "# type: ignore" suppression, and adjust the docstring to say
"Returns None if the file doesn't exist" so callers (e.g., the code that checks
if conn is not None) are correctly type-checked; no change to the urlquote/uri
logic (urlquote, conn, sqlite3.connect, PRAGMA busy_timeout remain the same).
src/eneru/utils.py (1)

8-29: LGTM — stricter is_numeric and normalized run_command returns line up with the call-site assumptions.

Rejecting bool and non-finite floats matches how callers in monitor.py, health/voltage.py, and the shutdown mixins consume these values (all do int(float(x)) or treat depletion_rate > 0 as a number), and normalizing stdout/stderr to "" matches shutdown/containers.py:142 and shutdown/remote.py:193 that unconditionally .strip() them. The negative-clamp in format_seconds is also a sensible display fix.

One small thing: moving import math to module top would be idiomatic (it's the only inline import here and is_numeric is on the hot path for every poll) — purely optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/utils.py` around lines 8 - 29, The is_numeric function currently
imports math inline; move the import to the module top for idiomatic style and
to avoid the per-call import overhead: add a single top-level "import math" and
remove the inline "import math" from inside is_numeric so the function continues
to use math.isfinite without repeated imports.
tests/test_tui.py (1)

109-128: Source-string inspection is fine as a guardrail, but consider also asserting at call time.

Using inspect.getsource + substring/regex checks will silently drift if someone reformats the call (e.g. splits args across lines with a weird inner-space, uses a variable me = query_events_for_display, or renames a kwarg). A complementary behavioral assertion that patches query_events_for_display and inspects its actual call args during a short run_tui invocation would be more robust. Not blocking — the current checks do catch the exact regression you described.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_tui.py` around lines 109 - 128, Add a runtime assertion test that
patches/wraps the query_events_for_display function during a short invocation of
run_tui to inspect how it was called: use unittest.mock.patch or monkeypatch to
replace eneru.tui.query_events_for_display with a wrapper that records
args/kwargs, call tui_mod.run_tui(...) with minimal config, then assert the
recorded call has no positional time-range argument (i.e., len(args) == 1 and
args[0] is the config) and that 'max_events' is present in kwargs, and finally
restore the original function; reference run_tui and query_events_for_display
when locating where to add this behavioral check.
src/eneru/multi_ups.py (1)

265-307: Drain ordering is correct; minor nit on the Phase-1 join budget.

Signaling stop_event before running peer shutdown sequences avoids concurrent state-file writes and notifications inside peer poll cycles. The two-phase join (short window, then full-timeout) is the right shape.

Minor: join_deadline = time.time() + max(1, timeout // 4) caps the per-peer pre-drain wait at timeout/4 total across all threads combined (since remaining shrinks monotonically as threads are joined serially). With default timeout=120 that's a 30s combined budget — fine, but if a single peer hangs for the full 30s, later peers effectively get a 0-timeout join. Consider splitting the budget per-thread if a peer commonly takes seconds to exit its loop:

🛠️ Optional per-thread budget
-        self._stop_event.set()
-        join_deadline = time.time() + max(1, timeout // 4)
-        for thread in self._threads:
-            remaining = max(0.0, join_deadline - time.time())
-            thread.join(timeout=remaining)
+        self._stop_event.set()
+        per_thread_budget = max(1.0, (timeout / 4) / max(len(self._threads), 1))
+        for thread in self._threads:
+            thread.join(timeout=per_thread_budget)

Not blocking for 5.1.1.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/multi_ups.py` around lines 265 - 307, The pre-drain join loop in
_drain_all_groups currently uses a single join_deadline (join_deadline =
time.time() + max(1, timeout // 4)) which gives a combined budget across all
threads and can starve later threads; change it to compute a per-thread join
budget derived from that phase budget divided by max(1, len(self._threads)) (or
a sane minimum per-thread like 0.5s) and use that per-thread timeout when
calling thread.join so each thread gets its own slice of the Phase-1 window
(adjust references to self._threads and join_deadline accordingly in
_drain_all_groups).
tests/conftest.py (2)

245-275: LGTM — explicit target list is the right trade-off over from X import Y rebinding.

Documenting why a single patch("eneru.utils.run_command") is insufficient is exactly the context the next contributor will need. One suggestion for future-proofing: if you add another shutdown mixin (shutdown/foo.py) that does from eneru.utils import run_command, the fixture will silently under-patch it. Consider an assertion or a helper that discovers these sites automatically (e.g. scanning eneru.shutdown submodules for a run_command attribute) so new modules don't escape isolation unnoticed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/conftest.py` around lines 245 - 275, Replace the hardcoded targets in
the patch_run_command_everywhere fixture with a discovery step that scans the
eneru.shutdown package for submodules that expose a run_command symbol: use
pkgutil.iter_modules over eneru.shutdown.__path__ and importlib.import_module to
load each submodule, collect module paths where hasattr(module, "run_command"),
then build the targets list from those module paths (plus any known non-shutdown
locations like eneru.monitor and eneru.multi_ups). After discovery, assert that
the discovered shutdown modules are included in the targets (or fail the test)
so new shutdown submodules that import run_command don’t silently escape
patching; keep the rest of the fixture (creating patchers, starting mocks,
default return_value, yield, stop) unchanged.

42-103: Solid two-layer isolation; watch the StatsConfig.__init__ wrapper signature if the dataclass grows.

The approach is correct for today's StatsConfig, but patched_cfg_init(self, db_directory=isolated_str, **kw) bakes in the assumption that db_directory is the only positional-friendly parameter. If someone later adds a second field before db_directory in the dataclass, a positional call like StatsConfig("/path", 100) will misroute the first arg. A slightly more future-proof shape is to detect whether the caller supplied db_directory and only inject the default otherwise:

🛠️ Optional hardening
-    def patched_cfg_init(self, db_directory=isolated_str, **kw):
-        return original_cfg_init(self, db_directory=db_directory, **kw)
+    def patched_cfg_init(self, *args, **kw):
+        if "db_directory" not in kw and not args:
+            kw["db_directory"] = isolated_str
+        return original_cfg_init(self, *args, **kw)

Not urgent — current tests pass cleanly with the existing shape.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/conftest.py` around lines 42 - 103, The patched StatsConfig.__init__
currently assumes callers won't pass positional args for db_directory; update
patched_cfg_init to accept (*args, **kw) and only inject
db_directory=isolated_str when the caller did not supply it (i.e. 'db_directory'
not in kw and no positional arg was provided for it). Concretely, change
patched_cfg_init to signature patched_cfg_init(self, *args, **kw), and if
'db_directory' not in kw and len(args) == 0 set kw['db_directory'] =
isolated_str before calling original_cfg_init(self, *args, **kw); otherwise
forward arguments unchanged to original_cfg_init. This preserves current
behavior but avoids misrouting when new dataclass fields are added before
db_directory.
src/eneru/shutdown/remote.py (2)

93-120: LGTM on timeout and thread-worker hardening.

The is not None check correctly preserves per-command timeout=0 as an explicit override (matches RemoteCommandConfig.timeout: Optional[int] = None in config.py and the test at test_monitor_core.py:877-886). The except Exception in the worker thread at line 113 is appropriate here despite Ruff BLE001 — a bare-crash in a thread target would otherwise disappear silently with no log trail. Consider adding # noqa: BLE001 on line 113 to document the deliberate catch-all and silence the static analyzer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/shutdown/remote.py` around lines 93 - 120, Add an inline noqa to
the broad exception handler in shutdown_server_thread to document the deliberate
catch-all and silence the Ruff BLE001 warning: locate the except Exception as
exc: block inside the shutdown_server_thread function (in
src/eneru/shutdown/remote.py) and append a trailing comment "# noqa: BLE001" to
that except line, keeping the existing log message and behavior intact.

232-249: Validate stop_compose path before formatting the template.

The validation that stop_compose has a path at lines 245-249 runs after the template has already been rendered with shlex.quote("") at line 240. While the continue prevents the malformed command from being sent, it means the "stop_compose requires 'path'" warning is effectively dead code path from a correctness standpoint — but it also obscures intent: for any future template that made path mandatory in a way shlex.quote("") couldn't satisfy, the ordering would let a bad command slip through. Minor reordering makes the precondition check authoritative.

♻️ Proposed reorder
             if cmd_config.action:
                 action_name = cmd_config.action.lower()

                 if action_name not in REMOTE_ACTIONS:
                     self._log_message(
                         f"    ⚠️ [{idx}/{cmd_count}] Unknown action: {action_name} (skipping)"
                     )
                     continue

+                # Validate stop_compose has path
+                if action_name == "stop_compose" and not cmd_config.path:
+                    self._log_message(
+                        f"    ⚠️ [{idx}/{cmd_count}] stop_compose requires 'path' parameter (skipping)"
+                    )
+                    continue
+
                 # Get command template and substitute placeholders.
                 # `path` is shlex-quoted because the template embeds it
                 # directly into the remote shell — without quoting, a
                 # malicious or malformed path could expand $(), `…`, or
                 # ${…} on the remote host.
                 command_template = REMOTE_ACTIONS[action_name]
                 command = command_template.format(
                     timeout=timeout,
                     path=shlex.quote(cmd_config.path or "")
                 )
                 description = action_name
-
-                # Validate stop_compose has path
-                if action_name == "stop_compose" and not cmd_config.path:
-                    self._log_message(
-                        f"    ⚠️ [{idx}/{cmd_count}] stop_compose requires 'path' parameter (skipping)"
-                    )
-                    continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/shutdown/remote.py` around lines 232 - 249, Move the "stop_compose
requires 'path'" precondition check to run before you fetch/format the
REMOTE_ACTIONS template: in the loop where you access
REMOTE_ACTIONS[action_name] and call command_template.format(...), check if
action_name == "stop_compose" and ensure cmd_config.path is present first (the
block that logs the warning and continue), then only construct command_template
and call shlex.quote(cmd_config.path) and format the template; this ensures the
validation in the loop for stop_compose is authoritative and prevents any
template formatting with an empty path.
src/eneru/tui.py (2)

1032-1036: Nit: redundant max(0, …) on KEY_HOME.

len(log_events) is already non-negative, so max(0, len(log_events)) is equivalent to len(log_events). The inner clamp inside render_logs_panel (lines 721-722) is what actually pins the view to the oldest event — this assignment just needs to overshoot. Trivial simplification:

             elif key == curses.KEY_HOME:
                 # Jump to the oldest event currently in the list.
-                events_scroll = max(0, len(log_events))
+                events_scroll = len(log_events)
             elif key == curses.KEY_END:
                 events_scroll = 0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/tui.py` around lines 1032 - 1036, The KEY_HOME handler uses max(0,
len(log_events)) unnecessarily; replace the assignment events_scroll = max(0,
len(log_events)) with the simpler events_scroll = len(log_events) in the
input/key handling block where KEY_HOME is processed (the same block that sets
events_scroll for KEY_END), since render_logs_panel already clamps the scroll;
reference variables/events_scroll, log_events and the render_logs_panel function
to locate the change.

962-972: Scroll keys are a no-op in normal mode — consider making <M> implicit on first scroll.

When show_more=False, events_cap = EVENTS_MAX_ROWS_NORMAL = 8 at line 963, so query_events_for_display pulls at most 8 rows. The events panel typically displays ~8 rows, meaning ↑/PgUp/Home have nothing to scroll to and silently do nothing, despite the always-visible <↑↓> Scroll hint at line 759.

The comment at lines 34-39 says the panel "pulls every event from the SQLite store … and lets the operator scroll through with the arrow keys" — but that's only true once the user presses <M>. Two low-friction options:

  1. Auto-promote to show_more=True when the user presses ↑/PgUp/Home (keep <M> as an explicit toggle).
  2. Dim the <↑↓> Scroll hint when show_more=False, or change its label to match reality.

Either clears up the "my arrow keys are broken" moment for first-time users. Not a correctness issue.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/tui.py` around lines 962 - 972, The events panel is limited to
EVENTS_MAX_ROWS_NORMAL when show_more is False, making arrow/PgUp/Home appear to
do nothing; modify the scroll-key handler (the function that processes
↑/PgUp/Home) to detect the first scroll attempt when show_more is False, set
show_more = True, recompute events_cap = EVENTS_MAX_ROWS_MORE and re-run
query_events_for_display(config, max_events=events_cap) (or fallback to
parse_log_events) before performing the actual scroll action so the implicit
"<M>" behavior is applied on first scroll; ensure the state change is idempotent
so subsequent scrolls behave normally.
tests/test_monitor_core.py (1)

320-331: Cross-class reference to a helper defined much later in the file.

TestFailsafe.test_connection_lost_while_ob_triggers_shutdown calls TestAdvisoryTriggers._run_one_iteration (defined 600+ lines below at line 1026). It works because the reference is resolved at test-execution time, but the coupling is awkward — a reader scanning TestFailsafe has no hint where _run_one_iteration lives, and TestAdvisoryTriggers is now an implicit dependency of TestFailsafe.

Consider promoting _run_one_iteration to a module-level helper (e.g. def _run_one_iteration(monitor, ups_data_response): near make_monitor) so both test classes reference it symmetrically.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_monitor_core.py` around lines 320 - 331,
TestFailsafe.test_connection_lost_while_ob_triggers_shutdown references
TestAdvisoryTriggers._run_one_iteration (defined much later), creating a
confusing cross-class dependency; move the helper to module scope so both tests
can call it. Extract the implementation of
TestAdvisoryTriggers._run_one_iteration into a top-level function (e.g. def
_run_one_iteration(monitor, ups_data_response):) placed near make_monitor,
update both TestAdvisoryTriggers and TestFailsafe to call the module-level
_run_one_iteration, and remove or deprecate the class-level helper to eliminate
the implicit dependency while preserving behavior used in
test_connection_lost_while_ob_triggers_shutdown and other tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/integration.yml:
- Around line 197-203: The dnf fallback using "--skip-broken" can hide a failed
eneru install when rpm fails, so change the "Install dependencies and package"
step (the rpm -ivh --nodeps || dnf install -y ./eneru-*.rpm --skip-broken
command) to either not use --skip-broken or to explicitly verify the install
succeeded afterward; e.g., run dnf without --skip-broken (so failure propagates)
or keep the fallback but add a check that the installed artifact exists (test -f
/opt/ups-monitor/eneru.py or rpm -q eneru) and exit non-zero if it’s missing so
the job fails fast.

In `@src/eneru/config.py`:
- Around line 592-605: The code uses notif_data.get(..., default) which won't
protect against explicit nulls; update the notifications parsing so you only
override the defaults when the YAML value is present and not None — e.g., read
tmp = notif_data.get('suppress') and if tmp is not None then set notif_suppress
= tmp (same for notif_voltage_hysteresis), and ensure any non-None value is of
the expected type before assigning (to satisfy NotificationsConfig and the
checks in validate_config); refer to notif_suppress, notif_voltage_hysteresis,
NotificationsConfig and validate_config to locate and apply this change.

In `@src/eneru/graph.py`:
- Around line 126-137: The bounds-calculation currently filters numeric values
using isinstance(v, (int, float)), which excludes numeric strings; change the
filter in the y_min/y_max auto-bound logic (the block that builds numeric =
[...]) to accept any value coercible by float() by attempting float(v) in a
try/except and collecting the float-converted values, then compute min/max from
those floats; make the same adjustment in the duplicate bounds block used
elsewhere (the similar numeric-filtering code later in the file) so plotting and
scaling accept numeric strings and other float-coercible inputs.

In `@src/eneru/logger.py`:
- Line 26: The logger name is still "ups-monitor" which breaks the syslog
rebrand; update the call to logging.getLogger(...) in the class where
self.logger is assigned (the self.logger = logging.getLogger("ups-monitor")
statement) to use "eneru" instead so the logger identifier matches the rest of
the codebase and docs.

In `@src/eneru/version.py`:
- Around line 4-5: Update the stale format comment to reflect the current
tagged-release example by changing "5.1.0" to "5.1.1" so the comment above the
__version__ assignment matches the bumped value; locate the comment near the
__version__ = "5.1.1" line and update the example string in that comment
accordingly.

In `@tests/test_redundancy.py`:
- Around line 363-370: The test currently checks ev.is_alive() after a fixed
sleep which can race with the evaluator calling snapshot(); modify the test to
wait (with a short timeout) until the failing snapshot path has actually been
exercised by polling the bad_monitor.state.snapshot flag/call indicator (or
equivalent sentinel) before asserting ev.is_alive(); in practice add a small
loop that repeatedly checks bad_monitor.state.snapshot (or a boolean set by the
snapshot implementation) and breaks on success or fails after timeout, then
perform the ev.is_alive() assertion so the check is gated on snapshot having
been invoked.

In `@tests/test_shutdown_vms.py`:
- Around line 118-127: The test currently only checks membership for per-VM
shutdown commands, which doesn't enforce order; update the assertions after
collecting issued = [c.args[0] for c in mock_run.call_args_list] to assert the
shutdown commands occur in the expected order (e.g., compute indices of
["virsh","shutdown","vm1"] and ["virsh","shutdown","vm2"] in issued and assert
index(vm1) < index(vm2) or assert the slice of issued contains the two shutdown
commands in that exact sequence), leaving the existing checks for the initial
"virsh list ..." and the absence of "virsh destroy" intact.

---

Outside diff comments:
In `@examples/config-homelab.yaml`:
- Around line 68-99: The example text describing remote_servers is outdated: it
only explains the legacy two-phase "parallel" model and misleads users about how
to schedule a server "last"; update the guidance to mention the new
shutdown_order phases and how they interact with the old parallel flag (i.e.,
that shutdown_order: <int> controls later phases and parallel: false runs a
server sequentially before the parallel batch per RemoteServerConfig), and
clarify that to run something after the parallel batch you must set
shutdown_order >= 1 (example: NAS uses shutdown_order: 2) rather than using
parallel: false.

In `@tests/test_cli.py`:
- Around line 245-257: Replace the hard-coded missing config path with a temp
path created from the pytest tmp_path fixture (e.g., typo_path = tmp_path /
"config.yaml") and ensure you do not create that file so it remains missing;
update the test to use str(typo_path) where needed, keep the argv patching that
calls main(), and leave the same SystemExit and captured output assertions
(asserting "Configuration is valid", "Config file not found", and the tmp_path
string) so the test is deterministic across environments.

---

Nitpick comments:
In @.github/workflows/release.yml:
- Around line 485-495: Add a concurrency group to the release workflow so
overlapping runs are serialized and wasted work avoided: update the job that
contains the "Deploy to gh-pages" step to include a concurrency: block (e.g.,
concurrency: { group: "${{ github.repository }}-release-gh-pages-${{ github.ref
}}", cancel-in-progress: false } or similar) at the job level so concurrent
release jobs targeting gh-pages are run one-at-a-time while still failing on
real divergence from the remote before pushing with the existing git push step.

In @.github/workflows/validate.yml:
- Around line 19-39: Add the same "how to refresh pins" explanatory comment next
to the pinned action SHAs in this workflow (e.g., the uses:
actions/checkout@de0fac2e..., uses: actions/setup-python@a309ff8b4..., and uses:
codecov/codecov-action@57e3a13... lines) so it matches codeql.yml; insert a
brief one-line note explaining the source/command or script to update the SHA
pins and where to run it (copy the exact phrasing from codeql.yml for
consistency).

In `@src/eneru/multi_ups.py`:
- Around line 265-307: The pre-drain join loop in _drain_all_groups currently
uses a single join_deadline (join_deadline = time.time() + max(1, timeout // 4))
which gives a combined budget across all threads and can starve later threads;
change it to compute a per-thread join budget derived from that phase budget
divided by max(1, len(self._threads)) (or a sane minimum per-thread like 0.5s)
and use that per-thread timeout when calling thread.join so each thread gets its
own slice of the Phase-1 window (adjust references to self._threads and
join_deadline accordingly in _drain_all_groups).

In `@src/eneru/shutdown/filesystems.py`:
- Around line 53-57: The dry-run message prints the raw options string which no
longer matches the actual argv because the real execution uses shlex.split;
update the dry-run branch in the same method that checks
self.config.behavior.dry_run to reconstruct the tokenized form (e.g. split
options with shlex.split and join with spaces or format as separate tokens)
before calling self._log_message so the logged command mirrors the real exec
args (use the same shlex.split logic already used elsewhere on the options
variable).

In `@src/eneru/shutdown/remote.py`:
- Around line 93-120: Add an inline noqa to the broad exception handler in
shutdown_server_thread to document the deliberate catch-all and silence the Ruff
BLE001 warning: locate the except Exception as exc: block inside the
shutdown_server_thread function (in src/eneru/shutdown/remote.py) and append a
trailing comment "# noqa: BLE001" to that except line, keeping the existing log
message and behavior intact.
- Around line 232-249: Move the "stop_compose requires 'path'" precondition
check to run before you fetch/format the REMOTE_ACTIONS template: in the loop
where you access REMOTE_ACTIONS[action_name] and call
command_template.format(...), check if action_name == "stop_compose" and ensure
cmd_config.path is present first (the block that logs the warning and continue),
then only construct command_template and call shlex.quote(cmd_config.path) and
format the template; this ensures the validation in the loop for stop_compose is
authoritative and prevents any template formatting with an empty path.

In `@src/eneru/stats.py`:
- Around line 608-619: The function currently returns None when the DB file
doesn't exist but its signature/docstring declare a non-Optional
sqlite3.Connection and use a "# type: ignore" to silence this; update the
function signature return type to Optional[sqlite3.Connection], remove the "#
type: ignore" suppression, and adjust the docstring to say "Returns None if the
file doesn't exist" so callers (e.g., the code that checks if conn is not None)
are correctly type-checked; no change to the urlquote/uri logic (urlquote, conn,
sqlite3.connect, PRAGMA busy_timeout remain the same).

In `@src/eneru/tui.py`:
- Around line 1032-1036: The KEY_HOME handler uses max(0, len(log_events))
unnecessarily; replace the assignment events_scroll = max(0, len(log_events))
with the simpler events_scroll = len(log_events) in the input/key handling block
where KEY_HOME is processed (the same block that sets events_scroll for
KEY_END), since render_logs_panel already clamps the scroll; reference
variables/events_scroll, log_events and the render_logs_panel function to locate
the change.
- Around line 962-972: The events panel is limited to EVENTS_MAX_ROWS_NORMAL
when show_more is False, making arrow/PgUp/Home appear to do nothing; modify the
scroll-key handler (the function that processes ↑/PgUp/Home) to detect the first
scroll attempt when show_more is False, set show_more = True, recompute
events_cap = EVENTS_MAX_ROWS_MORE and re-run query_events_for_display(config,
max_events=events_cap) (or fallback to parse_log_events) before performing the
actual scroll action so the implicit "<M>" behavior is applied on first scroll;
ensure the state change is idempotent so subsequent scrolls behave normally.

In `@src/eneru/utils.py`:
- Around line 8-29: The is_numeric function currently imports math inline; move
the import to the module top for idiomatic style and to avoid the per-call
import overhead: add a single top-level "import math" and remove the inline
"import math" from inside is_numeric so the function continues to use
math.isfinite without repeated imports.

In `@tests/conftest.py`:
- Around line 245-275: Replace the hardcoded targets in the
patch_run_command_everywhere fixture with a discovery step that scans the
eneru.shutdown package for submodules that expose a run_command symbol: use
pkgutil.iter_modules over eneru.shutdown.__path__ and importlib.import_module to
load each submodule, collect module paths where hasattr(module, "run_command"),
then build the targets list from those module paths (plus any known non-shutdown
locations like eneru.monitor and eneru.multi_ups). After discovery, assert that
the discovered shutdown modules are included in the targets (or fail the test)
so new shutdown submodules that import run_command don’t silently escape
patching; keep the rest of the fixture (creating patchers, starting mocks,
default return_value, yield, stop) unchanged.
- Around line 42-103: The patched StatsConfig.__init__ currently assumes callers
won't pass positional args for db_directory; update patched_cfg_init to accept
(*args, **kw) and only inject db_directory=isolated_str when the caller did not
supply it (i.e. 'db_directory' not in kw and no positional arg was provided for
it). Concretely, change patched_cfg_init to signature patched_cfg_init(self,
*args, **kw), and if 'db_directory' not in kw and len(args) == 0 set
kw['db_directory'] = isolated_str before calling original_cfg_init(self, *args,
**kw); otherwise forward arguments unchanged to original_cfg_init. This
preserves current behavior but avoids misrouting when new dataclass fields are
added before db_directory.

In `@tests/test_monitor_core.py`:
- Around line 320-331:
TestFailsafe.test_connection_lost_while_ob_triggers_shutdown references
TestAdvisoryTriggers._run_one_iteration (defined much later), creating a
confusing cross-class dependency; move the helper to module scope so both tests
can call it. Extract the implementation of
TestAdvisoryTriggers._run_one_iteration into a top-level function (e.g. def
_run_one_iteration(monitor, ups_data_response):) placed near make_monitor,
update both TestAdvisoryTriggers and TestFailsafe to call the module-level
_run_one_iteration, and remove or deprecate the class-level helper to eliminate
the implicit dependency while preserving behavior used in
test_connection_lost_while_ob_triggers_shutdown and other tests.

In `@tests/test_multi_ups.py`:
- Around line 1279-1283: The test was relaxed to use assert
live_thread.join.called because _drain_all_groups now calls Thread.join twice
per thread; update the assertion to enforce the two-phase contract by replacing
assert live_thread.join.called with assert live_thread.join.call_count >= 2 so
the test verifies at least two joins occurred for live_thread (keep the existing
check for the timeout log: any("still running after drain timeout" in m for m in
logs)).

In `@tests/test_tui.py`:
- Around line 109-128: Add a runtime assertion test that patches/wraps the
query_events_for_display function during a short invocation of run_tui to
inspect how it was called: use unittest.mock.patch or monkeypatch to replace
eneru.tui.query_events_for_display with a wrapper that records args/kwargs, call
tui_mod.run_tui(...) with minimal config, then assert the recorded call has no
positional time-range argument (i.e., len(args) == 1 and args[0] is the config)
and that 'max_events' is present in kwargs, and finally restore the original
function; reference run_tui and query_events_for_display when locating where to
add this behavioral check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7db37aa5-3772-4abb-945b-20cc83661dd1

📥 Commits

Reviewing files that changed from the base of the PR and between 3463bc9 and 681248b.

📒 Files selected for processing (49)
  • .github/actions/e2e-setup/action.yml
  • .github/workflows/codeql.yml
  • .github/workflows/e2e.yml
  • .github/workflows/integration.yml
  • .github/workflows/pypi.yml
  • .github/workflows/release.yml
  • .github/workflows/validate.yml
  • docs/changelog.md
  • examples/config-dual-ups.yaml
  • examples/config-enterprise.yaml
  • examples/config-homelab.yaml
  • examples/config-reference.yaml
  • packaging/scripts/postinstall.sh
  • packaging/scripts/postremove.sh
  • packaging/scripts/preremove.sh
  • pyproject.toml
  • pytest.ini
  • src/eneru/actions.py
  • src/eneru/cli.py
  • src/eneru/completion/eneru.bash
  • src/eneru/config.py
  • src/eneru/graph.py
  • src/eneru/health/battery.py
  • src/eneru/health/voltage.py
  • src/eneru/logger.py
  • src/eneru/monitor.py
  • src/eneru/multi_ups.py
  • src/eneru/redundancy.py
  • src/eneru/shutdown/filesystems.py
  • src/eneru/shutdown/remote.py
  • src/eneru/shutdown/vms.py
  • src/eneru/stats.py
  • src/eneru/tui.py
  • src/eneru/utils.py
  • src/eneru/version.py
  • tests/conftest.py
  • tests/e2e/groups/multi-ups.sh
  • tests/e2e/groups/single-ups.sh
  • tests/e2e/scenarios/low-battery.dev
  • tests/e2e/ssh-target/Dockerfile
  • tests/e2e/ssh-target/entrypoint.sh
  • tests/test_cli.py
  • tests/test_monitor_core.py
  • tests/test_multi_ups.py
  • tests/test_redundancy.py
  • tests/test_remote_commands.py
  • tests/test_shutdown_vms.py
  • tests/test_stats.py
  • tests/test_tui.py

Comment on lines 197 to +203
- name: Install dependencies and package
run: |
dnf install -y ${{ matrix.python_pkg }}
# Install the .rpm package (ignore dependency on nut-client for CI)
rpm -ivh --nodeps eneru-*.rpm || dnf install -y ./eneru-*.rpm --skip-broken || true
# Try rpm first (fast, with --nodeps for the missing nut-client).
# If that fails for non-dep reasons, fall back to dnf with
# --skip-broken. Real install failures still propagate.
rpm -ivh --nodeps eneru-*.rpm || dnf install -y ./eneru-*.rpm --skip-broken
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor: dnf install … --skip-broken can still hide a failed eneru install.

Dropping || true is the right call. One residual gap: if the initial rpm -ivh --nodeps fails for a non-dep reason and the fallback runs, dnf --skip-broken will happily exit 0 while silently skipping the eneru-*.rpm itself (e.g., signature/repo issues). The subsequent test -f /opt/ups-monitor/eneru.py in the Verify step does catch this, so it's not a correctness bug today — just worth noting if you ever remove that sanity check.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/integration.yml around lines 197 - 203, The dnf fallback
using "--skip-broken" can hide a failed eneru install when rpm fails, so change
the "Install dependencies and package" step (the rpm -ivh --nodeps || dnf
install -y ./eneru-*.rpm --skip-broken command) to either not use --skip-broken
or to explicitly verify the install succeeded afterward; e.g., run dnf without
--skip-broken (so failure propagates) or keep the fallback but add a check that
the installed artifact exists (test -f /opt/ups-monitor/eneru.py or rpm -q
eneru) and exit non-zero if it’s missing so the job fails fast.

Comment thread src/eneru/config.py
Comment on lines +592 to +605
# Defaults match NotificationsConfig dataclass.
notif_suppress: List[str] = []
notif_voltage_hysteresis = 30

if 'notifications' in data:
notif_data = data['notifications']
notif_title = notif_data.get('title')
avatar_url = notif_data.get('avatar_url')
notif_timeout = notif_data.get('timeout', 10)
notif_retry_interval = notif_data.get('retry_interval', 5)
notif_suppress = notif_data.get('suppress', notif_suppress)
notif_voltage_hysteresis = notif_data.get(
'voltage_hysteresis_seconds', notif_voltage_hysteresis,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

suppress: null / voltage_hysteresis_seconds: null in YAML bypasses the defaults.

dict.get(key, default) only returns default when the key is absent. If a user writes:

notifications:
  suppress: null        # or: suppress:
  voltage_hysteresis_seconds:

…the key is present with value None, so notif_suppress becomes None and notif_voltage_hysteresis becomes None, both of which then flow straight into NotificationsConfig(...) and violate the declared List[str] / int contract. validate_config does catch the int case (Line 935 not isinstance(hys, int)), but suppress=None would fail if config.notifications.suppress: gracefully and never get validated, while any later code assuming a list would break.

🛡️ Proposed fix
-            notif_suppress = notif_data.get('suppress', notif_suppress)
-            notif_voltage_hysteresis = notif_data.get(
-                'voltage_hysteresis_seconds', notif_voltage_hysteresis,
-            )
+            if notif_data.get('suppress') is not None:
+                notif_suppress = notif_data['suppress']
+            if notif_data.get('voltage_hysteresis_seconds') is not None:
+                notif_voltage_hysteresis = notif_data['voltage_hysteresis_seconds']
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/config.py` around lines 592 - 605, The code uses
notif_data.get(..., default) which won't protect against explicit nulls; update
the notifications parsing so you only override the defaults when the YAML value
is present and not None — e.g., read tmp = notif_data.get('suppress') and if tmp
is not None then set notif_suppress = tmp (same for notif_voltage_hysteresis),
and ensure any non-None value is of the expected type before assigning (to
satisfy NotificationsConfig and the checks in validate_config); refer to
notif_suppress, notif_voltage_hysteresis, NotificationsConfig and
validate_config to locate and apply this change.

Comment thread src/eneru/graph.py
Comment on lines +126 to +137
# Auto-bound from the numeric subset only — `data` may carry
# None or non-numeric placeholders for missing samples; min/max
# over a mixed list raises TypeError on Python 3.
if y_min is None or y_max is None:
numeric = [v for v in data if isinstance(v, (int, float))]
if not numeric:
# Nothing to scale against; render an empty grid.
return [" " * width for _ in range(height)]
if y_min is None:
y_min = min(numeric)
if y_max is None:
y_max = max(numeric)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

wc -l src/eneru/graph.py

Repository: m4r1k/Eneru

Length of output: 77


🏁 Script executed:

cat -n src/eneru/graph.py | sed -n '120,240p'

Repository: m4r1k/Eneru

Length of output: 5730


🏁 Script executed:

cat -n src/eneru/graph.py | sed -n '100,125p'

Repository: m4r1k/Eneru

Length of output: 1599


🏁 Script executed:

cat -n src/eneru/graph.py | head -20

Repository: m4r1k/Eneru

Length of output: 819


🏁 Script executed:

cat -n src/eneru/graph.py | head -40

Repository: m4r1k/Eneru

Length of output: 1487


Align bounds calculation with plotting's numeric coercion.

Bounds use isinstance(v, (int, float)) but plotting accepts anything float(value) can coerce. This can cause graphs to render blank when data contains numeric strings like "230.5".

Proposed fix
+import math
 import locale
 import os
@@
         if y_min is None or y_max is None:
-            numeric = [v for v in data if isinstance(v, (int, float))]
+            numeric = []
+            for value in data:
+                try:
+                    v = float(value)
+                except (TypeError, ValueError):
+                    continue
+                if math.isfinite(v):
+                    numeric.append(v)
             if not numeric:
                 # Nothing to scale against; render an empty grid.
                 return [" " * width for _ in range(height)]

Also applies to: 167-170

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/graph.py` around lines 126 - 137, The bounds-calculation currently
filters numeric values using isinstance(v, (int, float)), which excludes numeric
strings; change the filter in the y_min/y_max auto-bound logic (the block that
builds numeric = [...]) to accept any value coercible by float() by attempting
float(v) in a try/except and collecting the float-converted values, then compute
min/max from those floats; make the same adjustment in the duplicate bounds
block used elsewhere (the similar numeric-filtering code later in the file) so
plotting and scaling accept numeric strings and other float-coercible inputs.

Comment thread src/eneru/logger.py
@@ -26,8 +26,20 @@ def __init__(self, log_file: Optional[str], config: Config):
self.logger = logging.getLogger("ups-monitor")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm whether "ups-monitor" still appears anywhere (tests, systemd units, docs) to
# validate the rebrand scope and whether this file was missed.
rg -nP '\bups-monitor\b' -C2
rg -nP 'getLogger\(' --type=py -C1

Repository: m4r1k/Eneru

Length of output: 29390


Update logger name from "ups-monitor" to "eneru" to complete the syslog rebrand.

The changelog and code comments confirm the syslog identifier was renamed to "eneru" in v5.0 (see docs/changelog.md and the comment in src/eneru/monitor.py line 391), but logging.getLogger("ups-monitor") here was not updated. This inconsistency breaks downstream syslog filtering.

Proposed change
-        self.logger = logging.getLogger("ups-monitor")
+        self.logger = logging.getLogger("eneru")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.logger = logging.getLogger("ups-monitor")
self.logger = logging.getLogger("eneru")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/logger.py` at line 26, The logger name is still "ups-monitor" which
breaks the syslog rebrand; update the call to logging.getLogger(...) in the
class where self.logger is assigned (the self.logger =
logging.getLogger("ups-monitor") statement) to use "eneru" instead so the logger
identifier matches the rest of the codebase and docs.

Comment thread src/eneru/version.py
Comment on lines +4 to +5
# Format: "5.1.0" for tagged releases, "5.1.0-5-gabcdef1" for dev builds
__version__ = "5.1.1"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Stale example in the format comment.

The comment still uses 5.1.0 as the tagged-release example after bumping __version__ to 5.1.1. Minor, but worth refreshing to keep the example aligned with the current version.

📝 Proposed tweak
-# Format: "5.1.0" for tagged releases, "5.1.0-5-gabcdef1" for dev builds
+# Format: "5.1.1" for tagged releases, "5.1.1-5-gabcdef1" for dev builds
 __version__ = "5.1.1"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Format: "5.1.0" for tagged releases, "5.1.0-5-gabcdef1" for dev builds
__version__ = "5.1.1"
# Format: "5.1.1" for tagged releases, "5.1.1-5-gabcdef1" for dev builds
__version__ = "5.1.1"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/eneru/version.py` around lines 4 - 5, Update the stale format comment to
reflect the current tagged-release example by changing "5.1.0" to "5.1.1" so the
comment above the __version__ assignment matches the bumped value; locate the
comment near the __version__ = "5.1.1" line and update the example string in
that comment accordingly.

Comment thread tests/test_redundancy.py
Comment on lines 363 to +370
ev.start()
time.sleep(0.15)
# F3: assert the evaluator thread survived the exception storm
# BEFORE we signal stop. A thread killed by the unhandled
# RuntimeError would still leave `not ev.is_alive()` True after
# join (a dead thread joins instantly), so the original test
# would silently pass even if the swallowing contract broke.
assert ev.is_alive()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Wait until the failing snapshot path is exercised.

ev.is_alive() after a fixed sleep can pass even if the evaluator has not called snapshot() yet. Gate the assertion on bad_monitor.state.snapshot being called first.

Proposed test hardening
         ev.start()
-        time.sleep(0.15)
+        deadline = time.time() + 2
+        while bad_monitor.state.snapshot.call_count == 0 and time.time() < deadline:
+            time.sleep(0.01)
+        assert bad_monitor.state.snapshot.call_count > 0
         # F3: assert the evaluator thread survived the exception storm
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ev.start()
time.sleep(0.15)
# F3: assert the evaluator thread survived the exception storm
# BEFORE we signal stop. A thread killed by the unhandled
# RuntimeError would still leave `not ev.is_alive()` True after
# join (a dead thread joins instantly), so the original test
# would silently pass even if the swallowing contract broke.
assert ev.is_alive()
ev.start()
deadline = time.time() + 2
while bad_monitor.state.snapshot.call_count == 0 and time.time() < deadline:
time.sleep(0.01)
assert bad_monitor.state.snapshot.call_count > 0
# F3: assert the evaluator thread survived the exception storm
# BEFORE we signal stop. A thread killed by the unhandled
# RuntimeError would still leave `not ev.is_alive()` True after
# join (a dead thread joins instantly), so the original test
# would silently pass even if the swallowing contract broke.
assert ev.is_alive()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_redundancy.py` around lines 363 - 370, The test currently checks
ev.is_alive() after a fixed sleep which can race with the evaluator calling
snapshot(); modify the test to wait (with a short timeout) until the failing
snapshot path has actually been exercised by polling the
bad_monitor.state.snapshot flag/call indicator (or equivalent sentinel) before
asserting ev.is_alive(); in practice add a small loop that repeatedly checks
bad_monitor.state.snapshot (or a boolean set by the snapshot implementation) and
breaks on success or fails after timeout, then perform the ev.is_alive()
assertion so the check is gated on snapshot having been invoked.

Comment on lines +118 to +127
# Assert the actual virsh command sequence — without this, the test
# passed even when the wrong command (or no command at all) was
# issued during shutdown.
issued = [c.args[0] for c in mock_run.call_args_list]
assert issued[0] == ["virsh", "list", "--name", "--state-running"]
# Per-VM shutdowns are issued in the order virsh list returned them.
assert ["virsh", "shutdown", "vm1"] in issued
assert ["virsh", "shutdown", "vm2"] in issued
# No `destroy` should fire when graceful shutdown completes cleanly.
assert not any(c[:2] == ["virsh", "destroy"] for c in issued)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Pin the ordered command sequence, not just membership.

Lines 123-125 say the per-VM shutdowns are ordered, but in issued would still pass if vm2 was shut down before vm1.

Proposed test tightening
     issued = [c.args[0] for c in mock_run.call_args_list]
-    assert issued[0] == ["virsh", "list", "--name", "--state-running"]
-    # Per-VM shutdowns are issued in the order virsh list returned them.
-    assert ["virsh", "shutdown", "vm1"] in issued
-    assert ["virsh", "shutdown", "vm2"] in issued
+    assert issued[:4] == [
+        ["virsh", "list", "--name", "--state-running"],
+        ["virsh", "shutdown", "vm1"],
+        ["virsh", "shutdown", "vm2"],
+        ["virsh", "list", "--name", "--state-running"],
+    ]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Assert the actual virsh command sequence — without this, the test
# passed even when the wrong command (or no command at all) was
# issued during shutdown.
issued = [c.args[0] for c in mock_run.call_args_list]
assert issued[0] == ["virsh", "list", "--name", "--state-running"]
# Per-VM shutdowns are issued in the order virsh list returned them.
assert ["virsh", "shutdown", "vm1"] in issued
assert ["virsh", "shutdown", "vm2"] in issued
# No `destroy` should fire when graceful shutdown completes cleanly.
assert not any(c[:2] == ["virsh", "destroy"] for c in issued)
# Assert the actual virsh command sequence — without this, the test
# passed even when the wrong command (or no command at all) was
# issued during shutdown.
issued = [c.args[0] for c in mock_run.call_args_list]
assert issued[:4] == [
["virsh", "list", "--name", "--state-running"],
["virsh", "shutdown", "vm1"],
["virsh", "shutdown", "vm2"],
["virsh", "list", "--name", "--state-running"],
]
# No `destroy` should fire when graceful shutdown completes cleanly.
assert not any(c[:2] == ["virsh", "destroy"] for c in issued)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_shutdown_vms.py` around lines 118 - 127, The test currently only
checks membership for per-VM shutdown commands, which doesn't enforce order;
update the assertions after collecting issued = [c.args[0] for c in
mock_run.call_args_list] to assert the shutdown commands occur in the expected
order (e.g., compute indices of ["virsh","shutdown","vm1"] and
["virsh","shutdown","vm2"] in issued and assert index(vm1) < index(vm2) or
assert the slice of issued contains the two shutdown commands in that exact
sequence), leaving the existing checks for the initial "virsh list ..." and the
absence of "virsh destroy" intact.

m4r1k and others added 2 commits April 22, 2026 22:57
Bundle the actionable findings from the second AI review pass (cubic.dev
+ CodeRabbit Pro on commits 0c16da6..681248b) plus the codecov-driven
backfill for the F1 voltage-escalation fix and the new redundancy
local_shutdown_callback. Every behaviour change ships with a test.

Cubic.dev — P1 fixes:

- src/eneru/tui.py: enable curses keypad mode (`stdscr.keypad(True)`)
  in run_tui's main loop. Without it the new arrow / page / home
  scroll bindings receive raw escape sequences instead of KEY_*
  constants, and the leading ESC byte (27) hits the quit branch —
  the entire TUI scroll feature was broken on terminals where curses
  doesn't enable keypad by default. Reported by @cubic-dev-ai.
  + tests/test_tui.py: source-inspection guardrail asserting
  stdscr.keypad(True) is present in run_tui.
- src/eneru/shutdown/filesystems.py: wrap shlex.split(options) in
  try/except ValueError so a malformed YAML options string (e.g.
  unclosed quote) skips the offending mount with a logged error
  instead of crashing the entire shutdown sequence. Reported by
  @cubic-dev-ai.
  + tests/test_shutdown_filesystems.py: new test feeds a malformed
  options string and asserts the bad mount is skipped while the
  good one still reaches umount.

Cubic.dev — P2 fixes:

- tests/e2e/groups/single-ups.sh: tighten Test 2's exit-code check
  to ONLY accept 124 (SIGTERM from `timeout`); the previous "0 or
  124" let a premature daemon exit during a "monitor normal state"
  check pass silently. Reported by @cubic-dev-ai.
- tests/conftest.py: patch_run_command_everywhere now also patches
  eneru.utils.run_command so command_exists() (which resolves through
  utils at call time) doesn't shell out during shutdown tests.
  Reported by @cubic-dev-ai.

Cubic.dev — P3 fix:

- examples/config-enterprise.yaml: Slack URL `?footer=no` moved
  before `#operations`; the previous order placed the query string
  inside the URL fragment where Apprise can't parse it. Reported by
  @cubic-dev-ai.

CodeRabbit — actionable fixes:

- src/eneru/shutdown/remote.py: validate stop_compose has a `path`
  BEFORE rendering the REMOTE_ACTIONS template; the old ordering
  formatted with shlex.quote("") first and the precondition warning
  was effectively dead code (any future template change that made
  path mandatory in a way shlex.quote("") couldn't satisfy would
  silently slip through). Reported by @coderabbitai.
  + tests/test_remote_commands.py: new test patches the template's
  .format and asserts it's never invoked when path is missing.
- src/eneru/tui.py: arrow / PgUp / Home auto-promote `show_more=True`
  on first press. The 8-row normal cap matched the visible window,
  so scrolling toward older history was a silent no-op despite the
  always-visible `<↑↓> Scroll` hint. Also drop the redundant
  max(0, len(log_events)) on KEY_HOME (render_logs_panel already
  clamps). Reported by @coderabbitai.
  + tests/test_tui.py: source-inspection guardrail asserting the
  three keys are grouped and show_more flips before scroll math.
- examples/config-homelab.yaml: section comment above the
  remote_servers block rewritten to describe the shutdown_order
  phase model (the old text only described the legacy two-phase
  parallel flag, which is exactly what the NAS block stopped using
  in the F2 fix). Reported by @coderabbitai.
- tests/test_cli.py: test_validate_config_nonexistent_file now uses
  tmp_path instead of a hard-coded /nonexistent/path/config.yaml.
  Reported by @coderabbitai.
- tests/test_monitor_core.py: promote _run_one_iteration to module
  scope so TestFailsafe doesn't reach across 700 lines into
  TestAdvisoryTriggers. Class-level shim kept for the existing
  self._run_one_iteration call sites. Reported by @coderabbitai.
- src/eneru/utils.py: `import math` moved to module top (was inline
  inside is_numeric on the per-poll hot path). Reported by
  @coderabbitai.
- src/eneru/stats.py: open_readonly return type annotated
  Optional[sqlite3.Connection]; drop the `# type: ignore` since the
  declared type now matches the documented `None on missing file`
  behaviour. Reported by @coderabbitai.

Codecov-driven backfill:

- tests/test_voltage.py: TestVoltageSeverityEscalation covers the
  F1 fix from Commit 2 (severity escalation within the same LOW/HIGH
  state must refresh voltage_pending_severe so the immediate-notify
  bypass fires). Three cases: mild→severe escalation, no double-fire
  after already-notified, HIGH-side symmetry.
- tests/test_redundancy.py: TestLocalShutdownCallback covers the
  Commit 2 redundancy local-poweroff fix. Four cases: callback fires
  on is_local quorum loss, callback NOT invoked for non-local groups,
  callback optional (no crash when unset), callback skipped when
  remote shutdown raises (documented behaviour).

Co-Authored-By: Claude Opus 4.7 <[email protected]>
… maintenance

Picked up the four worth-doing items skipped from the previous AI
review pass; left the four genuinely-not-worth-it ones for the
backlog.

CodeRabbit nitpicks landed:

- .github/workflows/release.yml: add `concurrency: group:
  release-gh-pages, cancel-in-progress: false` so two parallel
  releases (e.g. tag push overlapping with a manual workflow_dispatch)
  don't both rebuild + sign packages and only collide at the final
  non-fast-forward push. Reported by @coderabbitai.
- tests/conftest.py: rewrite `patched_cfg_init` to use *args/**kw
  and only inject the isolated default when the caller didn't supply
  one. Today db_directory is the first dataclass field so positional
  calls work; this hardens the fixture against future StatsConfig
  field reordering. Reported by @coderabbitai.
  + tests/test_stats.py TestIsolateStatsDbDirectoryFixture: covers
  default construction, explicit-keyword override, explicit-positional
  override, and pass-through of unrelated kwargs.
- src/eneru/shutdown/filesystems.py: dry-run log line previously
  printed the raw `options` string (`umount -l -f /mnt`) which
  doesn't match the actual argv after shlex.split. Build the argv
  up-front and render with shlex.quote in the dry-run line so the
  log matches reality. As a side benefit, malformed-options
  detection now also fires in dry-run mode, surfacing config errors
  before a real shutdown attempt. Reported by @coderabbitai.
  + tests/test_shutdown_filesystems.py: two new tests cover the
  argv-rendered dry-run line and the dry-run malformed-options
  detection.

Workflow consistency:

- .github/workflows/{validate,integration,e2e,pypi,release}.yml +
  .github/actions/e2e-setup/action.yml: copy the SHA-pin refresh
  comment from codeql.yml to every workflow with pinned actions, so
  a maintainer reading any one file sees the refresh procedure
  inline. Reported by @coderabbitai.

CLAUDE.md:

- New "GitHub Actions SHA pin maintenance" section documenting that
  pins drift over time, refresh cadence (~every 3 months / on
  upstream security advisory / on major-version bump), the exact
  `gh api` command for both tag-tracked and branch-tracked actions,
  and a snapshot table of the currently-pinned action set so a
  future Claude session has the full picture without grepping
  every workflow.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@m4r1k m4r1k merged commit bd94bfe into main Apr 22, 2026
37 of 39 checks passed
@m4r1k m4r1k deleted the chore/release-5.1.1 branch April 22, 2026 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants