Skip to content

feat(Release): 2026-05-12#58

Merged
ar7casper merged 11 commits into
masterfrom
release/2026-05-12
May 13, 2026
Merged

feat(Release): 2026-05-12#58
ar7casper merged 11 commits into
masterfrom
release/2026-05-12

Conversation

@ar7casper
Copy link
Copy Markdown
Collaborator

@ar7casper ar7casper commented May 12, 2026

Summary

Release bundle for 2026-05-12. Combines seven independently-reviewed PRs into one RC for cleaner master history. Scope is parser depth (TypeScript DI, Express anonymous handlers), dependency UX (auto-reinstall on pyproject.toml change, JS parser bootstrap), CLI consistency (parse default), and a new opt-in LLM reachability stage.

Included PRs

  • fix: default parse --level to reachable to match scan and Python CLI #35fix: default parse --level to reachable to match scan and Python CLI
    Brings the Go CLI's parse command into alignment with scan and the Python CLI, both of which already defaulted to reachable. The documentation has always said the default is reachable — PR 35 makes the code match the documented contract.

  • feat: auto-detect dependency changes and reinstall openant #36feat: auto-detect dependency changes and reinstall openant
    SHA-256 hash of pyproject.toml stored at ~/.openant/venv/.deps-hash. Every CLI invocation compares the stored hash and re-runs pip install -e <core> when they differ. Catches stale-venv after git pull.

  • fix: lazy-install JS parser npm deps on first use #37fix: lazy-install JS parser npm deps on first use
    openant parse on a JS/TS repo no longer fails with Cannot find module 'ts-morph'. Auto-runs npm install once on first JS parse with node_modules/.package-lock.json as the completion sentinel. Cross-platform file lock prevents concurrent install corruption. Closes JS/TS parser fails on fresh install: missing npm dependencies #6.

  • feat: DI-aware call resolution with nominal type matching for TypeScript/NestJS #39feat: DI-aware call resolution with nominal type matching for TypeScript/NestJS
    Resolves this.userService.findById() style calls in NestJS / Angular codebases. Covers constructor injection, field-decorator injection (@Inject / @InjectRepository / etc.), and Angular's functional inject() API. Resolution priority: exact type → nominal (implements/extends) → unambiguous prefix. All steps return null on ambiguity. Class-level metadata file-qualified by relativePath:className for multi-module monorepos.

  • feat: auto-detect language in init #40feat: auto-detect language in init
    openant init now works on non-git directories (commit_sha = "nogit" placeholder). Shared config/languages.json consumed by both Go CLI and Python parser adapter (eliminates Go↔Python extension-list drift). Language auto-detection exposed as opt-in via -l auto (experimental dominance heuristic; see Validate language auto-detection accuracy before defaulting to it #61 for the validation work needed before it becomes the default).

  • fix: extract Express.js anonymous route handler callbacks #49fix: extract Express.js anonymous route handler callbacks
    router.post('/x', auth, async (req, res) => {...}) style handlers are now extracted as units. Synth units carry route_handler (last callback) or route_middleware (earlier callbacks); both registered in ENTRY_POINT_TYPES so the reachability filter doesn't drop them. Receiver filter prevents false positives on cache/query-builder .get/.post(...) calls. Named middleware identifiers become call-graph edges so authenticateToken shows up as an upstream dependency. Closes [bug] JavaScript parser misses Express.js anonymous route handler callbacks #21.

  • feat: LLM review stage for enhanced reachability detection #50feat: LLM review stage for enhanced reachability detection
    New opt-in --llm-reachability pass that uses Opus to surface entry points the structural analysis misses (framework handlers, plugin/CLI registrations, message queues, external input sites). Promote-only semantics — never demotes structurally-detected units. Bundle includes call_graph.json being written by all 5 previously-missing parsers (C/Ruby/PHP/JS/Go) so the post-LLM re-filter works across languages.

Changelog

CHANGELOG.md has a top entry for [2026-05-12] — Parser depth, dependency UX, and LLM reachability (opt-in) covering the user-facing impact of each PR. (Not yet committed — landing in a separate commit before merge.)

Test plan

  • Each constituent PR's test suite passes individually (verified at merge time).
  • Combined CI green on Ubuntu / macOS / Windows.
  • Manual: end-to-end scan across each supported language to confirm DI resolution + Express handler extraction + LLM reachability flow.

Closes #6
Closes #21

@ar7casper ar7casper force-pushed the release/2026-05-12 branch from 21cd56f to 463e9d0 Compare May 12, 2026 07:35
@ar7casper
Copy link
Copy Markdown
Collaborator Author

closes #6

joshbouncesecurity and others added 8 commits May 13, 2026 09:22
Covers the seven PRs in this release:
- #35: parse --level default → reachable (CLI consistency with scan + Python CLI)
- #36: auto-detect dep changes via ~/.openant/venv/.deps-hash
- #37: lazy JS parser npm bootstrap on first use
- #39: TypeScript/NestJS DI-aware call resolution (constructor + field + functional inject())
- #40: --language auto opt-in for openant init + non-git path support + shared config/languages.json
- #49: Express anonymous route handler extraction (route_handler / route_middleware)
- #50: --llm-reachability opt-in stage + cross-parser call_graph.json contract

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…line.py

When test_pipeline.py is invoked as a subprocess by core/parser_adapter.py
(which is the production path for openant scan/parse), the cwd may not
include the openant-core root on sys.path. The `from utilities.file_io
import ...` line was running before the sys.path.insert(...) that adds the
core root, causing ModuleNotFoundError under any environment that didn't
already have openant-core installed via `pip install -e` (e.g. local dev).

CI passes via the python-tests workflow's `pip install -e .` step, so the
issue was invisible there. Production also works for the same reason. But
local pytest from openant-core/ without PYTHONPATH=. surfaced the bug
(6 cross-parser tests failed with ModuleNotFoundError).

Move the sys.path.insert(...) above the utilities imports so all utilities
imports resolve via the explicit path mod, matching the JS/Go pattern
established in #56. Verified locally: 15 passed / 2 Docker-skipped (was
9 passed / 6 failed) on tests/test_call_graph_output.py without PYTHONPATH.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…stage

The four file I/O sites in the LLM-reachability stage of scan_repository
were using bare `with open(..., encoding="utf-8")` instead of the
read_json / write_json helpers from utilities.file_io introduced in #56.

Functionally equivalent (both go through open_utf8 under the hood) but
inconsistent with the post-#56 convention used elsewhere in the file
(line 30 already imports read_json; the rest of scanner.py at lines 167,
696, 704 already uses it).

Convert all four sites:
  - dataset load (active_dataset_path) → read_json
  - app context load (app_context_path) → read_json
  - signals write (llm_reachability.json) → write_json
  - dataset persist (active_dataset_path) → write_json

Verified locally: 44 passed / 2 Docker-skipped on test_llm_reachability.py
+ test_call_graph_output.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Per-unit code-blob truncation in the LLM reachability stage was hardcoded
at MAX_CODE_BYTES = 1500. That caps the LLM at ~30-50 lines per unit and
silently drops entry-point indicators past that cutoff in long handlers,
generated code, and class methods where the security-relevant pattern is
embedded mid-body.

Default stays 1500 (no behaviour change for existing users) but power
users can opt into a larger budget via --llm-reachability-max-code-bytes
on `openant scan`. Common values:
  - 1500 (default): cheapest, ~30-50 lines per unit
  - 4096: ~$1-4 extra per scan, fits a full handler body + utility funcs
  - 8192: ~$2-8 extra per scan, edge cases with very long handlers

Surface:
  - Go CLI: --llm-reachability-max-code-bytes int (default 1500),
    forwarded to the Python CLI only when non-default.
  - Python CLI: matching --llm-reachability-max-code-bytes argparse flag,
    threaded into scan_repository(llm_reachability_max_code_bytes=...).
  - core.scanner.scan_repository: new param, passed into
    analyze_reachability(max_code_bytes=...).
  - core.llm_reachability: max_code_bytes parameter chained through
    analyze_reachability → build_prompt → _unit_for_prompt → _trim_code.

Backward compatibility:
  - Module constant MAX_CODE_BYTES = 1500 kept as alias of new
    DEFAULT_MAX_CODE_BYTES so any external caller importing the old name
    still works.
  - All function signatures default the new param, so existing callers
    (including tests) work unchanged.

Test: tests/test_llm_reachability.py adds
test_max_code_bytes_override_keeps_more_context which verifies a
FINAL_MARKER past byte 1500 is dropped at default but preserved at
max_code_bytes=4096.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@ar7casper ar7casper changed the title Release/2026-05-12 feat(Release): /2026-05-12 May 13, 2026
@ar7casper ar7casper changed the title feat(Release): /2026-05-12 feat(Release): 2026-05-12 May 13, 2026
@ar7casper ar7casper merged commit 368b559 into master May 13, 2026
9 checks passed
@ar7casper ar7casper deleted the release/2026-05-12 branch May 13, 2026 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] JavaScript parser misses Express.js anonymous route handler callbacks JS/TS parser fails on fresh install: missing npm dependencies

3 participants