feat(plugin): install plugins from the GitGuardian platform#1235
feat(plugin): install plugins from the GitGuardian platform#1235benjaminrigaud-gg wants to merge 23 commits into
Conversation
… with compat alias
- Rename GITGUARDIAN_API enum value to PLATFORM with value 'platform'
- Add _missing_ classmethod to accept legacy 'gitguardian_api' value for backward compatibility
- Update all references in ggshield/cmd/plugin/{update,install}.py
- Update all references in ggshield/core/plugin/downloader.py
- Update all test assertions in tests/unit/core/plugin/test_{client,downloader}.py
- Add comprehensive tests for new PLATFORM enum value and backward compat
- Enables support for new manifest format while maintaining legacy manifest compatibility
…load - Reshape PluginDownloadInfo to carry only filename, sha256, version, size_bytes. The upstream URL is no longer exposed to the client: the platform proxies bytes from satori-repository via its own /download endpoint and the client reads sha256/version/size from response headers (X-Plugin-SHA256, X-Plugin-Version, Content-Length). - Drop PluginCatalog.plan/features in favor of per-plugin `available` + `reason` surfaced by the list endpoint. - Add PluginsNotEnabledError so CLI commands can distinguish "feature disabled for this workspace" from a generic 404. - Rewrite PluginAPIClient to call /v1/endpoints/plugins (list/detail/ download/installed) with API-key auth, streaming the wheel body for download. - Refactor download_and_install to accept a byte stream instead of fetching a URL, so the download path is purely "stream bytes → verify sha256 → extract".
Update all three entry points to use the /v1/endpoints/plugins streaming download endpoint and the reshaped client surface: - install: streams the wheel body + SHA256 header, passes the byte stream to download_and_install. - update: same streaming path + calls report_installation on success (best-effort — failures log but don't break the update). - status: surface PluginsNotEnabledError cleanly; drop plan/features columns now that availability is computed per-plugin on the backend.
…ndant try/except - Rename `version` to `resolved_version` in download_plugin to avoid shadowing the function parameter - Add missing `return` after `ctx.exit()` in update.py error handlers to prevent UnboundLocalError on catalog - Remove redundant try/except around report_installation calls — the method itself already swallows all exceptions and logs a warning - Update test to assert report_installation is called rather than testing redundant exception swallowing Refs: END-30
``ggshield plugin list`` showed either ``local`` or ``pip`` for every installed plugin. That split described HOW the loader discovered the plugin on the machine — useful internally, confusing for users who want to know WHERE a plugin came from. Read the install provenance from the manifest (``PluginDownloader.get_plugin_source``) and render it verbatim: ``platform``, ``local file``, ``url``, ``github release``, ``github artifact`` (underscores replaced by spaces). Legacy manifests without a ``source`` field still resolve to ``platform`` via the existing ``_missing_`` compatibility shim, so pre-1.50 installs don't silently re-label as something else. Fall back to ``pip`` when the plugin has no on-disk wheel (entry-point only — same as before) and to ``on-disk`` for the degenerate wheel-without-manifest case (hand-dropped file). Tests switch to a parametrized matrix covering all five PluginSourceType values plus the two fallback paths.
Catalog installs (``download_and_install``) now name the on-disk plugin directory after the wheel's PEP 503-normalised distribution name, the same way local-wheel installs (``install_from_wheel``) already do. Convergent naming means a user who installed the satori wheel from disk and then ``ggshield plugin install machine_scan`` ends up with one plugin directory (``satori-python/``) flipped from ``source.type=local_file`` to ``platform`` in place, instead of two side-by-side directories backed by the same wheel bytes. Distribution name is parsed from ``download_info.filename`` (the catalog response's Content-Disposition, validated by ``_sanitize_wheel_filename`` upstream of this call) — no need to open the streamed bytes as a ZIP. The catalog reference is still passed in as ``plugin_name`` so callers (``cmd/plugin/install.py``, ``update.py``) keep working unchanged: the loader keys discovered plugins by entry-point name, the ``update --check`` pairing keys on catalog reference, and ``_resolve_plugin_dir`` already falls back to a wheel-based entry-point scan when the dir name and reference diverge — so ``plugin uninstall <reference>`` and friends still locate the install correctly.
- ``PluginSourceType.GITGUARDIAN_API`` was removed but tests/code still used the attribute form (raises ``AttributeError``). Add a class-level alias mapping the legacy name to ``PLATFORM`` so attribute access and value-based lookup (via ``_missing_``) both resolve to the same enum member. - Drop ``--force`` arguments from ``ggshield plugin install`` test invocations. The flag was assumed by the test suite but never wired into the CLI; click was returning ``USAGE_ERROR`` for "No such option". - Drop assertions on warning messages (``"not from GitGuardian"``, ``"No SHA256 checksum provided"``) the rewritten install command doesn't emit. Replace with the matching success-output assertion so the test still pins the install path. - ``test_install_url_with_sha256``: update the ``download_from_url.assert_called_once_with`` signature to match the current keyword-arg form (``signature_mode=...``). - ``test_status_shows_signature_label_for_installed_plugin``: drop ``plan`` and ``features`` kwargs from ``PluginCatalog`` — both were removed when the catalog moved to per-plugin ``available`` + ``reason``. - ``test_download_and_install_from_chunks``: mock ``verify_wheel_signature`` so the synthetic-bytes test doesn't trip STRICT signature verification (the real bytes are never a valid sigstore-signed wheel).
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1235 +/- ##
==========================================
+ Coverage 93.56% 93.60% +0.04%
==========================================
Files 179 180 +1
Lines 9307 9459 +152
==========================================
+ Hits 8708 8854 +146
- Misses 599 605 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
``download_and_install`` previously moved the streamed wheel into its final location *before* verifying the sigstore signature, then deleted ``wheel_path`` on STRICT verification failure. For an *update* install that meant a working previously-installed wheel could be removed when the new wheel's signature failed to verify. Verify on the temp path instead: drop the bundle next to ``temp_path``, call ``verify_wheel_signature(temp_path, ...)``, and only swap the final wheel + bundle into place once verification passes. A STRICT failure now leaves the previous install untouched. Temp wheel + temp bundle are cleaned up unconditionally in the finally block.
Older ggshield builds named the plugin directory after the catalog reference (``plugin_name``); current builds name it after the wheel's distribution name. For a user upgrading from one of those builds, ``plugin install <ref>`` lands the new install in ``plugins_dir/<wheel-distribution-name>/`` while ``plugins_dir/<reference>/`` lingers from the previous version. ``_resolve_plugin_dir`` prefers the direct-name match over the wheel-based entry-point scan, so ``status``/``update`` would keep reading the pre-upgrade manifest and the user would see the install "do nothing". After a successful platform install, drop the legacy directory if it exists, is distinct from the new install dir, and looks like a plugin (``manifest.json`` present). Also revoke the trust record keyed on the catalog reference so a future install under that key starts clean.
Codecov flagged this PR's coverage drop (-0.87% project, 77.81% patch) with the bulk of the missed lines in client.py's ``download_signature_bundle`` and downloader.py's new ``_cleanup_legacy_install_dir``. Add unit tests for both: - ``TestDownloadSignatureBundle``: happy-path bytes, foreign-origin rejection (token-leak guard), Content-Length cap rejection, streaming-body cap rejection, network-error wrapping. - ``TestExtractServerDetail``: covers the helper that surfaces DRF ``detail`` strings (non-JSON body, requests exception, missing / non-string detail field). - ``TestCleanupLegacyInstallDir``: removal of the legacy ``plugins_dir/<reference>/`` install when the new wheel-distribution dir is distinct, no-op when same/missing/non-plugin/invalid-name, trust-record revocation after removal, and OSError-during-rmtree swallowed (logged, not raised).
CI's pre-commit job rejected the previous push: - black wanted to reformat ``ggshield/core/plugin/downloader.py`` and ``tests/unit/core/plugin/test_client.py`` (whitespace-only changes). - isort wanted to reorder ``ggshield/cmd/plugin/install.py``. - flake8 F401 flagged a ``PluginDownloadInfo`` import in ``tests/unit/core/plugin/test_client.py`` left over after a previous test refactor — only referenced in a docstring.
Push patch coverage above the project threshold by adding tests for the most-recent-PR additions still unreached by the suite: client.py - ``download_plugin``: missing ``X-Plugin-SHA256`` / ``X-Plugin-Version`` headers, oversize wheel via Content-Length, ``requests`` exception wrapping, and ``--version`` query-param forwarding. - ``_assert_all_https``: HTTP redirect rejection. - ``_sanitize_wheel_filename``: ``..``, embedded NUL, backslash. - ``_iter_with_size_cap``: total bytes overflow mid-stream. downloader.py - ``_wheel_distribution_name``: lowercase + underscore→hyphen normalisation, non-``.whl`` filename and missing version segment rejection. - ``download_and_install`` bundle path: bundle written next to the wheel; ``temp_path`` and temp ``.sigstore`` cleaned up when SHA verification fails; STRICT signature failure on update leaves the previous install (wheel + manifest) untouched. - ``_assert_all_https``: HTTP redirect rejection. - ``get_signature_label``: ``status=valid`` without identity → "signed". Local module coverage: client.py 95% → 99%; downloader.py 91% → 93%; combined 92% → 95%.
- client.py: ``PluginSourceType`` ``_missing_`` fallback for unknown values (returns None → constructor raises ValueError). Brings the module to 100% coverage. - downloader.py: ``_stream_to_file`` empty-chunk skip + size-cap raise + partial-file cleanup in one test. ``download_and_install`` temp bundle file cleaned up in the finally clause when signature verification fails after the bundle was written (the only path that exercises ``temp_bundle_path.exists()`` → ``unlink()``).
- ``_extract_github_repo``: valid URL → ``owner/repo``; ``.git`` suffix stripped; non-github URL → None; path-traversal segments rejected (``..``, ``.``); backslash inside a segment rejected. - ``download_from_github_release``: when ``Path.replace()`` of the ``.tmp`` manifest into place raises, the finally clause unlinks the leftover ``.tmp`` file rather than leaving it on disk.
Brings ggshield/core/plugin/downloader.py to 100% line coverage by exercising: - install_from_wheel co-located bundle copy - download_from_github_artifact: safe_unpack failures, multi-wheel warning, invalid wheel metadata, sigstore bundle copy - uninstall / get_manifest / get_wheel_path / get_installed_signature_label defensive paths (invalid plugin names, missing manifests, malformed JSON, unknown source types) - _is_valid_plugin_name rejection of empty, dot-segments, and null bytes
There was a problem hiding this comment.
A few comments here and there for the actual PR.
More broadly, my general impression (not restricted to this PR) is that the plugin management's logic is hard to follow. There is a lot of duplicated work and not very clear boundaries/separation of concerns, including:
- The
PluginDownloaderis more a "PluginManager" as it also handles installing/uninstalling the plugins and do a lot of the security checks... - ... except for some plugin management logic which is still in the commands
- the different download methods repeat almost identical logic for checksum computation, signature verification and registration/cleanup. There should be a common flow that does all that and only delegate fetching the wheel file to a specific method per possible source type.
- install and update commands repeat the same error management and logging for every kind of download
This PR could be merged (after resolving a few comments) as it doesn't really make things worse in that regard, but I think there is a strong opportunity to refactor this module.
| def _sanitize_wheel_filename(raw: str) -> str: | ||
| """Return a wheel filename safe to use as a single path segment. | ||
|
|
||
| Strips any path components the server may have included and rejects | ||
| values that would resolve outside the plugin directory (``..``, empty | ||
| segment, embedded NUL, trailing ``.whl`` missing). | ||
| """ | ||
| name = PurePosixPath(raw).name | ||
| if not name or name in {".", ".."} or "\x00" in name or "\\" in name: | ||
| raise PluginAPIError(f"Server returned unsafe filename: {raw!r}") | ||
| return name |
There was a problem hiding this comment.
- This doesn't check the ".whl" suffix.
- There is almost the same logic twice in
downloader.py, we should probably make this a util.
There was a problem hiding this comment.
Fixed across fb4a26c + 5606d65 + 4486174 — extracted to wheel_utils.sanitize_wheel_filename with the .whl-suffix check; client.py and the URL-download filename fallback in downloader.download_from_url now use it (the latter via try/except InvalidWheelError so the "plugin.whl" fallback still kicks in for query-string URLs).
| except PluginsNotEnabledError: | ||
| raise |
There was a problem hiding this comment.
I don't get the purpose of these two lines :/
There was a problem hiding this comment.
Fixed in 5606d65 — confirmed dead code. PluginsNotEnabledError(Exception) does not inherit from RequestException, so the bare except RequestException below would never have caught it. Removed.
| """ | ||
| if not filename.endswith(".whl"): | ||
| raise DownloadError(f"Not a wheel filename: {filename!r}") | ||
| stem = filename[: -len(".whl")] |
There was a problem hiding this comment.
nit: stem = filename.removesuffix(".whl") exists :)
| ) | ||
|
|
||
| def get_download_info( | ||
| @contextmanager |
There was a problem hiding this comment.
The API feels awkward: The PluginAPIClient class is only used when downloading from the platform and PluginDownloader.download_and_install() is only used with this context manager. Given how the install/update command works, I suppose we could simply have something like PluginDownloader.download_from_platform() that instantiates a PluginAPIClient to work.
There was a problem hiding this comment.
Agreed; deferring to a follow-up PR. The plan there is to collapse PluginAPIClient into PluginDownloader.download_from_platform() so we have one entry point per source type and a single shared post-download flow. Keeping this PR review-fix-scoped to keep the diff small.
| @@ -100,79 +172,155 @@ def __init__(self) -> None: | |||
| def download_and_install( | |||
There was a problem hiding this comment.
There are inconsistencies within the method names: this one is called "download_and_install", there is "install_from_wheel" (makes sense because no download), but "download_from_github_artifact" and "download_from_github_release" also install, whereas their name suggests they only download.
Also, as stated in the review of client.py, it is really a "download_from_platform" in disguise, because all other methods don't reuse it with bytes downloaded from various sources.
There was a problem hiding this comment.
Agreed; deferring to the same follow-up PR as #4. The intent is to make download_from_* only download and install_* only install, with a single post-download flow (verify → register → cleanup) reused across all sources.
| wheel_files = list(extract_dir.glob("**/*.whl")) | ||
| # Sort to ensure deterministic wheel selection when multiple | ||
| # wheels are shipped in the same artifact — without this, the | ||
| # order depends on filesystem traversal and an attacker-shaped |
There was a problem hiding this comment.
nit: not clear what "attacker-shaped" means here. If we suppose the upstream repository can be compromised or that we downloaded an adversarial wheel, this is most likely game over already.
However, handling the multiple-wheel-per-artifact case is good regardless.
There was a problem hiding this comment.
Fixed in 4486174 — rewrote the comment to spell out the multi-wheel-artifact reasoning. You're right that game-over is game-over once an attacker controls the upstream; the comment now says we just don't want to add ordering non-determinism on top.
Both client.py and downloader.py carried identical copies of _assert_all_https / _is_loopback that differed only in the exception class raised. Move them into a shared module and parametrise the exception via an exc_factory callable so each layer keeps its domain-exception contract. Add an opt-in loopback bypass gated by the GITGUARDIAN_ALLOW_INSECURE_ LOOPBACK=1 environment variable. Off by default (production guarantee unchanged); intended for local dev and QA against on-prem stacks served on http://localhost:3000. Both files still own their wrappers in this commit; the migration to the shared helper lands in follow-up commits.
Adds the .whl-suffix check the existing client.py copy was missing (per PR #1235 review) and gives both client.py and downloader.py a single home for wheel-filename hygiene. Raises InvalidWheelError so callers needn't import a separate exception type. Migration of client.py and downloader.py to the new helper lands in follow-up commits.
Drops the local copies of _assert_all_https / _is_loopback / _sanitize_wheel_filename in favour of the shared helpers. The sanitiser call site translates InvalidWheelError -> PluginAPIError to keep the function's exception contract for callers. Also removes the dead 'except PluginsNotEnabledError: raise' clause. PluginsNotEnabledError inherits from Exception (not RequestException), so the bare 'except RequestException' below would never have caught it; the explicit re-raise was a no-op. The four tests that imported the now-deleted private helpers are removed; equivalent coverage lives in test_http_security.py and test_wheel_utils.py. Addresses comments #1, #2 from PR #1235 review.
Drops the local copies of _assert_all_https / _is_loopback in favour
of the shared helpers (http_security.assert_all_https, parametrised
with InsecureSourceError). The URL-download filename hygiene now
delegates to wheel_utils.sanitize_wheel_filename, keeping its
'plugin.whl' fallback by catching InvalidWheelError.
Drive-by:
- _wheel_distribution_name: filename[:-len('.whl')] -> filename.removesuffix('.whl')
- Rewrite the multi-wheel-artifact ordering comment to spell out the
reasoning the reviewer asked about.
The single test that imported the now-deleted private helper is
removed; equivalent coverage lives in test_http_security.py.
Addresses comments #3, #6 from PR #1235 review (and finishes #1).
isort prefers a one-liner here once the multi-line wraps fit in the line budget. No semantic change.
|
Inline comments resolved as of 4486174. The structural feedback (boundaries between |
Context
This PR is the ggshield side of the GitGuardian-platform plugin install flow. With the matching backend deployed,
ggshield plugin install/update/statusdiscover and pull plugins from the user's GitGuardian instance instead of a hard-coded GitHub release URL.What has been done
Core install flow
PluginAPIClientrewritten to call/v1/endpoints/plugins(list / detail / streaming download / signature) with API-key auth. The client no longer sees the upstream wheel URL: the platform proxies bytes from a GitGuardian-hosted repository and exposesX-Plugin-SHA256,X-Plugin-Version,Content-Length(andX-Plugin-Signature-URL) on the download response.download_and_installaccepts a byte stream rather than fetching a URL — keeps the verify-sha256 / extract pipeline pure.install/update/statuscommands wired to the new client surface.updatecalls the newinstalledendpoint best-effort.statussurfaces a dedicatedPluginsNotEnabledErrorwhen the workspace doesn't have the feature.Install-directory naming
download_and_install(catalog flow) now names the on-disk plugin directory after the wheel's PEP 503-normalised distribution name, the same wayinstall_from_wheel(local-file flow) already does. The catalog reference is still passed in asplugin_nameand stored in the manifest, but the on-disk dir comes from the wheel filename.plugin install <ref>ends up with one plugin directory flipped fromsource.type=local_filetoplatformin place — not two side-by-side directories backed by the same wheel bytes.update --checkpairing keys on entry-point name = catalog reference (independent of the on-disk dir name), so the rename above doesn't change the pairing logic._resolve_plugin_dir's wheel-based entry-point fallback already handles the dir-name / reference divergence forplugin uninstall.Manifest
PluginSourceType.PLATFORM(value"platform") replacesGITGUARDIAN_API. A_missing_classmethod accepts the legacy"gitguardian_api"value so manifests written by pre-1.50 ggshield keep parsing.ggshield plugin listnow reads the install source from the manifest and renders it verbatim (platform,local file,url,github release,github artifact); falls back topipfor entry-point-only plugins andon-diskfor the wheel-without-manifest case.Smaller fixes
download_plugin(version→resolved_version)returnafterctx.exit()inupdate.pyerror handlers (was hittingUnboundLocalError)report_installation(method already swallows + logs).worktrees/added to.gitignoreValidation
Unit and functional tests in
tests/unit/cmd/plugin/andtests/unit/core/plugin/cover the new client, the rewritten command flows, and the source-type compat layer (parametrized matrix across all PluginSourceType values plus the two fallback paths).End-to-end against a live backend: a separate harness drives ggshield as a real user would against a GitGuardian instance with the plugin feature enabled, then asserts on real CLI output and on-disk manifests. 14 scenarios passing as of the latest push:
```
status / install pinned / install latest / list /
status post-install (update available) / update --check / update --all /
re-install / uninstall / unknown-plugin error / unknown-version error /
strict-sig install (sigstore bundle proxied + verified) /
local→local wheel upgrade / local-wheel-then-platform install (in-place)
```
Manual smoke test against a GitGuardian instance with the plugin feature enabled:
PR check list