|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +Guidance for AI coding assistants working in `fluxcd/image-reflector-controller`. Read this file before making changes. |
| 4 | + |
| 5 | +## Contribution workflow for AI agents |
| 6 | + |
| 7 | +These rules come from [`fluxcd/flux2/CONTRIBUTING.md`](https://github.com/fluxcd/flux2/blob/main/CONTRIBUTING.md) and apply to every Flux repository. |
| 8 | + |
| 9 | +- **Do not add `Signed-off-by` or `Co-authored-by` trailers with your agent name.** Only a human can legally certify the DCO. |
| 10 | +- **Disclose AI assistance** with an `Assisted-by` trailer naming your agent and model: |
| 11 | + ```sh |
| 12 | + git commit -s -m "Add support for X" --trailer "Assisted-by: <agent-name>/<model-id>" |
| 13 | + ``` |
| 14 | + The `-s` flag adds the human's `Signed-off-by` from their git config — do not remove it. |
| 15 | +- **Commit message format:** Subject in imperative mood ("Add feature X" instead of "Adding feature X"), capitalized, no trailing period, ≤50 characters. Body wrapped at 72 columns, explaining what and why. No `@mentions` or `#123` issue references in the commit — put those in the PR description. |
| 16 | +- **Trim verbiage:** in PR descriptions, commit messages, and code comments. No marketing prose, no restating the diff, no emojis. |
| 17 | +- **Rebase, don't merge:** Never merge `main` into the feature branch; rebase onto the latest `main` and push with `--force-with-lease`. Squash before merge when asked. |
| 18 | +- **Pre-PR gate:** `make tidy fmt vet && make test` must pass and the working tree must be clean after codegen. Commit regenerated files in the same PR. |
| 19 | +- **Flux is GA:** Backward compatibility is mandatory. Breaking changes to CRD fields, status, CLI flags, metrics, or observable behavior will be rejected. Design additive changes and keep older API versions round-tripping. |
| 20 | +- **Copyright:** All new `.go` files must begin with the boilerplate from `hack/boilerplate.go.txt` (Apache 2.0). Update the year to the current year when copying. |
| 21 | +- **Spec docs:** New features and API changes must be documented in `docs/spec/v1/` — `imagepolicies.md` and `imagerepositories.md`. Update the relevant file in the same PR that introduces the change. |
| 22 | +- **Tests:** New features, improvements and fixes must have test coverage. Add unit tests in `internal/controller/*_test.go` and other `internal/*` packages as appropriate. Follow the existing patterns for test organization, fixtures, and assertions. Run tests locally before pushing. |
| 23 | + |
| 24 | +## Code quality |
| 25 | + |
| 26 | +Before submitting code, review your changes for the following: |
| 27 | + |
| 28 | +- **No secrets in logs or events.** Never surface registry credentials, cloud provider tokens, or pull-secret contents in error messages, conditions, events, or log lines. |
| 29 | +- **No unchecked I/O.** Close HTTP response bodies, file handles, and registry connections in `defer` statements. Check and propagate errors from I/O operations. |
| 30 | +- **No unbounded reads.** Use `io.LimitReader` when reading from network sources. Repositories with very large tag sets (tens of thousands) consume significant memory; respect existing limits. |
| 31 | +- **No direct Badger imports from reconcilers.** Database access goes through the `DatabaseReader`/`DatabaseWriter` interfaces in `internal/controller/database.go`. That boundary is load-bearing for testability. |
| 32 | +- **Registry rate limits.** New scan paths must go through the existing reconcile rate limiter and the `TokenCache`. Do not call `remote.List` in tight loops — public registries (Docker Hub in particular) rate-limit aggressively. |
| 33 | +- **Auth through `AuthOptionsGetter`.** Registry auth resolution lives in `internal/registry/options.go`. Add new auth knobs there, not inline in the reconciler. Cloud provider tokens must use the shared `TokenCache`. |
| 34 | +- **Error handling.** Wrap errors with `%w` for chain inspection. Do not swallow errors silently. Return actionable error messages that help users diagnose the issue without leaking internal state. |
| 35 | +- **Resource cleanup.** Ensure temporary files and directories are cleaned up on all code paths (success and error). Use `defer` and `t.TempDir()` in tests. |
| 36 | +- **Concurrency safety.** Do not introduce shared mutable state without synchronization. Reconcilers run concurrently; per-object work must be isolated. Leader election is required because BadgerDB does not support concurrent access from multiple replicas. |
| 37 | +- **No panics.** Never use `panic` in runtime code paths. Return errors and let the reconciler handle them gracefully. |
| 38 | +- **Minimal surface.** Keep new exported APIs, flags, and environment variables to the minimum needed. The `api/` module is consumed by image-automation-controller — every exported change is a cross-repo contract change. |
| 39 | + |
| 40 | +## Project overview |
| 41 | + |
| 42 | +image-reflector-controller is a component of the [Flux GitOps Toolkit](https://fluxcd.io/flux/components/). It reconciles two CRDs under `image.toolkit.fluxcd.io/v1`: |
| 43 | + |
| 44 | +- `ImageRepository` — scans an OCI image repository at a fixed interval, lists the tags via the registry API, and persists them. |
| 45 | +- `ImagePolicy` — references an `ImageRepository`, filters its tag set (optional regex), and elects a "latest" tag using one of three policies: `semver` (Masterminds/semver ranges), `alphabetical`, or `numerical`. It can also reflect the image digest for the elected tag (`digestReflectionPolicy: Never | IfNotPresent | Always`). |
| 46 | + |
| 47 | +Scan results are stored in an on-disk BadgerDB database at `--storage-path` (default `/data`). The elected latest image ref is written to the `ImagePolicy` status and consumed by image-automation-controller, which patches that ref into Git. This controller never writes to Git. Tag listing uses `github.com/google/go-containerregistry` (`remote.List`). Auth resolves via `spec.secretRef`, `spec.serviceAccountName` (with image pull secrets), `spec.certSecretRef`, `spec.proxySecretRef`, and the `fluxcd/pkg/auth` providers for ECR, GCR/AR, and ACR (including object-level workload identity when the feature gate is enabled). |
| 48 | + |
| 49 | +## Repository layout |
| 50 | + |
| 51 | +- `main.go` — controller entrypoint: flag parsing, BadgerDB open, manager setup, reconciler wiring, feature gates, token cache, leader election. |
| 52 | +- `api/v1/` — CRD Go types (`imagerepository_types.go`, `imagepolicy_types.go`, `condition_types.go`, `groupversion_info.go`). Its own Go module, imported via `replace` from the root `go.mod`; consumed by image-automation-controller. `api/v1beta1/` and `api/v1beta2/` exist for conversion and must not be broken. `zz_generated.deepcopy.go` is generated. |
| 53 | +- `internal/controller/` — `ImageRepositoryReconciler`, `ImagePolicyReconciler`, plus the `DatabaseReader`/`DatabaseWriter` interfaces they depend on. Envtest-based suite in `suite_test.go`. |
| 54 | +- `internal/database/` — BadgerDB-backed tag store (`badger.go`) and a periodic GC runnable (`badger_gc.go`) plugged into the manager. |
| 55 | +- `internal/policy/` — policy evaluation: `semver.go`, `alphabetical.go`, `numerical.go`, `filter.go` (regex pre-filter with replace), `factory.go` (`PolicerFromSpec`), `policer.go` (the `Policer` interface). |
| 56 | +- `internal/registry/` — `AuthOptionsGetter` that turns an `ImageRepository` into go-containerregistry `remote.Option`s (secrets, TLS, proxy, cloud provider auth, token cache). `helper.go` has shared registry helpers. |
| 57 | +- `internal/features/` — feature gate registration (workload identity, secret caching). |
| 58 | +- `internal/test/` — in-process test registry, TLS server, and proxy used by the unit/envtest suite. |
| 59 | +- `config/` — Kustomize manifests: `crd/bases/` (generated CRDs), `manager/`, `rbac/`, `default/`, `samples/`. |
| 60 | +- `docs/spec/` — human-written CRD spec docs. `docs/api/v1/` — generated API reference. |
| 61 | +- `hack/` — codegen boilerplate and api-docs templates. |
| 62 | +- `tests/integration/` — cloud provider e2e tests (AWS/Azure/GCP) using terraform + tftestenv. Separate Go module. |
| 63 | + |
| 64 | +## APIs and CRDs |
| 65 | + |
| 66 | +- Group/version: `image.toolkit.fluxcd.io/v1`. Kinds: `ImageRepository`, `ImagePolicy`. Types in `api/v1/`. |
| 67 | +- CRDs under `config/crd/bases/` and API reference docs under `docs/api/v1/image-reflector.md` are generated — do not hand-edit. Regenerate with `make manifests` and `make api-docs`. |
| 68 | +- The `api` module is imported by image-automation-controller. Any change to exported API types is a cross-repo contract change. Additive only; breaking changes will be rejected. |
| 69 | + |
| 70 | +## Build, test, lint |
| 71 | + |
| 72 | +All targets in the root `Makefile`. Tool versions pinned via `CONTROLLER_GEN_VERSION` and `GEN_API_REF_DOCS_VERSION`. |
| 73 | + |
| 74 | +- `make tidy` — tidy the root, `api/`, and `tests/integration/` modules. |
| 75 | +- `make fmt` / `make vet` — run in root and `api/`. |
| 76 | +- `make generate` — `controller-gen object` against `api/` (deepcopy). |
| 77 | +- `make manifests` — regenerate CRDs and RBAC into `config/crd/bases` and `config/rbac`. |
| 78 | +- `make api-docs` — regenerate `docs/api/v1/image-reflector.md`. |
| 79 | +- `make manager` — `go build -o bin/manager main.go`. |
| 80 | +- `make test` — runs `tidy generate fmt vet manifests api-docs install-envtest`, then `go test ./... -coverprofile cover.out` at the root and in `api/`. Envtest assets are downloaded via `setup-envtest` and exposed through `KUBEBUILDER_ASSETS`. Honors `GO_TEST_ARGS` for extra flags (e.g. `-run`). |
| 81 | +- `make run` — runs the controller against the cluster in `~/.kube/config` using `--storage-path=./data`. |
| 82 | +- `make install` / `make deploy` — apply CRDs / full manager via kustomize. |
| 83 | +- `make docker-build` — buildx image build. Platforms via `BUILD_PLATFORMS`. |
| 84 | + |
| 85 | +Cloud e2e tests live in `tests/integration/` and have their own `Makefile` (`make test-aws`, `make test-azure`, `make test-gcp`). They require real cloud credentials and provision infrastructure via terraform; do not run them by default. |
| 86 | + |
| 87 | +## Codegen and generated files |
| 88 | + |
| 89 | +Check `go.mod` and the `Makefile` for current dependency and tool versions. After changing API types or kubebuilder markers, regenerate and commit the results: |
| 90 | + |
| 91 | +```sh |
| 92 | +make generate manifests api-docs |
| 93 | +``` |
| 94 | + |
| 95 | +Generated files (never hand-edit): |
| 96 | + |
| 97 | +- `api/v1/zz_generated.deepcopy.go` |
| 98 | +- `config/crd/bases/*.yaml` |
| 99 | +- `config/rbac/role.yaml` |
| 100 | +- `docs/api/v1/image-reflector.md` |
| 101 | + |
| 102 | +No load-bearing `replace` directives beyond the standard `api/` local replace. |
| 103 | + |
| 104 | +Bump `fluxcd/pkg/*` modules as a set. Run `make tidy` after any bump. Use `make <target>` so the pinned tool versions are installed rather than invoking `controller-gen` from elsewhere. |
| 105 | + |
| 106 | +## Conventions |
| 107 | + |
| 108 | +- Standard `gofmt`, `go vet`. All exported names need doc comments; non-trivial unexported types and functions should also have them. Match the style of the file you are editing. |
| 109 | +- Reconcilers use `fluxcd/pkg/runtime/patch` + `conditions` for status updates. Do not mutate status in place then `client.Status().Update`; use the patch helper. |
| 110 | +- Events go through `fluxcd/pkg/runtime/events.Recorder`; metrics through `fluxcd/pkg/runtime/metrics` via the shared `helper.Metrics`. |
| 111 | +- Rate limiting on reconciles is wired via `helper.GetRateLimiter` and the `RateLimiterOptions` flags; do not add your own workqueue rate limiter. |
| 112 | +- Honor `spec.provider`, `spec.serviceAccountName`, `spec.secretRef`, `spec.certSecretRef`, `spec.proxySecretRef` when wiring auth options. |
| 113 | +- Cloud provider tokens are cached in a `pkg/cache.TokenCache` keyed per involved object; respect that cache in new auth paths. |
| 114 | +- Tag filtering: the optional `spec.filterTags` regex runs before the policer; the policer only sees tags that matched and were optionally rewritten via the replace pattern. See `internal/policy/filter.go`. |
| 115 | +- Policy selection is dispatched in `internal/policy/factory.go` (`PolicerFromSpec`); add new policy types there and implement the `Policer` interface. |
| 116 | +- The `latestTagsCount` constant in `imagerepository_controller.go` controls how many recent tags are surfaced on `ImageRepository` status — do not change it casually. |
| 117 | + |
| 118 | +## Testing |
| 119 | + |
| 120 | +- Unit and envtest suites live next to the code they test (`internal/controller/*_test.go`, `internal/policy/*_test.go`, `internal/database/*_test.go`, `internal/registry/options_test.go`). |
| 121 | +- The controller suite (`internal/controller/suite_test.go`) boots a controller-runtime envtest environment; `KUBEBUILDER_ASSETS` must point at an installed kube-apiserver/etcd. `make install-envtest` installs them; `make test` wires the variable for you. |
| 122 | +- Registry-touching tests use the in-process helpers in `internal/test/` (plain registry, TLS registry, proxy) backed by `go-containerregistry/pkg/registry`. Prefer these over network calls. |
| 123 | +- Fixture images are created in-memory with `go-containerregistry/pkg/v1/random` and pushed to the test registry. |
| 124 | +- Run a single test: `make test GO_TEST_ARGS='-run TestImagePolicyReconciler_Reconcile'`. |
| 125 | +- `tests/integration/` is a separate module and is **not** exercised by `make test`; do not add unit tests there. |
| 126 | + |
| 127 | +## Gotchas and non-obvious rules |
| 128 | + |
| 129 | +- The Badger database at `--storage-path` is the source of truth for cached tag lists and is assumed to be on a persistent volume in production. Two controller replicas against the same volume is not safe; leader election is enabled for a reason. |
| 130 | +- `--storage-value-log-file-size` (default `1<<28`) caps Badger's mmap'd value log. Effective memory is roughly 2× that. Repositories with very large tag sets (tens of thousands) push this. Do not raise the default casually. |
| 131 | +- `--gc-interval` (minutes) drives the `BadgerGarbageCollector` runnable registered on the manager. `0` disables GC. Tests should not depend on GC running. |
| 132 | +- Public registry rate limits (Docker Hub in particular) are a real source of flakes. |
| 133 | +- Semver ranges use Masterminds/semver v3 semantics (`>=1.0.0 <2.0.0`, `~1.2`, `^1.2`). Not Cargo, not npm. Prereleases are excluded unless the range explicitly allows them. |
| 134 | +- The `filterTags.extract` replace pattern rewrites the tag string the policer sees; the original tag is still what gets published as `latestRef.tag`. Keep that distinction when adding new policies. |
| 135 | +- `DigestReflectionPolicy: Always` is the only mode where `ImagePolicy.spec.interval` is meaningful; the CEL validation rules on the CRD enforce that pairing. Do not loosen them without a very good reason. |
| 136 | +- `v1beta1` and `v1beta2` API packages exist for conversion. Removing fields or kinds from those is a breaking change for existing clusters. |
| 137 | +- Cloud auth feature gate: `ObjectLevelWorkloadIdentity` changes how `spec.serviceAccountName` is resolved for ECR/GCR/ACR. Changes to auth code paths must consider both gate-on and gate-off behavior, and `auth.InconsistentObjectLevelConfiguration`. |
0 commit comments