-
Notifications
You must be signed in to change notification settings - Fork 71
Stack stuck in StackProcessing after successful update (workspace pod continuously re-runs and leaks secrets) #1163
Description
What happened?
Despite the stack completing successfully (result: succeeded), the Stack CR status remains Ready: False and stays in StackProcessing. This causes the workspace pod to continuously re-run, repeatedly syncing outputs and creating a new *-stack-outputs secret on each cycle.
With resyncFrequencySeconds: 60, this results in a new secret every ~60 seconds, unbounded accumulation of secrets, and excessive API calls.
Setting resyncFrequencySeconds: 0 does not fully resolve the issue (new secrets continue to be created), though at a reduced rate. Something is still re-enqueuing the stack for reconciliation even when periodic resyncs are disabled. We suspect this may be related to workspace pod watch events, but have not confirmed the exact trigger.
Example
apiVersion: auto.pulumi.com/v1alpha1
kind: Stack
metadata:
name: example-stack
namespace: example-ns
spec:
resyncFrequencySeconds: 60
fluxSource:
sourceRef:
type: OCIRepository
name: example-source
# ... standard stack configObserved behavior:
# Stack status shows succeeded but keeps processing
$ kubectl get stacks -n example-ns
NAME AGE STATE
example-stack 2h succeeded
# Secrets accumulate continuously (~1 per resync interval)
$ kubectl get secrets -n example-ns | grep -c stack-outputs
74
# Update records accumulate
$ kubectl get updates -n example-ns | wc -l
1132
# New secrets appear every ~60s even after successful completion
$ kubectl get secrets -n example-ns --sort-by=.metadata.creationTimestamp | grep stack-outputs | tail -5
example-stack-19d081acf7d-stack-outputs Opaque 6 4m38s
example-stack-19d081c2795-stack-outputs Opaque 6 3m14s
example-stack-19d081d6f5d-stack-outputs Opaque 6 102s
example-stack-19d081e49ed-stack-outputs Opaque 6 26sWith resyncFrequencySeconds: 0:
Secrets still appear, though less frequently. The stack is still being re-enqueued by something other than the periodic resync timer.
Output of pulumi about
- PKO version: 2.4.1
- Kubernetes: 1.33
- Pulumi program source: OCI image
- State backend: DIY
Additional context
Impact
- Unbounded
*-stack-outputssecret accumulation (etcd pressure, potential quota issues) - Excessive state backend API calls
- Unnecessary cloud provider API calls on each
pulumi up(even though nothing changes)
What we've tried
| Workaround | Result |
|---|---|
resyncFrequencySeconds: 60 |
New secret every 60s, unbounded growth |
resyncFrequencySeconds: 0 |
Reduced frequency but secrets still appear (something else triggers reconciliation) |
Manual kubectl annotate to force reconcile |
Stack completes but returns to StackProcessing |
| Deleting accumulated secrets | They reappear on next reconcile cycle |
Potentially related
- #1105 — Stack processing race condition
- PRs Fix workspace watch spuriously aborting in-flight Updates #1155, Fix Update controller status write conflicts with SSA #1141, Migrate Workspace controller status writes to Server-Side Apply #1147, Fix Stack controller status write conflicts by migrating to Server-Side Apply #1152 — Partial fixes merged but issue persists in v2.4.1
Contributing
Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).