PSI metrics emitted as zeroes when PSI is unavailable

What is the Problem ?

When PSI (Pressure Stall Information) is not available on the host (kernel compiled without `CONFIG_PSI`, or booted with psi=0), cAdvisor still emits PSI Prometheus metrics (container_pressure_cpu_stalled_seconds_total, container_pressure_cpu_waiting_seconds_total, etc.) with zero values. This is misleading for monitoring and alerting systems, which cannot distinguish "PSI unavailable" from "PSI available but system is idle."


Main Cause: -  
`statPSI()` in opencontainers/cgroups (fs2/psi.go) correctly returns nil when PSI files don't exist or the kernel returns `ENOTSUP`. However, the nil signal is lost in cAdvisor's processing:

`setPSIStats() `in container/libcontainer/handler.go receives the nil *cgroups.PSIStats and silently does nothing, leaving the info.PSIStats at its zero value.

The Prometheus collector in metrics/prometheus.go reads those zero values and emits them as real metrics.
Since CpuStats.PSI, DiskIoStats.PSI, and MemoryStats.PSI in info/v1/container.go are value types (not pointers), there is no way for  code to distinguish "PSI unavailable" from "PSI value is genuinely zero."

Possible approaches

- Change PSI fields to pointer types (`PSI *PSIStats` instead of `PSI PSIStats`) so nil means unavailable. The Prometheus collector can then skip metrics when the pointer is nil. This is a breaking API change.

- Add a `PSISupported` bool field alongside the existing PSI fields. Non-breaking, but adds a redundant field that consumers must remember to check.

- Rely on callers to gate PressureMetrics in includedMetrics before creating the manager. This is what Kubernetes currently does             (kubernetes/kubernetes#137326), but it requires every cAdvisor consumer to implement their own PSI detection.

Related
kubernetes/kubernetes#136333 (original bug report)
kubernetes/kubernetes#137326 (kubelet-side fix)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PSI metrics emitted as zeroes when PSI is unavailable #3852

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PSI metrics emitted as zeroes when PSI is unavailable #3852

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions