Skip to content

PSI metrics emitted as zeroes when PSI is unavailable #3852

@amritansh1502

Description

@amritansh1502

What is the Problem ?

When PSI (Pressure Stall Information) is not available on the host (kernel compiled without CONFIG_PSI, or booted with psi=0), cAdvisor still emits PSI Prometheus metrics (container_pressure_cpu_stalled_seconds_total, container_pressure_cpu_waiting_seconds_total, etc.) with zero values. This is misleading for monitoring and alerting systems, which cannot distinguish "PSI unavailable" from "PSI available but system is idle."

Main Cause: -
statPSI() in opencontainers/cgroups (fs2/psi.go) correctly returns nil when PSI files don't exist or the kernel returns ENOTSUP. However, the nil signal is lost in cAdvisor's processing:

setPSIStats() in container/libcontainer/handler.go receives the nil *cgroups.PSIStats and silently does nothing, leaving the info.PSIStats at its zero value.

The Prometheus collector in metrics/prometheus.go reads those zero values and emits them as real metrics.
Since CpuStats.PSI, DiskIoStats.PSI, and MemoryStats.PSI in info/v1/container.go are value types (not pointers), there is no way for code to distinguish "PSI unavailable" from "PSI value is genuinely zero."

Possible approaches

  • Change PSI fields to pointer types (PSI *PSIStats instead of PSI PSIStats) so nil means unavailable. The Prometheus collector can then skip metrics when the pointer is nil. This is a breaking API change.

  • Add a PSISupported bool field alongside the existing PSI fields. Non-breaking, but adds a redundant field that consumers must remember to check.

  • Rely on callers to gate PressureMetrics in includedMetrics before creating the manager. This is what Kubernetes currently does (Fix zero PSI metrics emitted when OS doesn't enable PSI kubernetes/kubernetes#137326), but it requires every cAdvisor consumer to implement their own PSI detection.

Related
kubernetes/kubernetes#136333 (original bug report)
kubernetes/kubernetes#137326 (kubelet-side fix)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions