Skip to content

Wrong CPU usage per container in Docker Swarm — per-CPU series cause incorrect values and >100% results #3853

@Ghusn-Mhsen

Description

@Ghusn-Mhsen

Environment

  • cAdvisor: 0.56.2
  • Docker: Swarm mode
  • Kernel: 5.4.0-216-generic (cgroup v1)
  • Metrics backend: OpenTelemetry Collector → Uptrace (Prometheus-compatible API)
  • Visualization: Grafana

Problem

When querying container_cpu_usage_seconds_total in a Docker Swarm cluster, cAdvisor exposes one series per CPU core (e.g. cpu="cpu00" through cpu="cpu07" on an 8-core node) instead of a single aggregated series per container.

This causes the standard PromQL query to return values that are N× too high (where N = number of cores), and in some cases results exceed 100% which should be impossible.

Steps to reproduce

. Query raw metric in Grafana Explore:

container_cpu_usage_seconds_total{
  host_name="my-node",
  name=~".*my-container.*",
  image!=""
}
  1. Observe multiple series — one per CPU core:
{name="syncthing_app.1.xxx", cpu="cpu00"} = 0.155
{name="syncthing_app.1.xxx", cpu="cpu01"} = 0.162
{name="syncthing_app.1.xxx", cpu="cpu02"} = 0.201
... (8 series total for an 8-core node)
  1. Apply the commonly documented query:
sum by (name, host_name) (
  rate(container_cpu_usage_seconds_total{
    image!="", name!=""
  }[2m])
)
/ on(host_name) group_left()
max by (host_name) (machine_cpu_cores)
* 100
  1. Result shows 47% while docker stats shows 2.65% for the same container

Question

Is this behavior normal for cAdvisor? If so, can someone explain the correct way to calculate CPU usage per container when running Docker Swarm on a multi-core node?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions