Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions config_examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Dynatrace distribution of the OpenTelemetry Collector.
- [Redaction Processor](redaction.yaml)
- [Host Metrics Receiver](host-metrics.yaml)
- [Dynatrace Resource Detector](resource-detection.yaml)
- [Large Scale Prometheus Scraping](./prometheus-large-scale)

## Sending data to Dynatrace

Expand Down
26 changes: 26 additions & 0 deletions config_examples/prometheus-large-scale/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Prometheus Large-Scale

Tiered OTel Collector setup for scraping Prometheus targets at scale and shipping to Dynatrace.

## Architecture

- **Tier 1 — Scraper** (`tier1-scraper.values.yaml`): scrapes targets assigned by Target Allocator, load-balances OTLP to tier 2.
- **Tier 2 — Gateway** (`tier2-gateway.values.yaml`): enriches metrics, exports to Dynatrace.
- **Target Allocator** (`allocator.values.yaml`): distributes scrape targets across tier 1 replicas (consistent-hashing).
- **Selfmon Scraper** (`selfmon-scraper.yaml`): scrapes collector/allocator self-metrics direct to Dynatrace.
- **ScrapeConfig** (`scrapeconfig.yaml`): example Prometheus Operator `ScrapeConfig` CR consumed by TA.
- **RBAC** (`rbac.yaml`): ServiceAccounts + roles for scraper, gateway, sink, allocator.

## Deploy

Set `NAMESPACE` and apply RBAC + ScrapeConfig, then install Helm charts:

```sh
kubectl apply -f rbac.yaml
kubectl apply -f scrapeconfig.yaml

helm install otel-allocator open-telemetry/opentelemetry-target-allocator -f allocator.values.yaml
helm install otel-scraper open-telemetry/opentelemetry-collector -f tier1-scraper.values.yaml
helm install otel-gateway open-telemetry/opentelemetry-collector -f tier2-gateway.values.yaml
helm install otel-selfmon open-telemetry/opentelemetry-collector -f selfmon-scraper.yaml
```
40 changes: 40 additions & 0 deletions config_examples/prometheus-large-scale/allocator.values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
nameOverride: ""
fullnameOverride: "tiered-allocator"

replicaCount: 1

targetAllocator:
podAnnotations:
metrics.dynatrace.com/scrape: "true"
metrics.dynatrace.com/port: "8080"
image:
repository: ghcr.io/open-telemetry/opentelemetry-operator/target-allocator
tag: "0.150.0"
serviceAccount:
create: false
name: "tiered-otel-allocator"
service:
port: 8080
config:
allocation_strategy: consistent-hashing
collector_namespace: ${NAMESPACE}
collector_selector:
matchlabels:
app.kubernetes.io/name: opentelemetry-collector
app.kubernetes.io/instance: otel-scraper
prometheus_cr:
enabled: true
scrapeInterval: 60s
service_monitor_selector:
prometheus.dynatrace.com: "true"
pod_monitor_selector:
prometheus.dynatrace.com: "true"
scrape_config_selector:
prometheus.dynatrace.com: "true"

resources:
limits:
memory: 200Mi
requests:
cpu: 10m
memory: 150Mi
208 changes: 208 additions & 0 deletions config_examples/prometheus-large-scale/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
name: tiered-otel-scraper
namespace: ${NAMESPACE}
---
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
name: tiered-otel-gateway
namespace: ${NAMESPACE}
---
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
name: tiered-otel-sink
namespace: ${NAMESPACE}
---
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
name: tiered-otel-allocator
namespace: ${NAMESPACE}
---
# Scraper (tier 1): k8s resolver for loadbalancing exporter
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tiered-otel-scraper
rules:
- apiGroups: [""]
resources:
- pods
- endpoints
- services
verbs: [get, list, watch]
- apiGroups: [discovery.k8s.io]
resources:
- endpointslices
verbs: [get, list, watch]
---
# Gateway (tier 2): k8s_attributes processor
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tiered-otel-gateway
rules:
- apiGroups: [""]
resources:
- pods
- namespaces
- nodes
verbs: [get, watch, list]
- apiGroups: [apps]
resources:
- replicasets
verbs: [get, watch, list]
- apiGroups: [batch]
resources:
- jobs
- cronjobs
verbs: [get, watch, list]
---
# Sink (tier 3): k8s_attributes processor
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tiered-otel-sink
rules:
- apiGroups: [""]
resources:
- pods
- namespaces
- nodes
verbs: [get, watch, list]
- apiGroups: [apps]
resources:
- replicasets
verbs: [get, watch, list]
- apiGroups: [batch]
resources:
- jobs
- cronjobs
verbs: [get, watch, list]
---
# Allocator: service discovery + Prometheus CR access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tiered-otel-allocator
rules:
- apiGroups: [""]
resources:
- pods
- endpoints
- services
- nodes
- nodes/metrics
verbs: [get, watch, list]
- apiGroups: [discovery.k8s.io]
resources:
- endpointslices
verbs: [get, watch, list]
- apiGroups: [""]
resources:
- configmaps
verbs: [get]
- apiGroups: [networking.k8s.io]
resources:
- ingresses
verbs: [get, list, watch]
- nonResourceURLs: ["/metrics"]
verbs: [get]
- apiGroups: [monitoring.coreos.com]
resources:
- servicemonitors
- podmonitors
- scrapeconfigs
- probes
verbs: ["*"]
- apiGroups: [""]
resources:
- namespaces
verbs: [get, list, watch]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiered-otel-scraper
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: tiered-otel-scraper
subjects:
- kind: ServiceAccount
name: tiered-otel-scraper
namespace: ${NAMESPACE}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiered-otel-gateway
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: tiered-otel-gateway
subjects:
- kind: ServiceAccount
name: tiered-otel-gateway
namespace: ${NAMESPACE}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiered-otel-sink
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: tiered-otel-sink
subjects:
- kind: ServiceAccount
name: tiered-otel-sink
namespace: ${NAMESPACE}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiered-otel-allocator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: tiered-otel-allocator
subjects:
- kind: ServiceAccount
name: tiered-otel-allocator
namespace: ${NAMESPACE}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tiered-otel-sink
rules:
- apiGroups: [""]
resources: ["pods", "namespaces", "nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["replicasets", "deployments", "statefulsets", "daemonsets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
resources: ["jobs", "cronjobs"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiered-otel-sink
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: tiered-otel-sink
subjects:
- kind: ServiceAccount
name: tiered-otel-sink
namespace: ${NAMESPACE}
52 changes: 52 additions & 0 deletions config_examples/prometheus-large-scale/scrapeconfig.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
name: dynatrace-com
namespace: ${NAMESPACE}
labels:
prometheus.dynatrace.com: "true"
spec:
jobName: dynatrace-com
scrapeInterval: 60s
sampleLimit: 5000000
labelLimit: 50
labelNameLengthLimit: 100
labelValueLengthLimit: 1000
kubernetesSDConfigs:
- role: Pod
namespaces:
names:
- avalanche
relabelings:
- sourceLabels:
- __meta_kubernetes_pod_annotation_metrics_dynatrace_com_scrape
- __meta_kubernetes_pod_annotationpresent_metrics_dynatrace_com_scrape
action: keep
regex: true;true
- sourceLabels:
- __meta_kubernetes_pod_annotation_metrics_dynatrace_com_secure
- __meta_kubernetes_pod_annotationpresent_metrics_dynatrace_com_secure
action: replace
regex: true;true
targetLabel: __scheme__
replacement: https
- sourceLabels:
- __address__
- __meta_kubernetes_pod_annotation_metrics_dynatrace_com_port
- __meta_kubernetes_pod_annotationpresent_metrics_dynatrace_com_port
action: replace
regex: (.+?)(?::\d+)?;(\d+);true
targetLabel: __address__
replacement: $1:$2
- sourceLabels:
- __meta_kubernetes_pod_annotation_metrics_dynatrace_com_path
- __meta_kubernetes_pod_annotationpresent_metrics_dynatrace_com_path
action: replace
regex: (.+);true
targetLabel: __metrics_path__
replacement: $1
- sourceLabels:
- __meta_kubernetes_pod_phase
action: drop
regex: (Failed|Succeeded)
Loading
Loading