A Practical Baseline for OpenTelemetry on Kubernetes
By Nicolas Narbais
How to split node-local and cluster-wide OpenTelemetry Collectors on Kubernetes with Helm, explicit OTLP export to a backend, and a small proof path for traces, metrics, and logs.
A useful Kubernetes OTel stack starts with topology, not SDKs: one telemetry-emitting service, one node-local Collector path, one cluster-wide Collector path, and one backend exporter you configure explicitly.
Introduction
The first Kubernetes OpenTelemetry setup usually fails for a boring reason: the Collector is deployed in the wrong shape.
A Kubernetes OTel stack that works starts with ownership: which Collector sees node-local data, which Collector sees cluster-wide data, and which exporter sends it to a backend.
It is tempting to look for one YAML file that collects everything: application traces, application logs, node metrics, pod metrics, cluster metrics, Kubernetes events, and backend export. The official getting-started flow uses a cleaner split: a DaemonSet for node-local and workload-local telemetry, and a Deployment for cluster-wide metrics and events (OpenTelemetry Kubernetes Getting Started).
This article defines a minimal topology baseline. It proves Kubernetes telemetry can be collected, enriched, routed, and exported without collapsing node-local and cluster-wide responsibilities.
The hard parts are deployment shape, metadata, and data routing, even when SDKs are configured correctly. If you get those wrong, a perfectly instrumented service still produces telemetry that is hard to correlate, hard to route, or never exported to a real backend.
This baseline is intentionally small:
- one service that emits real OTLP telemetry
- one DaemonSet Collector path
- one Deployment Collector path
- one OTLP-capable backend
- one trace, one metric, and one log to prove the path works
What Each Collector Is Responsible For
Start with the ownership model.
flowchart LR
A[Sample API Pod] -->|OTLP traces metrics logs| B[Node-local Collector DaemonSet]
C[Container stdout/stderr] --> B
D[Kubelet node pod container metrics] --> B
B -->|OTLP export| G[Backend]
E[Kubernetes API server] -->|cluster metrics events| F[Cluster Collector Deployment]
F -->|OTLP export| G
The DaemonSet Collector is the node-local path. It runs one Collector per node, which makes it a good fit for telemetry that is naturally tied to workloads on that node: service OTLP, container logs, and kubelet-derived node, pod, and container metrics. The official guide describes this DaemonSet as the place for the OTLP receiver, Kubernetes attributes processor, kubeletstats receiver, and filelog receiver (OpenTelemetry Kubernetes Getting Started).
The Deployment Collector is the cluster-wide path. It runs as a normal Deployment and collects data that should not be duplicated per node: cluster metrics and Kubernetes events.
The Collector chart docs make the same split visible in the presets: kubelet and host metrics are recommended with mode=daemonset, while cluster metrics and Kubernetes events are recommended with mode=deployment or a single-replica StatefulSet (OpenTelemetry Collector Chart).
This is also where the Collector mental model matters. A Collector configuration is a set of receivers, processors, exporters, and service pipelines. A component becomes active only when it appears in a pipeline (Collector configuration).
Install the Node-Local Collector Path
Install the chart once in DaemonSet mode. This is the path your sample service will use for OTLP, and it is also where node-local logs and kubelet metrics belong.
First add the chart repository:
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
kubectl create namespace observability --dry-run=client -o yaml | kubectl apply -f -
Check the available chart versions before you choose the version for your baseline:
helm search repo open-telemetry/opentelemetry-collector --versions | head
export OTEL_COLLECTOR_CHART_VERSION="replace-with-chart-version"
In production, pin the chart version and test rendered config changes before upgrading. The Collector chart is a larger compatibility surface than the sample telemetry generator you use later.
The receiver and processor lists below are representative. Render your chart version first and copy the exact generated component IDs. Do not assume these names survive chart upgrades.
Use a values-node.yaml like this:
fullnameOverride: otel-node
mode: daemonset
image:
repository: otel/opentelemetry-collector-k8s
extraEnvs:
- name: BACKEND_TOKEN
valueFrom:
secretKeyRef:
name: otel-backend
key: token
service:
enabled: true
presets:
kubernetesAttributes:
enabled: true
kubeletMetrics:
enabled: true
logsCollection:
enabled: true
config:
exporters:
otlp_http/backend:
endpoint: https://otel-backend.example.com
headers:
authorization: "Bearer ${env:BACKEND_TOKEN}"
service:
pipelines:
traces:
receivers: [otlp]
processors: [k8sattributes, memory_limiter, batch]
exporters: [otlp_http/backend]
metrics:
receivers: [otlp, prometheus, kubeletstats]
processors: [k8sattributes, memory_limiter, batch]
exporters: [otlp_http/backend]
logs:
receivers: [otlp, filelog]
processors: [k8sattributes, memory_limiter, batch]
exporters: [otlp_http/backend]
This representative chart values overlay only shows the pieces this baseline changes. Pipeline overlays are easy to get wrong: when you change a pipeline, explicitly list the receivers, processors, and exporters that should remain active. This example keeps the beginner traces path OTLP-only. Include jaeger or zipkin only if you intentionally want to preserve those chart trace receivers.
Create the backend token Secret before installing:
kubectl create secret generic otel-backend \
--namespace observability \
--from-literal=token="$BACKEND_TOKEN" \
--dry-run=client -o yaml | kubectl apply -f -
Then install it:
helm install otel-node open-telemetry/opentelemetry-collector \
--namespace observability \
--version "$OTEL_COLLECTOR_CHART_VERSION" \
--values values-node.yaml
The important part is explicit export. The backend name can change. The Kubernetes getting-started docs call out the caveat directly: the chart sends data to the debug exporter by default, so you have to configure an exporter if you want to use the telemetry somewhere durable (OpenTelemetry Kubernetes Getting Started). The chart docs show the default exporter as debug, which is fine for a smoke test. A real backend needs its own exporter (OpenTelemetry Collector Chart).
Some vendors document an OTLP HTTP base endpoint. Others document signal-specific paths or require custom headers. Use your backend’s OTLP HTTP configuration instead of the placeholder URL.
Debug first is fine for a local demo. A monitoring stack needs a durable backend exporter.
This is where many first installs become demos instead of monitoring pipelines. The Collector pod starts, the chart creates pipelines, and the debug exporter prints telemetry somewhere in the logs. That proves the Collector can receive and process data. It leaves the incident workflow untested: searching the data, attaching it to an alert, and retaining it after the pod restarts.
Put the backend exporter in the first real values file, even if the endpoint is only a staging backend. Keep the debug exporter during bring-up if you want. Make the durable path primary before the first service sends telemetry.
The same rule applies to authentication. Do not paste a real token into values.yaml. The snippet reads BACKEND_TOKEN from a Kubernetes Secret through extraEnvs. Replace that with your normal secret-management path if you use External Secrets, Vault, sealed secrets, or another mechanism.
Use the debug exporter to prove that the Collector received telemetry. Send durable telemetry to a backend. Be especially careful when collecting container logs: exporting collected logs back to Collector stdout can create confusing feedback loops unless exclusions are configured deliberately.
Install the Cluster-Wide Collector Path
Now install the chart a second time in Deployment mode. Run this Collector once when it collects cluster-wide state. Duplicate cluster receivers create duplicate data.
The receiver and processor lists below are representative. Render your chart version first and copy the exact generated component IDs. Do not assume these names survive chart upgrades.
Use values-cluster.yaml:
fullnameOverride: otel-cluster
mode: deployment
replicaCount: 1
image:
repository: otel/opentelemetry-collector-k8s
extraEnvs:
- name: BACKEND_TOKEN
valueFrom:
secretKeyRef:
name: otel-backend
key: token
presets:
clusterMetrics:
enabled: true
kubernetesEvents:
enabled: true
config:
exporters:
otlp_http/backend:
endpoint: https://otel-backend.example.com
headers:
authorization: "Bearer ${env:BACKEND_TOKEN}"
service:
pipelines:
# The otlp receiver can be included only if you intentionally want to preserve the chart’s default receiver. For a stricter cluster-only path, remove it and keep only k8s_cluster and k8sobjects.
metrics:
# receivers: [oltp, prometheus, k8s_cluster]
receivers: [k8s_cluster]
processors: [memory_limiter, batch]
exporters: [otlp_http/backend]
logs:
# receivers: [otlp, k8sobjects]
receivers: [k8sobjects]
processors: [memory_limiter, batch]
exporters: [otlp_http/backend]
Install it as a separate release:
helm install otel-cluster open-telemetry/opentelemetry-collector \
--namespace observability \
--version "$OTEL_COLLECTOR_CHART_VERSION" \
--values values-cluster.yaml
This second install prevents a common beginner mistake: forcing all Kubernetes monitoring through one Deployment Collector. A Deployment can receive application OTLP if services can reach it. Node-local log files and kubelet-local metrics fit a DaemonSet better. A DaemonSet can see node-local data. Cluster-wide receivers belong in a single Deployment or single-replica StatefulSet.
A gateway pattern solves a separate problem. A gateway Collector is a central point that receives telemetry from clients or other Collectors and then processes, load-balances, or exports it onward (Gateway deployment pattern). You may add one later, especially when you need centralized routing, tail sampling, or backend isolation. The beginner baseline only needs to separate node-local collection from cluster-wide collection and export both paths somewhere real.
Render Before You Install
Do not rely on values-file intent. Render the chart and inspect the final Kubernetes objects and Collector config before applying them:
helm template otel-node open-telemetry/opentelemetry-collector \
--namespace observability \
--version "$OTEL_COLLECTOR_CHART_VERSION" \
--values values-node.yaml > rendered-node.yaml
helm template otel-cluster open-telemetry/opentelemetry-collector \
--namespace observability \
--version "$OTEL_COLLECTOR_CHART_VERSION" \
--values values-cluster.yaml > rendered-cluster.yaml
Check the generated manifests for the things that usually break first: receiver names, processor names, exporter names, pipeline membership, service names, OTLP ports, environment variables, and RBAC. If you use the umbrella chart, run helm template kubernetes-otel ./kubernetes-otel-stack --namespace observability and do the same inspection there.
These quick checks make the review less abstract:
grep -n "otelcol" -A120 rendered-node.yaml
grep -n "receivers:" -A50 rendered-node.yaml
grep -n "exporters:" -A30 rendered-node.yaml
grep -n "internalTrafficPolicy" -B5 -A5 rendered-node.yaml
For the node path, you should see the otlp receiver in the application pipelines, filelog in the logs pipeline, kubeletstats in the metrics pipeline, k8sattributes in active pipelines, and otlp_http/backend in every durable pipeline. For the cluster path, you should see k8s_cluster in metrics and k8sobjects in logs. You should not see an accidental debug exporter in the durable logs path.
A DaemonSet gives you one Collector pod per node. Client connections through a normal ClusterIP Service can still route to another node. Some chart versions or values configure internalTrafficPolicy: Local for DaemonSet services. Confirm it in the rendered Service manifest. After install, confirm the live Service too:
kubectl get svc otel-node -n observability -o yaml | grep internalTrafficPolicy
You can also inspect the mounted Collector config from the running workloads. The config path can vary by chart version, so use /conf/relay.yaml or the path shown in the rendered manifest:
kubectl exec -n observability daemonset/otel-node -- cat /conf/relay.yaml
kubectl exec -n observability deploy/otel-cluster -- cat /conf/relay.yaml
If strict node-local delivery matters, use a node-local access pattern such as internalTrafficPolicy: Local, host networking, or another environment-appropriate routing mechanism, and validate the routing explicitly.
Send Test Telemetry, Then Test a Real Service
Start with telemetrygen so you can prove the Collector and backend path before debugging an application. Port-forward the node Collector service and send one trace, one metric, and one log through the same OTLP receiver:
# Example pin. Check the latest compatible collector-contrib release before publishing.
go install github.com/open-telemetry/opentelemetry-collector-contrib/cmd/[email protected]
kubectl -n observability port-forward svc/otel-node 4317:4317
telemetrygen traces --otlp-insecure --otlp-endpoint localhost:4317 --traces 1
telemetrygen metrics --otlp-insecure --otlp-endpoint localhost:4317 --metrics 1
telemetrygen logs --otlp-insecure --otlp-endpoint localhost:4317 --logs 1
That proves the receiver, pipeline, exporter, and backend path. It leaves Kubernetes workload metadata untested, because telemetry generated from your laptop is not emitted by a pod.
After that, test a real in-cluster service that emits OTLP. The app’s business logic is secondary. The image must emit real telemetry. Replace the placeholder image below with your instrumented service or a small sample app you control.
Here is the shape of the Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-api
spec:
replicas: 1
selector:
matchLabels:
app: sample-api
template:
metadata:
labels:
app: sample-api
spec:
containers:
- name: sample-api
image: ghcr.io/your-org/your-instrumented-api:replace-me
ports:
- containerPort: 8080
env:
- name: OTEL_SERVICE_NAME
value: sample-api
- name: OTEL_EXPORTER_OTLP_PROTOCOL
value: http/protobuf
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://otel-node.observability.svc.cluster.local:4318
---
apiVersion: v1
kind: Service
metadata:
name: sample-api
spec:
selector:
app: sample-api
ports:
- port: 80
targetPort: 8080
Apply it:
kubectl apply -f sample-api.yaml
kubectl rollout status deploy/sample-api
kubectl port-forward svc/sample-api 8080:80
curl http://localhost:8080/hello
In a stricter node-local setup, you may expose the DaemonSet receiver through host networking or another node-local address. The official agent deployment pattern describes the same idea more generally: applications send OTLP to a Collector instance running alongside the application or on the same host, such as a DaemonSet (Agent deployment pattern).
Prove One Trace, One Metric, and One Log Arrived
Do not call the stack done when the pods are Running. Prove the path.
Start in Kubernetes:
kubectl get pods -n observability
kubectl logs -n observability -l app.kubernetes.io/instance=otel-node --tail=50
kubectl logs -n observability -l app.kubernetes.io/instance=otel-cluster --tail=50
Then check the backend:
| Signal | What to look for |
|---|---|
| Trace | A trace or span with service.name=sample-api after the curl /hello request |
| Metric | Node, pod, container, or cluster metric from the Collector presets |
| Log | A log line from the sample API or Kubernetes event logs from the cluster Collector |
If you only see debug logs in the Collector pod and nothing in the backend, the stack is still a demo. Recheck the exporter block and the service pipelines. The chart presets can add receivers and processors. Routing still ends at the exporters you configure.
This is the difference between a toy local demo and a baseline monitoring pipeline.
There is one more practical check: verify metadata. A trace without Kubernetes context is hard to use when the problem is a crash loop, noisy node, or bad rollout. Look for attributes such as namespace, pod name, and node name on the telemetry your backend receives. The Kubernetes attributes preset exists because raw application telemetry is not enough in a cluster. You need workload identity attached while the Collector still has access to Kubernetes metadata.
Use this failure checklist when the proof fails:
| Symptom | Likely cause |
|---|---|
| Collector pods run but the backend is empty | Exporter is missing from the pipeline, endpoint is wrong, or auth environment variable is missing |
| Traces arrive without pod metadata | Kubernetes attributes processor is missing or not active in the pipeline |
| Cluster metrics are duplicated | Cluster receiver is running in multiple replicas or in the DaemonSet path |
| App cannot export OTLP | Service name, port, or protocol does not match the receiver endpoint |
| Logs are noisy or recursive | debug exporter and logs collection are misconfigured or exclusions are incomplete |
The Pitfalls to Avoid
The first pitfall is trying to do all Kubernetes monitoring with one Deployment Collector. That shape misses or complicates node-local collection. Logs live on nodes. Kubelet metrics are node-local. A DaemonSet is the natural fit.
The second pitfall is assuming Helm exports somewhere automatically. The chart gives you useful defaults and presets. Default debug export only writes telemetry for local inspection (OpenTelemetry Collector Chart).
The third pitfall is treating the OpenTelemetry Demo as a production pattern. The Kubernetes getting-started page is explicit: the Demo is a good way to see OTel in action. It is not intended as an example of how to monitor Kubernetes itself (OpenTelemetry Kubernetes Getting Started, OpenTelemetry Demo Docs).
The fourth pitfall is forgetting metadata. Without Kubernetes attributes, traces, metrics, and logs are much harder to correlate back to pods, namespaces, and nodes. The Helm chart has a Kubernetes attributes preset, and the getting-started flow puts metadata enrichment in the DaemonSet path for a reason.
What to Build Next
This baseline is the smallest shape that teaches the right instincts.
From here, harden it:
- replace placeholder backend auth with your secret management pattern
- set resource requests and limits for both Collector releases
- review RBAC generated by presets
- add sampling and filtering policy
- decide which telemetry routes through a central gateway
- create dashboards that prove coverage and routing health
You can also decide whether the node-local Collector should export directly to the backend or forward to a gateway Collector first. Direct export is simpler. A gateway gives you a central place for heavier processing and routing. Introduce that second tier when you have a reason. Keep it separate from the cluster-wide Deployment Collector used for Kubernetes metrics and events. They solve different problems.
Start with the baseline. A Kubernetes OTel stack works when responsibility is clear: node-local telemetry through a DaemonSet, cluster-wide telemetry through a Deployment, metadata attached early, and exporters configured from day one.
That foundation supports the rest of the stack.