This tutorial step covers the basic usage of the OpenTelemetry Collector on Kubernetes and how to reduce costs using sampling techniques.
In chapter 3 we saw the schematic structure of the dice game application. The following diagram illustrates how the telemetry data collected there is exported and stored. excalidraw
Sampling refers to the practice of selectively capturing and recording traces of requests flowing through a distributed system, rather than capturing every single request. It is crucial in distributed tracing systems because modern distributed applications often generate a massive volume of requests and transactions, which can overwhelm the tracing infrastructure or lead to excessive storage costs if every request is traced in detail.
For example, a medium sized setup producing ~1M traces per minute can result in a cost of approximately $250,000 per month. (Note that this depends on your infrastructure costs, the SaaS provider you choose, the amount of metadata, etc.) You may want to check some service costs to get a better idea.
Pricing:
- AWS Xray (calculator)
- GCP Cloud Trace (pricing)
GCP
Feature Price Free allotment per month Effective date
Trace ingestion $0.20/million spans First 2.5 million spans November 1, 2018
---
X-Ray Tracing
Traces recorded cost $5.00 per 1 million traces recorded ($0.000005 per trace).
Traces retrieved cost $0.50 per 1 million traces retrieved ($0.0000005 per trace).
Traces scanned cost $0.50 per 1 million traces scanned ($0.0000005 per trace).
X-Ray Insights traces stored costs $1.00 per million traces recorded ($0.000001 per trace).
For more details, check the offical documentation.
Head sampling is a sampling technique used to make a sampling decision as early as possible. A decision to sample or drop a span or trace is not made by inspecting the trace as a whole.
For the list of all available samplers, check the offical documentation
Update the sampling % in the Auto Instrumentation CR and restart the deployment for the configurations to take effect.
kubectl apply -f https://raw.githubusercontent.com/pavolloffay/kubecon-eu-2024-opentelemetry-kubernetes-tracing-tutorial/main/app/instrumentation-head-sampling.yaml
kubectl rollout restart deployment.apps/backend1-deployment -n tutorial-application
kubectl get pods -w -n tutorial-application
Describe the pod spec for the backend1 deployment to see the updated sampling rate.
kubectl describe pod backend1-deployment-64ddcc76fd-w85zh -n tutorial-application
Environment:
OTEL_TRACES_SAMPLER: parentbased_traceidratio
- OTEL_TRACES_SAMPLER_ARG: 1
+ OTEL_TRACES_SAMPLER_ARG: 0.5
This tells the SDK to sample spans such that only 50% of traces get created.
You can also configure the ParentBasedTraceIdRatioSampler in code.A Sampler
can be set on the tracer provider using the WithSampler
option, as follows:
provider := trace.NewTracerProvider(
trace.WithSampler(trace.NewParentBasedTraceIdRatioSampler(0.5)),
)
Tail sampling is where the decision to sample a trace takes place by considering all or most of the spans within the trace. Tail Sampling gives you the option to sample your traces based on specific criteria derived from different parts of a trace, which isn’t an option with Head Sampling.
Update the ENV variables below in the backend2 deployment, which generates random spans with errors and high latencies.
kubectl set env deployment backend2-deployment RATE_ERROR=50 RATE_HIGH_DELAY=50 -n tutorial-application
kubectl get pods -n tutorial-application -w
Deploy the opentelemetry collector with tail_sampling
enabled.
kubectl apply -f https://raw.githubusercontent.com/pavolloffay/kubecon-eu-2024-opentelemetry-kubernetes-tracing-tutorial/main/backend/05-collector-1.yaml
kubectl get pods -n observability-backend -w
Now, let’s walk-through the tail-sampling processor configuration, placed in the processors
section of the collector configuration file:
# 1. Sample 100% of traces with ERROR-ing spans
# 2. Sample 100% of trace which have a duration longer than 500ms
# 3. Randomized sampling of 10% of traces without errors and latencies.
processors:
tail_sampling:
decision_wait: 10s # time to wait before making a sampling decision
num_traces: 100 # number of traces to be kept in memory
expected_new_traces_per_sec: 10 # expected rate of new traces per second
policies:
[
{
name: keep-errors,
type: status_code,
status_code: {status_codes: [ERROR]}
},
{
name: keep-slow-traces,
type: latency,
latency: {threshold_ms: 500}
},
{
name: randomized-policy,
type: probabilistic,
probabilistic: {sampling_percentage: 10}
}
]
Now let's execute some requests on the app http://localhost:4000/ and see traces in the Jaeger console http://localhost:16686/.
The image next is an example of what you might see in your backend with this sample configuration. With this configuration, you’ll get all traces with errors and latencies exceeding 500ms, as well as a random sample of other traces based on the rate we’ve configured.
You also have the flexibility to add other policies. For the list of all policies, check the offical documentation
Here are a few examples:
always_sample
: Sample all traces.string_attribute
: Sample based on string attribute values, both exact and regular expression value matches are supported. For example, you could sample based on specific custom attribute values.
Note
This is an optional more advanced section.
All spans of a trace must be processed by the same collector for tail sampling to function properly, posing scalability challenges. Initially, a single collector may suffice, but as the system grows, a two-layer setup becomes necessary. It requires two deployments of the collector, with the first layer routing all spans of a trace to the same collector in the downstream deployment (using a load-balancing exporter), and the second layer performing the tail sampling.
Apply the YAML below to deploy a layer of Collectors containing the load-balancing exporter in front of collectors performing tail-sampling:
kubectl apply -f https://raw.githubusercontent.com/pavolloffay/kubecon-eu-2024-opentelemetry-kubernetes-tracing-tutorial/main/backend/05-collector-2.yaml
kubectl get pods -n observability-backend -w
jaeger-bc5f49d78-627ct 1/1 Running 0 100m
otel-collector-b48b5d66d-k5dsc 1/1 Running 0 4m42s
otel-gateway-collector-0 1/1 Running 0 3m38s
otel-gateway-collector-1 1/1 Running 0 3m38s
prometheus-77f88ccf7f-dfwh2 1/1 Running 0 100m
Now, let’s walk-through the load-balancing exporter configuration, placed in the exporters
section of the collector (layer 1) configuration file:
exporters:
debug:
# routing_key property is used to route spans to exporters based on traceID/service name
loadbalancing:
routing_key: "traceID"
protocol:
otlp:
timeout: 1s
tls:
insecure: true
resolver:
k8s:
service: otel-gateway.observability-backend
ports:
- 4317
Note
This is an optional more advanced section.
This extension allows serving sampling strategies following the Jaeger's remote sampling API. This extension can be configured to proxy requests to a backing remote sampling server, which could potentially be a Jaeger Collector down the pipeline, or a static JSON file from the local file system.
extensions:
jaegerremotesampling:
source:
reload_interval: 30s
remote:
endpoint: jaeger-collector:14250
jaegerremotesampling/1:
source:
reload_interval: 1s
file: /etc/otelcol/sampling_strategies.json
jaegerremotesampling/2:
source:
reload_interval: 1s
file: http://jaeger.example.com/sampling_strategies.json
For more details, check the offical documentation