Background
On early October 2025, we was doing PoC of using OpenTelemetry
exporter for Google Cloud Trace for internal API. The API has been
generated 42 millions spans since October 1st – October 6th.
Most of spans return unset
status, and we are happy with it.
Cloud Trace service is not free. It charges after million spans generated by your application. Since we are not interested with every span generated and captured by Cloud Trace, we was conducting research on OpenTelemetry trace sampling. The purpose is crystal clear – to ingest trace/spans that we are interested into, and to save budget
In this article, we are going to explore definition of sampling, sampling method for OpenTelemetry trace, how they behave, their implementation, strengths and drawbacks, and their challenges.
Sampling Definition
Oxford dictionary defines sampling as a small amount of a substance taken from a larger amount and tested in order to obtain information about the substance. In observability realm, we can say sampling is a way to gather a group of telemetry data (it could be trace, span, or metrics) where they are represent the whole telemetry data. The main purpose of sampling is to reduce costs without losing visibility.
When To and Not To Sample
OpenTelemetry mentioned there are situations that you might need sampling
- Application generates more than 1000 traces per second
- Most of trace data represents healthy traffic with little variation
- You are interested in some criteria like errors or high latency
- You have domain-specific criteria you can use to determine
relevant data beyon errors and latency - You have common rules to collect or drop data
- You have a way to tell your application to sample data based
on their volume - You have ability to store unsampled data
- Your biggest concern is your budget
In addition, OpenTelemetry tells there are situation that you
might not need sampling
- Your application generates small data (tens of small traces
per second or lower) - You only use observability data in aggregate, and can thus
pre-aggregate data - You are bound to regulation where you are not allow to
drop data
There are head sampling and tail sampling to perform
OpenTelemetry sampling
Head Sampling
Head sampling is a sampling technique used to make a sampling
decision as early as possible. This decision to sample or drop
a trace or span, not inspect trace end-to-end
There are strengths of this sampling technique:
- Easy to implement and understand
- Efficient
- Can be done at any point in trace collection pipeline
However, the primary downside/side effect of this sampling
is it is not possible to sample whole trace. For example,
you are unable to make sure that you only collect trace
with error status in head sampling. For situation or
criteria like this, you might need tail sampling.
We are going to use these configuration files throughout
demonstration of head sampling.
Dockerfile
A Dockerfile to transform application to Docker Image. We’ll use it later alongside Docker Compose
FROM golang:1.22-alpine AS build
ARG APP_VERSION=latest
WORKDIR /src/
COPY . /src/.
RUN CGO_ENABLED=0 go build -o /bin/go-tracing-head-sampling ./tracing-head-sampling/main.go
# final - service
FROM alpine as api-service
COPY --from=build /bin/go-tracing-head-sampling /bin/go-tracing-head-sampling
# HTTP Port
EXPOSE 9320
ENTRYPOINT ["/bin/go-tracing-head-sampling"]
docker-compose.yaml
Docker Compose file to run application and their dependencies. We are going to use OpenTelemetry Collector to ingest traces and spans generated by application, Grafana Tempo to store traces and spans, and Grafana Dashboard to visualize result. We also need a load generator to automatically generate traces with HTTP calls.
version: "3"
networks:
observability:
services:
tempo:
image: grafana/tempo:main-c5323bf
command: [ "-config.file=/etc/tempo.yaml" ]
volumes:
- ./config/tempo-config.yaml:/etc/tempo.yaml
- ./config/tempo-data:/var/tempo
ports:
- "3200:3200" # tempo
- "4317"
- "4318"
networks:
- observability
otel-collector:
image: otel/opentelemetry-collector:latest
volumes:
- ./config/otel-config.yaml:/etc/otel/config.yaml
command:
- '--config=/etc/otel/config.yaml'
ports:
- "4317:4317" #grpc
- "4318:4318" #http
networks:
- observability
app:
container_name: app
build:
context: ./
dockerfile: ./Dockerfile
ports:
- 9320:9320
networks:
- observability
depends_on:
- otel-collector
environment:
- OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
- OTEL_SERVICE_NAME=go-loki-app-demo
grafana:
environment:
- GF_PATHS_PROVISIONING=/etc/grafana/provisioning
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
entrypoint:
- sh
- -euc
- |
mkdir -p /etc/grafana/provisioning/datasources
cat <<EOF > /etc/grafana/provisioning/datasources/ds.yaml
apiVersion: 1
datasources:
- name: Loki
type: loki
access: proxy
orgId: 1
url: http://loki:3100
basicAuth: false
isDefault: true
version: 1
editable: false
- name: Tempo
type: tempo
access: proxy
orgId: 1
url: http://tempo:3200
basicAuth: false
version: 1
editable: false
EOF
/run.sh
image: grafana/grafana:latest
ports:
- "3000:3000"
networks:
- observability
load-generator:
image: alpine/curl:latest
volumes:
- ./config/load-generator.sh:/usr/local/bin/load-generator.sh
command: ["sh", "/usr/local/bin/load-generator.sh"]
networks:
- observability
depends_on:
- app
load-generator.sh
A script to send HTTP request to application.
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
otel-config.yaml
A configuration file for OpenTelemetry Collector
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
exporters:
otlphttp/logs:
endpoint: "http://loki:3100/otlp"
tls:
insecure: true
otlphttp/trace:
endpoint: http://tempo:4318
tls:
insecure: true
service:
pipelines:
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/logs]
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/trace]
NewTracer
function is the way we done head sampling in Go.
It only collects 10% of generated traces.
// tracerName will be initialized after NewTracer() has been called
var tracerName string
// customTracer will be initialized after NewTracer() has been called
var customTracer = otel.Tracer(tracerName)
func NewTracer(
ctx context.Context,
serviceName string,
) error {
propagator := propagation.NewCompositeTextMapPropagator(
propagation.TraceContext{},
propagation.Baggage{},
)
otel.SetTextMapPropagator(propagator)
traceClientHTTP := otlptracehttp.NewClient(
otlptracehttp.WithInsecure(),
otlptracehttp.WithEndpoint("otel-collector:4318"),
)
traceExporter, err := otlptrace.New(
ctx,
traceClientHTTP,
)
if err != nil {
return err
}
res, err := resource.New(
ctx,
resource.WithContainer(),
resource.WithFromEnv(),
)
if err != nil {
return err
}
tracerProvider := traceSdk.NewTracerProvider(
traceSdk.WithBatcher(
traceExporter,
traceSdk.WithBatchTimeout(time.Second),
),
traceSdk.WithResource(
res,
),
traceSdk.WithSampler(traceSdk.TraceIDRatioBased(0.1)),
)
tracerName = serviceName
otel.SetTracerProvider(tracerProvider)
return nil
}
Here is driver code and trace implementation
func main() {
ctx := context.Background()
log.Println(os.Getenv("OTEL_EXPORTER_OTLP_ENDPOINT"))
err := NewTracer(ctx, "go-observability")
if err != nil {
log.Fatalln("fail setup tracer")
}
RunServer()
}
func RunServer() {
http.HandleFunc("/", home())
if err := http.ListenAndServe(":9320", nil); err != nil {
log.Fatalf("could not start server: %vn", err)
}
}
func home() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
respCode := 200
ctx := r.Context()
ctx, span := HTTPLevelSpan(ctx, r)
defer span.End()
failQuery := r.URL.Query().Get("fail")
if failQuery != "" {
respCode = 500
isFail, err := strconv.ParseBool(failQuery)
if err != nil {
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte("error parse query"))
log.Println("error parse query")
return
}
if isFail {
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte("encounter fail"))
log.Println("encounter fail")
return
}
}
w.WriteHeader(respCode)
w.Write([]byte("Hello, World!"))
}
}
func HTTPLevelSpan(
ctx context.Context,
r *http.Request,
) (context.Context, trace.Span) {
spanName := fmt.Sprintf("%s %s", r.Method, r.URL.String())
ctx, span := customTracer.Start(ctx, spanName)
if span.IsRecording() {
// this golang version of this repo is not compatible to
// use go.opentelemetry.io/otel/semconv/v1.32.0, for purpose of
// this example, we are using manually type semconv
span.SetAttributes(attribute.String("http.request.method", r.Method))
span.SetAttributes(attribute.String("server.address", r.RemoteAddr))
span.SetAttributes(attribute.String("url.full", r.URL.String()))
}
return ctx, span
}
Run the whole application with docker compose up -d
. Explore Grafana
dashboard with Grafana Tempo data source. It might be look like this.
Normally, when we don’t apply head sampling, we will get 10 traces
instead of 1. Since we are using head sampling with 10% generated
trace to be sampled, we only get 1 instead of 10.
Tail Sampling
Tail sampling is a way to sample a trace by considering all
or part of span within a trace. Tail sampling enables you
to choose what to sample and what to drop, based on your
interest. You can only sample traces with error status,
trace with certain duration, or trace with specific
attribute. For more tail sampling configuration, you
can find it here
Aside from it’s strength, there are downsides of implementing
tail sampling
- Difficult to implement
- Diffuclt to operate
- Today, it is vendor-specific technology
We are going to use these configuration files throughout
demonstration of tail sampling.
-
Dockerfile
FROM golang:1.22-alpine AS build
ARG APP_VERSION=latest
WORKDIR /src/
COPY . /src/.
RUN CGO_ENABLED=0 go build -o /bin/go-tracing-tail-sampling ./tracing-tail-sampling/main.go
# final - service
FROM alpine as api-service
COPY --from=build /bin/go-tracing-tail-sampling /bin/go-tracing-tail-sampling
# HTTP Port
EXPOSE 9320
ENTRYPOINT ["/bin/go-tracing-tail-sampling"]
-
docker-compose.yaml
version: "3"
networks:
observability:
services:
tempo:
image: grafana/tempo:main-c5323bf
command: [ "-config.file=/etc/tempo.yaml" ]
volumes:
- ./config/tempo-config.yaml:/etc/tempo.yaml
- ./config/tempo-data:/var/tempo
ports:
- "3200:3200" # tempo
- "4317"
- "4318"
networks:
- observability
otel-collector:
image: otel/opentelemetry-collector:latest
volumes:
- ./config/otel-config.yaml:/etc/otel/config.yaml
command:
- '--config=/etc/otel/config.yaml'
ports:
- "4317:4317" #grpc
- "4318:4318" #http
networks:
- observability
app:
container_name: app
build:
context: ./
dockerfile: ./Dockerfile
ports:
- 9320:9320
networks:
- observability
depends_on:
- otel-collector
environment:
- OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
- OTEL_SERVICE_NAME=go-loki-app-demo
grafana:
environment:
- GF_PATHS_PROVISIONING=/etc/grafana/provisioning
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
entrypoint:
- sh
- -euc
- |
mkdir -p /etc/grafana/provisioning/datasources
cat <<EOF > /etc/grafana/provisioning/datasources/ds.yaml
apiVersion: 1
datasources:
- name: Loki
type: loki
access: proxy
orgId: 1
url: http://loki:3100
basicAuth: false
isDefault: true
version: 1
editable: false
- name: Tempo
type: tempo
access: proxy
orgId: 1
url: http://tempo:3200
basicAuth: false
version: 1
editable: false
EOF
/run.sh
image: grafana/grafana:latest
ports:
- "3000:3000"
networks:
- observability
load-generator:
image: alpine/curl:latest
volumes:
- ./config/load-generator.sh:/usr/local/bin/load-generator.sh
command: ["sh", "/usr/local/bin/load-generator.sh"]
networks:
- observability
depends_on:
- app
-
load-generator.sh
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
curl -v http://host.docker.internal:9320/?fail=true
sleep 0.2
curl -v http://host.docker.internal:9320/
sleep 0.2
-
otel-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
exporters:
tail_sampling:
decision_wait: 10s # Optional: Time to wait before making a sampling decision
num_traces: 10000 # Optional: Number of traces to keep in memory
expected_new_traces_per_sec: 100 # Optional: Expected new traces per second for resource allocation
policies:
- name: errors-only-policy
type: status_code
status_code:
status_codes: [ERROR]
service:
pipelines:
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/logs]
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/trace]
Rest of configuration file are same, but the biggest changes
lies on OpenTelemetry configuration file and Docker Compose
file.
In OpenTelemetry config file, we define more specific processor
rules instead of having batch
rule. At this moment, we are
interested with only traces that return error status.
Consequently, no matter how important or numerous spans within
a trace as long as they don’t return error status, they will
be dropped.
We are going to use contrib
OpenTelemetry Docker image for
OpenTelemetry collector, because core
image is not supported
for tail sampling processor (read more about it here)
When accessing Grafana explore with Grafana Tempo data source,
we’ll get something like this
Among all requests, our OpenTelemetry collector will only sample
trace with error status. In this example, we simulate the error
by checking fail
query on HTTP URL.
Conducted Researches
There are several researches regarding OpenTelemetry sampling
- Tail sampling had highest CPU, memory, and network overhead compared to head sampling. This is a trade off for capturing errors.
- Enabling tail sampling consumes more physical resources compared to disable tail sampling. Consider head sampling for representative trace/span
References
- https://opentelemetry.io/docs/concepts/sampling/
- https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34418
- https://opentelemetry.io/docs/concepts/sampling/
- https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor
- Karkan, T. M. (2024). Performance Overhead Of OpenTelemetry Sampling Methods In A Cloud Infrastructure (Dissertation). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-225869
- Shuvo, G. K. (2021). Tail Based Sampling Framework for Distributed Tracing Using Stream Processing (Dissertation). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-306699