From Configurations To Conclusions: Lessons from Fine-tuning Open Telemetry’s Collector For Tracing
The adoption of Open Telemetry’s distributed tracing capabilities has revolutionized the way we analyze and monitor complex systems. Tracing allows us to speed up time to triage in massively distributed systems.
Open Telemetry provides SDKs and a collector to do vendor neutral ingest into an Observability platform of choice. As many have described over time, the collector specifically, provides building blocks like Legos. Similar to Legos, one could easily put together either a masterpiece or something really ugly. In our journey to adopt Open Telemetry we went through a plethora of configuration patterns ranging from a single tier of collectors to multiple tiers of collectors with their own performance characteristics.
In this talk, we will go over the various capabilities that can be used while adopting Open Telemetry Collector for Distributed Tracing, our journey of evolving pipeline configurations and valuable lessons we have learned along the way.
This talk is heavily geared towards sharing real life experiences in using Open Telemetry at scale. We would discuss the following aspects:
* OTEL pipeline components like OTLP receiver, tail sampling processor, span metrics processor/connector, OTLP exporter
* One tier configuration of the collector to do enrichment+tail sampling+ RED metrics generation
* Two tier configuration of the collector to do the same
* Pivot to solving tail sampling, RED metrics generation differently.
* Things to be aware of when setting up collectors