Normal view

Before yesterdayMain stream

Controlling Telemetry explosion at the Edge with OtelCol and OTTL

Telemetry has been exploding due to all these new AI workloads and I feel like there hasn’t been a lot of guidance around controlling this. Everybody’s observability bill is up and these backend vendors are raking it in; datadog stock went up almost 100% in the last 30 days (yes, some of the rise is due to their new AI observability tooling, but if you read the earnings report, their revenue from their backend business is booming even more. They call it non-AI revenue). And all these vendors are selling you a paid solution for it. They’re giving you levers and knobs to drop/sample telemetry after ingest. But it’s baked in to the price, because, of course it is! They have to make their money somehow, and after your telemetry is shipped and landed in their backend and then deleted, you’ve undoubtedly paid for it. Edge reduction itself isn't new. cribl, vector, and collector processors have done it for years, but doing it in the collector with OTTL means no proprietary agent and no lock-in.

With otel graduating last month and opamp becoming a very real thing, it’s so easy to drop/sample telemetry on the edge. It saves you egress, shipping, and ingestion. Not to mention, you are not using a vendor’s propriety tooling to control your telemetry, meaning you’re not locked in. Wana switch backends tomorrow? You can--all your config is based on OSS standards. Anyways, I wrote up a practical guide on how to actually do it, with real config examples, if anyone's interested

submitted by /u/Broad_Technology_531 to r/devops
[link] [comments]
❌
❌