Standard probabilistic sampling can fail to capture intermittent issues in high-throughput Kafka architectures. Instead, we found it much more effective to use a hybrid approach. We configured a `ParentBased` sampler that always traced messages from critical business topics, like `order-fulfillment`, while using the `TraceIdRatioBased` sampler with a low percentage for all other topics to throttle volume. This immediately surfaced a recurring latency problem that would have been lost in the noise. A downstream `inventory-service` consumer was intermittently timing out--but only when processing a specific, less common event type from the `order-fulfillment` topic. The full trace showed that this consumer was making a synchronous, multi-second call to a legacy third-party API when this product SKU was requested. Without the parent based decision to trace the whole critical workflow, we only saw isolated slow spans and couldn't trace them back to the specific kafka message responsible for it.