Compaction Threshold Tuning in Apache Druid

Effective segment lifecycle management in Apache Druid requires precise calibration of compaction thresholds to balance query latency, storage footprint, and cluster resource utilization. As ingestion pipelines scale, static compaction configurations inevitably lead to either excessive task fragmentation or prolonged lock contention on historical nodes. This guide details the operational mechanics of threshold tuning, providing Python-driven automation patterns and orchestration strategies aligned with modern OLAP platform requirements. For foundational context on how compaction intersects with broader data lifecycle policies, refer to Segment Compaction, Retention & Storage Optimization.

Core Threshold Vectors and Resource Calculus

Druid’s compaction engine operates on a set of interdependent thresholds that dictate task parallelism, segment merging behavior, and memory allocation. The primary configuration vector is the compaction spec submitted to the Overlord, where the following parameters require dynamic calibration:

  • targetSegmentsPerTask: Defines the maximum number of input segments a single compaction task will consume. Default values often range between 50–100, but high-cardinality time-series workloads require lowering this to 20–30 to prevent OOM conditions during index merging.
  • maxRowsPerSegment: Controls the output segment row ceiling. Aligning this with query scan patterns (typically 5M–7.5M rows) optimizes vectorized execution and reduces page cache thrashing.
  • inputSegmentSizeBytes: Acts as a hard guardrail against merging excessively large segments. When combined with maxRowsPerSegment, it enforces predictable memory footprints.
  • maxNumConcurrentSubTasks: Governs parallelism within a single compaction task. This should be tuned relative to available worker slots and JVM heap constraints.

Threshold tuning is not a static exercise; it requires continuous alignment with ingestion velocity and query topology. Implementing Segment Size Optimization Strategies ensures that compaction outputs remain within the optimal 500MB–1GB range, minimizing deep storage I/O overhead while preserving query efficiency.

Python Pipeline Integration and Dynamic Threshold Adjustment

Modern analytics platforms treat Druid compaction as an API-driven orchestration target rather than a static cron job. The following Python specification demonstrates how to dynamically compute and submit compaction thresholds based on real-time segment telemetry from the Coordinator API.

import requests
import math
from typing import Dict, Any

DRUID_COORDINATOR = "http://coordinator:8081"
DRUID_OVERLORD = "http://overlord:8090"

def fetch_segment_metrics(datasource: str) -> Dict[str, Any]:
    """Pulls current segment distribution and calculates baseline thresholds."""
    response = requests.get(
        f"{DRUID_COORDINATOR}/druid/coordinator/v1/datasources/{datasource}/segments",
        params={"full": "true"}
    )
    response.raise_for_status()
    segments = response.json()

    # The datasources/segments endpoint reports per-segment byte size (not row
    # counts), so thresholds are derived from the average segment footprint.
    total_bytes = sum(s.get("size", 0) for s in segments)
    avg_segment_bytes = total_bytes / max(len(segments), 1)
    
    # Dynamic threshold calculation
    target_bytes_per_task = 512 * 1024 * 1024  # aim for ~512MB of input per task
    target_segments = max(20, min(50, math.floor(target_bytes_per_task / max(avg_segment_bytes, 1))))
    max_rows = 5_000_000  # output row ceiling aligned with Druid's segment-size guidance
    
    return {
        "targetSegmentsPerTask": target_segments,
        "maxRowsPerSegment": max_rows,
        "inputSegmentSizeBytes": 1_073_741_824,  # 1GB hard cap
        "maxNumConcurrentSubTasks": 4
    }

def submit_compaction_task(datasource: str, interval: str, spec: Dict[str, Any]) -> None:
    """Submits dynamically tuned compaction spec to the Overlord."""
    payload = {
        "type": "compact",
        "dataSource": datasource,
        "interval": interval,
        "ioConfig": {"type": "compact"},
        "tuningConfig": spec
    }
    resp = requests.post(f"{DRUID_OVERLORD}/druid/indexer/v1/task", json=payload)
    resp.raise_for_status()

This pattern uses the standard requests library (see the Python Requests documentation) for JSON serialization; production deployments should layer in authentication and retry handling. By querying segment metadata at runtime, the pipeline avoids the guesswork inherent in static YAML configurations. For production-grade execution, integrate this logic with Automated Compaction Task Scheduling to align task submission windows with off-peak query loads and ingestion pauses.

Resilience Patterns and Failure Mitigation

Compaction tasks are inherently resource-intensive and susceptible to transient failures, particularly when merging high-cardinality dimensions or operating near JVM heap limits. A robust operational posture requires explicit failure handling rather than relying on default retry behaviors.

When a compaction task exceeds memory bounds or encounters deep storage latency spikes, the pipeline should automatically degrade thresholds, reduce concurrency, and reschedule. The Apache Druid compaction documentation outlines tuning parameters that directly impact merge stability, but programmatic intervention is often faster than manual tuning.

Implementing Fallback Chains for Failed Compactions ensures that degraded tasks still progress without blocking downstream retention policies. Typical fallback sequences include:

  1. Halving maxNumConcurrentSubTasks and retrying with identical input intervals.
  2. Reducing targetSegmentsPerTask to 10 to isolate problematic segments.
  3. Escalating to a manual review queue if three consecutive attempts fail with CONTAINER_EXITED or OUT_OF_MEMORY.

By treating threshold values as runtime variables rather than static constants, data engineering teams can maintain sub-second query SLAs while preventing historical node saturation. Continuous telemetry ingestion, coupled with automated spec generation, transforms compaction from a maintenance burden into a self-optimizing pipeline component.

Back to Apache Druid Segment Lifecycle