RisingWave’s built-in Iceberg maintenance — including automatic compaction and snapshot expiration — runs on the compactor node. When you enableDocumentation Index
Fetch the complete documentation index at: https://docs.risingwave.com/llms.txt
Use this file to discover all available pages before exploring further.
enable_compaction = true on an internal Iceberg table or Iceberg sink, the compactor node executes those background maintenance tasks.
Why a dedicated compactor is needed
When RisingWave writes to Iceberg, it produces many small data files and frequent snapshots. Without compaction:- Query performance degrades due to excessive file scanning.
- Storage costs increase from accumulated small files and stale snapshots.
- Metadata overhead grows with each new snapshot, slowing down catalog operations.
Deploy a compactor node
Kubernetes (Helm)
If you deployed RisingWave using the Helm chart, add or update thecompactorComponent section in your values.yaml file.
Minimal configuration
values.yaml
Production configuration
For production workloads with frequent writes or large data volumes, allocate more CPU and memory:values.yaml
compactorComponent fields.
Kubernetes (Operator)
If you deployed RisingWave using the Kubernetes Operator, add or update thecompactor section under spec.components in your RisingWave custom resource.
To dedicate a compactor node for Iceberg maintenance, add a node group named iceberg-compactor and set the RW_COMPACTOR_MODE environment variable to dedicated_iceberg. This node group handles only Iceberg compaction and snapshot expiration. The default node group (with empty name "") continues to handle regular Hummock compaction and must remain in place.
Minimal configuration
risingwave.yaml
Production configuration
For production workloads with frequent writes or large data volumes, allocate more CPU and memory to both node groups:risingwave.yaml
risingwave-postgresql-s3-with-iceberg-compaction.yaml reference in the risingwave-operator repository.
Verify the compactor is running
After applying the configuration, check that the compactor Pod is running:Running:
Sizing guidelines
The right compactor size depends on your write volume and compaction frequency. Use the following guidelines as a starting point.Minimum requirements
| Resource | Value |
|---|---|
| CPU | 1 core |
| Memory | 2 GB |
Recommended sizing by workload
| Workload | Write volume | Compaction frequency | CPU | Memory |
|---|---|---|---|---|
| Light | < 10 GB/day | Hourly (default) | 2 cores | 4 GB |
| Medium | 10–100 GB/day | Hourly or more frequent | 4 cores | 8 GB |
| Heavy | > 100 GB/day | Sub-hourly | 8+ cores | 16+ GB |
Sizing considerations
- CPU: Compaction is CPU-intensive due to file reading, sorting, and writing. Allocate more CPU for high write volumes or shorter compaction intervals.
- Memory: The compactor buffers file data in memory during compaction. For large target file sizes (for example,
compaction.target_file_size_mb = 512), increase memory proportionally. - Replicas: In most cases, a single compactor replica is sufficient. Consider adding a second replica if the compactor consistently becomes a bottleneck (observable via the RisingWave monitoring dashboard).
Adjusting compaction frequency
Reducingcompaction_interval_sec increases how often compaction runs, which keeps tables healthier but increases compactor load. Increase CPU and memory if you lower the interval significantly.