Streaming
Barrier pending for too long
No barrier has been committed in this project for more than 15 minutes. Triggers- Streaming graph bottlenecks. Typical causes include: join amplification, insufficient resources, and suboptimal streaming query (e.g., OverWindow, Joins).
- Compaction write stalls result in longer barrier sync duration.
- Check CPU and Memory utilization for all nodes. If those are maxed out, it suggests there’s insufficient resource.
- Check if there are any creating jobs, which are being backfilled via
SHOW JOBS
. Backfilling can induce higher pressure on the cluster.
Sink lag too large
Data for a particular sink has been pending in RisingWave’s internal log store for more than 30 minutes. Triggers- Slow external sink processing.
- Insufficient sink parallelism.
Compaction
Compaction back pressure
Back pressure from compaction detected in your cluster. Triggers Insufficient compaction resource. Diagnosis- Check compaction CPU usage.
- Check the CPU ratio of compute nodes and compactor nodes.