Setting up node-specific configurations
Node-specific configurations can be set in therisingwave.toml
configuration file. Here’s the steps on how to set them up:
-
Create or locate your
risingwave.toml
file. This file will contain all your node-specific configurations. If it doesn’t exist, create a new one. -
Edit the
risingwave.toml
file. Open the file in a text editor. Each configuration item should be specified in the formatconfig_name = value
. For example: -
Save your changes. After editing, save the
risingwave.toml
file. -
Provide the configuration file to the node.
You can do this via the
--config-path
command-line argument when starting the node. For example:Alternatively, you can set theRW_CONFIG_PATH
environment variable to the path of yourrisingwave.toml
file. For example, in a Kubernetes environment, you can copy the configuration file into the Docker container, or mount a path containing the configuration file into the pod. Then, specify the path to the configuration file using theRW_CONFIG_PATH
environment variable or the--config-path
command-line argument. - Restart the node. For the changes to take effect, you must restart the node.
risingwave.toml
will override the default values in the source code. If no configuration file is specified, the default values in /risingwave/src/common/src/config.rs
will be used.
For more details about the parameters in the configuration file, see RisingWave configuration files directory. There you’ll find information like the definitions and default values of these parameters.
Node-specific configurations
Configurations for different components lie under different TOML sections. Here’s an example:unsafe_
. Typically these configurations can cause system or data damage if wrongly configured. You may want to contact our technical support before changing the unsafe_
prefixed configurations.
System configurations
System configurations are used to initialize the system parameters at the first startup. Once the system has started, the system parameters are managed by Meta service and can be altered using theALTER SYSTEM SET
command.
Example for the system configuration section:
Streaming configurations
Streaming configurations can be set in[streaming]
section in the configuration file. For example:
Configuration | Default | Description |
---|---|---|
unsafe_enable_strict_consistency | true | Control the strictness of stream consistency. When set to false, data inconsistency like double-insertion or double-deletion with the same primary keys will be tolerated. |
Storage configurations
Storage configurations can be set in[storage]
and [storage.xxx]
sections.
File cache and block cache
In RisingWave, several node-specific configurations are provided to control the refilling process of file cache and block cache.What is the file cache and block cache refilling?
What is the file cache and block cache refilling?
The file cache serves as an extension of the memory block cache for the LSM-tree, which is used to speed up storage IO-intensive workloads.The file cache uses the disk to cache LSM-tree blocks. It offers a larger and more cost-effective storage capacity compared to memory, as well as lower latency and greater stability than S3. Besides, the disk provides more durability than memory. Therefore, implementing the file cache system can enhance RisingWave’s storage system performance and mitigate the cold start problem between reboots.The compaction operation of the LSM-tree may influence the effectiveness of the file cache. Therefore, block cache refilling should always be used to improve the effectiveness of the file cache. With block cache refilling, RisingWave will prefetch the latest version of recently-used blocks before metadata updates are applied after compaction, and then fill the file cache.
When to use the file cache and the block cache refilling?
When to use the file cache and the block cache refilling?
While the file cache can boost storage system performance, it’s worth noting that it adds overhead, especially when enabling block cache refilling. Please refer to the following checklist to assess the suitability of the file cache for your workload and configuration:
- If the miss ratio and the miss rate of in-memory block cache or meta cache are high.
- If both of the CPU usage and the network bandwidth are not fully utilized.
- If there is spare disk space.
- Data file cache config:
[storage.data_file_cache]
- Meta file cache config:
[storage.meta_file_cache]
- Cache refill config:
[storage.cache_refill]
Configuration | Default | Description |
---|---|---|
dir | "" | The directory for the file cache. If left empty, the file cache will be disabled. |
capacity_mb | 1024 | The file cache capacity in MB. |
file_capacity_mb | 64 | The capacity for each cache file in MB. |
flushers | 4 | Worker count for concurrently writing cache files. |
reclaimers | 4 | Worker count for concurrently reclaiming cache files. |
recover_concurrency | 8 | Worker count for restoring cache when opening. |
insert_rate_limit_mb | 0 | File cache insertion rate limit in MB/s. This option is important as disk bandwidth is usually lower than memory. |
indexer_shards | 64 | The shard number of the indexer. |
compression | ”none” | Compression algorithm for cached data. Supports none, lz4, and zstd. |
Configuration | Default | Description |
---|---|---|
data_refill_levels | [] | Only blocks in the given levels will be refilled. |
timeout_ms | 6000 | The metadata update will be delayed at most timeout_ms to wait for refilling. |
concurrency | 10 | Block refilling concurrency (by unit level). |
unit | 64 | The length of continuous data blocks that can be batched and refilled in one request. |
threshold | 0.5 | Only units whose recently used block ratio exceeds the threshold will be refilled. |
recent_filter_layers | 6 | Number of layers in the recent filter. |
recent_filter_rotate_interval_ms | 10000 | Time interval for rotating recent filter layers. |
Other storage configurations
Except for the above, RisingWave also provides some other storage configurations to help control the overall buffer and cache limits. Please see Dedicated compute node for more.UDF configurations
Added in v2.1.5, v2.2.4, v2.3.0: Allow enabling/disabling the creation of embedded Python, JavaScript, and WebAssembly UDFs.
[udf]
section in the configuration file. For example:
enable_embedded_python_udf
to true
in the configuration file.