streaming
role and one compute node with the serving
role.
When launching a compute node, its role can be specified via either the --role
command-line argument, or RW_COMPUTE_NODE_ROLE
environment variable.
You need to restart the node to update the role. A role can be one of:
both
: The default role, if not specified. Indicates that the compute node is available for both streaming and serving.serving
: Indicates that the compute node is read-only and executes batch queries only.streaming
: Indicates that the compute node is only available for streaming.
In a production environment, it’s advisable to use separate nodes for batch and streaming operations. The
both
mode, which allows a node to handle both batch and streaming queries, is more suited for testing scenarios. While it’s possible to execute batch and streaming queries concurrently, it’s recommended to avoid running resource-intensive batch and streaming queries at the same time.Enable decoupling with Kubernetes operator/Helm
To enable decoupling of streaming and serving nodes, set Apply the changes to your Kubernetes cluster:After enabling the embedded serving mode, the frontend component will be transformed into a combination of frontend and serving compute node, while the compute component will be dedicated to streaming operations only. This means:
spec.enableEmbeddedServingMode
to true
in your RisingWave custom resource definition:The
enableEmbeddedServingMode
field is only available in the v0.7.1
or later version of the RisingWave operator.- The frontend pods will now handle both frontend tasks and serving (batch) queries.
- The compute pods will exclusively handle streaming tasks.
- You can scale the frontend and compute components independently based on your serving and streaming workload requirements.
Configure a serving
compute node for batch queries
You can use a TOML configuration file to configure a serving
compute node. For detailed instructions, see Node-specific configurations.
Unlike a general-purpose both
compute node, a serving
compute node doesn’t require memory allocation or reservation for shared buffer and operator caches. Instead, it’s more efficient to increase the sizes of the block and meta caches. However, making these caches too large can limit the scope of data that batch queries can execute.
Here’s an example configuration for a serving
compute node with 16GB of memory which you can find in /risingwave/src/config/serving.toml
: