> ## Documentation Index
> Fetch the complete documentation index at: https://docs.risingwave.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Kubernetes (Operator)

> This article will help you use the [Kubernetes Operator for RisingWave](https://github.com/risingwavelabs/risingwave-operator) (hereinafter 'the Operator') to deploy a RisingWave cluster in Kubernetes.

The Operator is a deployment and management system for RisingWave. It runs on top of Kubernetes and provides functionalities like provisioning, upgrading, scaling, and destroying the RisingWave instances inside the cluster.

## Prerequisites

* **[Install kubectl](http://pwittrock.github.io/docs/tasks/tools/install-kubectl/)**:
  Ensure that the Kubernetes command-line tool [kubectl](https://kubernetes.io/docs/reference/kubectl/) is installed in your environment.
* **[Install psql](/deploy/install-psql-without-postgresql)**:
  Ensure that the PostgreSQL interactive terminal [psql](https://www.postgresql.org/docs/current/app-psql.html) is installed in your environment.
* **[Install and run Docker](https://docs.docker.com/get-docker/)**:
  Ensure that [Docker](https://docs.docker.com/desktop/) is installed in your environment and running.
* Ensure you allocate enough resources for the deployment. For details, see [Hardware requirements](/deploy/hardware-requirements).

## Create a Kubernetes cluster

<Note>
  The steps in this section are intended for creating a Kubernetes cluster in your local environment.
  If you are using a managed Kubernetes service such as AKS, GKE, and EKS, refer to the corresponding documentation for instructions.
</Note>

<Steps>
  <Step>
    [Install kind](https://kind.sigs.k8s.io/docs/user/quick-start#installation).

    [kind](https://kind.sigs.k8s.io/) is a tool for running local Kubernetes clusters using Docker containers as cluster nodes. You can see the available tags of `kind` on [Docker Hub](https://hub.docker.com/r/kindest/node/tags).
  </Step>

  <Step>
    Create a cluster.

    ```bash theme={null}
    kind create cluster
    ```
  </Step>

  <Step>
    ***Optional:*** Check if the cluster is created properly.

    ```bash theme={null}
    kubectl cluster-info
    ```
  </Step>
</Steps>

## Deploy the Operator

Before the deployment, ensure that the following requirements are satisfied.

* Docker version ≥ 18.09
* `kubectl` version ≥ 1.18
* For Linux, set the value of the `sysctl` parameter [net.ipv4.ip\_forward](https://linuxconfig.org/how-to-turn-on-off-ip-forwarding-in-linux) to 1.

<Steps>
  <Step>
    [Install cert-manager](https://cert-manager.io/docs/installation/) and wait a minute to allow for initialization.
  </Step>

  <Step>
    Install the latest version of the Operator.

    ```bash theme={null}
    kubectl apply --server-side -f https://github.com/risingwavelabs/risingwave-operator/releases/latest/download/risingwave-operator.yaml
    ```

    <Accordion title="If you'd like to install a certain version of the Operator">
      Run the following command to install a specific version instead of the latest version.

      ```bash theme={null}
      # Replace ${VERSION} with the version you want to install, e.g., v1.3.0
      kubectl apply --server-side -f https://github.com/risingwavelabs/risingwave-operator/releases/download/${VERSION}/risingwave-operator.yaml
      ```

      **Compatibility table**

      | Operator                                                                                                               | RisingWave | Kubernetes |
      | :--------------------------------------------------------------------------------------------------------------------- | :--------- | :--------- |
      | v0.4.0                                                                                                                 | v0.18.0+   | v1.21+     |
      | v0.3.6                                                                                                                 | v0.18.0+   | v1.21+     |
      | You can find the release notes of each version [here](https://github.com/risingwavelabs/risingwave-operator/releases). |            |            |
    </Accordion>

    <Note>
      The following errors might occur if `cert-manager` is not fully initialized. Simply wait for another minute and rerun the command above.

      ```bash theme={null}
      Error from server (InternalError): Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "<https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s>": dial tcp 10.105.102.32:443: connect: connection refused
      ```
    </Note>
  </Step>

  <Step>
    ***Optional:*** Check if the Pods are running.

    ```bash theme={null}
    kubectl -n cert-manager get pods
    kubectl -n risingwave-operator-system get pods
    ```
  </Step>
</Steps>

## Deploy a RisingWave instance

RisingWave Kubernetes Operator extends the Kubernetes with CRDs (Custom Resource Definitions) to manage RisingWave. That means all you need to do is to create a RisingWave resource in your Kubernetes cluster, and the RisingWave Kubernetes Operator will take care of the rest.

### Resource requests and limits recommendations

For production Kubernetes deployments, we **strongly recommend** setting CPU and memory requests equal to limits for all RisingWave components. This configuration ensures:

* **Predictable scheduling**: Kubernetes can more accurately schedule pods on nodes with sufficient resources
* **Guaranteed QoS class**: Pods are assigned the highest QoS class, making them less likely to be evicted under resource pressure
* **Consistent performance**: Helps provide more predictable CPU usage and stable memory allocation
* **Resource isolation**: Better protects workloads from noisy neighbors in multi-tenant clusters

In the RisingWave Operator CRD, `resources` is nested under each component's `template.spec`. Apply `requests` equal to `limits` per component (`meta`, `compute`, `compactor`, `frontend`):

```yaml theme={null}
spec:
  components:
    compute:          # repeat for meta, compactor, frontend
      nodeGroups:
      - name: ''
        replicas: 1
        template:
          spec:
            resources:
              limits:
                cpu: <cpu>
                memory: <memory>
              requests:
                cpu: <cpu>       # should equal limits
                memory: <memory> # should equal limits
```

For per-component minimum and recommended values, see [Hardware requirements](/deploy/hardware-requirements).

### Use the example resource files

The RisingWave resource is a custom resource that defines a RisingWave cluster. In [this directory](https://github.com/risingwavelabs/risingwave-operator/blob/main/docs/manifests/risingwave), you can find resource examples that deploy RisingWave with different configurations of metadata store and state backend. Based on your requirements, you can use these resource files directly or as a reference for your customization. The [stable directory](https://github.com/risingwavelabs/risingwave-operator/tree/main/docs/manifests/stable/) contains resource files that we have tested compatibility with the latest released version of the RisingWave Operator:

The resource files are named using the convention of `risingwave-<meta_store>-<state_backend>.yaml`. For example, `risingwave-postgresql-s3.yaml` means that this manifest file uses PostgreSQL as the meta storage and AWS S3 as the state backend.

RisingWave supports using these systems or services as state backends.

* MinIO
* AWS S3
* S3-compatible object storages
* Google Cloud Storage
* Azure Blob Storage
* Alibaba Cloud OSS
* HDFS

You can [customize the state backend](#optional-customize-the-state-backend), or [customize the state store directory](#optional-customize-the-state-store-directory).

### Optional: Customize the state backend

#### Download source file

If you intend to customize a resource file, download the file to a local path and edit it:

```bash theme={null}
curl https://raw.githubusercontent.com/risingwavelabs/risingwave-operator/main/docs/manifests/<sub-directory> -o risingwave.yaml

```

You can also create your own resource file from scratch if you are familiar with Kubernetes resource files.

Then, apply the resource file by using the following command:

```bash theme={null}
kubectl apply -f a.yaml      # relative path
kubectl apply -f /tmp/a.yaml # absolute path

```

#### Customize the state backends

To customize the state backend of your RisingWave cluster, edit the `spec:stateStore` section under the RisingWave resource (`kind: RisingWave`).

<Tabs>
  <Tab title="AWS S3">
    ```yaml theme={null}
    spec:
      stateStore:
        # Prefix to objects in the object stores or directory in file system. Default to "hummock".
        dataDirectory: hummock

        # Declaration of the S3 state store backend.
        s3:
          # Region of the S3 bucket.
          region: us-east-1

          # Name of the S3 bucket.
          bucket: risingwave

          # Credentials to access the S3 bucket.
          credentials:
            # Name of the Kubernetes secret that stores the credentials.
            secretName: s3-credentials

            # Key of the access key ID in the secret.
            accessKeyRef: AWS_ACCESS_KEY_ID

            # Key of the secret access key in the secret.
            secretAccessKeyRef: AWS_SECRET_ACCESS_KEY

            # Optional, set it to true when the credentials can be retrieved
            # with the service account token, e.g., running inside the EKS.
            #
            # useServiceAccount: true

    ```
  </Tab>

  <Tab title="MinIO">
    <Note>
      The performance of MinIO is closely tied to the disk performance of the node where it is hosted. We have observed that AWS EBS does not perform well in our tests. For optimal performance, we recommend using S3 or a compatible cloud service.
    </Note>

    ```yaml theme={null}
    spec:
      stateStore:
        # Prefix to objects in the object stores or directory in file system. Default to "hummock".
        dataDirectory: hummock

        # Declaration of the MinIO state store backend.
        minio:
          # Endpoint of the MinIO service.
          endpoint: risingwave-minio:9301

          # Name of the MinIO bucket.
          bucket: hummock001

          # Credentials to access the MinIO bucket.
          credentials:
            # Name of the Kubernetes secret that stores the credentials.
            secretName: minio-credentials

            # Key of the username ID in the secret.
            usernameKeyRef: username

            # Key of the password key in the secret.
            passwordKeyRef: password

    ```
  </Tab>

  <Tab title="S3-compatible">
    ```yaml theme={null}
    spec:
      stateStore:
        # Prefix to objects in the object stores or directory in file system. Default to "hummock".
        dataDirectory: hummock

        # Declaration of the S3 compatible state store backend.
        s3:
          # Endpoint of the S3 compatible object storage.
          #
          # Here we use Tencent Cloud Object Store (COS) in ap-guangzhou as an example.
          endpoint: cos.ap-guangzhou.myqcloud.com

          # Region of the S3 compatible bucket.
          region: ap-guangzhou

          # Name of the S3 compatible bucket.
          bucket: risingwave

          # Credentials to access the S3 compatible bucket.
          credentials:
            # Name of the Kubernetes secret that stores the credentials.
            secretName: cos-credentials

            # Key of the access key ID in the secret.
            accessKeyRef: ACCESS_KEY_ID

            # Key of the secret access key in the secret.
            secretAccessKeyRef: SECRET_ACCESS_KEY

    ```
  </Tab>

  <Tab title="Azure Blob Storage">
    ```yaml theme={null}
    spec:
      stateStore:
        # Prefix to objects in the object stores or directory in file system. Default to "hummock".
        dataDirectory: hummock

        # Declaration of the Azure Blob Storage state store backend.
        azureBlob:
          # Endpoint of the Azure Blob service.
          endpoint: https://you-blob-service.blob.core.windows.net

          # Working directory root of the Azure Blob service.
          root: risingwave

          # Container name of the Azure Blob service.
          container: risingwave

          # Credentials to access the Azure Blob Storage container.
          credentials:
            # Name of the Kubernetes secret that stores the credentials.
            secretName: azure-credentials

            # Key of the account name in the secret.
            accountNameRef: AccountName

            # Key of the account name in the secret.
            accountKeyRef: AccountKey
    ```
  </Tab>

  <Tab title="Google Cloud Storage">
    ```yaml theme={null}
    spec:
      stateStore:
        # Prefix to objects in the object stores or directory in file system. Default to "hummock".
        dataDirectory: hummock

        # Declaration of the Google Cloud Storage state store backend.
        gcs:
          # Name of the Google Cloud Storage bucket.
          bucket: risingwave

          # Root directory of the Google Cloud Storage bucket.
          root: risingwave

          # Credentials to access the Google Cloud Storage bucket.
          credentials:
            # Name of the Kubernetes secret that stores the credentials.
            secretName: gcs-credentials

            # Key of the service account credentials in the secret.
            serviceAccountCredentialsKeyRef: ServiceAccountCredentials

            # Optional, set it to true when the credentials can be retrieved.
            # useWorkloadIdentity: true

    ```
  </Tab>

  <Tab title="Alibaba Cloud OSS">
    ```yaml theme={null}
    spec:
      stateStore:
        # Prefix to objects in the object stores or directory in file system. Default to "hummock".
        dataDirectory: hummock

        # Declaration of the Alibaba Cloud OSS state store backend.
        aliyunOSS:
          # Region of the Alibaba Cloud OSS bucket.
          region: cn-hangzhou

          # Name of the Alibaba Cloud OSS compatible bucket.
          bucket: risingwave

          # Use internal endpoint or not. Check the following document for details:
          # https://www.alibabacloud.com/help/en/oss/user-guide/regions-and-endpoints
          internalEndpoint: false

          # Credentials to access the Alibaba Cloud OSS bucket.
          credentials:
            # Name of the Kubernetes secret that stores the credentials.
            secretName: oss-credentials

            # Key of the access key ID in the secret.
            accessKeyRef: ACCESS_KEY_ID

            # Key of the secret access key in the secret.
            secretAccessKeyRef: SECRET_ACCESS_KEY_ID

    ```
  </Tab>

  <Tab title="HDFS">
    #### Customize HDFS as state backend

    To customize HDFS as state backend in RisingWave, mount Hadoop configurations with ConfigMap.

    1. Firstly, prepare the config files locally like below, make sure the entire directory are mounted as a Config Map.

    ```bash theme={null}
    ls $HADOOP_HOME/etc/hadoop
    core-site.xml hdfs-site.xml

    ```

    2. Next, create a ConfigMap, where `hadoop-conf` is the name of ConfigMap:

    ```bash theme={null}
    kubectl create configmap hadoop-conf --from-file $HADOOP_HOME/etc/hadoop

    ```

    3. Then mount the Hadoop configuration files using this ConfigMap:

    <Accordion title="See the code example">
      ```yaml theme={null}
      apiVersion: risingwave.risingwavelabs.com/v1alpha1
      kind: RisingWave
      metadata:
        name: risingwave
      spec:
        image: <the image you packaged>
        metaStore:
          memory: true
        stateStore:
          hdfs:
            nameNode: <your_cluster_name/namenode>
            root: <data_directory>
        components:
          meta:
            nodeGroups:
            - name: ''
              replicas: 1
              template:
                spec:
                  resources:
                    limits:
                      cpu: 1
                      memory: 4Gi
                    requests:
                      cpu: 1
                      memory: 4Gi
                  env:
                  - name: HADOOP_CONF_DIR
                    value: /var/etc/hadoop/conf
                  volumes:
                  - name: hadoop-conf
                    configMap:
                      name: hadoop-conf
                  volumeMounts:
                  - name: hadoop-conf
                    mountPath: /var/etc/hadoop/conf
                    readOnly: true
          compute:
            nodeGroups:
            - name: ''
              replicas: 1
              template:
                spec:
                  resources:
                    limits:
                      cpu: 4
                      memory: 16Gi
                    requests:
                      cpu: 4
                      memory: 16Gi
                  env:
                  - name: HADOOP_CONF_DIR
                    value: /var/etc/hadoop/conf
                  volumes:
                  - name: hadoop-conf
                    configMap:
                      name: hadoop-conf
                  volumeMounts:
                  - name: hadoop-conf
                    mountPath: /var/etc/hadoop/conf
                    readOnly: true
          compactor:
            nodeGroups:
            - name: ''
              replicas: 1
              template:
                spec:
                  resources:
                    limits:
                      cpu: 2
                      memory: 8Gi
                    requests:
                      cpu: 2
                      memory: 8Gi
                  env:
                  - name: HADOOP_CONF_DIR
                    value: /var/etc/hadoop/conf
                  volumes:
                  - name: hadoop-conf
                    configMap:
                      name: hadoop-conf
                  volumeMounts:
                  - name: hadoop-conf
                    mountPath: /var/etc/hadoop/conf
                    readOnly: true
          frontend:
            nodeGroups:
            - name: ''
              replicas: 1
              template:
                spec:
                  resources:
                    limits:
                      cpu: 1
                      memory: 4Gi
                    requests:
                      cpu: 1
                      memory: 4Gi

      ```
    </Accordion>

    Replace the following placeholders:

    * `<your-risingwave-image-name>`: The name of your RisingWave Docker image.
    * `<your-hadoop-namenode-address>`: The address of your Hadoop NameNode.
    * `<your-hadoop-data-directory>`: The directory where data is stored.

    Note that the value of the `HADOOP_CONF_DIR` environment variable and the path where the `hadoop-conf` ConfigMap is mounted should be identical.

    #### Customize HDFS client

    You can also customize the HDFS client as needed. We have built an image based on Hadoop 2.7.3. If you want to create your own RisingWave image, please adjust the Hadoop configuration according to your specific cluster information and ensure that:

    * The `CLASSPATH` is correctly set.
    * `HADOOP_CONF_DIR` is placed at the beginning of the `CLASSPATH`.

    See [Docker file for HDFS](https://github.com/risingwavelabs/risingwave/blob/main/docker/Dockerfile.hdfs) for the latest version.

    Then build the image and push it to somewhere. Make sure it’s reachable by the Kubernetes clusters or the machine running docker.
  </Tab>
</Tabs>

### Optional: Customize the state store directory

You can customize the directory for storing state data via the `spec: stateStore: dataDirectory` parameter in the `risingwave.yaml` file that you want to use to deploy a RisingWave instance. If you have multiple RisingWave instances, ensure the value of `dataDirectory` for the new instance is unique (the default value is `hummock`). Otherwise, the new RisingWave instance may crash. Save the changes to the `risingwave.yaml` file before running the `kubectl apply -f <...risingwave.yaml>` command. The directory path cannot be an absolute address, such as `/a/b`, and must be no longer than 180 characters.

### Validate the status of the instance

You can check the status of the RisingWave instance by running the following command.

```bash theme={null}
kubectl get risingwave
```

If the instance is running properly, the output should look like this:

```bash theme={null}
NAME        RUNNING   STORAGE(META)   STORAGE(OBJECT)   AGE
risingwave  True      postgresql      S3                30s

```

## Connect to RisingWave

<Tabs>
  <Tab title="ClusterIP">
    By default, the Operator creates a service for the frontend component, through which you can interact with RisingWave, with the type of `ClusterIP`. But it is not accessible outside Kubernetes. Therefore, you need to create a standalone Pod for PostgreSQL inside Kubernetes.

    <Steps>
      <Step title="Create a Pod.">
        ```bash theme={null}
        kubectl apply -f https://raw.githubusercontent.com/risingwavelabs/risingwave-operator/main/docs/manifests/psql/psql-console.yaml
        ```
      </Step>

      <Step title=" Attach to the Pod so that you can execute commands inside the container.">
        ```bash theme={null}
        kubectl exec -it psql-console -- bash
        ```
      </Step>

      <Step title="Connect to RisingWave via `psql`.">
        ```bash theme={null}
        psql -h risingwave-frontend -p 4567 -d dev -U root
        ```
      </Step>
    </Steps>
  </Tab>

  <Tab title="NodePort">
    You can connect to RisingWave from Nodes such as EC2 in Kubernetes

    1. In the `risingwave.yaml` file that you use to deploy the RisingWave instance, add a `frontendServiceType` parameter to the configuration of the RisingWave service, and set its value to `NodePort`.

    ```bash theme={null}
    # ...
    kind: RisingWave
    ...
    spec:
      frontendServiceType: NodePort
    # ...
    ```

    2. Connect to RisingWave by running the following commands on the Node.

    ```bash theme={null}
    export RISINGWAVE_NAME=risingwave-postgresql-hdfs
    export RISINGWAVE_NAMESPACE=default
    export RISINGWAVE_HOST=`kubectl -n ${RISINGWAVE_NAMESPACE} get node -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}'`
    export RISINGWAVE_PORT=`kubectl -n ${RISINGWAVE_NAMESPACE} get svc -l risingwave/name=${RISINGWAVE_NAME},risingwave/component=frontend -o jsonpath='{.items[0].spec.ports[0].nodePort}'`
    psql -h ${RISINGWAVE_HOST} -p ${RISINGWAVE_PORT} -d dev -U root
    ```
  </Tab>

  <Tab title="LoadBalancer">
    If you are using EKS, GCP, or other managed Kubernetes services provided by cloud vendors, you can expose the Service to the public network with a load balancer in the cloud.

    1. In the `risingwave.yaml` file that you use to deploy the RisingWave instance, add a `frontendServiceType` parameter to the configuration of the RisingWave service, and set its value to `LoadBalancer`.

    ```bash theme={null}
    # ...
    kind: RisingWave
    ...
    spec:
      frontendServiceType: LoadBalancer
    # ...
    ```

    2. Connect to RisingWave with the following commands.

    ```bash theme={null}
    export RISINGWAVE_NAME=risingwave-postgresql-hdfs
    export RISINGWAVE_NAMESPACE=default
    export RISINGWAVE_HOST=`kubectl -n ${RISINGWAVE_NAMESPACE} get svc -l risingwave/name=${RISINGWAVE_NAME},risingwave/component=frontend -o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}'`
    export RISINGWAVE_PORT=`kubectl -n ${RISINGWAVE_NAMESPACE} get svc -l risingwave/name=${RISINGWAVE_NAME},risingwave/component=frontend -o jsonpath='{.items[0].spec.ports[0].port}'`
    psql -h ${RISINGWAVE_HOST} -p ${RISINGWAVE_PORT} -d dev -U root
    ```
  </Tab>
</Tabs>

Now you can ingest and transform streaming data. See [Quick start](/get-started/quickstart) for details.

## Set up RisingWave Console

After deploying your RisingWave cluster on Kubernetes, we recommend setting up RisingWave Console. The Console provides a web interface for managing and monitoring your cluster. It simplifies daily operations and helps you quickly collect and share diagnostic information when contacting RisingWave support team.

For detailed instructions, see [RisingWave Console setup guide](/web-ui/introduction).

<head>
  <script type="application/ld+json">
    {`
          {
            "@context": "https://schema.org",
            "@type": "HowTo",
            "name": "How to deploy RisingWave on Kubernetes with the Operator",
            "description": "Deploy a RisingWave streaming database cluster on Kubernetes using the RisingWave Operator. Includes cluster creation, operator installation, instance deployment, and connecting via psql.",
            "step": [
              {
                "@type": "HowToStep",
                "name": "Create a Kubernetes cluster",
                "text": "Set up a Kubernetes cluster using kind, minikube, EKS, GKE, or AKS. Verify the cluster is running with kubectl cluster-info."
              },
              {
                "@type": "HowToStep",
                "name": "Deploy the RisingWave Operator",
                "text": "Install cert-manager, then install the RisingWave Operator using kubectl apply. Verify the operator pod is running."
              },
              {
                "@type": "HowToStep",
                "name": "Deploy a RisingWave instance",
                "text": "Apply a RisingWave custom resource YAML file to create a RisingWave instance. Configure state backend (S3, GCS, Azure Blob, HDFS) and meta store (etcd, PostgreSQL, MySQL)."
              },
              {
                "@type": "HowToStep",
                "name": "Connect to RisingWave",
                "text": "Use kubectl port-forward or a LoadBalancer service to expose the frontend, then connect with psql: psql -h host -p port -d dev -U root"
              }
            ]
          }
          `}
  </script>
</head>
