Integrate with Databricks

This guide shows how to sink data from RisingWave into an Apache Iceberg table and make it available for querying in Databricks. You can use one of two catalog patterns for this integration:

Databricks Unity Catalog: Write data from RisingWave directly to a Databricks-managed Iceberg table.
AWS Glue as a federated catalog: Write data from RisingWave to an Iceberg table that uses AWS Glue as its catalog, and then connect Databricks to Glue.

Using Unity Catalog
Using AWS Glue Catalog

This pattern is ideal when you want to manage your Iceberg tables centrally within the Databricks ecosystem. RisingWave acts as a streaming ETL engine, writing data directly into your Unity Catalog.How it worksRisingWave → Iceberg table on S3 → Databricks Unity Catalog

Prerequisites

A running RisingWave cluster.
A Databricks workspace with Unity Catalog enabled.
Permissions to create and access credentials for external access in Unity Catalog.

Step 1: Configure Unity Catalog for external access

Follow the Databricks documentation to configure your Unity Catalog metastore to allow external clients like RisingWave to access it. You can also configure external managed locations at the catalog and table levels.
Acquire the necessary credentials to connect. You will need the following parameters for your sink:
- catalog.uri: The REST endpoint for your Unity Catalog.
- catalog.oauth2_server_uri: The OAuth token endpoint.
- catalog.credential: Your client ID and secret, formatted as <oauth_client_id>:<oauth_client_secret>.
- warehouse.path: The name of the catalog in Unity Catalog.

Step 2: Sink data from RisingWave to Unity Catalog

Create a SINK in RisingWave that writes to your Databricks-managed table. Note that currently, only append-only sinks are supported for this integration.

CREATE SINK databricks_uc_sink FROM my_source
WITH (
  connector = 'iceberg',
  type = 'append-only',
  force_append_only = 'true',
  warehouse.path = '<your-uc-catalog-name>',
  database.name = '<your-schema-name>',
  table.name = '<your-table-name>',
  s3.region = 'ap-southeast-1',
  catalog.type = 'rest_rust',
  catalog.uri = 'https://<workspace-url>/api/2.1/unity-catalog/iceberg-rest',
  catalog.oauth2_server_uri = 'https://<workspace-url>/oidc/v1/token',
  catalog.credential= '<oauth_client_id>:<oauth_client_secret>',
  catalog.scope='all-apis'
);

Step 3: Query in Databricks

Once the sink is active, you can query the table directly in your Databricks workspace.

SELECT * FROM <your-uc-catalog-name>.<your-schema-name>.<your-table-name>;

Interact with Apache Iceberg

​Prerequisites

​Step 1: Configure Unity Catalog for external access

​Step 2: Sink data from RisingWave to Unity Catalog

​Step 3: Query in Databricks

Prerequisites

Step 1: Configure Unity Catalog for external access

Step 2: Sink data from RisingWave to Unity Catalog

Step 3: Query in Databricks