- Create a connection: Tell RisingWave where to store your Iceberg data and metadata.
- Create a table: Define your table with the
ENGINE = iceberg
clause. - Stream and query data: Insert data and query it in real time.
Hands-on Tutorial: Streaming Iceberg Quickstart
This end-to-end tutorial provides a Docker Compose file to instantly set up the environment and includes all the code you need to run the examples below.
Step 1: Create a connection with the hosted catalog
First, you need to tell RisingWave where to store the table files and metadata. For the simplest setup, you can use RisingWave’s built-in hosted catalog, which manages the metadata for you without requiring any external services like AWS Glue or a separate database.While you can also use external catalogs like AWS Glue or a JDBC database to create native Iceberg tables, this tutorial uses the hosted catalog because it requires no additional setup. For details on all available options, see the Catalogs guide.
Step 2: Create a native Iceberg table
Next, you create the table using theENGINE = iceberg
clause. This tells RisingWave to store the data in the Iceberg format. You can set a low commit_checkpoint_interval
to enable low-latency commits, which is ideal for streaming workloads.
Step 3: Stream data in and query it
The table is now ready to accept streaming data. You can insert data into the table, and it will be committed to Iceberg in near real-time.Next steps
- Run the full tutorial: To run these examples in a pre-configured environment, head over to the Streaming Iceberg quickstart demo.
- Connect to an existing lake: If you already have Iceberg tables, see the Quickstart: Read from and write to existing Iceberg tables to learn how to connect to them.
- Dive deeper: For more detailed information, explore the guides on catalogs and how to create and manage native tables.