How it works
When you enable the hosted catalog in an Iceberg connection, RisingWave utilizes its internal metastore (which is typically a PostgreSQL instance) to function as a standard Iceberg JDBC Catalog.- Table metadata for tables created with the
ENGINE = iceberg
clause is stored within two system views in RisingWave:iceberg_tables
andiceberg_namespace_properties
. - This implementation is not a proprietary format; it adheres to the standard Iceberg JDBC Catalog protocol.
- This ensures that the tables remain open and accessible to external tools like Spark, Trino, and Flink that can connect to a JDBC catalog.
Create a connection with the hosted catalog
To use the hosted catalog, you create an Iceberg connection and set thehosted_catalog
parameter to true
.
Syntax
<storage_path>
and <object_storage_parameters>
depend on your chosen storage backend (S3, GCS, or Azure Blob). See the object storage configuration for specific parameter details.
Parameters
For object storage configuration parameters, see Object storage configuration.Hosted catalog-specific parameters
Field | Description |
---|---|
hosted_catalog | Required. Set to true to enable the Hosted Iceberg Catalog for this connection. This instructs RisingWave to manage the catalog metadata internally. |
Create your first table
Now that you understand how to configure a connection with the hosted catalog, the next step is to create a table and start streaming data. For a complete, step-by-step guide, please follow our quickstart tutorial.Quickstart: Create a streaming Iceberg table
This tutorial walks you through creating your first native Iceberg table from scratch using the hosted catalog.
Benefits of the hosted catalog
- Zero external dependencies: No need to set up AWS Glue, PostgreSQL, or other catalog services
- Rapid prototyping: Get started with Iceberg immediately without infrastructure setup
- Standard compliance: Uses the standard Iceberg JDBC catalog protocol for compatibility
- External accessibility: Tables can be accessed by external Iceberg-compatible tools
- Reduced complexity: Fewer moving parts in your data architecture
External access to hosted catalog tables
Since the hosted catalog implements the standard Iceberg JDBC catalog protocol, external tools can access your tables by connecting to RisingWave’s metastore.Spark example
Trino example
Add this to your Trino catalog configuration:When to use external catalogs instead
While the hosted catalog is great for getting started, you might want to use external catalogs when:- Multi-system environments: Multiple systems need to share the same catalog metadata
- Enterprise requirements: You need integration with existing catalog infrastructure (AWS Glue, etc.)
- Governance: You have strict data governance requirements that mandate specific catalog systems
- Scale: You’re managing hundreds or thousands of Iceberg tables across multiple systems
System tables
When using the hosted catalog, you can inspect the catalog metadata through RisingWave’s system tables:Best practices
- For simplified management: Use for development, testing, or production scenarios where RisingWave is the primary system managing the Iceberg tables and a shared external catalog is not needed.
- Backup your metadata: Since metadata is stored in RisingWave’s metastore, ensure you have proper backup procedures for it.
- Monitor storage growth: Keep an eye on metastore storage as you create more tables.
- Plan for scale: Consider external catalogs if you anticipate managing many tables or integrating with multiple systems.
Next steps
- Create your first table: Follow the Create and manage native Iceberg tables guide for detailed table creation and usage.
- Configure object storage: Review Object storage configuration for your storage backend.
- Explore external access: Test connecting external tools like Spark or Trino to your hosted catalog tables.