Access and use your processed data

This section explains how to access and interact with data in RisingWave. RisingWave offers several methods for serving results, catering to various use cases, from ad-hoc analysis to application integration.

Access data with `SELECT` statements

You can query data directly from RisingWave using standard SQL SELECT statements against sources, tables or materialized views. This method is ideal for ad-hoc analysis, exploring the latest data, and integrating with applications or visualization tools that need to pull data. These queries are handled by RisingWave’s Serving Nodes, which act as a PostgreSQL-compatible frontend designed to serve queries with high concurrency and low latency. You can connect to them with psql or any other PostgreSQL-compatible client. RisingWave is compatible with many data visualization tools, including:

Key features:

Uses familiar SQL syntax.
Provides immediate access to the most up-to-date results.
Offers flexibility to filter, aggregate, and join data.

Example: Retrieve the latest aggregated results from a materialized view.

Integrate with PostgreSQL via foreign data wrapper (FDW)

RisingWave seamlessly integrates with existing PostgreSQL ecosystems through its Foreign Data Wrapper (FDW) functionality. Access data in RisingWave’s tables and materialized views as if it were part of your PostgreSQL database. This allows you to leverage existing PostgreSQL tools and workflows. For details, see Integrate with other databases. Key features:

Enables unified access to RisingWave and PostgreSQL data.
Allows you to use existing PostgreSQL tools and applications.
Simplifies integration into existing data infrastructure.
Performance: While FDW offers convenience, it may introduce some performance overhead compared to directly accessing RisingWave. RisingWave pushes down filters in WHERE clauses to optimize data retrieval. However, complex SELECT statements with joins, aggregations, or LIMIT clauses are processed in PostgreSQL after fetching the data from RisingWave.

Example: Join data in a PostgreSQL table with a continuously updated materialized view in RisingWave. RisingWave’s subscription feature allows you to receive a continuous stream of updates from a materialized view directly, without needing an external message queue. This includes both existing data in the materialized view when the subscription is created and subsequent changes. You can choose to retrieve the full dataset or only incremental changes from a specific point using a subscription cursor. For details, see Real-time updates via subscriptions. Key features:

Provides real-time data updates directly from RisingWave.
Allows retrieving full or incremental data using a cursor.
Requires fewer components and less maintenance than external event stores.

Example: Subscribe to a materialized view that tracks website user activity to power a live dashboard, receiving updates directly from RisingWave.

Access programmatically via SDK and client libraries

RisingWave provides a Python SDK risingwave-py to help you develop event-driven applications. The SDK offers a simple way to access data, subscribe to changes, and define event handlers for tables and materialized views. Additionally, since RisingWave is compatible with Postgres, you can use standard PostgreSQL drivers to interact with RisingWave from your applications. Client libraries in various languages allow developers to interact with RisingWave programmatically and execute SELECT statements within their applications. For the list of available client libraries, see Client Libraries. Example: Use the Python client library to fetch the latest results from a materialized view and display them in a financial data analysis application.

Isolate serving and streaming workloads

For production environments with demanding workloads, you can configure dedicated nodes for serving and streaming to prevent resource contention. By default, Compute Nodes in RisingWave are hybrid and handle both types of workloads. However, running resource-intensive ad-hoc queries can impact the performance of ongoing streaming jobs. To ensure stability and performance, you can set up dedicated Serving Nodes to handle only ad-hoc queries and Streaming Nodes to handle only stream processing. This separation provides workload isolation, allowing you to scale each component independently based on your specific needs. For detailed instructions, see Set up a dedicated Compute Node.

Choose the right method

From the methods described above, select the one that best fits your needs, considering factors like data access patterns, integration requirements, and team expertise. RisingWave ensures consistency across all methods.

Get started

Work with data

Install & Operate

Performance

Troubleshooting

Reference

Cloud

Access and use your processed data

Access data with `SELECT` statements

Integrate with PostgreSQL via foreign data wrapper (FDW)

Access programmatically via SDK and client libraries

Isolate serving and streaming workloads

Choose the right method

Get started

Work with data

Install & Operate

Performance

Troubleshooting

Reference

Cloud

​Access data with SELECT statements

​Integrate with PostgreSQL via foreign data wrapper (FDW)

​Subscribe to real-time updates

​Access programmatically via SDK and client libraries

​Isolate serving and streaming workloads

​Choose the right method

Access data with `SELECT` statements

Integrate with PostgreSQL via foreign data wrapper (FDW)

Subscribe to real-time updates

Access programmatically via SDK and client libraries

Isolate serving and streaming workloads

Choose the right method