Skip to main content

What is a streaming database?

A streaming database is a database system designed to continuously process data as it arrives, rather than processing data only when a query is executed. In a streaming database, computations are defined in advance (typically as SQL queries) and results are incrementally updated in real time whenever new data enters the system. RisingWave is a streaming database. You define streaming pipelines using standard PostgreSQL-compatible SQL, and RisingWave continuously maintains the results as materialized views. When you query a materialized view, the result is returned instantly because the computation has already been performed.

Streaming database vs. traditional database

In a traditional OLTP or OLAP database, data is written first and processed later when a user or application sends a query. This “compute on read” model means query latency depends on data volume and query complexity. A streaming database flips this model to “compute on write.” Data is processed as it arrives, and results are stored and ready to serve. Query latency is near-constant regardless of data volume because the heavy computation is done incrementally at ingestion time.
Traditional databaseStreaming database
When computation happensOn query (read time)On data arrival (write time)
Query latencyDepends on data volume and query complexityNear-constant (pre-computed results)
Data freshnessCurrent at query timeContinuously updated in real time
Primary interfaceAd-hoc SQL queriesMaterialized views + ad-hoc queries
Best forOLTP, ad-hoc analyticsReal-time analytics, monitoring, event-driven apps

Streaming database vs. stream processing engine

Stream processing engines like Apache Flink, Apache Spark Streaming, and Kafka Streams process data in motion, but they are not databases. Key differences include:
Stream processing engineStreaming database
StorageExternal (Kafka, S3, HDFS)Built-in persistent storage
Query servingNo built-in serving layerBuilt-in SQL query serving
InterfaceJava/Scala API or custom SQL dialectPostgreSQL-compatible SQL
State managementCustom (checkpoints to external storage)Integrated (Hummock object storage)
DeploymentSeparate cluster + storage + serving layerSingle system
Ad-hoc queriesNot supportedFull SQL support
A streaming database combines the continuous processing capabilities of a stream processing engine with the storage and query serving capabilities of a database — in a single system.

Streaming database vs. real-time OLAP database

Real-time OLAP databases like ClickHouse, Apache Druid, and Apache Pinot are optimized for fast analytical queries over large datasets. They differ from streaming databases in their approach to freshness and computation:
Real-time OLAP databaseStreaming database
Storage engineColumnar (optimized for scans)Row-based (optimized for incremental updates)
Data freshnessNear real-time (micro-batch ingestion)True real-time (event-level incremental updates)
Materialized viewsPeriodic refresh (best-effort)Incrementally maintained, strongly consistent
Optimization targetAd-hoc query performanceResult freshness and continuous computation
Best forInteractive analytics, dashboards over historical dataReal-time monitoring, event-driven applications, streaming ETL
Streaming databases and OLAP databases are complementary. A common architecture uses a streaming database to process and transform data in real time, then sinks the results to an OLAP database for interactive analytics.

Why RisingWave?

RisingWave is a streaming database designed to reduce the complexity and cost of building real-time data applications:
  • PostgreSQL-compatible SQL — no new language or API to learn.
  • Incrementally maintained materialized views — results are always up to date without manual refresh.
  • Cascading materialized views — build multi-layered streaming pipelines entirely in SQL.
  • Built-in connectors — ingest from Kafka, databases (CDC), object storage, and more; deliver to Kafka, Iceberg, Snowflake, PostgreSQL, and many other systems.
  • Cloud-native architecture — decoupled compute and storage with object storage (S3) backend for cost efficiency.
  • Fully open source — Apache License 2.0 with no resource caps or usage restrictions.