RisingWave vs. Flink: Feature-by-feature comparison
A comprehensive comparison of features between RisingWave and Apache Flink, covering SQL capabilities, data types, streaming operations, and system functionalities.
This document provides a detailed feature-by-feature comparison between RisingWave (targeting v2.0) and Apache Flink (targeting v1.20). While both systems excel at stream processing, they have different architectural approaches and feature sets resulting from their different designs: RisingWave as a streaming database (offering an integrated, SQL-centric experience) and Flink as a stream processing framework (providing flexibility through integration with external components). This comparison helps you understand their similarities and differences across various features.
Features noted as “RisingWave-specific” highlight capabilities tied to its database architecture, while “Flink-specific” features reflect its framework nature. Version numbers (RW 2.0, Flink 1.20) are targets; some features might have evolved slightly around these versions. For definitive support details, always consult the official Flink and RisingWave documentation.
Fundamental concepts
RisingWave and Flink share the following core stream processing concepts:
- Dynamic Tables: Both represent streams as tables that change over time.
- Continuous Queries: Both execute SQL queries continuously, producing results as input data changes.
- Time attributes: Both support Event Time (timestamps embedded in data) and Processing Time (the system clock time) for operations such as windowing.
- Result Update Modes: Both use Append, Update, and Delete semantics to handle changes, which allows them to correctly process changelog streams and maintain state.
- Deterministic Queries: Both systems aim for deterministic results given the same ordered input events when using event time.
While sharing these fundamentals, their architectures differ: RisingWave stores state internally within its database storage layer, treating streams and materialized views as first-class database objects. Flink typically uses pluggable state backends (such as RocksDB or filesystems), and its connectors manage external storage. This integrated approach in RisingWave simplifies state management compared to Flink’s requirement of configuring and managing external state backends.
Data types
Both systems support a wide range of standard SQL data types. The following table highlights common types and key differences.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
Fixed-Length | |||
Fixed-Length Char | CHAR | Not Supported | Fixed-length character string. |
Fixed-Length Binary | BINARY | Not Supported | Fixed-length binary string. |
Variable-Length | |||
Variable-Length Char | VARCHAR /STRING | VARCHAR /STRING | Flink allows specifying a maximum length; RisingWave does not allow a maximum length specification. |
Variable-Length Binary | VARBINARY /BYTES | BYTEA | Flink uses VARBINARY or BYTES . RisingWave uses the PostgreSQL-compatible BYTEA . |
Numeric | |||
Arbitrary Precision Decimal | DECIMAL | DECIMAL /NUMERIC | Both support arbitrary precision numbers. Flink uses DECIMAL . RisingWave uses the PostgreSQL-compatible DECIMAL or NUMERIC . |
1-Byte Integer | TINYINT | Not Supported | |
2-Byte Integer | SMALLINT | SMALLINT | |
4-Byte Integer | INT | INT | |
8-Byte Integer | BIGINT | BIGINT | |
32-Byte Integer | Not Supported | RW_INT256 | RisingWave-specific large integer. |
Single-Precision Float | FLOAT | REAL | Flink uses FLOAT . RisingWave uses the PostgreSQL-compatible REAL . |
Double-Precision Float | DOUBLE | DOUBLE PRECISION | Flink uses DOUBLE . RisingWave uses the PostgreSQL-compatible DOUBLE PRECISION . |
Temporal | |||
Date | DATE | DATE | Calendar date. |
Time (no timezone) | TIME | TIME | Time of day without timezone. |
Timestamp (no timezone) | TIMESTAMP | TIMESTAMP | Timestamp without timezone. |
Timestamp (with timezone) | TIMESTAMP_LTZ | TIMESTAMPTZ | Flink uses TIMESTAMP_LTZ (local timezone interpretation). RisingWave uses the PostgreSQL-compatible TIMESTAMPTZ (stored as UTC). |
Interval | INTERVAL | INTERVAL | Time interval. |
Structured | |||
Structured Row | STRUCT | STRUCT | |
Array | ARRAY | ARRAY | Ordered collection of elements. |
Map | MAP | MAP | Key-value pairs. |
Multiset | MULTISET | Not Supported | Unordered collection allowing duplicates (Flink). |
JSON Handling | Via Functions | Native JSONB Type | Flink processes JSON primarily using string functions. RisingWave offers a native JSONB type (binary JSON), enabling efficient storage and indexed querying directly via SQL, similar to PostgreSQL. |
Other | |||
Boolean | BOOLEAN | BOOLEAN | True/False. |
Summary: Flink offers fixed-length types and MULTISET
. RisingWave provides native JSONB
, a large integer type (RW_INT256
), and several PostgreSQL-compatible types (REAL
, TIMESTAMPTZ
, BYTEA
, NUMERIC
). Both support the core data types required for streaming analytics. Timezone handling differs: Flink uses TIMESTAMP_LTZ
for local timezone interpretation, while RisingWave uses the PostgreSQL-standard TIMESTAMPTZ
. RisingWave’s native JSONB
support offers significant advantages for handling semi-structured data compared to Flink’s function-based approach.
SQL query capabilities
Common SELECT clauses
Flink SQL and RisingWave both support the standard clauses of a SELECT
query:
WITH
(Common Table Expressions)SELECT
(includingDISTINCT
)FROM
WHERE
GROUP BY
(includingGROUPING SETS
,ROLLUP
, andCUBE
)HAVING
ORDER BY
LIMIT
Key differences in SELECT
DISTINCT ON
: RisingWave supports the PostgreSQLSELECT DISTINCT ON (...)
syntax. Flink 1.20 has limited or no support for this specific syntax.- Complex Subqueries: Both support subqueries (e.g., in
WHERE IN (...)
,WHERE EXISTS (...)
, derived tables inFROM
). Flink’s SQL planner might optimize certain complex or correlated subquery patterns more effectively. - Pattern Matching: Flink provides the
MATCH_RECOGNIZE
clause for Complex Event Processing (CEP) directly within SQL. RisingWave does not supportMATCH_RECOGNIZE
.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
DISTINCT ON | Limited/No | Supported | PostgreSQL-style unique row selection, offering a concise way to get the first row per group, a common requirement not directly supported by standard SQL or Flink’s MATCH_RECOGNIZE . |
MATCH_RECOGNIZE | Supported | Not Supported | SQL standard for Complex Event Processing. |
Windowing operations
Both systems provide essential windowing capabilities for analyzing data over time or rows.
Window types (Table Value Functions)
Flink uses Table Value Functions (TVFs) like TUMBLE
, HOP
, SESSION
, CUMULATE
for windowing aggregations. RisingWave primarily uses standard SQL GROUP BY
with time functions or dedicated window syntax where applicable. Support for TVFs is also being added to RisingWave.
Window Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
Tumbling Window | Supported (TVF) | Supported | Fixed-size, non-overlapping windows. |
Sliding Window (Hop) | Supported (TVF) | Supported | Fixed-size, overlapping windows. |
Session Window | Supported (TVF) | Supported | Windows defined by gaps of inactivity. |
Cumulative Window | Supported (TVF) | Not Supported | Windows expanding from a start time to window_end . |
Late Data Handling | Supported | Supported | Mechanisms to handle data arriving after window close. |
Window Offset | Supported | Supported | Adjusting window start/end times. |
Window functions (OVER clause)
Both support standard SQL window functions using the OVER
clause for calculations across sets of table rows.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
OVER (...) | Supported | Supported | Defines the window specification. |
Ranking | Supported | Supported | RANK , DENSE_RANK , ROW_NUMBER , NTILE . |
Navigation | Supported | Supported | LEAD , LAG . |
Value | Supported | Supported | FIRST_VALUE , LAST_VALUE . |
Note: Flink’s NTILE
is a window function, not directly related to percentile aggregates.
Joins
Both systems support various SQL join types for combining data from multiple streams or tables.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
Regular Joins | |||
INNER JOIN | Supported | Supported | |
LEFT JOIN | Supported | Supported | |
RIGHT JOIN | Supported | Supported | |
FULL JOIN | Supported | Supported | |
State Handling Note | Configurable TTL | Internal Storage with temporal filters and configurable TTL | Flink requires manual configuration of Time-To-Live (TTL) for join state. RisingWave simplifies state management: it uses temporal filters to automatically prune state outside the required time range, and supports configurable TTL for append-only sources, both saving storage space with straightforward configuration. |
Interval Join | Supported | Supported | Joins streams based on time proximity (e.g., t1.time BETWEEN t2.time - INTERVAL '5' SECOND AND t2.time + INTERVAL '5' SECOND ). |
Temporal Join | Joins a stream against a versioned table/changelog source. | ||
Event Time Temporal Join | Supported (FOR ... AS OF ) | Supported (Implicit / TEMPORAL JOIN ) | Flink requires explicit syntax. RisingWave support depends on the source type and query structure. |
Processing Time Temporal Join | Supported (FOR ... AS OF ) | Supported (Lookup Joins) | Flink requires explicit syntax. RisingWave typically achieves this using lookup joins against external tables. |
Window Join | Supported | Supported | Joins aggregated results from windows. |
Lookup Join | Supported | Supported | Joins a stream against an external dimension table (often async). |
Set operations
Both support standard SQL set operations to combine or compare result sets.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
UNION | Supported | Supported | Combines results, removes duplicates. |
UNION ALL | Supported | Supported | Combines results, keeps duplicates. |
INTERSECT | Supported | Supported | Returns common rows, removes duplicates. |
INTERSECT ALL | Supported | Not Supported | Returns common rows, keeps duplicates. |
EXCEPT | Supported | Supported | Returns unique rows from first set not in second. |
EXCEPT ALL | Supported | Not Supported | Returns rows from first set not in second, keeps duplicates. |
IN (Subquery) | Supported | Supported | Checks for membership in a subquery result. |
EXISTS (Subquery) | Supported | Supported | Checks if a subquery returns any rows. |
CORRESPONDING | Not Supported | Supported | Performs set ops on specified columns by name (RisingWave-specific), simplifying queries when column orders differ but names match. |
DDL statements (Data Definition Language)
As a streaming database, RisingWave has a broader set of Data Definition Language (DDL) commands, especially for managing sources, sinks, users, and connections.
Common DDL
Operation | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
CREATE/DROP DATABASE | Supported | Supported | Logical grouping of objects. |
CREATE/DROP VIEW | Supported | Supported | Defines a logical view based on a query. |
CREATE/DROP FUNCTION | Supported | Supported | Defines user-defined functions. |
CREATE/DROP TABLE | Supported | Supported | Flink: Defines schema/connector. RW: Defines schema for source/table. |
CREATE MATERIALIZED VIEW | Supported | Supported | Defines a query whose results are stored. Core concept in RisingWave. |
DROP MATERIALIZED VIEW | Supported | Supported | Removes a materialized view. |
ALTER MATERIALIZED VIEW | Limited/No | Supported | Modifies a materialized view (e.g., rename, set parallelism). Flink support may vary. |
ALTER DATABASE/VIEW/FUNCTION | Supported | Supported | Modifies existing objects (scope varies). |
RisingWave-specific DDL
RisingWave includes the following DDL commands to manage its database-specific entities:
CREATE/DROP/ALTER SOURCE
: Define connections to external data sources (e.g., Kafka, Kinesis).CREATE/DROP/ALTER SINK
: Define destinations for outputting data.CREATE/DROP/ALTER CONNECTION
: Reusable connection configurations for sources/sinks.CREATE/DROP/ALTER SCHEMA
: Organize objects within a database.CREATE/DROP/ALTER USER
: Manage database users.CREATE/DROP SECRET
: Securely store credentials.CREATE/DROP/ALTER INDEX
: Create indexes on materialized views/tables for faster lookups.CREATE/DROP AGGREGATE
: Define custom aggregate functions.ALTER SYSTEM SET
: Modify system-level configuration parameters.
This integrated DDL allows managing the entire data flow (sources, processing, sinks, access) directly through SQL, a key advantage of the streaming database approach.
Flink-specific DDL aspects
CREATE OR REPLACE TABLE
: Flink supports this atomic replacement syntax.ALTER TABLE
: Both systems supportALTER TABLE
, but capabilities differ. Flink’s focus is often on schema evolution via connectors/formats. RisingWave supports various alterations likeADD/DROP COLUMN
,RENAME TO
,OWNER TO
,SET SCHEMA
,SET PARALLELISM
,SET SOURCE_RATE_LIMIT
, andREFRESH SCHEMA
(see RW docs for specifics), providing significant table management via SQL.
DML statements (Data Manipulation Language)
As a database, RisingWave allows standard DML statements to interact with tables and trigger materialized view updates directly, offering a familiar pattern for database users. Data Manipulation Language (DML) statements interact with data in tables or trigger updates to materialized views.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
INSERT INTO ... SELECT ... | Supported | Supported | Inserts data from a query result. Triggers MV updates. |
INSERT INTO ... VALUES ... | Supported | Supported | Inserts explicit rows. Triggers MV updates. |
UPDATE ... SET ... WHERE ... | Supported | Supported | Updates existing rows based on conditions. Triggers MV updates. |
DELETE FROM ... WHERE ... | Supported | Supported | Deletes rows based on conditions. Triggers MV updates. |
Multi-table INSERT | Supported | Not Supported | Flink can insert into multiple tables from a single source query. |
FLUSH | Not Supported | Supported | RisingWave-specific: Ensures visibility of preceding DML changes in subsequent queries within a session (mainly for sinks). |
SET [key = value] | Not Supported | Supported | Modifies session configuration settings. |
Introspection and utility statements
Commands for exploring metadata, explaining queries, and managing sessions.
Common statements
SHOW DATABASES | TABLES | VIEWS | FUNCTIONS | JOBS
: List common objects or entities (supported by both).DESCRIBE <object>
: Shows object metadata (e.g., columns, types). In Flink,<object>
can be a table, view, or catalog. In RisingWave,<object>
can be a table, source, view, sink, or materialized view.EXPLAIN [statement]
: Shows the logical and physical execution plan for a query.USE [database/catalog]
: Sets the current context for queries.SET [key = value]
: Modifies session configuration settings.
RisingWave-specific SHOW commands
Reflecting its database architecture, RisingWave offers significantly more comprehensive SQL commands for system introspection compared to Flink:
SHOW CLUSTER
,SHOW PROCESSLIST
,SHOW PARAMETERS
SHOW CONNECTIONS
,SHOW SOURCES
,SHOW SINKS
,SHOW SCHEMAS
SHOW INDEX
,SHOW MATERIALIZED VIEWS
,SHOW INTERNAL TABLES
SHOW CREATE ...
for various RW objects (Sources, Sinks, MVs, etc.)SHOW CURSORS
,SHOW SUBSCRIPTION CURSORS
Flink-specific SHOW commands
SHOW JARS
: Lists user-uploaded JAR files (relevant for Java UDFs).SHOW PARTITIONS
: Shows partitions for partitioned tables (common in batch/Hive integration).SHOW CURRENT DATABASE
: Flink command. RW usesSELECT current_database()
.
Job management
Both systems allow managing running streaming queries/jobs.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
SHOW JOBS | Supported | Supported | Lists active streaming jobs/pipelines. |
CANCEL JOB [id] | Supported | Supported | Stops a specific running job/pipeline. |
System management and monitoring interface
A key difference lies in how users interact with the system for management and monitoring tasks.
-
RisingWave (SQL-centric): Leverages its database architecture to provide a unified SQL interface for many operational tasks:
- Configuration: Runtime parameters (
SET
/SHOW
) and persistent system-wide parameters (ALTER SYSTEM SET
/SHOW PARAMETERS
) are managed via SQL. - Monitoring: Job status (
SHOW JOBS
), active processes (SHOW PROCESSLIST
), DDL progress, and detailed internal states (via system catalogs) are accessible through SQL commands. - Alteration: Modifying running jobs (e.g., changing parallelism with
ALTER ... SET PARALLELISM
) and stopping jobs (CANCEL JOBS
) are done directly with SQL. - Advantage: Offers a consistent, familiar SQL interface for configuration, monitoring, and management, potentially simplifying operations for users comfortable with databases.
- Configuration: Runtime parameters (
-
Flink (API/UI/Config File-centric): As a framework, Flink relies on a combination of interfaces:
- Configuration: System configuration is primarily done via configuration files (
flink-conf.yaml
) or command-line options. Flink SQLSET
affects session-specific query behavior. - Monitoring: Rich monitoring is available through the Web UI, REST API, and integration with external metrics systems (e.g., Prometheus, Grafana).
- Alteration: Stopping jobs (
flink cancel/stop
) and modifying aspects like parallelism typically involve the CLI, REST API, or UI, often requiring a job restart (potentially with savepoints). - Note: While powerful, requires interacting with multiple different tools and interfaces (SQL client, config files, web browser, CLI, potentially external monitoring dashboards) for full system management.
- Configuration: System configuration is primarily done via configuration files (
Materialized views
While both support the concept, their role differs significantly.
- RisingWave: Materialized Views are a core architectural concept, central to its performance and ease of use. Queries are typically defined as MVs, which RisingWave keeps incrementally, efficiently, and automatically updated. State management is built around these MVs, providing users with consistently fresh results via simple SQL queries.
- Flink: Materialized Views (or Materialized Tables) are an optional feature (experimental/evolving in 1.20). Flink can materialize query results using specific connectors, but it’s not the default way queries are executed or state is managed.
Access control (RisingWave-specific)
As a database system, RisingWave provides standard SQL access control.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
GRANT | Not Supported | Supported | Grants privileges on objects. Provides standard, granular SQL-based security within the system. |
REVOKE | Not Supported | Supported | Revokes privileges from users. Provides standard, granular SQL-based security within the system. |
Flink typically relies on external systems (like cluster managers or catalog providers) for access control.
User-defined Functions (UDFs)
Both systems allow you to extend SQL with custom logic using User-defined Functions (UDFs).
- Supported Languages:
- Flink (1.20): Java, Scala, and Python (primarily via an external Remote Procedure Call (RPC) mechanism).
- RisingWave (2.0): SQL, Python, Java (via Java Native Interface (JNI)), JavaScript, and Rust (the latter via embedded WebAssembly (WASM) or Foreign Function Interface (FFI)).
- Execution Models:
- Flink: Embedded JVM functions (Java/Scala), External RPC functions (Python).
- RisingWave: Embedded SQL functions, Embedded language functions via WASM/FFI/JNI.
- Function Types: Both support Scalar Functions (UDF), Aggregate Functions (UDAF), and Table Functions (UDTF), although the specific implementation details and language support for each type can vary.
- SQL UDFs: RisingWave allows you to define simple scalar functions directly using SQL expressions, offering a very accessible way to encapsulate reusable logic; Flink does not offer this capability.
Summary: Both offer extensibility, but RisingWave provides native SQL UDFs and uses WASM/FFI for several embedded languages, while Flink primarily uses JVM languages and an RPC mechanism for Python.
Built-in functions
Both Flink and RisingWave provide a rich library of built-in functions covering standard SQL categories. Listing every function is impractical here; instead, we highlight general coverage and key differences.
-
Common Coverage: Both systems offer extensive support for:
- Comparison operators (
=
,>
,<
, etc.) and predicates (IS NULL
,BETWEEN
). - Logical operators (
AND
,OR
,NOT
). - Standard arithmetic operators (
+
,-
,*
,/
,%
) and functions (POWER
,SQRT
,ABS
,ROUND
,CEIL
,FLOOR
). - Common string manipulation functions (
CONCAT
,SUBSTRING
,UPPER
,LOWER
,TRIM
,REPLACE
,LENGTH
,POSITION
,LIKE
). - Basic temporal functions (
EXTRACT
,DATE_FORMAT
/TO_CHAR
, casting between date/time types). - Standard aggregate functions (
COUNT
,SUM
,AVG
,MAX
,MIN
,ARRAY_AGG
,STRING_AGG
). - Conditional expressions (
CASE WHEN ... END
,COALESCE
,NULLIF
). - Type casting (
CAST
). - Basic Array and Map functions/operators.
- Set returning functions like
generate_series
.
- Comparison operators (
-
Key Differences & Specific Functions:
- Flink Specific:
- URL parsing:
PARSE_URL
function. - Safe Casting:
TRY_CAST
(returns NULL on failure instead of error). - More extensive built-in time functions:
LOCALTIME
,CURRENT_TIME
, etc. - Potentially more advanced collection functions (e.g.,
ARRAY_TRANSFORM
). - Specific functions related to
MATCH_RECOGNIZE
.
- URL parsing:
- RisingWave Specific:
- Native
JSONB
operators (->
,->>
,#>
, etc.) and functions. - PostgreSQL compatibility functions (e.g.,
pg_typeof
,pg_sleep
). - Encryption functions:
encrypt()
,decrypt()
. TIMESTAMPTZ
handling functions.- Uses the standard SQL
CAST
function, which typically errors on failure during query execution (though behavior within materialized view updates might differ).
- Native
- Flink Specific:
Recommendation: Always consult the official Flink SQL and RisingWave SQL documentation for the most current and comprehensive list of supported built-in functions.
System management & information
- RisingWave: Provides SQL functions and
SHOW
commands specifically designed for managing and monitoring the streaming database system itself, providing information such as cluster status, internal tables, process lists, and system parameters. - Flink: Relies on its framework APIs, metrics system, and external tools (like its Web UI or platform integrations) for system management and monitoring, rather than using built-in SQL functions for these tasks.
Other features
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
Catalogs | Supported | Supported | Manage metadata namespaces (e.g., connecting to external catalogs). |
SQL Client | Supported | Supported | Interactive command-line interface for SQL execution. RisingWave supports psql and any Postgres-compatible SQL client. |
SQL Gateway | Supported | Supported | Service for accepting remote SQL submissions (e.g., via REST/JDBC). |
Conclusion
RisingWave and Flink SQL, while both powerful stream processing tools, cater to slightly different needs and design philosophies, which are reflected in their SQL dialects and features.
- Flink SQL (1.20) is a flexible processing framework with strong Hive integration,
MATCH_RECOGNIZE
for CEP, and specific features likeTRY_CAST
. - RisingWave SQL (2.0) offers a PostgreSQL-compatible, integrated streaming database experience. Key advantages include core materialized views for simplicity and performance, native
JSONB
handling, unified SQL DDL/DML for managing sources, sinks, and access control, and automatic internal state management.
Your choice between them depends on whether you need a standalone, easy-to-manage streaming database (RisingWave) or a flexible stream processing framework requiring integration with external systems for state, storage, and management (Flink).
Was this page helpful?