This document provides a detailed feature-by-feature comparison between RisingWave (targeting v2.0) and Apache Flink (targeting v1.20). While both systems excel at stream processing, they have different architectural approaches and feature sets resulting from their different designs: RisingWave as a streaming database (offering an integrated, SQL-centric experience) and Flink as a stream processing framework (providing flexibility through integration with external components). This comparison helps you understand their similarities and differences across various features.

Features noted as “RisingWave-specific” highlight capabilities tied to its database architecture, while “Flink-specific” features reflect its framework nature. Version numbers (RW 2.0, Flink 1.20) are targets; some features might have evolved slightly around these versions. For definitive support details, always consult the official Flink and RisingWave documentation.

Fundamental concepts

RisingWave and Flink share the following core stream processing concepts:

  • Dynamic Tables: Both represent streams as tables that change over time.
  • Continuous Queries: Both execute SQL queries continuously, producing results as input data changes.
  • Time attributes: Both support Event Time (timestamps embedded in data) and Processing Time (the system clock time) for operations such as windowing.
  • Result Update Modes: Both use Append, Update, and Delete semantics to handle changes, which allows them to correctly process changelog streams and maintain state.
  • Deterministic Queries: Both systems aim for deterministic results given the same ordered input events when using event time.

While sharing these fundamentals, their architectures differ: RisingWave stores state internally within its database storage layer, treating streams and materialized views as first-class database objects. Flink typically uses pluggable state backends (such as RocksDB or filesystems), and its connectors manage external storage. This integrated approach in RisingWave simplifies state management compared to Flink’s requirement of configuring and managing external state backends.

Data types

Both systems support a wide range of standard SQL data types. The following table highlights common types and key differences.

FeatureFlink (v1.20)RisingWave (v2.0)Notes
Fixed-Length
Fixed-Length CharCHARNot SupportedFixed-length character string.
Fixed-Length BinaryBINARYNot SupportedFixed-length binary string.
Variable-Length
Variable-Length CharVARCHAR/STRINGVARCHAR/STRINGFlink allows specifying a maximum length; RisingWave does not allow a maximum length specification.
Variable-Length BinaryVARBINARY/BYTESBYTEAFlink uses VARBINARY or BYTES. RisingWave uses the PostgreSQL-compatible BYTEA.
Numeric
Arbitrary Precision DecimalDECIMALDECIMAL/NUMERICBoth support arbitrary precision numbers. Flink uses DECIMAL. RisingWave uses the PostgreSQL-compatible DECIMAL or NUMERIC.
1-Byte IntegerTINYINTNot Supported
2-Byte IntegerSMALLINTSMALLINT
4-Byte IntegerINTINT
8-Byte IntegerBIGINTBIGINT
32-Byte IntegerNot SupportedRW_INT256RisingWave-specific large integer.
Single-Precision FloatFLOATREALFlink uses FLOAT. RisingWave uses the PostgreSQL-compatible REAL.
Double-Precision FloatDOUBLEDOUBLE PRECISIONFlink uses DOUBLE. RisingWave uses the PostgreSQL-compatible DOUBLE PRECISION.
Temporal
DateDATEDATECalendar date.
Time (no timezone)TIMETIMETime of day without timezone.
Timestamp (no timezone)TIMESTAMPTIMESTAMPTimestamp without timezone.
Timestamp (with timezone)TIMESTAMP_LTZTIMESTAMPTZFlink uses TIMESTAMP_LTZ (local timezone interpretation). RisingWave uses the PostgreSQL-compatible TIMESTAMPTZ (stored as UTC).
IntervalINTERVALINTERVALTime interval.
Structured
Structured RowSTRUCTSTRUCT
ArrayARRAYARRAYOrdered collection of elements.
MapMAPMAPKey-value pairs.
MultisetMULTISETNot SupportedUnordered collection allowing duplicates (Flink).
JSON HandlingVia FunctionsNative JSONB TypeFlink processes JSON primarily using string functions. RisingWave offers a native JSONB type (binary JSON), enabling efficient storage and indexed querying directly via SQL, similar to PostgreSQL.
Other
BooleanBOOLEANBOOLEANTrue/False.

Summary: Flink offers fixed-length types and MULTISET. RisingWave provides native JSONB, a large integer type (RW_INT256), and several PostgreSQL-compatible types (REAL, TIMESTAMPTZ, BYTEA, NUMERIC). Both support the core data types required for streaming analytics. Timezone handling differs: Flink uses TIMESTAMP_LTZ for local timezone interpretation, while RisingWave uses the PostgreSQL-standard TIMESTAMPTZ. RisingWave’s native JSONB support offers significant advantages for handling semi-structured data compared to Flink’s function-based approach.

SQL query capabilities

Common SELECT clauses

Flink SQL and RisingWave both support the standard clauses of a SELECT query:

  • WITH (Common Table Expressions)
  • SELECT (including DISTINCT)
  • FROM
  • WHERE
  • GROUP BY (including GROUPING SETS, ROLLUP, and CUBE)
  • HAVING
  • ORDER BY
  • LIMIT

Key differences in SELECT

  • DISTINCT ON: RisingWave supports the PostgreSQL SELECT DISTINCT ON (...) syntax. Flink 1.20 has limited or no support for this specific syntax.
  • Complex Subqueries: Both support subqueries (e.g., in WHERE IN (...), WHERE EXISTS (...), derived tables in FROM). Flink’s SQL planner might optimize certain complex or correlated subquery patterns more effectively.
  • Pattern Matching: Flink provides the MATCH_RECOGNIZE clause for Complex Event Processing (CEP) directly within SQL. RisingWave does not support MATCH_RECOGNIZE.
FeatureFlink (v1.20)RisingWave (v2.0)Notes
DISTINCT ONLimited/NoSupportedPostgreSQL-style unique row selection, offering a concise way to get the first row per group, a common requirement not directly supported by standard SQL or Flink’s MATCH_RECOGNIZE.
MATCH_RECOGNIZESupportedNot SupportedSQL standard for Complex Event Processing.

Windowing operations

Both systems provide essential windowing capabilities for analyzing data over time or rows.

Window types (Table Value Functions)

Flink uses Table Value Functions (TVFs) like TUMBLE, HOP, SESSION, CUMULATE for windowing aggregations. RisingWave primarily uses standard SQL GROUP BY with time functions or dedicated window syntax where applicable. Support for TVFs is also being added to RisingWave.

Window FeatureFlink (v1.20)RisingWave (v2.0)Notes
Tumbling WindowSupported (TVF)SupportedFixed-size, non-overlapping windows.
Sliding Window (Hop)Supported (TVF)SupportedFixed-size, overlapping windows.
Session WindowSupported (TVF)SupportedWindows defined by gaps of inactivity.
Cumulative WindowSupported (TVF)Not SupportedWindows expanding from a start time to window_end.
Late Data HandlingSupportedSupportedMechanisms to handle data arriving after window close.
Window OffsetSupportedSupportedAdjusting window start/end times.

Window functions (OVER clause)

Both support standard SQL window functions using the OVER clause for calculations across sets of table rows.

FeatureFlink (v1.20)RisingWave (v2.0)Notes
OVER (...)SupportedSupportedDefines the window specification.
RankingSupportedSupportedRANK, DENSE_RANK, ROW_NUMBER, NTILE.
NavigationSupportedSupportedLEAD, LAG.
ValueSupportedSupportedFIRST_VALUE, LAST_VALUE.

Note: Flink’s NTILE is a window function, not directly related to percentile aggregates.

Joins

Both systems support various SQL join types for combining data from multiple streams or tables.

FeatureFlink (v1.20)RisingWave (v2.0)Notes
Regular Joins
INNER JOINSupportedSupported
LEFT JOINSupportedSupported
RIGHT JOINSupportedSupported
FULL JOINSupportedSupported
State Handling NoteConfigurable TTLInternal Storage with temporal filters and configurable TTLFlink requires manual configuration of Time-To-Live (TTL) for join state. RisingWave simplifies state management: it uses temporal filters to automatically prune state outside the required time range, and supports configurable TTL for append-only sources, both saving storage space with straightforward configuration.
Interval JoinSupportedSupportedJoins streams based on time proximity (e.g., t1.time BETWEEN t2.time - INTERVAL '5' SECOND AND t2.time + INTERVAL '5' SECOND).
Temporal JoinJoins a stream against a versioned table/changelog source.
Event Time Temporal JoinSupported (FOR ... AS OF)Supported (Implicit / TEMPORAL JOIN)Flink requires explicit syntax. RisingWave support depends on the source type and query structure.
Processing Time Temporal JoinSupported (FOR ... AS OF)Supported (Lookup Joins)Flink requires explicit syntax. RisingWave typically achieves this using lookup joins against external tables.
Window JoinSupportedSupportedJoins aggregated results from windows.
Lookup JoinSupportedSupportedJoins a stream against an external dimension table (often async).

Set operations

Both support standard SQL set operations to combine or compare result sets.

FeatureFlink (v1.20)RisingWave (v2.0)Notes
UNIONSupportedSupportedCombines results, removes duplicates.
UNION ALLSupportedSupportedCombines results, keeps duplicates.
INTERSECTSupportedSupportedReturns common rows, removes duplicates.
INTERSECT ALLSupportedNot SupportedReturns common rows, keeps duplicates.
EXCEPTSupportedSupportedReturns unique rows from first set not in second.
EXCEPT ALLSupportedNot SupportedReturns rows from first set not in second, keeps duplicates.
IN (Subquery)SupportedSupportedChecks for membership in a subquery result.
EXISTS (Subquery)SupportedSupportedChecks if a subquery returns any rows.
CORRESPONDINGNot SupportedSupportedPerforms set ops on specified columns by name (RisingWave-specific), simplifying queries when column orders differ but names match.

DDL statements (Data Definition Language)

As a streaming database, RisingWave has a broader set of Data Definition Language (DDL) commands, especially for managing sources, sinks, users, and connections.

Common DDL

OperationFlink (v1.20)RisingWave (v2.0)Notes
CREATE/DROP DATABASESupportedSupportedLogical grouping of objects.
CREATE/DROP VIEWSupportedSupportedDefines a logical view based on a query.
CREATE/DROP FUNCTIONSupportedSupportedDefines user-defined functions.
CREATE/DROP TABLESupportedSupportedFlink: Defines schema/connector. RW: Defines schema for source/table.
CREATE MATERIALIZED VIEWSupportedSupportedDefines a query whose results are stored. Core concept in RisingWave.
DROP MATERIALIZED VIEWSupportedSupportedRemoves a materialized view.
ALTER MATERIALIZED VIEWLimited/NoSupportedModifies a materialized view (e.g., rename, set parallelism). Flink support may vary.
ALTER DATABASE/VIEW/FUNCTIONSupportedSupportedModifies existing objects (scope varies).

RisingWave-specific DDL

RisingWave includes the following DDL commands to manage its database-specific entities:

  • CREATE/DROP/ALTER SOURCE: Define connections to external data sources (e.g., Kafka, Kinesis).
  • CREATE/DROP/ALTER SINK: Define destinations for outputting data.
  • CREATE/DROP/ALTER CONNECTION: Reusable connection configurations for sources/sinks.
  • CREATE/DROP/ALTER SCHEMA: Organize objects within a database.
  • CREATE/DROP/ALTER USER: Manage database users.
  • CREATE/DROP SECRET: Securely store credentials.
  • CREATE/DROP/ALTER INDEX: Create indexes on materialized views/tables for faster lookups.
  • CREATE/DROP AGGREGATE: Define custom aggregate functions.
  • ALTER SYSTEM SET: Modify system-level configuration parameters.

This integrated DDL allows managing the entire data flow (sources, processing, sinks, access) directly through SQL, a key advantage of the streaming database approach.

  • CREATE OR REPLACE TABLE: Flink supports this atomic replacement syntax.
  • ALTER TABLE: Both systems support ALTER TABLE, but capabilities differ. Flink’s focus is often on schema evolution via connectors/formats. RisingWave supports various alterations like ADD/DROP COLUMN, RENAME TO, OWNER TO, SET SCHEMA, SET PARALLELISM, SET SOURCE_RATE_LIMIT, and REFRESH SCHEMA (see RW docs for specifics), providing significant table management via SQL.

DML statements (Data Manipulation Language)

As a database, RisingWave allows standard DML statements to interact with tables and trigger materialized view updates directly, offering a familiar pattern for database users. Data Manipulation Language (DML) statements interact with data in tables or trigger updates to materialized views.

FeatureFlink (v1.20)RisingWave (v2.0)Notes
INSERT INTO ... SELECT ...SupportedSupportedInserts data from a query result. Triggers MV updates.
INSERT INTO ... VALUES ...SupportedSupportedInserts explicit rows. Triggers MV updates.
UPDATE ... SET ... WHERE ...SupportedSupportedUpdates existing rows based on conditions. Triggers MV updates.
DELETE FROM ... WHERE ...SupportedSupportedDeletes rows based on conditions. Triggers MV updates.
Multi-table INSERTSupportedNot SupportedFlink can insert into multiple tables from a single source query.
FLUSHNot SupportedSupportedRisingWave-specific: Ensures visibility of preceding DML changes in subsequent queries within a session (mainly for sinks).
SET [key = value]Not SupportedSupportedModifies session configuration settings.

Introspection and utility statements

Commands for exploring metadata, explaining queries, and managing sessions.

Common statements

  • SHOW DATABASES | TABLES | VIEWS | FUNCTIONS | JOBS: List common objects or entities (supported by both).
  • DESCRIBE <object>: Shows object metadata (e.g., columns, types). In Flink, <object> can be a table, view, or catalog. In RisingWave, <object> can be a table, source, view, sink, or materialized view.
  • EXPLAIN [statement]: Shows the logical and physical execution plan for a query.
  • USE [database/catalog]: Sets the current context for queries.
  • SET [key = value]: Modifies session configuration settings.

RisingWave-specific SHOW commands

Reflecting its database architecture, RisingWave offers significantly more comprehensive SQL commands for system introspection compared to Flink:

  • SHOW CLUSTER, SHOW PROCESSLIST, SHOW PARAMETERS
  • SHOW CONNECTIONS, SHOW SOURCES, SHOW SINKS, SHOW SCHEMAS
  • SHOW INDEX, SHOW MATERIALIZED VIEWS, SHOW INTERNAL TABLES
  • SHOW CREATE ... for various RW objects (Sources, Sinks, MVs, etc.)
  • SHOW CURSORS, SHOW SUBSCRIPTION CURSORS
  • SHOW JARS: Lists user-uploaded JAR files (relevant for Java UDFs).
  • SHOW PARTITIONS: Shows partitions for partitioned tables (common in batch/Hive integration).
  • SHOW CURRENT DATABASE: Flink command. RW uses SELECT current_database().

Job management

Both systems allow managing running streaming queries/jobs.

FeatureFlink (v1.20)RisingWave (v2.0)Notes
SHOW JOBSSupportedSupportedLists active streaming jobs/pipelines.
CANCEL JOB [id]SupportedSupportedStops a specific running job/pipeline.

System management and monitoring interface

A key difference lies in how users interact with the system for management and monitoring tasks.

  • RisingWave (SQL-centric): Leverages its database architecture to provide a unified SQL interface for many operational tasks:

    • Configuration: Runtime parameters (SET/SHOW) and persistent system-wide parameters (ALTER SYSTEM SET/SHOW PARAMETERS) are managed via SQL.
    • Monitoring: Job status (SHOW JOBS), active processes (SHOW PROCESSLIST), DDL progress, and detailed internal states (via system catalogs) are accessible through SQL commands.
    • Alteration: Modifying running jobs (e.g., changing parallelism with ALTER ... SET PARALLELISM) and stopping jobs (CANCEL JOBS) are done directly with SQL.
    • Advantage: Offers a consistent, familiar SQL interface for configuration, monitoring, and management, potentially simplifying operations for users comfortable with databases.
  • Flink (API/UI/Config File-centric): As a framework, Flink relies on a combination of interfaces:

    • Configuration: System configuration is primarily done via configuration files (flink-conf.yaml) or command-line options. Flink SQL SET affects session-specific query behavior.
    • Monitoring: Rich monitoring is available through the Web UI, REST API, and integration with external metrics systems (e.g., Prometheus, Grafana).
    • Alteration: Stopping jobs (flink cancel/stop) and modifying aspects like parallelism typically involve the CLI, REST API, or UI, often requiring a job restart (potentially with savepoints).
    • Note: While powerful, requires interacting with multiple different tools and interfaces (SQL client, config files, web browser, CLI, potentially external monitoring dashboards) for full system management.

Materialized views

While both support the concept, their role differs significantly.

  • RisingWave: Materialized Views are a core architectural concept, central to its performance and ease of use. Queries are typically defined as MVs, which RisingWave keeps incrementally, efficiently, and automatically updated. State management is built around these MVs, providing users with consistently fresh results via simple SQL queries.
  • Flink: Materialized Views (or Materialized Tables) are an optional feature (experimental/evolving in 1.20). Flink can materialize query results using specific connectors, but it’s not the default way queries are executed or state is managed.

Access control (RisingWave-specific)

As a database system, RisingWave provides standard SQL access control.

FeatureFlink (v1.20)RisingWave (v2.0)Notes
GRANTNot SupportedSupportedGrants privileges on objects. Provides standard, granular SQL-based security within the system.
REVOKENot SupportedSupportedRevokes privileges from users. Provides standard, granular SQL-based security within the system.

Flink typically relies on external systems (like cluster managers or catalog providers) for access control.

User-defined Functions (UDFs)

Both systems allow you to extend SQL with custom logic using User-defined Functions (UDFs).

  • Supported Languages:
    • Flink (1.20): Java, Scala, and Python (primarily via an external Remote Procedure Call (RPC) mechanism).
    • RisingWave (2.0): SQL, Python, Java (via Java Native Interface (JNI)), JavaScript, and Rust (the latter via embedded WebAssembly (WASM) or Foreign Function Interface (FFI)).
  • Execution Models:
    • Flink: Embedded JVM functions (Java/Scala), External RPC functions (Python).
    • RisingWave: Embedded SQL functions, Embedded language functions via WASM/FFI/JNI.
  • Function Types: Both support Scalar Functions (UDF), Aggregate Functions (UDAF), and Table Functions (UDTF), although the specific implementation details and language support for each type can vary.
  • SQL UDFs: RisingWave allows you to define simple scalar functions directly using SQL expressions, offering a very accessible way to encapsulate reusable logic; Flink does not offer this capability.

Summary: Both offer extensibility, but RisingWave provides native SQL UDFs and uses WASM/FFI for several embedded languages, while Flink primarily uses JVM languages and an RPC mechanism for Python.

Built-in functions

Both Flink and RisingWave provide a rich library of built-in functions covering standard SQL categories. Listing every function is impractical here; instead, we highlight general coverage and key differences.

  • Common Coverage: Both systems offer extensive support for:

    • Comparison operators (=, >, <, etc.) and predicates (IS NULL, BETWEEN).
    • Logical operators (AND, OR, NOT).
    • Standard arithmetic operators (+, -, *, /, %) and functions (POWER, SQRT, ABS, ROUND, CEIL, FLOOR).
    • Common string manipulation functions (CONCAT, SUBSTRING, UPPER, LOWER, TRIM, REPLACE, LENGTH, POSITION, LIKE).
    • Basic temporal functions (EXTRACT, DATE_FORMAT/TO_CHAR, casting between date/time types).
    • Standard aggregate functions (COUNT, SUM, AVG, MAX, MIN, ARRAY_AGG, STRING_AGG).
    • Conditional expressions (CASE WHEN ... END, COALESCE, NULLIF).
    • Type casting (CAST).
    • Basic Array and Map functions/operators.
    • Set returning functions like generate_series.
  • Key Differences & Specific Functions:

    • Flink Specific:
      • URL parsing: PARSE_URL function.
      • Safe Casting: TRY_CAST (returns NULL on failure instead of error).
      • More extensive built-in time functions: LOCALTIME, CURRENT_TIME, etc.
      • Potentially more advanced collection functions (e.g., ARRAY_TRANSFORM).
      • Specific functions related to MATCH_RECOGNIZE.
    • RisingWave Specific:
      • Native JSONB operators (->, ->>, #>, etc.) and functions.
      • PostgreSQL compatibility functions (e.g., pg_typeof, pg_sleep).
      • Encryption functions: encrypt(), decrypt().
      • TIMESTAMPTZ handling functions.
      • Uses the standard SQL CAST function, which typically errors on failure during query execution (though behavior within materialized view updates might differ).

Recommendation: Always consult the official Flink SQL and RisingWave SQL documentation for the most current and comprehensive list of supported built-in functions.

System management & information

  • RisingWave: Provides SQL functions and SHOW commands specifically designed for managing and monitoring the streaming database system itself, providing information such as cluster status, internal tables, process lists, and system parameters.
  • Flink: Relies on its framework APIs, metrics system, and external tools (like its Web UI or platform integrations) for system management and monitoring, rather than using built-in SQL functions for these tasks.

Other features

FeatureFlink (v1.20)RisingWave (v2.0)Notes
CatalogsSupportedSupportedManage metadata namespaces (e.g., connecting to external catalogs).
SQL ClientSupportedSupportedInteractive command-line interface for SQL execution. RisingWave supports psql and any Postgres-compatible SQL client.
SQL GatewaySupportedSupportedService for accepting remote SQL submissions (e.g., via REST/JDBC).

Conclusion

RisingWave and Flink SQL, while both powerful stream processing tools, cater to slightly different needs and design philosophies, which are reflected in their SQL dialects and features.

  • Flink SQL (1.20) is a flexible processing framework with strong Hive integration, MATCH_RECOGNIZE for CEP, and specific features like TRY_CAST.
  • RisingWave SQL (2.0) offers a PostgreSQL-compatible, integrated streaming database experience. Key advantages include core materialized views for simplicity and performance, native JSONB handling, unified SQL DDL/DML for managing sources, sinks, and access control, and automatic internal state management.

Your choice between them depends on whether you need a standalone, easy-to-manage streaming database (RisingWave) or a flexible stream processing framework requiring integration with external systems for state, storage, and management (Flink).