Streambased compared

SQL based stream processors

Using SQL to access real-time data is not a new concept. Processors such as KsqlDB and FlinkSQL allow you to specify your stream processing requirements using SQL-like languages and may at first appear similar to Streambased. However, they are different in key areas:

KsqlDB etc. are optimised for continuous queries whereas Streambased is optimised for ad-hoc batch queries. An ad-hoc query executed in Streambased will typically run 30x-100x faster than its KsqlDB counterpart.
Stream processor SQL language is more complex than regular SQL with users having to understand streaming concepts such as windowing and grace periods. Streambased runs regular ANSI SQL including the full set of operations (joins, aggregates etc.). With Streambased you execute the same statements you would in any other database.

Streambased also greatly simplifies the infrastructure needed to provide real-time results. Stream-processing frameworks often require extra real-time streams and intermediate stores in order to achieve their goals. Streambased does not require these.

Analytical Databases and data lakes

Analytical databases are designed for large volume scans over high-latency data, the likes of Databricks, Snowflake etc. are typically fed by ETL pipelines and, because of this, can lag behind the real-time view of the data.

Streambased focuses on providing the freshest view of the data available to your organisation. To accomplish this, it provides a view over the system where data is created (usually a row based system like Apache Kafka) rather than a separate more analytically focused, column based store.

For this reason, some large-volume queries will perform better in analytical databases (there are a lot of variables involved) whereas point lookups and queries that require up-to-the-minute information will perform better in Streambased.

The Apache Iceberg table format employed by Streambased I.S.K. means that the low latency view from Streambased can easily be combined with longer term analytical storage (such as Parquet files) to provide the best of both worlds.

PreviousUnify your data NextSimplifying your architecture

Last updated 1 month ago

hashtagSQL based stream processors

hashtagAnalytical Databases and data lakes

SQL based stream processors

Analytical Databases and data lakes