Unify your data

Streambased unifies Kafka and Iceberg into a single, continuously accessible data layer, removing the cost, latency, and complexity of traditional data pipelines and enabling faster decisions across the business.

A unified platform for operational and analytical data

Streambased provides the full scope of your data to all end users. Seamlessly providing real-time data to analytical and AI applications.

It provides:

  • I.S.K. (Iceberg Service for Kafka) - Streambased I.S.K. presents a set of Iceberg tables composed of a section of real-time data from Kafka (the “hotset“) and a section of physical Iceberg data (the “coldset“). Tables in I.S.K. combine these two sections in a way that is completely transparent to any clients interacting with it (it just looks like a regular Iceberg table).

  • K.S.I. (Kafka Service for Iceberg) - Streambased K.S.I. presents Kafka topics composed of a “hotset” section of data served directly from Kafka and a “coldset” section served from Iceberg. Kafka’s partition and offset concepts are mapped from columns in the Iceberg data allowing Kafka clients to interact with them as if they were Kafka topics.

  • Streambased Hyperstream - An indexing and acceleration engine for analytical queries.

  • Streambased Slipstream - A monitoring and management UI for Streambased deployments.

  • Streambased MCP server - An implementation of Anthropic's Model Context Protocol standard to allow AI agents to access real-time data.

What sets Streambased apart is:

  • No data movement - Streambased provides logical views on top of the data and does not move or store any data ahead of query time.

  • The freshest view - Data in Kafka is queryable in Iceberg the moment it lands. Dashboards, investigations and ML models always stay in step with the stream.

  • Drastically reduced Kafka costs - stored older Kafka data in Iceberg, not expensive Kafka storage.

What this means you get is:

  • A single source of truth - Both operational and analytical applications access the same data meaning there is no opportunity for drift or lag.

  • No ETL - No data transfer ahead of query time means no pipelines to manage and evolve.

  • A single point of governance - Manage permissions, lineage, schema evolution, etc. in one system and have it apply to all downstream users.

Last updated