Key Concepts

Architecture

Streambased Enterprise is a distributed processing engine and indexing service. Each node connects to an underlying Apache Kafka instance and creates indexes in via a background process in the Streambased Indexer component.

These indexes can be used directly in your applications via the Streambased consumer, a custom index aware implementation of the Apache Kafka consumer. They can also (more commonly) be utilised by Streambased workers, processing unit tasked with performing steps in a query plan (usually a read from Kafka or further processing of data previously read).

These workers and tasks are managed by a coordinator. Any node in a Streambased cluster can be worker or coordinator for submitted queries.

Streambased Enterprise is a shared nothing architecture with any state persisted to Apache Kafka, this means that it can scale elastically according to query load to practically infinite size.

Schema Registry

A Schema Registry is a service that provides a central repository for storing and managing schemas used by applications. It ensures that data formats are consistent and allows for schema evolution over time. Key benefits include:

  • Schema Versioning: Track changes to schemas over time.

  • Compatibility Checks: Ensure compatibility between producers and consumers of data.

  • Centralised Management: Simplify schema management across distributed systems.

JDBC (Java Database Connectivity)

JDBC is a Java-based API that allows applications to interact with databases. It provides methods for querying and updating data in a database.

  • Database Access: Standardizes database connectivity for Java applications.

  • SQL Execution: Execute SQL queries and update statements.

  • Cross-Database Compatibility: Works with various relational databases through JDBC drivers.

ODBC (Open Database Connectivity)

ODBC is a standard API for accessing database management systems (DBMS). It is language-agnostic and provides a uniform interface for database interaction. Key points include:

  • Cross-Platform Support: Works with various operating systems and programming languages.

  • Database Independence: Allows applications to interact with multiple DBMS using a common interface.

  • SQL Execution: Execute SQL queries and updates, similar to JDBC.

SQLAlchemy

SQLAlchemy is a Python SQL toolkit and Object-Relational Mapping (ORM) library that provides a flexible and powerful framework for database interaction. Key points include:

  • ORM Support: Map Python classes to database tables, allowing for object-oriented database interaction.

  • Database Abstraction: Provides a consistent interface for interacting with different databases.

  • Query Generation: Simplifies the process of generating complex SQL queries using Python.

Last updated