Key Concepts
Architecture
Streambased Enterprise is a distributed processing engine and indexing service. Each node connects to an underlying Apache Kafka instance and creates indexes in via a background process in the Streambased Indexer component.
These indexes can be used directly in your applications via the Streambased consumer, a custom index aware implementation of the Apache Kafka consumer. They can also (more commonly) be utilised by Streambased workers, processing unit tasked with performing steps in a query plan (usually a read from Kafka or further processing of data previously read).
These workers and tasks are managed by a coordinator. Any node in a Streambased cluster can be worker or coordinator for submitted queries.
Streambased Enterprise is a shared nothing architecture with any state persisted to Apache Kafka, this means that it can scale elastically according to query load to practically infinite size.
Schema Registry
A Schema Registry is a service that provides a central repository for storing and managing schemas used by applications. It ensures that data formats are consistent and allows for schema evolution over time. Key benefits include:
Schema Versioning: Track changes to schemas over time.
Compatibility Checks: Ensure compatibility between producers and consumers of data.
Centralised Management: Simplify schema management across distributed systems.
JDBC (Java Database Connectivity)
JDBC is a Java-based API that allows applications to interact with databases. It provides methods for querying and updating data in a database.
Database Access: Standardizes database connectivity for Java applications.
SQL Execution: Execute SQL queries and update statements.
Cross-Database Compatibility: Works with various relational databases through JDBC drivers.
ODBC (Open Database Connectivity)
ODBC is a standard API for accessing database management systems (DBMS). It is language-agnostic and provides a uniform interface for database interaction. Key points include:
Cross-Platform Support: Works with various operating systems and programming languages.
Database Independence: Allows applications to interact with multiple DBMS using a common interface.
SQL Execution: Execute SQL queries and updates, similar to JDBC.
SQLAlchemy
SQLAlchemy is a Python SQL toolkit and Object-Relational Mapping (ORM) library that provides a flexible and powerful framework for database interaction. Key points include:
ORM Support: Map Python classes to database tables, allowing for object-oriented database interaction.
Database Abstraction: Provides a consistent interface for interacting with different databases.
Query Generation: Simplifies the process of generating complex SQL queries using Python.
Last updated