Step by Step Installation

A step by step guide from 0 to 1

Step 1: Download example config

To get you up and running we provide a sample set of config files here. Download these resources into a directory named serverConfig

This config set contains everything Streambased Enterprise need to connect to one of our public test clusters hosted in Confluent Cloud.

Step 2: Start Streambased Enterprise

An indexer can be started with the following command:

docker run -v ${PWD}/serverConfig:/etc/streambased/etc streambased/streambased-enterprise:latest

Step 3: Connect to Streambased Enterprise

Streambased exposes an interface that is 100% compatible with that which is exposed by the popular Trino database. This means connectivity can be established using components from the Trino ecosystem including:

Tutorials on connecting your favorite analytical applications to Streambased can be found here.

In addition to the based Trino functionality, Streambased supports the following extra session variables to aid interaction with Kafka. These can be set via the SET SESSION directive within a JDBC session.

streambased_connection

Used to pass a JSON map of Apache Kafka connection properties (typically security configs for impersonation). Streambased provides a handy tool to convert java properties to the correct format.

  • Type: string

  • Default: empty

  • Example:

SET SESSION streambased_connection = '{ "sasl.jaas.config":"org.apache.kafka.common.security.plain.PlainLoginModule required username=''restricted'' password=''restricted-secret'';"}';

Note: Remember to double up any single quotes in your Kafka connection configuration as shown above.

use_streambased

Enable or disable Streambased acceleration for the session.

  • Type: string

  • Default: true

  • Example:

SET SESSION use_streambased = true;

Step 4: Customise to your environment

For most deployments, only client.properties need be modified. Simply replace the public Kafka connection details:

With the appropriate configurations for your Kafka data source. These should be provided in Kafka Java client properties format.

You must restart your container for this to take effect.

Step 5: Configure indexing

Streambased indexes Kafka topics to greatly improve the performance of SQL queries run against them. To configure this apply the following:

a. List the topics you wish to index in the indexed.topics property. e.g.:

indexed.topics=transactions,payment_terms

b. For each topic provide an extractor class to tell Streambased how to extract the fields for indexing e.g.:

indexed.topics.transactions.extractor.class=io.streambased.index.extractor.JsonValueFieldsExtractor
indexed.topics.payment_terms.extractor.class=io.streambased.index.extractor.JsonValueFieldsExtractor

c. [optional] If you only wish to index a subset of fields you can configure them explicitly e.g.:

indexed.topics.transactions.fields=timestamp

indexed.topics.transactions.field.timestamp.type=LONG
indexed.topics.transactions.field.timestamp.jsonpath=$.timestamp

For more information on these see the configuration section.

You must restart your container for this to take effect.

Step 6: Scale

Streambased Enterprise can scale alongside your Kafka infrastructure to practically unlimited capability. Please reach out to our team to schedule your free architecture assessment.

Last updated