Streambased Documentation
  • Home
  • Overview
    • Key Concepts
  • Streambased Cloud
    • Streambased Cloud UI
      • Create your first Streambased cluster
      • Create your first Streambased API Key
      • Running your first A.S.K Query
      • Exploring your data using S.S.K
    • Iceberg Service for Kafka - I.S.K.
      • Overview
      • Architecture
      • Usage
      • Quick Start
    • Analytics Service for Kafka - A.S.K.
      • Overview
      • Architecture
      • Connecting to Streambased A.S.K.
        • Connect Superset to Streambased A.S.K.
        • Connect Jupyter to Streambased A.S.K.
        • Connect a JDBC Client to Streambased A.S.K.
        • Connect an ODBC client to Streambased A.S.K.
        • Connect a Python Application (SQL Alchemy) to Streambased A.S.K.
    • Storage Service for Kafka - S.S.K.
      • Overview
      • Connecting to Streambased S.S.K.
        • Connecting a S3 compatible client to Streambased S.S.K.
        • Connect a S3manager to Streambased S.S.K.
  • Streambased Platform
    • Overview
    • Requirements
    • Step by Step Installation
    • Configuration
      • Dynamic Configuration
    • Connecting Analytical Applications to Streambased
      • Connect Superset to Streambased
      • Connect Jupyter to Streambased
      • Connect a JDBC Client to Streambased
      • Connect an ODBC client to Streambased
      • Connect a Python Application (SQL Alchemy) to Streambased
Powered by GitBook
On this page
  1. Streambased Platform

Configuration

Streambased server is configured via a single configuration files mounted into a /etc/streambased/etc/client.properties in the Streambased deployment.

client.properties

This file configures Kafka sources as well as extraction, transformation and aggregations that should be applied to Kafka messages when indexing.

Global configurations

board.size

The number of Kafka offsets to include in an index chunk.

  • Type: integer

  • Example:

    board.size=1000
  • Importance: high

Configuring Sources

A source represents a Kafka cluster that Streambased should interact with. Every source has a name that is used as it's prefix in configuration. E.g. for a source named someSource the prefix would be sources.someSource .

sources

A comma separated list of names of sources.

  • Type: string

  • Example:

    sources=usEastSource,usWestSource
  • Importance: high

sources.[sourceName].*

A prefixed list of Kafka connection details. Any number of configurations can be added according to the connection requirements for your Kafka cluster

  • Type: string

  • Example:

    sources.usEastSource.bootstrap.servers=us-east.kafka.com:9092
  • Importance: high

Sources are equivalent to database schemas, allowing you to join topics across Kafka clusters.

Configuring Extractors

An extractor reads from a source and builds indexing information from the information it retrieves. Every extractor has a name that is used as it's prefix in configuration. E.g. for an extractor named transactionsExtractor the prefix would be field.extractors.transactionsExtractor .

field.extractors

A comma separated list of names of extractors.

  • Type: string

  • Example:

    field.extractors=transactionsExtractor,customersExtractor
  • Importance: high

Extractors must have the following mandatory fields:

field.extractors.[extractorName].class

A fully qualified class name for a class that can extract column data for indexing from Kafka messages.

  • Type: string

  • Example:

    field.extractors.transactionsExtractor.class=io.streambased.index.extractor.JsonValueFieldsExtractor
  • Importance: high

field.extractors.[extractorName].source

The name of the Source this extractor should extract messages from.

  • Type: string

  • Example:

    field.extractors.transactionsExtractor.source=usEastSource
  • Importance: high

field.extractors.[extractorName].topic

The Kafka topic this extractor should extract messages from.

  • Type: string

  • Example:

    field.extractors.transactionsExtractor.topic=someTopic
  • Importance: high

field.extractors.[extractorName].*

Extractors support any number of prefixed extra configurations that are required by the pluggable class configured in .class .

  • Type: string

  • Example:

    field.extractors.transactionsExtractor.fields=someField,someOtherField
  • Importance: high

Configuring Transformers

A transformer represents a function that is applied to a field before it is stored in the index. This can be useful for reducing index sise by bucketing etc. Every transformer has a name that is used as it's prefix in configuration. E.g. for an transformer named ageTransformer the prefix would be transformers.ageTransformer .

transformers

A comma separated list of names of transformers.

  • Type: string

  • Example:

    transformers=nameTransformer,ageTransformer
  • Importance: high

Transformer must have the following mandatory fields:

transformers.[transformerName].class

A fully qualified class name for a class that can transform column data for indexing from Kafka messages.

  • Type: string

  • Example:

    transformers.ageTransformer.class=io.streambased.index.transformer.TruncatingTransformer
  • Importance: high

transformers.[transformerName].source

The name of the Source this transformer should apply transformations to.

  • Type: string

  • Example:

    transformers.ageTransformer.source=usEastSource
  • Importance: high

transformers.[transformerName].topic

The Kafka topic this transformer should apply transformations to.

  • Type: string

  • Example:

    transformers.ageTransformer.topic=someTopic
  • Importance: high

transformers.[transformerName].*

Transformers support any number of prefixed extra configurations that are required by the pluggable class configured in .class .

  • Type: string

  • Example:

    transformers.ageTranformer.truncatedToChars=3
  • Importance: high

Configuring Aggregators

An aggregator extracts and store common aggregate information that can be used to accelerate aggregate queries. For instance an aggregator may record the MAX value for a field in a block of Kafka data meaning that a query for this does not need to read the data itself. Every aggregator has a name that is used as it's prefix in configuration. E.g. for an aggregator named customersAggregator the prefix would be aggregators.customersAggregator .

aggregators

A comma separated list of names of aggregators.

  • Type: string

  • Example:

    aggregatores=transactionsAggregator,custoemrsAggregator
  • Importance: high

Aggregators must have the following mandatory fields:

aggregators.[aggregatorName].source

The name of the Source this aggregator should read from.

  • Type: string

  • Example:

    aggregators.customersAggregator.source=usEastSource
  • Importance: high

aggregators.[aggregatorName].topic

The Kafka topic this aggregator should read from.

  • Type: string

  • Example:

    aggregators.customersAggregator.topic=someTopic
  • Importance: high

aggregators.[aggregatorName].aggregate.fields

A comma separated list of fields this aggregator should calculate aggregate information for. Note that every combination of this and grouping.fields will be stored.

  • Type: string

  • Example:

    aggregators.customersAggregator.aggregate.fields=amount,balance
  • Importance: high

aggregators.[aggregatorName].grouping.fields

A comma separated list of fields this aggregator should group aggregate information on. Note that every combination of this and aggregate.fields will be stored.

  • Type: string

  • Example:

    aggregators.customersAggregator.grouping.fields=name,country
  • Importance: high

General Configuration prefixes

The remaining configurations represent prefixes that are used to determine configurations for external services used by Streambased

kafkacache.*
  • Example:

    kafkacache.topic.replication.factor=3
PreviousStep by Step InstallationNextDynamic Configuration

Last updated 7 days ago

A prefix used to pass configurations to the instances used to store Streambased indexing data

KafkaCache