Configuration

Streambased server is configured via a small set of configuration files mounted into a configuration directory (typically named serverConfig). Each file has a specific role that governs the properties that should be placed in it and, where properties are potentially duplicated, they can be automatically picked up from other config files.

In this section we will list the configuration files required and detail the properties that can be provided to them, for examples please see the Streambased demos.

serverConfig/catalog/kafka.properties

This file is used to configure the way in which Kafka messages should map to the table/column structure used by Streambased.

connector.name

A name for kafka cluster connected to

  • Type: string

  • Example:

    kafka
  • Importance: low

kafka.config.resources

A path to a configuration file that contains connection properties to Kafka. This will typically be pointed to the indexer configuration file (usually client.properties) so that server and indexer share a connection.

  • Type: string

  • Example:

    /etc/streambased/etc/client.properties
  • Importance: high

kafka.table-description-supplier

Streambased requires a method of mapping Kafka messages to table structures and this property determines which method is used. Valid values are FILE (use a configuration file) or CONFLUENT (use Confluent Schema Registry)

  • Type: string

  • Example:

    CONFLUENT
  • Importance: high

serverConfig/client.properties

This file configures the indexing component.

bootstrap.servers
security.protocol
sasl.mechanism
sasl.jaas.config
...

Any properties used to connect to the underlying Kafka cluster can be provided without prefix here.

  • Type: string

  • Example:

    bootstrap.servers=kafka1:10092
  • Importance: high

indexed.topics

A comma separated list of topics to be indexed

  • Type: string

  • Example:

    transactions,payment_terms
  • Importance: high

indexed.topics.[topic name].extractor.class

A fully qualified class name for a class that can extract column data for indexing from Kafka messages.

  • Type: string

  • Example:

    io.streambased.index.extractor.JsonValueFieldsExtractor
  • Importance: high

Note: the following 3 configurations are specific to JsonValueFieldsExtractor, other extractors may use different configuration.

indexed.topics.[topic name].fields

A comma separated list of fields within the message to index. This fields is optional, if not provided Streambased will attempt to infer fields to index from the incoming data automatically.

  • Type: string

  • Example:

    accountNo,transactionId
  • Importance: high

indexed.topics.[topic name].field.[field name].type

The data type of the field to be indexed. Valid values are STRING, LONG, BOOLEAN, DOUBLE

  • Type: string

  • Example:

    STRING
  • Importance: high

indexed.topics.[topic name].field.[field name].jsonPath

The jsonPath expression that describes how to extract field values from

  • Type: string

  • Example:

    $.accountNo
  • Importance: high

ranges.topic.name

The Kafka topic on which to persist index information

  • Type: string

  • Example:

    _streambased_ranges
  • Importance: high

Configuration prefixes

The remaining configurations represent prefixes that are used to determine configurations for external services used by Streambased

schema.registry.*

A prefix used to pass schema registry configurations.

  • Example:

    schema.registry.schema.registry.url=https://schema-registry:8081
    schema.registry.basic.auth.credentials.source=USER_INFO
consumer.*

A prefix used to pass consumer configurations used by the indexer. These will override any configurations provided above.

  • Example:

    consumer.auto.offset.reset=earliest
kafkacache.*

A prefix used to pass configurations to the KafkaCache instances used to store Streambased indexing data

  • Example:

    kafkacache.topic.replication.factor=3

serverConfig/config.properties

This file configures the serving element of Streambased Exterprise. Configurations here define how nodes will interact with eachother and external clients.

node.id

A unique id identifying this node

  • Type: string

  • Example:

    ffffffff-ffff-ffff-ffff-ffffffffffff
  • Importance: high

node.environment

A name signifying a group of nodes

  • Type: string

  • Example:

    production
  • Importance: high

node.internal-address

A hostname this node should advertise to other nodes.

  • Type: string

  • Example:

    streambased-server1
  • Importance: high

discovery.uri

The http endpoint of a coordinator node that this node can register itself with to establish cluster membership. This should be consistent across all nodes.

  • Type: string

  • Example:

    http://streambased-server:8080
  • Importance: high

coordinator

Whether or not this node is a coordinator. Coordinators handle incoming requests from clients and transform them into tasks for workers.

  • Type: boolean

  • Example:

    true
  • Importance: high

serverConfig/exchange-manager.properties

This file determines the ways in which workers can share data between themselves.

exchange-manager.name

The type of storage used to transfer data. Valid values are hdfs and filesystem

  • Type: string

  • Example:

    filesystem
  • Importance: high

exchange.base-directories

(filesystem specific) The directories in the storage system that can be used for data exchange.

  • Type: string

  • Example:

    /tmp/exchange-filesystem
  • Importance: high

Last updated