Connect Jupyter to Streambased

Jupyter notebooks are used for exploratory data analysis, data cleaning, data visualization, statistical modeling, machine learning, and deep learning. Let's bring real-time data into the mix.

Prerequisites:

A running Streambased instance, for instructions see here
A running Jupyter deployment that has the following additional packages:
- jupysql
- sqlalchemy-trino

Step 1: Create a database engine

In a new notebook execute the following:

from sqlalchemy.engine import create_engine
engine = create_engine("trino://[server host]:[server port]/kafka",
                       connect_args ={"http_scheme":"https", "schema":"streambased"})

By default server port is 8080 and server host is the name of the host on which the docker instance has been launched.

Step 2: Connect to the database

From your notebook run the following to load the SQL extension and the open the previously created engine:

%load_ext sql
%sql engine

Step 3: Run a query

Using the SQL extension we can execute anything we like. Happy querying!

%sql SELECT * FROM demo_transactions

PreviousConnect Superset to Streambased NextConnect a JDBC Client to Streambased

Last updated 8 months ago