Skip to content

Data Visualization with Superset

Matthew Pope edited this page Mar 7, 2021 · 10 revisions

Superset Setup Guide

Apache Superset is a free, kick-ass tool to visualize and explore data. This guide will get you started with a bare-bones setup. Superset is awesome in that it is 'infinity scalable'. However, in this tutorial we're going with local hosting for ease of use. Superset comes with a Postgres database by default, so we're going to lean on that. With that in mind, this tutorial will work for an external Postgres / MySQL database too. We're using version 1.0.1 (released 02/2021), so this might need to be updated for future versions of Superset.

This guide assumes you have Docker and docker-compose installed. I've tested this on Linux and OSX. I could not get this functioning with Windows. From the Superset docs:

Superset is not officially supported on Windows unfortunately. 
The best option for Windows users to try out Superset locally is to install an Ubuntu Desktop VM

Setup Superset

  1. Follow the Superset docker documentation to run the docker-compose locally.
  2. Create the database. Run this from the terminal.
    1. docker exec -i superset_db psql -U superset -c "CREATE DATABASE nba;"
  3. Create a connection for Superset.
    1. Follow these docs.
    2. Use this connection string: postgresql+psycopg2://superset:superset@superset_db:5432/nba.
  4. Load our data.
    1. Clone this repo (or download the .zip).
    2. Modify the scripts/create_postgres.sh file to change the following environment variables.
      1. DB_NAME="nba"
      2. DB_HOST="localhost"
      3. DB_USER=superset
      4. DB_PASSWORD=superset
    3. Run the script!
  5. Follow the regular Superset documentation on how to setup databases and datasets, following the schema provided.

Keep in mind, when you build queries in Superset you shouldn't 'pre-aggregate'. Superset basically accepts a query as a view that it saves outside of our Postgres db, then does it's own aggregation. So make general queries that fetch a ton of rows, then do the SUM, AVG, or whatever inside of Superset.

Happy visualizing!

Clone this wiki locally