forked from Jman4190/nba-sql
-
Notifications
You must be signed in to change notification settings - Fork 24
Data Visualization with Superset
Matthew Pope edited this page Mar 7, 2021
·
10 revisions
Apache Superset is a free, kick-ass tool to visualize and explore data. This build will get you started with a bare-bones setup. Superset is awesome in that it is 'infinity scalable'. However, in this tutorial we're going with local hosting for ease of use.
This guide assumes you have Docker and docker-compose
installed. I've tested this on Linux and OSX. I could not get this functioning with Windows. From the Superset docs:
Superset is not officially supported on Windows unfortunately. The best option for Windows users to try out Superset locally is to install an Ubuntu Desktop VM
- Follow the Superset docker documentation to run the docker-compose locally.
- Create the database. Run this from the terminal.
docker exec -i superset_db psql -U superset -c "CREATE DATABASE nba;"
- Create a connection for Superset.
- Follow these docs.
- Use this connection string:
postgresql+psycopg2://superset:superset@superset_db:5432/nba
.
- Load our data.
- Clone this repo (or download the .zip).
- Modify the
scripts/create_postgres.sh
file to change the following environment variables.DB_NAME="nba"
DB_HOST="superset_db"
DB_USER=superset
DB_PASSWORD=superset
- Run the script!
- Follow the regular Superset documentation on how to setup databases and datasets, following the schema provided.
Keep in mind, when you build queries in Superset you shouldn't 'pre-aggregate'. Superset basically accepts a query as a view that it saves outside of our Postgres db, then does it's own aggregation. So make general queries that fetch a ton of rows, then do the SUM
, AVG
, or whatever inside of Superset.
Happy visualizing!