Awesome Screenshot of Apache Superset, compliments of Apache Incubator.

How to get Apache Superset to connect to Athena

Adam Phillabaum
PayScale Tech
Published in
2 min readMar 28, 2019

--

This took a couple of rounds for me, so I thought I’d just share my learnings on how to get Apache Superset connecting to Athena. I’m running this on my local Mac. These are NOT “how to set it up in prod” instructions.

I started with the instructions here: https://superset.incubator.apache.org/installation.html#start-with-docker

git clone https://github.com/apache/incubator-superset/ 
#
# Hidden steps here, to be revealed later in this post
#
cd incubator-superset/contrib/docker
docker-compose run --rm superset ./docker-init.sh
docker-compose up

This stood up the cluster for me. But, without the connectors I need, this is pretty much worthless.

The trick is that you need to customize your docker image to suit your needs. So, after you git clone you can edit your Dockerfile to include:

apt install default-jre
pip install "PyAthenaJDBC>1.0.9"

Do these in the place in the Dockerfile where you see similar looking commands.

e.g. my Dockerfile has these sections in them

RUN apt-get install -y build-essential libssl-dev \
libffi-dev python3-dev libsasl2-dev libldap2-dev \
libxi-dev default-jre

and

RUN pip install --upgrade setuptools pip \
&& pip install -r requirements.txt -r requirements-dev.txt \
&& pip install "PyAthenaJDBC>1.0.9" \
&& rm -rf /root/.cache/pip

And, when you’re in the UI, this is the JDBC string you want:

awsathena+jdbc://<Your-AWS-key>:<Your-AWS-key-secret>@athena.<AWS-Region>.amazonaws.com/?s3_staging_dir=s3://aws-athena-query-results-XXXX-<AWS-Region>

Good luck, and happy Business Intelligencing!

--

--

Adam Phillabaum
PayScale Tech

Production Engineering at @facebook. I like to hack stuff together. Skier. Chinese food addict. Parentally experimenting with @mckb0mb.