Apache Superset setup in production

rajesh kumar
Urban Company – Engineering
3 min readDec 1, 2018

Apache Superset is a modern, enterprise-ready business intelligence web application.

The setup consists of 4 things:

  • Redis
  • Postgres
  • Superset app
  • Superset Celery worker

The article covers a step-by-step guide to setup Apache Superset in production with separate containers for Redis, Postgres, Superset app and Superset Celery worker.

Redis setup

sudo docker pull redissudo docker run -p 6379:6379 -d redis redis-server --requirepass myredispassword

Postgres setup

sudo docker pull postgressudo docker run -p 5432:5432 -e POSTGRES_USER=superset -e POSTGRES_PASSWORD=mypostgrespassword -e POSTGRES_DB=superset --volume $PWD:/var/lib/postgresql/data -d postgres

Superset app setup

  • Create superset_config.py file.
import osfrom werkzeug.contrib.cache import RedisCacheMAPBOX_API_KEY = os.getenv('MAPBOX_API_KEY', '')
REDIS_SERVER_IP = os.getenv('REDIS_SERVER_IP', '')
REDIS_PASSWORD = os.getenv('REDIS_PASSWORD', '')
POSTGRES_SERVER_IP = os.getenv('POSTGRES_SERVER_IP', '')
POSTGRES_USER = 'superset'
POSTGRES_PASSWORD = os.getenv('POSTGRES_PASSWORD', '')
SUPERSET_CACHE_REDIS_URL = "".join(['redis://:', REDIS_PASSWORD, '@', REDIS_SERVER_IP, ':6379/1'])
SUPERSET_BROKER_URL = "".join(['redis://:', REDIS_PASSWORD, '@', REDIS_SERVER_IP, ':6379/0'])
SUPERSET_CELERY_RESULT_BACKEND = "".join(['redis://:', REDIS_PASSWORD, '@', REDIS_SERVER_IP, ':6379/0'])
SUPERSET_SQLALCHEMY_DATABASE_URI = "".join(['postgresql+psycopg2://', POSTGRES_USER, ':', POSTGRES_PASSWORD, '@', POSTGRES_SERVER_IP, ':5432/superset'])
CACHE_CONFIG = {
'CACHE_TYPE': 'redis',
'CACHE_DEFAULT_TIMEOUT': 300,
'CACHE_KEY_PREFIX': 'superset_',
'CACHE_REDIS_HOST': 'redis',
'CACHE_REDIS_PORT': 6379,
'CACHE_REDIS_DB': 1,
'CACHE_REDIS_URL': SUPERSET_CACHE_REDIS_URL
}
SQLALCHEMY_DATABASE_URI = SUPERSET_SQLALCHEMY_DATABASE_URI
SQLALCHEMY_TRACK_MODIFICATIONS = True
SECRET_KEY = 'thisISaSECRET_1234'
class CeleryConfig(object):
BROKER_URL = SUPERSET_BROKER_URL
CELERY_IMPORTS = ('superset.sql_lab', )
CELERY_RESULT_BACKEND = SUPERSET_CELERY_RESULT_BACKEND
CELERY_ANNOTATIONS = {'tasks.add': {'rate_limit': '10/s'}}
CELERY_CONFIG = CeleryConfig
RESULTS_BACKEND = RedisCache(
host=REDIS_SERVER_IP,
port=6379,
key_prefix='superset_results',
password=REDIS_PASSWORD
)
  • Run the following command. Make sure to replace the Redis and Postgres IP-addresses.
sudo docker pull amancevice/supersetsudo docker run --detach -p 8088:8088 -e REDIS_SERVER_IP=172.32.11.126 -e REDIS_PASSWORD=myredispassword -e POSTGRES_SERVER_IP=172.32.11.126 -e POSTGRES_PASSWORD=mypostgrespassword --volume $PWD:/etc/superset --volume $PWD:/var/lib/superset --name superset amancevice/superset

Superset init

sudo docker exec -it superset superset-init

Superset login

  • Login to superset.
http://172.32.11.126:8088/
  • Enable ‘Allow Run Async’ from Sources->Databases tab.
Enable ‘Allow Run Async’
  • If you run any query in SQL editor you will see ‘pending’ status. This is because we have not setup the celery worker.
Query in pending state

Superset Celery worker setup

sudo docker run --detach -e REDIS_SERVER_IP=172.32.11.126 -e REDIS_PASSWORD=myredispassword -e POSTGRES_SERVER_IP=172.32.11.126 -e POSTGRES_PASSWORD=mypostgrespassword --volume $PWD:/etc/superset --volume $PWD:/var/lib/superset amancevice/superset celery worker --app=superset.sql_lab:celery_app --pool=gevent -Ofair

Now if you run the same query in SQL editor it will get executed through celery worker.

Query executed by celery worker

That’s it and if you found this article helpful, please click the clap 👏button

--

--