The Ultimate deployment guide for atoti on AWS with Amazon EC2
How to run and share your atoti BI app
UPDATE: The GIFs and code snippets in this article are based on versions prior to atoti 0.7.0. They may have evolved or become obsolete. Check out the atoti documentation for the latest features.
In a previous article, we have covered how to deploy a BI dashboard in AWS using Docker, specifically using the atoti Docker image. In case you haven’t heard of atoti, it is a Python library that allows multidimensional data analysis and comes with dashboarding capability.
In this article, we are going to see how we can set up a JupyterLab along with atoti on a virtual machine using AWS Elastic Compute Cloud, also known as EC2.
Objective
The goal of this article is to give access to atoti development platform from the cloud for anyone to create their own notebooks and run a BI web application. The recommended solution would be to implement a JupyterHub. We will show you a simpler solution that uses a shared instance of JupyterLab. One limitation of this solution is that your users should be mindful of not concurrently running the same notebook or restarting the kernels of other users.
With a common development platform deployed on AWS using Amazon EC2, your end-users can start building the dashboards, or even explore the source code of the notebook behind the app.
For more information about atoti and what you can do with it, follow this link.
Prerequisites
We created a free account with the AWS free tier for this article.
In this use case, we used PuTTY to remote access the Ubuntu server that we are going to create.
UPDATE: The GIFs and code snippets in this article are based on an older version of atoti. We have released much smoother and even more functional dashboards and widgets with the latest version of atoti. Check out this link to see the documentation of the latest version of atoti.
Step 1. Launching Ubuntu instance with Amazon EC2
Amazon EC2 is a web service that allows us to boot an Amazon Machine Image (AMI) to configure a virtual machine. The video below takes us through the process of setting up a Ubuntu server with Amazon EC2 and JupyterLab with an atoti tutorial.
Generating Amazon EC2 key pair
A key pair is a set of security credentials that allow us to connect to an Amazon EC2 instance.
Log into AWS and navigate to the Amazon EC2 main page. We can create the key pair from the Amazon EC2 dashboard.
Select the ppk option to download a PuTTY private key file. Be sure to keep this file securely. We will use it to configure PuTTY for SSH into our Amazon EC2 instance.
We can also create the key pair before launching our EC2 instance. This will give us a private key file (*.pem file) that can be used with OpenSSH. However, we will have to convert this file to a ppk file when we use PuTTY.
Launching Amazon EC2 instances
From the AWS left menu bar, navigate to Instances. Click on the “Launch instances” button to start creating the virtual machine.
There are 7 steps to launching the server. We are going to focus on:
Step 1. Choose AMI
Step 2. Choose Instance Type
Step 4. Add Storage
Step 6. Configure Security Group
For those steps that are not mentioned, we will keep to the default configuration.
Step 1. Choose AMI
We shall select the free tier eligible Ubuntu server.
Step 2. Choose Instance Type
Go for the t2.micro instance type which gives us 750 free hours per month for the first year.
Step 4. Add Storage
We can only specify the instance store volumes for the EC2 instance during launching, we should therefore increase the volume size.
Note that data in an instance store persists only during the lifetime of its associated instance, i.e. if the EC2 instance is stopped or hibernates. The data in the instance store will be lost.
Although not covered in this article, it is advisable to use more durable data storage such as Amazon EBS for example. We can attach additional EBS volumes to the EC2 instance anytime.
Step 6. Configure Security Group
Depending on whether SSL is configured, we can open the firewall for HTTP or HTTPS. Otherwise, add the rules for port 22 for SSH and port 8888 for the Jupyter server.
While we are allowing all IP addresses to access the EC2 instance with source set to Anywhere, you should restrict the access from known IP addresses only.
Associating Key pair for EC2 instance connection
In the final step to launch the EC2 instance, we will be prompted to select a key pair for connecting to our instance.
This is where we choose the key pair that we have created earlier on. Make sure you still have the private key that was downloaded to the machine.
Instance key information
We can quickly access our EC2 instance from the Launch Status page by clicking on the link highlighted below.
It’s a good idea to create billing alerts as suggested above as we don’t want to be charged unknowingly when the free tier usage is exceeded.
On the instance summary page, we look at a few things:
1. Instance status — it should be running for us to be able to connect to it
2. Public IPv4 address — we need this to connect to the EC2 instance and also, to configure our Jupyter server.
3. Connect — This gives us the instructions for connecting to the EC2 instance.
SSH using PuTTY
Using the information from the instance summary page, we can configure PuTTY to connect to our EC2 instance:
- Enter the instance IP address in the Host Name field.
- Go to Connection > Data and enter the username obtained from the “Connect to instance” page.
- Go to Connection > SSH > Auth, browse and select the ppk file downloaded from our keypair creation earlier on.
Save the instance configuration and click Open to SSH into our EC2 instance.
Step 2. Installing JupyterLab with atoti on Amazon EC2
We will reference the atoti installation guide to install JupyterLab and atoti on the Amazon EC2 instance that we have created:
1 — Install Conda
As recommended, we will install miniconda 64-bits. We have to first download it to the Amazon EC2 instance. Connect to the instance in PuTTY and run the below commands:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.shchmod +x Miniconda3-latest-Linux-x86_64.sh./Miniconda3-latest-Linux-x86_64.sh
During the installation, press Enter a few times to go through the license agreement and type yes to agree. Also, press Enter to confirm the location.
Lastly, type yes for the installer to initialize Miniconda3.
We need to close and reopen PuTTY for the changes to take effect.
2 — Set up the conda-forge channel and atoti channel
After reconnecting to the EC2 instance, run the below commands in PuTTY.
conda config --add channels conda-forgeconda config --add channels https://conda.atoti.io
3 — Create a new Conda environment
Run the below command to create a new Conda environment for us to set up atoti and JupyterLab. You can create multiple environments with different conda packages installed for different purposes, such that changes to one doesn’t affect the other.
conda create --name atoti
Enter y to proceed with the environment creation. Follow the instructions to activate the environment:
conda activate atoti
4— Install atoti and JupyterLab
All it takes to install atoti and its companion packages is the below command:
conda install atoti atoti-jupyterlab python
It will take a while for the installation to finish. Don’t forget to enter y to proceed with the installation when prompted.
atoti web application port
When we create an atoti session, session.url will return the link to access the atoti web application. A random port is generated for each session unless pre-defined in the configuration.
To avoid having to open up a range of ports on the firewall, we are going to make use of the Jupyter Server Proxy. This will allow us to run the atoti web application alongside the notebook, with the URL directing to a proxy port.
Hence we only have to open up the firewall for port 8888 instead of a different port per atoti session.
Let’s install it with the below command:
conda install jupyter-server-proxy
Configuring JupyterLab
There are a few configurations that we want to apply to the JupyterLab:
- ServerApp.ip — For the notebook server to listen on all IP address
- ServerApp.open_browser — not to open the browser after starting
- ServerApp.password — Instead of the default token authentication, we will change it to password authentication.
- ServerApp.custom_display_url — Override URL shown to users
- ServerApp.root_dir — set it to the work folder where we will store the atoti tutorial
Password generation
Before we start the configuration, let’s generate a hashed password for the web authentication.
ipythonfrom IPython.lib import passwdpasswd()
Enter password: [Create password and press enter] Verify password: [Press enter]
The hashed password will be displayed as shown below:
Exit from ipython.
Updating configuration file
Run the below commands in PuTTY to create the config profile:
jupyter notebook --generate-config
To start configuring Jupyter, run the following:
cd ~/.jupyter/nano jupyter_notebook_config.py
Insert the following at the beginning of the configuration file (Replace the hashed password, instance IPv4 address and port for Jupyter server in the url_pattern above):
c = get_config()# Notebook config# listens on all IPsc.ServerApp.ip = '*'#so that the ipython notebook does not opens up a browser by defaultc.ServerApp.open_browser = False#the encrypted password we generated abovec.ServerApp.password = u'<hashed password>'# Set the port to 8888, the port we set up in the AWS EC2 set-upc.ServerApp.port = <port for jupyter server># Replace actual URL, including protocol, address, port and base URL, with the given value when displaying URL to the users.c.ServerApp.custom_display_url = 'http://<instance IPv4 address>:<port for jupyter server>'# to start up Jupyter on this directoryc.ServerApp.root_dir = 'work'
Exit and save the configuration file.
Downloading atoti tutorial to work directory
Since we have set the root_dir to the work directory, we need to create this folder. We will create it under the home directory and download the atoti tutorial to it:
mkdir ~/workpython -m atoti.copy_tutorial ~/work/tutorial
atoti configuration
We can configure atoti during session creation to point to the EC2 instance IP address instead:
Instead of providing the configuration for each session, we can also create a global configuration for atoti. Run the following commands in PuTTY:
mkdir --parents ~/.atotiecho "url_pattern: http://<instance IPv4 address>:<port for jupyter server>/proxy/{port}/" > ~/.atoti/config.yml
Replace the instance IPv4 address and port for the Jupyter server in the url_pattern above. Since we are making use of the Jupyter server proxy, we include `/proxy/{port}/` in the URL pattern. The proxy {port} will be randomly assigned unless specified in the atoti configuration. Check out the configuration options available for atoti.
Step 3. Launching JupyterLab
We are now ready to launch the JupyterLab! Run the below command with `&` to start the process in the background:
jupyter lab &
This way, JupyterLab will still be accessible after we exit the shell.
To terminate the JupyterLab, run `ps -ef | grep jupyter` to find Jupyter running processes. Use `kill -9` on the PID to kill the processes.
Step 4. Example of an atoti BI application
Using the atoti — 01. Basics.ipynb, we will create our measures and visualize them in the Jupyter notebook.
We can publish the visualization from the notebook to the web application for use in a dashboard, or we can add new widgets to a dashboard:
We can save the dashboard and share it with our peers!