Developing a web app from scratch — Part 1

Published in

Data Decoded

12 min readSep 16, 2018

One of the major problems most data science companies face is the gap between the analysis(and/or statistical modelling) and the representation and usage of the same. Yes, you might have built great machine learning models, but if those models cannot be translated into business, then they’re as worthless as a traffic light in GTA 5.

When it comes to representing or making a business-usable representation of a model in the form of a dashboard/scenario planner/ etc., most data science companies fall short. They typically outsource the tool building process to a software company which is where the gap begins to show. I strongly believe that you cannot bring out the consumption of a data science project unless you’ve worked with the data extensively. I believe that the intersection of development and data science is where the magic happens.

In this article series, I will be covering the major aspects of building a generic web app based tool. We will also go over the architecture, advantages and a basic hands-on session so that you can get started quickly without going into too much detail.

These are the topics which we will be covering broadly in this article.

Architecture — what does front-end/back-end/middle-ware mean?
Introduction to Linux(Ubuntu)
Setting up a Ubuntu system for back-end/middle-ware development from scratch
Creating a simple API(a messenger sort of program) which provides some data when asked for, which is then displayed to the user

The front-end aspect of this app will be covered in the next part of this article.

Architecture

A typical web app consists of three layers:

Front-end—This is the user interface layer with which the user directly interacts. Coded in HTML, CSS and JS; although actual code is written in the language of the particular framework being used for e.g. React(a front-end framework) is written in JSX.
Middle-ware — This layer is generally referred to as the “plumbing” part of a system. It does the groundwork of handling requests(in form of API calls) between the front-end and the back-end. Sort of a glue between the UI and the database.
Back-End — Databases and other data stores are generally at this level. MySQL, PostGRESQL and various off-the-shelf pieces of software come to mind when we talk about the back-end.

In the picture above, I’ve drawn a couple of blocks which demonstrate how a three tier web app architecture is analogous to the way a restaurant operates.

A side note on APIs

I’ve used the term ‘API’ multiple times by now and you might be wondering what it means. API stands for ‘Application Programming Interface’.

In simple terms, it is a piece of code(part of middle-ware) that receives requests and sends responses. For e.g. if the user requests the age of a certain person, the API will return the age of the said person.

Calling an API is like clicking on a link and getting some data. Try this: Visit this link.

Voila! You just called an API and got some data as the response. On a web tool, whenever the user clicks on a certain button, a API like the one above is called(visited) and some data corresponding to the user’s request is returned.

That’s all you need to know about API’s. Let’s get our hands dirty!

Introduction to Linux(machines)

While I’m assuming a basic level of familiarity with the Linux(Ubuntu) OS, I will reiterate some key points.

We will be using the Linux shell(Bash) to execute most of the commands. This shell can be access by pressing Ctrl + Alt + T. It is similar to the Windows Command Prompt.

Note: Adding sudo before any command gives it administrative privileges, as is required in most cases.

Ideally, three different machines should be running the three layers of the app.
We will setup the database layer first and then proceed with the middle-ware(back-end).
Shell scripts will be provided for setting up each layer automatically.
However, it is recommended that you go through the tutorial step by step so that you know what’s happening
I’m assuming that we’re starting with a clean install of Ubuntu 14.04/16.04/18.04 LTS versions.

System Setup

Back-end — PostgreSQL Setup

We will be using PostgreSQL as our database. Setting it up is a fairly straightforward process.

Download the bash script here and execute it, providing the root password when prompted.

Now that PostgreSQL with pgAdmin4 has been installed, let’s start off our configuration by working with PostgreSQL. With PostgreSQL we need to create a database, create a user, and grant the user we created access to the database we created. Start off by running the following command:

$ sudo su postgres

Your terminal prompt should now say “postgres@yourserver”. If this is the case, then run this command to enter the SQL shell

$ psql

Once we’re in the psql shell, let’s create a user who will connect to the database.

CREATE USER sample_user WITH PASSWORD 'password';

ALTER ROLE sample_user SET client_encoding TO 'utf8';
ALTER ROLE sample_user SET default_transaction_isolation TO 'read committed';
ALTER ROLE sample_user SET timezone TO 'UTC';

Now let’s create a database, say ‘sample_database’

CREATE DATABASE SAMPLE_DB;

Let’s grant the user we’ve created all permissions to access the database.

GRANT ALL PRIVILEGES ON DATABASE SAMPLE_DB TO sample_user;

Setting up pgAdmin4 for DB administration

Let’s confirm that the database we’ve created is up and running. While this can be done using the command line, I would like to take this as an opportunity to introduce the pgAdmin4 utility, which has a GUI for administering the PostgreSQL database.

Open pgAdmin4 by typing the same on the shell

$ pgadmin4

You should see a screen as shown below

The database server that’s running on your machine hasn’t been added yet. Let’s do that now.

Add your local postgres server to pgAdmin4

As we can see, the user has been created and the database has been added.

However, the database that you created will not be accessible to other users on the LAN/Internet unless you explicitly allow PostgreSQL to do so. Let’s fix this then.

Open the pg_hba file with root permissions

$ cd ../../.. | sudo gedit etc/postgresql/10/main/pg_hba.conf

Add the following line to it and save. What we’re doing here is opening up access to the database from all IP addresses and ports.

#Add to pg_hba conf
host   all  all 0.0.0.0/0 md5

We will also have to make a slight change to the postgresql.conf; basically asking it to listen from all IPs and ports.

#Add to postgreSQL.conf
listen_addresses = '*'

Now we’re all set in terms of the database. Let’s start with the back-end design and development.

Django — Introduction

Django is a free and open source web application framework, written in Python. A web framework is a set of components that helps you to develop websites faster and easier.

When building websites, you will realize that most of them use very similar components like user authentication etc. Django makes these commonly required ready made components available to use so that you don’t waste time in these things.

We will be using Django to write the middle-ware layer of our application. The primary use will be to write APIs in Django which are then called by front-end. Let’s get started!

Installing Django and setting up the development environment

For developing in Django, we will need:

PyCharm Community Edition — Install it by running the below command

$ sudo snap install pycharm-community --classic

A python virtual environment with some packages installed

This is slightly tricky, and if not done correctly, will lead to multiple issues down the line.

A virtual environment isolates all the packages needed for your Django project inside a folder so that they do mess with the global installations of those packages.

Let’s start by creating a project directory called ‘DjangoExample’. We will also create a new virtual environment called ‘venv’.

Setting up the project directory and virtual environment

A gentle intro to Django

We will be using Django to develop the back-end of our web application. Specifically, we will write APIs in Django which will take some input from the front-end, fetch and process the required data from the database and return it to front-end.

To work with the database, Django uses models. Models are the schema of your database. You define the schema of the database in models.py and migrate(transfer) it to the actual database. Here’s an example of a model.

An example model for a table called ‘BaseTable’ with name, CustomerID and Account

Although we can create a django project manually using the django admin command, we will be using this link to create the project and app files automatically. Head to the link and fill in the Project name, app name and the schema of the tables that you want.

Creating models.py and the project files automatically

As we can see, the name of the project is Sample_project and the name of this app is sample_app_1. Each project can have multiple apps running under it, for e.g. your e-commerce project could have multiple apps like product_page, shopping_cart, help_page etc. running inside it.

The Django Model Builder will create all the required files for the project. Click on the “Download as project” button as unzip it inside the DjangoExample folder we created earlier using PyCharm.

Open PyCharm and you should now have a directory structure similar to the one show below.

The outer level contains three items— venv folder(your virtual environment files), Sample_project(the outer most directory for your project and apps) and the requirements.txt file(contains the list of python packages necessary to run the app).
Inside the sample_project directory, we have

manage.py file — this file is used to run a emulated server on your local computer and perform a few other important functions. We will be covering it in detail later on.
sample_app_1 folder — contains app specific files like models.py etc.
sample_project folder — contains settings and project level files which are common across all apps

We will need one more library while pushing our data to the database, hence let’s install it and get the setup out of the way. Remember that you will need to install all these inside your the virtual environment you’ve setup. The terminal prompt that you’re using should display the name of your virtual env in brackets as shown below.

#Run this command at the terminal to install dependencies for pyodbc$ sudo apt-get install unixodbc-dev unixodbc-bin unixodbc python-dev python2.7-dev#Then install pyodbc by running the following command
pip install pyodbc

Checking if everything works

We will run the backend of our app on our local server now to check if everything is in place.

CD to your project direcotry

$ cd PycharmProjects/DjangoExample/Sample_Project/

If you do a ls, you should see manage.py present as below

Manage.py in the root of our project directory

Let’s get the server started for a quick check if our machine setup is complete.

$ python manage.py runserver

You should see the server start and the IP displayed. If you can see a screen similar to the above, we’re all set for the next phase!

Migrating the model to the database

As discussed earlier, the models.py file contains the schema(design of the tables). We will need to create that schema in the database so that we can push the data into the database.

Using the django model builder, we’ve already created a schema for the table “BaseTable”, which contains the name, CustomerID and Account of a client. You can view your models.py file, should have the following model already written.

Models.py — created using django model builder

Now let’s migrate the model — i.e. in simple terms, create the same schema in the database so that we can add some data into the DB.

Django, by default, uses an internal sqlite database. We will have to explicitly tell Django to use the postgreSQL database we’ve created.

Let’s open up our project’s settings.py file.

We need to specify the database name, id, password, ip’s and ports. Let’s do all that here. By default, postgreSQL runs on port 5432.

Now, let’s open up a terminal and migrate the models.

1. Running the makemigrations command with app name tells Django to get ready for migrations

2. Output of step 1

3. The migrate command actually creates the schema in the database

Voila! Models in the database are created

Let’s verify that by going to pgAdmin4 and looking at the tables

The basetable we defined in models.py has been created in the database

Now, let’s upload some sample data to the database so that we can access it using the API.

Download the sample csv from here, create a data sets folder inside your project directory and save the csv there.

For uploading the data, we will write the following code in a python notebook or script and execute it. (Change the IP to your machine’s local IP)

If we look at the table we created in pgAdmin4, we see that it has been populated.

Woah! Now that everything’s set, let’s start with the API itself.

Writing a simple API

Create a python file called API_<whatever_you_want> inside your app folder.

After creating the API_Sample file, it should look like this.

In this API, we are reading the basetable values and converting them to JSON object.

And then returning those JSON objects when the API is called.

That’s it, you’ve written your first API.

Understanding how URLs work in Django

A URL is a web address. You can see a URL every time you visit a website — it is visible in your browser’s address bar. (Yes! 127.0.0.1:8000 is a URL! And syedmisbah.github.io is also a URL).

Every page needs an URL, and your app needs one too.

Let’s open up the project’s urls.py file and see what’s there in it.

Whenever you open your web app, Django searches for a specific pattern(using regex) in the url. In the above case, two patterns are defined.

First one is the pattern for the admin dashboard. Whenever that pattern in matched, Django will open the administration dashboard(which comes by default with Django). For e.g. opening 127.0.0.1:8000/admin would open the admin dashboard.
The second pattern is related to the app that we created. Whenever you open 127.0.0.1:8000/Sample_App_1/<some_link>, Django will go to the URLs.py file in your app and search for patterns there.

Let’s open the urls.py of your app(Sample_App_1) and edit to include the API we’ve written.

We’ve added two lines:

The first line imports the GetBaseTableAPI from the API_Sample file that we created earlier.
The second line(urlpatterns) defines a pattern for the same. It means that whenever a url matching 127:0:0:1:8000/Sample_App_1/CalltoAPI is hit(opened), the GetBaseTableAPI is called.

Do note that the first part of the URL, i.e. the 127:0:0:1:8000/Sample_App_1 comes from the project level urls.py, while the second part is matched by the app level urls.py.

Adding permissions to the project settings

Before the API is accessible over the LAN, we need to add permissions to the Django project settings file. Open the settings.py inside your project folder and add the following line.

ALLOWED_HOSTS = ['*']

Let’s check the API

Start your local server using the terminal

python manage.py runserver

Once the server is up and running, visit this link: 127:0:0:1:8000/Sample_App_1/CalltoAPI

If you’ve done everything correctly up till this stage, you should get the results of the API from the database

That was it! You’ve created the back-end and middle-ware layer of your web application. We will continue with integration of these two layers with the front end in the next article.

The entire codebase for this example is available on github here for reference, if you come across any issues with the code.

Please do let our team @ Decoding Data know your thoughts, questions or suggestions in the comment section below.