Learning How to Build a Web Application

Robert Chang
Apr 13, 2016 · 14 min read

Lessons Learned on Web Development, Data Visualization, and Beyond

Motivation

However, as I took on more empirical work in graduate school, I realized that data visualization was often far more effective in communication than LaTeX alone. From crafting what information to convey to my readers, I learned that the art of presentation is far more important than I originally thought.

Luckily, I have always been a rather visual learner, so when it comes to beautiful data visualizations, they always grabbed my attentions and propel me to learn more. I began to wonder how people published their beautiful works on web; I became a frequent visitor of Nathan Yau’s FlowingData blog; And I am continued to be in awe when discovering visualizations like this, this, and this.

After many moments of envy, I couldn’t resist but to learn how to create them myself. This post is about the journey that I took to piece the puzzles together. By presenting my lessons learned here, I hope it can inspire all those who are interested in learning web development and data visualization to get started!

Setting Learning Goals

Learning Goals

  • Learn basic AJAX so you can hit your RESTful API from client-side code. I find jQuery the nicest way to get started with this. Also, don’t get hung up on low-level DOM manipulation in Javascript. Learn Selectors and take advantage of all the hard work jQuery developers have put into the problem.
  • Learn how to approach layout on the web. This means HTML + CSS. Twitter’s Bootstrap framework is a lovely, well-designed starting point.
  • Learn how to build interactive graphics on the client side. D3.js is the most mature way to do this.

My favorite part of his answer is the following paragraphs:

Finally, a bit on mindset. I like to think of building web applications as another way to express your ideas, just like giving talks, writing papers, or distributing R[Python] packages. A working prototype based on a new algorithm or dataset is often far more compelling than a report or command-line tool, especially to non-data-scientists.

Your web apps don’t have to be completely polished and productionised to be useful in this capacity. But, you do need to communicate your work just as clearly in an interactive app as would in any other medium.

I knew the only way that I could learn how these technologies work together is to build something integral, useful and fun. After some planning, I decided to build a Calendar Visualizer that OAuth to my Google Calendar and display how I allocate and spend my spare time.

The landing page of my web application

Web Development 101 (with Flask)

What is a Web Framework?

  • At the end of the day, all a web application really does is sending data back (think HTML) to browsers/clients through Hypertext Transfer Protocol (HTTP)
  • One of the main challenges of building a web application then is to figure out how to process each request and return the right response
  • Web Frameworks make these challenges a lot easier because it abstracts away a lot of the lower level works that we otherwise have to do

The Big Picture

Image Credit: G.L. Heileman, Coursera MOOC, “Web Application Architectures”

A More Granular Picture

Image Credit: G.L. Heileman, Coursera MOOC, “Web Application Architectures”

Generally speaking, there are three essential layers in a web application:

  • Front-End Layer (Blue): This layer is where technologies such as HTML ,CSS, and Javascript create the look and feel of our application.
  • Application Layer (Green): This is the middle layer where business and presentation logic work together to deliver the response back to the users
  • Database Storage Layer (Orange): This layer is where data is stored and is what enables a data rich application

Depending on your interests and goals, you might develop more specific skills in one area than the other. Given that my goal was to see how everything worked together, I took a breadth-first search approach and learned just enough to see how each layer works. In the following sections, I will dive into each layer in more details and highlight some of the big ideas and lessons learned.

Application Layer

Routes

When Flask processes an HTTP request it uses this information to figure out which views it should pass the request to. The function can then return data in a variety of formats (HTML, JSON, plain text) that will be used by Flask to create an HTTP response. Let us demonstrate this by an example:

A decorator that matches on /user

The decorator webapp.route, upon receiving a request from /user, evokes the view which returns a HTML table. This is what happened under the hood when a user visits the /user page (see screenshot below).

Big Idea / Lesson Learned: Routes are the fundamental building blocks that enables client-server interaction! To see another simple example how this works, see Philip Guo’s short but instructive tutorial.

Templates

The most intuitive explanation of templates again come from Jeff Knupp:

HTML Templating is similar to using str.format(): the desired output is written with placeholders for dynamic values. These are later replaced by arguments to the str.format() function. Imagine writing an entire web page as a single string, marking dynamic data with braces, and calling str.format() at the end. Both Django templates and jinja2, the template engine Flask uses, are designed to be used in this way.

This design enables us to create scaffolds for different but similarly structured HTML pages. It makes presentation logic very customizable and reusable. Let’s revisit our view and see how a template can help:

Functionality wise, this view does the exact same thing as before, it returns the same HTML table. However, the only difference is that the method definition is now much more readable — to render HTML. Where is the HTML code then? It is actually modularized in user.html:

user.html

Notice that this file does not look like our typical HTML page. It is, in fact, templatized:

  • {{ }} represents placeholder: {{ info }} will be replaced by data that are being passed from user_dict.
  • {} represents control flow: {% for info in user_dict %} paired with {% endfor %} will create a for loop to create multiple <td> tags
  • Template Inheritance: Templates can extend or inherits from other templates, using {% extends “base.html” %}; It can also include other children templates with {% include “other.html” %}

All these constructs facilitate us to write flexible HTML templates, and allow us to separate what to present from how to present.

Big Idea / Lesson Learned: Templates do not change what is presented to the users, but it makes the how much more organized, customizable, and extensible. To learn more examples, check out this detailed Jinja template documentation.

Data Layer + RESTful APIs

Database, SQL, and CRUD operations

With the table created, we can execute SQL statements to populate the table and perform additional CRUD (Create/Read/Update/Delete) operations. When the application needs to query this data, our database is responsible for handing the data from the data layer to the application layer:

As an example, the show_all_events view needs to display all the events. And one particular way to surface this data is to execute a SQL query inside the view function. The code is simple, readable, but unfortunately problematic:

  • Hardcoding SQL logic in the application code is error-prone, just like hardcoding HTML inline. Often, there will be schema updates, table migrations, or changes in business logic. All these changes could break the query.
  • There could be security concerns, we generally do not want to expose our data models in the application code, because the application could suffered from malicious attacks such as SQL injections.

ORM and SQLAlchemy

An object-relational mapper (ORM) is a code library that automates the transfer of data stored in relational databases tables into objects that are more commonly used in application code. It allow developers to access data from a backend by writing Python code instead of SQL queries.

One of the most popular ORMs in Flask is SQLAlchemy. Instead of creating the dim_events table in SQLite directly, it allows us to initialize the same events table in Python as a Class:

Defining a Table via SQLAlchemy

More importantly, SQLAlchemy allows us to represent data models as Class instances, so interacting with databases in the application layer is now much more natural in the application code. The example below only uses all and filter operators (which is equivalent to SELECT * AND WHERE in SQL respectively), but SQLAlchemy is much more versatile than that!

Let’s see all of these hard works in action when we visit /dbdisplay/Exercise:

Big Idea / Lesson Learn: A database enables data to persist in an application. The proper way to query data in the application code is to leverage ORM such as SQLAlchemy. To learn more, check out the official documentation & tutorial.

RESTful API endpoints

A good way to think about RESTful APIs is that they act as functions — functions that take in specific parameters as inputs and output standardized data in a controlled manner. The entire execution of the “function call” happens via HTTP: arguments are passed as part of the URL parameters, and data is returned by the function as a HTTP response.

With tools like SQLAlchemy, building API endpoints is actually not too different from what we have already done. Views take in the URLs and issue specific queries in order to return the results based on the parameters. Below are the two views that we have seen before, but slightly modified. Notice the only thing that really changes is the return type of the data is now in JSON.

A list of API endpoints

Let’s see how things work when we hit these API endpoints:

Big Idea / Lesson Learned: APIs are convenient endpoints for developers to expose proprietary data to the outside world in a controlled manner. The specification of the data request is often composed as parameters in the URL, data are returned via HTTP, and often are presented in JSON form. I highly recommend reading Miguel Grinberg’s long but engaging post to learn more.

Next up, we will see how everything (routes, templates, database, API endpoints) fits together to create what we are going after in the first place — data visualizations.

Front End Layer

Creating Interactive Charts with D3.js

While they are educational, I always had little idea on how things really work in a real application, i.e. Where do data come from? As I gained more experiences, I learned that D3 actually offers a wide range of options to load data into the browser, one particular method is called d3.json.

A JS example to retrieve data into the browser using d3.json

This makeGraph function takes two arguments — an URL and a callback function (not implemented in the above code snippet).

  • The URL is where we will ask for data
  • The callback function will execute once the data arrived (asynchronously)

The callback function will take that data, bind it with DOM elements, and display the actual visualizations on the browser. This is usually the place where we write our D3 visualization code.

Let us see this through by a more elaborate example. In my web app, there is a tab called “Calendar View” that allows a user to display her activities in the form of calendar heatmaps:

I tend to run on Saturdays and play basketballs on Sundays

For this visualization, each cell here represent a single day. The color intensity represents how much time I spent on a particular activity on that day. In the plot above, each highlighted block means that I did some form of exercise on that day. It’s obvious from the chart above that my New Year Resolution is to exercise more regularly in 2016.

Where does it fetch the data, and how does it display this information? How does one construct a calendar? Let’s deconstruct this step by step:

  • First, when a user goes to the ‘Calendar View’ tab, the particular URL /calendar will trigger the view plot_d3_calendar. It will then render the calendar.html template — a pattern that uses routes and templates.
Calendar View Decorator
  • calendar.html contains all the things we need to render the HTML, but an important part of this file is a Javascript file called calendar.js, which will be executed as part of this rendering — Notice the HTML file is templatized.
calendar.html
  • In calendar.js, I defined an event listener and a callback function called makeCalendar. When a user clicks on a specific event button, the click event will trigger makeCalendar to query the API endpoints for data. The code in d3.json will be responsible for creating the D3 visualization — Notice that we are hitting the API endpoint we built just earlier.
The part in calendar.js that renders the calendar visualization

For each of the visualization that is being rendered, that’s essentially what is happening under the hood:

https://www.youtube.com/watch?v=-vJXKNODlFQ

Big Idea / Lesson Learned: When a request triggers a view, the view will attempt to render the HTML and execute the Javascript file. The D3 code in the Javascript file will issue a query to the API endpoint, and returned data will be bind to actual DOM elements to be shown on the web browser. Routes, templates, databases, and APIs all work together to get this done! To learn more, here is another illustrative example that study BART data.

Beautify-ing UI Using Twitter Bootstrap

A front-end toolkit for rapidly developing web applications. It is a collection of CSS and HTML conventions. It uses some of the latest browser techniques to provide you with stylish typography, forms, buttons, tables, grids, navigation and everything else you need in a super tiny resource.”

Twitter Bootstrap is extremely powerful because it enables us to upgrade the look and feel of my application very easily. Below is an example where headers and table formatting essentially come free because of Twitter Bootstrap:

Big Idea / Lesson Learned: If you are interested in Design and UI, don’t re-invent the wheels. There is no shortage of layouts, components, and widgets to play with in Twitter Bootstrap. To learn more, I recommend this tutorial.

Summary

For Data Scientists, there are certainly more lightweight approaches to produce (interactive) data visualizations, using tools such as ggplot2, ggviz, or shiny. But I think learning how a web application works in general also make one a stronger DS. For example, if you ever need to figure out how data is logged in an application, knowing how web application tend to be built can help you so much in navigating the codebase to do data detective works. This knowledge also helps you to establish common languages with the engineers in more technical engineering discussions.

If you are inspired, there are many more resources written by programmers who are much more qualified than I am on this topic (see here and here).

Start Flasking and keep hacking!

I would like to thank Krist Wongsuphaswat, Robert Harris, Simeon Franklin and Tim Kwan for giving me valuable feedback on my post. All mistakes are mine, not theirs.

Thanks to Krist Wongsuphasawat and Tim.

Robert Chang

Written by

Data @Airbnb, previously @Twitter. Thoughtfully opinionated, weakly held. Opinions are my own.