Responsive Photosets

Creating beautiful photosets with jQuery and CSS


Erratum: This writing has been updated as of 16th December 2013, for a minor calculation mistake that led to distortion of aspect ratio when margins are accounted for.

As I was developing functionality for a new personal blog, I ran into an issue — being an avid amateur photographer myself, I would love to display images in a photoset. One of the charms of Tumblr is that the photosets allow you to display multiple images in a certain layout, and I have always wanted to replicate this, although with a slightly different implementation. As a person who have dipped his feet in web design, I am also acutely aware how important fluid or responsive layouts is — so I sought to incorporate this very concept into a flexible photoset.

The requirements

There are some simple requirements to the aforementioned photoset:

  1. Flexibility, such that it fits into a fluid width layout
  2. Customizability, such that one can dictate the layout as desired, preferably in a straight forward, fuss-free manner — and such that it can also be adapted into a WordPress theme function
  3. Equal height images, such that all images fit snugly into a single row, but their widths are adjusted accordingly so that all images on the same row fill the entire width, not more and not less, of its parent container, while preserving their individual aspect ratios.

Now, where do we begin? I searched high and low, but there doesn’t seem to be a satisfactory solution anywhere. While WordPress offers a tiled mosaic display (by enhancing the native [gallery] shortcode) with their JetPack plugin, I did not want all the bells and whistles of the plugin, which bloats the site. Also, I did not need images to span rows (which requires more complicated code work), so the idea of installing JetPack seemed even less tantalizing.

In the end, it seems like I would have to write my own jQuery function — and that screams challenge. The first two requirements are easy to satisfy, but the last one left me in a tizzy… until I had an epiphany on a very boring Tuesday night.


Here is the actual demo, hosted with CodePen, of the technique I will introduce to you in this article.

HTML markup

The HTML markup is rather straight forward:

<div class="photoset">
<div class="photoset-row">
<figure class="photoset-item">
<a href=""><img src="" alt="" title="" /></a>
<figcaption></figcaption>
</figure>
<!-- multiple images per row is possible -->
</div>
    <!-- Additional rows are possible -->
<div class="photoset-row">
<!-- more photoset items -->
</div>
</div>

To those who are doubtful of nesting other block-level elements within the anchor element, this is actually possible (and semantically valid) in HTML5 — provided that there are no interactive content in it. Just remember to declare the right doctype at the start of your file.

The <a> element allows you to link to larger versions of the thumbnail image, or even trigger modal boxes. I will leave these functionality out not only for the sake of brevity, but also because there are not necessary for the core functionality of the photoset.

Styling is rather straight-forward, and I highly recommend using a CSS reset. What is achieve in the CSS code block below can be summarized as follow:

  • Creating general styles for each row,
  • Floating individual items in each row, and
  • Styling the image caption, so that it appears upon hover. This exploits the CSS translate property.
.photoset {
overflow: hidden;
width: 100%;
}
/* Rows */
.photoset .photoset-row {
margin-bottom: .5rem;
overflow: hidden;
width: 150%; /* See comment after code for reason */
}
.photoset .photoset-row:last-child { margin: 0; }
/* Images on each row, known as an "item" */
.photoset .photoset-item {
display: block;
float: left;
margin: 0 .25rem;
}
.photoset .photoset-item:first-child { margin-left: 0; }
.photoset .photoset-item:last-child { margin-right: 0; }
/* Images and captions are contained within <figure> */
.photoset figure {
margin: 0;
overflow: hidden;
position: relative;
width: 100%;
height: 100%;
}
.photoset figcaption {
background-color: rgba(255, 255, 255, .75);
box-sizing: border-box;
font-size: .75rem;
padding: .5rem;
position: absolute;
bottom: 0;
left: 0;
width: 100%;
-webkit-transform: translateY(100%);
transform: translateY(100%);
transition: all .5s ease-in-out;
}
.photoset a:hover figcaption {
-webkit-transform: translateY(0);
transform: translateY(0);
}
.photoset img {
display: block;
max-width: 100%;
transition: all .125s ease-in-out;
}

You may ask why I have chosen to set the width of each row in the photoset to 150% instead of 100%. There is no risk of images running out of the stipulated width, since their final widths will be calculated with respect to the width of the parent container, the photoset itself.

Width exceeding 100% prevents floats from wrapping

By declaring a width exceeding 100%, yet not too taxing on the rendering system (since the browser actually does render a box of said specified width — this is why the -9999px text-indent was frowned upon), we can ensure that floats will not wrap when the browser window is being resized. The additional width beyond 100% allows floats to stay on a single line while our JavaScript function is busy with recalculating the correct image dimensions. I have picked 150% as a comfortable compromise, but you can definitely explore larger percentage values.

jQuery coming to the rescue

Now the fun part begins — making jQuery to perform the third requirement, which is to calculate the widths of each individual image in each row such that they are have equal height, and their combined widths, taking into account the horizontal margins between each images, fills the entire width of the photoset container. Challenging? Definitely. Impossible? I don’t take no for an answer.

Mathematical basis

Before I dump a whole chunk of code on the monitor, I would like to introduce you a walk-through of the mathematical basis behind the calculations — at least for the initiated. This is because it’s quite impossible to grasp the calculations performed later unless I explain them right now.

There is only one variable to be computed.
The rest are relative.

The only variable that has to be determined is the width of one single element. Since we know that the widths are related to a to-be-calculated uniform height by known aspect ratios, they are not exactly hard to calculate, either.

Let’s start with a very simple example: we have two images on a row. One is a square (aspect ratio: 1) and the other in a landscape orientation (aspect ratio: 2). Here is an animation describing how the widths, based on a single variable, can be computed. For the sake of simplicity, we let x denote the width of the image with the smallest aspect ratio.

Simple example of two images. The total width can be easily calculated with known aspect ratios.

And since we know the width of the container, it simple means that x is one-third the width of the container. Let’s say the container is 960px wide, that means the orange box will have a dimension of 320*320, while the green box will have one of 640*320. Not rocket science, right?

I hope that was clear. Now let’s move on to a slightly more complicated example, involving aspect ratios that are fractions. Then again, for ease of calculation and programming — since we can rely on the Math.min.apply() function — we again let x denote the smallest aspect ratio. The basic principles from the previous scenario apply:

A slightly more complicated scenario involving fractional aspect ratios.

This calculation is done strictly within the context of a row, because the uniform height only applies to images on the same row. A new calculation has to be made per row, for which we will use jQuery’s .each() function.

We can see from the above animated examples that the width of the narrowest image (the one with the smallest aspect ratio) can be expressed as a function of the overall width. Since we already know the overall width, we can easily solve for x. This mathematical basis therefore forms the code that we are using later.


This is of course, a simplified scenario. In the real life scenario where we want spacing to be between images, we establish horizontal margins for all image elements, and then remove the right margin for the first image (that is on the left) and the left margin for the last image (that is on the right). However, we let CSS handle these calculations, and provide concessions in our jQuery calculations later to account for there margins.

Wait for DOM to get ready

This is really important — we want to make sure that we bind events to elements that exist at runtime, so we wait for DOM to be ready. Remember, beauty takes time!

The first thing we want to do is to hide the images by forcing zero dimension on both axes, wait for them to load, and then run the calculations when the dimensions are finally available.

// Functionality for calculations
// and event binding can be done on DOM ready
$(function (){
// Rest of the code goes here
});

That should do it. Now we wait for the browser to be done loading the images — thanks Mathew for pointing out that the layout breaks upon initial load with my initial strategy of putting all my code in the DOM ready event (instead of the load event).

Having trouble getting it to load up proper each time, though: refreshes seem to randomly break it?

The reason for waiting a for the load event is because the image dimensions will not be accessible until they have been loaded by the browser.

// Trigger the actual calculations when resources are loaded
$(window).load(function (){
// Trigger resize event to perform calculations
$(window).resize();
});

Store original image dimensions in data objects

The original image dimensions are accessed with the native naturalWidth and naturalHeight properties after the image has been loaded — we then store them in jQuery data objects. They are supported in all modern browsers (that means excluding IE8 and below). Since the images are loaded, we can also safely display them by setting opacity to 1.

// Store original image dimensions
$('.photoset-item img').each(function () {
$(this).load(function (){
$(this)
.data('org-width', $(this)[0].naturalWidth)
.data('org-height', $(this)[0].naturalHeight)
.css({ opacity: 1 });
});
});

In order to support older browsers, you may access original image dimensions by creating a new image object for every <img> element encountered, and then fetching their actual width and height respectively:

 // Store original image dimensions
$('.photoset-item img').each(function () {
var img = new Image();
$(this)
.data('org-width', img.width)
.data('org-height', img.height)
});

The above code will create a new image object per <img> element encountered, and then fetching the natural dimensions of it.

Listen to resize event

When using jQuery to calculate dimensions, one has to remember to listen to the resize event triggered on the window element.

$(window).resize(function (){
// Perform calculation for each row independently
$('.photoset-row').each(function() {
// Rest of the code here
});
});

Even better: throttle or debounce the resize event, so it does not fire a gazillion times when you (or your very curious and amused visitor) resizes the viewport. Paul Irish wrote an amazingly nifty script for that purpose. Alternatively, you can rely on the jQuery throttle/debounce plugin. For performance’s sake, I have elected to use Paul’s script in my CodePen demo.

The rest of the code below goes into the .each() function.


First of all, we make things easy by shortening variables and fetching the parent container’s width:

// Declare some variables
var $pi = $(this).find('.photoset-item'),
cWidth = $(this).parent('.photoset').width();

And then we generate an array containing all the aspect ratios of images in the same row. This can be done by using jQuery’s very handy .map() function, where we iterate through all elements that match the selector. And since we have already created the jQuery data objects storing the original image dimensions, it is just a matter of fetching them and performing a simple arithmetic operation to get the aspect ratio:

// Generate array
var ratios = $pi.map(function() {
var orgWidth = $(this).find('img').data('org-width'),
orgHeight = $(this).find('img').data('org-height');
return orgWidth/orgHeight;
}).get();

Now, we want to fetch the sum of the ratio of widths of all images in that row. This is easily done with a for loop:

// Sum aspect ratios
var sumRatios = 0,
minRatio = Math.min.apply(Math, ratios);
for (var i=0; i<$pi.length; i++){
sumRatios += ratios[i]/minRatio;
}

Also, since we want to make concessions for margins, we sum all horizontal margins of all items in the same row:

// Sum all horizontal margins
var sumMargins = 0;
$pi.each(function (){
sumMargins += parseInt($(this).css(‘margin-left’)) + parseInt($(this).css(‘margin-right’));
});

Finally, we solve for x, which denotes the width of the narrowest image in any photoset row. x is simply derived by dividing the parent container width with the sum of the aspect ratios.

This allows us to calculate the dimensions for each individual photoset item. Given the non-trivial scenario where margins are involved, we will have to subtract the sum of horizontal margins of all child items from the parent width, so that the outer width of the item (CSS width + margins) will be the desired width.

// Calculate dimensions
$pi.each(function (i){
var minWidth = (cWidth-sumMargins)/sumRatios;
$(this).find('img')
.width(Math.floor(minWidth * (ratios[i] / (Math.min.apply(Math, ratios)))))
.height(minWidth / Math.min.apply(Math, ratios));
});

The reason why I have chosen to use Math.floor() is because of possible rounding errors when integral pixel values are eventually computed.

It is safer to underestimate the width of individual items

The maximum difference is technically half a pixel, since we are rounding up/down) such that the row does not overflow, causing floated elements to be pushed onto a new line and breaking the layout.

… and voila, you’re done! A working demo of the code is available on CodePen.

Extending into a WordPress theme function

It is also possible to extend what has been demonstrated above into a function that is inserted into the functions.php file of a WordPress theme. This allows me to use a selected shortcode, say, [photoset], to create the necessary markup.

Basic design principles

The shortcode will accept the following attributes:

  • id — specify the ID of the WordPress image attachment
  • layout—a comma-separated field value indicating the number of images included per row

An example usage will be:

[photoset id="12,15,17,26,46,45" layout="1,3,2"]

This tells the function to produce a photoset with 1 photo on the first row, 3 photos on the second and 2 photos on the third. Of course I could design a whole new GUI from ground-up that resembles that of Tumblr’s photoset feature, but I don’t see the need to do so if one can correctly sum the number of images in their photoset and dictate the layout manually (and correctly). Therefore, it is important that the sum of the comma-separated values in the layout field matches the total number of IDs specified.

The code

The tricky part of the code is simply to use PHP to parse the layout into a machine-friendly form. Also, probably due to bloated code, there is no easy way of accessing all attributes of image attachments in WordPress through a single function — in this case, I would like to get my hands on:

  • A lower resolution image as the thumbnail. Medium is a good size.
  • The original link to a higher resolution image — the Large size or the actual image file would suffice.
  • Metadata of the attachment, like title, caption, description and etc.

Luke over at the WordPress forum proposed a useful way of fetching image metadata — I have adapted part of his code in my WordPress function.

Much more to yearn for

This segment only serves as a demonstration of how to integrate the markup for photoset into a WordPress function accessible via shortcode through the WYSIWYG editor. Following my guilty admission that I am not a WordPress wizard, I am sure that there are many ways out there of which my code can benefit tremendously from supplementation, improvements and adaptations.

There are many ways of accessing attachment resources in WordPress, ranging from the very basic and essential image URLs, and to image metadata, description, caption and more. Unfortunately WordPress has created a handful of different (and royally confusing) functions to access so, and due to varied personal preferences I will leave it up to you, my reader who doubles as an avid programmer and kick-ass designer, to pick your poison.

You may even modify my code such that you can specify custom URLs to be associated with each image by creating and reading from a new attribute, say, “urls”.

Further improvements

I have been humbled by the reception this piece of writing has received since it was published. This has also spurred me to improve on the demo. One thing that was initially placed on the backburned was brought back to the drawing board — of forcing square thumbnails when the screen becomes narrower, and then forcing the floats to clear each other completely (therefore forcing images onto each line) when screen width reaches that of the mobile phone.

In this improved version of responsive photoset (see full demo, too), it is now mobile friendly, and offers a square thumbnail grid when viewed on tablets (where portraits may appear too small).

The major changes in this version include:

  • Use the meta tag to force width to device width and to set initial scale to 1 in the header element
  • Detecting viewport width upon firing of the resize() event
  • Forcing square thumbnails between screen sizes between 480 and 768px
  • Forcing floats to clear when screen sizes dips below 480px.
  • Hiding <img> elements on screen sizes larger than 480, and reassigning source image to the background-image for better control (especially during square thumbnail displays). This allows me to take advantage of the background-size: cover property.

Closing note

Creating a photoset is complicated, but only in a mathematical sense — once you have figured out how a minimum height can be easily derived from the proportions of individual elements as well as the actual width of the parent container, nothing will hold you back.

While I have tried my very best to check and correct for errors in my code, the code might not be efficient or perfect — it is up to your personal discretion to use and adapt my code to your liking. Leave a note if you know there are places where the code can be improved.

Next Story — Setting up SequenceServer
Currently Reading - Setting up SequenceServer

A heavily modified SequenceServer that is running from the Lotus japonicus genomics and proteomics resource website, manuscript in preparation (Mun et al., 2015)

Setting up SequenceServer

I was recently tasked with setting up SequenceServer on a work server in a production environment—the legacy NCBI wwwblast is extremely dated, and highly non-customizable. Deep integration with the API we have developed for a Lotus japonicus genomic and proteomic resource is not possible.

I have encountered several issues with setting up SequenceServer, but have managed to resolve all of them. I am therefore penning this guide for future users who might be interested in installing SequenceServer, but are at a loss of where to start, and what to do when they run into issues.

It is of course not possible to enumerate the installation instructions for all possible server configurations—therefore you should be warned that the SequenceServer I have installed is running off a RHEL7 server with a standard solution stack (Apache, PERL-CGI, PHP and MySQL) installed.

Ruby on Rails

If you have not installed Ruby on Rails on your server, you will have to do that. I strongly recommend using rbenv to install Ruby. It just makes life a lot easier ;) you may also follow this very handy guide for more details.

### Install Ruby using rbenv
$ git clone https://github.com/sstephenson/rbenv.git ~/.rbenv
$ echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bash_profile
$ echo 'eval "$(rbenv init -)"' >> ~/.bash_profile
$ git clone https://github.com/sstephenson/ruby-build.git ~/.rbenv/plugins/ruby-build
$ source ~/.bash_profile

### Install ruby
$ rbenv install 2.1.0
$ rbenv rehash
$ rbenv global 2.1.0
### Update everything
$ gem update
$ gem update --system

Installing SequenceServer and its dependencies

Here, I will go through the installation instructions for the dependencies for SequenceServer and SequenceServer itself.

Dependencies

The gem you need for SequenceServer to work is rails and passenger.

### Install basic dependencies
$ gem install rails
$ gem install passenger
$ passenger-install-apache2-module

If you want to build SequenceServer yourself, you will also need to install rack and rake.

### Install basic dependencies
$ gem install rack rack-test rake

However, if you run into additional issues, you might want to install development dependencies such as Rspec, Rubocop, Capybara, CodeClimate and qmake.

### Install qmake
$ yum install qt-webkit-devel
$ find / -name *qmake
$ ln -s /usr/lib64/qt4/bin/qmake /usr/bin/qmake
### Install other development dependencies
$ gem install rspec rubocop capybara capybara-webkit codeclimate-test-reporter

SequenceServer

As SequenceServer is available as a Ruby gem, you just have to run:

$ gem install SequenceServer

Yes, it is as easy as that—but wait, you are not really done yet. We need to specify where can SequenceServer run from, and we typically do not run it off a port (4567 by default), but off a subdirectory, like /blast/ or the likes. In order to run SequenceServer off a subdirectory, we need to run Passenger at the correct paths—this involves fiddling with the Apache httpd configuration file.

Updating your Apache httpd configuration file

As mentioned before, if you intend to run SequenceServer as a Passenger app in a subdirectory, you will have to give Apache a set of very specific instructions:

  1. Your choice of subdirectory where you intend to serve SequenceServer
  2. The location of your SequenceServer public folder
  3. The user that Passenger should run SequenceServer as

Info 1: Choice of subdirectory

That is very simple. If you want to run SequenceServer from the /blast/ subdirectory of your URL, you will use “/blast”

Info 2: Location of SequenceServer public folder

This is a bit tricky. First, you will have to figure out where is SequenceServer installed. You can do this by running:

$ gem env

This will output something similar to the following in your terminal (emphasis my own):

RubyGems Environment:
- RUBYGEMS VERSION: 2.2.0
- RUBY VERSION: 2.1.0 (2013–12–25 patchlevel 0) [x86_64-linux]
- INSTALLATION DIRECTORY: /home/terry/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0
- RUBY EXECUTABLE: /home/terry/.rbenv/versions/2.1.0/bin/ruby
- EXECUTABLE DIRECTORY: /home/terry/.rbenv/versions/2.1.0/bin
- SPEC CACHE DIRECTORY: /home/terry/.gem/specs
# and more... #

Look at the INSTALLATION DIRECTORY value (see bolded line). Your gems should be installed in the /gems folder located in that directory. For example, the SequenceServer that I have installed is located at:

/home/terry/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/gems/sequenceserver-1.0.4

Info 3: User that Passenger is running SequenceServer with

The annoying thing is that processes spawned by Passenger, by Apache, is not by the user that is running Apache. If you want to check who is running Apache, just use the command:

$ ps aux | egrep '(apache|httpd)'
apache    3932  0.0  0.0 498956 10292 ?        S    Nov15   0:00 /usr/sbin/httpd -DFOREGROUND
apache 3933 0.0 0.0 498956 10292 ? S Nov15 0:00 /usr/sbin/httpd -DFOREGROUND
apache 3934 0.0 0.0 498956 10292 ? S Nov15 0:00 /usr/sbin/httpd -DFOREGROUND
apache 3935 0.0 0.0 498956 10292 ? S Nov15 0:00 /usr/sbin/httpd -DFOREGROUND
apache 3936 0.0 0.0 498956 10292 ? S Nov15 0:00 /usr/sbin/httpd -DFOREGROUND
apache 3979 0.0 0.0 498956 10288 ? S 00:15 0:00 /usr/sbin/httpd -DFOREGROUND

To remedy this, you will need to find out who owns the file config.ru in SequenceServer’s installation directory (which you have obtained in Info 2). You can check it by running:

$ cd /home/terry/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/gems/sequenceserver-1.0.4
$ ls -l config.ru

And it seems that it is me (user terry) not the Apache user (user apache):

-rw-r — r — 1 terry terry 116 Oct 1 16:33 config.ru

Modifying Apache httpd.conf

With all three information on your hand, you are ready to modify your Apache configuration. The location of the configuration file varies from server to server, but mine is located at /etc/httpd/conf/httpd.conf. If you want to play it safe, make a copy of your httpd.conf file and store it somewhere.

Nobody likes playing around with httpd.conf or .htaccess files, because they can be very difficult to understand at times. Therefore, I have made a template of what you should append to your httpd.conf file to get SequenceServer to work. Note that:

  • /path/to/app should refer to the directory of the SequenceServer gem itself (see Info 2)
  • /path/to/app/public should refer to the directory of the public folder in the SequenceServer gem (see Info 2, just append “/public” to end of path)
The template file for httpd.conf

And here is a working example of the portion of my httpd.conf file:

Update SequenceServer’s configuration file

SequenceServer needs some additional information before it can work. The most important two bits are:

  • The location of NCBI BLAST+ executables
  • The location of your BLAST directories

SequenceServer’s configuration file is located in the gem’s directory, named config.ru. In there you can specify it to load a configuration file located anywhere else on your server—as long as the user running Passenger and SequenceServer has access to it.

An example of my config.ru file:

require 'sequenceserver'
SequenceServer.init(:config_file => "/home/terry/.sequenceserver.conf")
run SequenceServer

I simply instructed SequenceServer to load additional configurations from the file “/home/terry/.sequenceserver.conf” upon initialisation. In that file I specify the number of threads SequenceServer should run on, and the location of both the NCBI BLAST+ binaries and BLAST directories.

---
:num_threads: 4
:bin: "/home/terry/bin/.sequenceserver/ncbi-blast-2.2.31+/bin"
:database_dir: "/home/terry/var/blast-db"

Multiple SequenceServer instances?

There might be edge use cases where you will want to have multiple SequenceServer instances. For example, you want to separate what databases the public has access to, and what databases your internal users (within your lab, or your research group for example) have access to. This can be done by:

  1. Having addition rules in your httpd.conf file — you can use IP-based access restriction to limit access to your private SequenceServer instance.
  2. Having a copy of the SequenceServer’s config.ru
  3. Symlinking /public to original gem’s /public folder

Additional Apache httpd configuration

You simply will have to duplicate the rules that make SequenceServer work, namely from the “alias” line to the end of the “</Directory>” block. For example, if I want to serve SequenceServer from both the /blast and /blast-private directories, with IP-based access restriction on the latter, I can do this:

Even though Apache 2.4 accepts legacy Apache allow/deny directives, you should migrate over to the require directive for future compatibility. There is a handy guide on updating directives when upgrading to Apache 2.4.

A different configuration file for each SequenceServer instances

Each SequenceServer instance is opened by a new config.ru file. For convenience’s sake, I have create another SequenceServer folder in the Ruby gems folder, called “sequence-server-1.0.4-private”, for example. In there, I have a new config.ru file—and that is all!

You may load different configurations for your config.ru files. For the public one:

# sequenceserver-1.0.4/config.ru
require 'sequenceserver'
SequenceServer.init(:config_file => "/home/terry/.sequenceserver.conf")
run SequenceServer

…and for the private SequenceServer:

# sequenceserver-1.0.4-private/config.ru
require 'sequenceserver'
SequenceServer.init(:config_file => "/home/terry/.sequenceserver-private.conf")
run SequenceServer

With different .conf file loaded, you can use different BLAST databases, binaries and etc.

Symlinking /public folder to original gem’s /public folder

Remember that since file paths are relative in the SequenceServer installation, you will be serving files in the /public folder from different directories where each config.ru is located in. To overcome this issue, you will have create symlinks to the original /public folder in the SequenceServer gem:

# Go to folder of alternative instance of SequenceServer
$ cd /home/terry/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/gems/sequenceserver-1.0.4-private
# Create symlink to the public folder in actual gem
$ ln -s public ../sequenceserver-1.0.4/public

You may also use .htaccess to redirect requests for files in non-existent /public folder in other SequenceServer instances to the actual files, but I find that quite cumbersome.

Next Story — Preparing for prepared statements
Currently Reading - Preparing for prepared statements

Preparing for
prepared statements

A guide towards migrating away from mysql_ functions to PDO

· · ·

I have decided to pen this guide after encountering a multitude of questions concerning migration away from mysql_ functions on StackOverflow, as well as my personal experience from updating the codebase of dated PHP files in my line of work.

This guide is written under the preface that you have some basic understanding of PHP and object-oriented programming (OOP), but believe me, when I started writing PDO I have no idea how OOP worked. It’s never a bad place or time to start now.


One of the major complaints I see while prowling through StackOverflow is that people are still using mysql_ functions. These functions will be rendered obsolete because of the deprecation of the mysql function in PHP 5.5 and onwards, and marked for removal in PHP 7. In fact, if you attempt to execute a mysql_ function with recent versions of PHP, you will be thrown an exception:

Deprecated: The mysql extension is deprecated and will be removed in the future: use mysqli or PDO instead

Oh no! You say. Why is this happening?

The mysql extension is not insecure…
but simply harder to use securely.
Exploits of a Mom”, an xkcd comic comically addressing the not-so-comical issue of sanitizing user inputs (or rather, there lack thereof).

There are many good reasons to why mysql_ functions are not worthy of future support. A commonly cited reason in support of migrating away from said functions is because of the inherent security flaws in the mysql extension. While not entirely true, using mysql_ functions requires the programmer, i.e. you, to take proactive, conscious steps to prevent attacks and hijacking attempts. In other words, mysql is not insecure, but simply harder to use securely compared to other available extensions, such as mysqli and PDO. The dated extension also does not support a variety of other recent features, which makes it difficult to work with mysql_ functions in the near future.

The reason why the mysql functions have stayed around for so long is for one simple reason — legacy support. Disabling the extension outright will disrupt plenty of dated web services and scripts that still use the mysql extensions, but this is no decent excuse to why you should start looking at alternatives this moment forward.

While suppressing E_DEPRECATED errors or, god-forbid, falling back to an older version of PHP, are quick and dirty tricks to tiptoe around this issue, your code will not be future proof. Since it’s introduction in the 1990s, no new functions have been added to the mysql extension since 2006, and it is not even being actively maintained anymore.

The two best alternatives out there are the mysqli and PDO extension. There is great amount of debate which one is better, but I personally prefer the latter for very simple reasons — its support for other drivers than mysql itself, and that it accepts named parameters. Both are fast, object-oriented and supports stored procedures.

Establishing a database connection

We need several snippets of information in order to establish a database connection. When you define a new PDO object, it will ask you a few questions:

  • What host name is the database located on?
  • Which database are we working with?
  • What are the logic credentials that allows me access to reading and/or writing to the said database?

It is always safer to store these information in a configuration file, which you can use .htaccess (if you are on an Apache server) to block any requests for the file. This will keep your database connection information safe from (external) prying eyes. In this file, you can define named constants that will store these snippets of information, at the disposal of any database connection requests:

Securing your database connection details.

With this information you can craft a database connection out of thin air, pronto! Below I am presenting a barebones database connection, followed by one that also defines a character set (here I am using utf8, but your flavours may vary) and port (defaults to 3306, but good to know):

Connecting to the MySQL database using PDO.

Easy peasy. Variables and objects declared are considered to exist throughout this article.

p/s: A little note for advanced users — if you’re using the try/catch method of retrieving errors, remember to enabled that via setAttribute():

Catching exceptions thrown by PDO with the try/catch regime.

By catching the exception and dealing with it manually prevents PDO from displaying a backtrace, which contains sensitive information about your database. The trace is very useful for troubleshooting in a development setting, but definitely a no-no for production.


Querying

Now the fun part—let’s construct some queries! But where do we start?

Fetching data

Sorry, Miranda Priestly. I really just have to put this gif right here.

Let’s begin with the basics with an extremely reduced example — say you want to execute a MySQL query that will fetch user data from a database. How should you get around to do it?

For example, we want to know how many users from each country we have in our database. We are not supplying any user input at this point, and are only concerned about retrieving information.

Fetching data from MySQL database.

Accepting user inputs

A typing cyborg. Ghost in the Shell by Production I.G.

Remember that mysql 101 rule — always sanitize your user input — nailed hard into your psyche by your programming teacher? Remember the countless times you have to use mysql_real_escape_string()? One of the reason why mysql_ functions are deprecated because the programmer will have to make a conscious effort to sanitize user input. For PDO, prepared statements come to the rescue. The magic behind PDO is that it sanitizes parameters for you during the process of parameterisation.

PDO makes the need to escape or sanitize input redundant.

This what what PDO actually does — unlike a mysql_ function, it keeps the prepared statement and strings separate, and only combines them upon execution. It protects you from injection attacks, which happens when hackers know you are not sanitizing your data (properly).

In PDO, you are offered several options on how to combine your variables as strings into the prepared statement, before the latter is executed. This is done by:

  1. using named placeholders
  2. using positional placeholders
  3. binding parameters (by reference)
  4. binding values

1 + 2: Named or Positional Placeholders

Both named and positional placeholders work in a similar fashion —their corresponding values are passed as an array in the execute() statement. The only differences are that for the former, you use ‘?’ as a placeholder in your statement, while for the latter, you use ‘:placeholderName’ instead. Moreover, positional placeholders require you to order the array items in the order of which the placeholders appear in your statement, while named placeholders rely on object keys to identify the correct placeholders, therefore making order irrelevant.

Named vs positional placeholders.

3 + 4: Binding Values or References

Alternatively, you can choose to bind variables by reference or value before executing the prepared statement. The difference is that binding by reference (hence using bindParam()) will only evaluate the value of the reference upon execution, unlike binding by value (bindValue()) which will bind an immutable value upon calling of the method which prevents further manipulation before execution. In majority of implementations, the difference is not of great significance.

bindParam() vs bindValue().

Along the same vein, you will have to be careful how you bind values or parameters to placeholders in a PHP for loop. Since bindParam() binds the reference and the reference only, you should modify the array elements directly by prepending the ‘&’ sign, i.e.:

However, this argument is moot considering that you can simply pass the entire array to the execute() command without the need of trigger the overhead of a for loop. This is the recommended way of dealing with multiple placeholders.

Further reading: bindParam() vs bindValue().

Placeholders don’t have a place
in the entire prepared statement

Just like how nice jeans are always priced beyond my reach.

Parameterised prepared statements are not an all-encompassing panacea when it comes to security concerns. Given that you can only bind parameters to placeholders within the WHERE clause, if you are accepting user inputs to (1) select tables or columns to query from, or (2) to dictate the grouping, order and limit of their results, performing sanity checks on user inputs is a reality that you cannot, and should not, run away from.

Parameterisation is not a one-size-fits-all solution.

A sensible solution would be to use a whitelist approach for parameters that have a limited range of keywords, like column and table names, for table and columns, and GROUP and ORDER BY clauses; and forced conversions to integers for the LIMIT clause.

One example: if you are allowing the end user to retrieve data from a list of tables, compare that user input against an internal list of tables — if there are any discrepancy, reject any attempts to connect to, or query from, the database:


Advanced examples

Here are some advanced examples that I feel should not be addressed in the main section in the guide simply because it requires basic understanding of constructing prepared statements and binding values or references to placeholders — which we have already conveniently covered in the previous section.

Executing multiple statements

If you want to execute multiple statements that differ only by values, such as inserting multiple rows of data into your database, you can use beginTransaction() and commit():

Executing multiple prepared statements.

Inquiring about IN queries

Sorry for the pun. Jokes (or rather the lack thereof) aside, preparing IN queries is a bit of a challenge in PDO, but it isn’t exceptionally difficult. If you have an array of n size you want to pass into the IN query, you will have to duplicate the positional placeholder (‘?’) n number of times so that each positional placeholder will be assigned a value upon execution of the statement.

For example, if you have an array of values (10, 38, 56, 93) you want to query with, you will have to use the positional placeholders (?, ?, ?, ?) in your WHERE clause. The trick is to generate the correct number of placeholders so that each will be mapped back to your array of values.

With this in mind, you can use any of the following methods for an array of n size to generate the correct placeholders:

  1. Use array_fill() to fill an array with n placeholders, and then implode() it with a comma as the separator.
  2. Use str_repeat() to repeat the placeholder (‘?’) n times, and str_split() it into a array, for which you then implode() using comma as the separator.
  3. Use str_repeat() to repeat the placeholder and its separator (‘?,’) n times, and remove the last comma with substr().
  4. Use str_repeat() to repeat the placeholder and its separator (‘?,’) n-1 number of times, followed by appending by a closing placeholder (‘?’).
Filling placeholders in the IN clause.

Which method is the best? It is up to you. This is where personal style rules over sense, although I personally prefer the last method because it is the simplest (and likely to have the least overhead).

Repeat n-1 times and append is the fastest.

Which method is the fastest? To ask this question you are running the risk of micro-optimisation, and you would be better off spending time optimizing your query instead. However, for the sake of completeness, I have performed three independent tests with 100,000 repeated runs for each method, and verdict is clear: method 4 clenches the title of fast and furious:

Execution time, measured in microseconds, of 100,000 iterations of methods 1 through 4. Values represent arithmetic averages from three independent runs. Error bars represent standard deviation.

Preparing LIKE queries

Sometimes you would want to query your database using the LIKE operator, which allows you to use ‘%’ as a wildcard symbol. The best part about using prepared statements is that they will automatically parse the ‘%’ as a wildcard symbol when appended and/or prepended to your variable:

Formulating LIKE queries using prepared statements and named placedholders.

Counting rows

At times it is desirable to either access the number of rows returned, or to determine if any rows have been returned from the query. This can be done by using rowCount():

Count the number of rows returned

Note: While using FETCH_ALL is a possible alternative, it is memory inefficient, especially for very large datasets.


Closing the connection

By default, the PDO connection is only open when the PHP script is running and will be closed once the script end. Closing the conection is as simple as assigning null to the PDO object, i.e.:

$db = null;

Closing the connection is considered good practice as it frees up resources for other incoming database connections, although not absolutely necessary — PHP scripts typically have a finite execution time, and will be terminated anyway if it exceeds a pre-determined threshold set by the server. Terminating it will close the connection, but that also means that the database connection might be held open for an unnecessary amount of time.

For the interested:


If you have any more useful tips about migrating to PDO, or have spotted any mistakes in my Gists, do let me know! Leave a note, or tweet me at @teddyrised.

Next Story — Writing better AJAX
Currently Reading - Writing better AJAX

Writing better AJAX

A guide towards writing better AJAX calls in jQuery

Illustration: Terry Mun

For awhile, the standard way of making an AJAX call using jQuery is rather simple — by using the $.ajax() function:

Looks easy peasy, but there are several drawbacks to this method:

  1. Excessive nesting. Functions and events dependent on the returned AJAX data has to be wrapped in the success handler, because AJAX is by nature, asynchronous. Along the same line, this reduces the readability of your code.
  2. Difficulty in chaining and compounding. It is difficult and overwhelmingly complex to evaluate outcomes of multiple AJAX requests — we are unable to predict how long does it take for all AJAX requests (each running asynchronously) to return a response.
  3. Deprecation warning. As of v1.8 and above, jqXHR.success(), error and complete callbacks have been officially deprecated.

Promises and deferred objects

Thankfully, promises and deferred objects are implemented natively in the $.ajax() method:

  • .done() as a replacement for .success()
  • .fail() as a replacement for .error()
  • .always() as a replacement of .complete()

As per other native jQuery methods, you can chain them, i.e.:


Chaining works, just as usual

Even better: assign the $.ajax() method to a variable, for which you can chain promise callbacks to. Chaining has always been a hallmark of jQuery, which allows one to reuse a cached selector, for example. In this case, the $.ajax() method returns a promise object natively, which we can use for chaining, too:

See that how the .done() callback is on the same level as the AJAX call itself, without requiring complicated nesting within the call per se? While this advantage may seem trival on a cursory glance, it greatly improves the readability of your code, especially in a production environment where your code is likely to be cluttered with multiple asynchronous calls.

It also allows you to inject code between the AJAX call itself and resolving the promise delivered by the call, allowing for greater flexibility in coding.

In other words, you can see that promises and deferred objects, implemented since jQuery v1.5 onwards, in relation to synchronicity of code execution, is akin to event bubbling in the DOM, listened on by the .on() method.


Multiple AJAX calls

Compounding several AJAX calls has never been made easier with promises. All you have to do is to listen to their status via $.when().


Dependence chain of AJAX requests

You can also chain multiple AJAX request — for example, when the second AJAX call relies on returned data on the first call. Let’s say the first call retrieves the session ID of a user, and we need to pass that value off to a second script. Remember that $.then() returns a new promise, which can be subsequently passed to the $.done() or even another $.then() method.


Modularizing AJAX requests

More often than not, one might want to modularize their code and delegate a single function to make dynamic AJAX requests. How should we invoke the AJAX call individually, and access the promise returned, in this case? It turns out to be beyond simple: simply return the AJAX object after you make a request.

Let’s say we want to make two AJAX calls, which basically uses the same parameters except for the data and url parameters:

No more messy nesting, and no more the need to repeat the $.ajax() method over and over again.


Handling arrays of returned deferred objects

Sometimes you would want to use $.when() on an array of deferred objects. An example situation would be: making a series of AJAX calls (number of calls dynamically changes, perhaps?), and then checking when all of them are done. How does that work?

There are two options:

  1. The easier to understand method would be to construct an empty array, use $.each() to make AJAX calls iteratively and then push the returned promise into the array.
  2. The preferred method (personally) would be to use .map() to construct an object containing returned promises, which we then use .get() to return an array

After that it’s all easy: simply use $.when.apply($, array) to evaluate all AJAX calls performed:

You can see that using promises and deferred object offers an unparalleled advantage over nesting jqXHR.success callbacks when making iterative AJAX calls that should be evaluated as a whole.


Final notes

Of course, the aforementioned promises and deferred objects can also be used for $.get() and $.post() methods, which are basically reduced shorthand functions for the $.ajax() method using the GET or POST methods.

This writing is inspired by my visits to StackOverflow, where occurrence of jQuery-based AJAX-related questions is rather common.


Further reading

Sign up to continue reading what matters most to you

Great stories deserve a great audience

Continue reading