Automatic screen design classification

Translation of screen designs into HTML/CSS markup is a time consuming task for web developers, which follows clear rules but needs a lot of precision. It seems to be a good application for machine learning in the future.

The article

discussed a paper from 2017 with some ideas for generating markup from pixel images. In this post I will try to use machine learning in a first step towards this goal, by trying to classify screen designs. In two examples I train a neural net to distinguish between dark and bright designs and to separate screens with one column from screens with two.

Data augmentation

For the training I need a lot of images of screen designs with content fitting the needs of the classification tasks. So I decided to generate the data with some Python scripts using the library “Beautiful Soup” for HTML parsing and constructing parts of the markup, and the library “cssutils” for parsing the CSS files. All designs are built with “Bootstrap” (https://getbootstrap.com/) and combine randomly chosen images and blindtext with some colors and basic markup. Bootstrap’s grid system was used to generate layouts with different number of columns.

Generated Screendesign

The command line tool “wkhtmltoimage” (https://wkhtmltopdf.org/) was used to convert the HTML-Files to PNG-files. The screen resolution was set to 1200px by 1200px. The files can be downloaded from https://www.floydhub.com/mosch/datasets/pix2html.

One set of images consists of 540 images — half of them with a bright background and the other half with a dark one.

The second set has 450 images — half of them with one column and the rest with two.

Training the classifiers

The training was done with “Jupyter Notebooks” on “Floydhub” (https://www.floydhub.com). The neural nets were built and optimised with “Keras” (https://keras.io/).

Found 450 images belonging to 2 classes.
Found 90 images belonging to 2 classes.

For training and testing the data was splitted into a training and test dataset, and each image is resized to 300x300 pixels to save computation time. A batch size of 10 was used.

The model consists of several “2D-Convolutional” and “MaxPooling” layers followed by a “Dense” layer and a “Sigmoid” unit at the end to predict the probabilities for the binary classification.

Detect the background color

I used the first dataset to test if a trained model can learn to detect dark and bright designs.

Training and testing for some epochs showed that after about 15 steps the model was perfect in detecting the background color. Let us look at the trainings curves for the accuracy of the classifier and the loss-function.

Detect the number of columns in the layouts

Using the second dataset of generated images I tried to classify by the number of columns from the raw pixels.

After only 3 epochs the model did a good job. So the two jobs seemed to be easy for a machine learning model. It should be possible to try more difficult tasks in the future.