AI based UI Development (AI-UI)

6 min readJan 7, 2018

Artificial Intelligence (AI) is currently one of the most popular topics in the industry with seemingly endless applications in everything from matchmaking to self-driving cars. The most disturbing aspect of AI that we hear is it will result in massive job losses across industries. Can AI also affect the IT jobs? If so which skills will be impacted? When? How? These are some questions every software engineer must be seeking.

Creative designers or business users comes up with UI (User Interface) ideas for application/ website on a sheet of paper or on a whiteboard or on their fancy graphics tablet. It is a job of an UI developer to convert the design idea/ wireframes into a working UI keeping the creative design intent in mind. This is one of the complex, time-consuming step in software development process. In this article, we will see an interesting example of applying AI for UI development. We will try to understand this by comparing it with human learning process and (over)simplifying the technology behind it.

Mimicking our eyes and brain

As a child, we learn to observe and label the things around us. The learning happens through feedback provided by our parents and others. Our brain gets trained to look for some pattern, texture, color, size in a object to identify it. In AI, Convolutional Neural Network (CNN) is a class of deep neural network very effective at recognizing the objects in a given image.

Basic idea behind CNN is to look for some shapes or patterns with the help of various filters in small parts of the image one at a time. Below figure shows applying 2 filters to look for slanted lines. Based on the filter results features are extracted. Finally by voting for the extracted features, the algorithm can conclude on the objects in the image.

Describing the image

The child starts uttering a single word label for each identified object, such as ‘ball’. Soon she will also learn to identify the relationship between the identified objects and describe it in a short sentences such as ‘a red ball and a brown bat is on the lawn’. The learning happens through a cycle of trial and errors.

In AI, for a given image constructing sentences from the word labels is a job of LSTM (Long Short Term Memory) networks. This process is called as image captioning.

Below are some examples of AI based image captioning. More such examples are at http://cs.stanford.edu/people/karpathy/deepimagesent/

The image captioning is achieved by appending LSTM network to the CNN discussed earlier. LSTM is very effective in language related tasks, because of their unique property of referring to their previous outputs. LSTM generates a word at a time. The next word is decided based on it’s inputs, but also on previous words generated. e.g. in a sentence ‘My name is John.’, you can say ‘John’ only if earlier three words were ‘My name is’. The sequence of words forms into a sentence. Like any other neural network, LSTM goes through a learning at building the sentences.

UI Development Process

Typically UI development happens through following steps,

Creative designers or business users of the application likes to hand draw their UI design ideas on a whiteboard or a graphic tablet or even a piece of tissue paper.
Designer uses wireframing tool on a computer to create the same design again. This is a redundant step.
UI developers will translate the wireframes into a working UI code. The developers and designers goes through a iterative process till the expected UI is built. This step is a time consuming and repetitive process.

AI based UI development

What if the hand-drawn design idea is directly translated to a working UI? AI can do this. Below is an example of the same.

In image captioning, AI describes objects (such as dog, horse) in a scene and builds a English sentence describing the objects and their relationship with each other.

In case of UI code, the UI design is like a scene, but instead of dog and horse will have UI objects like button, slider. Instead of English language, the objects will be described in UI code. The UI code is having a limited vocabulary (such as button, slider), and relationship between objects are described with few more words (such as position, hierarchy). Thus UI code generation can be considered specific use case of image captioning.

UI code generation goes through two stages.

Training Stage:

Imagine a child (child_1) learning to look at many UI images and creating a list of the UI objects for each UI image. Other child (child_2) learns to read the descriptive code for the same UI. Third child (child_3) learns to find the relationship between the child_1 and child_2’s learning. They together learn to observe a image and create a corresponding UI code.

CNN takes role of Child_1, LSTM as Child_2 and another LSTM as Child_3. (For a complete technical explanation, refer the link for pix2code paper at end of the article.)

Sampling Stage:

The trained model is now ready to process hand drawn GUI drawing. The code context is updated for each prediction to contain the last predicted token. The resulting sequence of DSL tokens is compiled to the desired target language (e.g. for android, iOS, HTML etc.) using traditional compiler techniques.

Benefits of AI-UI

For designers and developers, AI based solution would save critical time early on a project by rapid prototyping, boost iteration cycles, and eventually enable the development of better apps.
They will save on all the trivial, repetitive and redundant tasks.
It also will allow designers and developers to focus on what matters the most that is to bring value to the end-users.
The entry barrier to build apps will become really low. Learning to use a UI design tool takes time, learning to code takes even more time. However everyone can draw UI on paper. This will allow your grandma to go from an idea to a working UI running on her phone in a matter of seconds.

Current and future state

As of now only few AI based UI development products (e.g. Uizard) are getting developed and not yet reached maturity to replace the human UI developers. But still they are good as an assistant for any UI developers. In coming years, we may see new approaches and improved AI products, where this assistant will take over the role of the experienced UI developer. It’s time for UI developers to look at the changing trends and get ready for Reskilling.

Still many of us may think generating UI code from the creative designers drawings is OK, but AI itself cannot come up with it’s own creative UI designs. We still need artists, creative designers, Right? Maybe wrong! AI has Generative Adversarial Network (GAN) and Creative Adversarial Networks (CAN) have proven to generate art and sometimes better than humans. We will discuss this in some other article.

References

pix2code: Generating Code from a Graphical User Interface Screenshot by Tony Beltramelli https://arxiv.org/pdf/1705.07962.pdf
Deep Visual-Semantic Alignments for Generating Image Descriptions by Andrej Karpathy, Li Fei-Fei http://cs.stanford.edu/people/karpathy/cvpr2015.pdf