Designing a chat app to solve the communication problems of the world

There is always a better way.

Published in

UpLabs Success Stories

10 min readAug 24, 2017

We are two UX designers, developers and entrepreneurs who aim for the improvement of the future of humanity, we call ourselves Wersatiles. UpLabs challenged us to design an improved chat app. We accepted the challenge and designed a solution for better chatting using the Design Thinking framework, and our hackers’ mindset, knowledge and creative problem solving skills. In this post we want to show the process, the results we’ve got, and the decisions behind them.

Emphasize

Communication crisis

Recently we got to know that communication is a big problem nowadays. One of us heard a term “communication crisis” that was mentioned in a very unexpected place — a stream about neurobiology (this stream is in Russian, unfortunately, sorry those who don’t understand). And while no one of us can find any other references to it, we can clearly define it as an inability to communicate every piece of information that a person needs or wants to share.

Some may say that such crisis can’t happen in the information age, because people have so many means of communication, that they can tell anyone anything they want in a matter of seconds or minutes. Yes and no. With a lot of verbal and nonverbal, digital and “traditional” ways to share information it may seem like people don’t experience any “crisis” at the moment. However, our brains collect and process enormous amounts of information every day, and it’s much more than we can communicate by any means. It’s always easier to consume than to create. So, we decided we should improve this situation at least by a little bit.

Research

As always, our design thinking process started with a research to find what problems are people usually facing with chat apps. Due to the very limited time frame, we haven’t conducted in-depth interviews like usually, instead, we asked our friends questions and utilized simple yet effective observation method.

While doing quotidian tasks our team paid more attention to strangers using phones for communication and remembered facts about seeing such people in the past. This stage helped us to make a lot of assumptions. They were turned into questions that were then posed to our friends. These questions allowed to validate the assumptions. Some questions were about chat apps in general, and some were about specific points like voice messages. We made a mistake here by asking too broad questions that didn’t give much valuable feedback. The question, “Why don’t you use voice messages?” gave us a lot more than, say, “How do you use voice messages?” We will leave more open questions for in-person interviews.

Analogs analysis

There are so many chat apps nowadays, it seems like the new one appears at least every day if not faster. Even in the Samsung IT School where both of us met and learned to code — many students had some kind of a chat as a part of their graduation projects. There was no time to analyse every single app that exists on the market, so the three most used by us and our friends were compared: Google Hangouts, Telegram and WhatsApp.

The focus of the analysis was on the features each app has, and reviews by users. It was quite interesting to see how marketing worked for those apps: some reviews matched value propositions of the apps almost word-for-word. For example, in Telegram, whose message to users is, “Telegram is a messaging app with a focus on speed and security, it’s super-fast, simple and free.” — a lot of reviews emphasized it’s simplicity and good security. There were some negative comments about the apps, we found many of them informative. Also, a lot of users pointed to the problems they face, some of which were discovered earlier in the interviews.

Define

Insights

Next step was to get everything in one place. The simplest way of doing it is putting every piece of information gathered from the empathy stage on a sticky note and then clustering them. It didn’t take a lot of time and soon we understood the main problems people encounter while communicating via chat apps. Apparently, most of them were something already known by us, but all hypotheses are just hypotheses, and they needed to be validated anyway. In the end, we came up with five user stories about discovered pain points in chats.

User stories

“I’m not comfortable scrolling long conversation list in order to find something.”
“I spend too much time choosing emoji.”
“I can’t type in crowded buses.”
“I can’t listen to voice messages without headphones when people are around.”
“It’s hard to distinguish between voice messages if someone sends a lot of them and they are about the same length.” — “There are no titles on them,” — said one of our interviewees.

Ideate

Brainstorming

One of the most fun parts of the whole process where creativity and divergent thinking are turned on, and the criticism and rejection are turned off. Since our team consists only of two people, the rules for the brainstorm session change a bit, but the general principles stay the same. Sometimes we also use techniques developed specifically for small numbers of people, but this time so much information has been gathered that our team could brainstorm without any problems.

Five concepts

We came up with different solutions for every problem and then created a small proof of concept for the solutions we found most promising. In total five working solutions were created, each of them helped people to communicate in a faster and easier way… in theory.

Emoji drawing — user would be able to draw emoji on the screen and they will be recognized and added to the message. A lot of problems and limitations arise from this solution, such as differentiating between 😊and 🙂, and adding something more complex like 🤡.
Create conversations from scratch every day and archive old messages to remove clutter — this solution needed more work obviously. And it may even have a negative impact on users.
Message map — the new way to represent messages in a shape of a tree, or a mind map. This removes problems with linear chats — undoubtedly, people’s dialogues and thought processes are not linear, so it could be easier to navigate through nonlinear chats. And there will be no more forgot topics or unanswered questions. However, this could get messy…
Every talk can be broken down into smaller chunks or themes. Another attempt to break linearity in chats is to make users be able to chat about different topics in separate threads. Later we discovered that Slack does something similar.
Turning voice messages into text would work for our last three user stories. It also would work for communication crisis solution the best of all ideas above.

Evaluate

During this stage, our team switched from divergent thinking mode to the convergent unbiased one and arranged the solutions on the decision matrix. Ideas were ranged in a diagram based on the impact on the communication crisis and innovativeness/creativity of the solution — the set of criteria that is most useful for the challenge. Implementation effort was not considered as it was not the point of the competition, but anyways we ended up with a solution with decent user impact and very low implementation effort (standard criteria for the decision matrix) — because every developer can build speech recognition in a mobile app nowadays.

Final idea

Decision matrix clearly shows that the best solution for the contest and for the communication crisis is turning voice messages into text. But why does such a small feature make such a big impact?

It consumes less time for communication. There is a speed gap between information consumption and creation, that’s why the phenomenon of communication crisis exists. Reading is faster than listening, but speaking is faster than typing. Typing speed for composition is about 19 wpm, while speaking can be as fast as 100 wpm. And reading is 160 wpm on average, so if the typing is removed from the communication chain, there will be a rapid speed increase in information exchange.

It works in situations described in user stories. People don’t have to listen to voice messages because they can read them. And it’s also easy to distinguish different messages as their whole content is visible.
The regular voice messages don’t appear in the search results, but the voice-to-text messages do! People can easily find the voice messages just like the regular text ones.
Some people use voice messages more often than text ones. Such addition to the chat app does not break their habits — they are still able to record voice messages in a simple way.
This AI system has the low cost of being wrong: if some word is not recognized, the voice message could still be listened and understood; and the high probability of being right—thanks to all the genius machine learning scientists and researchers in natural language processing field.

We were very happy to choose this idea because we specialize in designing and developing products that use artificial intelligence to improve the experience end-users. We want to make humans and AI friends.

Prototype

Features

Usually, features are selected in the decision stage of ideation, but our solution is a single feature itself, so it can’t be broken down into smaller features. What could, and should be done, however, is writing down how it works and what it does.

A person records a voice message, another person receives it with the text recognized by AI in the recording.
A person using the app could listen the recording in two ways: either by tapping a “play” button or by tapping a specific word — then the voice message will be played from this word.
If the recording is currently playing, tapping a word will make the message to continue playing from this word.
The timeline shows which word the recording is currently at. Works like subtitles in karaoke.
While the message is being played, the “pause” button could be pressed to pause the playback. The recording will start from the place it was paused on next time, the timeline shows it.
The timer shows the duration of the recording and turns into the current time indicator when the message is being played.
The certainty of the speech recognition algorithm is represented by the opacity of each word: the lower the probability, the less opacity the word gets.
This feature should appear in chat apps without removing any currently existing features like text messages because voice messages have limitations and can’t be used always.

There was an idea to make texts editable after they are sent, but it would break the whole idea of faster communication. People would spend time editing messages, while those edits are not necessary.

Deliverables

Finally, Wersatiles could get their hands on making prototypes! Not that fast. One more decision had to be made. What kind of prototype are we going to make? “It would be so cool to create a working app!”— at first we were considering developing a small app for Android based on the Firebase or Heroku for message exchange. However, it couldn’t have been done because there is no easy way to extract the recording from native SpeechRecognizer API and we were too limited in time to search for libraries that could work. (Please, tell us if you know the easy way.) So we abandoned this idea.

For sure static images wouldn’t have worked, so we couldn’t just make mockups in Figma and upload them directly. We usually use InVision for prototyping, but this time an animated timeline was needed, so it didn’t suit either. We looked at several other interactive prototype creation tools, but we came to a conclusion that a video should be recorded anyway, and with the video all the interactivity would have been lost. Another option was to set up a website on GitHub as all prototypes can be converted to SVG, which is easy to animate in HTML5. but UpLabs don’t allow interaction with the website directly, it has to be opened in the other window. So, in the end, the good old Adobe After Effects was chosen to help us bring the screens made in Figma to life.

Prototype!

After writing a scenario for the animation and creating a small fictional dialogue of two friends, we finally opened Figma and the fun continued. Find all the details about this stage of the process on our Behance. Most of the design decisions that were made are described there (with images, storyboards and more!).

Test (?)

The last stage of Design thinking wasn’t made because of the very limited resources of time and money for this challenge. However, we are interested in continuing to experiment with this idea and explore it more, so popular chat apps would get it in the future if it’s successful.

Final result

We are very happy how it turned out, and we are very happy to share the challenges we came through to make it and to show how intentional everything is in this app. It also proved us that design of even a small thing takes a lot of effort. We learned a lot from this project, and we hope that you did learn something new from this post too.

Here is our submission for the challenge.

Check our Behance for more details on the project.

We are Wersatiles, a team of two young hackers who want to design and develop for humans. Hire us!