Martina Aldasio Pozzi
6 min readMay 15, 2015

Politecnico di Milano | Design della comunicazione | C1 | Workshop May / 11–15/ 2015 — “Digital Shadow”

THE BRIEF.

We had to chose an innovative field of research, still not well explored, about digital footprints, the data that is left behind by users on digital services. The we had to develop our concept on a dataset, built by collecting the digital traces left by a member of the group.

THE CONCEPT.

One of us noticed that, using an alternative keyboard app on an Android device, it was possible to retrieve parts of past conversations, using only the previsions suggested by the keyboard. Then, when she linked the app to the cloud (including Facebook, G+, Gmail, Twitter, sms service, Evernote), the predictions improved and became more accurate.

After reading the privacy policy of SwiftKey, we discovered that, despite they assure that the app is able to recognize password fields and credit card codes — so as not to record them,

They “May share your information (including but not limited to, information from cookies, log files, device identifiers, learned language data/Language Modeling Data and statistics) with businesses that are legally part of the same group of companies as TouchType Limited, or that become part of that group (“Group Companies”). Group Companies may use this information to help provide, understand, and improve our Products (including by providing analytics and data analysis) and Group Companies own services.”

“We may share your information […] with third-party organizations that help us provide our Products (“Service Providers”) such as providers of hosting services or analytics tools. Our Service Providers will only be given access to your information as is reasonably necessary and under appropriate confidentiality terms.”

Therefore, the goal was to make the users of this “smart keyboard” aware that their data and conversations are recorded and stored, managed by a company that can share them. Also, the project aims to analyze the linguistic patterns in terms of confidentiality of the informations typed by the user, stored and recorded by SwiftKey.

THE “SMART KEYBOARDS”.

The standard theme of Swiftkey

SwiftkKey is an alternative keyboard available for Android devices on Google Play. It is able to record the user’s speech patterns, facilitate the writing and improve the words prediction. It is provided with a smart corrector and progressively builds a custom dictionary that can be synchronized between different devices. While typing, SwiftKey suggests three words, based on an algorithm. The words suggested in the central button are the ones the user types the most. Before the download, it asks for authorization to find accounts on the device, read your text messages (SMS or MMS), read and edit the contents of your USB storage, view Wi-Fi connections, read read phone status and identity, receive data from Internet, full remote access, control vibration, run on startup, prevent device from sleeping.

SwiftKey Keyboard learns and memorizes data, contacts and language patterns of each user, for easy, fast typing. Its autocorrect function is based on your personal writing style and you can sync your writing style securely to the cloud and across devices.

We analyzed language patterns generated by a user and stored by the SwiftKey keyboard. Specifically, we put together a database of words, based on the three choices suggested by the keyboard.

OUR PROJECT.

First of all, we needed a dataset. It is possible to access your personal data held by SwiftKey by sending your request together with a check for £10 Sterling payable to TouchType Limited (click here). Unfortunately, because of time constraints, we couldn’t buy this data, so the collecting was manual.

We calculated that, by choosing each time one of the three suggestions for six consecutive times, we would have had sentences of six words, and 1092 words overall. Unfortunately, we could not go any further, since the seventh “level” consists of 3279 words (n=173n). (To calculate the amount of the elements for each level, we used WolframAlpha.)

Since the predictions are often articles and prepositions, in order to obtain a sentence with a logical sense, sometimes we had to tap the suggestion more than six times.
We had to delete some predictions from time to time, since the more we tapped on a prediction, the more the keyboard suggested that word, repeating previous patterns and leading us to not-so-interesting results.
Moreover, the suggestions changed within the seven hours spent to complete the collection, since the user needed to use her smartphone. So, it was quite unstable. Those were the problems relative to the manual collection.

Dataset

The dataset was designed, since the beginning, to be loaded on Raw, a data visualization tool developed by DensityDesign Lab, a research lab in the Design Department of the Politecnico di Milano. We chose the Circular Dendrogram chart, among the sixteen proposed. It was the best way to show the ramification of our results, and the connection between words. With it, we realized an infographic. The result were classified by their level of confidentiality: level 0, marked in blue, for non-confidential sentences and recurring expressions in the user lexicon; the first level, marked in orange, for generic references about user’s life, daily routines, nicknames; the second level, for specific references to the user’s life, names, places, private conversations.

In order to communicate our project on the internet, we shot and uploaded on YouTube two short videos, portraying in an ironic way what could happen if your keyboard and its informations fell in someone else’s hands.

VIDEOS.

Episode 1 and Episode 2

These video, the infographic and some informations about the project were put on a website.

THE RESULTS.

A long message generated by using the words suggested by the keyboard

As we in a measure expected, some “personal” words emerged, but they were not such confidential to be disturbing. This, we think, happens since six “levels” of suggestions are not enough to form complete sentences. In fact, going deeper, it is possible to retrieve sentences of past conversations.

IMAGES.

Dandelion | different levels
Dandelion | Color levels
Dandelion Zoom
level of confidentiality

ABOUT US.

Martina Aldasio

I was born 25 years old. I’m attending the master’s degree in Communication Design at Politecnico di Milano. I lived and studied in Barcelona and Madrid and I worked at the Italian Culture Institute in London for six months. I am currently living in Madeira, Portugal working for a project of landscape and Communication design.

Stefania Baldassarre

I’m a milan girl born twenty-three years ago.I graduated in bachelor’s degree in interior design and currently I’m attending the master’s degree at PoliMi. Since three months I’m living in Lisbon for a stage.

Alice Dolci

I’m studying Communication Design at Politecnico di Milano. I have been lived abroad in the last four years, studying in different university and country and I’m currently working in a project about landscape and communication design in Lisbon, supported by Universidade Lusìada de Lisboa.

Silvia Riva

I’m a sicilian girl born twenty-four years ago. I’ve been living in Milano for five years, currently I’m attending the master’s degree at PoliMi and I’m working as a graphic designer in a product design studio.

Arianna Tarabusi

I’m 24 years old. I’m studying Communication Design at Politecnico di Milano for five year and I’m currently attending the master’s degree. At this time I’m working in a company dedicated to the book binding.