Running on OPENRNDR
Visualising machine learning processes
This story is part of a series of case studies related to the open source framework for creative coding OPENRNDR, written in Kotlin and Java 8. OPENRNDR simplifies writing real-time audio-visual interactive software. The framework is designed and developed with two goals in mind: prototyping and the development of robust performant audio-visual applications.
A computational model for natural language
If both machine and man read and write the same word, do they mean the same thing?
Computational models for natural language primarily regard language as a sequence of symbols, as such the meaning of words can only be described as a product of word context, that is, to which other words a given word appears. This is in contrast to Chomsky’s linguistic theory, which holds that the principles underlying the structure of language are biologically determined in the human mind and hence genetically transmitted.
The key difference between the perception of language by machine and by man is how language is embodied. For machine languages is computational operations on symbols stored in computer memory, for man language is an experience of the body.
What artificial neural networks roughly do to be essential to modern day language computation
Artificial neural networks process numbers: they take numerical input and give numerical output.
In the figure above, the larger circles represent neurons, when neurons receive a signal or number larger than a certain threshold they emit a number. Between two neurons we have a connection, there is a little valve on the connection that controls how much of the signal runs to the target neuron.
Now the idea is that these neural networks can be trained to associate input values to output values. The training goes by a process of given it an input value (here 1,0,0,0,0,0,0,0 and an associated output value 0,1,1,1,1,1,1 [which is the inversion of the input]). The network outputs 0.431, 0.654 etc, which means we have to adjust the settings on the valves to make the results match. We can repeat this process to learn complex associations.
In the end we have neural networks that encode very complex associations in the settings of the valves. With some tricks neural networks can be used to model language.
Here we see a text generated by a neural network that has been trained on a large body of works by Shakespeare. Notice how the machine has learned the structure of a play. First a character appears, some sentences are said and then another character appears.
The figure above shows a more modern text: a generated Wikipedia article. Since the language is learned on a character level it is also able to learn Wikipedia’s markup language. The machine has also learned how to construct links to other, non existing websites.
What is the meaning of word? We start by thinking that natural language is a product from the human and its environment, for man every word can be associated to sensory experiences. A word is a complex association of actions, memories and so on.
Since most software runs on computers, often without any sensors, natural language is reduced to written words only. A word is just a series of symbols. Thus the meaning of a word can only be explained through other words.
Turning words into space
A word embedding is a numerical representation for a set of words in which each word is a point in a high dimensional space. Such an embedding can be constructed using a neural network and a large body of texts. The neural network is trained to guess the blinded center word in a short text fragment by repeatedly feeding it fragments from example texts. When the neural network fails to guess the blinded word correctly it updates the representation of all the words in the fragment such that it is more likely to successfully guess the word. By applying the guess- and-update steps a great many times all the word-points in the space organise into an embedding that contains semantic qualities.
Once a word embedding is found we can query the model to find words that are close to a given word. Here we see a listing of words that are similar to the word ‘typography’. We see a list of words related to typography, but also words that are in fields related to typography.
Another quality of the word embedding is the ability to somehow learn word analogies. For example we can find a direction by taking the line between the words Italy and Rome. Instead from Italy we now leave from France and we end up at the word Paris. So the language model has implicitly learned a land to capital relationship, just from reading a lot of texts.
The language model is surprisingly good at some relations, like directions, gender-based word forms and the stereotypical food for a given country. We also see that it is somewhat limited in certain other things.
We use the Word2vec model, a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.
Since Word2Vec constructs such high-dimensional spaces, and we can’t really look beyond 3 dimensions we have to find a way to fit all that information into 2 dimensions. The animation above is composed by 3 million word points being fitted into two dimensions using the t-SNE algorithm. What is interesting to see here are that the are some very clear patterns in the data.
Much of this research has been applied to the READ/WRITE/REWRITE project.
USE CASE: Read/Write/Rewrite, 2017
READ/WRITE/REWRITE was an interactive installation by LUST, exhibited at Typojanchi 2017 in Seoul, South Korea, that visualises how a machine can learn to ‘read and write’ by using machine learning applied to natural language in the form of written text. Ongoing research in machine learning was transformed into a tangible interactive installation able to reconfigure itself through different contexts and contents.
The work takes the shape of a cube and visualises the organisation of roughly 3 million English words on the basis of their meaning and semantics. The end result of this process is a landscape of words in which words within a similar context are placed in near proximity. The word contexts are learned from a large body of digital texts that are publicly available, such as news or Wikipedia articles.
The various visual perspectives make it possible to explore word similarities and interrelations that are encoded in the language model in different manners. The examples in this section are simplified illustrations of core visualisation principles applied to each side of the cube.
The interaction of the work is based on visitor proximity. When nobody is near the work it will show and work on the organisation of the word data in this mode, words are visualised as anonymous data points. As a visitor approaches the cube, the work will switch to a language appropriate to humans. By moving closer to the cube the visitor zooms in on the language model.
The display of the word using its context, both as a visual effect in which a word manifests from neighbouring data points and a visualisation of neighbouring words.
The second screen shows a transition from a landscape of data points to a column based writing system in which word proximities based on contextual similarity are maintained.
The third screen treats the language model as individual points, the west screen attempts to treat the language model as a continuous space by guessing what is in between two words.
The installation aimed to make abstract concepts as machine learning and neural networks accessible and understandable for a wider audience, meanwhile addressing themes as design, future of typography, context vs meaning, etc.. Machine learning principles will be applied more and more, and it’s important to give insight into how these principles work, to give a counter voice to ‘empty’ terms as Artificial Intelligence. By addressing this, the project aims to load these themes with cultural significance.
The interaction of the work is based on visitor proximity. When nobody is near the work it will show and ”work” on the organisation of the word data. In this mode, words are visualised as anonymous data points. As a visitor approaches the cube the work will switch to a language appropriated to man. By moving closer to the cube the visitor zooms in on the language model. Through three visual perspectives it is possible to explore word similarities and interrelations that are encoded in the language model.
Typojanchi 2017 – the 5th International Typography Biennale was a combination of exhibitions, talks and publications with contributions from local and international artists, designers and typographers. The theme was Body and Typography. In comparison to most art biennials and art fairs Typojanchi is less self-regarding: it tries to interpret the current era and its socio-cultural environments. All different components create a vast range of intersections between visual languages and perspectives - literature, music, movie, city, politics and economy.
More information on typojanchi.org/2017
Design and concept: LUST
Typojanchi Director : Ahn Byunghak
Curator: An Hyoijn, Kim Namoo
Media/Technical support: Multi Tech
Construction, Build up: Gom Design
Translator: Jeong Daye
Photography: Kim Jin Sol
Video: Jeong Moon Ki