Sonifying experimental visual scores with computer vision

Boris Vitázek
Boris Vitazek
Published in
15 min readJul 24, 2017


— How to shape audio with visual input and audience participation

Me and Zuzana Sabova performing on festival Pohoda 2017 — original on Flickr

I would like to tell you about a project that I have been working on occasionally for the last 5 years. A project that increased in complexity as time passed by, rewarding me in turn with friends, knowledge and never-ending presence of the inevitable end. I am going to write about:

  • (How I got to know) Milan Adamčiak and his inspiring/playful work
  • Grafofon, a 6 meters long instrument, inspired by analogue experiments, but driven by computer vision
  • Technical details of the project, esp. in its last iteration — the most technically complex, but also the most successful
  • What I learned along the way and what I plan for the future

It all started when Zuzana Sabová asked me to work on a project co-authored by 4D gallery and titled “Grafofon”. Situated in Galanta, not far from the capital city, 4D gallery is an agricultural farm turned art oasis, a space focused on sculpture and dedicated to helping artists from different art disciplines to realize their projects.

Work ARGO we did together in Plzeň for festival Blik Blik 2017.

I met Zuzana in the Academy of Fine Arts and Design. Though studying traditional graphic techniques at the time, she started experimenting outside her department disciplines, and she was in need of some help with a project involving Arduino and Pure data. And I was the “intermedia guy” that just started playing with all of that. Grafofon was our first larger common project and it kick-started an ongoing collaboration, in frame of which we created many artworks together; lately we’ve been traveling around art festivals, Zuzana drawing large scale murals and me animating them through video mapping. You can check her portfolio here. Her involvement in the project was crucial — there would be no grafofon without her, or 4D gallery continuous support.

The exhibition “Galantské Piesne” photo from event — Transart Communication in 4D Gallery

The exhibition “Galantské Piesne”, to which Zuzana invited me to cooperate, revolved around Milan Adamčiak’s work, including some of his collaborative projects with other artists, me being one of them.

Milan Adamčiak was an older artist relatively unknown to me. Years of performing and Slovak Borovička ran him down, and one could easily overshoot his age by 10 years, but it certainly did not impact his drive and creativity.

Milan Adamčiak working on “Vibrácie AA, ZP, GA, UP, …” (2011, 4D Gallery)

Milan studied musicology and became interested in experimental music — in how music relates to visual expression, how rhythms can become words and screams can become music. He corresponded, and later collaborated, with John Cage on the topic, but most importantly kept creating and experimenting. He could do something with just an empty plastic bottle, or sit for hours behind his typewriter, writing tones, punching words, deconstructing rhythms and meanings. As well as performing, he generated an impressive number of visual scores, one of which you can see below.

Milan Adamčiak, Konfigurácie pre veľký orchester (1968)

In frame of the Grafofon project, I was tasked with sonifying his visual scores. I was not the first to face this quest, numerous musicians interpreting his work in numerous ways — instruments, voice, dance. But the difference is I never considered myself a musician, though I love to experiment with sound. So I decided to take into consideration/learn from the previous interpretations, while bringing something different and connected to my domain of activity — the use of a machine/computer to help interpret the visual scores. And I wanted to split/control separately the two layers of interpretation, by human(s) and by machine, so that I keep (and combine) both a strong emotional expression with efficient algorithms.

Me, Milan Adamčiak and Zuzana Sabova, on the right (2012, Gabriela Zigová)

And all these considerations resulted in the concept of the Grafofon, whose name was suggested by Milan himself. During several sessions we discussed how my proposed system could read the visuals scores, and also how it would all work — as the first output of our work was supposed to be an opening performance for the exhibition “Galantské Piesne”, and both of us needed to understand the system and perform by using it.

Milan Adamčiak , Trojrozmerná partitúra č. 3 (1969)

His 3 dimensional visual score (shown above) was a great inspiration. Simple and fun to play like most of his work, and for that reason enduring time too. And the never-ending loop gives it a feeling of automation that makes it somehow resemble the approach of a DAW in terms of notation and execution.

Herbert Bela — RCA Synthesizer (“Mark I”) — 1955

While working on the first concept, I researched synthesizers and the use of computer vision in sound making. As it turned out, the beginnings of electronic music was driven by same principles as all computer work - a punch-card like paper was feeding data to the synth, that in turn generated the sound.

Direct conversion of image into sound by means of computer vision — an approach derived from the ANS synthesizer.

I also was considering doing a straight conversion of the visual into sound, taking out as much human interpretation as possible, but this approach did not match the way Milan and I worked. We both wanted some sort of interpretation that allowed a lot of expression. And I wanted to have control over the instrument, to give a well defined atmosphere to the sound and the overall performance.

First concept of the Grafofon; the space it was firstly exhibited in greatly influenced its final shape.

To allow the two of us perform-draw together, I designed a 6 meters long steel construction resembling industrial equipment such as belt conveyor systems, a nudge both to the previous function of the 4D gallery grounds, and to the gallery’s team support, with design included. The construction rolled (looped) 13 meters of paper, on which we could draw or place objects. The camera, housed in the raised part of the construction, was “reading” everything on the paper.

The reading is done in vvvv software. In the first project iteration, it was just one linear spread of pipets translated them into midi notes. The image from camera was filtered through a threshold, and the vertical linear spread of pipets was translated into a “score” of 128 notes from low to high. Sound was done in Reason, in which I mixed various instruments governed by the MIDI signals sent from vvvv.

Grafofon installation in Transit Gallery after the performance. In background: a tape composition by Milan Adamčiak.

There was also the time aspect involved by the rotation of the same paper roll— as it kept being drawn onto and thus progressively “filled” during the performance, the reading got more complicated and the dynamic more chaotic. So the length of performance is tied both to the length of the instrument, and the number of “rotations”. Another detail: in this first project iteration I didn’t have time to build a motorized pulley, so the paper (roll) position was controlled by a huge hand crank, that had the advantage of allowing to move the paper with expression, and quickly change the direction or speed of drawing.

Me doing nerdy stuff, Milan drawing on the paper and Zuzana behind the wheel, controlling the tempo of our work.

Because of the fast rolling rhythm and Milan’s expressive drawing style, the paper was full in about 4 minutes, which at that time seemed as a nice performance duration — little did I knew that I will be extending this duration to hours in next iterations of the Grafofon.

Short demonstration of the countour plugin from VVVV used in Grafofon

Second iteration of the instrument was presented at Multiplace festival in Bratislava. Every time I performed, I basically rewrote the whole software. This time the camera was enhanced by freeframe contour detection. I was getting position of the objects on the belt, as well as their relative size, and contour vectors. Along the midi-keyboard pipet style, I was also having some percussions that were getting dynamic from the object size — the bigger the drawing/object that touched certain position on the vertical axis of the paper, the stronger the sample that was triggered sounded. Unfortunately I have zero documentation from this performance.

Then we performed with Zuzana on the Biela Noc festival,2014 in Košice, Slovakia. All went pretty well aside from some problems with crowds being too overwhelming while we played, pushing on the whole instrument which needed our constant attention so the would not break the paper / etc. At least I got a feeling that there is something interesting for people in the whole act of drawing being translated to sounds.

Milan Adamčiak — Verbálny text 2 (hard to translate, part of its charm is how he uses characteristics of Slovak language)

On 16. January of 2017, Milan unfortunately passed away. I am not even sure I saw Slovakia art scene mourn someone on this scale. While there were times in his life during which he went a bit silent, last years of his life were full of exhibitions and performances.

We did not became very close, but there was a time when we shared a work together, when we sat down and talked about what we gonna do, how it will look, why are we doing it and how we gonna do it. For me that is one of the most intimate contact I can have with someone. My whole life revolves around creating stuff, and that is how briefly, but intensely our paths crossed — by making something together.

So in 2017, preparations for retrospective exhibition of his life`s work has begun — Adamčiak Začni! — Adamčiak Begin!. We were asked to do a Grafofon performance during the event “Night of Museums and Galleries” on 20. May 2017.

My working setup in 4D Gallery for performance in SNG.

After 2 years of silently lying in the depths of 4d Gallery, Grafofon was brought to life once again for a new series of performances. The main hardware problem we still had was to get the cylinders centered, so that the paper would not slip to either sides or tear up in the process. As yet not solved, we need to glue the paper to the plastic fabric used for printing big banners.

First version of the motor: badass power source, but feeble and overheating motor — later switched for a better one.

The biggest advancement in this project chapter was motorization. I was dreaming about this since we made the first version, because with motorized conveyor belt that has constant speed, you can play with rhythm, and it leaves performers free focus their attention on the drawing. Zuzana with her father Ladislav Sabo worked on this upgrade while I was punching away on my keyboard trying to come up with more versatile software that could be upgraded later.

Slovak National Gallery atrium, place for our performance (photo Juraj Starovecký)

Challenge in performance commissioned by SNG was to make Grafofon run for several hours during which it would be subjected to heavy interaction with public. For this performance I built a software base — OCR image recognition for switching sounds, X number of lanes that are used for OSC control and 8 triggers for samples. There was no quantization or anything fancy, but people still found a lot of fun in putting objects on the conveyor belt and seeing their relation to the sound.

Bent wires and coffee beans as sound seeds (photo Juraj Starovecký)

Since the installation was supposed to run for hours, we needed to use objects that will always fall off the conveyor belt, freeing to space for new improvisation. This was achieved by using coffee beans, black beans, bent wire and cut out text snippets for OCR. I opted for very simple OCR implementation — widely known Tesseract, an open source OCR engine done by engineer Ray Smith, who is currently working for Google —making it possible for your phone to translate that Chinese restaurant menu. VVVV just takes a picture of the text, saves it on disk, i exec the Tesseract, look with it at the image file and write TXT file with the recognized result, which i read with vvvv. I do this 2 times/second, which is enough to switch presets for instruments. This became a base for a next iteration, that will once again change and evolve the software side of the project.

Airport near Trenčín — my home town, reworked to house the biggest festival in Slovakia.

After multiple years spent on organization of creative zone, Zuzana wanted to do something different for this year of Pohoda — the biggest music festival in Slovakia. It is by now well known for its excellent organization, huge music diversity and tons of side activities that makes it friendly for families and party-goers alike. She proposed we do Grafofon somewhere on the festival, and we did indeed get spot on smaller experimental stage. I immediately wanted to do something different this time, something that will fit the festival atmosphere, that will feature some beats and will be still very expressive while setting some borders for us to move inside. I was born in Trenčín and lived there for bigger part of my life — statement that is slowly crawling to not be true, so I am always happy to return home with my work.

The light box got upgraded with a LED strip, that annihilated shadows for the camera.

It was one week before festival Pohoda when I plunged into depths of Reaktor — a strange beast of a software.

On one part it is amazing tool for a crazy price, for 200€ you get coding environment with huge possibilities, set of instruments that are very creative and ready to go and on top of that user libraries which are full of great creation extending the value of the product. My second choice would be combo of Ableton and Max for Live, which is painfully costly compared to Reaktor. You don`t get the fluid live environment as with Ableton, but you get powerful tool to sonify your ideas with. Of course it has its downfalls, the CPU drain is crazy, that is why I am dividing it to two instances in Vvvv audio engine. The coding environment feels sluggish and ancient, everything about the interface feels a bit off and working could be extremely tedious, the UI and UX could use a lot of work. It is not the least resistance solution, but I think it is worth checking out.

This diagram show my architecture, the whole thing is controlled from within Vvvv. Its main task is to take care of image processing, routing of midi, routing of sound and housing Reaktor as 2 VST plugins. Huge thanks goes — once again — to tonfilm, for his amazing contribution of audio engine for Vvvv — which made Vvvv even stronger glue that can hold all of your inputs and outputs together, and shape them in the process into anything you want.

Biggest challenge was to make drawing of the beats fun while still retaining big degree of control. Not just for me, but also for Zuzana. She was taking taking care of the hardware part of the Grafofon, and she does not have experience in coding or performing music, so I needed to do a tool that could be used by both of us. For this I made a quantization sequencer — Vvvv is looking for a drawing, when it sees something in the right spot, it turns on flipflop. This flipflop waits for next trigger in sequence. When it is triggered, flipflop turns off, waiting for the next drawing.

In the arrangement above, you can see the direct quantization of the drawing. I can draw longer line to just play every beat, or draw points rhythmically to construct more sparse beat.

This approach allows me huge flexibility in style, I can do anything between quick successions of drums or rhythmical techno beat, layering various samples by drawing and working with sequencer.

This principle is applied to 8 samples with added bonus of ability to set different clock division to every sample, so I can weave in more diverse sequences. Big thanks to Michael Lancaster from Reaktor User Library for posting LaunchPad mini control, which I was able to modify for my purposes and it made my work much easier.

I was also using contact microphone attached to the motor of Grafofon, which is routed through few effects that were controlled on the paper, a feature that is only possible with the whole hardware built and ready. You can make these things really easily.

Grafofon performance, Pohoda festival, 2017.

I really liked reactions of people, they were laying down various thing on the conveyor belt, essentially playing with me. It was not overwhelming and everybody was curious about the sounds generated by their experiments — program of the festival or sunflower crawling with bugs. This was by far the best interaction with public, I think this concept really fits in a concert form into experimental club environment.

A 30-seconds full cycle performance demo (i.e. paper completing a full rotation cycle in 30 seconds)

I still needed a way to be able do some drawing on the Grafofon without the complete hardware, so I have made an input simulation for myself. It is really fun to play with and I have done the whole documentation with it. For example — the sound from Pohoda documentation was done on 2-minute simulated roll. I let it pass around few times — layering controls of ambient and then beats. after I am done, I have a feature to save the whole drawing, and just load it when needed, so I can change something or re-record audio.

Paper roll layout, designed according to computer vision needs. Movement correction (needed due to paper roll side sliding) is achieved by real-time adjustment of layout elements.

The cool thing is that vvvv makes it super easy to work with VR — by plugging in few things I can have my simulation covered in space as well. Though the hardware VR controllers are a really different experience compared to a physical pen (in both precision and tool definition), any misgiving due to them can be solved by adjusting drawing parameters in the virtual space.

5 years of work by now, time to think where I am heading with all this.

I found the theme that really interests me a lot and it applies to every iteration of Grafofon — how to make performing electronic music more interesting. Right now music scene is dominated by Ableton, modulars and tons of different machines to work with, but often the result is someone standing behind the table, pushing buttons, turning knobs and dragging sliders. Or you have performers that do a lot of dancing and other types of show, but I want to make the act of creation interesting. I want to go beyond just playing. I think performance could be inviting, but not overwhelming — giving space to people to take some form of more active participation or just watch from afar and see the process. It can retain its working grit, while looking attractive to people. I can turn whole space to some part of my instrument, or invite people to a table on which I am working.

Live performance with SOTE on NEXT festival, Bratislava, 2015 (photo Jana Makroczy)

In live visuals, this need materialized as an interface integrated into the visuals themselves, so I can mix in the crowd with a game pad in hand, looking at the same screen people do, playing and occasionally passing along the controller for people to have fun — but that is another story, and another (next) blog.

Zuzana and visitors playing with the Grafofon installed at the Slovak National Gallery (photo Juraj Starovecký)

In the future I would love to see the Grafofon as part of some long term installation, and for that purpose rework it into a final form that can be installed easily and economically, and made available to visitors to play with. Also the local experimental music and art scenes operate in a mode which puts you into “the Symbolic payment zone” as a local — reserving budget for foreign artists, and cutting you from the needed resources to make such a project sustainable and/or minimally easy to transport & install. During the whole project I was supported financially in terms of materials and helping hands by the generous 4D gallery — else my budget was zero, making it really hard to spend substantial time on the system in order to perfect it. Currently I have a clearer idea on what I would like to achieve sound-wise during the live performance, and I will continue working on that, albeit in the same on-and-off fashion. And maybe even explore the Grafofon posibilities in VR, a technique that offers easier setup possibilities and an intriguing medium.


4D gallery

Zuzana Sabová and Ladislav Sabo

Miroslav Šimek for help with building the Grafofon construction

Juraj Starovecký for video documentation

Gabriela Zigová for photo documentation

The great tool that vvvv is and the team behind it

tonfilm / Tebjan Halm Vvvv audio pack

Michael LancasterLaunchPad mini control library for Reaktor

Martin Kaščák for giving us spot on his stage at Pohoda festival


Amalia Filip — Chaosdroid


Digital archive of Milan Adamčiak’s works

Printed collection of Milan Adamčiak’s works (4 volumes)

Dennik N obituary (in Slovak)

“Adamčiak Začni!” exhibition at the Slovak National Gallery