How we created an immersive Street Walk Experience with a GoPro and Javascript

Sharing back the making of the web-documentary Pregoneros de Medellín (Colombia)

17 min readMay 19, 2015

http://www.pregonerosdemedellin.com

N.B : This post will discuss the realization of the interactive part of the web-documentary about the street-vendors of Medellín: Pregoneros de Medellín (in English “The Criers of Medellín”). If you do not know it, you should check-it out before reading this article: http://www.pregonerosdemedellin.com , and/or watch the trailer: https://vimeo.com/123789400.

Since the early days we were thinking the web-doc Pregoneros de Medellín , we wanted to realize a virtual streetwalk, because of the narrative value of it, not for the fancy of having a nice streetwalk.

We never gave a lot of thought in: “How will be the experience of the users? What kind of interactivity can we offer to them?” , our story was the experience, and the experience was the story.

And Pregoneros de Medellín’s story is : stroll through the street, enjoy the urban soundscape, be surprised by the pregones (street-vendors shouts) and meet them.

In this article i will explain the creation process of transforming this story into a complete interactive experience build with HTML, CSS, Javascript, a GoPro, a bike, some microphones and a lot of love ;-).

I kept this article readable for a non-programmer audience, just skip the parts with TECH in the headline. The code is open-source and available on Github, please keep in mind that the goal of this article is not to explain the code, but to share back the creation process.

Benchmark and Inspiration :

Like any product, before starting you actually look what is out there, to avoid reinventing the wheel. And like anyone else we were fascinated by Google Street Walk (since this benchmark there are new nice projects based on Google Street Walk, for example: Night Walk in Marseille - France).

Back in the days the Google Imagery wasn’t available in Medellín (it does now), so our first thought was, let’s code our own Street View!

360° imagery exploration:

First draft of “Pregoneros de Medellín”, Google Street Walk like

We started investigating how you actually capture 360° imagery and how you build a javascript client to visualize it, the way we found was:

taking some pictures with the GoPro (of any fisheye lens) in the same POV and stitch them to have an equirectangular panoramic picture.
TECH: From this equirectangular picture you can apply different projections to visualize them either with WebGL, Css3D, or Canvas. The are some example with the three.js library: WebGL , Css3D
TECH: Or you can use the API of Google Street View to create you own street view imagery.

Technically it was doable, but complex, and without any funding at the time we decided to explore other options.

We were watching a lot a interactive documentaries production and we got an idea from one of them : Defense d’afficher, a documentary about street-art.

Defense d’afficher

The idea came from the way they do the transition between two street art pieces:

Defense d’afficher transition

It is a linear video and you can’t control the speed of the street walk, but this inspired us the idea of continuity, we didn’t want more jumps like the normal Google Street Walk experience.

From this video transition we thought, how could we make that interactive? How the user could control this video? From this question came the idea of controlling it with the scroll, but we didn’t know how technically we could do it.

We started scratching our heads to find a technical solution and we found out a project on Github: scroll2play.

Scroll2Play:

You can check out this cool hack: http://pwalczyszyn.github.io/scroll2play/ , the experience it quite simple, you scroll and it makes the video move forward.

Scroll2play demo

At this point we thought that if we could do the same with a video travelling through the streets, we will have our experience !

And so we did the first prototype of what will be the Pregoneros de Medellín interactive experience: http://tdurand.github.io/scrollingvideo (a little country road in my hometown in France, shot with a GoPro on my bike)

First prototype http://tdurand.github.io/scrollingvideo

Scroll2Play helped us set up the technical bases of Pregoneros de Medellín app.

On giving up on 360° imagery and fixing priorities

At this point, investigation and benchmark process phase was over, we had the concept, now we needed to optimize it and see if this thing can really run in browsers, with potentially regular connections, figure out how we will actually shoot this in the streets of Medellín, how we will add the sounds, how the characters will appear, etc …

We made the decision to do not pursue the 360° imagery idea, for some reasons:

TECH: Do the same streetwalk with 360° imagery would require WebGL (well you can hack stuff with CSS3D or Canvas, but it’s not easy to have good performance), if today it’s almost supported in every browser, 2 years ago it wasn’t.
It would require to load even more assets (and therefore having a really fast internet) to build the continuous experience, and not a frame each 10m like Google Street Walk. Recently we saw this amazing experience: http://a-way-to-go.com/ , these guys succeed to have a 360° imagery with a continuous walk. I would love to know more about how they did it.
It makes the shooting in the streets complex, you need a 360° head with 6/8 GoPro. (more expensive shooting and pos-processing)

How does the scrolling animation works (TECH)

The core feature of the scrolling video experience is quite simple:

We compute the <body> height of the page depending on the length of the current street, this gives a long page you can scroll
We use requestAnimationFrame which gives a callback each time the GPU can do an operation (hopefully at 60 fps, but not always)
In this callback we get the current scrolling position which correspond to the position in the street, and we display the correct frame simply by replacing the src attribute of the <img> tag.

No fancy tech there, just <img> tag, requestAnimationFrame, this makes the web-doc compatible with a lot of browsers.

If you head to the code of the web-documentary you will see more complexity, progressive loading, high/low res images, fetch queues… But the core is this feature and we built the app on it.

On shooting, trial and error

We did a lot of trial an error until getting the right method, you can have a look at the early footage we shot (with a normal DSLR without fisheyes lens):

You can see that:

It’s not wide angle enough
It feels slow, you have the impression you will need to scroll a lot to go forward in this street, and you are following the same people
It feels a bit unstabilized

We addressed these issues by:

Using a GoPro to have the wide angle (we could have used a camera with a fisheye lens, but the gopro is more compact, you want to be discrete, you don’t want the people to stop and stare the camera)
Shooting on a bike to be able to go faster than the people walking in the street, and having the impression of “flying” in the streets.
Now that we are on a bike with a GoPro, we needed stabilization because you can’t possibility be steady and look where you need to go, avoid shocking with the passants … So we bought a gimbal and did some tests

GoPro on the bike

Pros:
- discrete, POV (point of view) a bit lower

Cons:
- was not stabilizing enough, the streets of the center of Medellín aren’t the best in the world, it jumped a lot

GoPro on the helmet

Pro:
- incredibly stable, your body absorb all the vibrations

- you can control the shooting, if you need to go a bit left with the bike, you can keep your head straight

Cons:
- the POV is a bit high, we resolved it by lowering the seat of the bike

- it’s less discrete

After the tests, we chose to shoot with the GoPro on the helmet:

Shooting “Pregoneros de Medellín” streetwalk

You can see the final result:

Final streetwalk online experience

Bonus thought

To improve the shooting process, we think that this kind of thing can be really nice (discrete and you can still go faster than the people walking )

Airwheel, great concept

Finding a way to encounter the streets vendors on the interactive walk

It’s nice to have the streetwalk working, but how the users will meet the streets vendors?

We have been inspired by the web-documentaries Gare du Nord and Highrise, where some animated drawing of characters appear so you can click on them. This was our first shot at this:

The result is not good, we figure out that it is really hard to play with perspective, and the main problem we faced with this technique was that unlike Highrise the speed of our video depends on the speed of scrolling, it is really hard to play with this variable.

Looking for a better solution, we came back to the simplest, what if the streets vendors are actually in the street when we will shoot ? This solution raised some logistics complexity, but it was clearly the best way to go. It’s 100% reality!

The problematic became: how to make them clickable? We did some design iteration and the less intrusive way and best UX (user experience) choice has been the arrow sign you can see online:

UX choice to make the street-vendors clickable

Tracking 3D to position the sign on top of the characters

Screencast Discovering character

As you can see in the screencast, the sign appear 5-10 meters before the character, and the position of the character is not the same on each frame. We needed to have for each frame the position of the character in the picture, to position with on top of him the sign.

It represents more or less 30 frames for each character, so if you do the maths it’s 300 frames to process for all the characters of the web-doc. We needed to find a way to automatize the process.

We opened After Effects and we used a great feature, the tracking 3D. You can track a point (here our reference was the head of the street vendor) across the frames. And then you can convert this positions back to 2D space with the excellent plugin 3Dto2D ScreenSpace.

After Effect tracking 3D on GoPro Footage

Then with a simple copy-paste you can export this positions, and with a custom script export this data to JSON (the position are in percentage):

"characterPosition": {
      "87": { //frame number
        "left": 58.5453125, //left offset in %
        "top": 50.97759259 //top offset in %
      },
      "88": {
        "left": 58.58489583,
        "top": 50.99018519
      },
      "89": {
        "left": 58.6171875,
        "top": 50.99990741
      }

For each frame we could now position the sign with top and left percentage values.

Sound design:

Investigation

The sound design is a crucial part of the web-documentary, in fact a part of the promise of the experience is: “enjoy the urban soundscape, be surprised by the pregones (street-vendors shouts)”. We invested a lot of research time in this.

We wanted to be immersed in the streets by replicating the best we could the all the diversity of sounds that exist in the center of Medellín. The other fundamental part of the sound design is that it gives the complete continuity when you change streets, when you are loading a new street the sounds keep going.

Technically we must confess we didn’t have a clue on how to do it, we barely knew the Web Audio API, and we were just tinkering around some prototypes.

We rapidly did some prototypes with 2 or 3 sounds playing at different volume level, adjusting this levels depending on the scroll position of the user, but it was basic and we didn’t know how to scale that to more sounds and more streets. (80+ streets).

It’s with this basic principle in mind that we discovered this amazing project: Sounds of Street View . The guys from Amplifon really did an amazing job by creating an interface where you can add sound to a Google Street View walk, you should check-out the project.

Sounds of Street View Project: www.amplifon.co.uk/sounds-of-street-view

And the best part of this project is that it is opensource, so we dig right into the code and we found out that these guys were using GPS coordinates to geo-spatialize the sounds and compute the distance between them and the position of the user. It seems obvious but until you do not have this idea, you are pretty much in the dark.

With this fundamental concept and some more coming from the Sounds of Street View project, we started build our own sound engine and think about how to capture them in the streets.

Field work in the center of Medellín, iterating on the recording process

After much trial and error, we came to the conclusion that there will be two kind of sounds:

Ambient sounds as in the Sounds of Street View project, which will fade in/out respecting a 1 / distance factor.
Punctual sounds that will populate the scene and make it more real, which will fade in/out respecting a 1 / (distance x distance) factor.

The ambients sounds give us the background ambience of the streets, they last 1–2 min before looping.

Example of ambient sound

The punctual sounds give us the details: the “pregones” of the streets vendors, a musician playing, a sound of the store, they are shorter (5–20 sec)

Example of punctual sound

Recording:

The sound recording has been done at the same time as the video recording to make sure that the sound ambience will be respecting what you see in the image. (some days there could be a lot of people, cars, store opens … , other day not). We were first shooting with the GoPro and immediately after capturing the sounds of the street.

To record the ambient sounds we used a Jeckling Disk to record in stereo. We recorded ambient sounds each 30m more or less.

Recording ambient sound with the Jeckling Disk

The punctual sounds were recorded in mono and as we will see after, integrated into the ambient by our sound engine software.

Sound editor UI:

Now that we had our methodology, we needed to build a workflow to be efficient. We created an interface on the top of the web-documentary to be able to easily design the sound environment:

Within this sound editor you can add sounds, drag&drop to adjust their position, set DB level, specify a max volume, solo a sound or mute all other sounds, and choose if it is an ambient or punctual which will affect how the volume of the sound is computed regarding the distance of the user to the sound.

You can see this editor in action in this screencast:

Demo of the Sound Editor

(TECH) You can look the Sound.js and Sounds.js files on GitHub if you want to dig in the code. The algorithm is quite simple:

when the user has moved, we call the updateSounds function which loops all the sounds of the streets and call for each the updateSound function
updateSound take care of adjusting the volume depending the position of the user in the street and for the punctual sounds will call updatePan which use Web Audio API pos() method to localize the sound to the left of right of the stereo mixing.

In the whole web-documentary we have 329 different recordings. We did 5 days of shooting in the center of Medellín to record both video and audio.

To conclude this part of this article we need to say that we did a realization choice in the sound design: we emphasized the “pregones” of the vendors in the sound design, to be able to focus the attention of the user on them. Even if it contaminates a bit the whole ambient, we wanted to fulfill our promise to first encounter the street-vendors through the sounds.

Iterating on interface design and user testing:

UX methodology

In the web-documentaries productions, it is something that is often mediocre, i think the main reason is that the projects are led by team which come from the audiovisual industry, and do not have the background of designer or front-end developers. And a mediocre interface equals a click on the close tab button of the navigator.

We put a lot of effort to realize a simple, efficient and beautiful interface, we did several iteration of the main features, validating by user-testing each week during the process. It was basic user-testing: we invited friends from different backgrounds to pass-by the office and we watched them opening the web-doc, telling them to say out loud what they were thinking from of the interface.

Each screen of the interface has been designed to only have one main action:

on the home page: click on ENTER (solution: big purple button)
beginning the streetwalk: understand that you need SCROLL the page (solution : animation)
finding the first street vendor: CLICK on him (solution : contextual tooltip, and arrow sign)
after the click: understand it UNLOCKs something (solution: animation)
closing the video: understand that there are MORE to watch (solution : contextual tooltip)
…

From the user testing feedback we iterated on these screens, we got some rights with one iteration, while others took us a lot of work.

I will describe the process of designing the scroll to start tooltip, its goal is to make the user understand that he needs to scroll to go forward in the webdoc.

Iterating on the Scroll to Start tooltip:

This was the first design of the scroll to start tooltip:

After doing 2 or 3 user-test, we saw that people were actually moving the entire mouse as the arrow suggests, and when they saw that it didn’t work, they were finally reading the “Scroll down the page” and eventually understanding how it worked.

We came back to Illustrator and worked on a 2nd iteration:

This was a lot better, 50% of the people were understanding immediately that they needed to scroll, but still some people were confused by the drawing, they thought they needed to click (the finger on the mouse was confusing). We figured out as well that a lot of people don’t understand the word “Scroll”, they need a visual explanation.

We tried to address the issue with the 3rd iteration:

By adding a side view, we were explaining that the finger is on the scroll button of the mouse, and not the click. This helped a bit, but was still confusing for some people so we decided to animate this drawing.

Scroll to start animation

This is the final version, with the animation we made clear that we were scrolling and not clicking, the results of user tests were good so we decided to go with it.

Bonus thoughts:

Understanding that you need to scroll is not enough, sometimes people where actually scrolling, but in the wrong way: they were going backward.

They were right but nothing happened on the screen because at the beginning you can’t go backward. They were stuck because they didn’t understand what they did wrong. We solved this issue by adding a visual feedback:

We weren’t able to specify which way you need to scroll to go forward because some devices have inverted scroll, like Safari on Mac, we needed to tell that it was the “other way”.

Touch pad and Touch device:

You can see that we used the mouse metaphor to explain the scroll, but what if the user doesn’t have a mouse? On laptop, on touch device for example. Unfortunately we didn’t have the time to implement other animation for this devices.

Note on the interface implementation :

(TECH) Our graphic designer (@nicolemgm) worked with Illustrator and most of the UI pieces are vectorial to be able to implement them using SVG. SVG is now almost supported by every browser, it is a really cool tech to do interface design, if you are not familiar with it, i recommend you to visit CSS-Tricks to learn about it: Using SVG

(TECH) For the animation work, the used the excellent GSAP library, i can’t recommend it enough, the syntax is clean and it allows to create complex animation without cluttering the code. We also used CSS Animation to code some simple states animation (hide/show).

Similar project:

Later on we found this amazing project that had been done back in 2011 by the Hinderling Volkart team : http://360langstrasse.srf.ch/ , i think that if we had found this project before, we would have spare a bit or R&D time.

(TECH) It’s the same concept of scrolling video, and Severin Klaus wrote a nice article about how it works. For Pregoneros de Medellín we did an implementation less jQueryfied, but we did use a great optimization we found thanks to this article: the progressive load of the images. It’s explained with a nice schema in the article, the idea is to be a bit more clever when loading the images needed to display the street.

For example we have a street with 300 frames (300 pictures to load), instead of loading the 1st , 2nd, 3rd .. frame until the 300th, we can load the 1st, 5th, 10th, 15th … and progressively load all the frames. This has the advantage you can actually display the street before everything is loaded, there will be less images, but you can barely notice it and you gain 300 / 400% of loading time. If guys from Hinderling Volkart team read this article, many thanks to you, sharing rocks !

What we could improve:

Pixelating on scroll

You notice that when you go forward, we display low-resolution pictures of the street walk, this pictures are 500 px wide, for HD-Retina screen the result is really ugly.

We do so because of the loading time, if we increase the quality of the low-res pictures, the result would be a higher loading time. This is why we choose this low resolution.

That said, if you have a fast internet (20Mb+), the loading time should be really fast (should be ~2-5 sec for a street), and for that kind of bandwidth we could have an HD mode which loads 1000px wide pictures for example.

Unfortunately, we didn’t have the time to implement such feature.

360° on stopping

As discussed at the beginning of the article, we would love to have a 360° imagery of the street when you stop in the street, and it is definitely something doable with a bit more funding ;-).

Tablet compatibility

As most web-documentary, we didn’t have the time (i.e: money) to optimize it correctly for tablet, if you use a recent Android tablet with Chrome Browser or an iPad with Safari it should work, but there must be some specific bugs we didn’t fix, sorry !

Conclusion:

We have been working on this project the past 2 years and we are really proud that it is finally out there.

The goal of this post was to share back the creation process of such project but there is a lot more to say about the audiovisual realization, the diffusion and the audience of such website, a more technical post about the use of SVG, the stills/sounds loading, the hosting, the funding, an analysis of the statistics of visits… I hope to be able to share back more on this subject in the next months.

For technical people the code is on GitHub, i would love to see some hacks based on our work !

Meanwhile you can let me know what you think on Twitter (@tibbb) or by commenting below.

I was responsible of the development of the interactive part of the documentary, but a lot of the people of the team worked on this creation process, mainly Ángela (co-director and realization audiovisual), Nicole (graphic designer), Carlos (sound engineer), Mario (director of photography, camera and editing), Esau (conceptual development) and Juliana (text writing).