Design for Another World: Creating a Virtual Concert

Published in

HCI with Galaan

8 min readNov 19, 2020

In our creation of an “impossible place” in a 3D environment, we first considered implementing a fictional world, but then realized that in the current period of a global pandemic, going to a concert is just as impossible as going to Narnia or Jupiter. You can experience the world we created here and enjoy the feeling of being on stage, performing at a concert for hundreds of people. While we are happy with the results, there were many ups and downs that ultimately led us along our creative path.

Demo Video

Demo video of our virtual concert.

Brainstorming

Our first goal was to come up with the basic theme for this project. Since the description of the project seemed to incline more towards a fictional place such as another planet or the world of a story book, we first started toying around with ideas such as those. At some point Travis Scott’s virtual Fortnight concert came up, and from that point on we were pretty much sold on creating a concert. The 3 main ways to implement a concert would involve the user either on the stage as a member of a band, in the crowd watching the performance, or both via moving in the 3D space to switch spaces. In all 3 of these possible implementations, we saw potential for the use of both visuals and sound to make this world feel immersive. We then started the research process necessary to substantiate our ideas and create something fully.

Travis Scott’s virtual Fortnight concert.

Research

On the topic of sound, there seemed to be a variety of effects we could use for immersion. In section 4b of this article by Staci Jaime, she states that having audio come from the direction of a sound’s source is important to establish a sense of realism in a virtual world. Since we intend for a user of our app to be playing sound back on their general laptop speaker, we didn’t think we could effectively implement the directionality of sound, since that requires the use of different sound levels in headphones. But, we were confident that we could modify the total volume of sound based on the proximity of a user to the source. More specifically, we could place the source of our sound inside 3D models of speakers and increase the volume of the music as the user got closer to it.

A binaural audio explanation showing how the direction of sound builds a sense of realism.

Additionally, we liked the idea of reverberating a user’s speech through the microphone when close to an on-stage microphone, as this would give off the feeling of actually singing into a mic on stage. We felt this was a strong usage of her point in 11.b where she mentions that many physical objects already have their own understood meaning to a user and should be implemented with expected features, in this case being the microphone with the expectation of voice playback.

In terms of visuals, we mainly just wanted it to feel like a concert. We determined that we would include a stage, band players on the stage, and some sort of crowd off-stage to try and encapsulate the feeling of attending an in-person concert. In this article by Jonathan Ravasz, he talked about how important the establishment of an environment is. In particular, we would remember to take into account the importance of where the ground level is in relation to the viewpoint of the user, ideally being about a human’s height below the camera in perspective, as well as the need to make a non-uniform, “cluttered”, walking space which we would achieve using 3D models of complex terrain and the audience.

Implementation

Our general methodology for making progress on this project involved thinking in incremental steps. The first achievement we reached was simply being able to download a model and load it into our code base. We then loaded in a few person models, a stage, and a ground terrain for everything to sit on. A particular point of discussion regarded how to implement the idea of the audience. We first thought of using a lot of human models, but with each new model the screen became increasingly laggy. Our next idea was to place a large screen in front of the stage playing a video of people dancing at a concert. While this was done without too much trouble, it didn’t feel as immersive. Luckily, we found an actual model of a crowd, composed of a collection of almost-2D people with varying positions in 3D space.

One of the early stages of our project when we first learned how to implement models.

We simultaneously investigated our ideas for sound. A-Frame does not have any documentation on how to use a computer’s microphone, so we had to look to other libraries as alternatives. We found a few, with the easiest to use being Pizzicato. Since A-Frame does not play well with other independent scripts, we had to figure out how to embed a Pizzicato mic-recording object in an A-Frame entity, but once we did this we were able to create the voice echo effect we wanted.

Issues

The limiting factor of what we were able to achieve was A-Frame. While it is a pretty easy tool to use, there were many nuanced difficulties we faced. In terms of loading in models, some models did not work because they were of file formats that the A-Frame documentation says it specified yet seemed not to be the case, such as .ply and .fbx. Some file formats such as .obj and .gltf seemed to be the correct file format as some models of this type would load in without error, but some even of the same format would simply not do so and had to be abandoned.

One of the many pages of models we found that looked cool but would not load properly.

Another issue on the topic of visuals was simply the time of updating and rendering. While Glitch has live updating for the A-Frame world we were creating, any time one of us made a change, it refreshed and fully reloaded the entire project. This made group collaboration quite slow unfortunately. A somewhat-viable workaround was copying the project, modifying the size and location of components individually, and then copying over the new dimensions / values back into the base code, though this process was only somewhat more efficient than our original direct collaboration.

An example of what it looked like when the scene would slowly reload / possibly never finish loading.

In terms of sound, the implementation of the sound ideas described above took much longer than expected because either A-Frame or Glitch would stop outputting the sound of the project at random times, without any changes to the code. The process of eventually realizing that this was the case and that the code itself wasn’t broken took some hours. After this we found that the sound for this project on some of our computers worked on some web browsers, but they were different across all of our machines. Though eventually, we were able to integrate a third-party echo-mic effect from a library called Pizzicato into an A-Frame component without too much difficulty. And yet even after we reached this point, getting Pizzicato to modify the sound input from the mic to sound like reverb did not work as we had hoped, as there seemed to be some issue with either the library or our understanding of how the library works.

The sound library we used to access the microphone.

Additionally, we wanted the sound of the music at the concert to come from the speakers next to the stage, thus having the volume increase / decrease as the user moved towards / away from them. The issue we faced here was that we really liked using the large board, that initially was used to play a video of the audience, as a visual that seemed to really add to the concert experience. The reason why using the board was problematic was that we couldn’t figure out how to turn off the audio for the video board, meaning that we ended up playing the music through that board alone and not the speakers by the stage. We ultimately felt that the board added more as a visual asset than the varying music sound added as a sound asset.

The video board next to the stage. Billy Ray Cyrus is serenading the crown with “Old Town Road”.

Lastly, A-Frame would sometimes just not load fully. For Nick in particular, there would be times when different parts of the environment simply would not load, even after 10 minutes. This issue was random yet persistent which allowed some of us to work on only a few specific parts of the model at any given time, solely based on which parts would load or not.

Results

Quantifying our results turned out to be harder than we thought. From a general feeling of immersion, we all believed that we used video and sound effectively to create an interactive world. From a research standpoint, we implemented the points that we sought out as well, using objects in their intended ways, creating effective terrain, and using video and input sounds. On the other hand, we were limited by the models that we found online, as we did not think creating our own was realistic given our timeframe, and using A-Frame is somewhat laggy which does affect the sense of immersion. In the end, we successfully created a virtual environment that allowed the user to sing along with the song and explore the concert around them.

A full view of our finished product, featuring the stage, audience, video board, and a Ferris wheel for fun.

Once we figured out how to find and implement valid models in our environment, we added elements to effectively mimic a concert. The static objects were the strongest component of our environment since the music video and microphone only worked on some platforms. Adding the objects allowed us to be creative in the pieces we wanted to include in our concert environment.

Because the audio would only work for certain computers or in certain browsers, playing the song was the weakest part of the final product. We believe that many of the audio issues were caused by problems in Glitch itself. We also had difficulties demonstrating our project in Gather.Town due to conflicts in their audio settings. Without the audio playing, users were simply walking around static models. If we had more time, we would have liked to animate the models to create a more immersive simulation. In the scenario where the audio doesn’t play, the crowd jumping up and down or the performers moving on the stage would have made the environment more realistic.

Conclusion

This project gave our group an opportunity to really explore virtual reality. Before this, some of us had used a VR headset once or twice, but had never really put much thought into all of the little decisions that go into making a program feel immersive. Our creation demonstrates the design research we put into the decisions we made, the effort that went into creating the user experience we sought, and the utilization of the design process to establish a procedural, incremental path from the start to finish of this project.