How we created “Analytics Tag” for VRChat and made our world multilingual with the use of AI.

Georgy Molodtsov
FILM XR
Published in
10 min readApr 2, 2024

…Also how we reached the audience in their native language without speaking that language.

Everyone is talking about AI and use cases, but for us, it was hard to see the practical benefit of using AI in our projects beyond the usual “ask ChatGPT how to do this and that”. But in our recent project “MormoVerse” VRChat world, we found a real practical use with the tools which gave us quite a nice boost in attendance, and more — visitors' satisfaction of us making something easier for them.

So here is a long story.

Origin of the project

MormoVerse is a VRChat narrative world, created by our team as a part of the bigger “Kitten Mormitten/Under the Pillow XR” universe we’ve been working on for several years and which we still are going to do (so if you are a VR game / mobile game publisher or animation series broadcaster/studio — we have a pitch deck for an animation season and VR game / mobile games concept, please contact us :)

It is a story that is not just a “story”, but my father’s legacy, as he wrote the original story for me and my sister and over 30 years ago to help us have better relations with each other by having a common “helper” — a magical handmade toy Kitten Mormitten, which kids in the story (us) decided to create so that he could found their lost toys and “treasures”. Father was a journalist, he published a weekly newspaper with a page dedicated to Kitten Mormitten, and thousands of kids from the region were sending their images and stories about the adventures of Kitten Mormitten. Fast forward to 2019 and I’ve decided to bring back this story in VR and dedicate it to my late father, who died too early.

So, our VRChat world is the third stage of what I’ve been doing with Kitten Mormitten.

And this “Phase Three” is based on two previous:

  • “Under the Pillow” narrative VR game pilot, created in Unity by Feeling Digital Studio and Film XR in 2021. The project was nominated for the Cannes XR, and Crystal Owl Awards and received the VR Days Halo Best Art Game Award:
  • “Kitten Mormitten” animation film (Phase two), which was created in Unreal Engine 5.2 using the assets from the VR Game (Unity -> Unreal, yeap) with a big rework of the characters and visuals, but still based in the same asset group:

So, basically, we combined and upgraded the world design from Unity VR Game and the narrative structure from animation film by cutting the film into 4 short episodes. The deal is simple: you arrive in the world, watch a short piece of the film, get the task, go to the virtual world and visit the same places you’ve just seen on the screen, do a simple interactive quest, go back, watch the second film — etc. That is how one of the world hops looks at me as a creator giving the introduction:

After releasing VRChat world and being awarded at the Raindance Immersive, we started to work on the optimization of the project for standalone devices, the so-called “Quest optimization”. VRchat recently published an Android app and all of the Quest-optimised worlds might be played on a good Android smartphone. So the user base peaks at 100–120K visitors in a moment during the weekend.

100K is a good number, but there are also thousands of worlds. And to have your world visited well — that’s almost a full-time job. Participation in the Raindance Festival and receiving an award helped us to get to 3,5K visitors in a month — a good amount with over 650 people adding the world to “favorites”.

As our VRChat experience is based on an animation film and during the experience you can watch 4 short episodes, we’ve embedded a video player in the world. For now, VRChat doesn’t provide much analytics about the visitors except the number of visitors (total) and who favored it.

And here is the first trick we’ve learned!

As videos are hosted on a video platform (Vimeo or YouTube), you can get the analytics from them and see, who visited your world.

The Quest version was published on Feb 23, 2024, and that’s when the new wave of visits started. Over 5K visits in 16 days (from VRChat visits count) from all over the world:

It seems that for Vimeo you can get data only from Downloads and Impressions, but that’s a valuable source anyway, as everyone who comes into the world and clicks the start button, creates an impression. A similar thing is with YouTube — it doesn’t count as a view but still gives you some numbers of visitors.

One of the reasons, of course, is that our world was Spotlighted by VRChat itself in a “Cross-platform” category, meaning that it was proposed to visit while people are going through the world menu:

But data from Vimeo Analytics gave us a really great understanding of who is our audience — a knowledge we usually have to guess while working in VRChat.

Based on the data, we can see, that we have quite a lot of visitors from Japan and Korea. That was not surprising, taking into consideration the number of tweets we found mentioning our world, especially that one with the video from the world, which pushed the visits to Japan:

Japanese audience really appreciated the work and the story, but they shared that it was hard to follow the animation film without speaking English well. Having English subtitles (not only voice) helped, but still, not everything was clear.

So we decided to make a Japanese version!

It was clear, that if we want to please our audience and establish stronger relations with them, it is worth trying to create an experience in their language. In our case, most of the world is based on videos, meaning what we need to do is re-narrate and subtitle the work. To do that, we’ve tested several AI services that help with the translation of the video and keep the professional quality of the audio. There are few of them in the market, so my preliminary tests included:
1) Rask AI
2) Heygen
3) Silero Bot
4) llElevenLabs

Out of all those services, only Rask AI provided a proper transcript and flexible tools to manually correct each phrase, while keeping the rest of the translation untouched.

As we are working with animation film, we used only audio track, but both Rask and Heygen might be interesting in their lipsync function for video, which works quite well, but only for those web versions of video files you upload them — you can hardly use it in professional production.

My workflow was based on the mastered version of the film, so the audio output was taken from the M&E vocal track. To make the process easier, I’ve copied the track to a new sequence in Premiere Pro and “reverse-engineered” it by cutting the voices each for separate tracks and giving them in numeric order.

I’ve decided to do two tests.

First — group the cues from one talent together and make one sequence with voices going one after another:

Leaving blank spaces was important as my early tests showed that we need to separate phrases from each other for AI to recognize them and match them with the original one by one.

9 minutes of the film turned to 4 min 45 sec of text (including 15–20 sec of pauses). That’s good because Rask charges by minutes (takes credits from your account based on the amount of minutes).

The second approach is to organize files chronologically but cut the extra spaces (where there is no music). You can go with this one, just be sure you leave extra spaces between phrases, as in different languages it takes longer to pronounce some of the sentences (otherwise AI would try to pronounce it quickly).

I’ve uploaded the 4:45 track to Rask and let it transcode and translate it first. And here it is super important to proofread the transcription in English as it is the one which would give you major control over other language versions. With the Pro plan, you can use one source properly marked and add new languages, while with Basic you’d have to upload the same file for a new language every time:

The system easily recognized 2 out of 3 voices (kitten Mormitten and elder sister), so I had to manually assign a third voice to the younger brother (to be fair, his voice is quite similar to Yulia’s and Mormitten’s and is somewhere in between).

It took 3–4 minutes or so to upload and transcribe the 5-minute-long file, as well as around 10 minutes to translate those pieces of audio to Japanese.

After the first version you can correct the duration using the visual timeline in the lower part of the window:

After getting the file, I cut it into pieces relevant to each cue and put them in the project — with color coding it was quite easy to do:

Here you can see that the lowest track is the new audio:

As I am planning to do several videos and am not yet in the mastering phase, I decided to use sequences with background sounds like laughs and emotions + AI-translated tracks. I just select “solo” for the tracks I need (character’s sounds + translation) and then place it in a master video sequence:

The first test was good — at least the sound of the voices was correct as well as the intonation. The tricky part is to pass the translation to native speakers for further proofreading and changes of the text.

Next step — I found a native speaker who corrected written transcription. She said that AI didn’t pronounce some syllables in the first take, which is why some of the phrases were good, but others were not recognizable.

We made a second take with a properly written translation. For that, I’ve created a Google Spreadsheet, where I’ve copied data from the Rask interface.

As I’m using a basic tariff, I didn’t have access to export/import SRT files, and that would have given me a slightly faster way of doing things.

Here is another trick — nesting Premiere sequences and replacing new files makes it easier to quickly get an update of the edit.

Rask is quite consistent in keeping phrases within the frames of timecodes. This means that when you are doing a new take with changes, cues will stay in the same place and your previous cuts of the nested sequence will still match. Just replace the source file — and you have an updated version.

After the second run we understood, that “written Japanese” was not doing the job well, even though it was better than the first run. For the third run, we did phonetic corrections, so my spreadsheet tab for Japanese translation had two extra columns: “Japanese for subtitles” and “Japanese for dubbing”.

The third version was much better, but there were still 8 out of 50 phrases that were hard to use due to mispronunciation. So I’ve received a review of those changes — somewhere we replaced the words with synonyms, somewhere — another phonetic model. It took 2 more rounds to get it done! What is good is that RASK doesn’t charge you for those changes, so you can keep on perfecting.

After replacing the audio, I had a final version of the Audio transcription and what was left for VRCHAT was to edit those videos to the episodes used in VRChat, as well as to put subtitles. And, it’s done.

Let’s compare the original English starting episode:

With the Japanese one:

We went to Twitter/X and found those who posted about our world before the release of the Japanese version, invited them to visit and watch in Japanese and the reactions were pretty good:

Creating a multilingual version gave us a huge boost in visits: from 250–270 people daily we started to grow closer to 350–400 during the week and over 550 over the weekend. The majority are still from the United States though. But it might be that by adding the list of the languages available in the title of the world (now it is MormoVerse˸ Under the pillow[ENG⁄ JPN⁄ RUS]), we made it clearer for visitors what to expect there (I doubt people are reading the full description of the world carefully).

Thanks a lot to everyone who contributed, especially to Sana, who volunteered to proofread the translation and correct it for both verbal and written versions!

And please follow deacōnline, the developer behind our VRChat world, who might be opening some secrets about the way he programmed language switching in the World!

And of course, you can try Rask.AI by yourself to check if it fits you!

See you in MormoVerse! モルモバースで会おう!

P.S. Our next steps:

  1. Release French and Korean versions
  2. Create more mini-games in the VRChat world
  3. Conquer the universe with the kindness of Kitten Mormitten and make him the most beloved Kitten in both real and virtual worlds!

--

--

Georgy Molodtsov
FILM XR

XR Director @VRROOM (Oxymore), VR Festival Curator (goEast, VR_SciFest, Tbilisi VR Days), Founder @ Film XR (Raindance winning "MormoVerse" etc)