Spatial Mini-apps: The next growth curve for web developers?

Yorkie Neil
10 min readOct 12, 2023

--

YodaOS released its first version in 2019, originally positioning itself as an open-source solution for smart speakers. At that time, I served as the core maintainer of the YodaOS application framework, providing an embedded JavaScript voice application framework for JavaScript developers.

As time passed, smart speakers gradually lost their prominence, and it seemed that the intersection of the voice assistant field and the web ecosystem was not as compelling. Smart speakers with screens also brought the solution back to the Android Open Source Project (AOSP) ecosystem.

After four years of exploring different paths, I returned to Rokid. I still hoped to find the next growth curve that would be interesting and substantial enough to support the livelihood of engineers. This led to the birth of YodaOS JSAR — Spatial Mini-apps.

Over the years, YodaOS evolved from its original role as an intelligent speaker operating system based on Linux to YodaOS Master, an operating system for spatial computing (AR/MR/XR) scenarios, with its technical foundation transitioning from the Linux Kernel to AOSP.

YodaOS JSAR is a set of application frameworks built on this system, specifically designed for web developers. You could say that YodaOS JSAR is an extension of the previous YodaOS JavaScript Application Framework, transitioning from developing voice skills to creating spatial mini programs.

Recently, I was invited to participate in a small sharing session at OpenJS World 2023 in Shanghai, where I shared the YodaOS JSAR’s design.

This article will combine the content shared during this session to provide a textual version, with the aim of helping readers gain a comprehensive understanding of YodaOS JSAR and the underlying thought process.

Spatial Computing and What’s the Space

To provide some background, let’s first understand what “space” means. In the context of augmented reality (AR), “space” refers to an area in the real world, which can be a plane or a three-dimensional object. Users interact with space through spatial mini-programs to obtain information about that space, such as its size, position, and orientation. In traditional AR application development, a scene is usually used to represent and interact with this space.

from Wolvic Documentation

As shown in the illustration above from Wolvic’s technical documentation, it describes a design for a browser in the spatial context. It can be observed that the traditional browser tabs have now transformed into a series of virtual screens (web pages) surrounding the user. This design allows users to easily switch between the web pages or applications by simply turning their head. Additionally, the menu functions are located at the top and bottom of the virtual screens, enabling interaction through controllers or gestures.

The Rokid AR Studio Screenshot

YodaOS Master follows the similar design. Users can open different windows in the space, where each window corresponds to a web page or Android application. Users interact with windows and the menu using gestures or rays.

The user’s space can be visualized as the above: the user is at the center of this space, surrounded by a cylindrical surface. Part of this surface serves as a virtual screen to display web pages or applications, and users can switch between them by simply turning their head.

Spatial Mini-Apps

While readers are familiar with traditional mini-apps like WeChat mini-programs, which allow developers to create instantly usable features within the WeChat app without requiring users to download a separate mobile application, spatial mini-apps have some unique characteristics.

So what exactly are spatial mini-apps? Let’s begin by watching a demonstration video: https://ar.rokidcdn.com/web-assets/pages/yodaos-jsar-demo.mp4

In addition to the screens seen earlier, spatial mini-apps include interactive 3D objects in the space. These objects can be interacted with independently.

To make this possible, spatial mini-apps must possess the following features:

  • Security: Ensuring the safety of these programs.
  • From 2D to 3D: Transitioning from traditional 2D interfaces to three-dimensional spaces.
  • From Window to Space: Adapting from conventional windows to spatial environments.

Why Choose Web Technologies

From a technical perspective, spatial mini-apps don’t necessarily have to be based on web technologies. Scripting languages like Lua or Python could be used as well. However, the decision to use web technologies was driven not only by my passion for web development but also by some specific technical advantages.

Web technologies, which have evolved since the release of HTML 1.0 in 1993, have been known for their security and convenience. YodaOS JSAR aims to provide a space application framework designed for security and convenience, allowing developers and users to create and share simple, fun, and convenient “spatial mini-apps” beyond independent space applications.

The New Trio: XSML, SCSS, and TypeScript

YodaOS JSAR introduces a new trio of technologies, which correspond to the familiar HTML, CSS, and JavaScript, they are: XSML, SCSS and TypeScript.

XSML

XSML (eXtensible Spatial Markup Language) corresponds to HTML. It’s a markup language for spatial content, extending the capabilities of web development into three-dimensional space.

The above is an XSML code example, which will be quite familiar to developers acquainted with HTML. It closely resembles today’s HTML, with differences mainly found in certain tags, such as:

  • <html> becomes <xsml>
  • <body> becomes <space>

Here are some of the new tags introduced in YodaOS JSAR:

  • <mesh> references a 3D model
  • <cube> creates a cube
  • <plane> creates a plane
  • <sphere> creates a sphere
  • <capsule> creates a capsule shape
  • <torus> creates a torus
  • <bound> creates a 3D bounding box, similar to <div>

The lion shown in the illustration above is renderer using XSML. You can access the source code and an online demo through the following links:

For a comprehensive understanding of XSML, you can visit: https://jsar.netlify.app/en-us/manual/latest/basic-concepts/intro-xsml

SCSS

It corresponds to CSS and is known as Spatial Cascading Style Sheet, or simply, SCSS.

The SCSS syntax is fully inherited from CSS, and its usage is also identical:

  1. Select elements using selectors.
  2. Define styles.

Such as:

@material red {
diffuse-color: #ff2200;
}

#box {
rotation: 0 0 180;
position: 0 1 0;
material: "red";
}

You can see that the settings are no longer the traditional CSS styles but are now focused on 3D space, encompassing attributes like rotation, position, and material. SCSS provides a highly intuitive and natural way to design spatial styles, making it more straightforward compared to using scripts.

For a complete understanding of SCSS, you can visit: https://jsar.netlify.app/en-us/manual/latest/basic-concepts/intro-scss

TypeScript

With the advent of server-side runtimes like Deno and Bun, which natively support TypeScript, YodaOS JSAR has also chosen TypeScript as its native programming language. It incorporates a TypeScript compiler at runtime and preprocesses TypeScript code when interpreting <script>, executing it within V8.

The reasons for selecting TypeScript primarily include:

  1. Exceptional development experience with TypeScript, especially when combined with Visual Studio Code.
  2. TypeScript is compatible with JavaScript, so supporting TypeScript implies compatibility with JavaScript as well.
  3. In the 3D development domain, where complexity multiplies due to the additional dimension, using JavaScript to maintain code can increase the demands on developers. TypeScript helps lower this threshold.
  4. With the backing of a type system, it becomes easier to write high-efficiency, more versatile code with code generation hints for various Low-Level Manipulation (LLM) scenarios.

Through YodaOS JSAR’s script system, you can use both Babylon.js APIs and Web APIs.

Babylon.js is an open-source 3D rendering engine that supports multiple backends, including WebGL, WebGPU, server-side, and native platforms. It offers a range of game scene APIs suitable for developing 3D games and applications.

Such as:

const scene = spatialDocument.scene as BABYLON.Scene;

YodaOS JSAR provides a global object called spatialDocument, which is similar to the document object in a web environment. You can use this object to access the current 3D scene.

YodaOS JSAR is built on the Babylon.js framework, allowing you to access other Babylon.js capabilities directly through the scene. You can check the support for Babylon.js features here.

Additionally, YodaOS JSAR supports various Web APIs, and it will continue to support more in the future. Here are some of the Web APIs that are currently considered valuable for spatial computing devices:

  • Using the WebXR Device API to handle spatial relationships, obtain input data, and manage the lifecycle.
  • Managing modules using ECMAScript Module.
  • Sending network requests using fetch.
  • Creating timers using Timer.
  • Handling and playing videos using Web Audio.
  • Recognizing user speech and generating speech using Web Speech.
  • Continued support for WebAssembly.

You can check the status of our Web APIs support here.

Implementing Details

In this section, we will introduce some details of implementing the YodaOS JSAR runtime.

Integration with Unity

First, let’s address a question — why integrate with Unity?

Whether it’s Apple Vision Pro or YodaOS Master, developers currently have almost only one mainstream choice for 3D application development frameworks, and that’s Unity.

For example, in the case of YodaOS Master’s desktop applications, they are Unity applications. So, to achieve the goal mentioned earlier of placing spatial apps within a Unity application’s space, it’s necessary to collaborate with Unity.

Therefore, YodaOS JSAR’s integration with Unity is a crucial aspect of its design. This integration allows spatial mini-apps to run within Unity applications. This means that Unity developers can integrate YodaOS JSAR into their Unity applications and deploy them to various platforms.

The integration involves Unity and JavaScript components, connected through a YodaOS JSAR Unity plugin. The workflow consists of:

  1. Unity applications calling YodaOS JSAR plugin interfaces in C#.
  2. The plugin executes Node.js Embedder API to start Node.js instances on a non-main Unity thread.
  3. Node.js requests the URLs of the supported spatial mini-programs, parsing XSML, SCSS, and TypeScript.
  4. The parsed data is used to create a Babylon.js Scene and retrieve scene data.
  5. The scene data is sent to Unity C# through the Unity C++ communication channel.
  6. Unity creates 3D objects (GameObjects) based on Babylon.js scenes using provided APIs.
  7. Real-time data synchronization is maintained.

By following this process, YodaOS JSAR is able to render the code from spatial mini-programs within Unity and provide interactive capabilities through Unity’s interaction methods.

Isolation

To ensure program isolation, each spatial mini-app entity in Unity Runtime operates independently, with each entity corresponding to a data channel.

In the Node.js runtime, each spatial mini-app corresponds to an XSMLDocumentd, with each document having its own isolated data channel. This design ensures that each mini-app runs in isolation, without interfering with others.

Protocol Design of Intermediate Messaging

One another of the most critical aspects of YodaOS JSAR is the design of the intermediate data exchange between Unity and JavaScript runtime.

As you can see, OpenGL contains three-dimensional coordinate information for your scene (including other data based on coordinates), shader programs, and transformation matrices. Therefore, it is possible to render objects into a Unity scene using OpenGL, which offers greater universality and standardization.

However, YodaOS JSAR insists on approaching the problem through scene management for a specific reason. We aim to do more than just rendering; we want to integrate better with the Unity framework, enabling users to interact realistically with objects from spatial apps within the Unity scene. This requires us to extract object descriptions for synchronization, rather than dealing with lower-level vertex data and shaders.

We plan to address the issue of non-standard scene data and hope to achieve this by opening up the data format of the middle part, serving as a new Web Virtual Object Model standard proposal.

Consistency Across Multiple Platforms

YodaOS JSAR aims to provide Web developers with a consistent development experience, and therefore, consistency across multiple platforms is an important feature we aspire to achieve.

Firstly, YodaOS JSAR supports rendering on the following platforms:

  • Unity Runtime for Android
  • Unity Editor
  • Web Browser
  • JSAR DevTools for Visual Studio Code (The VSCode Extension provided by YodaOS JSAR)

For Unity Runtime and Unity Editor, they are guaranteed by Unity.

For the latter two, essentially, they are both web platforms. Initially, there were two options: using Babylon.js on Web browser or compiling through Unity to WebGL. Ultimately, the latter option was chosen for the following reasons:

  • It offers a clearer design, with Unity ensuring consistency across all platforms.
  • Babylon.js and Unity have significant differences in the underlying rendering, making it challenging to guarantee consistency.
  • Babylon.js supports all capabilities on the web, but in YodaOS JSAR, only partial support is required. This approach avoids situations where developers debug on the web but find that it doesn’t work on actual devices.

Thanks to Unity’s excellent performance in rendering across multiple platforms, YodaOS JSAR has achieved the following results:

The End

We hope to leverage YodaOS to complement end-application development for the Node.js community, allowing us to return to the world of “end” development with this familiar technologyies.

Our initial goal remains unchanged. We hope that YodaOS JSAR can provide Web developers in this era of spatial computing with more possibilities and opportunities!

Regarding open source, YodaOS JSAR will eventually operate in an open-source manner. However, the primary task at the moment is to support the needs of upper-level developers and grow the community. Once the community reaches a certain scale, YodaOS JSAR will certainly transition to an open-source model.

You can access the latest documentation and information about YodaOS JSAR at https://jsar.netlify.app/.

--

--