Getting started with visionOS!

Apple’s big bet that the future of mental health is in VR

Piram Singh
9 min readNov 4, 2023

Imagine this — your attending an immersive therapy session where you are practicing meditation in a large forest all from the comfort of your home in the Apple’s new Vision Pro.

From respected sources, Apple is looking to invest money into creating a whole ecosystem to tackling mental health with the Vision Pro at the center alongside the Apple Watch and the iPhone. You can learn more about this future — click here.

To attain this future, we have to start accelerating visionOS project development. So in this piece, I am going to guide you how to start making those cool projects! I’m going to breakdown the (1) frontend design principles, (2) backend set up, (3)showcase some cool example projects and (4) resources for next steps!

As a quick clarity check for those who don’t know — visionOS is the operating system that is powering the Vision Pro.

Frontend Breakdown

When developers are tasked with developing apps for a new user experience where the input is changing from tapping a screens to using eyesight and the tapping of fingers — UXers, front-end engineers, and product designers have a lot to think about. I’ve curated a list of 5 main principles front-enders need to consider when designing apps for visionOS:

🔭 Field of View

With this new tech, we are not limited by a 2D screen, we have the entire world to display an entire operating system on top of which leads me to the biggest factor you need to consider — where is the user looking at all this information? What is the angle of their head in regards to the screen?

Upright vs. Angled Viewing — learn more here

The main takeaway for designers to ask themselves: is the app that i’m designing going to be suitable for the user to view in the upright and angled viewing angles?

🔍 Screen Anatomy

With an understanding of the user’s POV — we can dive into understanding the look and feel of VisionOS which is determined by 4 main areas:

(1) Screen Material

(2) Screen Shape

(3) Screen Ornaments

(4) Screen Typography

🧱 Screen Material

The basic building block to visionOS’s screen design is glass 👇

This allows natural light and users to view and interact with their real-life environment and light to pass through virtual 3D objects! For designers — Apple requires that all content screens keep this design language in all applications (make this your starting point for all your design workflows)

🔹 Screen Shape

windows vs. volumes — learn more here

There are 2 types of content screens here — a window or a volume. A window would be used for more conventional applications that you are used to on your phone or tablet.

Concept piece for the visionOS App Store

A volume screen would be used to create virtual 3D objects that users can interact with and view from any angle.

A 3D Globe being used alongside a window for a teaching experience
A 3D Globe being used alongside a window for a new learning experience

🕹️ Screen Ornaments

To virtually control all of the content elements on the screen, Apple has developed a term called “screen ornaments” are essentially the virtual controllers for the screen. Here are the different types of ornaments👇

Bars

Bars are essentially windows that are to the left, right, or on the bottom of windows. They can exist on an isolated side of the window or right in front of the app, creating a necessary depth effect from the main content. All bars will either have specific controls mapped like play/pause or act as a page switcher.

Menus

To create more drill-downs of screens, the second family of ornaments to take notice of are these popover menus that help users understand corresponding actions that deviate from the parent action. Also note here that menus are using are towards the user more, increasing the z-axis of the screens.

Sheets

Spatial GPT Chat History

The last family of ornaments are sheets. These are essentially when content screens (normally smaller rectangles) are stacked up on each other. This is helpful for looking at chat history, flipping through images, and other use cases.

🔤 Screen Typography

2D vs 3D Text

The last part of understanding the screen anatomy is the typography of the system. Apple requires that all text should be kept in the SF pro font and keeping the text in 2D.

Why not 3D? Because you are already dealing with a 3D space, keeping basic components in a 2D font will make it easier for humans to easier to read

📐 Spatial Layout

The next design principle is understanding the space around you. When you design for spatial experiences there are essentially two states the user could be in:

(1) A Shared Space: Where multiple app windows are open and content is right in front of your eyes and right near the temples

(2) A Full Space: Where the entire app takes the space around you and creates an immersive experience

🗣️ Input Mechanisms

To control the new content screens and to create a more seamlessUX, the input mechanisms have been changed that feel more natural and more humane, here’s a breakdown of the new input mechanisms in Vision Pro 👇

(1) 👀 Eye Tracking + 🤏🏼 Hand Movements — your eyes are the main drivers to the system. You see an app, select by looking, and tap your fingers together and ta-da!

2) ⚙️ Digital Crown — the digital crown is used to adjust the volume, adjust the level of immersion in a given app, resize content, open accessilbility settings, exit apps, and acts as a home button for the system!

3) 📣 Voice — You use your voice to dictate text systemwide. This can be used practically anywhere and a core part of the ease of use factor.

(3) ⌨️ Physical Keyboards , 💻 Laptops, &🖱Mice — A virtual keyboard is built in the system, but users can also connect their physical keyboards, Mac laptops, and mice if they would like the feel of tactile clicks of keys and mice!

(4) 🎮 Game Controllers — For the gamers out there, you can connect your controllers to play all of your favorite video games to the system!

Backend Breakdown

Moving on to the backend, to power the new device, here’s what your going to need to get started with development

  1. Hardware: A Mac laptop to store all your projects into!
  2. Software: The latest Xcode beta with the built-in VisionOS SDK. Check here for latest updates to Xcode (beware though, this does slown downyour laptop)
  3. Software: Reality Composer Pro to render for virtual 3D objects (this comes pre-installed with the VisionOS SDK)
  4. Software: Unity — for 3D rendering (specifically for game development)
  5. Knowledge: Learning the SwiftUI Coding Language.

Current visionOS projects

With the SDK being rolled out June of this year, there has been a small but mighty developer community that is using it to create some dope conceptual apps based on those front-end principles. Here’s some amazing examples 👇

  1. 🤖 Spatial- GPT: With an empahsis on voice integraions — Spatial GPT is what customers need! It is an augmented version of ChatGPT that is built and designed for spatial computing. These are for the AI geeks, programmers, and other workers who are moving to an entire work system that is purely screenless and based on AI. The front-end is using the basis of voice and shared spaces, Spatial GPT will be an existing window that helps you support your on-going work. Now while there is no menetion of the model that is powering the AI, the best option for this concept is that SpatialGPT will Chat GPT4-V — the multimodal edition of ChatGPT 4. You can learn more about this concept here.

2. 🎙️ Navi Translation — Imagine you are meeting a work buddy in your from Korea, you don’t know the language but live translations or transcriptions would be helpful. At the bottom of your screen are real-time translations. This is Navi Translation. This is so helpful for people who are traveling, hard of hearing, also just people who want translations if they have family members who live in countries to understand different languages. Focusing on Voice, Screen Ornaments and the cues of depth, the frontend interface focuses on being in front of the human, rather than take up an entire window or volume. While there is no explicit mention of how translations are being done, they are probably using a Neural Machine Translation (NMT) Algorithm. For the designers — think of the NMT algorithm as a highly proficient translator who can translate and understand multiple languages instantly. You can see the entire project here .

3. 🪂 Immersive Social Experiences- Your scrolling on your twitter feed and then two minutes later jumping off of a skydive. The future of social media is through immersive experiences with your friends through Vision Pro. With these immersive experiences, your looking at people who are the vr geeks, people who have friends in other countries who want to be together, and possibly those who want to relive some nostalgic experiences that they can’t do anymore. The biggest design principle to take note of here is the use of the full immersive space and the transition from a shared to a full space. While there is no mention of the back-end models and algorithms, here are four things to understand about the back-end for this product:

  • Real-Time Messaging Protocol (RTMP) — this is used for low latency video
  • Content Delivery Network (CDN) — video delivery for users
  • Cloud based scalability model — a model to allow for more data to come into the system
  • Multi-Protocol Adaptive Streaming (MPEG-DASH) or HTTP Live Streaming (HLS): all this is used is for is better performance of the video across different types of networks (slow v. fast networks)
  • Data Encryption Model : To ensure user data is secure.

If you would like to learn more about the project click here.

🔗 Tutorials & Accounts to look out for

  1. https://www.youtube.com/watch?v=V-mIIcvYrh0 — A good guide to get started with visionOS development
  2. https://www.youtube.com/watch?v=eMA1Vd1nc9M — A step-by-step visionOS app tutorial (also teaches you how to use reality composer pro)
  3. https://www.youtube.com/watch?v=AQHq9WZPavI — Another step-by-step tutorial
  4. https://twitter.com/visionOS_news — The best place to find VisionOS project and dev community is pretty active here
  5. https://twitter.com/tracy__henry — Another great person to follow

After all of this research, I am going to start designing my own mental health spatialGPT app! Follow me to see what happens on this journey!

Here’s a quick summary of what we just learned:

  • Apple is investing a lot of money into solving mental health through VR
  • There are four main design principles to know of when designing for visonOS — Field of View, Understanding Screen Anatomy, Spatial Layout, & Input Mechanisms
  • When it comes to the backend — you need to be having. a Mac, the latest Xcode beta with the visionOS SDK, and knowledge of the SwiftUI language
  • There is an awesome dev community for visionOS that is making some sick projects like Spatial GPT & Navi Translator. Go follow them!

Hey i’m Piram and i’m an aspiring UI/UX designer, exploring how AI is changing the design. connect with me on linkedin to follow me on this journey!

--

--

Piram Singh

Hey i'm Piram and i'm an aspiring UI/UX designer, exploring how AI is changing design. connect with me on linkedin! https://www.linkedin.com/in/piramsingh/