What is RealWorld ™?
Lift your head (slowly), turn it. All is the RealWorld ™.
Version 1 25/03/2019
+Symbol “ͼ” indicates open source.
+It does not include solutions that are not mature.
+I have included a lot of videos (of very short duration) that show information about devices or software.
Table of Contents.
- 3D Engines
- More engines.
- Other developers tools.
- AR engines.
- Neural Network and AR.
- Content creation
- File formats.
XR covers all technologies:
VR. All is virtual.
AR. The information is shown projected on screens (glasses or simply smartphones).
MR. Information is integrated in visualization and interaction. (This term is relatively new, previously it was simply AR)
Unreal. Free to use, including source code (github Unreal and can be downloaded and compiled), is not opensource. But you only pay when you publish the games and earn income (5%). Free of use for movies, post-production or any use other than strictly “games/applications”.
Being able to compile specific versions for you is great, as the code is clean and allows you to generate your own optimizations. Usually when nVidia pulls out any new devkit, it first runs on Unreal probably because of this flexibility.
It is not necessary to be a programmer, but it is useful and fundamental in professional environments (obviously). It uses a system of hierarchies to “program” graphically, although it can be programmed in C++.
Unity3D. It is probably one of the most requested engines in job offers, especially regarding AR. The online store is very good for identifying assets or elements needed to prototype any application quickly, there is a plugin for everything.
In the long run it is dangerous, as it depends on external sources, as they can stop making the plugins and cease to be compatible with a later version of unity.
Prototypes are not a final product!.
Free under 100k (gross income), 35$ per site/month plus, $125 per site/month pro.
There are many 3D engines, we will highlight some that may be interesting depending on the project. Wikipedia link with a complete reference.
Cryengine. You can probably get one of the most beautiful digital oceans and good vegetation. In the latest information they have developed the ability to compute reflections on any graphics card not only on the latest model of nVidia cards (RTX models).
Godotͼ. Beautiful engine, one of the most serious opensource initiatives.
Internet browsers include webgl, an implementation to display 3D content.
BabylonJS ͼ. A WebGL Game Engine from Microsoft, a little heavy on the load but with great performance and options.
Three.js ͼ Light and fast. With high quality and flexibility.
Claygl. Great graphic quality and by default with some visually fantastic settings.
Aframe. Based on three.js, it is the reference in terms of VR on the web. Integrating mobile sensors and cardboard. From a simple 360º video player, Cardboard or any “invention” is simply fundamental for any VR development in the smartphone browser.
Code editor. Visual Studio Code Editor of Microsoft, includes as extension an editor and viewer of GLB/glTF .
Draco. ͼ (Google). Compression of 3D models, essential for web.
WebVR/WebXR. W3C Api for integration of virtual reality devices and scenes. Still in development.
In this section, I make an introduction to the characteristics of the AR engines. For now the best quality is for Apple Arkit. Apple manufactures the hardware and develops the software, something particularly sensitive to AR, because it knows the accuracy and calibration of the sensors, from the gyroscope to the optics of the camera that ensures perfect synchronization of all elements. If you have to develop AR for industrial applications that require precision, there’s no question.
SLAM. Simultaneous localization and mapping. To integrate any type of AR element, it is necessary to know the location of the device, usually a smartphone. Analyzing the image and making use of sensors to determine the location and analyze the environment is basic.
Image Target. It is the relationship between an image and a 3D model, or media content.
3D object recognition. This feature has two functions, for known objects to serve as occlusion and display information and content that is not in the RealWorld ™
ArToolKit ͼ. One of the oldest AR devkit uses a pattern contained in a black rectangle, the pattern is associated with a 3D model and the rectangle serves to identify the camera position.
AR.js ͼ. It is a great solution based on artoolkit 5 js and 3d engine three.js
Arkit. (Apple). The best AR libraries on the market. Determine horizontal and vertical planes with great level of accuracy. Only IOS
Arcore (Google). Google solution with a limited number of devices because hardware manufacturer is different of SO developer. The objectives are different for both.
Vuforia(Integrated in Unity3D in the latest versions). It’s not very expensive, and of reasonable quality. It tries to unify ARcore+ARkit+proprietary development, to obtain the best quality according to the device.
Maxst. Proprietary solution, with good results when integrating objects and solving SLAM.
EasyAR. High quality solution including image target and slam.
Wikitude. A little expensive if you want to completely remove the watermark. It has enough good quality in SLAM, it doesn’t reach the arkit level, but really best.
+Neural Network and AR.
Neural networks open a door to identify the environment in a complete way. For example to add content in specific areas of the body. If we want to make an AR application to test trousers it is essential to know the location of the human body.
For example DensePose ͼ (Facebook research), a neural network that not only identifies the human position, adds a model that covers it.
3D. Maya. Max. Blenderͼ. Cinema 4D.
360º is simply a movie, which is projected on a sphere (in stereoscopy two, one for each eye), the device you are using is able to determine the turn of your head (usually by a sensor called a gyroscope) and shows you a recorded scene.
fbx (Autodesk interchange format), is used generically, can include 3D models, textures and rig (internal structure that allows the animation of characters), is one of the most complete and universal.
usd ͼ (Universal scene description), file format created by Pixar and also used by Apple, mainly in its USDZ version (compressed as a zip file, including complex textures and materials)
drc (Draco 3D model compressor from Google).
Feel. The sense of touch has two components, volume and texture. In order to be able to identify any “virtual” object we need devices that on the one hand limit our movement (volume) and on the other hand produce a haptic response (texture). Obviously we need to know the position of the hands.
Our ability to determine texture is below the millimeter
Therefore we will classify all gloves or solutions according to these parameters:
Position, volume and texture.
Finger position. The location of the fingers can be direct by sensors. The other possibility is through inverse kinematics. Knowing the location of the fingertips you can resolve the plane and angle of rotation of the fingers and their location.
Dexmo. This solution is ideal for detecting volumes, looks bulky but is hygienic for multiple users and the sensation of volume is cool.
Plexus VR/AR. They are very practical gloves, are able to identify the location of the fingers and on each fingertip has a LRA (Linear Resonant Actuator). For location it is necessary to use an extra component, purchased separately with VR glasses.
VRfree. An ideal solution for mobility, lightweight and do not need extra sensors. Unfortunately they don’t have any kind of volume or texture feedback.
Hi5. Already on the market. Precision is good but haptic feel limited. Although the material is antibacterial, the use by multiple users will not be pleasant, feeling of moisture from sweat. The use of plastic gloves is even more unpleasant.
Senso VR. A complete solution, no external adapters required. Basic haptic sensation (no vibrators at fingertips), very good battery life (10 hours estimated).
HaptX. Probably the most complete (and complex) solution. The microfluidic system that generates “bubbles” is spectacular. You can’t buy it yet. But it looks like a serious company.
LeapMotion. (74$) It is a practical solution, which is very tested and with the update of drivers (Orion project code) a reasonable accuracy.
Use two infrared cameras to analyze the hands and identify their position. Compared to gloves the precision is medium.
The biggest problem is that it doesn’t have any kind of feedback, users find it strange especially when they click on virtual UI.
See. The devices have two parts, visualization and sensors. The sensors locate the position of the headset and the controls, it is called odometry. Headset require a high refresh rate on the screen (around 90fps) to avoid dizziness.
+VR. We will classify the VR in three categories, on the one hand the headset that are connected to devices (consoles or computers), on the other hand the headset that integrate the player device. Finally “inventions” which include various solutions mainly to use smartphones as screens.
++VR Connected. The evolution of the graphics cards has been incredible. Especially in the case of nVidia, pulling the whole industry.
The connected Vr, link a cable that carries the video signal and a USB for odometry information.
Oculus Rift. It was a long way from 2013. First commercial system (the first were the dk1 and dk2 of oculus), the main feature is the tracking system, odometry, using a single sensor “camera”. It is oriented to the “desk”. It allows to position the system with many limitations. Occlusions and signal losses.
HTC VIVE. It was the second great revolution, the “desk” is abandoned. With the appearance of HTC VIVE the space grows up to approximately 5x5m. It uses two IR lasers (Lighthouse), which determine the location with extreme precision at all times. HTC partnered with Steam, the online distribution platform for video games as result obtain a very fast growth of the developer ecosystem.
Oculus Rift S. Not yet released commercially, offers improvements in odometry system, is more SLAM, using 5 cameras to position itself, more like Windows Mixed Reality.
Oculus Gear VR. Samsung did a joint venture with Oculus. Using the graphical capabilities of Samsung high-end mobile phones, sensors and high quality screens. Oculus contributed its knowledge in the development of VR.
This tool is practically a plastic “casing”, attached to two lenses. The smartphone is connected inside.
Oculus Go. Completely autonomous solution from oculus, no more smartphone needed. Basically the glasses have an Android system without SIM card.
HTC Vive Focus. HTC product to compete with other standalone systems.
It is important to emphasize the quality of the positioning system
Differences between different odometry systems.
Led+camera are oriented to the desktop, since it is necessary that the camera can see the leds of the headsets.
Spatial (IR) requires two IR laser emitters and allow users to walk the detection zone between 5x5m and 10x10m.
SLAM with integrated cameras, detects variations in the room, can be a laggy, but is generally accurate.
Inertial only allows you to identify orientation, nothing to move away from the monitor or walk.
Ultrasonic frequencies are more accurate than SLAM.
All AR headsets have a portable vocation, therefore by definition all are standalone. In order to determine the quality of the devices it is necessary to understand an optics and of the devices.
Any AR system must have:
+ Cameras or systo interpret the environment. (SLAM)
+ Portable processing system.
+ Controller or hand analysis system.
Screen, Stereoscope and LightField. (one, two and many).
In a screen or a projection usually each pixel corresponds to a color, in the case of stereoscopic projections each pixel corresponds to two colors or two dots with color (we use glasses to separate the left eye from the right). In the case of lightfield each point is a screen in itself.
We determined a basic problem, the amount of information, 24 bits, 48 bits and 393216 bits (128x128 matrix) per pixel
One of the advantages of the lightfield is that there is no need for focusing systems, by default all images include “all” possible focal points. This is an example of an image taken with a Lytro camera.
Now you are looking at this article, on your smartphone or computer. Keep looking at the screen, around the screen (make an effort, don’t move the eyeball) is out of focus. The eye can focus at will completely independently.
Magic Leap. This capacity is what determines the quality of the screens, in the case of Magic Leap there are only two positions near and far, this is the first “commercial” system that allows it (future Creal3D or LightFieldLab), with this we have less fatigue from continued use.
HoloLens 2. We jump directly to the latest version, first you can order Microsoft Store, second according to Magic Leap HoloLens 2 are better.
Here we have to talk about another important feature the FOV (field of view). AR glasses are practically transparent, but the “digital image” will only appear at a specific angle. Magic Leap has a FOV (diagonal) of 40º, Hololens 2 of 52º, the difference is small but they improve the experience significantly.
Another advantage is the quality of the Microsoft software, with more experience that Magic Leap offers from the beginning a user (or in this case developer) experience much smoother and finished.