Vulkan on Android 6 — VR Stereo Rendering Part 2: Focal Length and Stereo Camera Modes

Jin-Long Wu
6 min readJan 16, 2019

--

Photo by Paul Gaudriault on Unsplash

In implementing stereo viewing, we need to specify value of focal length. But in common graphics api, we don’t find any APIs with a parameter specifying it. We might have contemplated the reason and googled for it and found, yes, we don’t need a single parameter specifying focal length at all. Then why we need focal length in stereo viewing? That’s what we are talking about.

Focal Length is Hidden

In modern graphics api we have to construct our own transform matrices if we don’t use any library. But in old days we have APIs such as gluLookAt, gluPerspective, glFrustum, glOrtho, D3DXMatrixLookAt*H, D3DXMatrixPerspectiveFov*H, etc. They may specify field of view, boundaries of image/canvas/near clipping plane of viewing frustum, but not focal length.

The Pinhole Camera Model

This great video demonstrates how a pinhole camera works. For a pinhole camera, focal length f is the distance from the pinhole to the image plane. θ is the vertical field of view or called angle of view. And aperture is where light can penetrate in, creating an image on the film. I’m using film size other than image size since I focus on real camera structure deliberately.

Fig. 1

And, for CG rendering, we can invert the pyramid conceptually to be at the same side of the object so that the projected image does not turn opposite in all directions.

Fig. 2 Inverted projection image. The blue line is the inverted film.

So now, we return to our familiar viewing frustum. It doesn’t matter whether the inverted-pyramid is behind or is in front of the object.

There are horizontal and vertical fields of view, but for APIs like gluPerspective or D3DXMatrixPerspectiveFov*H they both assume field of view to be vertical.

Fig. 3: from https://en.wikipedia.org/wiki/Field_of_view

From fig. 1, focal length, film size and field of view have such a relationship:

We can see focal local can be determined by other two factors. Because viewing frustum, as well as θ, are fixed, and film size is unchanged(recall we just inverted the pyramid inside the camera), focal length does not change. Film size is unrelated to the image plane to which points on the object are projected. The main difference is image plane tightly fitted inside boundaries of viewing frustum and changes its size with its position at z-axis, so we could place image plane at any position at z-axis, the coordinates of the various image plane are in proportional, and are later remapped to the same NDC coordinates anyway. Thus we do specify focal length indirectly when using those APIs.

Fig 4. Three image plane embedded inside viewing frustum with images proportional to z coordinates.

In addition to the image plane, near and far plane which both have nothing to do with focal length as well. They are just depth boundaries of our viewing frustum. However, perspective and frustum API of OpenGL and DirectX assume image plane is located at z = -1 plane since it simplifies the matrices generated by them.

Stereo Viewing

In the previous article, we know it needs two camera views for stereopsis to work. And 3 methods are used to converge to a point.

Parallel

Fig. 5

It just shifts camera views horizontally and it converges to points at infinity and is useful for objects exist at effective infinity like landscape(or just one camera as I said in the previous article). Therefore, we can not see objects behind the projection plane, which means they are all negative parallax. And the non-overlapping parts on edges of two views make eyes difficult to fuse them.

Toe-in

Fig. 6

Toe-in rotates cameras to converge to a point. It reflects approximately how the human eye works but it causes vertical parallax which makes the viewer uncomfortable. Closer to the edge of views, more severe the uncomfortable our eyes are. It is useful in situations when the objects in the scene are very close to the camera.

Off-Axis

Fig. 7

Unlike the parallel stereo camera, it converges to a point by cutting out parts of the frustums, and objects can be placed in front or behind the project screen. It gives the most comfortable viewing feeling so it’s the recommended option.

By the way, one additional mode we didn’t talk about is the radial stereo camera. It’s a variation of toe-in and differs on the places where cameras are located. In the radial mode, cameras are placed on the path of an arc rather than a line in toe-in mode. I wonder its usage because it seems not much distinction between both modes. However, I couldn’t find much information and only C4D has this mode.

Focal Length

Fig. 8 Center original frustum and shift-right frustum.

We need to specify focal length since we are cooperating with lens of HMD, and it is requisite to calculate the amount of frustum shifting. Now we are going to find left and right extent for right eye truncated frustum.

  • OB is the focal length.
  • OA is the distance to near plane.
  • OC is the half intraocular distance and is equal to EG.
  • DG is the half horizontal extent before truncating the frustum.
  • DI and DF are left and right extent of the truncated frustum, respectively.

AE is equal to DG since we just shifted the frustum horizontally.

Using similar triangles, we know

We want to know what DF is, that is

Similarly, -ID

For left eye, the frustum is almost the same, and it just varies with the signs of terms. We will leave it to the implementation(next article).

--

--