How to Create a 360 Video Player with OpenGL ES 3.0 and GLKit on iOS

Hanton Yang
8 min readMar 1, 2018

--

360 videos are recordings where a view in every direction is captured simultaneously. During playback, the viewer can control the viewing direction, similar to a panorama (wiki). This format is becoming increasingly popular, appearing on platforms like Facebook’s newsfeed, YouTube’s 360 channels, and news apps such as NYTimes and Wall Street Journal. In this tutorial, you’ll learn how to create a 360 video player from scratch using OpenGL ES 3.0 and GLKit. Since the implementation heavily relies on OpenGL, which is cross-platform, the app can potentially be ported to other platforms such as Android, Windows, or even the Web (WebGL). In this process, you’ll learn:

  • How to draw geometries programmatically in iOS with GLKit
  • How to interact with OpenGL geometries
  • How to use video frames as OpenGL textures

Can’t wait to get started? Let’s dive in! :)

Getting Started

Start by checking out the Fisheye project and open Fisheye.xcodeproj in Xcode. In the Navigation Bar on the left-hand side of the Xcode console, you’ll see the demo.m4v 360 video that you’ll eventually display in the app. There are also four other folders: Main, Shader, Model, and Extension. You can ignore these for now. Build and run the app; you will see a colorful sphere rotating.

At this stage, you only hear the audio of the 360 video. Don’t worry; you’ll see the video soon.

How the Sphere is drawn

In OpenGL, you can only draw triangles. By connecting vertices, you can draw triangles, and by using more triangles, you can build up a sphere that looks smooth.

In this project, you use the sample code esShapes.c from the OpenGL ES 3.0 Programming Guide. This code generates a sphere’s vertices, texture coordinates, and indices programmatically. esShapes is written in C. To use it, you need to import its header file in Go360-Bridging-Header.h. Check out Sphere.swift in the Model folder.

import GLKitclass Sphere {
// 1
var vertices: UnsafeMutablePointer<glfloat>?
var texCoords: UnsafeMutablePointer<glfloat>?
var indices: UnsafeMutablePointer<glushort>?
var vertexCount: GLint = 0
var indexCount: GLint = 0
// 2
init() {
let sliceCount: GLint = 200
let radius: GLfloat = 1.0
vertexCount = (sliceCount / 2 + 1) * (sliceCount + 1)
indexCount = esGenSphere(sliceCount, radius, &vertices, &texCoords, &indices)
}
}
  1. Here, you declare variables for the sphere, including its vertex array, texture coordinate array, index array, and the number of vertices and indices.
  2. You generate a sphere with 200 slices and a radius of 1.0 using the esGenSphere function. This function provides all the data (vertices, texture coordinates, indices) needed to draw the sphere in OpenGL ES. This is how you draw a sphere with code!

Make the Sphere Interactive

Next, you’ll make the sphere interactive with your fingers.

First, add the following code to VideoViewController.swift:

private var rotationX: Float = 0.0
private var rotationY: Float = 0.0

Declare two new private variables to store the rotation angles along the X and Y axes. Next, add the following method:

override func touchesMoved(_ touches: Set<uitouch>, with event: UIEvent?) { 
// 1
let radiansPerPoint: Float = 0.005
let touch = touches.first!
let location = touch.location(in: touch.view)
let previousLocation = touch.previousLocation(in: touch.view)
var diffX = Float(location.x — previousLocation.x)
var diffY = Float(location.y — previousLocation.y)
// 2
diffX *= -radiansPerPoint
diffY *= -radiansPerPoint
// 3
rotationX += diffY
rotationY += diffX
}

Here’s a step-by-step explanation:

  1. Calculate the touch distance along the X and Y axes.
  2. For every pixel the user drags, rotate the sphere by 0.005 radians. Multiply by -1 to simulate real-life interaction where dragging in one direction increases the view area.
  3. The x-axis is horizontal across the screen, and the y-axis is vertical. Therefore, dragging left to right (diffX) rotates around the y-axis (rotationY), and dragging up and down (diffY) rotates around the x-axis (rotationX).

To apply these finger rotations to the sphere’s rotation, add the following code to glkViewControllerUpdate(_ controller:):

// Update the model view projection matrix
renderer?.updateModelViewProjectionMatrix(rotationX, -rotationY)

In Renderer.swift, modify the updateModelViewProjectionMatrix(_ rotationX:, _ rotationY:) method:

// 1
func updateModelViewProjectionMatrix(_ rotationX: Float, _ rotationY: Float) {
let aspect = fabs(Float(UIScreen.main.bounds.size.width) / Float(UIScreen.main.bounds.size.height))
let nearZ: Float = 0.1
let farZ: Float = 100.0
let fieldOfViewInRadians = GLKMathDegreesToRadians(fieldOfView)
let projectionMatrix = GLKMatrix4MakePerspective(fieldOfViewInRadians, aspect, nearZ, farZ)
var modelViewMatrix = GLKMatrix4Identity
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, 0.0, 0.0, -2.0)
// 2
modelViewMatrix = GLKMatrix4RotateX(modelViewMatrix, rotationX)
modelViewMatrix = GLKMatrix4RotateY(modelViewMatrix, rotationY)
modelViewProjectionMatrix = GLKMatrix4Multiply(projectionMatrix, modelViewMatrix)
}
  1. Add parameters for the finger rotations.
  2. Apply the finger rotations along the X and Y axes to the model view matrix. Build and run the app; you should see the sphere rotate when you drag on the screen.

So far so good :)

Project the Video onto the Sphere

The next step is to project the video frame onto the sphere model using Equirectangular projection.

Equirectangular Projection
Spherical Mapping

Move the eyes to the Center of the Sphere

In Renderer.swift, modify updateModelViewProjectionMatrix(_ rotationX, _ rotationY:) to:

func updateModelViewProjectionMatrix(_ rotationX: Float, _ rotationY: Float) {
let aspect = fabs(Float(UIScreen.main.bounds.size.width) / Float(UIScreen.main.bounds.size.height))
let nearZ: Float = 0.1
let farZ: Float = 100.0
let fieldOfViewInRadians = GLKMathDegreesToRadians(fieldOfView)
let projectionMatrix = GLKMatrix4MakePerspective(fieldOfViewInRadians, aspect, nearZ, farZ)
var modelViewMatrix = GLKMatrix4Identity
// Comment out this line
//modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, 0.0, 0.0, -2.0)
modelViewMatrix = GLKMatrix4RotateX(modelViewMatrix, rotationX)
modelViewMatrix = GLKMatrix4RotateY(modelViewMatrix, rotationY)
modelViewProjectionMatrix = GLKMatrix4Multiply(projectionMatrix, modelViewMatrix)
}

Simply comment out the translation code. Build and run the app, and you should now be at the center of the sphere.

Get Pixel Buffer from the Video Player

In VideoPlayer.swift, add the following method:

private func configureOutput(framesPerSecond: Int) {
// 1
let pixelBuffer = [kCVPixelBufferPixelFormatTypeKey as String:
NSNumber(value: kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange)]
output = AVPlayerItemVideoOutput(pixelBufferAttributes: pixelBuffer)
// 2
output.requestNotificationOfMediaDataChange(withAdvanceInterval: 1.0 / TimeInterval(framesPerSecond))
avPlayerItem.add(output)
}
  1. Set the pixel buffer format to kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, which represents Bi-Planar Component Y’CbCr 8-bit 4:2:0, video-range (luma=[16,235], chroma=[16,240]).
  2. Update the video player’s frame rate to match the GLViewController’s frame rate. Next, add one more method:
func retrievePixelBuffer() -> CVPixelBuffer? {
// returns the pixel buffer of the current video frame.
let pixelBuffer = output.copyPixelBuffer(forItemTime: avPlayerItem.currentTime(), itemTimeForDisplay: nil)
return pixelBuffer
}

To pass the video frame to the renderer, open VideoViewController.swift and add the following code to glkView(_ view:, drawIn rect:) before renderer?.render():

// Retrieve the video pixel buffer
guard let pixelBufer = videoPlayer?.retrievePixelBuffer() else { return }
// Update the OpenGL ES texture by using the current video pixel buffer
renderer?.updateTexture(pixelBufer)

Use the 360 video frame as texture

Add the following variables to Renderer.swift:

var lumaTexture: CVOpenGLESTexture?
var chromaTexture: CVOpenGLESTexture?
var videoTextureCache: CVOpenGLESTextureCache?

Implement the updateTexture(_ pixelBuffer:) method:

func updateTexture(_ pixelBuffer: CVPixelBuffer) {
// 1
if videoTextureCache == nil {
let result = CVOpenGLESTextureCacheCreate(kCFAllocatorDefault, nil, context, nil, &videoTextureCache)
if result != kCVReturnSuccess {
print(“create CVOpenGLESTextureCacheCreate failure”)
return
}
}
let textureWidth = GLsizei(CVPixelBufferGetWidth(pixelBuffer))
let textureHeight = GLsizei(CVPixelBufferGetHeight(pixelBuffer))
var result: CVReturn // 2
cleanTextures()
// 3
glActiveTexture(GLenum(GL_TEXTURE0))
result = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
videoTextureCache!,
pixelBuffer,
nil,
GLenum(GL_TEXTURE_2D),
GL_LUMINANCE,
textureWidth,
textureHeight,
GLenum(GL_LUMINANCE),
GLenum(GL_UNSIGNED_BYTE),
0,
&lumaTexture)
if result != kCVReturnSuccess {
print(“create CVOpenGLESTextureCacheCreateTextureFromImage failure %d”, result)
return
}
glBindTexture(CVOpenGLESTextureGetTarget(lumaTexture!), CVOpenGLESTextureGetName(lumaTexture!))
glTexParameteri(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_MIN_FILTER), GL_LINEAR)
glTexParameteri(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_MAG_FILTER), GL_LINEAR)
glTexParameterf(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_WRAP_S), GLfloat(GL_CLAMP_TO_EDGE))
glTexParameterf(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_WRAP_T), GLfloat(GL_CLAMP_TO_EDGE))
// 4
glActiveTexture(GLenum(GL_TEXTURE1))
result = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
videoTextureCache!,
pixelBuffer,
nil,
GLenum(GL_TEXTURE_2D),
GL_LUMINANCE_ALPHA,
textureWidth / 2,
textureHeight / 2,
GLenum(GL_LUMINANCE_ALPHA),
GLenum(GL_UNSIGNED_BYTE),
1,
&chromaTexture)
if result != kCVReturnSuccess {
print(“create CVOpenGLESTextureCacheCreateTextureFromImage failure %d”, result)
return
}
glBindTexture(CVOpenGLESTextureGetTarget(chromaTexture!), CVOpenGLESTextureGetName(chromaTexture!))
glTexParameteri(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_MIN_FILTER), GL_LINEAR)
glTexParameteri(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_MAG_FILTER), GL_LINEAR)
glTexParameterf(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_WRAP_S), GLfloat(GL_CLAMP_TO_EDGE))
glTexParameterf(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_WRAP_T), GLfloat(GL_CLAMP_TO_EDGE))
}
  1. Create the CVOpenGLESTextureCache if it does not exist.
  2. Clean up current textures before updating them.
  3. Map the luma plane of the 420v buffer as a source texture.
  4. Map the chroma plane of the 420v buffer as a source texture. Also, you need to implement cleanTextures():
private func cleanTextures() {
if lumaTexture != nil {
lumaTexture = nil
}
if chromaTexture != nil {
chromaTexture = nil
}
if let videoTextureCache = videoTextureCache {
CVOpenGLESTextureCacheFlush(videoTextureCache, 0)
}
}

The Fragment Shader

Since the video is recorded in the YCbCr color format and OpenGL ES renders in RGB color format, you need to convert the color format of the texture to RGB. The transform matrix for ITU-R BT.709, the standard for HDTV, will be used for this purpose. A Fragment Shader is the Shader stage that will process a Fragment generated by the Rasterization into a set of colors and a single depth value (OpenGL Wiki). In this part, you will use it to do the color format transformation. Modify fragmentShader.glsl as follows:

#version 300 esprecision mediump float;uniform sampler2D samplerY;
uniform sampler2D samplerUV;
in vec2 textureCoordinate;out vec4 fragmentColor;void main() {
mediump vec3 yuv;
lowp vec3 rgb;
// 1
yuv.x = texture(samplerY, textureCoordinate).r — (16.0 / 255.0);
yuv.yz = texture(samplerUV, textureCoordinate).ra — vec2(128.0 / 255.0, 128.0 / 255.0);
rgb = mat3(1.164, 1.164, 1.164,
0.0, -0.213, 2.112,
1.793, -0.533, 0.0) * yuv;
fragmentColor = vec4(rgb, 1);
}

1. Use the ITU-R BT.709 transform matrix to convert the color from YCbCr to RGB for rendering. In Shader.swift, add two more variables to the class:

var samplerY = GLuint()
var samplerUV = GLuint()

In the init() method, add the following code:

samplerY = GLuint(glGetUniformLocation(program, “samplerY”))
samplerUV = GLuint(glGetUniformLocation(program, “samplerUV”))

This code binds the newly created variables to the OpenGL ES shader fragmentShader.glsl. In Renderer.swift, add the following code before glUniformMatrix4fv(shader.modelViewProjectionMatrix, 1, GLboolean(GL_FALSE), modelViewProjectionMatrix.array):

// Set the values of samplerY and samplerUV before rendering
glUniform1i(GLint(shader.samplerY), 0)
glUniform1i(GLint(shader.samplerUV), 1)

Build and run the app. You should now see a 360 video playing, and you can drag to view different angles. Well done!

Where to Go From Here?

If you want to explore more about OpenGL ES and GLKit on iOS, start with the OpenGL ES Programming Guide. This documentation provides a comprehensive explanation of how to use OpenGL on iOS. There are additional details you should consider before releasing an app that uses OpenGL on iOS.

Congratulations on creating your 360 video player! To enhance your player, consider adding features such as:

  • Device Motion Mode: Watch different areas of the video by moving your phone (Sensor Fusion)
  • The Metal version

A full tutorial covering these features would be too lengthy for this guide, so stay tuned for new tutorials on 360 video players. They will be released in the near future. I hope you enjoyed this tutorial!

--

--