iOS Seamless Video Chunks
Recently, I created a video recording solution for iOS that recorded the video in 5 second chunks, and then assembled the video once they were uploaded to the server. Everything appeared to be working correctly, until some of our users pointed out that the audio and the video fell out of sync while watching the videos. If they were to seek the video, the audio would come back in sync but this isn't really a sensible solution.
My naive approach of splitting the video file utilized a NSTimer and every time that it ticked, it'd create a new AVAssetWriter and switch it out
I knew that I was probably missing something, and I became paranoid that I was losing samples that were supposed to be on the old asset writer because we have two streams coming in and at some arbitrary point we switch out the asset writer and the very next sample that we get calls the following:
The following is an excerpt from Apple's documentation regarding the Quicktime movie format specifically "Samples with timestamps before startTime will still be added to the output media but will be edited out of the movie". I interpreted this to mean that if I switch to the new asset writer, and call the startSessionAtSourceTime method when we get a sample from either stream, if we then receive a sample that's presentation time is prior to that one that we started the asset writer with it'll be ignored. I figured that was likely to occur at some point, and came to believe that was causing all of my issues.
To address this issue, I was afraid that I was going to have to do a major rework of my implementation to keep track of old asset writers and write samples to them and not just simply switch to the new asset writer. I then noticed that the documentation said that it'll be added to the output media, but edited out. I started to wonder what exactly would edit out that sample, and whether FFmpeg's concat functionality did anything to help us out when I noticed that the following error was output a number of times when doing the concatenation:
Packet with invalid duration -64 in stream 0
Stream 0 was always the video, and I knew that most video players synchronized the audio to the video, so I started to wonder if I was to always start using the new asset writer on a video sample whether FFmpeg's concat functionality might do some magic to help me out. I modified the timer above to no longer switch out the asset writer itself, but to instead put it in a new variable
Then when I got a video sample and a new asset writer was ready for me, I would switch it out
That way the video packets would never have a negative presentation timestamp. I crossed my fingers and hoped for the best.
I was pleasantly shocked to hear the audio play back seamlessly!
To verify that FFmpeg was magically solving my issue, I took two files, one of which had audio packets with a negative presentation time stamp, and concatenated them. I then ran ffprobe and counted the packets before and after. Lo and behold, they were all there. FFmpeg has once again saved the day!!