The Video Miner — A Path to Scaling Video Transcoding
Authored by Livepeer Director of Video Product, Philipp Angele.
There is a new type of mining coming that can provide crypto currency miners, who are using GPUs for their hashing, an opportunity for an additional income.
At Livepeer we aim to create a open marketplace for video processing services, which is a current need for any developer who wants to do video outside of the offering of a Youtube or Facebook. Startups who are trying to add video to their apps are suffering from the high pricing and need to run on competitor’s infrastructure to even get started.
The Livepeer open marketplace will change this by allowing crypto currency miners to rent out their idle capacities on their GPU mining rigs to those who need video processing.
Since the chips in graphic cards for video encoding (NVDEC+NVENC) are separate from the chips used for general purpose computing and crypto mining (CUDA cores), they can run side by side with only an approximate 10% increase in power usage.
A single Nvidia GTX 1080 card can currently only mine crypto currencies worth about $1.50 per day on its programmable compute units. In the same 24 hours it can perform adaptive bitrate encoding on 48 hours of video on the dedicated video de- and encoding ASICs (NVDEC+NVENC). As an example, Amazon’s AWS Elemental MediaLive charges $3.00 per hour for this — so yes, this single graphics card could render video streams worth $144.00 every day with adding only 10% additional operational costs to miners for the power consumption.
- $1.50 for hashing crypto currencies
- +$144.00 for video compression
The big price difference and underutilization of spare resources is driving Livepeer to create a trustless marketplace in which everyone could sell their video encoding resources for the price they want. There are mining farms out there that could potentially encode a multiple of the current video processing load of Youtube + Facebook + Twitch.
The protocol economics in Livepeer’s network are designed to get these resource providers to compete against each other on the actual encoding price, resulting in a very cost effective solution for developers. There’s a lot of room between the $144.00 a developer would have to pay a centralized service for 48 hours of encoding, and the $0.15 of incremental electricity cost that they’d have to cover with a competitive miner, so there’s room for both incremental income for the miner and a huge cost savings for a developer.
Since miners wont have to heavily invest in new infrastructure (potentially more bandwidth is required) to participate, but can leverage their current deployments, we expect that the price for transcoding can become 20-50 times less expensive than it is today, making it affordable for everyone to add video to their application.
How many GPU’s are currently mining?
Just looking at the Ethereum network hashrate of about 250,000 GH/s, this maps to approximately 6,250,000 Nvidia GTX 1080 GPU’s achieving a hashrate of 40 MH/s. This volume of GPUs could do adaptive bitrate encoding on 12,500,000 concurrent live streams. Alternatively, AMD 480 cards can hash at 20 MH/s and encode one live stream at slightly lower quality. As we’ll see in the case studies below, this still significantly exceeds the demands of even the biggest video sites.
Can you trust a decentralized encoding network?
In the world of CPU based deterministic transcoders, the Livepeer protocol makes use of Truebit to verify that encoding was performed correctly. But to validate a GPU video miner’s work, Livepeer is designing a passive processing validation mechanism that will aim to work for the non-deterministic nature of ASIC based video encoding. This is an open research problem, that we are approaching through a 3-pronged verification function:
Video and Audio Details
One can check the elementary of the video streams like Codec, Framesize, Framerate, GOP, Samplerate, etc, with ffprobe which is part of the ffmpeg family to see if the transcoding job was done properly.
The quality of the video encoding can be probed with VMAF, which is a quality estimation tool that Netflix released and that provides a clue if a transcoder would have tried to simplify the encoding process.
Perceptual Video(+Audio) Hashing
Video hashes of the entropy of the video can be created to validate if the content was tampered during the transcoding process to minimize the risk of someone changing the video content during transcoding. We are looking into various open source projects that were previously attempting to do similar, more for content owner declaration than to make distributed video encoding tamper proof.
Livepeer is looking to cooperate with developers or researchers on this open problem and implementation. Please reach out to us if you feel you could contribute in this area.
A fingerprinting mechanism, which combines the above three forms of validation, will allow trustless video processing across many different types of video encoding chips. Nodes can then can compete against each other in performance, quality and price to win available business on the network, and users can trust that their videos were encoded correctly.
The resulting network aims to dramatically lower the encoding price while significantly increasing the reliability through cost efficient redundancies.
Case Study — Youtube
Looking at the biggest video platform, youtube.com, it can be seen from various press releases that they have over 300 hours of content every minute.
This sounds like a lot of video, but for a GPU encoding farm, where video encoding chips are sitting idle, it is actually not as much as it may seem.
If one wanted to encode all this content, assuming it’s all 1080p30 H264/AAC, in realtime, it would only take:
300h * 60m = 18,000 minutes of content
/ 2 minutes of content processed per minute on each card
= 9000 graphic cards to process all content in realtime.
Of course Youtube’s transcoder is incredibly fast and encodes the source content much faster than realtime, but it’s possible to get a sense for how few GPUs are actually needed to render all this content. If one would use twice as many graphics cards for the same task it would also halve the encoding time.
Case Study: Twitch
Twitch has public statistics that show 55,000 concurrent livestreams at peak.
To encode all this twitch would utilize about 22,500 GPUs if they wished to encode every stream. However they are only encoding streams of users in their partner program or channels that have a minimum viewership of approximately 50 Viewers (estimated).
Multi-bitrate transcoding for the long tail (meaning just a few viewers are watching a stream, here estimated at less than 50) is too expensive. However, with a dramatic drop in the price of video processing, long-tail transcoding would become affordable and many streamers could reach a wider audience at various connection speeds.
Now is this a business threat to current video businesses?
This is not going to replace the current enterprise and premium content focused transcoding market, but it enables a new market to be created.
Use cases that are more consumer oriented become affordable to operate at scale without the need to own infrastructure and without the need to use a competitor’s infrastructure. Many aspiring social video startups have told the stories of finding what they thought was immediate success upon launch adding 100’s of thousands of users in month one, only to end up with multi-million dollar streaming bills that drain their funding prior to finding a working business model.
For infrastructure providers it should lead to a more competitive market in which one is incentivized to perform better than their competitors — in which redundancies are the norm, and in which the price for transcoding is affordable such that new startups don’t burn through their capital so quickly just because they have success and find many people using their offering.
New challenges for the video miners
Bandwidth is needed to do the job! We estimate about 4–8 mbps in and 4–8 mbps out per average transcoding job. A mining rig with 8 cards could perform 16 concurrent stream encodings and so would have to support 64 mbps in and 64 mbps out (plus n% encoding and network congestion overhead).
Bandwidth is unequally priced depending on where in the world you are located. A miner in Seattle with a Google 1 gbps fiber connection would be able to encode 200 streams on his mining rigs and would only have to pay $100 for the bandwidth for the whole month, while a miner in Dubai may have to pay $500 per 10 mbps for the month. So the miner in Dubai pays 500x more for the bandwidth as the one in Seattle. As previously with affordable electricity pricing, miners will have to find similar advantages with affordable bandwidth.
Of course video transcoding is only part of the story. The second financial obstacle for developers after an affordable decentralized processing marketplace for video is the delivery of it.
We are working closely with Ethereum’s Swarm team to design prototypes of decentralized content delivery (CDN) for video. More to come on that front soon.
To summarize the Livepeer video mining proposal, the aim is to:
- Merge the cryptocurrency mining and video encoding industries, by leveraging existing idle resources on GPU mining infrastructure.
- Create a competitive marketplace on cost + quality of video encoding services.
- Resulting a far cheaper solution which opens up the video market to app + DApp (decentralized application) developers, IoT devices, and long tail content providers, where it was never an affordable business before.
As an open source community we are very thankful for any piece of participation, specially for feedback from industry experts! Thanks to notsofast, Kingdom Mining, Genesis Mining, and some anonymous miners for giving us feedback on mining. Thanks to Colleen Henry, Marc Cymontkowski and Chris Knowlton for feedback on the video encoding strategy.