Graviton: Trusted Execution Environments on GPUs

Frank Wang
Frankly speaking
Published in
3 min readFeb 18, 2019

This is part of a week(-ish) blog series where I discuss my random thoughts as a recovering academic (mostly about research and tech). I am currently an investor at Dell Technologies Capital in Silicon Valley. You can follow me on Twitter and LinkedIn.

This week, I’m going to discuss a cool paper from OSDI 2018 about running trusted execution environments on GPUs. This is work out of Microsoft Research and University of Lisbon, and you can find the full paper here.

Problem

Accelerators are playing a more pivotal role in the cloud. CPUs are becoming less popular, and dedicated hardware like GPUs, FPGAs, and custom silicon deliver 10–100x higher performance.

However, cloud privacy is important but challenging. Customers regularly operate on sensitive data, and data breaches are becoming more frequent and sophisticated. We need strong mechanisms for preserving data privacy in the cloud.

Typically, trusted execution environments (TEE) like Intel SGX and ARM TrustZone, can be used to execute code and data isolated from privileged attackers. However, CPUs TEEs do not work in applications that utilize accelerators.

Solution: Graviton

Their solution is Graviton, which is a TEE on GPUs. It has the same guarantees as CPU TEEs and provide remote attestation for establishing trust.

Before we move on, here is an overview on how the GPU works. Here is what a GPU system stack looks like:

The GPU is controlled via a group of commands that are generated by runtime and fetched by the command processor. Here is the GPU execution model:

However, there are a few issues with GPUs. A malicious OS can tamper with the commands and data. There could be context violation, i.e. one context can access the memory of another context.

The goal of Graviton is the confidentiality and integrity of computation and data. The key concept of Graviton is to redefine the interface between hardware and software.

Graviton overview

First, Graviton places hardware primitives in GPU: remote attestation for establishing trust, context isolation, and secure command submission. Next, there are runtime abstractions: secure memory management and secure memory copy and task launch.

How do they provide context isolation? Graviton has protected memory which hosts VM structures, code, and data. The CPU’s MMIO accesses are blocked. Then, it has virtual memory management via the command processor, which ensures use of protected memory and exclusive use of context’s memory resources. Finally, to enable secure command submission, Graviton uses a session key during context creation, and only allows the owner of the runtime to execute tasks.

Context isolation

To enable secure memory copy, they ensure that data/code is plaintext only inside TEEs, and data/code is encrypted outside of TEE (DMA buffer).

Secure memory copy

The performance overhead is moderate, and the overhead is about 20–35%. For more detailed results, I refer you to the paper.

Secure data management in the cloud is a huge issue, especially with the increased use of accelerated hardware.

If you have questions, comments, future topic suggestions, or just want to say hi, please send me a note at frank.y.wang@dell.com.

--

--

Frank Wang
Frankly speaking

Investor at Dell Technologies Capital, MIT Ph.D in computer security and Stanford undergrad, @cybersecfactory founder, former @roughdraftvc