Lessons learned using OpenResty to securely proxy static asset requests 🔒

Omar Al-Hayderi
PlanGrid Technology
4 min readMar 28, 2018

At PlanGrid, we’re on a mission to be the construction record set for every job site in the world. We’re already well on our way to our goal, meaning lots of plans: hundreds of millions. That means hundreds of petabytes of content to store. Just to put it into context, each sheet that’s uploaded to PlanGrid becomes 10+ files we use for various processing and display purposes. Additionally, every sheet can contain sensitive customer data. Distributing assets in a secure and performant way is pivotal to the user experience we provide.

Tech choices

When building out an asset proxy we explored a variety of options. Here’s the tech that wasn’t the right fit:

AWS authentication:

  • Signed URLs mean no browser caching.
  • Signed Cookies don’t give us file based granularity on Authorization

Lambda @Edge

  • New and unproven at the time of this implementation
  • Limits on request time and size that didn’t fit our use case

We needed something light and reliable. It had to be able to make outbound HTTP calls for authorization, handle high traffic, have in memory cache as well as access to a distributed cache. To satisfy these requirements we implemented an Nginx reverse proxy with a tool called OpenResty.

What is OpenResty?

OpenResty is an extension of Nginx that includes tons of additional modules built-in that optimize it for low-latency, highly durable web applications. One of these modules is the lua-nginx-module that allows us to run Lua code that can contain complex logic and run synchronously with an Nginx request.

How we use OpenResty

The deployment is straight forward. Users get asset URLs from the PlanGrid Digest API, the URLs point to our OpenResty deployment. Authentication, authorization and auditing are performed on each request, and proxy the request to the respective AWS asset if successful.

Birds-Eye view of out content distribution system

We can break down the heavy lifting of the OpenResty proxy into 3 steps:

  1. Authenticate: Given a cookie or Authorization header, retrieve the respective PlanGrid user that this request is being made for if their session is valid.
  2. Authorize: Once we have a user, we ensure that they have the access rights to this asset they are requesting.
  3. Audit: Who accessed what and when
  4. Proxy: Route request to data store.

Lessons Learned

Debugging our Lua code proved to be more of a pain than expected. We used the test-nginx package to construct test cases. The tricky part of this framework is that while you can have many assertions in one test case, you have to build an nginx configuration in your test code. Meaning your deployment has it’s Lua code tested, but the nginx conf is untested. It’s up to you to ensure your test case configs closely match what is getting deployed and that you smoke test your releases.

The tests also lack any verbosity. A failure could look like this:

# Failed test ‘A cool test- response_body — response is expected (repeated req 0, req 0)’
# at /usr/local/share/perl/5.22.1/Test/Nginx/Socket.pm line 1382.
# got: ‘<html>
# <head><title>500 Internal Server Error</title></head>
# <body bgcolor=”white”>
# <center><h1>500 Internal Server Error</h1></center>
# <hr><center>openresty/1.11.2.5</center>
# </body>
# </html>
# </html>

Huh? All you get out of the box is a dump of the erroneous nginx response. Not very helpful. A trick to get a more verbose stack trace is to add the no_error_log assertion to your test case:

-— request
GET /some_resource
-— response_body
expected response
-— no_error_log
[error]

This will give you more insight into what could have happened. This also allows you to test verbose error messages 👍

# Failed test ‘A cool test — pattern “[error]” should not match any line in error.log but matches line “2018/03/26 18:23:11 [error] 4155\#4155: *1 lua entry thread aborted: runtime error: <some_error> from file ‘<a_file>:<line_num>’:” (req 0)’

We ran into a tricky issue when trying to connect to our statsd server in our Lua code. The naive approach was to have a statsd connection at the module level, and share it across all functions in the module. We could instantiate the connection, and use it in any function we wanted after that to post custom metrics. But wait…

lua entry thread aborted: runtime error: attempt to yield across C-call boundary

Yield across C-call boundary? It turns out Lua’s require function is compiled as a C function in LuaJIT. Requiring a module means we need to load all of the module level variables and methods. Since the redis client instantiation requires a yielding I/O operation to connect to the redis host, you cannot have the connection defined at the module level. Or in simpler terms, no module scoped statsd connections 😭

Encapsulate all connection initizations in functions

We used the lua-resty-redis library to connect to our distributed redis cache. However this proved to be more involved than originally thought. We had to follow the same pattern as above to share a connection between functions. Also, Nginx is unable to share cosocket objects (redis connections) between different requests. This leaves no intuitive way to have connection pooling for redis. The workaround for this is to set a keepalive for the socket we open for redis, so that the next Nginx worker can pick it up without having to establish a new connection to redis.

We also ran into issues connecting to our redis caches via SSL. However at the time of writing it does not expose a method to connect via SSL. To overcome this we had to add custom code to manage the SSL connection.

Manual handshake & connection pooling workaround code

Conclusion

After wrestling with these network issues and overcoming the learning curve we now have a robust service that has been one of the highest traffic services deployed at PlanGrid. We look forward to leveraging and optimizing our OpenResty deployment more as we grow.

Have experience using OpenResty with Redis in production? Leave a comment below!

PlanGrid is hiring! If these types of infrastructure problems interest you, drop us a line, we’d love to hear from you 😁

--

--