At HAL24K, we benefit a lot from open source software. That is why, to contribute back, we’ve started an internal program to open source some of the internal tools and libraries we’ve used to build our platform and machine learning solutions, starting with OOProxy. You can find the code here: repository.
What is OOProxy
OOProxy is a reverse OpenID and OAuth2 proxy that we use to protect our HTTP-based machine learning APIs. The proxy implements the client-credentials flow with bearer tokens, making it appropriate for machine-to-machine authentication and authorization.
The proxy has very little overhead (1–2 MB of RAM usage when used normally) and can handle the streaming of very large datasets, making it suitable for shared clusters and machine learning APIs with large datasets.
The API Lake Project
A few months ago, we started building a scalable cloud environment to run authenticated machine learning APIs in the cloud. It enables us and our customers to expose the results of machine learning projects, and hook these up to their own systems so they start classifying their own data and receive the predictions from their machine learning models.
This environment is built on top of Kubernetes, which enables us to:
- Easily (auto-) scale APIs horizontally
- Assign deployments to certain nodes based on needs, for example, assign certain APIs for deployment on GPU nodes
- Swap components around, like our authentication proxy
- Easily add necessary dependencies for the APIs, like databases or Hadoop, to the cluster
It also gives us an easy deployment API on top of Helm. On top of that API we built a set of deployment tools and project templates that makes it incredibly easy for our customers to deploy their projects, even from Jupyter notebooks, to production-ready systems.
Benefits of using rust
One of the things we needed for the API Lake was authentication. Most load-balancers like HAProxy or Kong do have OAuth capabilities, but they are often only available as proprietary extensions or didn’t quite meet our needs. Other reverse OpenID proxies also didn’t suit our use case. Building your own isn’t often the first choice for projects like this, but it was necessary in this case.
Most machine learning models have high hardware requirements, they need:
- A lot of memory
- And GPUs
We certainly could have written our authentication layer in c# or java, but most virtual machines have a lot of memory overhead, especially when you have to deploy the service many times. Rust is a new systems programming language that allows you to write safe, low-level efficient code without the overhead of a garbage collector or a virtual machine, and without any of the memory-safety issues that you’d get with languages like C++.
The current proxy only uses 1 MB of RAM at rest, which compared to a typical java or c# is astonishingly small. This already saved us from having to add an extra node to our cluster, and will save us more hosting costs in the future.
The single-sign-on server also has a pretty central place in the service graph. To reduce the load the APIs place on this service, we made sure they only fetch keys on startup (and key-rotation), and validate any tokens inside the proxy. This works if the single-sign-on service publishes key IDs (which can be used to signal key-rotation). If your service doesn’t publish key IDs, you can set a key expiry which will prompt the proxy to look for new keys periodically.
More to come
In the coming months we’ll provide more open-sourcing and more internal tooling and (data-science) libraries.
HAL24K is a Data Intelligence scale-up based in San Francisco, Amsterdam and London, delivering operational and predictive intelligence to cities, countries and companies.