The Prefect Hybrid Model
Cloud Convenience; On-Prem Security
UPDATE: With the release of Prefect 2.0 we created a security white paper that discusses our latest security practices and enhancements. Check it out for the most up-to-date information! 🚀
“There’s no way in hell I’m giving you our code or data!”
It was the winter of 2017, and I was sitting in the basement of one of the largest hedge funds in the world. Its CTO, who is now one of our technical advisors, was gently explaining why he would never use a managed workflow service like the one I was proposing for my new company, Prefect.
“Well, what if we gave you an on-prem solution?” I asked.
“I’m not interested in learning how to manage your software,” he replied.
I clarified: “So — you won’t give us your code, and you won’t take our code? How are we supposed to work together?”
“I’m sure you’ll figure something out,” he said.
Well, after two years (and filing a few patents!), I’m proud to announce that we did figure it out. Prefect’s Hybrid Model is a completely new way of delivering workflow software, combining the convenience and cost savings of a fully-managed service with the security and privacy of an on-premise solution.
And with the launch of Prefect Cloud this morning, it’s now available for anyone to try, for free.
The Hybrid Model
At Prefect, we’ve always wanted to offer a managed workflow service. Like most open-core companies, we assumed that would entail running our software on behalf of our customers. Since our software is a framework for building data pipelines and applications, that implicitly requires us to take our customers’ code and execute it.
This creates a trust problem. Our customers must trust us as the recipient of critical technical property. In turn, we must trust that the code we’re receiving is not malicious in nature (even inadvertently). On-prem solutions require the same degree of trust, though the exchange is in the opposite direction. Needless to say, establishing this trust is difficult — and as the hedge fund CTO demonstrated, sometimes impossible.
The Hybrid Model is a zero-compromise solution.
Prefect Cloud is a fully-managed workflow orchestration service. It provides all the features customers require, including a UI, database, team management, permissions, GraphQL API, scheduler, and much more. The one thing it does not do is execute customer code.
We have successfully split the difficult task of orchestrating workflows — in which Prefect has great expertise — from the task of executing code — which our customers can do better than anyone.
In this model, our customers’ code always remains on their private infrastructure. They design, test, and build workflows with our open-source engine, Prefect Core. When a workflow is ready, it is registered
with Prefect Cloud. This sends metadata to Cloud that is sufficient to reconstruct a code-less version of the workflow: details like the tasks it contains and their dependency structure; its schedule; information about its runtime environment; etc.
Once Cloud has this information, it can begin orchestrating the workflow even though it doesn’t have access to its code. This process begins by placing a Cloud workflow into a Scheduled
state. An open-source Prefect Agent, running on a customer’s infrastructure, monitors for work. When it finds a scheduled flow, it begins to execute it. This might take place locally, or in a remote execution cluster — it is entirely up to the customer to define. As tasks pass through Running
states and ultimately Succeed
, Fail
, or Retry
, that information is communicated back to Cloud as the source of truth. In this way, multiple concurrent executions of workflows on customer infrastructure can be coordinated through a central broker, all without ever requiring access to code or data.
When we first solved the trust problem, we were pretty excited. At the time, we assumed this would allow us to meet the strict requirements of some of our large financial services customers, but we still planned to offer a fully-managed workflow execution service for other industries. However, after hearing about the hybrid model, every single one of our potential customers immediately opted for it over the managed service. And why wouldn’t they: it’s more secure, more private, more convenient, and lower cost!
Case Study: Working with Regulated Data
In the early days of exploring the hybrid model’s applicability outside finance, we had the good fortune to meet Joe Schmid, CTO of a healthcare company called SymphonyRM. Joe became one of our most enthusiastic Lighthouse Partners, and we worked with him and his team to make rapid improvements to Cloud over the last year.
If you’re curious, Joe independently blogged about how SymphonyRM uses Prefect with Dask, referring to Prefect as “the next gen’s next gen.”
SymphonyRM had been running hundreds of DAGs on Airflow for a few years, and was frustrated by not only Airflow’s limitations, but having to continuously manage its cumbersome infrastructure. Because their data is bound by strict HIPAA-compliance requirements, finding third-party services that can assist with workflow management is almost impossible.
Therefore, our relationship began with the explicit assumption that SymphonyRM would require an on-prem solution. Their security requirements, coupled with the fact that their typical usage of Prefect involves spinning up massive ad-hoc data science analytics on Dask clusters, made a managed execution service untenable.
We began working with SymphonyRM as the pilot customer for Cloud’s on-premise deployment. The technological requirements weren’t easy, especially for a young company simultaneously trying to bring its “main” product to market.
While we worked, SymphonyRM began testing Cloud with a hybrid setup. We worked with them to carefully outline any and all mechanisms by which their data could end up on our servers, and exposed settings and tools to manage Core-Cloud interactions. Our first hybrid Cloud customer was a Fortune-100 technology company, but it was Joe’s team at SymphonyRM that really ran the early product through the gauntlet.
The technical overhead and costs of introducing an on-premise model so early were higher than we anticipated, and we got within a few weeks of the target implementation date before finally working up the courage to ask Joe if the hybrid model would actually satisfy SymphonyRM’s needs.
Almost immediately, he said: “Sure!”
This was a critical moment for our company: a customer who came to us explicitly requiring a on-premise solution, who told us they couldn’t use competitive managed services, was able to quickly adopt our hybrid setup. As a result, we jettisoned all plans to offer non-hybrid Cloud deployments. If SymphonyRM didn’t need them, who would?
Today, the Prefect Hybrid Model is the only way we deliver Prefect Cloud. This allows us to offer the exact same Cloud product to individuals and open-source projects that we do to Fortune-100s and regulated industries. We look forward to working with partners like Joe and SymphonyRM to continue delivering innovations that eliminate negative engineering.
How It Works
The hybrid model is based on a simple premise: execution in your cloud; orchestration in ours. In a few simple steps, you can set up your local deploy of Prefect Core to be fully managed by Prefect Cloud. Instructions from Cloud are executed by Core in your private execution environment, and only state updates are transmitted back. Cloud never receives your code.
This description of the hybrid model is also available on our website.
Step 1: Build Your Flow
Use our open-source Prefect Core library to design, test, and build a workflow.
Step 2: Register Your Flow
Send metadata about your flow (but never code!) to Prefect Cloud. This registers the flow for execution and lets you inspect and interact with it in the Prefect Cloud UI.
Step 3: Run an Agent
Run an open-source Prefect Agent on your infrastructure. The Agent, which ships as part of Prefect Core, monitors Cloud for scheduled work.
Step 4: Schedule Work
Use the Prefect Cloud UI or API to schedule a new run of your flow. This will put the flow into a Scheduled
state.
Step 5: Run the Flow
The Agent will detect that a new run has been scheduled and launch the flow on your private infrastructure. Any state changes will be communicated back to Prefect Cloud.
Step 6: Monitor and Manage
In the live-updating Prefect Cloud UI, you can watch progress across all of your flows, no matter how many you have or how often they run.
Eliminating Negative Engineering
When we introduced Prefect, we talked about the difference between positive and negative engineering. Our customers are experts in the positive engineering that drives their businesses; we are dedicated to reducing the negative engineering burden that complicates their efforts.
The hybrid model is the perfect representation of that philosophy: code and execution stays close to our customers, where it belongs. Orchestration, error surfacing, scheduling, and monitoring live in Prefect Cloud, where we can deliver a great experience.
The hybrid model is something we’ve been working on for close to two years, often in secret. It represents the best of Prefect innovation. With the launch of Prefect Cloud, anyone in the world can take advantage of our completely new solution to workflow orchestration.
Free, as in Free
Because Prefect customers can bring their own infrastructure, we can offer a full workflow orchestration service that supports unlimited execution at a dramatically different price point than other services. Taking that to its extreme, it’s always been one of our strategic imperatives to offer a fully-featured — and free — workflow management system.
Today, we’re excited to announce that our Scheduler Tier delivers all of the power of Prefect Cloud for up to 3 workflows, completely for free. You can run flows of unlimited complexity unlimited times — without paying us at all. When you’re ready to step up and power your team with Prefect Cloud, we’re always ready to talk: sales@prefect.io.
Happy Engineering!
— The Prefect Team