Back in 2016, our team began researching the technologies in which Security Operations Center’s were using to protect the broad range of customers they monitor. Security Information and Event Management (SIEM) systems, provided by proprietary vendors were de facto and increasingly complex, limiting and expensive. We set out to design a system that made use of modern technologies and provide solutions for the growing list of challenges facing industry professionals;
- The ability to ingest large amounts of data quickly and reliably
- Rapidly and effectively query data in multiple data stores (data lakes etc.)
- Run on-premise or in the cloud
- A micro-services based architecture running on Docker/Kubernetes
- Provide a mechanism for the rapid development of security intelligence and capabilities
- Support modern practices of security and development teams such as DevSecOps and Threat Modelling
- Allow for effective upscaling and downscaling of infrastructure in-line with the current load of the system
- Provide a framework for security automation and remediation
Away we go
With the emergence of Open Source Big Data platforms such as Apache Hadoop and its surrounding ecosystem, it made sense to explore their use in our Cyber Security use case and thus we set out on our journey.
Having designed our target architecture we started out to build a proof of concept making use of components such as Apache Kafka, Apache Storm, and ElasticSearch.
Sometimes along your journey to a destination, you find a better path.
A few months into the process we discovered Apache Metron, a Real-time ‘Big Data’ Security platform created by HortonWorks which originated out of the OpenSOC project created by Cisco. On the surface, Apache Metron met all our requirements and more and thus our new journey began.
At the time, Metron was in it’s very early stages of development and was certainly nuanced in many areas, nonetheless, it’s target architecture and vision were aligned with ours and it made sense for us to explore further. 6 months into the process we emerged with battle scars and countless hard lessons learned but with another proof of concept that demonstrated the capabilities of the platform.
Apache Metron is heavyweight in every sense. With a minimal installation requiring 10 chunky servers (m4.xlarge on AWS, approaching $2,000/month) and is more suited to traditional infrastructure be it Virtual Machines in the cloud or bare-metal on-premise installations. Although work is currently being done to allow these systems to run on more modern infrastructure such as Kubernetes, our view is that it feels contrived and will struggle to gain adoption in todays emergent ‘cloud-first’ world.
Together with a steep learning curve and rigid design, Metron lacked the agility and efficiency goals we had set ourselves and so we decided to go back to the drawing board.
There were a number of aspects of Apache Metron we liked and having gone deep into its architecture and codebase we had an appreciation for it’s power and capabilities. Equally, we had an understanding of the areas we felt we could improve significantly.
By this time the Serverless bandwagon, made popular by AWS Lambda, was well and truly rolling and we were curious how such a system could work in practice and could we use it as a platform to meet some or all of the goals we set out.
At this point, it’s worth describing our definition of Serverless to remove any ambiguity that surrounds the technology today. We believe there are 3 main aspects:
- Pay per execution
- Zero management of servers or any infrastructure
- The ability to scale up on demand and to zero (yes zero cost) when idle
The opportunity to leverage these benefits were compelling, so we went to work to reimagine how things might look if we rebuilt in Serverless.
A new beginning
At this point there were some obvious questions:
- Does something already exist either commercially or in the Open Source world that offers such capabilities
- Is such a solution economically viable when considering the total cost of ownership
- Could we use Serverless capabilities to significantly increase efficiencies beyond current tools and platforms available today
- Is Cloud vendor lock-in inevitable or could we build a platform-agnostic solution that makes use of native cloud constructs to maximise efficiency?
We set out to answer these questions and embarked on a journey that fundamentally lead us to create Furnace. In the coming weeks, we’ll write about some of our findings as they certainly make for interesting reading, however, here is a summary of what we found.
Although Serverless constructs such as AWS Lambda provide the ability to create complex processing and pipelines and workflows, the plumbing necessary was significant and detracted from the ability for teams to simply focus on creating value. Constructing pipelines forces organizations to make architectural decisions that are complex and impact on both cost and capability. To be specific, we’re talking about such things as:
- Message routing, on AWS you have Kinesis, SNS, SQS, Kafka, all with pros and cons
- Identity and Access Management (IAM) roles and ensuring the platform is secure
- How to manage infrastructure in a way that is in line with modern DevOps and DevSecOps practices
- What does Continuous Integration and Delivery (CI/CD) look like in a Serverless world
- How do we optimize our code for reuse and preventing vendor lock-in?
After significant research and development effort, we finally arrived at a design and architecture that we were excited about. Gathering input and feedback through our customers, peers and working groups we started to assemble the core platform and today we’re announcing Project Furnace as a free and Open Source project and we’re excited to announce it to the world.
Over the coming days we’ll be drilling into detail about the platform and how you can use it to build and maintain highly effective processing pipelines within minutes.
Here are some of the feature highlights:
- A platform agnostic framework that makes use of cloud-native constructs
- Integrated CI/CD, build pipeline and deployment the required infrastructure
- Modern GitOps methodology, uses Git as a single source of truth, fully auditable, putting you in complete control of the underlying infrastructure
- Easily create multiple environments (such as development, staging, production) with minimal cost
- Standardized module format making it really easy to create your own capabilities
- Language agnostic, code modules in any of the major languages
Our initial release targets AWS and Node.js and GitHub so you’ll need to make sure you have those configured prior to use. We’re really excited to hear your feedback and to see what you build with Furnace.
You can get started with Furnace by visiting GitHub at https://github.com/ProjectFurnace/furnace