Q&A with Watchful: Why Streaming is the Future of Data Processing
In conversation with John, we get a sense of the current challenges in the landscape, the future of data processing, and the interesting use cases for their platform.
Give us a quick sense of what Watchful does.
We allow organizations with large amounts of unstructured data to process it at scale, in real-time, without hiring an army of highly experienced engineers. We are a lightweight, horizontally scalable, and incredibly flexible filtering, routing, and tagging layer for large enterprise.
Why does this matter now?
We see the world changing along two tracts: there is an exponential increase in the amount of data we are producing, from IoT, machine-to-machine data, more users coming online, and mobile data. Yet, we are not increasing compute and storage at the same pace.
Basically, we are drowning in data, resulting in an exponential increase in noise from the data. And this noise is expensive to compute, store, and transfer at scale.
So there’s a shift, from batch to streaming — from looking at data at rest to doing everything in-flight.
We see the opportunity for Watchful to be the bridge between these two points. We enable organizations to take control of their data in flight, at scale, to make managing the fire hose easy, intuitive, and scalable — essentially rethinking their data architecture. Watchful is like a dynamic routing layer sitting between ingest and egress for downstream services.
For those of us that are more technically inclined, what is novel about your approach to this problem?
Watchful’s “secret sauce” is the ability to run RegEx at scale, in real-time. A useful analogy, albeit a bit inelegant, would be Watchful is like a “distributed grep.” We are able to get HPC levels of pattern matching performance (FPGA/GPUs) on commodity x86 server hardware — 2 GB/s per core with linear scalability on a medium-complexity load of 3000 patterns.
So, this could have a broad spectrum of use cases — who has the most demand for what you do?
Infrastructure, logging, and monitoring groups dealing with massive amounts of unstructured and semi-structured data, and spending a fortune on infrastructure and man hours to scale in production.
Additionally, security groups trying to alert and react to attacks in real time, and companies that are sensitive to leaking information to the public internet (PII/PHI/secrets/api keys/etc.)
Are there any interesting or unique ways companies are using your platform to elevate their business?
Definitely. A large recruiting company is using Watchful for tagging large volumes of clickstream data from a number of web properties, training machine learning models to predict user behavior on production data. Effectively a semi-supervised ML pipeline with Watchful as the automated tagging solution.
A large research laboratory is also using Watchful to significantly reduce the time and cost to process large volumes of open source intelligence data.
Where do you see the future of data processing going, and how does Watchful fit into that?
We see that high performance pattern matching will become a quintessential part of the streaming “stack” of tomorrow. If Watchful is successful — we want to be the obvious first choice for engineers to build cost effective, performant, and scalable streaming data pipelines without hiring an army of highly experienced engineers.