We Need Answers, Not Analytics

Jamie Conklin
Astraea.Earth
Published in
5 min readFeb 1, 2021

My journey from building foundational tools to an analytics platform and where we need to go next.

Photo by Mark König on Unsplash

Organizations need answers — not data, not analytics, not BI, not databases, and not pretty maps (although pretty maps are really cool too). All too often analytics efforts get bogged down with data management, DevOps, MLOps, data curation, and a slew of other obstacles that stand in the way of doing the analysis required to answer important questions.

Earlier in my career, my team was asked to build “the” predictive analytics platform for the Army’s new cloud. We focused on spatial analytics and, given what the Army does, that was a good place for us to dig in. With the migration to cloud computing, the spatial infrastructure we had come to know and love (e.g. PostGIS) was no longer available. Instead, we had distributed key/value stores. Before we could begin migrating our analytics and building a “the platform”, we spent months developing what would become the open-source spatial database GeoMesa.

Luckily, the Army saw the value in the foundational technology we were building and continued the effort. Our work on GeoMesa ultimately paid off. Today, GeoMesa powers numerous production systems managing hundreds of billions of points. While this is mostly a good news story, in retrospect, we spent an inordinate amount of time building infrastructure to just get to the point where could reliably access and store spatial data. Years passed as we matured GeoMesa sufficiently to support production-grade analytics on real-time data feeds. Organizations can’t wait years for information. Once a need is identified, it should be resolved in weeks or months, but not years.

Looking back, this experience was exciting and the work was cool. We were building game-changing technology and we saw it through from genesis to deployment into production. Very few organizations have the patience, financial backing, and vision necessary to make foundational investments required to begin building analytics let alone manage a production machine learning environment. GeoMesa has changed the landscape for scalable vector data management, but even GeoMesa is a relatively low-level tool requiring deep understanding of cloud operations to use efficiently. The moral of this story is that while very cool, GeoMesa itself is not enough because organizations need an operational cloud with literally dozens of other services to use it. In short, it will cost too much and take too long for most organizations to be able to realize the value that GeoMesa brings.

I got to be a part of building some really cool technology but I have yet to see this technology reach its full potential (it’s getting there, but alas open source projects can move slowly).

About a year and a half ago I interviewed for a job at Astraea. The team understood that to make tools that are truly useful and that can be widely adopted, they had to go further. You have to meet the users where they are, not where you want them to be. In other words, the user wants — and to the point of this story — needs to JUST USE IT. I loved Astraea’s vision and excitedly accepted an offer to join their team.

If organizations are not interested in databases, or analytics, or pretty maps, they most certainly are not interested in evaluating, configuring, testing, modifying, and managing the complex infrastructure that may or may not result in an analytic result they can use. Furthermore, scalable tools are complex and require expertise to configure, deploy, and manage. Yet we need answers. So, we need better tools and these tools need to be available in such a way that users can just get in and start analyzing data.

During the interview process, the Astraea team introduced me to an internal term they used… a lot. SLAW. This stands for S*** Load of Annoying Work. The Astraea founders were dealing with the same frustrations I had years before. When they started the company, they set out to build information products from satellite imagery. There was tons of data out there, more coming, and literally hundreds of important problems that could be solved using these data. Unfortunately, they ran into obstacle after obstacle trying to get to the point where they could actually analyze the data. Much like the challenges faced by my team supporting the Army, getting to the starting line, was too hard. Furthermore, they saw that they were not alone. Other companies and analysts faced the same challenges. Temporarily stymied, and frustrated, they discovered their new mission — to remove the barriers that got in the way of actually analyzing the data.

Over the past few years, we have achieved this mission. We have built a suite of tools that make searching, managing, accessing, and analyzing spatial data straightforward and scalable. A new user can sign up, query data, and start analyzing it within 5 minutes. We have built the open-source library RasterFrames which enables analysts to process imagery, at scale, using Apache Spark. Learning from GeoMesa, we made RasterFrames accessible by building it into EarthAI Notebook so an analyst can jump right in and use the tools without spending a minute on DevOps. While we are excited about our progress, we are not stopping here.

We have gone to great lengths to make analysts more productive. Our discovery tool, EarthAI OnDemand makes discovering analysis-ready data easy. EarthAI Notebook eliminates the SLAW resulting from Python dependency hell — all of the spatial analysis libraries are installed and harmonized. Notebook has direct access to pull data from OnDemand, data query functions built by experienced data scientists, and dozens of examples in the documentation to get you started in minutes.

Having addressed the needs of individual analysts, we are now focusing on building the infrastructure to enable organizations to operate production-grade analytic workflows. We are building a GeoAI Ops platform that will enable users to build production analytics. We are doing this because our clients need Answers, Not Analytics. Our job is to help users get Answers as quickly and easily as possible.

“What are GeoAI Ops?” you ask…. Stay tuned for my next article!

--

--