Simplifying Geospatial Data Access
In 2008, a colleague and I spent about 6 months developing a program for rollout across a large US Retail chain. The service we were putting in place was an in-store intervention for a large number of that store’s customers. As any good product development folks would do, we planned in store tests and quantified the impact of our proposed program through small scale interventions. After all, if the program was likely to have minimal impact, it wasn’t going to be worth the cost of rolling out a change nationally.
The data we observed was fantastic. We forecast hundreds of millions in revenue growth based on observed patterns in a few stores. The strong forecast behind us enabled us to launch. The program was a wild success. We succeed in delivering hundreds of millions in sales (lucky for us). Unfortunately, the number of hundreds of millions was about one quarter of what we’d forecast (not as lucky).
In our scramble to determine what had gone wrong and why our team wasn’t delivering everything we’d promised we uncovered the problem. The few geographies we’d tested in and ran our simulations against were anomalous relative to the rest of the chain’s stores. Hence, our forecasts were off. Nothing in the company’s data warehouse or in its store descriptions would have told us as much. Nothing in our limited in store observations gave this fact away. Only the reality of data housed outside of the company, in public records and proprietary marketing databases informed us of the issue.
If we’d had access to the data up front we would have known as much.
Unfortunately, it took our project catching on fire for us to find the budget and time to get to the heart of the issue.
Finding Atlas
That project was a decade ago now. In the time that’s past, I’ve always had my eyes peeled for some sort of solution to the problem we ran into. It wasn’t that the data to make a better decision didn’t exist. It definitely existed. And it wasn’t that there weren’t good tools to analyze geospatial data. There are fantastic tools to analyze geospatial data. The problem was that for two analysts launching a product, there was no easy way to get access to the tools or perform analysis.
The solutions were too robust, too complex, and too expensive for anyone but a power user. Sound familiar? It should. The same could have been said of data analytics in the enterprise before companies like Tableau, Qlik, and Microsoft started championing visualization for citizen analysts and forcing everyone in the industry to think about how to enable end users access to self-service data and analytic processes.
Needless to say, when we got to know Henning Kollenbroich and Atlas, I got excited. Henning’s whole mission in life is to simplify access to and consumption of geospatial data. Whether it’s data you have in your enterprise systems today, data you need to obtain and clean from your government, or data from trusted 3rd party sources; geospatial data tends to be big, complex, and hairy to wrangle. Combine that situation with the reality that users want to consume this data in a variety of systems and you have a hairy problem (For instance: power users want to use it in ESRI, business analysts want to use it in Excel, and everyday Joes like me just want a simple web app to get at it; all of which have different integration points and methodologies).
Henning’s idea was to tackle that problem in an elegant way. Effectively, build an ETL engine in the cloud for geospatial data. Anything that you want to pipe up and munge together, you can. Atlas’ team does all the mapping, all the pipeline management, and manages all the integration points. All you need to do is consume the data that’s relevant.
You can be live in a few minutes and finish with what was previously a complex analysis within an hour.
If you were me a decade ago… Atlas could have let you quickly understand the different demographic and psychographic qualities documented in no time flat (avoiding projections that were hundreds of millions off in magnitude).
A Well Defined Problem with Sprawling Implications
It may seem like this is a narrow problem. After all, how many people really need to clean geospatial data? To give you a sense of the size of the problem, it’s worth looking at the market size. Today, geospatial information systems generate ~9B annually in software spend. And while there are a number of big names in the space (ESRI, Pitney Bowes, Mapbox, etc.), no player has more than 25% share of that vast market. In part this is because the variety of the problems to solve when dealing with geospatial data are vast.
The market is also smaller than it could be because of its complexity. Think about the number of scenarios where understanding your geography, your proximity to different events, and your own behavior might be relevant.
Sales forecasting. No brainer.
Customer segmentation. Yup.
Assortment planning. No duh.
Trade promotions. Um… sign me up.
Price optimization. Demand does differ by location.
Campaign planning. Why is data driven marketing just available online?
Real estate. You would try it without the data?
Recruitment Optimization. Anything to find great candidates.
Facility Planning. Please save me multi-million dollar mistakes.
The list goes on. And on. Spatial data has implications across business processes and across industries. If access to it could only be unlocked and simplified. And that’s exactly what Atlas is trying to accomplish. I’m excited to see them make it a reality