Planet’s API-first vision for satellite imagery

Published in

GET PUT POST

11 min readJan 25, 2017

Welcome to GET PUT POST, a newsletter all about APIs. Each edition features an interview with a startup about their API and ideas for developers to build on their platform.

This edition, I spoke with Chris Holmes and Rachel Holm from Planet. This company is on a mission to launch a fleet of satellites that provide a daily map of the entire globe. We discuss how this is a game changer for monitoring the planet, the company’s API strategy, and some of the challenges of working with satellite data.

Want the latest interviews in your inbox?

What’s the vision for Planet?

We want to create a snapshot of the globe every single day. We call this Mission One and it’s what drives everyone at the company. We take images at a three to five meter resolution, so cars are full pixels and you can’t make out individual people. It’s still a high enough resolution that you can make out buildings and all sorts of human change.

The company was founded by three NASA scientists who wanted to innovate in the space. At NASA, the founders sent a cell phone to space. A phone has most of the parts you need for a satellite — camera, GPS, accelerometer, battery, CPU, and radio.

The core concept behind Planet isn’t a change in rocket technology, but the trend of consumer electronics. There have been trillions of dollars of investment in consumer electronics, which actually dwarfs the spending of satellite technology. The Planet team decided to use the same parts that are in your cell phone to build satellites, as opposed to custom designing every piece. We call this agile aerospace. We draw from software principles to release early and often, but we’re designing hardware and sending it to space.

Over the past five years, we’ve built and launched a stable version 1.0 of the satellite. It’s actually something like build 13.5. We currently have over 70 satellites in space, delivering an entire picture of the world every two weeks, which is an unprecedented amount of data.

This is the technology to let us achieve Mission One and start collecting images essentially every day.

Beyond the development cycle, how is it different?

Instead of chasing resolution to see more with a single picture, we’re going after a high temporal cadence to see things every day. The beauty is that this lets people take action with the data. For example, deforestation monitoring typically focuses on monthly reports to track acres that are damaged. With Planet, a park service can get an alert within a day and go and stop it. We let people not just monitor our world, but improve it for the better.

The idea to get a complete snapshot is unprecedented. If you look at the satellite imagery in Google Maps, for example, it might be a couple months old in major cities. Outside of these dense areas, updates can take years.

Almost every other satellite is a tasking system. This means the highest bidder determines where it looks. They’re not looking everywhere to develop the baseline. When a disaster happens, there will be tons of shots the next day (e.g. in Haiti or Nepal after an earthquake), but no one’s watching the day before. The historical forensics make Planet really different.

How do APIs fit into your strategy?

Traditionally, satellite imagery involved huge images that were hard to work with even on a desktop. People would make requests for hard drives and production would take a week or two. Everyone focused on storing these massive amounts of pixels, rather than making them accessible.

The founders from day one have thought about APIs and providing a platform on the web. We need to get this data out there to have an impact on the world. That’s what attracted me to Planet. My background is in open-source geospatial software, so I joined to help the team leverage all the great open-source projects to make accessible an incredibly valuable dataset.

We expose simple REST interfaces to access the data. On top of that, we’re been building GUIs to search, browse, and download satellite imagery. The data is available in minutes instead of days and weeks.

Right now, we’re mostly serving enterprise customers and getting them comfortable with an API-centered way of looking at things. Last year, we launched a program targeted at developers called Open California. We’re releasing images of the state of California with two years of archives. This is under a Creative Common license, so people can use it freely. It’s very rare to have a detailed dataset like this to play with.

How do you segment your customers?

We explored a bunch of different options in the beginning and now we focus on two big verticals: agriculture and government.

Companies in these spaces already know what to do with satellite imagery. It’s a decently steep learning curve to pull information out of the data. Our customer base right now has the expertise to extract that value. The long-term vision is to make working with imagery as easy as any API on the web, and thus be able to reach many more verticals.

Our goal is for this data to impact every industry, and to touch everybody’s life. This will often means powering an app that doesn’t involve imagery, but draws insights from our daily picture of the planet.

Our customers in agriculture focus on the health of plants. We capture red, green, blue channels and also near-infrared. This is a non-visible band that plugs into a new algorithm called NDVI to approximate plant health. Customers can zoom in on areas that need help and then modify their precision agriculture equipment to adjust soil, seeds, fertilizer, or pesticides.

Some good examples in this space are Farmers Edge and FarmLogs.

What’s the government use case?

There are a whole bunch of examples we’ve seen. Many government organizations know how to work with geospatial information and tend to have analysts who dig into the data. Some common examples include food security, forest security, monitoring borders, and verifying land usage.

A lot of the use cases are monitoring — keep an eye on an area of interest and enable users to know if anything has changed. We had USGS’s conflict diamond group evaluate our imagery to see if it could help monitor alluvial diamond mines in Africa. They found the imagery quite useful to see if a mine was active or not.

Another example is establishing base maps of roads, buildings, and other important features. Governments spend lots of money producing these maps and need regular updates to keep them accurate. We’re able to direct their resources to look at the things that have changed instead of rebuilding the entire map.

How do you price your API?

Although we have the vision of a self-service API where we can charge a credit card based on the number of API calls, we started by focusing on enterprise contracts.

We sell access to a bucket of data in your area of interest. A government will typically buy their country and get an unlimited data feed. They can query the catalog for metadata of what we’ve taken for the globe, but only make GET requests to pull down imagery for the region they have purchased.

How do you measure customer success?

We instrument the whole site, both GUI and API calls using Segment. This lets different groups in our company use a variety of tools to gain insight into customer activity and success. Currently we use Google Analytics, Mixpanel and Intercom to help us measure and improve customer success.

We’re still figuring out how to effectively measure an active developer. Often, they’ll just download a bucket of data and work with it. We’ll just see a big spike in downloads for a county-sized area or the whole country and then we won’t see a lot of API use. They may load up the imagery into their system and have a ton of use, or they may never touch it again, but we miss out on that insight as the data was downloaded.

Over the long term, we want to shift people to working with imagery more directly online. A few customers are doing that and making ad-hoc queries through the API. We want to move towards that rather than copying all of the data into an S3 bucket or, even worse, working offline. The storage bill for maintaining your own copy of the data can be tens of thousands of dollars every month.

We have a partner in Brazil moving towards this focus on web APIs. They’re called Santiago & Cintra. They serve customers in Brazil, but run their backend alongside our cluster in US West. Their app can access the data directly, rather than processing it in batch. They’re psyched to be able to deliver that data to their customers faster and they save big money on storage costs.

What apps have you seen from the Open California dataset?

We launched this as early as possible to get feedback from developers. One person built a visualization tool to see the impact of forest fires. Another cool example was using computer vision to detect clouds.

This brings up the issue that there’s a high barrier to working with satellite imagery that we’re trying to bring down. Clouds are a simple concept, but they get in the way of doing other interesting things with the data. Computer vision people don’t know how to work with geospatial data out of the box. There are huge files and they’re formatted in a weird way. It’s not clear how to plug this into a machine learning model.

That’s what we’re working on next — making the data accessible through the API straight into a machine learning algorithm. It’s much easier to get up to speed if you get well-labeled training data.

Why is geospatial data so hard to work with?

I think geospatial has built up its own little world and language. It hasn’t stayed connected to mainstream tech. That being said, there are some harder problems.

Projections is one example. We live in a 3-D world, but have to project that globe onto a 2-D map. You need coordinate transformations to switch between different representations.

The size of the data is another problem. It’s trillions of pixels that are hard to work with. You can’t just put it in Hadoop. Even unstructured data is still text that many algorithms can use. With imagery, you start with bundles of pixels and build meaning from there.

Atmospheric conditions is another challenge that we’re tackling. We provide a cloud map, but it’s not 100% accurate. San Francisco looks blue one day, gray the next, and more reddish another. Developers need to sample that out. There’s a bunch of work to calibrate and get a real signal from the ground.

Then, you have to do the math to line the calibrated images up. Buildings and other features need to consistently line up. The current workflows involve a lot of work on a big desktop system. Someone will run through each of those steps semi-manually.

Whenever you look at Google Maps, thousands of images are stitched together to give a consistent view. That process of stitching the images together is mostly done by hand. We’ve done pretty well at automating many of the steps and are now exposing that through the API.

Once all the data is clean and consistent, developers can start to build something useful. For example, tell me when ships are coming into a port and let me integrate that into an app. That’s the vision we’re moving towards. I don’t think there’s anything impossible in the domain but there are a lot of moving pieces.

How do you visualize this imagery data?

We tend to use OpenLayers, which is similar to Leaflet. Leaflet has a very simple surface area with an ecosystem of plug-ins. OpenLayers offers similar functionality, but the philosophy is to bundle everything together.

We’ll contribute back to OpenLayers whatever improvements we make. For example, we implemented the NDVI algorithm that could be fully handled in the browser in JavaScript. We view the dissemination of usable data and analysis as one of Planet’s core competencies.

In the future, once you’ve solved the data cleaning problems, what are the opportunities for machine vision experts?

It’s all about monitoring — knowing about the state of the world. I think the app developer’s part is to help people take action and feed the data into areas that aren’t even imagery-related

I visited a paper manufacturer in Brazil. They grow eucalyptus groves to produce the pulp. If there’s a bug infestation or a fire, they have people drive around with radios even if it’s the middle of the night. It shouldn’t be that hard for computer vision to analyze the area and give daily updates on any issues. How can you create useful alerts that fit into their workflow?

Deforestation is another example. We have an impact team that pursues these areas where there’s not a clear market for it yet. We’re very interested if we could alert park rangers to areas that are being actively damaged.

Finally, think about counting date palm trees. Lots of financial services and agriculture companies care about this. They probably track it in spreadsheets and outdated reports that are months old. Ideally, you could give them an accurate feed that’s updated daily.

Anything you can share about the internal APIs that control the satellites?

A traditional tasking satellite will have low-level APIs to direct it in a certain direction. In other words, the API lets a customer steer it.

For Planet, we have a system called Micro Manager that models the area we’re trying to monitor across the fleet of satellites. It’s a traveling salesperson-type problem, that is too complex to solve with brute force. We have 70+ satellites that communicate with roughly 15 ground stations, so there are thousands of passes to communicate each day.

Each ground station has to decide which satellites to communicate with and the proper set of instructions. For example, do we upload new commands, download images, or upload new software updates? Deciding what to do is a very complex problem.

We use a simulated annealing algorithm to make these decisions. There’s a heat map of the areas that we have the most customers in. The operators manage these higher-level inputs and don’t issue specific commands like a rotation.

It’s similar to the move from specific servers to auto-scaling in the cloud. The configuration for the whole system of satellites controls the lower-level APIs.