“Farming looks mighty easy when your plow is a pencil and you’re a thousand miles from the corn field” — Dwight D. Eisenhower
As the nation’s agricultural journalists and agronomists start out on their annual “boots on the ground” tour of the corn belt, randomly selecting farms to visit and plants to weigh and measure, we just delivered our first space-based forecast of US corn yield and production (reported on this week by Bloomberg). Using satellite images plus weather observations collected daily over 15+ years, we’ve built a machine learning system in the Cloud that watches the progress of the nation’s corn farms and predicts the number of bushels of grain that will come out of the fields for each county, with greater accuracy and speed than traditional survey methods. This technology has the potential to change the way we assess and understand the challenges of Food Security in an age of resource depletion and global climate change.
From science to startup
Descartes Labs is a venture-backed startup focused on understanding the Earth through satellite imagery. We use machine learning, remote sensing science, and large-scale cloud computing with satellite imagery to explore the past, present and future of human activity and natural resources at global scale. As described by my co-founder Mark Johnson, we’re a small team of scientists and technologists spun out of a government research facility, Los Alamos National Laboratory, descendant of the wartime Manhattan Project, hidden on a secluded mesa in beautiful northern New Mexico.
We opened our doors last December, and since then we’ve been hard at work turning satellite pixels into quantitative data. Fast Company has a nice write-up of our initial efforts to teach our machine learning system to find wheat fields across the US, and mentions our intention to process all the NASA Landsat satellite imagery, going back to the original Landsat 1 satellite launched in 1972. Landsat was an implementation of Eisenhower’s Open Skies initiative, a means to foster world peace through increasing mutual understanding between nations and avoid surprises, and its main mission is to map the world’s natural resources and monitor them for changes indicating threats to our environment.
In fact, not long after the Fast Company story broke we surpassed our technical goal, processing over a petabyte of satellite imagery in under 16 hours using a virtual super-computer of 30,000 cpu cores, that we conjured in the Google Cloud (as blogged about by the fine folks at Google). While wheat is marvelous grain, and (along with rice) forms the foundation for feeding 7 billion hungry humans, the biggest prize in agriculture is corn, that goes into ethanol for our cars and feed for our animals and sweet syrup for our densely-caloric beverages, as well as into traditional healthy foods. So we’ve focused our initial efforts on corn, and since the US is the world’s largest producer of this most valuable agricultural commodity, we’ve started our work here at home.
Here be (mathematical) Dragons …
Since there are never enough column inches in a press story to fully describe what we are doing, we’d like to go into just a little bit more detail than usual (feel free to jump to “What’s Next?”, below).
First, and very importantly, we need to emphasize that our models have been optimized to estimate the actual ground conditions of the crops, and no attempt is made to model the survey-based estimates appearing in the USDA National Agricultural Statistics Service August report or other crop forecast products. Instead of conducting paper surveys of ~10,000 farmers per month, or sending out workers to visit ~1,000 fields to sample actual ears of corn, we are using satellite observations of ~100,000 farms per day to estimate yield and production from the visible and infrared spectral and temporal signatures of the plants seen from orbit.
Our methods are based on analysis of over 10 trillion pixels of visible and infrared satellite imagery collected across the United States over 15 years of observations. We use satellite observations of the Earth by NASA/USGS Landsat TM, ETM+ and OLI sensors and NASA Aqua/Terra MODIS sensors, collected between January 1, 2004, and the present date, augmented by weather observations by ground stations across the US over 30+ years.
We validate our growing season models with a historical back-test for years 2004 to 2014, and make forecasts for this year. Our day 216 (early August) back-tested national-level corn grain yield estimate has a median absolute accuracy of 2.5 bu/ac (bushels per acre). That is, our prediction can be expected to be within 2.5 bu/ac of the final USDA yield estimate in half of the years. This mid-season prediction improves with additional satellite observations over the remainder of the growing season, and reaches an end-of-season national-level yield estimate with median absolute accuracy of 1.5 bu/ac. Compared to the “gold standard” survey methods at USDA, our historical day 216 (e.g., August 4, 2015) predictions are more accurate than the USDA August yield forecast, our day 248 (September 5, 2015) predictions are more accurate than their September forecast, and the day 280 (end-of-season: October 7, 2015) forecasts are more accurate than their October forecast.
Faster & Better: Descartes Labs prediction error as a function of day-of-year for US corn yield, compared to USDA NASS/WASDE forecast error.
Now that our quantitative analysis based on over a decade of satellite and weather daily observations shows that we’re faster and more accurate than USDA at estimating the US corn crop yield and production, what’s next? US corn supply is just one term in a global balancing act of demand and supply, and in our search for an understanding of the world as a single system we naturally want to get to an estimate of global production soon. Surprisingly, the world currently does not yet have a global, real-time map of corn production, or any other commodity crop. While there is some interesting ongoing work on global forests by the World Resources Institute, resources from water to minerals to our sprawling urban zones could all benefit for continual scientific observation and modeling, and Descartes Labs intends to help bring this new understanding to market.
The implications of this go well beyond the direct benefits to commodities trading. The economics of corn has driven wheat production in the US onto more marginal farm land across the former prairies and down into the South Western US, where it is highly dependent on ancient aquifers. These aquifers filled up from rainfall over thousands of years, and we might be on track to effectively drink it all up in just another 20 years, a much faster timescale than the usual concerns over global climate change. If that doesn’t seem like a terribly good situation, we tend to agree.