Gaps in Quantitative Data
Quick Recap: Prasenjit and I are students at Cornell Tech doing research on building a model for many farmers with diverse products to collaborate to serve institutional buyers like hospitals or schools. I used to be a farmer in Upstate New York. Farmers and dining services managers we want your data!
In addition to showing it is possible to optimize matches between institutional food sourcing and local producers we also need to show that it is worth it. We need to compare local food to non-local food based on dollars and food miles. We’re hoping to show that by buying directly from farmers, buyers can save on both.
Food miles are exactly what they sound like — it is the number of miles traveled from where the raw ingredient was grown or raised to where the product was ultimately consumed. We’re using this as a proxy for showing the carbon footprint of the food we consume.
In searching for data about how we could consistently make assumptions about non-local food we started poking around the USDA (United States Department of Agriculture) website. Prasenjit was much more hopeful about finding data that was granular enough to be helpful than I was. My personal experience with the USDA is that they are still not sure if this whole internet thing is gonna take off. In 2014/2015 I had four separate in-person meetings with my local loan officers to get a USDA farm loan.
However, the USDA publishes the Ag Census results by product, by county. This should not have been a surprise — I filled out the Ag Census in 2017 for my operation. But, I was delighted that we could find information as specific as acres of beets by county. Prasenjit pointed out that usually people only get that excited about Beats by Dre.
This is a big deal because we had been planning to rely heavily on our own data collection to determine regional capacities for different products, but that data is always going to be heavily biased towards people whose phone numbers I already have.
Now we can use Ag Census data to compare a product from NY to the same product from CA. We can literally compare apples to apples. At least for food miles.
Now, the next big hurdle in terms of data is deciding how to make generalizations about yields. The Ag Census counts in acres and in dollars, but not in quantities of product.
Of course, using the Ag Census data shows what is possible in terms of matching farmers with institutions, not what is probable or actionable. We’re definitely many iterations away from having a model that is real enough to work, but it’s happening.