How Open Data Saved us Half a Million Dollars (at least)

Tyler Kleykamp
Dec 6, 2017 · 4 min read

The Center for Land Use Education and Research (CLEAR) at UCONN has long been distributing important data for the State as part of their mission to provide information to land use decision makers and others. Their Land Cover dataset for Connecticut has given us important data on 25 years’ worth of landscape change in Connecticut. The State’s Department of Energy and Environmental Protection has worked with them to provide a variety of imagery, elevation, and natural resource information to the public through CTECO. Most recently they helped us provide high resolution aerial photography and elevation data to the public.

In 2012, the State funded the collection of high resolution aerial photography (often called orthoimagery). We worked with CLEAR then to make it all available online through map services and downloads. I’ve previously suggested that public domain imagery is undervalued by the open data community. With that data now available, the state’s Department of Energy and Environmental Protection funded the development of impervious surface data, through CLEAR. These data were a raster that identified buildings, roads, parking lots, and other surfaces and were intended to support municipalities in complying with new storm water discharge requirements known as “MS4.” CLEAR of course completed this work and as usual made all of this data available online. Had things stopped there, this might have been a nice story about how the availability of this imagery helped a state agency develop some data, at a reduced cost, yet still difficult to quantify with a firm dollar figure. However, I wasn’t all that excited about this as it’s a fairly typical example of the use of imagery; and raster data can be tricky for more casual users of geospatial data.

In November, I was at the CT Data Collaborative’s Conference where a colleague from CLEAR, Emily Wilson, happened to be presenting. We were chatting and catching up, and discussing some ways we might be able to collaborate. At some point, she shared with me that an engineer from ESRI’s Community Basemaps Program picked up this impervious surface data, and extracted building footprints (along with the other data) from it and converted them to vectors, such that each individual building was a unique feature. In addition, they shared the data back to her, without any restrictions on access or use of the data. It didn’t stop there though. The engineer not only extracted the building footprints and converted them to vectors but added data from parcels and address points that we had collected from towns (that had the data and were willing to give it to us) a few years ago. This was an incredibly exciting development.

From CLEAR this image depicts the raster version with the outlines from the vector version

Building footprints are valuable data for a number of reasons and have broad applicability, with or without additional details. Adding addresses and other property based information to them only increases this. Last year we requested information on the costs associated with developing this data. We were quoted 40 cents per building, for only the polygons and no additional data. With almost 1.5 million buildings in the resulting dataset, we’ve saved over $500,000 for the building polygons alone. Adding the additional data to the buildings only increases this number. Intersecting and joining polygons in order to add additional data is an imperfect process. Both the spatial precision of these data, and the incomplete nature of the parcel and address data, make this a rather messy set of data, with numerous issues. Some buildings may be split along parcels lines resulting in slivers that are incorrectly attributed to a neighboring parcel. CLEAR, while in possession of this data has decided not to release this, understandably because there are many issues in terms of it’s quality and reliability. It still has value though, and therefor we’re providing it as sort of an experimental release. My hope, is that Connecticut’s data community will help us make this data better as they work with it and use it. While we look for the best way to make geospatial data more broadly accessible in CT, we’ve also provided access to the other vector data, including the raw building data, as part of our exploration of implementing a “GeoHub.” Of course, CLEAR provides access to it in a variety ways as well.

This story also speaks to the value of getting various members of the data community together. Had Emily and I not been at the CT Data Collaborative’s conference, we might not have had this conversation, and thus I may never have learned about this enriched dataset. Connecticut’s data community still remains a bit fragmented. Bringing the GIS community into the broader data community is something that needs to happen more.

Tyler Kleykamp

Written by

I was the first Chief Data Officer for the State of Connecticut

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade