EIE is Filling in The Gaps in Buildings Data

Google Earth
Google Earth and Earth Engine
7 min readAug 30, 2022

By Estefania Lahera, Software Engineer, Environmental Insights Explorer

In 2018, Google’s Environmental Insights Explorer (EIE) initially launched building emissions data. It was one of the first of its kind, a global dataset with transparent methodology and clear, distinct boundaries. Since then, we’ve continued expanding our data coverage and refining our estimates.

Today, EIE is thrilled to announce our latest data refresh using our newest model. The released buildings data includes over 4,000 new cities and regions, bringing our total buildings data coverage to over 13,000 cities and regions worldwide. Local governments can now sign up to access these insights in EIE’s Insights Workspace.

We have substantially increased our geographic coverage of the world, filling in many gaps between urban areas. This update covers over 250% more geographic area — from just over 9.5 million square kilometers to over 35 million square kilometers of the world — resulting in over 50% more buildings counted and 40% more building floor area used to estimate emissions. The newly available regions are predominantly suburban areas or large regions, such as states and provinces. For example, the release includes all 50 US states, all 47 Japanese prefectures, and all 20 Italian regions.

The challenge of buildings data

The emissions associated with the energy used to power, heat, and cool buildings are a significant contributor to global warming. When we include indirect emissions from building construction, indirect and direct building emissions account for about 38% of total global energy-related CO2 emissions. This trend is even more pronounced in cities and urban areas; according to C40 Cities, 60% of emissions on average come from operating buildings. Improving building energy efficiency, electrifying heating and appliances, and increasing available carbon-free energy are critical solutions for policymakers around the world looking to meet decarbonization goals and prevent the worst effects of climate change. Determining actionable next steps, however, requires a systematic approach.

Reducing building emissions requires tracking progress against a baseline. Collecting, organizing, and manipulating city-scale data to calculate a baseline is non-trivial. Cities can use local utility data when available, which accurately measures overall energy usage, but may not be measured or structured in a way that best supports regional climate action. For example, if a region intersects with or completely contains multiple, possibly overlapping utility companies, it’s hard to determine what fraction of each utility company’s reported energy use actually pertains to the region in question. Even if the boundaries are non-overlapping and match the region in question with no gaps, utility companies may have different calculation methodologies, so combining their potentially disparate estimates may be challenging.

Moreover, in places where utility data does provide an accurate emissions estimate, planning buildings interventions would benefit from complementary information, such as the existing buildings stock. ICLEI USA noted in their technical review that “EIE provides the data for total floor space of residential and non-residential buildings, which can be useful indicators for understanding emissions changes.” When policy makers understand emissions changes, they are better equipped to target specific interventions, such as automated building energy controls, for maximum impact.

With expanded geographic coverage and a globally consistent methodology, EIE’s newest release aims to mitigate these potential gaps and ambiguities and support building emissions intervention plans.

Methodology advancements

EIE’s expanded coverage and updated emissions estimates are possible thanks to improvements in our building detection and height measurement models.

Building heights are a foundational component in our emissions estimates. We calculate emissions by first combining building floor area with energy intensity factors to estimate building energy usage, and then applying emissions factors to calculate the total carbon footprint. In addition to natural gas, we support two non-utility fuels, diesel oil and propane.

While we are constantly refining our building emissions calculation methodology, this particular data refresh is powered by a novel building height regression model. At its core, the model is a convolutional neural network, using aerial data — including both imagery and digital elevation model measurements — to calculate height measurements. The model employs a ResNet-50 backbone and was trained against the highest quality aerial data set Google has available. Once trained, the model was applied to a lower resolution but more widely available aerial data set. We use the high quality height measurements training data, as well as the new, inferred measurements from the models, in our emissions calculations.

In this new release, a majority of released cities and regions now have over 90% building heights data coverage. By incorporating the new model’s precise, globally consistent building heights, we’ve greatly expanded EIE’s global coverage and improved our estimates in already released cities.

Data quality

The EIE team’s data quality efforts have, behind the scenes, evolved over the past four years in preparation for this release. We reimagined our evaluation process and with the new infrastructure, expanded our analysis. Our data quality approach considers both external data and internal consistency across a city. For example, we assess buildings per capita with detailed population data, and examine the distribution of building heights in each city. Both the model’s building height data and the emissions estimates using that new data were independently assessed for quality, and only data for cities and regions that met our standards was used and ultimately released.

While preliminary insights from the aerial data led to improved buildings data estimates in our July 2021 buildings data release, applying our evolved analysis capacity and quality standards to the full model results gives us the confidence to release the largest increase in buildings data coverage since EIE was first launched.

Going forward

We are excited to share this data with the world, but we know we have more work ahead of us. EIE’s buildings data is a starting point for policymakers to quickly establish inventories and dive into crafting solutions. From insulation retrofitting to targeted heat pump installations, there are many creative initiatives that benefit from emissions and building inventory data. Planning is just the beginning, however, and as EIE continues to improve and grow our data, we hope to support cities throughout their climate action journeys.

If you are interested in what EIE can do for your community, fill out this form to learn more and get in touch with our team. And, if you work with a local government and haven’t signed up for EIE, you can do so here.

Frequently asked questions

For more information, please visit the EIE Methodology page.

My city’s emissions in EIE have changed significantly with this new update. Why is that?

The most likely reason emissions have changed is due to our improved understanding of your city’s building count, building heights, or building usage. A city’s building stock is dynamic and when we refresh our data, we work to incorporate the new changes and adjust our emissions estimates accordingly. A hypothetical example would be a city that recently closed down much of its retail spaces and converted them into office spaces. Another less common reason is that the city’s boundaries may have changed.

Any new data — whether reflective of recent changes in the physical world, or simply new information to us — can affect emissions estimates.

If EIE only shows residential and nonresidential emissions, how does EIE account for multi-use buildings? Are they considered residential or nonresidential?

Behind the scenes, we actually calculate emissions based on building floor area. A multi-use building’s emissions are split between the residential emissions and nonresidential emissions estimates based on our understanding of the building and floor usage. The allocation takes into account the fraction of the building’s area that is nonresidential versus residential, and the different emissions factors of the different types of establishments found in the building.

Does EIE provide emissions broken down by more specific building types?

We currently only surface the emissions for two overarching categories, residential and nonresidential, per the GPC protocol for reporting building emissions inventories. We are exploring ways to make our buildings data more helpful and actionable to policy planning by leveraging more fine-grained building type classifications.

How does the model determine the height of buildings with a spire? Or buildings that have one tall, high tower but are otherwise short?

Our model calculates multiple “kinds” of height measurements, including the “mean” height of the building, and not just the “max” height. When available, we use the height measurement that best approximates the true area of the building in our emissions estimates.

Why is my state, province or greater area available but not my city?

EIE calculates a region’s building emissions by aggregating over the individual buildings data. Smaller cities are more likely to be affected by data anomalies, and therefore may not meet our quality standards. This is less of a problem in large areas where the majority of the data is solid; thus, anomalies are often insignificant after aggregating.

Does this mean that previous emissions estimates were wrong?

Not necessarily. Our estimates reflect our best understanding of the world when they are calculated, and we do our best to only release data for cities and regions that we are sufficiently confident in. We released emissions estimates for places that we either had enough data for, or our models and assumptions were close enough to reality to yield accurate emissions estimates.

One such example is Redland City, Australia, a region in the Brisbane metropolitan area. Although Redland City’s building height data coverage more than tripled in the new release, the emissions estimates only increased 6%. The improved emissions estimate is very similar to the original estimate because while the original methodology didn’t have access to as much data, its assumptions and estimates were very close to the new measurements — and reality.

--

--