Generative AI for Zoning Acquisition

Published in

Urban AI

7 min readJun 3, 2024

In this series of articles, we share some experiments on using Generative AI, in the form of LLMs, to simulate household location choices and urban growth patterns, and interpret zoning constraints in order to automatically create algorithms to compute capacity in Oakland, CA, USA and Auckland, NZ.

The development of algorithms that take zoning constraints as input has long posed significant challenges, akin to the difficulties inherent in any software system that must process and interpret natural language. For decades, computer scientists have tried to craft solutions capable of overcoming these obstacles, driven by the vast potential that lies in the ability to systematically structure knowledge encapsulated within human-readable texts, such as legal systems and regulations.

In the 1980s, there was a surge of activity surrounding the popularization of the logical programming paradigm, symbolized by the release of the PROLOG language. This development was accompanied by promises of revolutionizing the field of knowledge representation and reasoning, yet these aspirations were not fully realized. However, recent breakthroughs in neural network architectures and their remarkable capacity for natural language processing, as exemplified by large language models, have reignited interest in tackling these long-standing challenges.

Zoning regulations and their associated constraints find widespread application across diverse domains, with one of their most crucial roles being the determination of the construction capacity permitted on a given parcel of land. This computation is important for both urban planning and the real estate sector. However, a key distinguishing factor between these two domains lies in the scale of the study area under consideration. Urban planning typically encompasses vast, expansive regions, necessitating an analysis that spans across extensive geographic areas. In contrast, the real estate industry often focuses its attention on a relatively small subset of parcels. As the number of parcels under evaluation increases, the importance of employing robust and efficient methodologies for the automated computation of construction capacities becomes increasingly evident. This is particularly true in the context of land use simulation. models, where the sheer magnitude of the study area can result in the need to assess thousands, or even tens of thousands, of individual parcels.

Zoning data presents unique challenges stemming from its inherent lack of structured representation. Across different urban regions, not only do the specific values for constraints such as setbacks and maximum heights vary, but the very attributes that define capacity calculations can differ. Furthermore, certain constraints defy straightforward quantification and necessitate complex methodologies involving the acquisition of supplementary data to enable accurate capacity computations. Examples of such constraints include maximum height restrictions that depend on the average height of buildings within a given block, or geometric restrictions that consider the spatial relationships with neighboring structures. The process of computing development capacities transcends the mere collection of input parameters such as setback lengths. It also entails the discovery and formulation of the sequence of steps that must be undertaken to arrive at the final computation. Adding to the complexity, this computational process can vary significantly across different land use categories, such as residential or commercial zones.

From a software engineering perspective, both the data processing and the method for capacity calculation are relatively trivial problems. However, the gap that comes from the interpretation of natural language is one of the hardest. Previous approaches taken by our group centered around the creation of domain-specific languages (DSLs) that could be learned by urban planning professionals. This involved interpreting zoning laws and codifying them in a manner that a custom DSL parser could then convert into computable algorithms. While this initiative proved novel and effective, providing a significant improvement over past methods, it came with significant drawbacks. The development of these DSLs is an inherently complex task, as is the effort required for professionals without software development expertise to learn and translate the constraints into the DSL. In essence, while successful in finding a way to accurately represent difficult constraints and enable automatic capacity computation, this approach necessitated significant human intervention.

Zoning map for Oakland, CA as part of a CityCompass™ land use simulation.

Large language models and other generative AI systems have shown remarkable capabilities in both natural language understanding and source code generation. Given these strengths, we decided to explore an intriguing experiment — feeding zoning documents into these models to see what level of automation could be achieved. We conducted this trial across different regions, including Oakland, a city in the United States, and Auckland in New Zealand. The goal was to assess the models’ ability to interpret and create algorithms that represent the complex rules and regulations contained within zoning documentation.

OAKLAND, USA

We began by extracting the relevant parts of the area zoning code to focus on some residential and mixed use categories (RM and C), given that the LLM that we are using doesn’t have specific knowledge about this topic. After feeding the model, it was possible to begin to ask questions about density such as the following:

LLM interpretation of maximum residential density in natural language.

As you can see, the model can read and interpret tables, and summarize information on a specific zoning category, which is good, but still far from a format that an algorithm could understand. Then, we went a step further, asking the model to tell us about RM-3 maximum lot coverage, this time not in natural language but a JSON data object:

JSON output representing maximum lot coverage.

Clearly, it is possible to take this input for an algorithm, interpret the conditions in the “Lot Size” field and create a function. But, what if the LLM can also do that? Well, it does.

LLM generated a function to compute maximum lot coverage.

In short, this generative AI was able to create a function that receives the area of a lot, and returns what is the maximum coverage allowed in JavaScript. We tried also with setbacks (in terms of the parcel area), and we got the following function:

Setbacks computation function automatically generated from the planning code.

Finally, we challenge the model asking for code that returns the maximum allowed height, but this time in function of the slope (that is a parameter considered in this particular zoning code) and for all four residential / mixed use categories (not a single category as the previous examples):

AUCKLAND, NEW ZEALAND

After loading local zoning documents, we went ahead to ask the LLM to create a function to check if a given building polygon is in line with the building coverage constraints. This challenge has the additional complication that it requires computing polygon areas, and as it is possible to see, the algorithm is correct:

Creation of a coverage check function that requires computing polygon area.

This urban area has been chosen to experiment with because it includes recession planes, which poses a special challenge for extracting zoning information, and creating algorithms to compute capacities. Several prompts have been tested, but it was impossible to achieve a correct function to compute any of the properties of such structures.

Maximum allowed height in function of parcel slope and zoning category.

After this difficulty, we asked the LLM to find a generic algorithm to compute capacity, according to the area constraints.

How to compute recession planes, from an official guide from Auckland Council.

It is evident that what the model got is partial and fails to notice significant restrictions that are present in the area. It might be possible to get better results by carefully applying prompts step by step, but it’s clear that the model doesn’t have a comprehensive understanding of the rules.

Failed attempt to compute capacity, taking into account complex constraints in Auckland.

CONCLUSIONS

It is important to understand that all these experiments required significant manual intervention, encompassing the identification and retrieval of relevant documents as well as an iterative process of prompt engineering. There are some recent initiatives trying to automate this process, providing functionality around an automated workflow of web scraping and prompt creation. As part of these experiments we tried AgentGPT with disappointing results — the technology is promising but not mature enough for this nature of data.

A significant limitation of current large language models (LLMs) is their inability to interpret and analyze maps and spatial data formats. Beyond processing textual descriptions of zoning categories and overlays, a crucial component of zoning acquisition involves determining the specific zoning designation and applicable overlays for individual parcels. This determination often relies on visual examination of maps containing multiple vector layers, typically provided in formats such as shapefiles or similar geospatial data representations. In instances where parcel categorization is relatively straightforward, as the necessary layers are available in digital formats, this limitation may not significantly impede the performance of large-scale capacity calculation algorithms. The inability to effectively reason over and integrate spatial data representations constitutes a significant bottleneck, requiring human intervention and expertise to ensure accurate and comprehensive zoning analysis.

The findings of these experiments suggest that state-of-the-art large language models (LLMs) have the potential to substantially augment zoning acquisition processes, particularly for expansive large-scale projects that require capacity calculations across a vast number of parcels. However, human intervention remains indispensable for certain stages of the workflow, and significant limitations were encountered when handling complex geometric rules. Nonetheless, the rapid pace of advancement in this field indicates that novel refinements may emerge in the near future, potentially within months, that could enable us to overcome the current constraints.

By Federico J. Fernandez, Founder & CEO of Urbanly.

On this topic, you can also look at our report on Generative AI for Urban Governance.

Generative AI for Zoning Acquisition

OAKLAND, USA

AUCKLAND, NEW ZEALAND

CONCLUSIONS

Written by Urban AI