How you can use Shapefiles for U-Net Machine learning training data to predict Parking Lot Size

for Satellite Images Recognition

9 min readJul 15, 2023

Background

U-Nets have undoubtedly captivated the field of machine learning computer vision algorithms with their fascinating architecture. Not only are they visually appealing and easier to understand, but they also exhibit phenomenal effectiveness. Imagine the immense value we could harness by applying this model architecture to solve a wide array of real-life problems.

Problem

However, like many challenges in data science, utilizing U-Nets to tackle a broader range of problems presents a significant hurdle: the scarcity of well-tagged data. Acquiring or creating a meticulously annotated dataset is an arduous and time-consuming task. The level of detail required in these tagged datasets often demands weeks, if not months, of work from specially trained experts who meticulously comb through photos for tagging. Consequently, the upfront cost of data annotation restricts the availability of training datasets for U-Nets, limiting the model’s widespread usage.

Our solution

In this article, we will explore a novel approach to overcome this limitation by leveraging shapefiles, a widely used and well-established data structure, in combination with satellite images. By substituting the scarce tagged photos and masks with this alternative approach, we can extend the applicability of U-Net models to solve hundreds of problems that were previously data-constrained.

Using Shapefiles to Train a U-net model sample result

Shapefiles are commonly employed by Geographic Information System (GIS) professionals to describe geographical features on a map. These files are abundant and relatively easy to create using popular software such as ArcGIS or QGIS. Moreover, the internet hosts numerous repositories with thousands of pre-existing shapefiles. A notable example is the OpenStreetMap project, which houses one of the largest collections of shapefiles.

A specific example of methodology

To illustrate this methodology, let’s consider a parking lot shapefile obtained from OpenStreetMap, containing outlines of thousands of parking lots across the country. By combining this shapefile with satellite imagery of the corresponding areas, we can create composite images and masks suitable for training U-Net models. The subsequent sections of this article will delve into the process of integrating these components, unlocking a multitude of U-Net applications.

By leveraging the abundance of shapefiles and the wealth of information offered by satellite images, we can circumvent the challenge of acquiring meticulously tagged photographs. This innovative approach not only expands the versatility of U-Nets but also empowers researchers and practitioners to address a broader range of real-world problems.

Technical Description

The first step is preparing the data. We will read the data and filter parking lots. We will find the center of the spatial object. Use the folium library to prepare a picture of the satellite view of that location.

Read in all the libraries

import folium
import time
import pickle   
import matplotlib.pyplot as plt
import os
##Export the file
##https://nagasudhir.blogspot.com/2021/07/save-folium-map-as-png-image-using.htmlfrom selenium.webdriver.firefox.options import Options
from selenium import webdriver
import os

from selenium.webdriver.firefox.options import Options
import imageio as iio
import numpy as np
import geopandas as gpd
import numpy
print(numpy.version.version)

Create a function to normalize images we will use later

def normalize(img):
    min = img.min()
    max = img.max()
    return 2.0 * (img - min) / (max - min) - 1.0

Read in the OpenStreetMap dataset


shapefile_full = gpd.read_file('data/gis_osm_traffic_a_free_1.shp')
shapefile_full.head()

At this point, I will take you through each step of preparing the satellite image and mask file separately for 1 location. And later we will put it all together in a loop to create a whole sample.

Based on our data we have a polygon geometry for the parking lot. We need to convert it to a point to feed into the folium library

id='14574538'
st = time.time()
#get the item outof the dataset
shapefile_parking_one = shapefile_full[shapefile_full['osm_id'] == id]
df=shapefile_parking_one
df.crs
df = df.to_crs(epsg=4326)
##location
location=[df.geometry.centroid.y.mean(), df.geometry.centroid.x.mean()]

2. now we prepare the main image using folium, based on this blog

m_map= folium.Map(location, width=600,height=600, zoom_start=17, tiles='CartoDB positron')
#Map
#m = folium.Map(location,  zoom_start=20, tiles='CartoDB positron')
tile = folium.TileLayer( tiles = 'https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}',         attr = 'Esri',        name = 'Esri Satellite',        overlay = False,        control = False       ).add_to(m_map)

mapFname = 'map.html'
m_map.save(mapFname)
mapUrl = 'file://{0}\\{1}'.format(os.getcwd(), mapFname)
m_map

3. Now we will use the same library to add our spatial object to the map and prepare the combined image. I specifically make it super red so that I can separate it into a mask later.

#mask
m = folium.Map(location, width=600,height=600, zoom_start=17, tiles='CartoDB positron')

tile = folium.TileLayer( tiles = 'https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}',         attr = 'Esri',        name = 'Esri Satellite',        overlay = False,        control = False       ).add_to(m)

style = {"color": "red", "fillOpacity": 0.89}

for _, r in df.iterrows():
    # Without simplifying the representation of each borough,
    # the map might not be displayed
    sim_geo = gpd.GeoSeries(r['geometry']).simplify(tolerance=.5)
    geo_j = sim_geo.to_json()
    geo_j = folium.GeoJson(data=geo_j, style_function=lambda x:style)
    folium.Popup(r['name']).add_to(geo_j)
    geo_j.add_to(m)


maskFname = 'mask.html'
m.save(maskFname)
maskUrl = 'file://{0}\\{1}'.format(os.getcwd(), maskFname)
m

Sattelite image with shape object applied to it

4. Now we take both the image and the mask and export it using Selenium Firefox driver. This was a little tricky to set up but it works fine.

#### export map and mask
##### You can find both of these images in the folder of 

# download gecko driver for  from here - https://github.com/mozilla/geckodriver/releases

options = Options()
options.headless = True
##options = Options()
options.binary_location = r'C:\Program Files\Mozilla Firefox\firefox.exe'#### you may need to update this path
driver = webdriver.Firefox(options=options)



###Map_export_portion
# use selenium to save the html as png image
driver.get(mapUrl)
# wait for 5 seconds for the maps and other assets to be loaded in the browser
time.sleep(5)
driver.save_screenshot('map.png')

####mask_export_portion
driver.get(maskUrl)
# wait for 5 seconds for the maps and other assets to be loaded in the browser
time.sleep(5)
driver.save_screenshot('mask.png')
driver.quit()

This creates files in your working directory that you can use for your image recognition code.

5. Now we can read the mask image and the code Focus on the red layer and transform it into a binary mask. The last few lines of that code read in the raw satellite image of the site.

img = iio.imread('mask.png')
##portion = china[120:250, 110:230]
img =img [0:600, 0:600]

r = img[:,:,0]
b = img[:,:,1]
g = img[:,:,2]
img_red=r>150 
   ## plt.imshow(img_red)
img_blue=      b>100
img_green=      g>100
   ## plt.imshow(img_blue)
img_other = (img_red.astype(np.float32) * img_blue.astype(np.float32)).astype(np.bool)
img_mask = (img_red.astype(np.float32) - img_other.astype(np.float32)).astype(np.bool)

plt.imshow(img_mask)
#import map 
img_map= iio.imread('map.png')
##portion = china[120:250, 110:230]
img_map =img_map [0:600, 0:600]
img_map=normalize(img_map)
plt.imshow(img_mask)

Mask separated from the image to use in U-net

Finally, you can take all of those 5 steps combine them together and create a loop to go through each record and add the image and the mask in two arrays that you can later feed into the neural network. Also, I subsetted my data frame to get 600 images of the parking class only.

###create a subset
shapefile_subset=shapefile_full[shapefile_full['fclass'] =="parking"].reset_index() 
shapefile_subset=shapefile_subset[:300]
len(shapefile_subset)

### running a loop on the above code to download a mask and a picture of every location

img_map_array=[]
img_mask_array=[]
st = time.time()

options = Options()
options.headless = True
##options = Options()
options.binary_location = r'C:\Program Files\Mozilla Firefox\firefox.exe'
driver = webdriver.Firefox(options=options)

for i in range(len(shapefile_subset)):
    print ([i])
    id=(shapefile_subset['osm_id'][i])
    #get the item outof the dataset
    shapefile_parking_one = shapefile_full[shapefile_full['osm_id'] == id]
    df=shapefile_parking_one
    df.crs
    df = df.to_crs(epsg=4326)
    
    ##location
    location=[df.geometry.centroid.y.mean(), df.geometry.centroid.x.mean()]
    m_map= folium.Map(location, width=600,height=600, zoom_start=20, tiles='CartoDB positron')
    #Map
    #m = folium.Map(location,  zoom_start=17, tiles='CartoDB positron')
    tile = folium.TileLayer( tiles = 'https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}',         attr = 'Esri',        name = 'Esri Satellite',        overlay = False,        control = False       ).add_to(m_map)
    
    mapFname = 'map.html'
    m_map.save(mapFname)
    mapUrl = 'file://{0}\\{1}'.format(os.getcwd(), mapFname)
    ###m_map
    
    #mask
    m = folium.Map(location, width=600,height=600, zoom_start=17, tiles='CartoDB positron')
    ###m = folium.Map(location,  zoom_start=20, tiles='CartoDB positron')
    tile = folium.TileLayer( tiles = 'https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}',         attr = 'Esri',        name = 'Esri Satellite',        overlay = False,        control = False       ).add_to(m)
    
    style = {"color": "red", "fillOpacity": 0.89}
    
    for _, r in df.iterrows():
        # Without simplifying the representation of each borough,
        # the map might not be displayed
        sim_geo = gpd.GeoSeries(r['geometry']).simplify(tolerance=.5)
        geo_j = sim_geo.to_json()
        geo_j = folium.GeoJson(data=geo_j, style_function=lambda x:style)
        folium.Popup(r['name']).add_to(geo_j)
        geo_j.add_to(m)
    
    
    maskFname = 'mask.html'
    m.save(maskFname)
    maskUrl = 'file://{0}\\{1}'.format(os.getcwd(), maskFname)
    ##m
    
    #### export map and mask

    
    
    # download gecko driver for  from here - https://github.com/mozilla/geckodriver/releases
    ###Map_export_portion
    # use selenium to save the html as png image
    driver.get(mapUrl)
    # wait for 5 seconds for the maps and other assets to be loaded in the browser
    time.sleep(5)
    driver.save_screenshot('map.png')
    
    ####mask_export_portion
    driver.get(maskUrl)
    # wait for 5 seconds for the maps and other assets to be loaded in the browser
    time.sleep(5)
    driver.save_screenshot('mask.png')
    
    ##convert mask into binary mask
    
    img = iio.imread('mask.png')
    ##portion = china[120:250, 110:230]
    img =img [0:600, 0:600]
    
    r = img[:,:,0]
    b = img[:,:,1]
    g = img[:,:,2]
    img_red=r>150 
       ## plt.imshow(img_red)
    img_blue=      b>100
    img_green=      g>100
       ## plt.imshow(img_blue)
    img_other = (img_red.astype(np.float32) * img_blue.astype(np.float32)).astype(np.bool)
    img_mask = (img_red.astype(np.float32) - img_other.astype(np.float32)).astype(np.bool)
    
    plt.imshow(img_mask)
    
    #import map 
    img_map= iio.imread('map.png')
    ##portion = china[120:250, 110:230]
    img_map =img_map [0:600, 0:600]
    img_map=normalize(img_map)
    plt.imshow(img_map)

    
    
    ### reshape the mask
    
    mask_final=[]
    img_mask0=img_mask*-1+1
    img_mask1=img_mask.astype(int)
    mask_final.append(img_mask0)
    mask_final.append(img_mask1)
    mask_final = np.array(mask_final)
    mask_final=mask_final.transpose([1, 2, 0])
    
    img_map_array.append(img_map)
    img_mask_array.append(mask_final)
et = time.time()
elapsed_time = et - st
print('loop time:', elapsed_time, 'seconds')    
driver.quit()
img_map_array=np.array(img_map_array)
img_mask_array=np.array(img_mask_array)

This article mostly focuses on creating the dataset above. But just to complete the workflow I added a simple U- net model below to show you how this data can be used.

Build the model

You can switch out the network or the shapefile and adopt this algorithm for your own use. Let’s first set up the libraries we will need for this model.

###from tensorflow import keras
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, concatenate, Conv2DTranspose
from tensorflow.keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint
from keras.callbacks import CSVLogger
from keras.callbacks import TensorBoard

Create the model

## Create the model
def simple_unet_model(n_classes=2, im_sz=600, n_channels=4, n_filters_start=4, growth_factor=2):
    # Creating network model using functional API:
    n_filters = n_filters_start
    inputs = Input((im_sz, im_sz, n_channels))
    conv1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(inputs)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    n_filters *= growth_factor  # increase number of filters when going down the U-Net
    conv2 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(pool1)
    n_filters //= growth_factor  # decrease number of filters when going up the U-Net
    upconv = Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv2)
    concat = concatenate([conv1, upconv])
    conv3 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(concat)
    output = Conv2D(n_classes, (1, 1), activation='sigmoid')(conv3)
    model = Model(inputs=inputs, outputs=output)
    # Compiling model with ADAM optimizer and logloss (aka binary crossentropy) as loss function
    model.compile(optimizer=Adam(), loss='binary_crossentropy')
    return model

model = simple_unet_model()
model.summary()

Create the Holdout and Train the Model and take a look at our performance metrics:

x_train=img_map_array[0:200]
x_test=img_map_array[200:]
y_train=img_mask_array[0:200]
y_test=img_mask_array[200:]
x_train.shape
# Now training the model:
st = time.time()
N_EPOCHS =50
BATCH_SIZE = 32
# ask Keras to save best weights (in terms of validation loss) into file:
model_checkpoint = ModelCheckpoint(filepath='weights_simple_unet.hdf5', monitor='val_loss', save_best_only=True)
# ask Keras to log each epoch loss:
csv_logger = CSVLogger('log.csv', append=True, separator=';')
# ask Keras to log info in TensorBoard format:
tensorboard = TensorBoard(log_dir='tensorboard_simple_unet/', write_graph=True, write_images=True)
# Fit:
model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=N_EPOCHS,
          verbose=2, shuffle=True,
          callbacks=[model_checkpoint, csv_logger, tensorboard],
          validation_data=(x_test, y_test))

et = time.time()
elapsed_time = et - st
print('loop time:', elapsed_time, 'seconds')

Results

The U-network model we created has high accuracy only after 50 epochs. We compared a few of the images with masks and predictions. The first thing we noticed is the masks are not always great at outlining the full parking lot. We are able to turn our shapefiles into masks but the shapefile we picked is not the best example of parking lot outlines. Predictions are fairly good at highlighting light-colored concrete, thus we often misclassify roads as parking lots but that is understandable. We can use this exact same methodology and hopefully with better data create even better predictions for parking lots or whatever shapefile you may want to apply this to.

In conclusion, the marriage of shapefiles and satellite images with U-Net models opens up new horizons in computer vision. The combination of readily available geographical data and remote sensing imagery provides a viable alternative to the laborious process of data annotation. As a result, we can now tackle a multitude of challenges that were previously hindered by the scarcity of tagged datasets. With this newfound capability, the potential for U-Nets to revolutionize various domains is boundless, making them an indispensable tool in the arsenal of data scientists and machine learning enthusiasts alike.