Tree Structural Skeletonization with Close-ranging Panoramic imagery

Background of adopting open-data for roadside tree structural analysis

Yu Kai Him Otto
Forestree
6 min readSep 13, 2023

--

In the Mobile Mapping System (MMS), it produce a lot of panoramic (pano) imageries from the 360 camera. For those captured image, it can not only used as the VR-AR viewing, but also can provide many useful details for us to “remote sensing” with those close-ranging fish eye lens imagery.

Google Street View would be a famous application of the panoramic imagery for navigations, mapping and tourism, it was captured by MMS from the car/ backpack. The 360 images can be fully utilised in the field of urban forestry as well as the mapping. Those panoramic images was composed by two fish eye lens and it also captured a lot of details for use to do the tree structural analysis in a more systematic way.

In fact, when we start the tree structural analysis by these 360 images, trimming, rectification and segmentation process is needed in order to derive the structural profile of the tree more accurate and effectively.

Sample of the original trimmed image, segmented by Lang-SAM and skeletonise images

Framework and workflow

  • Download the panoramic images from the Google Street View

The Google Street View has different temporal resolutions, for example, the year and month was stated in the image, it is recommended to choose the same time and sensors. Also, the Google Street View open for the public upload and engagement, the official Google’s imaging sensor in Hong Kong 2023 was captured by the “Ricoh Theta S”, instead if you use the public uploaded data, the camera model and lens calibration model should be regarding their respected camera model (like the Insta360 X).

List of the resolution# can be downloaded from Google Street View images:

  • 13312x6656
  • 6656x3328
  • 3328x1664

# There will have some different among different captured sensors

It is recommend to download the highest resolution, but the image processing time for the segmentation will be longer and more computer-intensive than the lower resolution. In the end, we may need to resize the pano images into smaller resolution, so the download resolution is depends on your computation power and resources.

Sample of the captured pano image from Google Street View
  • Image trimming from two fish eye lens back to left and right images

Since the upload pano images was the composition of two fish eye lens, therefore we will trim the full image (2x1, width x height) into two images (1x1, width x height). After trimming into half (1/2) of its original size, we can do the rectification respect to the lens distortion model. Be aware that the middle divider of the trimmed half is not the true north, the north is various among the captured camera viewpoint. In the “Street View Download 360”, it also provides the latitude, longitude, Height (Relative) and North rotation. The north rotation indicated the true north direction of the camera view point.

Snapshot from the Street View Download 360
  • Image rectification regarding the fish eye lens, field of view and it lens model

The major problem of adopting fish-eye lens image is the barrel distortion, in which can be corrected by the process of rectification.

Sample of the trimmed fish eye lens image (left) and rectified fish eye lens image (right)
  • Applying the feature recognition and image semantic segmentation with Lang-SAM
Sample of the original rectified fish lens image (left) and Lang-SAM segmented result (right)

There are the steps and code to use the Lang-SAM to the automatically retrieve the tree structural from images.

Clone and install the package

!pip install torch torchvision
!pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

Install the packages

from PIL import Image
from lang_sam import LangSAM
import matplotlib.pyplot as plt
import numpy as np
import os
from glob import glob
from pathlib import Path

Initialize the Lang-SAM model

model = LangSAM()

Create a function for the mask extraction and let the mask be .png (transparent)

def mask_image(image, mask):

image_arr = np.array(image)
mask_arr = np.array(mask)
masked_arr = np.zeros_like(image_arr)
masked_arr[mask_arr>0] = image_arr[mask_arr>0]
alpha = np.ones_like(masked_arr[:,:,0]) * 255
alpha[mask_arr==0] = 0
masked_arr = np.dstack((masked_arr, alpha))

return masked_arr

Do a for loop to do the Lang-SAM segmentation and masking in a batch

images_dir = 'E:/50Batchsample'
batch_size = 50

for root, dirs, files in os.walk(images_dir):

for i in range(0, len(files), batch_size):
images = []

for f in files[i:i+batch_size]:
path = os.path.join(root, f)
image = Image.open(path).convert('RGB')
images.append(image)

for j, image in enumerate(images):

masks, boxes, phrases, logits = model.predict(image, "tree")
mask = masks[0]

masked_arr = mask_image(image, mask)

filename = files[i*batch_size + j]
save_path = os.path.join(images_dir, filename.replace('.jpg', '_masked.png'))

im = Image.fromarray(masked_arr.astype('uint8'))
im.save(str(save_path))

print("Saved...")
  • Adopting colour segmentation with green-brown (for the leaf-trunk) identification and Skeletonization for the distinguished leaf and trunk results
Flowchart of the colour segmentation with green-brown (for the leaf-trunk) identification

Results

Sample 1, Typical Ficus microcarpa in a park, retrieved leaf and trunk skeletons
Sample 2, Typical Ficus microcarpa along the Park Lane Shopper’s Boulevard (roadside), retrieved leaf and trunk skeletons
Sample 3, Typical Ficus microcarpa along the Park Lane Shopper’s Boulevard (roadside), retrieved leaf and trunk skeletons

The result generally are nice and accurate, still some of the defects from the colour based segmentation is the detailed branches structure cannot be effective retrieved from this prototype. This workflow can work effectively in a batch scale for the leaf and trunk segmentation from pano image download freely online.

Limitation and future enhancement

There are few limitation because of using the Close-ranging Panoramic imageries:

  • Limited angles was rectified from the panoramic imageries

Due to the current workflow only trimmed and rectified the pano image individually by the two fish-eye lens, therefore we only can derive the tree structure limited by these pre-processed images. Later on, we should explore different method to reproject the pano images in different orientations (not limited to the left and right, but also sky and ground views).

  • Limited structure can be derived, mainly can only derive the tree body, lack of canopy can be skeletonized from the pano images

The captured camera view point may affect the skeletonized result, the far the viewpoint, the higher the tree structure coverage. So, when we select the images to be process with this algorithm, it is recommended to choose the camera point is far away from the tree body.

Sample of the far the viewpoint, the higher the tree structure coverage
  • Multiply tree was combined into one skeleton body

In the current workflow, we have not add the multi-feature masking yet. The multi-feature masking should be enhance in the next update, in order to segment the tree individually creating more precise tree structural analysis datasets.

Reference and Acknowledgments

This is a volunteering pilot study from the Team of Forestree, Remote Sensing and Forestry, used to study the close-ranging photogrammetry, image processing and computer vision.

All the Pano image was download from the Google Street View and processed image was create by Yu Kai Him Otto developed algorithm.

--

--

Yu Kai Him Otto
Forestree

Student from Hong Kong, studying in Land Surveying and Geo-informatics, PolyU.