Tree Structural Skeletonization with Close-ranging Panoramic imagery
Background of adopting open-data for roadside tree structural analysis
In the Mobile Mapping System (MMS), it produce a lot of panoramic (pano) imageries from the 360 camera. For those captured image, it can not only used as the VR-AR viewing, but also can provide many useful details for us to “remote sensing” with those close-ranging fish eye lens imagery.
Google Street View would be a famous application of the panoramic imagery for navigations, mapping and tourism, it was captured by MMS from the car/ backpack. The 360 images can be fully utilised in the field of urban forestry as well as the mapping. Those panoramic images was composed by two fish eye lens and it also captured a lot of details for use to do the tree structural analysis in a more systematic way.
In fact, when we start the tree structural analysis by these 360 images, trimming, rectification and segmentation process is needed in order to derive the structural profile of the tree more accurate and effectively.
Framework and workflow
- Download the panoramic images from the Google Street View
The Google Street View has different temporal resolutions, for example, the year and month was stated in the image, it is recommended to choose the same time and sensors. Also, the Google Street View open for the public upload and engagement, the official Google’s imaging sensor in Hong Kong 2023 was captured by the “Ricoh Theta S”, instead if you use the public uploaded data, the camera model and lens calibration model should be regarding their respected camera model (like the Insta360 X).
List of the resolution# can be downloaded from Google Street View images:
- 13312x6656
- 6656x3328
- 3328x1664
# There will have some different among different captured sensors
It is recommend to download the highest resolution, but the image processing time for the segmentation will be longer and more computer-intensive than the lower resolution. In the end, we may need to resize the pano images into smaller resolution, so the download resolution is depends on your computation power and resources.
- Image trimming from two fish eye lens back to left and right images
Since the upload pano images was the composition of two fish eye lens, therefore we will trim the full image (2x1, width x height) into two images (1x1, width x height). After trimming into half (1/2) of its original size, we can do the rectification respect to the lens distortion model. Be aware that the middle divider of the trimmed half is not the true north, the north is various among the captured camera viewpoint. In the “Street View Download 360”, it also provides the latitude, longitude, Height (Relative) and North rotation. The north rotation indicated the true north direction of the camera view point.
- Image rectification regarding the fish eye lens, field of view and it lens model
The major problem of adopting fish-eye lens image is the barrel distortion, in which can be corrected by the process of rectification.
- Applying the feature recognition and image semantic segmentation with Lang-SAM
There are the steps and code to use the Lang-SAM to the automatically retrieve the tree structural from images.
Clone and install the package
!pip install torch torchvision
!pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git
Install the packages
from PIL import Image
from lang_sam import LangSAM
import matplotlib.pyplot as plt
import numpy as np
import os
from glob import glob
from pathlib import Path
Initialize the Lang-SAM model
model = LangSAM()
Create a function for the mask extraction and let the mask be .png (transparent)
def mask_image(image, mask):
image_arr = np.array(image)
mask_arr = np.array(mask)
masked_arr = np.zeros_like(image_arr)
masked_arr[mask_arr>0] = image_arr[mask_arr>0]
alpha = np.ones_like(masked_arr[:,:,0]) * 255
alpha[mask_arr==0] = 0
masked_arr = np.dstack((masked_arr, alpha))
return masked_arr
Do a for loop to do the Lang-SAM segmentation and masking in a batch
images_dir = 'E:/50Batchsample'
batch_size = 50
for root, dirs, files in os.walk(images_dir):
for i in range(0, len(files), batch_size):
images = []
for f in files[i:i+batch_size]:
path = os.path.join(root, f)
image = Image.open(path).convert('RGB')
images.append(image)
for j, image in enumerate(images):
masks, boxes, phrases, logits = model.predict(image, "tree")
mask = masks[0]
masked_arr = mask_image(image, mask)
filename = files[i*batch_size + j]
save_path = os.path.join(images_dir, filename.replace('.jpg', '_masked.png'))
im = Image.fromarray(masked_arr.astype('uint8'))
im.save(str(save_path))
print("Saved...")
- Adopting colour segmentation with green-brown (for the leaf-trunk) identification and Skeletonization for the distinguished leaf and trunk results
Results
The result generally are nice and accurate, still some of the defects from the colour based segmentation is the detailed branches structure cannot be effective retrieved from this prototype. This workflow can work effectively in a batch scale for the leaf and trunk segmentation from pano image download freely online.
Limitation and future enhancement
There are few limitation because of using the Close-ranging Panoramic imageries:
- Limited angles was rectified from the panoramic imageries
Due to the current workflow only trimmed and rectified the pano image individually by the two fish-eye lens, therefore we only can derive the tree structure limited by these pre-processed images. Later on, we should explore different method to reproject the pano images in different orientations (not limited to the left and right, but also sky and ground views).
- Limited structure can be derived, mainly can only derive the tree body, lack of canopy can be skeletonized from the pano images
The captured camera view point may affect the skeletonized result, the far the viewpoint, the higher the tree structure coverage. So, when we select the images to be process with this algorithm, it is recommended to choose the camera point is far away from the tree body.
- Multiply tree was combined into one skeleton body
In the current workflow, we have not add the multi-feature masking yet. The multi-feature masking should be enhance in the next update, in order to segment the tree individually creating more precise tree structural analysis datasets.
Reference and Acknowledgments
This is a volunteering pilot study from the Team of Forestree, Remote Sensing and Forestry, used to study the close-ranging photogrammetry, image processing and computer vision.
All the Pano image was download from the Google Street View and processed image was create by Yu Kai Him Otto developed algorithm.