Guide to Processing LAS Point Cloud format data

Shikhar Gupta
Mindkosh AI
Published in
4 min readAug 9, 2024
A colored LAS format aerial point cloud from USGS

LAS (LASer) is one of the most widely used formats used to store Lidar point clouds. In fact, it is the standard format when saving point clouds captured by aerial lidar for applications such as surveying and mapping. Most aerial Lidar software combine GPS, IMU, and laser pulse range data to produce X, Y, and Z point data in addition to other possible properties like elevation, intensity, classification etc.

LAS was created by the American Society for Photogrammetry and Remote Sensing (ASPRS), with the aim of reducing the point cloud file size through the use of binary data, while maintaining information specific to the Lidar nature of the data, and the same time not being overly complex.

While LAS point cloud file format is not compressed, there is an open source project called LASzip which can loss-lessly compress LAS data, reducing the overall size of the point cloud file.

In this article, we will look at the general structure of a LAS file and how to process it using simple python code. If you want to process other point cloud formats like PLY and PCD, checkout this article that describes how to process point cloud file formats.

General structure of LAS point cloud files

All LAS files contain four types of records. Also note that there might be small changes in the format when dealing with different versions. These records are, in order of their appearance in a typical LAS file:

  • Public header block — A single header block containing the file signature (LASF), metadata identifying the project, number and type of point data records, dimensions (data attributes) such as x, y, z, range, intensity etc. and pointers to the other sections of the file.
  • Variable Length Records — VLRs contain variable types of data including projection information, metadata and user application data. VLRs also contain the Coordinate Reference System of the point cloud which is necessary to geo-reference the points in the point cloud. VLRs are optional and may not be present in all LAS files.
  • Point Data Records — These records contain the actual point cloud data. Data points can be in any of the 10 data types specified in LAS specifications. All point records in a file must be of the same type.
  • Extended Variable Length Records — EVLRs allow a larger data payload than VLRs and can be appended to the end of a LAS file. These records, like VLRs are also optional.
Some of the dimensions as specified in the Public Header Block of a LAS file

Note that different versions of the LAS format can have a few more or less dimensions.

Processing LAS point cloud data using python

Here are some code-snippets in Python using the popular las processing library laspy.

Before you try using these, make sure to install both backends for laspy to allow opening parsing compressed (LAZ, LAZRS) formats as well.

python -m pip install "laspy[lazrs,laszip]"

Read LAS file and get format details

import laspy

las = laspy.read('example_las_file.las')
print(las.header)

# Output - <LasHeader(1.2, <PointFormat(3, 0 bytes of extra dims)>)>
# Indicates this file uses Point format 3.

# Get point format
point_format = las.point_format
print(list(point_format.dimension_names))

print(point_format[3].name)
print(point_format[3].num_bits)
print(point_format[3].kind)

Get Header details

# Get number of points
print(las.header.point_count)

# Get the offset
print(las.header.offset)
# Get scale
print(las.header.scale)

# Get min and max of scaled and offset actual values
print(las.header.min)
print(las.header.max)

# Get points
print(las.points[0].x)
print(las.points[0].y)
print(las.points[0].z)

An important point to note is the dimensions X, Y, Z are signed integers without the scale and offset applied. An offset is co-ordinate that moves the entire point cloud by some vector. A scale on the other hand defines by what value should the points be multiplied, to get the actual global point co-ordinates. The reason for doing this is to reduce the file-size by making the numbers that are to be saved smaller. By using scale and offset, the range of the numbers that need to be saved becomes smaller, which allows other, smaller-sized data types to be used to encode the numbers. This can drastically reduce the LAS file size.

Within the points structure, x, y,z are original double values. These are calculated as follows:

Converting LAS format point cloud to PCD format

Sometimes it can be beneficial to convert point cloud files in LAS format into other lighter point cloud formats like PCD or PLY. While laspy does not support this conversion, we can use another popular python library for point clouds — open3d to achieve this.

import laspy
import open3d as o3d
import numpy as np

# Load the LAS file
las = laspy.read('file_name.las')

# Extract the point data (coordinates)
points = np.vstack((las.x, las.y, las.z)).transpose()

# Create an Open3D point cloud object
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(points)

# Save the point cloud to a PCD file
o3d.io.write_point_cloud('output_file_pcd_file.pcd', pcd)

You can checkout the official LAS file format specification here.

--

--