PDAL at the OSGeo Code Sprint

Two weeks ago, approximately 95 contributors from a wide range of open source geospatial projects convened in Bonn, Germany for the 2018 OSGeo Code Sprint. Of these, about 17 contributors (spanning eight nationalities and ten organizations) of various point cloud projects were in attendance. The PDAL development team pushed hard towards a PDAL v1.7 release. A good deal of my focus was on cleaning up documentation, introducing a new filter, and working to solidify a Conda recipe for PDAL.


Documentation

In each of the last seven releases, the number of PDAL stages has increased.

As we approach the century mark, I personally celebrate the fact that we’ve designed a system that is flexible in its ability to transcode between various point cloud formats, but has also established itself as a powerful processing engine with over half of the stages being devoted to pipelined filtering operations.

With all of these capabilities, we unfortunately also have a documentation problem on our hands. If we were focused on solely readers and writers, many of whom are referenced by their file extension or specification name, we could stick to presenting a monolithic list of stages like readers.las or writers.matlab.

Filters present an interesting problem though. We do not always have a universally recognized acronym to represent some of the filters. filters.pmf may be meaningful enough for someone looking for the Progressive Morphological Filter, but not to the majority of readers. The flip side of that would be to name the filters functionally, like filters.ground for segmenting ground returns. But there may be many algorithms that could be used for such a purpose and PDAL doesn’t currently support hierarchical or nested stage names like filters.ground.pmf. Furthermore, the role of a reader and writer are relatively clear but a filter can serve multiple roles. Some filters create new dimensions. Others alter dimensions (but not coordinates) while others move the coordinates themselves. Some cull points while others split or join point clouds. At this point in time, PDAL makes no distinction between all of these different filter types. For all of these reasons, and in lieu of a major overhaul of PDAL’s Filter class, improved documentation is perhaps the most straightforward answer to our problem.

With the release of PDAL v1.7, we will begin providing stage descriptions along with stage names in the reader, writer, and filter documentation. An example is shown below.

The filter documentation will be further grouped into several broad categories to capture the intended behavior.

Returns Filters

filters.returns can now be used to split a point cloud into separate point clouds based on ReturnNumber and NumberOfReturns. Users can specify any combination of first, intermediate , last , or only and filters.returns will create a new point cloud for each subset.

When the NumberOfReturns is exactly 1, a return is considered an only return. When the NumberOfReturns is greater than 1, first returns are identified as those with a ReturnNumber of 1, and last returns are identified as those whose ReturnNumber is equal to the NumberOfReturns. When the NumberOfReturns is greater than 2, intermediate returns are those for which the ReturnNumber is both greater than 1 and less than the NumberOfReturns.

The following pdal translate command reads the input LAS file and uses filters.returns to split the point cloud into two separate point clouds. In this case, the point clouds are written to two separate files output_0.las and output_1.las, which are comprised of first and last returns respectively.

pdal translate input.las output_#.las returns \
--filters.returns.groups="first,last"

Conda

PDAL binaries are already available to Linux users via Fedora, Debian, and Alpine. Windows users can install via OSGeo4W. And macOS users can install via Homebrew. We also maintain Docker images. So why another packaging system?

Conda gives us a common look and feel across platforms. Once the PDAL conda-forge package is finalized in the coming days/weeks, regardless of your OS, you will be able to

conda install -c conda-forge pdal

where -c conda-forge indicates that we are installing from the conda-forge channel.

If you would like to install PDAL in an isolated environment, you can

conda create -n pdalenv
conda install -n pdalenv -c conda-forge pdal

or in one step

conda create -n pdalenv -c conda-forge pdal

In doing so, we can now run multiple versions of PDAL in a clean environment, which is especially important if the user has conflicting dependencies installed on their system or if they are using PDAL alongside Python and multiple Python packages.

Note: While Docker is certainly one possible alternative to running PDAL on many different platforms and in an isolated manner, we have found that for many users the barrier to entry is prohibitive.

A more elaborate example: to install PDAL with Python bindings for use inside a Jupyter notebook with Numpy and Scipy packages, we can be up and running in just three simple steps.

conda create -n pdal-notebook -c conda-forge \
jupyter \
python3 \
pdal \
python-pdal \
numpy \
scipy
source activate pdal-notebook
jupyter notebook

For more on the topic, please come see my talk at FOSS4G North America.