PDAL at the OSGeo Code Sprint
Two weeks ago, approximately 95 contributors from a wide range of open source geospatial projects convened in Bonn, Germany for the 2018 OSGeo Code Sprint. Of these, about 17 contributors (spanning eight nationalities and ten organizations) of various point cloud projects were in attendance. The PDAL development team pushed hard towards a PDAL v1.7 release. A good deal of my focus was on cleaning up documentation, introducing a new filter, and working to solidify a Conda recipe for PDAL.
In each of the last seven releases, the number of PDAL stages has increased.
As we approach the century mark, I personally celebrate the fact that we’ve designed a system that is flexible in its ability to transcode between various point cloud formats, but has also established itself as a powerful processing engine with over half of the stages being devoted to pipelined filtering operations.
With all of these capabilities, we unfortunately also have a documentation problem on our hands. If we were focused on solely readers and writers, many of whom are referenced by their file extension or specification name, we could stick to presenting a monolithic list of stages like
Filters present an interesting problem though. We do not always have a universally recognized acronym to represent some of the filters.
filters.pmf may be meaningful enough for someone looking for the Progressive Morphological Filter, but not to the majority of readers. The flip side of that would be to name the filters functionally, like
filters.ground for segmenting ground returns. But there may be many algorithms that could be used for such a purpose and PDAL doesn’t currently support hierarchical or nested stage names like
filters.ground.pmf. Furthermore, the role of a reader and writer are relatively clear but a filter can serve multiple roles. Some filters create new dimensions. Others alter dimensions (but not coordinates) while others move the coordinates themselves. Some cull points while others split or join point clouds. At this point in time, PDAL makes no distinction between all of these different filter types. For all of these reasons, and in lieu of a major overhaul of PDAL’s Filter class, improved documentation is perhaps the most straightforward answer to our problem.
The filter documentation will be further grouped into several broad categories to capture the intended behavior.
filters.returns can now be used to split a point cloud into separate point clouds based on
NumberOfReturns. Users can specify any combination of first, intermediate , last , or only and
filters.returns will create a new point cloud for each subset.
NumberOfReturns is exactly 1, a return is considered an only return. When the
NumberOfReturns is greater than 1, first returns are identified as those with a
ReturnNumber of 1, and last returns are identified as those whose
ReturnNumber is equal to the
NumberOfReturns. When the
NumberOfReturns is greater than 2, intermediate returns are those for which the
ReturnNumber is both greater than 1 and less than the
pdal translate command reads the input LAS file and uses
filters.returns to split the point cloud into two separate point clouds. In this case, the point clouds are written to two separate files
output_1.las, which are comprised of first and last returns respectively.
pdal translate input.las output_#.las returns \
PDAL binaries are already available to Linux users via Fedora, Debian, and Alpine. Windows users can install via OSGeo4W. And macOS users can install via Homebrew. We also maintain Docker images. So why another packaging system?
Conda gives us a common look and feel across platforms. Once the PDAL conda-forge package is finalized in the coming days/weeks, regardless of your OS, you will be able to
conda install -c conda-forge pdal
-c conda-forge indicates that we are installing from the conda-forge channel.
If you would like to install PDAL in an isolated environment, you can
conda create -n pdalenv
conda install -n pdalenv -c conda-forge pdal
or in one step
conda create -n pdalenv -c conda-forge pdal
In doing so, we can now run multiple versions of PDAL in a clean environment, which is especially important if the user has conflicting dependencies installed on their system or if they are using PDAL alongside Python and multiple Python packages.
Note: While Docker is certainly one possible alternative to running PDAL on many different platforms and in an isolated manner, we have found that for many users the barrier to entry is prohibitive.
A more elaborate example: to install PDAL with Python bindings for use inside a Jupyter notebook with Numpy and Scipy packages, we can be up and running in just three simple steps.
conda create -n pdal-notebook -c conda-forge \
source activate pdal-notebook