PDAL at the OSGeo Code Sprint
Two weeks ago, approximately 95 contributors from a wide range of open source geospatial projects convened in Bonn, Germany for the 2018 OSGeo Code Sprint. Of these, about 17 contributors (spanning eight nationalities and ten organizations) of various point cloud projects were in attendance. The PDAL development team pushed hard towards a PDAL v1.7 release. A good deal of my focus was on cleaning up documentation, introducing a new filter, and working to solidify a Conda recipe for PDAL.
Documentation
In each of the last seven releases, the number of PDAL stages has increased.
As we approach the century mark, I personally celebrate the fact that we’ve designed a system that is flexible in its ability to transcode between various point cloud formats, but has also established itself as a powerful processing engine with over half of the stages being devoted to pipelined filtering operations.
With all of these capabilities, we unfortunately also have a documentation problem on our hands. If we were focused on solely readers and writers, many of whom are referenced by their file extension or specification name, we could stick to presenting a monolithic list of stages like readers.las
or writers.matlab
.
Filters present an interesting problem though. We do not always have a universally recognized acronym to represent some of the filters. filters.pmf
may be meaningful enough for someone looking for the Progressive Morphological Filter, but not to the majority of readers. The flip side of that would be to name the filters functionally, like filters.ground
for segmenting ground returns. But there may be many algorithms that could be used for such a purpose and PDAL doesn’t currently support hierarchical or nested stage names like filters.ground.pmf
. Furthermore, the role of a reader and writer are relatively clear but a filter can serve multiple roles. Some filters create new dimensions. Others alter dimensions (but not coordinates) while others move the coordinates themselves. Some cull points while others split or join point clouds. At this point in time, PDAL makes no distinction between all of these different filter types. For all of these reasons, and in lieu of a major overhaul of PDAL’s Filter class, improved documentation is perhaps the most straightforward answer to our problem.
With the release of PDAL v1.7, we will begin providing stage descriptions along with stage names in the reader, writer, and filter documentation. An example is shown below.
The filter documentation will be further grouped into several broad categories to capture the intended behavior.
Returns Filters
filters.returns
can now be used to split a point cloud into separate point clouds based on ReturnNumber
and NumberOfReturns
. Users can specify any combination of first, intermediate , last , or only and filters.returns
will create a new point cloud for each subset.
When the NumberOfReturns
is exactly 1, a return is considered an only return. When the NumberOfReturns
is greater than 1, first returns are identified as those with a ReturnNumber
of 1, and last returns are identified as those whose ReturnNumber
is equal to the NumberOfReturns
. When the NumberOfReturns
is greater than 2, intermediate returns are those for which the ReturnNumber
is both greater than 1 and less than the NumberOfReturns
.
The following pdal translate
command reads the input LAS file and uses filters.returns
to split the point cloud into two separate point clouds. In this case, the point clouds are written to two separate files output_0.las
and output_1.las
, which are comprised of first and last returns respectively.
pdal translate input.las output_#.las returns \
--filters.returns.groups="first,last"
Conda
PDAL binaries are already available to Linux users via Fedora, Debian, and Alpine. Windows users can install via OSGeo4W. And macOS users can install via Homebrew. We also maintain Docker images. So why another packaging system?
Conda gives us a common look and feel across platforms. Once the PDAL conda-forge package is finalized in the coming days/weeks, regardless of your OS, you will be able to
conda install -c conda-forge pdal
where -c conda-forge
indicates that we are installing from the conda-forge channel.
If you would like to install PDAL in an isolated environment, you can
conda create -n pdalenv
conda install -n pdalenv -c conda-forge pdal
or in one step
conda create -n pdalenv -c conda-forge pdal
In doing so, we can now run multiple versions of PDAL in a clean environment, which is especially important if the user has conflicting dependencies installed on their system or if they are using PDAL alongside Python and multiple Python packages.
Note: While Docker is certainly one possible alternative to running PDAL on many different platforms and in an isolated manner, we have found that for many users the barrier to entry is prohibitive.
A more elaborate example: to install PDAL with Python bindings for use inside a Jupyter notebook with Numpy and Scipy packages, we can be up and running in just three simple steps.
conda create -n pdal-notebook -c conda-forge \
jupyter \
python3 \
pdal \
python-pdal \
numpy \
scipy
source activate pdal-notebook
jupyter notebook
For more on the topic, please come see my talk at FOSS4G North America.