Editor’s Preface: Obtaining, digitizing, georeferencing and publishing historical Vermont orthoimagery are some of the many things we at VCGI wish there were more time and resources to do. Fortunately, we’ve been lucky to have met Cale Kochenour, a Vermonter and current graduate student in GIS and remote sensing at Penn State. Cale has graciously volunteered not only to georeference some of the digitized old black and white VT orthoimagery we’ve collected over the years, but also document for others the process and lessons learned. Learn more about the imagery program here, and access imagery via the Vermont Open Geodata Portal here.
Over to Cale.
I: Introduction to Vermont’s Historic Imagery
Jenny Bower wrote about Vermont’s historic aerial imagery in a previous post on our Medium page. In the article, Jenny discusses the history of aerial imagery, as well the acquisition and use of it by VCGI.
To date, VCGI has published three historic aerial imagery data sets to the Vermont Open Geodata Portal:
- Imagery collected by the U.S. Soil Conversation Service in 1942 at a scale of 1:20,000 (Tile Download)
- Imagery collected by the Vermont Department of Highways in 1962 at a scale of 1:18,000 (Tile Download)
- Imagery collected by the Vermont Department of Highways in 1962 at a scale of 1:6,000 (Tile Download)
II: Overview of the Georeferencing Workflow
The presentation below provides a tutorial for how to georeference raster files such as historical orthoimagery using QGIS, an open source GIS software package. (A .pdf (19MB) and .pptx (70 MB) are also available for download.)
The workflow contains the following modules:
- System Prerequisites
- Environment Setup
- Image Acquisition
- Image Georeferencing
- Quality Assurance and Quality Control
- Wrap Up and File Collection
- Tips and Lessons Learned
I aimed for this tutorial to be useful for all audiences. It will provide an inexperienced GIS user the knowledge and skills to georeference raster files. It will also provide an experienced GIS user knowledge of the nuances involved with georeferencing historical aerial imagery, and Vermont’s historic imagery in particular. The modules are presented in an order that follows the georeferencing process. However, each module was designed to act as a standalone topic, which gives the user the flexibility to pick and choose which modules to review, depending on their need and skills.
Each module is broken down as follows:
- Download and installation of QGIS.
- Installation of the QGIS Georeferencer GDAL plugin.
- Selection and confirmation of the QGIS Coordinate Reference System.
- Addition of basemap layers.
- Overview of the historic imagery on the Vermont Open Geodata Portal.
- Dataset tile layout.
- Image download process.
- Web service options.
- Links to historic imagery and metadata.
- Working with the Georeferencer GDAL plugin.
- Georeferencing transformation settings, configurations, and properties.
- Expected georeferencing results.
Quality Assurance and Quality Control:
- Qualitative and quantitative checks to run post-georeferencing.
- Discussion of the georeferencing transformation.
Wrap Up and File Collection:
- Specifications used to georeference the DCC 1942 data set.
- Georeferenced image format conversion.
- Expected georeferencing output files.
- Links to QGIS and the Georeferencer GDAL plugin documentation.
Tips and Lessons Learned:
- Discussion of items unique to historic aerial imagery.
- Recommendations for control point locations.
- The No Data value in QGIS.
- Historic aerial imagery frame artifacts.
III: Lessons Learned from the Georeferencing Workflow
I learned a number of things while applying the georeferencing process to the historic 1942 imagery, and these lessons may be helpful for those also learning the ropes of georeferencing. These tips should equally apply to the 1962 datasets.
Control Point Locations and the Georeferencing Transformation
Vermont’s historic, analog aerial imagery has not been orthorectified, meaning the imagery contains uncorrected distortions caused by the terrain. Buildings and structures lean away from the image center. The amount of lean increases as the you move farther away from the image center. This effect causes varying scales throughout the image. In general, for non-orthorectified imagery, a second-order polynomial provides the most appropriate georeferencing transformation. The second-order polynomial allows the transformation to warp an image — beyond the ability to shift, scale, and rotate, as the first-order transformation provides — in order to fit the defined control points. The following screenshots show the QGIS Georeferencer window and where a user can change the selected georeferencing transformation.
While the second-order polynomial offers this advantage for georeferencing historic aerial imagery, it must be used consciously. A user should choose a transformation after an analysis of the control point locations. If possible, a user should place control points at the corners, edges, and center of an image, and should evenly distribute control points throughout the image. Vermont’s terrain does not always allow for the ideal case. In areas of the state covered by water, forests, open fields, and/or locations with no easily identifiable landmarks, a user may have more difficulty in placing control points. The following screenshots demonstrate images with evenly distributed control points.
The use of a second-order transformation without a sufficient control point distribution (i.e. missing control points at the corners, edges, center, or not evenly distributed) will likely affect the output image in a negative way. Enough warping and distortion can reduce the aesthetic of the georeferenced image. This may render the georeferenced image unusable for the intended purpose. Certain cases exist in which the user should rethink the use of the second-order transformation and consider a first-order transformation to maintain the integrity of the image shape. The following image shows the effect of poor control point distribution combined with a second-order transformation.
Two examples provided below demonstrate how the terrain dictates the georeferencing transformation. The first image, DCC_1942_02–169, shows a capture near Williston, VT.
The image contains road intersections and buildings throughout. Some forested areas are present. However, there are no features that prevent the even distribution of control points throughout the image. Edges, borders, and corners of land plots, while less ideal than road intersections and buildings, can be used for the control points if necessary. A second-order polynomial would be the appropriate transformation for this image.
The second image, DCC_1942_06–158, shows the northern part of Mount Mansfield near Underhill, VT.
The image contains some roads and buildings in the Northwest corner. In the Southeast corner, the unique and identifiable rock features on the Mount Mansfield Forehead and Adam’s Apple provide some locations for control points. The Sunset Ridge Trail provides a similar opportunity for control points. The remainder of the image, spanning a line from the Southwest corner to the Northeast corner, will provide a user trouble in placing control points in an exact manner. The thickness and density of the forest make it impractical to identify the same tree in both the 1942 imagery and a modern basemap or image data set. The following image shows parts of the image that could be used for control points (identifiable rock faces) and other parts that should not be used (forested areas).
This scenario presents a case where the use of a second-order georeferencing transformation (with insufficient control points) would alter the image enough that it would lose its aesthetic and usability (as demonstrated in DCC_1942_06–160 above). The terrain dictates that a first-order transform — where the image is shifted, scaled, and rotated, but not warped — would be the most appropriate choice.
Ultimately, a user should choose a georeferencing transformation based on the limitations and nature of the specific image. This varies from image to image, largely a factor of the location and type of features (land cover) contained in the image. GIS users should not fret about this decision. The choice of georeferencing transformation is made after the user creates the control points. Multiple transformation can be applied, each creating a unique georeferenced image. If concern exists, a user may visually compare the output image with different transformations to see which provides the best fit for the specific application of the imagery.
Control Point Categories
The workflow categorizes control point locations into three categories — Most Ideal, Less Ideal, and Least Ideal — within the context of Vermont’ historic aerial imagery.
- Most Ideal control point locations exist where minimal or no change in land cover has occurred from the time of image acquisition to the date of the reference image. Locations in this category include road intersection and centerlines, corners of buildings, and landmarks.
- Less Ideal control point locations exist where possible change in land cover has occurred over time, but the location is still identifiable in both the historic and reference images. Locations in this category include land cover borders (open land, forested areas, water bodies), individual trees, brushes, or shrubs, and road edges.
- Least Ideal control point locations exist where change in land cover has most likely occurred, and it is difficult or impractical to determine the same location in both the historic and reference images. Locations in this category include open land, forested areas, and water bodies (but excludes land cover borders).
The terrain and image extent drive the placement of control points, and a user must work within the constraints of the image.
Converting Image Format
QGIS outputs georeferenced imagery (using Georeferencer GDAL plugin) into the GeoTIFF format. In order to convert to a different image format, a user must use the QGIS Translate (Convert Format) option. Here, the user selects the output coordinate system and desired output image format. Some settings within this process contain quirks, such as converting from GeoTIFF to to JPEG2000, as the workflow discusses. More information can be found at the Geospatial Data Abstraction Library.
The workflow discusses the concept of RMS error within the context of historic imagery. In general, the RMS error calculation presents an effective metric to quantitatively assess how well a user georeferenced an image. However, since the Vermont historic aerial images are not orthorectified, the quantity of the RMS error does not have the same diagnostic value as it does for orthorectified imagery. A qualitative measurement for image registration — comparing the georeferenced image to the reference image — suffices in this scenario. RMS error provides a measure of the consistency of the transformation but is not a direct measurement of image registration accuracy. The residual value shows the difference between where a user placed a control point on the image and where the georeferencing transformation placed the point on the georeferenced image. Adding more control points to an image should provide more accurate image registration, but may also increase the RMS error. Do not worry about this. The specific value of the RMS error is less important in non-orthorectified imagery. Users should focus more on the magnitude of the RMS error (e.g. 5 vs. 50 vs. 500). An RMS error of 5 vs. 6 meters may not make a noticeable difference in the georeferenced image, but 5 vs. 50 meters will.
The No Data Value
The QGIS georeferencing process adds a collar (background) with a pixel value of 0 to the edges of the image in order to form a rectangle. More rotation caused by georeferencing equates to more collar pixels. The collar obstructs the reference imagery below the georeferenced image. This can be an inconvenience. Specifying a No Data value that matches the tone of the collar (value of 0) can hide the collar. However, a user must know that any pixels within the image, not just the collar, with the same tone value will also be hidden. Since the aerial imagery contains a single band (grayscale), as opposed to a 3-band color image (red, green, blue), there is a likelihood that some features (shadows, trees, frame artifacts, image borders) will also have a tone value of 0. It is worth noting that the bit depth — number of allowable brightness values in an image — varies throughout the 1942 dataset. The 1942 dataset contains both 8-bit images (256 brightness values, 0–255) and 16-bit images (65,536 brightness values, 0–65,535). The inadvertent removal of non-background pixels has a higher chance of occurring in 8-bit images than it does in 16-bit images. Looking at the georeferenced image histogram — the tone value distribution — shows how many pixels in the image contain values of 0. The histogram provides a relative risk profile for how many pixels a No Data value of 0 would remove. The data user should make a determination for the No Data value and if it should be used, based on the intended use of the image. Vermont’s historic aerial imagery also contains frame artifacts and image borders. Prior to georeferencing, a user could crop the image with image editing software to remove the artifacts or boundary. The No Data value could also be used here if the user just wants to temporarily hide these features.
This article provides an overview of the process to georeference raster files in QGIS and the lessons learned from georeferencing some of Vermont’s historic, analog aerial imagery. I have enjoyed exploring this imagery, and I am grateful for the opportunity to share my experience working with the imagery and documenting the georeferencing process. Let me know if you have questions or comments about the topics discussed.