Copy and Paste Georeference Data using GDAL

Tommy
3 min readJun 18, 2022

--

Recently I am working for a project that requires “copy & paste” georeference data of between identical raster file, varying format (TIFF and PDF) and resolution (150dpi, 600dpi, 1200dpi…).

Since the content and extent of files are identical, it’s better to find a simple one-line command to complete the task, don’t want to worry the projection and transformation. Googling “GDAL copy georeference”, most of the results point to a python script, gdalcopyproj.py.

This script is quite simple and intuitive, using GDAL Python library, copying geotransform and projection data from source file, and set in the destination file. One-line command, nice! (Remember to install GDAL if you have not!)

python gdalcopyproj.py source_file dest_file

Note that the script is quite historical as written in Python 2, which is already end-of-life on 2020–1–1, so first of all, change all“print XX” to “print(XX)” so as to successfully run in Python 3.

This script works well for files of identical pixel dimension (i.e. resolution), except some PDF (I want to copy the georeference data from GeoTIFF to PDF, I am not sure whether GDAL has bug in processing PDF, anyway I failed to paste the data to PDF accordingly). However, it doesn’t work for different resolution, why?

You shall know a little bit about Coordinate Reference System and Spatial Projection first. In my very inaccurate description, the rectangular raster is projected in a rectangular coordinate system (like x,y coordinate system learnt in secondary school maths). The “origin” is imagined to be top-left corner of the raster (h0,k0), it is the coordinates of actual projection system used (e.g. EGSG:27700 for UK, EPSG:2326 for HK). The actual coordinates for any point (h,k) in the raster file coordinates (x,y) is (h0 + x * dx, y0 + y * dy). You may think it is coordinate conversion between computer graphic and geospatial projection.

h = h0 + x * dx
k = k0 + y * dy
where dx and dy are conversion unit, means one pixel represents how many unit of x/y in the geospatial projection.

For a raster with X, Y pixels in its files, the length of easting and northing of raster in the corresponding projection is X * dx (Xdx) and Y * dy (Ydy). Imagine the four corners of the raster…the coordinates for computer graphic and geospatial projection are…

top-left: (0,0) -> (h0, k0)
top-right: (X,0) -> (h0 + Xdx, k0)
bottom-left: (0,-Y) -> (h0, k0 + Ydy)
bottom-right: (X, -Y) -> (h0 + XDx, k0 + Ydy)
Note that: dx > 0, dy < 0
because downward is decreasing for geospatial, but increasing for computer graphic

So… for the raster files of identical extent, the values of (h0, k0, Xdx, Ydy) are the same, but dx and dy vary with resolution. The logic of dx and dy is similar to the unit of dpi / ppi…for the same physical length, pixel increases means one pixel represents shorter physical length.

Now you may understand the dx and dy should be adjusted when raster resolution (pixels) changed.

Inside gdalcopyproj.py, GetGeoTransform is a function to return the tuple containing “h0, k0, dx, dy” that I mentioned above.

dataset.GetGeoTransform()
# (h0, dx, 0, k0, 0, dy)

To consider the function for copying projection to the same raster (identical extent) with different resolution, dx and dy should be modified as:

geotransform_new = list(dataset.GetGeoTransform())# total X-length in projection(Xdx) = dx_old * pixel_x_old
# dx_new = Xdx / pixel_x_new
geotransform_new[1] = geotransform_new[1] * dataset.RasterXSize / dataset2.RasterXSize
# Ydy = dy_old * pixel_y_old
# dy_new = Ydy / pixel_y_new
geotransform_new[5] = geotransform_new[5] * dataset.RasterYSize / dataset2.RasterYSize
if geotransform is not None:
dataset2.SetGeoTransform( tuple(geotransform_new) )
# RasterXSize, RasterYSize are number of pixels in the raster file

Finally, here is the “improved version” of gdalcopyproj.py

--

--