Introduction to Geospatial Data Types, Tools and Programs

Roadside Marker Locations around Burlington, VT

Geospatial data has become an increasingly important subject in the modern world and what is where has become a driving force both in tradition realms as well as the rapidly growing digital one.

Geospatial data usually refers to data with Latitude/Longitude or other reference points to geographic locations along with attributes that describe features like images, points, lines or polygons. In truth all data is geospatial and includes anything with an address, city, region, state or country. Working with this kind of data requires some foundational information so this will cover the basics of geospatial data, then we will explore the file types, tools and programs.

Projections

Map Projections are how mathematically 3D surfaces are transformed to a 2D plane. Every approach has some distortion but depending on the needs certain projections may fit better than others (see http://projectionwizard.org as a great way to determine the best fit).

Datums

Datums defines the surface and the position of the surface relative to the center of the earth. It provide a frame of reference for any map files and even the printed maps you are used to.

Example of Ellipsoid and Geoid models for Vertical Datums

Horizontal Datums provide a reference of a fixed point and an reference ellipsoid model which better represents the non-spherical shape of the Earth.

Vertical Datums are used to describe the elevation or orthometric height of a point either from Mean Sea Level, Tidal data, or from a geoid.

Global Datums In an increasingly connected world, the need for a universal perspective led to a global datum.

Datum Shifts

A coordinate transformation (EPSG, OGC) or geographic transformation (ESRI) between Datums. The margin of error between different datums can be quite small to hundreds of feet

The Future of Datums

NAD 83 and NAVD 88 will be replaced in 2022 with a newer version. This will correct for continental drift and adjust for several other factors.

Geographic Information Systems

Geographic Information Systems (GIS) is a system designed to capture, store, manipulate, analyze, and present spatial and geographic data to allow the understanding of relationship, patterns and trends.

This includes:

  • Topographical Maps
  • Property Boundaries
  • Roads and Rivers
  • Census Data by Town

Vector vs Raster

Raster

Data recorded as a pixel/grid units with image or information

  • Resolution is the pixel size
  • Can display satellite and other photographic information
Example of IR Raster Images

Vector

Scalable data geometrically represented lines, shapes, or points

  • Smaller file sizes
  • Better for depicting boundaries, roads and regional area
  • Can be styled as points, polygon color, line weights
Example of vector file showing regions in Burlington

FILE TYPES

Shapefile A shapefile is an Open Source vector data storage format for storing the location, shape, and attributes of geographic features. It is stored as a set of related files and contains one feature class.

An shapefile consists of a main file .shp, an index file .shx, and a dBASE table .dbf.

Additionally there may be .prj — projection format; the coordinate system and projection information, a plain text file describing the projection using well-known text format and an .xml which contains the metadata.

Any datums and projections can be used.

KML (Keyhole Markup Language) XML-based standard primarily used for Google Earth. KMZ (KML-Zipped) replaced KML as being the default Google Earth geospatial format.

KML/KMZ became an international standard of the Open Geospatial Consortium in 2008.

The longitude, latitude components (decimal degrees) are as defined by the World Geodetic System of 1984 (WGS84). The vertical component (altitude) is measured in meters from the WGS84 EGM96 Geoid vertical datum.

OSM (Open Street Maps) XML-based file format that contain geographic data in a structured, ordered format. The more efficient, smaller .pbf Format (“Protocolbuffer Binary Format”) is a compressed alternative to OSM.

QGIS can load native .OSM files but not .PBF. The OpenStreetMap plugin can convert PBF to OSM, which then can be used in QGIS.

Like KML it used the World Geodetic System of 1984 (WGS84).

Flat Files Any flat file that contains a country, state, county, city or address is technically geospatial data and can be joined with other types as attributes to the respective geospatial location.

DATA SERVICES — API

GeoJSON GeoJSON supports the following geometry types: Point, LineString, Polygon, MultiPoint, MultiLineString, and MultiPolygon. Geometric objects with additional properties are Feature objects. Sets of features are contained by FeatureCollection objects. http://geojson.org/

GeoService REST-based application which can return html and json and depending on the version of Service and can include geojson, kml, and amf. http://resources.arcgis.com/en/help/arcgis-rest-api/index.html

SOAP (Simple Object Access Protocol) XML exchanging structured information in the implementation of Web Services in computer networks. http://wiki.gis.com/wiki/index.php/SOAP

ArcGIS Javascript Javascript-based API that can support building engaging, beautiful web mapping applications. https://developers.arcgis.com/javascript/

DATA SERVICES — OPENGIS STANDARD FORMATS

WMS (Web Map Service) HTTP interface for requesting geo-registered map images from one or more distributed geospatial databases. The response to the request is one or more geo-registered map images (returned as JPEG, PNG, etc). http://www.opengeospatial.org/standards/wms

WSC (Web Coverage Service) A WCS provides access to coverage data in forms that are useful for client-side rendering, as input into scientific models, and for other clients. http://www.opengeospatial.org/standards/WCS

WMTS (Web Map Tile Service) A server application for sharing pre-cached map tiles across the web for use as basemaps with predefined content, extent, and resolution. http://www.opengeospatial.org/standards/WMTS

WPS (Web Processing Service) for sharing geoprocessing services across the web for performing dynamic geospatial analytics. http://www.opengeospatial.org/standards/WPS

WCS (Web Coverage Service) for sharing geospatial data stored and managed as a coverage across the web. http://www.opengeospatial.org/standards/WCS

CSW (Catalog Service for the Web) for sharing geospatial information, typically metadata, stored in XML across the web.

JAVASCRIPT, PLUG AND PLAY AND OTHERS

APIS, SDK’S, PLATFORMS, LIBRARIES, ETC

ARCGIS

https://www.arcgis.com ArcGIS offers a unique set of capabilities for applying location-based analysis to your business practices. Gain greater insights using contextual tools to analyze and visualize your data. Then share these insights and collaborate with others via apps, maps, and reports.

  • Professional Standard
  • Enterprise Services
  • Comprehensive Development
  • ArcExplore (requires account creation)

QGIS

http://www.qgis.org QGIS is an Open Source Geographic Information System (GIS) licensed under the GNU General Public License. QGIS is an official project of the Open Source Geospatial Foundation (OSGeo). It runs on Linux, Unix, Mac OSX, Windows and Android and supports numerous vector, raster, and database formats and functionalities.

PYTHON

Python is an open source general programming language which can be used for data analysis with a robust selection of geospatial libraries:

  • GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric type and include some of the libraries below for manipulation/plotting
  • PySal PySAL is an open source cross-platform library of spatial analysis function
  • Fiona Importing and exporting vector data from various formats like shapefile
  • Rasterio Importing and exporting raster data from various formats
  • PyProj Defining and transforming the datum and projections of spatial data
  • Shapely Spatial analytics
  • CartoPy Descartes Cartography tools for making maps.

ArcPy Python site package for ArcGIS that provides a useful and productive way to perform geographic data analysis, data conversion, data management, and map automation with Python.

R

R is an open source programming language and software environment for statistical computing and has a huge number of spatial data packages. Here are some of the common ones:

  • Ggmap extends the plotting package ggplot2 for maps
  • Rgdal R’s interface to the popular C/C++ spatial data processing library gdal
  • Rgeos R’s interface to the powerful vector processing library geos
  • Maptools provides various mapping functions
  • Dplyr and tidyr are a fast and concise data manipulation packages
  • Tmap is a new packages for rapidly creating beautiful maps
  • Cran.r-project Spatial provides a general list of spatial libraries

R-ArcGIS The R — ArcGIS Community is a community driven collection of free, open source projects making it easier and faster for R users to work with ArcGIS data, and ArcGIS users to leverage the analysis capabilities of R.

COMMUNITIES

  • The Spatial Community is a Slack-based community of over 1600 geospatial enthusiasts: developers, GIS professionals, students, and hobbyists. http://thespatialcommunity.org
  • OSGeo was created to support the collaborative development of open source geospatial software, and promote its widespread use. http://www.osgeo.org
  • OGC (Open Geospatial Consortium) is an international not for profit organization committed to making quality open standards for the global geospatial community. http://www.opengeospatial.org
  • NEARC North East ArcGIS User Group is an independent, volunteer organization dedicated to helping users of Esri GIS software and hosts two conference annually. http://www.northeastarc.org
  • VGIS-L Vermont GIS Community Listserve: https://list.uvm.edu/cgi-bin/wa?A0=VGIS-L

Conclusion

There are many more files, APIs, programs and tools beyond this short list. While it can be intimidating to approach geospatial problems, the power of layering is well worth the effort and there are many quick tools that allow you to get started with a basic map in minutes.

--

--

Kendall Fortney
Vermont Center for Geographic Information

Self-taught Data Scientist focused on Python, machine learning and Geospatial Data with degree in Art and years of experience in tech in Vermont.