Geofencing in ODK

Jan Schenk
4 min readJun 25, 2017

--

At ikapadata we are sometimes faced with the task of listing or enumerating households within specific enumeration areas (EAs). This means that we need to ensure that the observations (points) we collect are all located within a predefined area (polygon). So far we have tackled this through a combination of using maps on the enumerator’s devices and sending automated Slack messages to the fieldwork team leaders whenever an enumerator captured a point outside the targeted EA. While this usually works very well, we were looking for a solution that would warn enumerators immediately, within SurveyCTO Collect (or ODK), without having to rely on a working network connection or third-party apps. This is what we came up with.

Requirements

We use SurveyCTO for mobile data collection but it should work with any ODK-based system using XLSForm.

You will need a CSV-file defining your polygons (e.g. EAs) as a list of nodes with GPS coordinates in a long format (here is an example of what it should look like). It will need the following:

  • Polygon ID: Serial number (1, 2 ,3 ,…) identifying the polygons
  • Node ID (or “corners”): Serial number (1, 2 ,3 ,…) identifying the nodes tracing the shape of the polygon.
  • ID Key: Combination of Polygon ID and Node ID as a unique ID to identify each node
  • Latitude/Longitude of each node in separate columns
  • Number of Nodes (or “corners”): Number of nodes/corners tracing the shape of the polygon (constant).

If you struggle to get your polygons into the required format you can use the shp2dta command in Stata to convert your shapefile into Stata format and some variant of the following Stata script to get the resulting coordinates.dta file into the correct format (shp2dta creates empty and repeated rows for nodes which we first need to get rid of):

cap ssc install shp2data
shp2dta using polygons.shp, database(database) coordinates(coordinates) replace
use coordinates, clear
rename _ID id_polygon
drop if _X == .
gen n = _n
sort id_polygon n
by id_polygon: gen id_corner = _n
by id_polygon: gen n_corner = _N - 1
drop if id_corner == n_corner + 1
gen id_key = string(id_polygon) + "_" + string(id_corner)
rename _Y p_latitude
rename _X p_longitude
export delimited id_key n_corner p_latitude p_longitude using preload_polygons.csv, replace
exit

Once you have your CSV-file you can upload it to SurveyCTO as a preloaded server database and attach it to the form you want to use for geofencing.

Form Definition

In the form you will need the following (here is an example of a form which allows you to enter the coordinates of the point manually or capture a geopoint):

Parameters

  • A field asking for the ID of the polygon (ID Polygon)
  • Either a geopoint field or two decimal fields to collect the GPS location of the point (if you use geopoint, you will need to parse it first — see below)
  • A calculate field to define the ID Key of the first node of the selected polygon (only used for the next calculate field)
  • A calculate field to pull the number of nodes for the selected polygon from the preload, to be used as the repeat-count in the repeat group

Calculating Point-in-Poly

A repeat group going through each node of the polygon, with the following fields:

  • A calculate field to define the unique ID Key for the node (required for pulling the other fields from the preload)
  • Calculate fields to pull the lat/long coordinates of the node from the preload
  • A calculate field to define the unique ID Key for the previous node (required for pulling the other fields from the preload)
  • Calculate fields to pull the lat/long coordinates of the previous node from the preload (index()-1, except for the first node where it skips back to the last node)
  • A calculate field where all the magic happens. It is based on the algorithm found here — best to visit the site to understand how this works. It basically acts as a dummy indicator here.
  • A dummy calculate field determining whether the point is inside or outside the polygon, based on the sum of the dummy indicator inside the repeat group (odd sum: inside; even sum: outside)

Parse Geopoint

  • One calculate field to get the short version of the geopoint
  • One calculate field to get string-length of the short geopoint, to be used as the repeat-count in the following repeat group
  • A repeat group going through each character in the short geopoint string. It contains only a single field dummy calculate field which is the position index() of the repeat group if the position marks the space (“ “) between the latitude and longitude in the GPS field
  • A calculate field max() to collect the position of the space in the GPS string from the repeat group
  • Two calculate fields for parsing the GPS field for latitude and longitude using the position of the blank space in the GPS field

Restrictions

According to the authors of the algorithm it would not work with a polygon crossing the international dateline — so watch out if you use ODK to capture locations of shipwrecks in the Pacific.

Also the polygons must not be too large. But unless your enumeration areas are larger than a US State you will be fine.

--

--

Jan Schenk

Founder of ikapadata and builder of bridges between technology and humanities.