Perhaps the most ubiquitous type of geographic information is the continuous stream of data produced by GPS devices. Global Positioning System (GPS) devices are now in everything from watches to cars and, of course, smartphones. This means that GPS datasets have the ability to track a large proportion of the world’s population. Although there are privacy concerns, when appropriately anonymized and aggregated, GPS datasets have the potential to help tackle the issues raised in the Transportation chapter of the Geocomputation with R book: to design healthy transport systems in which walking and cycling overtake cars as the main travel modes.

Loading GPS data

The standard format of GPS data is the .gpx file. GPS traces are vector datasets that are well-support by sf (see Chapter 2 of the book), so we’ll use this package to process them:

As with any dataset the first stage is to identify the source. A great source of GPS data is OpenStreetMap (OSM). We’ll use publicly available .gpx files uploaded to OSM as the basis of this tutorial.1 For the purpose of this tutorial we will use a .gpx file uploaded to OSM, that represents travel to the Institute for Geoinformatics, Universität Münster:

The previous code chunk downloads the trace from OSM and queries the ‘layers’ that can be imported (note there are 5, but only ‘tracks’ and ‘track_points’ are available). The following code chunk imports and plots the ‘tracks’ layer, which is a MULTILINESTRING geometry in sf, and only 1 row of data:

Other than the geometry, this object has limited information about the route in the fields. The richest field is ‘Description’, which contains the following summary data:


[1] “factor”

[1] "

Start time: 01/07/2019 10:03
Finish time: 01/07/2019 10:34
Distance: 5.1 km (00:30)
Moving time: 00:21
Average speed: 10.08 km/h
Avg. Speed Mov.: 14.07 km/h
Max. speed: 35.06 km/h
Minimum altitude: 32 m
Maximum altitude: 68 m
Ascent speed: 469.6 m/h
Descent speed: -889.1 m/h
Elevation gain: 139 m
Elevation loss: -113 m
Ascent time: 00:17
Descent time: 00:07


The data can also be imported as points, as follows:

Time series

When did the track take place?

It started from just after 9:00 in the morning and ended just after 09:30. Let’s check if they happened at regular intervals:

plot(p$time, 1:nrow(p))

And the temporal resolution:

difftime(p$time[11], p$time[10])
#> Time difference of 4 secs

Converting points to lines

The plot of points is useful, but it would be better to have a line representation that contains the rich point-level data. This can be done by converting the points to lines, as follows for the the first 2 points:

c = st_coordinates(p)
l1 = sf::st_linestring(c[1:2, ])
lsf = st_sf(st_set_geometry(p[2, ], NULL), st_sfc(l1))
plot(p$geometry[1:2], add = T)

For all points we can do:

points2line_trajectory = function(p) {
  c = st_coordinates(p)
  i = seq(nrow(p) - 2)
  l = purrr::map(i, ~ sf::st_linestring(c[.x:(.x + 1), ]))
  s = purrr::map_dbl(i, function(x) {
    geosphere::distHaversine(c[x, ], c[(x + 1), ]) /
      as.numeric(p$time[x + 1] - p$time[x])
  lfc = sf::st_sfc(l)
  a = seq(length(lfc)) + 1 # sequence to subset
  p_data = cbind(sf::st_set_geometry(p[a, ], NULL), s)
  sf::st_sf(p_data, geometry = lfc)

Let’s test it out:

We now have lots more variables to play with and, unlike with the point dataset, the look good (and like traces) when plotted. Note that in the previous function definition we calculated speed. Plot the resulting ‘speed profile trace’ as follows:

plot(l["s"], lwd = l$s)

  1. For saving your own GPS data, we recommend using an open source Android apps such as Orux, owntracks or osmand. These can then be transferred onto your computer and, if you want to support the community, uploaded to OSM.