Installing spatial R packages on Ubuntu

geocompr rspatial packages gdal geos proj linux

This post explains how to quickly get key R packages for geographic research installed on Ubuntu, a popular Linux distribution.

A recent thread on the r-spatial GitHub organization alludes to many considerations when choosing a Linux set-up for work with geographic data, ranging from the choice of Linux distribution (distro) to the use of binary vs or compiled versions (binaries are faster to install). This post touches on some of these things. Its main purpose, though, is to provide advice on getting R’s key spatial packages up-and-running on a future-proof Linux operating system (Ubuntu).

Now is a good time to be thinking about your R set-up because changes are in the pipeline and getting set-up (or preparing to get set-up) now could save hours in the future. These imminent changes include:

To keep-up with these changes, this post will be updated in late April when some of the dust has settled around these changes. However, the advice presented here should be future-proof. Upgrading Ubuntu is covered in the next section.

There many ways of getting Ubuntu set-up for spatial R packages. A benefit of Linux operating systems is that they offer choice and prevent ‘lock-in’. However, the guidance in the next section should reduce set-up time and improve maintainability (with updates managed by Ubuntu) compared with other ways of doing things, especially for beginners. If you’re planning to switch to Linux as the basis of your geographic work, this advice may be particularly useful. (The post was written in response to colleagues asking me how to set-up R on their new Ubuntu computers. If you would like a a computer running Ubuntu, check out companies that support open source operating systems and guides on installing Ubuntu on an existing machine).

By ‘key packages’ I mean the following, which enable the majority of day-to-day geographic data processing and visualization tasks:

The focus is on Ubuntu because that’s what I’ve got most experience with and it is well supported by the community. Links for installing geographic R packages on other distros are provided in section 3.

1. Installing spatial R packages on Ubuntu

R’s spatial packages can be installed from source on the latest version of this popular operating system, once the appropriate repository has been set-up, meaning faster install times (only a few minutes including the installation of upstream dependencies). The following bash commands should install key geographic R packages on Ubuntu 19.10:

# add a repository that ships the latest version of R:
sudo add-apt-repository ppa:marutter/rrutter3.5
# update the repositories so the software can be found:
sudo apt update
# install system dependencies:
sudo apt install libudunits2-dev libgdal-dev libgeos-dev libproj-dev libfontconfig1-dev
# binary versions of key R packages:
sudo apt install r-base-dev r-cran-sf r-cran-raster r-cran-rjava

To test your installation of R has worked, try running R in an IDE such as RStudio or in the terminal by entering R. You should be able to run the following commands without problem:

library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
install.packages("tmap")

If you are using an older version of Ubuntu and don’t want to upgrade to 19.10, which will upgrade to (20.04) by the end of April 2020, see instructions at github.com/r-spatial/sf and detailed instructions on the blog rtask.thinkr.fr, which contains this additional shell command:

# for Ubuntu 18.04
sudo add-apt-repository ppa:marutter/c2d4u3.5

That adds a repository that ships hundreds of binary versions of R packages, meaning faster install times for packages (see the Binary package section of the open source book R Packages for more on binary packages). An updated repository, called c2d4u4.0 or similar, will be available for Ubuntu 20.04 in late April.

If you have issues with the instructions in this post here, you can find a wealth of answers on site such as StackOverflow, the sf issue tracker, r-sig-geo and Debian special interest group (SIG) email lists (the latter of which provided input into this blog post, thanks to Dirk Eddelbuettel and Michael Rutter).

2. Updating R packages and upstream dependencies

Linux operating systems allow you to customize your set-up in myriad ways. This can be enlightening but it can also be wasteful, so it’s worth considering the stability/cutting-edge continuum before diving into a particular set-up and potentially wasting time (if the previous section hasn’t already made-up your mind).

A reliable way to keep close (but not too close) to the cutting edge on the R side on any operating system is simply to keep your packages up-to-date. Running the following command (or using the Tools menu in RStudio) every week or so will ensure you have up-to-date package versions:

update.packages()

Keeping system dependencies, software that R relies on but that is not maintained by R developers, is also important but can be tricky, especially for large and complex libraries like GDAL. On Ubuntu dependencies are managed by apt, and the following commands will update the ‘OSGeo stack’, composed of PROJ, GEOS and GDAL, if changes are detected in the default repositories (from 18.10 onwards):

sudo apt update # see if things have changed
sudo apt upgrade # install changes

The following commands will upgrade to a newer version of Ubuntu (it may be worth waiting until the point release of Ubuntu 20.04 — 20.04.1 — is released in summer before upgrading if you’re currently running Ubuntu 18.04 if high stability and low set-up times are priorities; also see instructions here):

apt dist-upgrade

To get more up-to-date upstream geographic libraries than provided in the default Ubuntu repositories, you can add the ubuntugis repository as follows. This is a pre-requisite on Ubuntu 18.04 and earlier but also works with later versions (warning, adding this repository could cause complications if you already have software such as QGIS that uses a particular version of GDAL installed):

sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable
sudo apt update
sudo apt upgrade

That will give you more up-to-date versions of GDAL, GEOS and PROJ which may offer some performance improvements. Note: if you do update dependencies such as GDAL you will need to re-install the relevant packages, e.g. with install.packages("sf"). You can revert that change with the following little-known command:

sudo add-apt-repository --remove ppa:ubuntugis/ubuntugis-unstable

If you also want the development versions of key R packages, e.g. to test new features and support development efforts, you can install them from GitHub, e.g. as follows:

remotes::install_github("r-spatial/sf")
remotes::install_github("rspatial/raster")
remotes::install_github("mtennekes/tmaptools") # required for dev version of tmap
remotes::install_github("mtennekes/tmap")

3. Installing geographic R packages on other Linux operating systems

If you are in the fortunate position of switching to Linux and being able to choose the distribution that best fits your needs, it’s worth thinking about which distribution will be both user-friendly (more on that soon), performant and future-proof. Ubuntu is a solid choice, with a large user community and repositories such as ‘ubuntugis’ providing more up-to-date versions of upstream geographic libraries such as GDAL.

QGIS is also well-supported on Ubuntu.

However, you can install R and key geographic packages on other operating systems, although it may take longer. Useful links on installing R and geographic libraries are provided below for reference:

  • Installing R on Debian is covered on the CRAN website. Upstream dependencies such as GDAL can be installed on recent versions of Debian, such as buster, with commands such as apt install libgdal-dev as per instructions on the rocker/geospatial.

  • Installing R on Fedora/Red Hat is straightforward, as outlined on CRAN. GDAL and other spatial libraries can be installed from Fedora’s dnf package manager, e.g. as documented here for sf.

  • Arch Linux has a growing R community. Information on installing and setting-up R can be found on the ArchLinux wiki. Installing upstream dependencies such as GDAL on Arch is also relatively straightforward. There is also a detailed guide for installing R plus geographic packages by Patrick Schratz.

4. Geographic R packages on Docker

The Ubuntu installation instructions outlined above provide such an easy and future-proof set-up. But if you want an even easier way to get the power of key geographic packages running on Linux, and have plenty of RAM and HD space, running R on the ‘Docker Engine’ may be an attractive option.

Advantages of using Docker include reproducibility (code will always run the same on any given image, and images can be saved), portability (Docker can run on Linux, Windows and Mac) and scalability (Docker provides a platform for scaling-up computations across multiple nodes).

For an introduction to using R/RStudio in Docker, see the Rocker project.

Using that approach, I recommend the following Docker images for using R as a basis for geographic research:

  • rocker/geospatial which contains key geographic packages, including those listed above
  • robinlovelace/geocompr which contains all the packages needed to reproduce the contents of the book, and which you can run with the following command in a shell in which Docker is installed:
docker run -e PASSWORD=yourpassword --rm -p 8787:8787 robinlovelace/geocompr

To test-out the Ubuntu 19.10 set-up recommended above I created a Dockerfile and associated image on Dockerhub that you can test-out as follows:

docker run -it robinlovelace/geocompr:ubuntu-eoan
R
library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(raster)
library(tmap) 

The previous commands should take you to a terminal inside the docker container where you try out the Linux command line and R. If you want to use more cutting-edge versions of the geographic libraries, you can use the ubuntu-bionic image (note the more recent version numbers, with PROJ 7.0.0 for example):

sudo docker run -it robinlovelace/geocompr:ubuntu-bionic
R
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0

These images do not currently contain all the dependencies needed to reproduce the code in Geocomputation with R.

However, as documented in issue 476 in the geocompr GitHub repo, there is a plan to provide Docker images with this full ‘R-spatial’ stack installed, building on strong foundations such as rocker/geospatial and the ubuntugis repositories, to support different versions of GDAL and other dependencies. We welcome any comments or tech support to help make this happen. Suggested changes to this post are also welcome, see the source code here.

5. Fin

R is an open-source language heavily inspired by Unix/Linux so it should come as no surprise that it runs well on a variety of Linux distributions, Ubuntu (covered in this post) in particular. The guidance in this post should get geographic R packages set-up quickly in a future-proof way. A sensible next step is to sharpen you system administration (sysadmin) and shell coding skills, e.g. with reference to Ubuntu wiki pages and Chapter 2 of the open source book Data Science at the Command Line.

This will take time but, building on OSGeo libraries, a well set-up Linux machine is an ideal platform to install, run and develop key geographic R packages in a performant, stable and future-proof way.

Be the FOSS4G change you want to see in the world!

Progress update: Geocomputation with R Second Edition Part 1

geocompr geocompr2 sf rgdal stars raster terra s2 rstats

Geocomputation with R: Second Edition feedback

geocompr geocompr2 sf sp stars raster terra sabre tmap rstats

Conversions between different spatial classes in R

geocompr sf sp stars raster terra sabre tmap rstats