TWO DOGS IN TOILET ELDERLY LADY INVOLVED

dataviz
geospatial
leaflet
r
rvest
sf
Author
Published

April 27, 2018

A pigeon that's stuck inside a chimney breast.

A pigeon does not belong inside a chimney.

tl;dr

Animals get stuck in weird places, just ask the London Fire Brigade. I used the {sf} package to handling some animal rescue spatial data prior to interactive mapping with {leaflet}.

The problem

Sometimes I need to convert coordinate data to latitude and longitude for interactive mapping using the {leaflet} package in R.

I’m going to demo the {sf} package, show how to reproject coordinates, work with points and polygon data, make an interactive map with {leaflet} and do a bonus bit of webscraping.

Special features

The {sf} (‘simple features’) package from Edzer Pebesma has a series of classes and methods for spatial data. The package is becoming popular because simple feature access is a widely-used multi-platform ISO standard and the package supports the popular tidy data paradigm and can be integrated with tidyverse operations. This and more was spelled out by Pebesma in a post from UseR! 2017.

You can read more about anlyses with {sf} in Geocomputation with R by Robin Lovelace, Jakub Nowosad and Jannes Muenchow.

Data: animal rescues

I’ve found an interesting example dataset that contains eastings and northings: animal rescue incidents attended by the London Fire Brigade from the excellent London Data Store. In their words:

The London Fire Brigade attends a range of non-fire incidents (which we call ‘special services’). These ‘special services’ include assistance to animals that may be trapped or in distress.

This is close to my heart because a pigeon fell down our chimney recently. After a brave rescue mission (putting on some rubber gloves and contorting my arms through a tiny vent hole) I think I’m now qualified to join an elite search and rescue team.

Clean

We can read the data directly from the datastore as a CSV file.

suppressPackageStartupMessages(library(tidyverse))
library(janitor, warn.conflicts = FALSE)

csv_path <- "https://data.london.gov.uk/download/animal-rescue-incidents-attended-by-lfb/01007433-55c2-4b8a-b799-626d9e3bc284/Animal%20Rescue%20incidents%20attended%20by%20LFB%20from%20Jan%202009.csv"

rescue <- read_csv(csv_path, show_col_types = FALSE)

# A sample of the rows and columns
rescue %>%
  select(DateTimeOfCall, `IncidentNotionalCost(£)`, AnimalGroupParent, Borough) %>%
  slice_sample(n = 5)
# A tibble: 5 × 4
  DateTimeOfCall      `IncidentNotionalCost(£)` AnimalGroupParent     Borough   
  <dttm>              <chr>                     <chr>                 <chr>     
1 2020-06-14 12:05:00 346                       Bird                  WESTMINST…
2 2022-12-31 18:53:00 728                       Cat                   BRENT     
3 2017-06-16 21:19:00 328                       Cat                   SOUTHWARK 
4 2022-07-17 12:41:00 728                       Unknown - Wild Animal HARROW    
5 2018-04-08 10:36:00 333                       Cat                   WALTHAM F…

So each row is an ‘incident’ to which the brigade were called and there’s 9728 of them recorded in the dataset. There are 31 columns for variables relating to incident, including when it was, what it was and where it was.

Note

This post was first written in April 2018, but was re-rendered with the latest rescue data as of August 2023.

Explore

This post isn’t about the dataset, but it’s worth having a closer look at it. I’ve added an interactive table below so you can search for incidents, creatures, locations and the notional cost of the callout.

Obviously there are plenty of cats up trees, but there’s so much more. Some of the more unique entries are:

DOG WITH JAW TRAPPED IN MAGAZINE RACK

SWAN IN DISTRESS

FERRET STUCK IN LIFT SHAFT

And of course:

TWO DOGS IN TOILET ELDERLY LADY INVOLVED

Data: points

Each incident in our data is a point in space with an X and Y coordinate. It’s currently eastings and northings, but we want to transform it to latitude and longitude.

The sf class and reprojection

A Coordinate Reference System (CRS) is required to project geographic entities – points, polygons, etc – onto a surface. There are many systems for doing this, from local to global, each with its own CRS code. This isn’t a post about geography and projections, but you can read more in the CRS chapter of rspatial.org.

First we convert the our dataframe-class object an ‘sf’ class object using the st_as_sf function, which readies it for spatial analysis. We simply provide arguments that point to the columns containing the coordinates and the CRS code for that projection system.

We can then use the st_transform() function to convert our projection system from eastings and northings to to latitude and longitude.

library(sf, quietly = TRUE)
Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
rescue_sf <- rescue %>% 
  st_as_sf(
    coords = c("Easting_rounded", "Northing_rounded"),
    crs = 27700  # coordinate reference system code for eastings/northings
  ) %>% 
  st_transform(crs = 4326)  # the coord ref system code for latlong

So what happened more specifically? Our eastings and northings were combined with st_as_sf() into a two element list-column called geometry with data type sfc_POINT. Our new sf-class object also contains some metadata detailing the geometry type – POINTS in our case – and the projection system of the coordinate data, which we converted to latlong with the st_transform() function.

Tidyverse manipulation

As mentioned, one advantage of using {sf} is that sf-class objects use the tidy data paradigm that allows for use of the tidyverse. Some users may prefer this relative to the ‘Spatial Points Data Frame’ (SPDF) class that is produced by the {sp} package for points data. SPDFs are an S4 class, which means they have ‘slots’ of data, coordinates, etc.

In the code chunk above I used pipes to pass the rescue dataframe to st_as_sf() and then to st_transform(). You can also use {dplyr} functions like filter() on your sf-class object.

filtered_rescue_sf <- rescue_sf %>%
  filter(
    AnimalGroupParent %in% c("Dog", "Cat", "Bird"),  # just these animals
    DateTimeOfCall > ymd("2017-01-01") &
      DateTimeOfCall < ymd("2017-12-31")  # 2017 only
  )

We can also use the st_coordinates() function to extract the XY (latitude and longitude) coordinates from the column containing our {sf} point geometry. This means we can have separate columns for the latitude and longitude, as well as our list-column.

rescue_coords <- as.data.frame(st_coordinates(filtered_rescue_sf))
filtered_rescue_sf <- bind_cols(filtered_rescue_sf, rescue_coords)

Data: polygons

Read and convert GeoJSON

While we’re messing around with the {sf} package we can investigate polygons by adding in the borders for each of the London boroughs from a GeoJSON file.

First, read Local Authority District (LADs) data directly from the Open Geography Portal and then use the st_as_sf() function to make the conversion from the sp class to the sf class.

suppressPackageStartupMessages(library(geojsonio))

# Download boundary data to temporary location
geojson_path <- "https://services1.arcgis.com/ESMARspQHYMw9BZ9/arcgis/rest/services/Local_Authority_Districts_May_2023_UK_BUC_V2/FeatureServer/0/query?outFields=*&where=1%3D1&f=geojson"
temp_geojson <- tempfile(fileext = ".geojson")
download.file(geojson_path, temp_geojson, quiet = TRUE)

# Read the geojson and convert to sf class
lad_sf <- temp_geojson %>%
  geojson_read(what = "sp") %>% 
  st_as_sf()

unlink(temp_geojson)  # discard temporary file

This means our polygon dataset is a tidy dataframe with the polygon information stored as MULTIPOLYGON type in a list-column with as many elements as required to draw each polygon (e.g. a square requires four sets of XY points that can be joined together). The outcome is very similar to what we had for our points data.

Scrape London boroughs

But which of these LADs are London boroughs? We can extract a vector of the boroughs from Wikipedia using the {rvest} package and use it to filter our data. The CSS selector used in the html_nodes() function below can be extracted using the very handy SelectorGadget tool.

library(rvest, warn.conflicts = FALSE)
library(xml2)

wiki_page <- "https://en.wikipedia.org/wiki/List_of_London_boroughs"

boroughs <- wiki_page %>%
  read_html() %>% 
  html_nodes(css = "td:nth-child(1) > a") %>%
  html_text() %>% 
  tolower()

boroughs_sf <- lad_sf %>% 
  mutate(LAD23NM = tolower(LAD23NM)) %>% 
  filter(LAD23NM %in% boroughs)

head(boroughs_sf$LAD23NM)
[1] "city of london"       "barking and dagenham" "barnet"              
[4] "bexley"               "brent"                "bromley"             

Map it

So now we can plot the data. The addPolygons() function accepts each MULTIPOLYGON in the list-column of our London boroughs dataset. The addMarkers() function accepts the POINTS-type list-column to plot each point at the latitude and longitude specified.

This is a very simple map for now. You can:

  • zoom with the + and - buttons or scroll with your mouse wheel
  • click and drag to move the map
  • click the points to get a pop-up showing more information
  • hover over a borough to highlight it and show a label
library(leaflet)  # for interactive mapping

# put the map together
leaflet(boroughs_sf) %>% 
  addProviderTiles(providers$Wikimedia) %>%
  addPolygons(  # add London borough boundaries
    label = ~tools::toTitleCase(LAD23NM),  # label on hover
    color = "black",  # boundary colour
    weight = 2,  # boundary thickness
    opacity = 1,  # fully opaque lines
    fillOpacity = 0.2,  # mostly transparent
    highlightOptions = highlightOptions(
      color = "white",  # turn boundary white on hover
      weight = 2,  # same as polygon boundary
      bringToFront = TRUE  # overlay the highglight
    )
  ) %>% 
  addAwesomeMarkers( # add incident points
    lng = filtered_rescue_sf$X, lat = filtered_rescue_sf$Y,
    icon = awesomeIcons(
      library = "ion",  # from this icon library
      icon = "ion-android-alert",  # use this icon
      iconColor = "white",  # colour it white
      # colour by animal
      markerColor = case_when(  # different colours for each
        filtered_rescue_sf$AnimalGroupParent == "Dog" ~ "red",
        filtered_rescue_sf$AnimalGroupParent == "Cat" ~ "blue",
        filtered_rescue_sf$AnimalGroupParent == "Bird" ~ "black"
      )
    ),
    popup = ~paste0(  # display this information on click
      "<b>Animal</b>: ", filtered_rescue_sf$AnimalGroupParent, "<br>",
      "<b>Incident</b>: ", filtered_rescue_sf$FinalDescription, "<br>",
      "<b>Date</b>: ", filtered_rescue_sf$DateTimeOfCall, "<br>",
      "<b>Borough</b>: ", filtered_rescue_sf$Borough, "<br>",
      "<b>Notional cost</b>: £", filtered_rescue_sf$`IncidentNotionalCost(£)`
    )
  )


Okay cool, we get a simple map of London with borough boundaries and markers showing incidents in 2017, with a different colour for each of three selected animal groups (red = dog, blue = cat, black = bird).

My main advice? Keep an eye on your pets. And consider covering your chimney.

Environment

Session info
Last rendered: 2023-08-09 23:31:02 BST
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.2.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] leaflet_2.1.2    xml2_1.3.5       rvest_1.0.3      geojsonio_0.11.1
 [5] sf_1.0-14        DT_0.28          janitor_2.2.0    lubridate_1.9.2 
 [9] forcats_1.0.0    stringr_1.5.0    dplyr_1.1.2      purrr_1.0.1     
[13] readr_2.1.4      tidyr_1.3.0      tibble_3.2.1     ggplot2_3.4.2   
[17] tidyverse_2.0.0 

loaded via a namespace (and not attached):
 [1] gtable_0.3.3            xfun_0.39               bslib_0.5.0            
 [4] htmlwidgets_1.6.2       lattice_0.21-8          tzdb_0.4.0             
 [7] leaflet.providers_1.9.0 vctrs_0.6.3             tools_4.3.1            
[10] crosstalk_1.2.0         generics_0.1.3          curl_5.0.1             
[13] parallel_4.3.1          proxy_0.4-27            fansi_1.0.4            
[16] pkgconfig_2.0.3         KernSmooth_2.23-22      lifecycle_1.0.3        
[19] compiler_4.3.1          munsell_0.5.0           fontawesome_0.5.1      
[22] snakecase_0.11.0        jqr_1.2.3               htmltools_0.5.5        
[25] class_7.3-22            sass_0.4.7              lazyeval_0.2.2         
[28] yaml_2.3.7              pillar_1.9.0            crayon_1.5.2           
[31] jquerylib_0.1.4         ellipsis_0.3.2          classInt_0.4-9         
[34] cachem_1.0.8            tidyselect_1.2.0        digest_0.6.33          
[37] stringi_1.7.12          fastmap_1.1.1           grid_4.3.1             
[40] colorspace_2.1-0        cli_3.6.1               magrittr_2.0.3         
[43] crul_1.4.0              utf8_1.2.3              e1071_1.7-13           
[46] withr_2.5.0             scales_1.2.1            sp_2.0-0               
[49] bit64_4.0.5             timechange_0.2.0        httr_1.4.6             
[52] rmarkdown_2.23          bit_4.0.5               hms_1.1.3              
[55] evaluate_0.21           knitr_1.43.1            V8_4.3.3               
[58] geojson_0.3.4           rlang_1.1.1             Rcpp_1.0.11            
[61] httpcode_0.3.0          glue_1.6.2              DBI_1.1.3              
[64] selectr_0.4-2           geojsonsf_2.0.3         rstudioapi_0.15.0      
[67] vroom_1.6.3             jsonlite_1.8.7          R6_2.5.1               
[70] units_0.8-2            

Reuse

CC BY-NC-SA 4.0