8 min read

Quantify colour with magick

Matt Dray (@mattdray)

Walrus with rainbow vomit, of course (via Giphy)

Walrus with rainbow vomit, of course (via Giphy)

Art of the possible

So what we’re going to do is:

  1. Read in an image
  2. Prepare a set of ‘simple colours’
  3. Map the simple colours to the image
  4. Quantify the colours

There’s also a step 5 that I’m not covering here: create a tool where the user chooses a colour and images are returned in order of dominance for that colour. We could also write this all into a function that takes a fodler of images and returns the percentage of each colour in each image.

It’s a kind of ImageMagick

The magick R package is an implementation of ImageMagick, an open-source software suite whose emphasis is image manipulation from the command line. The flexibility of magick can be seen in the vignette.

magick was created and is maintained by Jeroen Ooms, a software engineer and postdoc at ROpenSci, a collective that seeks to develop tools for open and reproducible research.

ROpenSci hosted a workshop from Ooms about working with images in R (presentation slides), which caught my attention. I’ve used some of his code below.

Code

First we need to load our packages.

# Use install.packages("package") if not yet installed
library(dplyr)  # tidy data manipulation
library(tibble)  # tidy tables
library(magick)  # image manipulation
library(knitr)  # nice html tables

Read the test image

I’ve chosen a colourful image to use for our test case: it’s a picture of a bunch of Lego Duplo bricks.3

We’ll use image_read() to read the JPEG as an object of class ‘magick’ and then image_scale() to reduce the image size and save some space.

Printing the image also gives us some details of format, dimensions, etc.

# Path to the image
duplo_path <- "https://upload.wikimedia.org/wikipedia/commons/thumb/a/ac/Lego_dublo_arto_alanenpaa_2.JPG/2560px-Lego_dublo_arto_alanenpaa_2.JPG"

# Read as magick object and resize
duplo <- 
  image_read(duplo_path) %>%
  image_scale(geometry = c("x600"))

print(duplo)
##   format width height colorspace matte filesize density
## 1   JPEG   900    600       sRGB FALSE        0   72x72

Prepare simple colours

We’ll map a set of simple colours to the test image. This means that the colours from the test image will be replaced by the ‘closest’ colour from our simple set.

One way to do this is to define our simple colour set and create an image from them. In this case I’m taking just six colours.

# Generate named vector of 'simple' colours
cols_vec <- setNames(
  c("#000000", "#0000ff", "#008000", "#ff0000", "#ffffff", "#ffff00"),
  c("black", "blue", "green", "red", "white", "yellow")
)

Then we can plot squares of these colours, using image_graph() to read them as magick-class objects.4 My method here is inefficient and not really reproducible, but you can see the output is an image that contains our six colours.

# For each vector element (colour) create a square of that colour
for (i in seq_along(cols_vec)) {
  fig_name <- paste0(names(cols_vec)[i], "_square")  # create object name
  assign(
    fig_name,  # set name
    image_graph(width = 100, height = 100, res = 300)  # create magick object
  )
  par(mar = rep(0, 4))  # set plot margins
  plot.new()  # new graphics frame
  rect(0, 0, 1, 1, col = cols_vec[i], border = cols_vec[i])  # build rectangle
  assign(fig_name, magick::image_crop(get(fig_name), "50x50+10+10")) # crop
  dev.off()  # shut down plotting device
  rm(i, fig_name)  # clear up
}

# Generate names of the coloured square objects
col_square <- paste0(names(cols_vec), "_square")

# Combine magick objects (coloured squares)
simple_cols <- image_append(c(
  get(col_square[1]), get(col_square[2]), get(col_square[3]),
  get(col_square[4]), get(col_square[5]), get(col_square[6])
))

print(simple_cols)
##   format width height colorspace matte filesize density
## 1    PNG   300     50       sRGB  TRUE        0   72x72

Map to the image

Now we can apply the simple colour set to the test image using image_map(). I’ve put the images side-by-side underneath using image_append(), also from magick.

# Map the simple colours to the image
duplo_mapped <- image_map(
  image = duplo,  # original image
  map = simple_cols  # colours to map on
)

# Display the original and simplified images side by side
image_animate(c(duplo, duplo_mapped), fps = 1)

Great. You can see where the original colours have been replaced by the ‘closest’ simple colours. Note in particular where the more reflective surfaces are mapped to white than the actual brick colour. This is okay: the brick may be blue, but we’ve only defined one shade of blue. If a particular shade is closer to white, then so be it.

Quantify the colours

Now we can take this mapped image and quantify how much of the image belongs to each colour. Imagine we’ve broken the image into pixels and then counting how many belng to each of our six colours.

# Function to count the colours (adapted from Jeroen Ooms)
count_colors <- function(image) {
  data <- image_data(image) %>%
    apply(2:3, paste, collapse = "") %>% 
    as.vector %>% table() %>%  as.data.frame() %>% 
    setNames(c("hex", "freq"))
  data$hex <- paste("#", data$hex, sep="")
  return(data)
}

# Dataframe of dominant colours 
duplo_col_freq <- duplo_mapped %>%
  count_colors() %>%
  left_join(
    as_tibble(cols_vec) %>%
      rownames_to_column() %>%
      rename(hex = value, name = rowname),
    by = "hex"
  ) %>% 
  arrange(desc(freq)) %>% 
  mutate(percent = 100*round((freq/sum(freq)), 3)) %>% 
  select(
    `Colour name` = name, Hexadecimal = hex, `Frequency of colour` = freq,
    `Percent of image` = percent
  )

duplo_mapped  # see mapped image again

kable(duplo_col_freq)  # quantify colour
Colour name Hexadecimal Frequency of colour Percent of image
red #ff0000 132134 24.5
white #ffffff 107847 20.0
black #000000 103641 19.2
yellow #ffff00 79751 14.8
green #008000 64867 12.0
blue #0000ff 51760 9.6

So red makes up almost a quarter with white and black just behind. Many of the bricks are red and much of the shadow areas of yellow bricks were rendered as red. Black and white make up much of the shadowed and reflective surfaces elsewhere in the image.

A few more examples

Reef fish

By Richard L Pyle from Wikimedia Commons, CC0 1.0.

Colour name Hexadecimal Frequency of colour Percent of image
blue #0000ff 317133 49.5
black #000000 214649 33.5
green #008000 76245 11.9
yellow #ffff00 13297 2.1
red #ff0000 10077 1.6
white #ffffff 8799 1.4

Neon lights in Hong Kong

By Daniel Case from Wikimedia Commons, CC BY-SA 3.0

Colour name Hexadecimal Frequency of colour Percent of image
black #000000 191565 71.0
green #008000 23134 8.6
blue #0000ff 18455 6.8
red #ff0000 17551 6.5
yellow #ffff00 10874 4.0
white #ffffff 8421 3.1

Ladybird

By Elena Andreeva from Wikimedia Commons, CC0 1.0.

Colour name Hexadecimal Frequency of colour Percent of image
white #ffffff 300366 54.2
blue #0000ff 117361 21.2
yellow #ffff00 100809 18.2
green #008000 27647 5.0
black #000000 5305 1.0
red #ff0000 2312 0.4

Session info

devtools::session_info()
## Session info -------------------------------------------------------------
##  setting  value                       
##  version  R version 3.5.1 (2018-07-02)
##  system   x86_64, darwin15.6.0        
##  ui       X11                         
##  language (EN)                        
##  collate  en_GB.UTF-8                 
##  tz       Europe/London               
##  date     2018-12-09
## Packages -----------------------------------------------------------------
##  package    * version date       source        
##  assertthat   0.2.0   2017-04-11 CRAN (R 3.5.0)
##  backports    1.1.2   2017-12-13 CRAN (R 3.5.0)
##  base       * 3.5.1   2018-07-05 local         
##  bindr        0.1.1   2018-03-13 CRAN (R 3.5.0)
##  bindrcpp   * 0.2.2   2018-03-29 CRAN (R 3.5.0)
##  blogdown     0.7     2018-07-07 CRAN (R 3.5.0)
##  bookdown     0.7     2018-02-18 CRAN (R 3.5.0)
##  compiler     3.5.1   2018-07-05 local         
##  crayon       1.3.4   2017-09-16 CRAN (R 3.5.0)
##  curl         3.2     2018-03-28 CRAN (R 3.5.0)
##  datasets   * 3.5.1   2018-07-05 local         
##  devtools     1.13.6  2018-06-27 CRAN (R 3.5.0)
##  digest       0.6.18  2018-10-10 cran (@0.6.18)
##  dplyr      * 0.7.8   2018-11-10 CRAN (R 3.5.0)
##  evaluate     0.10.1  2017-06-24 CRAN (R 3.5.0)
##  glue         1.3.0   2018-07-17 CRAN (R 3.5.0)
##  graphics   * 3.5.1   2018-07-05 local         
##  grDevices  * 3.5.1   2018-07-05 local         
##  highr        0.7     2018-06-09 CRAN (R 3.5.0)
##  htmltools    0.3.6   2017-04-28 CRAN (R 3.5.0)
##  knitr      * 1.20    2018-02-20 CRAN (R 3.5.0)
##  magick     * 1.9     2018-05-11 cran (@1.9)   
##  magrittr     1.5     2014-11-22 CRAN (R 3.5.0)
##  memoise      1.1.0   2017-04-21 CRAN (R 3.5.0)
##  methods    * 3.5.1   2018-07-05 local         
##  pillar       1.3.0   2018-07-14 CRAN (R 3.5.0)
##  pkgconfig    2.0.2   2018-08-16 cran (@2.0.2) 
##  png          0.1-7   2013-12-03 CRAN (R 3.5.0)
##  purrr        0.2.5   2018-05-29 CRAN (R 3.5.0)
##  R6           2.3.0   2018-10-04 cran (@2.3.0) 
##  Rcpp         1.0.0   2018-11-07 CRAN (R 3.5.0)
##  rlang        0.3.0.1 2018-10-25 CRAN (R 3.5.0)
##  rmarkdown    1.10    2018-06-11 CRAN (R 3.5.0)
##  rprojroot    1.3-2   2018-01-03 CRAN (R 3.5.0)
##  stats      * 3.5.1   2018-07-05 local         
##  stringi      1.2.4   2018-07-20 CRAN (R 3.5.0)
##  stringr      1.3.1   2018-05-10 CRAN (R 3.5.0)
##  tibble     * 1.4.2   2018-01-22 CRAN (R 3.5.0)
##  tidyselect   0.2.5   2018-10-11 CRAN (R 3.5.0)
##  tools        3.5.1   2018-07-05 local         
##  utils      * 3.5.1   2018-07-05 local         
##  withr        2.1.2   2018-03-15 CRAN (R 3.5.0)
##  xfun         0.3     2018-07-06 CRAN (R 3.5.0)
##  yaml         2.1.19  2018-05-01 CRAN (R 3.5.0)

  1. Just as well, because I’m colourblind.

  2. There are five named versions of olive drab in R.

  3. Photo by Arto Alanenpää, CC0-BY-4.0 from Wikimedia Creative Commons.

  4. Artefacts introduced during compression of PNGs and JPGs might mean that your set of six colours ends up being more than six. It’s preferable to generate our colour set within R, inside image_graph(), so that we have only our six defined colours.