8 min read

A tidyverse functions quiz with {learnr}

Matt Dray (@mattdray)

TL;DR

Can you match the tidyverse function to its package? I used {learnr} to make a β€˜tidyquiz’ to test you.

A live version is available at https://mattdray.shinyapps.io/tidyquiz/

To run locally and get the very latest functions:

  1. remotes::install_github("matt-dray/tidyquiz") to install {tidyquiz} (it’s a package!)
  2. library(tidyquiz) to load it
  3. learnr::run_tutorial("tidy", package = "tidyquiz") to open in your browser

The problem

I saw a (probably) tongue-in-cheek tweet recently from Ryan Timpe:

It’s easy enough to get out of this pickle, but maybe there’s a deeper problem? What if the purpose of each tidyverse isn’t clear enough?1 Is there too much arbitrary jargon in the tidyverse?

Enjoy your existential crisis. Meanwhile, I’ve made a little quiz to see if you can remember whether unnest() is from {dplyr} or {tidyr}2. In fact, it’s an interactive multi-choice test that presents you a random function from the tidyverse and challenges you to select the correct package.

Step 0: the approach

I wanted:

  1. To get a tidy dataframe of all the tidyverse package-function combos
  2. A user to be presented with an interactive question about one of these tidyverse functions
  3. The ability to generate a new question from within the document
  4. To share this quiz easily, without a server

Read the rest of this post to see how I tackled these. Or, you know, spoilers:

  1. The tidyverse_packages() function from {tidyverse}
  2. The {learnr} package
  3. An actionButton() and Shiny reactivity
  4. You can put a {learnr} quiz in a package and call it from there!

Step 1: package-function combos

The {tidyverse} package is a package that loads packages.3 It’s a convenient way to load the eight core packages of the tidyverse.

library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────── tidyverse 1.2.1 ──
## βœ” ggplot2 3.1.0       βœ” purrr   0.3.0  
## βœ” tibble  2.1.1       βœ” dplyr   0.8.0.1
## βœ” tidyr   0.8.3       βœ” stringr 1.4.0  
## βœ” readr   1.3.1       βœ” forcats 0.4.0
## ── Conflicts ─────────────────────────────────────────────── tidyverse_conflicts() ──
## βœ– dplyr::filter() masks stats::filter()
## βœ– dplyr::lag()    masks stats::lag()

But there’s more than these core eight. To access a list of functions for each package, we first need to load all the packages. We can get a character vector of them all with tidyverse_packages().

tidy_pkgs <- tidyverse_packages() %>%  # calls character vector
  # We can protect ourselves from any rogue characters in this vector
  str_replace("\n", "") %>%  # remove newline
  str_replace(">=", "") %>%  # remove greater than or equal to
  str_replace("[:punct:]", "")  # remove punctuation

tidy_pkgs
##  [1] "broom"      "cli"        "crayon"     "dplyr"      "dbplyr"    
##  [6] "forcats"    "ggplot2"    "haven"      "hms"        "httr"      
## [11] "jsonlite"   "lubridate"  "magrittr"   "modelr"     "purrr"     
## [16] "readr"      "readxl"     "reprex"     "rlang"      "rstudioapi"
## [21] "rvest"      "stringr"    "tibble"     "tidyr"      "xml2"      
## [26] "tidyverse"

We can pass this character vector to p_load(). This convenient function from {pacman} installs and loads them all for us.

library(pacman)

p_load(
  char = tidy_pkgs,
  character.only = TRUE  # read elements of character vector
)

Now we can get the functions from each package by mapping over them with {purrr} and {pacman}’s p_functions().

tidy_funs <- tidy_pkgs %>% 
  enframe(name = NULL, value = "package") %>%  # make tibble
  mutate(
    functions = map(
      package,  # for each package...
      ~p_functions(.x, character.only = TRUE)  # ...get the functions within
    )
  ) %>% 
  unnest()  # unpack the listcol elements

Here’s a small sample:

sample_n(tidy_funs, 10)  # random sample
## # A tibble: 10 x 2
##    package   functions             
##    <chr>     <chr>                 
##  1 ggplot2   standardise_aes_names 
##  2 lubridate ehours                
##  3 dplyr     db_begin              
##  4 forcats   fct_anon              
##  5 purrr     chuck                 
##  6 ggplot2   aes_q                 
##  7 magrittr  is_less_than          
##  8 xml2      xml_add_child         
##  9 ggplot2   ggplot_add            
## 10 ggplot2   scale_colour_gradient2

Out of interest we can look at the packages with the most and fewest functions:

count(tidy_funs, package, sort = TRUE) %>% slice(1:5)
## # A tibble: 5 x 2
##   package       n
##   <chr>     <int>
## 1 rlang       468
## 2 ggplot2     456
## 3 dplyr       258
## 4 lubridate   212
## 5 purrr       177
count(tidy_funs, package) %>% arrange(n) %>% slice(1:5)
## # A tibble: 5 x 2
##   package       n
##   <chr>     <int>
## 1 tidyverse     5
## 2 reprex        6
## 3 hms           7
## 4 broom        10
## 5 readxl       13

Another source of confusion might be that some functions exist in multiple packages. How many functions?

count(tidy_funs, functions, sort = TRUE) %>% filter(n > 1) %>% nrow()
## [1] 95

Okay, we have our dataset, so let’s get quizzical.

Step 2: interactive questions with {learnr}

The {learnr} package helps you turn an R Markdown document into an interactive tutorial with a little help from Shiny. One option is to create a multiple-choice question, which is exactly what we need.

I should say that {learnr} wasn’t really intended for what I’ve done – it’s better suited to longform tutorials – but using it means that I didn’t have to write the logic for a multi-choice quiz question. Shrug.

Having installed the package and started a {learnr}-flavoured R Markdown4 we can create a question inside a code chunk in this form:

quiz(
  caption = "Question 1",
  question(
    text = "What is Pokemon #399?",  # question
    answer("Bidoof"), correct = TRUE),  # right answer
    answer("Drifloom"),   # wrong
    answer("Pyukumuku"),  # wrong
    answer("Rayquaza"),   # wrong
    random_answer_order = TRUE  # answers ordered randomly
  )
)

But this example is hardcoded. In our case we want to replace the subject of the question and the answers any time we want to be presented with a new question.

Looks like we’ll need a button for users to press to signal that they want a new question.

Step 3: generate new questions with Shiny

Since {learnr} operates in a Shiny runtime in our R Markdown file, it’s no problem to use Shiny’s actionButton().

actionButton("goButton", "Get Question")  # button

You can press the button in the app to generate a new seed base don the current time and date. The seed is then used to randomly select a new question for the user.

To make this reactive – so that nothing will happen until the button is pressed – we can write Shiny server code in an R Markdown chunk by setting context="server" in the chunk options. So here’s how we get a new seed after clicking:

seed <- eventReactive(
  input$goButton,
  {
    seed_temp <- as.numeric(Sys.time())
    return(seed_temp)
  }
)

Then our code needs to sample a row from the full dataframe of package-function combos and isolate the name of the function the user will be quizzed on. This code is within eventReactive() and will only trigger when the button has been activated. Second, we use renderText() to take the function name and paste it into a string to create our question.

# Set the reactive element
fun_name <- eventReactive(
  input$goButton,  # on input
  { 
    seed_val <- seed()  # the newly-generated seed value
    set.seed(seed_val)  # user-selected value is seed value
    fun_sample <- sample_n(tidy_funs, 1)  # sample a package-function combo
    fun_name <- select(fun_sample, functions) %>% pull()  # just the function name
    return(fun_name)  # return the package value
  }
)

# Set the output
# Generate a question that includes the sampled function name 
output$fun_name_out <- renderText({
  paste0("The function `", fun_name(), "` is from which tidyverse package?")
})

We can repeat this for getting the right answer and alter the code slightly to generate a few wrong answers. A wrong answer is selected randomly from the dataframe of tidyverse functions, but only once the correct answer and already-selected wrong answers have been removed. I’ve also coded it so that any package that has a function with the same name – a conflict – will also be removed before a β€˜wrong’ answer is chosen.

So rather than the hardcoded example of a multi-choice question in Step 2, our quiz question code will look like this:

quiz(
  caption = "Question ",
  question(
    text = as.character(textOutput("fun_name_out")),
    answer(as.character(textOutput("ans_correct_out")), correct = TRUE),
    answer(as.character(textOutput("ans_wrong1_out"))),
    answer(as.character(textOutput("ans_wrong2_out"))),
    answer(as.character(textOutput("ans_wrong3_out"))),
    random_answer_order = TRUE
  )
)

So now the text outputs will be rendered into the quiz question and this won’t change until the the β€˜Get Question’ button is clicked.

Actually, that’s sort-of a lie. {learnr} remembers how it’s users have performed; it saves their progress. To erase this, we need to click β€˜Start Over’ from the menu pane to clear that memory.

Get the code

Browse the code on GitHub and leave an issue with thoughts or suggestions.

For example, it could definitely be improved if the user got a set of 10 questions that were graded to give a final mark. Maybe I’ll implement this one day.

For now, give it a go and let me know if you ever find out if drop_na() is in {dplyr} or {tidyr}.5


  1. Seems even Hadley gets it wrong sometimes.↩

  2. Am I tricking you? Is it actually from neither?↩

  3. The meme writes itself. Or rather, you can do it for me.↩

  4. After installing {learnr} you can go to new R Markdown > From Template > Interactive Tutorial.↩

  5. Am I tricking you? Is it actually from neither?↩