rostrum.blog - A tidyverse functions quiz with {learnr}

tl;dr

Can you match the tidyverse function to its package? I used {learnr} innapropriately to hack a ‘tidyquiz’ to test you.

The app isn’t hosted online, but it’s in a package that you can install and run locally with the very latest tidyverse functions:

remotes::install_github("matt-dray/tidyquiz") to install {tidyquiz} (it’s a package!)
library(tidyquiz) to load it
learnr::run_tutorial("tidy", package = "tidyquiz") to open in your browser

The problem

I saw a (probably) tongue-in-cheek tweet from Ryan Timpe:

Hardest part about #rstats package development: remembering which functions are from {dplyr} and which are from {tidyr}.

It’s easy enough to get out of this pickle, but maybe there’s a deeper problem? What if the purpose of each tidyverse isn’t clear enough?¹ Is there too much arbitrary jargon in the tidyverse?

Enjoy your existential crisis. Meanwhile, I’ve made a little quiz to see if you can remember whether unnest() is from {dplyr} or {tidyr}². In fact, it’s an interactive multi-choice test that presents you a random function from the tidyverse and challenges you to select the correct package.

Step 0: the approach

I wanted:

To get a tidy data frame of all the tidyverse package-function combos
A user to be presented with an interactive question about one of these tidyverse functions
The ability to generate a new question from within the document
To share this quiz easily, without a server

Read the rest of this post to see how I tackled these. Or, you know, spoilers:

The tidyverse_packages() function from {tidyverse}
The {learnr} package
An actionButton() and Shiny reactivity
You can put a {learnr} quiz in a package and call it from there!

Step 1: package-function combos

The {tidyverse} package is a package that loads packages.³ It’s a convenient way to load the eight core packages of the tidyverse.

suppressPackageStartupMessages(library(tidyverse))

But there’s more than these core eight. To access a list of functions for each package, we first need to load all the packages. We can get a character vector of them all with tidyverse_packages().

tidy_pkgs <- tidyverse_packages()
tidy_pkgs

 [1] "broom"         "conflicted"    "cli"           "dbplyr"       
 [5] "dplyr"         "dtplyr"        "forcats"       "ggplot2"      
 [9] "googledrive"   "googlesheets4" "haven"         "hms"          
[13] "httr"          "jsonlite"      "lubridate"     "magrittr"     
[17] "modelr"        "pillar"        "purrr"         "ragg"         
[21] "readr"         "readxl"        "reprex"        "rlang"        
[25] "rstudioapi"    "rvest"         "stringr"       "tibble"       
[29] "tidyr"         "xml2"          "tidyverse"

Note

I re-rendered this post in August 2023, when more packages had been added to the tidyverse. For example, the {conflicted} package, which is why I have to namespace-qualify my use of filter() later in this post!

We can pass this character vector to p_load(). This convenient function from {pacman} installs and loads them all for us.

library(pacman)

p_load(
  char = tidy_pkgs,
  character.only = TRUE  # read elements of character vector
)

Now we can get the functions from each package by mapping over them with {purrr} and {pacman}’s p_functions().

tidy_funs <- tidy_pkgs %>% 
  enframe(name = NULL, value = "package") %>%  # make tibble
  mutate(
    functions = map(
      package,  # for each package...
      ~p_functions(.x, character.only = TRUE)  # ...get the functions within
    )
  ) %>% 
  unnest()  # unpack the listcol elements

Warning: `cols` is now required when using `unnest()`.
ℹ Please use `cols = c(functions)`.

Here’s a small sample:

sample_n(tidy_funs, 10)  # random sample

# A tibble: 10 × 2
   package    functions          
   <chr>      <chr>              
 1 ggplot2    draw_key_abline    
 2 ggplot2    geom_curve         
 3 cli        pb_percent         
 4 httr       oauth_service_token
 5 conflicted conflict_prefer    
 6 ggplot2    GeomErrorbar       
 7 cli        bg_red             
 8 rstudioapi documentSaveAll    
 9 httr       hmac_sha1          
10 dbplyr     sql_quote

Out of interest we can look at the packages with the most and fewest functions:

count(tidy_funs, package, sort = TRUE) %>% slice(1:5)

# A tibble: 5 × 2
  package       n
  <chr>     <int>
1 ggplot2     536
2 rlang       440
3 dplyr       293
4 cli         231
5 lubridate   205

count(tidy_funs, package) %>% arrange(n) %>% slice(1:5)

# A tibble: 5 × 2
  package        n
  <chr>      <int>
1 dtplyr         2
2 conflicted     5
3 tidyverse      6
4 broom          9
5 ragg          10

Another source of confusion might be that some functions exist in multiple packages. How many functions?

count(tidy_funs, functions, sort = TRUE) %>% 
  dplyr::filter(n > 1) %>%
  nrow()

[1] 111

Okay, we have our data set, so let’s get quizzical.

Step 2: interactive questions with {learnr}

The {learnr} package helps you turn an R Markdown document into an interactive tutorial with a little help from Shiny. One option is to create a multiple-choice question, which is exactly what we need.

I should say that {learnr} wasn’t really intended for what I’ve done – it’s better suited to longform tutorials – but using it means that I didn’t have to write the logic for a multi-choice quiz question. Shrug.

Having installed the package and started a {learnr}-flavoured R Markdown ⁴ we can create a question inside a code chunk in this form:

quiz(
  caption = "Question 1",
  question(
    text = "What is Pokemon #399?",  # question
    answer("Bidoof"), correct = TRUE),  # right answer
    answer("Drifloom"),   # wrong
    answer("Pyukumuku"),  # wrong
    answer("Rayquaza"),   # wrong
    random_answer_order = TRUE  # answers ordered randomly
  )
)

But this example is hard-coded. In our case we want to replace the subject of the question and the answers any time we want to be presented with a new question.

Looks like we’ll need a button for users to press to signal that they want a new question.

Step 3: generate new questions with Shiny

Since {learnr} operates in a Shiny runtime in our R Markdown file, it’s no problem to use Shiny’s actionButton().

actionButton("goButton", "Get Question")  # button

You can press the button in the app to generate a new seed base don the current time and date. The seed is then used to randomly select a new question for the user.

To make this reactive – so that nothing will happen until the button is pressed – we can write Shiny server code in an R Markdown chunk by setting context="server" in the chunk options. So here’s how we get a new seed after clicking:

seed <- eventReactive(
  input$goButton,
  {
    seed_temp <- as.numeric(Sys.time())
    return(seed_temp)
  }
)

Then our code needs to sample a row from the full data frame of package-function combos and isolate the name of the function the user will be quizzed on. This code is within eventReactive() and will only trigger when the button has been activated. Second, we use renderText() to take the function name and paste it into a string to create our question.

# Set the reactive element
fun_name <- eventReactive(
  input$goButton,  # on input
  { 
    seed_val <- seed()  # the newly-generated seed value
    set.seed(seed_val)  # user-selected value is seed value
    fun_sample <- sample_n(tidy_funs, 1)  # sample a package-function combo
    fun_name <- select(fun_sample, functions) %>% pull()  # just the function name
    return(fun_name)  # return the package value
  }
)

# Set the output
# Generate a question that includes the sampled function name 
output$fun_name_out <- renderText({
  paste0("The function `", fun_name(), "` is from which tidyverse package?")
})

We can repeat this for getting the right answer and alter the code slightly to generate a few wrong answers. A wrong answer is selected randomly from the data frame of tidyverse functions, but only once the correct answer and already-selected wrong answers have been removed. I’ve also coded it so that any package that has a function with the same name – a conflict – will also be removed before a ‘wrong’ answer is chosen.

So rather than the hard-coded example of a multi-choice question in Step 2, our quiz question code will look like this:

quiz(
  caption = "Question ",
  question(
    text = as.character(textOutput("fun_name_out")),
    answer(as.character(textOutput("ans_correct_out")), correct = TRUE),
    answer(as.character(textOutput("ans_wrong1_out"))),
    answer(as.character(textOutput("ans_wrong2_out"))),
    answer(as.character(textOutput("ans_wrong3_out"))),
    random_answer_order = TRUE
  )
)

So now the text outputs will be rendered into the quiz question and this won’t change until the the ‘Get Question’ button is clicked.

Actually, that’s sort-of a lie. {learnr} remembers how it’s users have performed; it saves their progress. To erase this, we need to click ‘Start Over’ from the menu pane to clear that memory.

Get the code

Browse the code on GitHub and leave an issue with thoughts or suggestions.

For example, it could definitely be improved if the user got a set of 10 questions that were graded to give a final mark. Maybe I’ll implement this one day.

For now, give it a go and let me know if you ever find out if drop_na() is in {dplyr} or {tidyr}.⁵

Environment

Session info

Last rendered: 2023-08-02 22:56:40 BST

R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.2.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] xml2_1.3.5          rvest_1.0.3         rstudioapi_0.15.0  
 [4] rlang_1.1.1         reprex_2.0.2        readxl_1.4.2       
 [7] ragg_1.2.5          pillar_1.9.0        modelr_0.1.11      
[10] magrittr_2.0.3      jsonlite_1.8.7      httr_1.4.6         
[13] hms_1.1.3           haven_2.5.2         googlesheets4_1.1.1
[16] googledrive_2.1.1   dtplyr_1.3.1        dbplyr_2.3.2       
[19] cli_3.6.1           conflicted_1.2.0    broom_1.0.5        
[22] pacman_0.5.1        lubridate_1.9.2     forcats_1.0.0      
[25] stringr_1.5.0       dplyr_1.1.2         purrr_1.0.1        
[28] readr_2.1.4         tidyr_1.3.0         tibble_3.2.1       
[31] ggplot2_3.4.2       tidyverse_2.0.0    

loaded via a namespace (and not attached):
 [1] gtable_0.3.3      xfun_0.39         htmlwidgets_1.6.2 gargle_1.5.1     
 [5] tzdb_0.4.0        vctrs_0.6.3       tools_4.3.1       generics_0.1.3   
 [9] fansi_1.0.4       pkgconfig_2.0.3   data.table_1.14.8 lifecycle_1.0.3  
[13] compiler_4.3.1    textshaping_0.3.6 munsell_0.5.0     fontawesome_0.5.1
[17] htmltools_0.5.5   yaml_2.3.7        cachem_1.0.8      tidyselect_1.2.0 
[21] digest_0.6.33     stringi_1.7.12    fastmap_1.1.1     grid_4.3.1       
[25] colorspace_2.1-0  utf8_1.2.3        withr_2.5.0       scales_1.2.1     
[29] backports_1.4.1   timechange_0.2.0  rmarkdown_2.23    cellranger_1.1.0 
[33] memoise_2.0.1     evaluate_0.21     knitr_1.43.1      glue_1.6.2       
[37] DBI_1.1.3         R6_2.5.1          systemfonts_1.0.4 fs_1.6.3

Footnotes

Seems even Hadley gets it wrong sometimes.↩︎
Am I tricking you? Is it actually from neither?↩︎
The meme writes itself. Or rather, you can do it for me.↩︎
After installing {learnr} you can go to new R Markdown > From Template > Interactive Tutorial.↩︎
Am I tricking you? Is it actually from neither?↩︎

Reuse

CC BY-NC-SA 4.0