Generating Mountain Goats lyrics

Matt Dray (@mattdray)

John Darnielle with the green-scaled slipcase for In League with Dragons

John Darnielle with the green-scaled slipcase for In League with Dragons

The Mountain Goats released In League with Dragons today, their seventeenth studio album.

John Darnielle has written a lot of words across the Mountain Goat’s back catalogue. His lyrics are poetic and descriptive, covering fictional and autobiographical themes that include susbtance abuse, professional wrestling and cadaver-sniffing dogs.

Can we generate new Mountain Goats lyrics given this rich text dataset? This is a short post to do exactly that using the {spotifyr}, {genius} and {markovifyR} packages for R.

Hit play below while reading to generate the right mood.

Get lyrics

The {spotifyr} package1 pulls artist and album information from the music streaming service Spotify, along with some interesting audio features like ‘danceability’ and ‘acousticness’. It also fetches lyrics from Genius via the {genius} package2.

First get a developer account for the Spotify API. Run usethis::edit_r_environ() and add your client ID and secret in the form SPOTIFY_CLIENT_ID=X and SPOTIFY_CLIENT_SECRET=Y. The get_spotify_access_token() function will add an access token to your environment, which will authenticate each API request.

library(spotifyr)  # install.packages("spotifyr")
access_token <- get_spotify_access_token()

The get_discography() function fetches a named artist’s back-catalogue, including the lyrics. Beware: this may include some duplicates from different regions or because of reissues or deluxe versions.

goat_discography <- spotifyr::get_discography("the mountain goats")
dim(goat_discography)
## [1] 399  41

This is a hefty dataframe with 41 columns of data for nearly 400 songs. Let’s simplify the columns and for fun we can look at five random sings and their ‘energy’.

library(dplyr)  # for data manipulation and %>%

goat_disco <- goat_discography %>% 
  ungroup() %>% 
  select(
    album_name, album_release_year,  # album
    track_name, track_number, duration_ms,  # track info
    key_name, mode_name, key_mode, tempo, time_signature,  # music info
    danceability, energy, loudness, mode, speechiness,  # audio features
    acousticness, instrumentalness, liveness, valence,  # audio features
    lyrics
  )

sample_n(goat_disco, 5) %>%
  select(album_name, track_name, energy)  # a sample
## # A tibble: 5 x 3
##   album_name                           track_name                 energy
##   <chr>                                <chr>                       <dbl>
## 1 Bitter Melon Farm                    Star Dusting                0.437
## 2 Transcendental Youth                 Counterfeit Florida Plates  0.714
## 3 Protein Source Of The Future... Now! Going To Malibu             0.803
## 4 Full Force Galesburg                 Minnesota                   0.452
## 5 In League with Dragons               Younger                     0.586

I’ll be saving this dataframe for some other analysis, but for now we’ll need only the lyrics. The lyrics are stored in a list-column as a separate tibble (data frame) per song.

library(tidyr)  # for unnest()

goat_lyrics <- goat_disco %>%
  filter(lyrics != "NULL") %>%  # remove rows where lyrics weren't collected
  unnest(lyrics) %>%  # unpack the lyrics list-column
  filter(!is.na(lyric)) %>%  # remove empty lyrics
  select(-line) %>%  # unneeded column
  group_by(lyric) %>% slice(1) %>%  ungroup() %>% # remove duplicate lyrics
  pull(lyric)  # convert column to character vector

sample(goat_lyrics, 10)  # a sample
##  [1] "I try to remember to write in the diary"                         
##  [2] "Feel bad about the things we do along the way"                   
##  [3] "I was a red dot blinking on a screen up overhead"                
##  [4] "Been waiting such a long time now, my number's finally coming up"
##  [5] "Drunk on the spirit and high on fumes"                           
##  [6] "To piss off the dumb few that forgave us"                        
##  [7] "Save yourselves"                                                 
##  [8] "Two of the renegades carrying firewood, a third with a lightbulb"
##  [9] "Garden Grove"                                                    
## [10] "But the devil is in the details"

Generate lyrics

We can use a Markov chain to generate new lyrics based on our dataset. Basically, it will predict the next word from the current one based on the likelihood from our input dataset. You can read more about this principle elsewhere.

The {markovifyR} package3 is a wrapper for the Python package markovify, which ‘is a simple, extensible Markov chain generator’. You can install markovify at the command line via R’s system() function. {furrr} is also needed.

# system("pip install markovify")
library(markovifyR)  # remotes::install_github("abresler/markovifyR")
library(furrr)  # install.packages("furrr")

Now we can generate the model given all the lyrics.

markov_model <- generate_markovify_model(
    input_text = goat_lyrics,
    markov_state_size = 2L,
    max_overlap_total = 25,
    max_overlap_ratio = 0.7
  )

You can meddle with these controls, but I’ve kept to the suggested defaults for now. Note that ‘overlap’ relates to the likelihood of generating whole sentences that already exist. See markovify for more detail.

Now to generate a bunch of lines.

goat_speak <- markovify_text(
  markov_model = markov_model,
  maximum_sentence_length = NULL,
  output_column_name = 'goat_speak',
  count = 50,
  tries = 100,
  only_distinct = TRUE,
  return_message = TRUE
)
## goat_speak: Shower room full of promise and potential
## goat_speak: On the floor again
## goat_speak: The singer's locked up in the room fill with steam
## goat_speak: To tell the cameras
## goat_speak: Your hair was dark and there is nobody driving
## goat_speak: You'll be maybe lunging for the fever to break my fall
## goat_speak: Deep in my teeth into your handbag and puled out a place to go
## goat_speak: The color of the night came on, you burst into song
## goat_speak: Feel like the rhythm of the death-dealing physician
## goat_speak: As the windows down
## goat_speak: And I know you see is what you will be saved
## goat_speak: Don't feel like I'm gonna take this rabbit into Malibu
## goat_speak: Children in the desert sand
## goat_speak: I saw something dark and then remember you
## goat_speak: The wind from the hands of children
## goat_speak: The wind behind us like ribbons
## goat_speak: Let's go to Maine right now headed silent down the sails tonight
## goat_speak: You come into the Grand Am
## goat_speak: Orphans in the black walnut tree
## goat_speak: To the rooms with the windows on the dunes
## goat_speak: You were looking at you
## goat_speak: When we meet again?
## goat_speak: Perfectly aware of where our love in clear perspective
## goat_speak: Preacher in the rafters
## goat_speak: Heading to the concrete walls in shadow
## goat_speak: They were on Cherry Red I think I'll stay here
## goat_speak: There's champagne on the wind come through his teeth
## goat_speak: Try not to lose you
## goat_speak: I fear for my safety, you can see that you're wearing a wire
## goat_speak: All the way home from the Hunan Province
## goat_speak: Let's go where everybody knows
## goat_speak: I throw up in the rain falling from the world, I may not ever get the call
## goat_speak: Though we may kiss no more after me
## goat_speak: You had sugar on your hair behind your head someday
## goat_speak: I heard you saying anyway
## goat_speak: When you eased down the Hertzsprung-Russell diagram
## goat_speak: To an afternoon I spent in jail again
## goat_speak: Got out of the refrigerator
## goat_speak: The wind blew through your body breathing on mine
## goat_speak: We had hot caramel sticking to her skin
## goat_speak: See the rain, feel the new found, rich brown, deep, wet ground
## goat_speak: From all away across the bruised earth
## goat_speak: When I got real cold
## goat_speak: Like a desperate strength
## goat_speak: Try not to open up your eyes
## goat_speak: Pull down my throat, fresh blood in our throats
## goat_speak: What it is that I'm aware of where I stand
## goat_speak: When I get the call
## goat_speak: But none of your hair
## goat_speak: Wheels were spitting out sparks, scraping at the stars and knew the secrets within them

Mountain Goats fans will probably recognise a few familiar patterns in the text that emerges.

I ran this function a few times and here a few outputs that made me laugh (or think):

  • And a bird we would have liked brought the Norman invasion
  • I hope I never liked Morrissey
  • Went and got the case of vodka from a disco in old east Berlin
  • Fresh coffee at sunrise, warm my lips like a dying man
  • But I felt all the Portuguese water dogs?
  • But my love is like a tattoo into my ear
  • I write reminders on my kimono that I could not remember
  • And you brought me a bowl of cooked wild grasses
  • We had hot caramel sticking to her skin
  • And then the special chicken
  • Leann Rimes on the ocean
  • Sunset spilling through your megaphone
  • It’s the most gorgeous cow I’d ever wanted
  • How come there’s peacocks in the face of the rainbow

You can also choose to seed the first word in the sentence. You can do this in such a way that you can create a sort-of possible-sounding stanza.

goat_speak <- markovify_text(
  markov_model = markov_model,
  maximum_sentence_length = NULL,
  output_column_name = 'goat_lyric',
  start_words = c("I", "And", "But", "So"),
  count = 1,
  tries = 100,
  only_distinct = TRUE,
  return_message = TRUE
)
## goat_lyric: I looked toward the same effect on a shiny black plastic tray
## goat_lyric: And the world back on
## goat_lyric: But when you were standing in the dining room
## goat_lyric: So this is what the point is

…or not.

I think John Darnielle probably remains the best generator of Mountain Goats lyrics for now.

Further reading

To learn more about the band:

Session info

## [1] "Last updated 2019-04-30"
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Locale: en_GB.UTF-8 / en_GB.UTF-8 / en_GB.UTF-8 / C / en_GB.UTF-8 / en_GB.UTF-8
## 
## Package version:
##   askpass_1.1       assertthat_0.2.0  base64enc_0.1.3  
##   BH_1.69.0.1       blogdown_0.11     bookdown_0.9     
##   cli_1.1.0         clipr_0.5.0       codetools_0.2-16 
##   compiler_3.5.2    crayon_1.3.4      curl_3.3         
##   digest_0.6.18     dplyr_0.8.0.1     emo_0.0.0.9000   
##   evaluate_0.13     fansi_0.4.0       furrr_0.1.0      
##   future_1.12.0     genius_0.0.1.0    globals_0.12.4   
##   glue_1.3.0.9000   graphics_3.5.2    grDevices_3.5.2  
##   grid_3.5.2        highr_0.8         hms_0.4.2        
##   htmltools_0.3.6   httpuv_1.5.0      httr_1.4.0       
##   jsonlite_1.6      knitr_1.22        later_0.8.0      
##   lattice_0.20-38   listenv_0.7.0     lubridate_1.7.4  
##   magrittr_1.5      markdown_0.9      markovifyR_0.101 
##   Matrix_1.2-16     methods_3.5.2     mime_0.6         
##   openssl_1.2.2     parallel_3.5.2    pillar_1.3.1     
##   pkgconfig_2.0.2   plogr_0.2.0       promises_1.0.1   
##   purrr_0.3.0       R6_2.4.0          Rcpp_1.0.1       
##   readr_1.3.1       reticulate_1.11.1 rlang_0.3.1      
##   rmarkdown_1.12    rstudioapi_0.9.0  rvest_0.3.2      
##   selectr_0.4.1     servr_0.13        spotifyr_2.1.0   
##   stats_3.5.2       stringi_1.4.3     stringr_1.4.0    
##   sys_3.1           tibble_2.1.1      tidyr_0.8.3      
##   tidyselect_0.2.5  tinytex_0.11      tools_3.5.2      
##   utf8_1.1.4        utils_3.5.2       xfun_0.5         
##   xml2_1.2.0        yaml_2.2.0

  1. Charlie Thompson, Josiah Parry, Donal Phipps and Tom Wolff (2019). spotifyr: R Wrapper for the ‘Spotify’ Web API. R package version 2.1.0. https://CRAN.R-project.org/package=spotifyr

  2. Josiah Parry (2019). genius: Easily Access Song Lyrics from Genius. R package version 0.0.1.0. https://CRAN.R-project.org/package=genius

  3. Alex Bresler (2017). markovifyR: R wrapper for the Markovify Python module. R package version 0.101. https://github.com/abresler/markovifyR