R-Ladies organizes a multitude of inspiring and exciting Meetups worldwide. However, due to work, time zones, or other commitments, I can’t attend them all. Fortunately, since the pandemic began, many Meetup hosts have started to record webinars and post them online. This provides a great opportunity to catch up on things I’ve missed afterward.
In addition to the Meetups, R-Ladies also maintains a variety of YouTube channels. The primary channel, R-Ladies Global, offers the largest collection of videos. Local chapters often have their individual accounts as well. To simplify the process of accessing videos from multiple chapters, I wanted a convenient tool for viewing an aggregated list of videos. I also thought it would be helpful to have the ability to search videos based on presenters, topics, and other criteria.
So, I set out to build a dashboard with {flexdashboard} that would display all this information. {flexdashboard} is built on R Markdown, which allowed me to use R to personalize a dashboard with custom colors and styles. And, of course, I used R to pull in the data. {tuber} provides access to the YouTube Data API v3 with R, a straightforward tool for getting the data I needed.
There were challenges along the way. Since Meetups happen fairly often, I hoped to use GitHub Actions to automatically refresh the dashboard every 24 hours. Unfortunately, I encountered difficulties due to the authentication options provided by {tuber}, which supports YouTube’s OAuth 2.0 authorization process. I reached out to my brother for help since we’d previously collaborated on an API wrapper in R. The project stalled as we tried unsuccessfully to get YouTube’s OAuth 2.0 flow to work with GitHub Actions.
We had a breakthrough when we realized that videos could be accessed relatively painlessly by skipping OAuth 2.0 and using an API key instead. After two pull requests to add API key authentication to {tuber}, we are delighted to announce that the GitHub version of {tuber} now contains functions that work well with GitHub Actions. You can find the finished (and automated!) dashboard here:
https://ivelasq.github.io/rladies-video-feed
The code for the dashboard is on GitHub in case you want to reuse it for another set of channels!
I originally used the tidyRSS package to pull the list of videos from YouTube. However, the YouTube RSS feed limits results to the latest 15 videos. The code is still available in the repo if you would like to take a look. This is a good option if you only want the most recent stream of content and would like to avoid setting up any Google credentials. Thanks to the R4DS channel and Tom Mock for pointing me in the right direction!
The rest of the post describes how we built the dashboard. We learned quite a bit in the process!
As is often the case with anything to do with Google, first, you must get credentials. In this example, we were interested in an API key that would work with the YouTube Data API v3.
Now that you have an API key, you will want to save it in your R environment (.Renviron
) using usethis::edit_r_environ()
if working interactively or in your repository secrets if using GitHub Actions. Storing the API key as an environment variable allows you to access it easily without the need to manually input it every time you run your code. GitHub Actions can interact with the API without manual intervention, and storing it as a repository secret keeps the key separate from your source code so it is not exposed to others.
To do this, follow the instructions below:
# install.packages("pak") # Install {pak} if not yet installed pak::pak(httr2, soodoku/tuber) # CRAN and GitHub syntax side by side!
.Renviron
. Running the code below will prompt you to save the API key. Paste your API key into the pop-up window:tuber::yt_set_key()
Now your YouTube API key is saved as YOUTUBE_KEY
in .Renviron
!
Using an API key is considered relatively insecure since the key itself will be included in queries sent over the Internet and thus could be intercepted. If you are considering including an API key in a package or in public source code, we recommend using the secret management functions provided by {httr2}.
If you would like to encrypt your API key, you can do so using {httr2} and a few arguments we built into {tuber}. You can first create a {tuber} package key stored as TUBER_KEY
with tuber::yt_set_key(httr2::secret_make_key(), type = "package")
and an encrypted YOUTUBE_KEY
with tuber::yt_set_key(httr2::secret_encrypt("YOUR_UNENCRYPTED_API_KEY", key = "TUBER_KEY"))
. To retrieve a decrypted YOUTUBE_KEY
in your code, you can then run tuber::yt_get_key(decrypt = TRUE)
.
Please note that encrypting your API key will keep it safe if others don’t have your TUBER_KEY
to decrypt it, but it will still need to be included unencrypted in queries to interact with the YouTube Data API.
Woohoo, you are credentialed! It’s time for the fun part — pulling and cleaning data. I did this in a script called data_processing.R
.
In addition to {tuber}, here are other packages that will aid you in the cleaning process:
library(tuber) # GitHub version, installed with pak::pak(soodoku/tuber) library(readr) library(dplyr) library(stringr) library(DT)
Create a spreadsheet with the channels you are interested in. I created this manually — I just searched on YouTube for all the R-Ladies channels I could find. To make this workflow as programmatic as possible, I only included the chapter name, thumbnail, and channel ID.
dat <- read_csv("https://github.com/ivelasq/rladies-video-feed/blob/main/r-ladies_channels.csv") head(dat) chapter image id <chr> <chr> <chr> 1 R-Ladies Global https://yt3.ggpht.com/ytc/AKedOLRM4FPiPcPBdFUmYWR… UCDg… 2 R-Ladies Baltimore https://yt3.ggpht.com/RQifC3qp_7AFeTc48-QV1i4WBNM… UC9U… 3 R-Ladies Philly https://yt3.ggpht.com/ytc/AKedOLShIKBvPUKqbvm_Cpu… UCPq… 4 R-Ladies STL https://yt3.ggpht.com/ytc/AKedOLSBA7rlz1yvbIJ7TVN… UCQ7… 5 R-Ladies Sydney https://yt3.ggpht.com/ytc/AKedOLQnaU-dJbb14j2RE6W… UCkr… 6 R-Ladies Vancouver https://yt3.ggpht.com/3yf0Zo8-VKffrG-dRT_Gs85xX_x… UCX5…
Some YouTube channels have ‘custom IDs’ like RLadiesGlobal. These won’t work in {tuber}; you need the original IDs. The best way I found to get this ID is to click on a video from a channel. Then, scroll down to the description and click the channel name from that video. The original ID will appear in the URL after /channel/
. A good idea for a future Shiny app would be a way to pull this information from the YouTube API…
Now, create a few more columns with dplyr::mutate()
that expand the URLs into HTML format:
dat_urls <- dat %>% dplyr::mutate( feed = paste0("https://www.youtube.com/feeds/videos.xml?channel_id=", id), feed_url = paste0("yt:channel:", id), channel_image_url = paste0( "<img src='", image, "' alt='Hex Sticker for Chapter' width='40'></img>", " <a href='https://www.youtube.com/channel/", id, "' target='_blank'>", chapter, "</a><br>" ), )
The {tuber} documentation describes the many functions available to you. Many of these require OAuth 2.0 authentication. Since I had only the channel IDs, I wanted to use list_channel_videos()
to get a complete list of videos from the channels, which thankfully works with an API key!
For example, for the R-Ladies Global channel, you could run list_channel_videos()
to get its (currently 165) videos:
tuber::list_channel_videos("UCDgj5-mFohWZ5irWSFMFcng", max_results = 200, auth = "key")
What if you want the results for all the channels we have in our spreadsheet? I used a loop for this.
1dat_videos <- NULL 2for (i in 1:nrow(dat_urls)) { tmp <- 3 dat_urls[i, ]["id"] %>% 4 dplyr::pull() %>% 5 tuber::list_channel_videos( ., part = "snippet", config = list('maxResults' = 200), auth = "key" ) 6 dat_videos <- dplyr::bind_rows(dat_videos, tmp) }
dat_videos
,
dat_urls
(which contains the channel IDs),
list_channel_videos()
, and
dat_videos
.
The arguments in list_channel_videos()
gave me all the columns I am interested in with part = "snippet"
. Notice that there’s a default limit for the number of videos pulled from the API. I bumped that up a bit with the maxResults
argument.
Then, I brought back the URL info:
dat_join <- dat_videos %>% dplyr::left_join(., dat_urls, by = join_by("snippet.channelId" == "id"))
This results in a lot of information for each video. I cleaned it up a bit so that the data frame contained only the relevant columns in HTML format.
dat_dashboard <- dat_join %>% dplyr::mutate( video_url = paste0( "<a href='https://www.youtube.com/watch?v=", snippet.resourceId.videoId, "' target='_blank'>", snippet.title, "</a>" ), channel_url = paste0( "<img src='", image, "' alt='Hex Sticker for Chapter' width='40'></img>", "<a href='https://www.youtube.com/channel/", snippet.channelId, "' target='_blank'>", chapter, "</a>" ), date = as.Date(str_sub(snippet.publishedAt, 1, 10)) ) %>% dplyr::arrange(desc(snippet.publishedAt)) %>% dplyr::select(date, chapter, channel_url, video_url, channel_image_url)
See the final data processing file on GitHub.
You have the YouTube data — time to create a pretty dashboard!
My flexdashboard started as R Markdown files often do: with a YAML header. I specified an orientation (columns) and added the link to the GitHub repository in the navigation bar.
--- title: "R-Ladies YouTube Video Feed" output: flexdashboard::flex_dashboard: orientation: columns navbar: - { icon: "fa-github", href: "https://github.com/ivelasq/rladies-video-feed", align: right } ---
If you’d like your dashboard to have a custom look, the {bslib} package is a great option. It can add different colors and fonts directly in the YAML header. Make sure to add library(bslib)
in the actual code part of your .Rmd
file. I used the R-Ladies style guide to fill out the rest of the YAML header:
--- title: "R-Ladies YouTube Video Feed" output: flexdashboard::flex_dashboard: orientation: columns navbar: - { icon: "fa-github", href: "https://github.com/ivelasq/rladies-video-feed", align: right } theme: version: 4 bg: "#FDF7F7" fg: "#88398A" # purple primary: "#88398A" # purple base_font: google: "Lato" ---
Below the YAML header, add the packages you need. I used source()
to read the data-processing.R
script.
library(flexdashboard) library(bslib) source("data-processing.R", local = knitr::knit_global())
This dashboard has a simple layout: just a sidebar and the main section. I recommend checking out the flexdashboard documentation to see all the layout options available to you.
This code builds out the sidebar section. I created a list of each R-Ladies chapter that I was able to find and arranged them by name. With htmltools::HTML()
, the dashboard can render the URLs as HTML (the reason for all that manipulation earlier on). We learned that the .noWS = "outside"
argument is crucial for deploying the dashboard with GitHub Actions. It omits extra whitespace around the HTML, which ensures that the dashboard is committed only when the number of videos has changed (rather than creating spurious commits each time data is pulled, even if the number of videos is unchanged).
Channels {.sidebar} ----------------------------------------------------------------------- The purpose of this dashboard is to provide a running feed of R-Ladies videos posted to YouTube. It is refreshed every 24 hours. Currently, the feed includes these channels: ```{r} dat_join %>% dplyr::arrange(chapter) %>% dplyr::distinct(channel_image_url) %>% dplyr::pull() %>% htmltools::HTML(.noWS = "outside") ```
For the main body, I used {DT} to create a table for the information in our clean dataset.
Column {data-width=900} ----------------------------------------------------------------------- ### By default, the list is sorted by latest video. <style> .dataTables_scrollBody { max-height: 100% !important; } </style> ```{r} dat_dashboard %>% dplyr::select(-chapter, -channel_image_url) %>% DT::datatable( colnames = c("Date", "Channel", "Video"), filter = "top", escape = FALSE, # <1> height = "1000", # <2> elementId = "dashboard", options = list(columnDefs = list( # <3> list(className = "dt-middle", targets = "_all") )) ) ```
escape = FALSE
renders the URLs within the table as HTML.height = "1000"
makes it expand to the entire column height.options
aligns the text within columns (in this case, to be in the middle of the cell for all columns).1And that’s it! See the final dashboard code on GitHub.
Try it out — search “Shiny” to see any video with Shiny in the title, or “Ecuador” to see all the videos from R-Ladies Ecuador!
Great, we have a completed dashboard! Now, what if we want to keep it updated without manually re-knitting the dashboard? The next step is to create a GitHub Action. GitHub Actions are workflows that you create and configure to automate various tasks within your GitHub repositories. They are event-driven and can be triggered by actions such as code commits, pull requests, issue updates, or scheduled intervals.
GitHub Actions use YAML syntax and are saved as .yml
or .yaml
files within a folder called .github/workflows/
. We can use GitHub Actions to create a bot that refreshes the dashboard on a regular basis. You can see an example workflow in the GitHub repository. H/T to @gvelasq and this amazing example in the r-lib/actions repo for inspiration.
To create a GitHub Action:
.github
..github
, add a folder called workflows
.rladies-videos-bot.yaml
.rladies-videos-bot.yaml
.on: schedule: 1 - cron: '0 0 * * *' name: rladies-videos-bot jobs: rladies-videos-bot: runs-on: ubuntu-latest env: GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} YOUTUBE_KEY: ${{ secrets.YOUTUBE_KEY }} steps: - uses: actions/checkout@v3 with: fetch-depth: 0 - uses: r-lib/actions/setup-pandoc@v2 - uses: r-lib/actions/setup-r@v2 with: use-public-rspm: true 2 - uses: r-lib/actions/setup-r-dependencies@v2 with: 3 extra-packages: soodoku/tuber - name: Render and commit dashboard run: | Rscript -e 'rmarkdown::render("index.Rmd")' git config --local user.name "$GITHUB_ACTOR" git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com" git add index.html git commit -m '[Bot] Render and commit dashboard' || echo "No changes to commit" git push origin || echo "No changes to commit"
index.Rmd
, and committing the resulting index.html
file only if there are any changes to commit. Matt Dray’s {dialga} package is helpful in building cron expressions.
r-lib/actions/setup-r-dependencies@v2
installs R packages that are declared in a DESCRIPTION
file. In our case, since our RStudio project name is not a valid R package name due to the hyphens in rladies-video-feed
, we created a DESCRIPTION
file using usethis::use_description(check_name = FALSE)
and added the dependencies to Imports:
below using usethis::use_package()
for each required R package.
r-lib/actions/setup-r-dependencies@v2
to install the GitHub version of {tuber}, we specified it using the extra-packages
parameter.
The Imports:
section of our DESCRIPTION
file looks like this:
Imports: bslib, dplyr, DT, flexdashboard, htmltools, httr2, jsonlite, readr, rmarkdown, stringr
Now that we have the workflow set up, we need to let GitHub know what our YouTube API key is so that it can be accessed each time the workflow is triggered.
YOUTUBE_KEY
and in the ‘Secret’ textbox, paste your YouTube API key, then select ‘Add secret’.Now your GitHub repository is configured to run the workflow script. Your GitHub Action is ready to go!
To host the dashboard, I used GitHub Pages. In the GitHub repository for your dashboard, go to ‘Settings’, then ‘Pages’. Choose the branch and folder of your flexdashboard index.html
output file, click ‘Save’, and then you will have a URL to showcase your work.
Here is the final link for this dashboard: https://ivelasq.github.io/rladies-video-feed.
In summary, we have explored the powerful combination of {flexdashboard}, GitHub Actions (via r-lib/actions
), and {tuber} to create an automated YouTube dashboard.
It took a while, but we got there! Many thanks to @gvelasq’s support in getting this through the finish line.
Give this tutorial a try and unlock the potential of automated dashboards for your favorite YouTube channels! I’d love to see what you create!
If you made it all the way to the end of this tutorial, I need to plug (as an employee of Posit, formerly RStudio ): Posit Connect! Posit Connect is a platform that allows you to publish and share data products like Quarto docs, Shiny apps, and – you guessed it – dashboards made with {flexdashboard}!
Automating reports and dashboards is easy on Posit Connect. The scheduler is built into the interface and can be adjusted to your specifications. Never write another Cron job again.
Liked this article? I’d love for you to retoot!
I found out about this here: https://stackoverflow.com/questions/35749389/column-alignment-in-dt-datatable.︎