Supporting R Markdown with Jekyll and knitr

Blogging with Jekyll and Markdown is good, but sometimes R Markdown is necessary.

Based on the following two posts, I figured out how to support R Markdown in the Github pages. The basic idea is to use knitr to convert R Markdown files to Jekyll friendly markdown files.

  1. Blogging with Jekyll and R Markdown using knitr by Andrew
  2. Publishing R Markdown using Jekyll by chepec

The first blog is adapted from the second blog, so their basic idea is the same. Based on their idea, I made some minor changes - all the credit goes to them.

Also, I added some extra features such as auto-rerun bash script whenever we make changes to an R Markdown file, even on Windows.

Use R script to call knitr

Here are the steps:

  1. Create a directory called _Rmd at the root level of the Jekyll directory, which will store all R Markdown files. In _Rmd, an R script is also created (called render_post.R), which is adapted from the first blog and shown below.
  2. Configure the paths for each directory accordingly, for example, posts.path is _posts.
  3. Create an R Markdown post under _Rmd, such as 2017-10-23-test-markdown.Rmd. At the beginning of this file, remember to add proper front matter for Jekyll.
  4. Run KnitPost to convert files. For simplicity, in the next section, I created a bash script to convert _Rmd/*.Rmd to _post/*.md.
# render_post.R
# R script to convert RMarkdown into Jekyll markdown
# Credit: http://brooksandrew.github.io/simpleblog/articles/blogging-with-r-markdown-and-jekyll-using-knitr/

KnitPost <- function(site.path='/pathToYourBlog/', overwriteAll=F, overwriteOne=NULL) {
  if(!'package:knitr' %in% search()) suppressWarnings(library(knitr))

  ## Blog-specific directories.  This will depend on how you organize your blog.
  site.path <- site.path # directory of jekyll blog (including trailing slash)
  rmd.path <- paste0(site.path, "_Rmd") # directory where your Rmd-files reside (relative to base)
  fig.dir <- "assets/Rfig/" # directory to save figures
  posts.path <- paste0(site.path, "_posts") # directory for converted markdown files
  cache.path <- paste0(site.path, "_cache") # necessary for plots
  
  render_jekyll(highlight = "pygments")
  opts_knit$set(base.url = '/', base.dir = site.path)
  opts_chunk$set(fig.path=fig.dir, fig.width=8.5, fig.height=5.25, dev='svg', cache=F, 
                 warning=F, message=F, cache.path=cache.path, tidy=F)   
  
  # setwd(rmd.path) # setwd to base
  
  # some logic to help us avoid overwriting already existing md files
  files.rmd <- data.frame(rmd = list.files(path = rmd.path,
                                full.names = T,
                                pattern = "\\.Rmd$",
                                ignore.case = T,
                                recursive = F), stringsAsFactors=F)
  files.rmd$corresponding.md.file <- paste0(posts.path, "/", basename(gsub(pattern = "\\.Rmd$", replacement = ".md", x = files.rmd$rmd)))
  files.rmd$corresponding.md.exists <- file.exists(files.rmd$corresponding.md.file)
  
  ## determining which posts to overwrite from parameters overwriteOne & overwriteAll
  files.rmd$md.overwriteAll <- overwriteAll
  if(is.null(overwriteOne)==F) files.rmd$md.overwriteAll[grep(overwriteOne, files.rmd[,'rmd'], ignore.case=T)] <- T
  files.rmd$md.render <- F
  for (i in 1:dim(files.rmd)[1]) {
    if (files.rmd$corresponding.md.exists[i] == F) {
      files.rmd$md.render[i] <- T
    }
    if ((files.rmd$corresponding.md.exists[i] == T) && (files.rmd$md.overwriteAll[i] == T)) {
      files.rmd$md.render[i] <- T
    }
  }
  
  # For each Rmd file, render markdown (contingent on the flags set above)
  for (i in 1:dim(files.rmd)[1]) {
    if (files.rmd$md.render[i] == T) {
      out.file <- knit(as.character(files.rmd$rmd[i]), 
                      output = as.character(files.rmd$corresponding.md.file[i]),
                      envir = parent.frame(), 
                      quiet = T)
      message(paste0("KnitPost(): ", basename(files.rmd$rmd[i])))
    }     
  }
}

Use bash script to call R script

I created a bash script (called convert_rmd.sh) under the root level of the Jekyll directory, which is adapted from the second blog.

It can convert a specific R Markdown file under _Rmd/ to Jekyll markdown under _posts/:

$ ./convert_rmd.sh _Rmd/YYYY-mm-dd-title.Rmd

Alternatively, it can convert all files under _Rmd/ to _posts/:

$ ./convert_rmd.sh --all

To support Rscript with Cygwin on Windows, I added platform check at the bottom. Customize with your own path to Rscript.exe.

Update: I found that Rscript on Unix and Windows will generate different line endings and figures. It is recommended to use Unix-style line endings to be consistent with GitHub Pages. Otherwise, the carriage return (\r) on Windows would cause an extra line break in code block when rendering on GitHub Pages.

#!/bin/bash
# Credit: adapted from https://chepec.se/2014/07/16/knitr-jekyll

function show_help {
  echo "Usage: convert_rmd.sh [filename.Rmd | --all] ..."
  echo "Knit posts, convert Rmd to jekyll blog"
  echo "<filename.Rmd>  convert a specific _Rmd/*.Rmd file to _posts/*.md (overwrite existing md)"
  echo "--all           convert all _Rmd/*.Rmd files to _posts/*.md (overwrite existing md)"
}

if [ $# -eq 0 ] ; then
  # no args at all? show help
  show_help
  exit 0
fi

sitepath="./"
cmd="source('./_Rmd/render_post.R')"
if [ "$1" = "--all" ]; then
  echo "convert all _Rmd/*.Rmd to _posts/*.md"
  cmd="$cmd; KnitPost(site.path='$sitepath', overwriteAll=T)"
else
  rmdfile=$1
  cmd="$cmd; KnitPost(site.path='$sitepath', overwriteOne='$rmdfile')"
fi

# determine Rscript for different platforms; in particular, for Cygwin on Windows
case "$(uname -s)" in
   Darwin|Linux)
     # echo 'Mac OS X or Linux'
     Rscript -e "$cmd"
     ;;
   CYGWIN*|MINGW32*|MSYS*)
     # echo 'Windows'
     /cygdrive/c/'Program Files'/R/R-3.3.0/bin/Rscript.exe -e "$cmd"
     echo: "Warning: remember to convert line endings to Unix style before publish"
     ;;
   *)
     echo 'other OS' 
     ;;
esac

Auto-rerun when .Rmd file changes

It would be cumbersome if we need to manually rerun the above script whenever we make changes to an R Markdown file.

Also, if we could automatically rerun the above script, then we could view the generated html in the browser (locally) immediately.

To achieve this, I’m using a command called when-changed. It is cross-platform and works well on Windows; other possible solutions might be nodemon or inotifywait, based on this question.

# install when-changed
$ pip install https://github.com/joh/when-changed/archive/master.zip

# Usage: when-changed FILE -c COMMAND (watch FILE changes and exec COMMAND) 
$ when-changed _Rmd/<filename>.rmd -c bash convert_rmd.sh _Rmd/<filename>.rmd

# assume that Jekyll is also running in a different tab
$ bundle exec jekyll serve --livereload  # Jekyll v3.7+

In such case, an R Markdown file will be automatically converted to _posts/*.md, further compiled to html, and live-reloaded in the browser (given Jekyll v3.7+).

Add figure prefix by setting fig.path for each post

So far, the R Markdown could be rendered correctly, including the figures generated by ggplot2. However, those figures are named after the chunk name of each code chunk. For example, a code chunk {r plot_residual} has the chunk name “plot_residual”, so the generated figures would be named as “plot_residual-1.svg”, “plot_residual-2.svg”, and so on.

It is likely that we have two code chunks with the same name when we have multiple R Markdown files. We have to distinguish the figures from different files by adding a prefix to each figure.

The step is straightforward. Based on Blogging About R Code with R Markdown, Knitr, and Jekyll by Nicole White, we just need to put the following code at the beginning of each post:

# for post 1:
knitr::opts_chunk$set(fig.path='assets/Rfig/title-of-post-1-')
# for post 2:
knitr::opts_chunk$set(fig.path='assets/Rfig/title-of-post-2-')

Since the prefix is distinct, the generated figures will have distinct names, even though the chunk names are duplicate.

Example

All is done. Here I will show what an R Markdown would look like after being converted to a Jekyll page. The code snippets are copied from example-r-markdown.rmd.

Prepare for analyses

set.seed(1234)
library(ggplot2)
library(lattice)

Basic console output

The code chunk input and output is then displayed as follows:

x <- 1:10
y <- round(rnorm(10, x, 1), 2)
df <- data.frame(x, y)
df
##     x     y
## 1   1 -0.21
## 2   2  2.28
## 3   3  4.08
## 4   4  1.65
## 5   5  5.43
## 6   6  6.51
## 7   7  6.43
## 8   8  7.45
## 9   9  8.44
## 10 10  9.11

ggplot2 plot

ggplot2 plots work well:

qplot(x, y, data=df)

plot of chunk ggplot2example