The original document of this file is provided by and modified by Kenya Amano and Inhwan Ko. visit the the link if you want to see the full version of this file.

This post examines the features of R Markdown using knitr in Rstudio 1.3. This combination of tools provides an exciting improvement in usability for reproducible analysis. Specifically, this post:

The post may be most useful if the source code and displayed post are viewed side by side. In some instances, I include a copy of the R Markdown in the displayed HTML, but most of the time I assume you are reading the source and post side by side.

Getting started

To work with R Markdown, if necessary:

To run the basic working example that produced this blog post:

To produce PDF file, you need TeX files. * Easy way: Install the tinytex package: install.packages("tinytex"). Then run tinytex::install_tinytex(). * If you want full version of TeX: For Mac install MacTeX. For Windows install TeX Live.

Prepare for analyses

set.seed(1234)

#install.packages("lattice")
#install.packages("stargazer")
#install.packages("pander")

library(tidyverse)
library(lattice)
library(stargazer)
library(pander)

Without specify the options of chunk, you get warning

Basic console output

To insert an R code chunk, you can type it manually or just press Chunks - Insert chunks or use the shortcut key. This will produce the following code chunk:

Pressing tab when inside the braces will bring up code chunk options.

The following R code chunk labelled basicconsole is as follows:

```r
x <- 1:10
y <- round(rnorm(10, x, 1), 2)
df <- data.frame(x, y)
df
```

<div data-pagedtable="false">
  <script data-pagedtable-source type="application/json">
{"columns":[{"label":["x"],"name":[1],"type":["int"],"align":["right"]},{"label":["y"],"name":[2],"type":["dbl"],"align":["right"]}],"data":[{"1":"1","2":"-0.21"},{"1":"2","2":"2.28"},{"1":"3","2":"4.08"},{"1":"4","2":"1.65"},{"1":"5","2":"5.43"},{"1":"6","2":"6.51"},{"1":"7","2":"6.43"},{"1":"8","2":"7.45"},{"1":"9","2":"8.44"},{"1":"10","2":"9.11"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}}
  </script>
</div>

The code chunk input and output is then displayed as follows:

x <- 1:10
y <- round(rnorm(10, x, 1), 2)
df <- data.frame(x, y)
df
x <- 1:10
y <- round(rnorm(10, x, 1), 2)
df <- data.frame(x, y)
df
x <- 1:10
y <- round(rnorm(10, x, 1), 2)
df <- data.frame(x, y)
df

Inline value can be shown as follows: The average of y is 4.733.

R Code chunk features

Create Markdown code from R

Frequently used chunk options

Option Description
include If FALSE, knitr will run the chunk but not include the chunk in the final document
echo If FALSE, knitr will not display the code in the code chunk above it’s results in the final document.
error If FALSE, knitr will not display any error messages generated by the code.
message If FALSE, knitr will not display any messages generated by the code.
warning If FALSE, knitr will not display any warning messages generated by the code.

Recommendation for Homework

Option HW setting
include TRUE
echo TRUE
error FALSE
message FALSE
warning FALSE

Echo and Results

The following code hides the command input (i.e., echo=FALSE), and outputs the content directly as code (i.e., results=asis).

Here are some dot points

  • The value of y[1] is 2.1
  • The value of y[2] is 1.52
  • The value of y[3] is 2.29

This code includes the command input (i.e., echo=TRUE) with markup output (i.e., results -> default )

cat(paste("* The value of y[", 1:3, "] is ", y[1:3], sep="", collapse="\n"))
## * The value of y[1] is 2.1
## * The value of y[2] is 1.52
## * The value of y[3] is 2.29

Message and Warning

While the chunk without specification of options show all wanings and messages….

df %>% 
  summarize_at(vars(y), funs(mean))
## Warning: `funs()` was deprecated in dplyr 0.8.0.
## Please use a list of either functions or lambdas: 
## 
##   # Simple named list: 
##   list(mean = mean, median = median)
## 
##   # Auto named with `tibble::lst()`: 
##   tibble::lst(mean, median)
## 
##   # Using lambdas
##   list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))

this code does not output warnings

df %>% 
  summarize_at(vars(y), funs(mean))

So, I recommend having this chunk at the very beginning of your RMarkdown when submitting an assignment:

knitr::opts_chunk$set(include=T, echo = T, error=F, message=F, warning=F)

and modify manually whenever necessary.

Cache analysis

Caching analyses is straightforward. Here’s example code. On the first run on my computer, this took about 10 seconds. On subsequent runs, this code was not run.

If you want to rerun cached code chunks, just delete the contents of the cache folder

```r
for (i in 1:5000) {
    lm((i+1)~i)
}
```

Basic markdown functionality

For those not familiar with standard Markdown, the following may be useful. See the source code for how to produce such points. However, RStudio does include a Markdown quick reference button that adequatly covers this material.

Dot Points

Simple dot points:

  • Point 1
  • Point 2
  • Point 3

and numeric dot points:

  1. Number 1
  2. Number 2
  3. Number 3

and nested dot points (2 tabs):

  • A
    • A.1
    • A.2
  • B
    • B.1
    • B.2

Or use LaTeX functions like

Equations

Equations are included by using LaTeX notation and including them either between single dollar signs (inline equations) or double dollar signs (displayed equations). If you hang around the Q&A site CrossValidated you’ll be familiar with this idea.

There are inline equations such as \(y_i = \alpha + \beta x_i + e_i\).

And displayed formulas:

\[\frac{1}{1+\exp(-x)}\]

\[ x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} \]

\[ \begin{split} X & = (x+a)(x-b) \\ & = x(x-b) + a(x-b) \\ & = x^2 + x(a-b) - ab \end{split} \]

\[ \mathbf{X} = \left( \begin{matrix} x_{11} & x_{12} \\ x_{21} & x_{22} \\ x_{31} & x_{32} \\ \end{matrix} \right) \]

More info: LaTeX wiki

Chris suggests not to use dollar signs when writing up mathematical notations. This is because when producing an output, TeX often awkwardly interprets dollar signs and returns an error. However, if you are not sure about whether you used the right LaTeX codes, you can wrap notations around with dollar signs temporarily to see their outputs. Then erase them when knitting the R Markdown file. You can use \begin{equation} ~ \end{equation} or \begin{eqnarray} ~ \end{eqnarray} instead to wrap around your LaTeX mathematical notations.

\[\begin{equation} \mathbf{X} = \left( \begin{matrix} x_{11} & x_{12} \\ x_{21} & x_{22} \\ x_{31} & x_{32} \\ \end{matrix} \right) \end{equation}\]

Tables

Tables can be included using the following notation

A B C
1 Male Blue
2 Female Pink

Or you want to show nice regression tables

Mod1 <- y ~ x 
Res1 <- 
  lm(formula = Mod1,
     data = df)

Mod2 <- y ~ x^2  
Res2 <- 
  lm(formula = Mod2,
     data = df)
stargazer(Res1, Res2)
% Table created by stargazer v.5.2.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu % Date and time: 월, 10 04, 2021 - PM 3:36:11
#For html
#stargazer(Res1, Res2, type = "html")

More info: Cheat Sheet

If you want to create a fancy table from data.frame, you can use “pander”

Table <- 
df %>% 
  mutate(z = if_else(y>5, 1, 0)) %>% 
  t()

Table  
##   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## x  1.0 2.00 3.00  4.0 5.00 6.00 7.00 8.00 9.00 10.00
## y  2.1 1.52 2.29  3.5 3.37 4.83 4.82 6.66 8.71  9.53
## z  0.0 0.00 0.00  0.0 0.00 0.00 0.00 1.00 1.00  1.00

With pander

Table  %>% 
  pander(caption ="Fancy Table")
Fancy Table
x 1 2 3 4 5 6 7 8 9 10
y 2.1 1.52 2.29 3.5 3.37 4.83 4.82 6.66 8.71 9.53
z 0 0 0 0 0 0 0 1 1 1