KnitR/Use KnitR

From Wikiversity
Jump to navigation Jump to search

Use of KnitR[edit | edit source]

We will explain the creation of a knitR document with a simple example document. We will look at each part of this simple document and explain the new features used here. A knitR document basically consists of two types of text:

  • Text, which can be formatted with the R markup language, quite similar to the Mediawiki markup used to format wikiversity articles
  • R code snippets, which consists of R code, which is executed if the document is rendered

The first part of our sample document looks like this

---
title: "Descriptive Statistics of 10000 dice rolls - a simple KnitR example"
author: "Martin Papke"
date: "22 August 2018"
output: pdf_document
---

At the start of every knitR document, we specify a title and an author. Moreover we have to tell the interpreter, in what form we want knitR to produce the output, here we create a PDF document. Other possible outputs include Word and HTML documents.

Next, we load the packages we need in an R code snippet. Code snippets are seperated by ``` from the text parts.

 
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(knitr)
library(readr)
library(dplyr)
library(ggplot2)
```

At the beginning of a code snippet, we have to specify in curly brackets the language used (here: R) and a name for the code snippet. We can give extra options as include=FALSE here, which prevents R from including this code snippet into the output document. Note that per default code snippets are included.

# A simple KnitR example

## Data import
[...]

Headings are marked with # and ## for level 1 and level 2 headings.

## Statistics

Now we can do some statistics 
``` {r statistics}
  dicemean <- mean(dice)
  dicemedian <- median(dice)
```
So, the mean of our dice throws is $\bar x = `r dicemean`$ and the median is `r dicemedian`. We 
know count the absolute frequencies of the dice results: 
```{r statistics2}
  dicetable <- table(dice)
```

What we see in this part is that LaTeX markup can be used in the text to present mathematical formulae. We also say how to include a result of an R command into the document namely by `r command`. In this way, e.g. the results of calculations can be part of our document, as it is shown with the mean and the median above.

## Plots
In KnitR, plots can be done into the document, just call the usual R plot command 
```{r plot}
  xy <- data.frame(dicetable)
  ggplot(data=xy, aes(x=dice, y=Freq)) + geom_bar(stat="identity")
```

Plots are simply put into the document, created by usual R code, as shown above.

A statistical example - checking for independence[edit | edit source]

We will reuse our dice data to check if the even and the odd numbered throws are independent, see here. After the loading of the data as above, we use R's internal -test to check for independence, by inwoking

  # test the contingency table 
  chi <- chisq.test(tbl)

Now we can check whether we have (high) significance of independence by looking at the $p$-value:

 
  p <- chi$p.value
  if (p < 0.01) {
    "high significance for independene"
  } else if (p < 0.05) {
    "significance for independence"
  } else {
    "no significance for independence"
  }