From Wikiversity
Jump to navigation Jump to search

KnitR is package for the RStudio, which is a graphical user interface for calling commands and scripts for the underlying statistic software R (see Wikipedia:Knitr for details). If learners are able to see the R-Code in the learning document they can perform activities in the software for statistics on their own. Furthermore for research publications in the Wikiversity[1] readers can

  • reproduce the results,
  • learn from the methodology,
  • apply the R-code on their own data,
  • check if the algorithm are appropriate for experimental design
Knitr integration.png

Workflow of KnitR[edit]

KnitR consists of standard e.g. MarkDown document with R-code chunks integrated in the document. The code chunks can be regarded as R-scripts that

  • load data,
  • preforms data processing and
  • creates output data (e.g. descriptive analysis) or output graphics (e.g. boxplot diagram).

The implementation of logical conditions in R can provide text elements for the dynamic report depended on the statistical analysis. The following text is as stan

   The Wilcoxon Sign test was applied as statistical comparison of the average of two dependent samples above. 
   In this case the the calculated P-value was 0.56 and hence greater than the significance (0.05 by default).
   This implies that "H0: there is no difference between the    
   results in data1 and data2" must be accepted. 

Depending on the R results (here 0.56) the text fragments are determined by logical conditions in the R-script. If the P-value was 0.45, which is lower than the significance (0.05 by default). An other appropriate text fragment is inserted in the dynamic report. By this workflow the replacement of the input data of the statistical or numerical analysis in R creates a reproducible report which the same methodology.

Wiki to Markdown Conversion with PanDoc[edit]

The OpenSource tool PanDoc is called the "swiss army knife" of document conversion. Assume we have a KnitR document of a scientific paper that contains the KnitR code chunks for processing the data, that was analysed.

  • converted the Markdown document of the paper with PanDoc-Online Converter in a MediaWiki document.
    • Create a sample document with the knitr-package in RStudio and save the R-Markdown file with the extension Rmd to your harddrive.
    • Copy the content of your R-Markdown document to PanDoc-Online Converter,
    • select Markdown (pandoc) as input format,
    • select MediaWiki as output format,
    • press Convert-button and analyze the generated MediaWiki syntax of the text.
  • The R-Code chunks for the analysis of the data (e.g. loaded from CSV file of spreadsheet document) is converted into a <code>-environment.
  • This converted KnitR document is stored together with the scientific papers in the WikiJournal (e.g. WikiJournal of Medicine). If sampling of data was performed in the same way the application of the KnitR-document with the new data will be performed in the same algorithmic way. This KnitR-approach contributes to a workflow for Reproducible Science.

Workflow for KnitR-Backend connected to Wikiversity/Wikipedia[edit]

The following workflow is not implemented in Wikiversity currently. With this wiki resource you learn about the workflow analogy between KnitR and future benefits of a KnitR-like implementation for scientific publishing of dynamically generated learning resources in Wikiversity:

==My Section==
this is text in wikiversity. Now we calculate the covariance of two vectors x and y dynamically. The covariance is 
<dyncalc language="R" src="">
   x <- rnorm(100)
   y <- 3*x + rnorm(100)
   cor(x, y) 
Now we create a scatter plot of the data 
<dyncalc language="R" src="">
   {r scatterplot, fig.width=8, fig.height=6}
Now we create the text output depending on the analysis of the data.
<dyncalc language="R" src="">
  if (cor(x, y) > 0) {
    print("The covariance of x and y is a positive number.")

After processing the wikiversity document with R-backend specified the src-attribute, the result could be:

==My Section==
this is text in wikiversity. Now we calculate the covariance of two vectors x and y dynamically. The covariance is 0.93612
Now we create a scatter plot of the data 
Now we create the text output depending on the analysis of the data.
The covariance of x and y is a positive number.

The proprosed environment for dynamic calculations are placed in a tag-envirnoment, just like the math-tag for mathematical expressions.

  • The encapsuled content in the math-tag is rendered for the output.
  • The encapsuled content in the dyncal-tag is submitted to a backend for calculation or creating a scatterplot.

The encapsuled code the real R-code, that works in R or RStudio. Workflow for R/KnitR can be found in the R-Tutorial by K. Broman[2]. The R-code creates to vectors with random numbers and calculates the covariance. The R-code should be processed for wikiversity pages after a code modification by default (Server load for the R-backend). Other options are, that learner and download the source via an API and can create the KnitR offline on the mobile device. Wikiversity community will decide if this is an option to include in a learning environement to explore analysis of data and its interpretation.

  • document language will be standard Wiki markup, also known as wikitext or wikicode, consists of the syntax and keywords used by the MediaWiki software to format a page (e.g. used in Wikipedia, Wikiversity, OLAT,...).
  • R-Code chunks will be recognized and interpreted by a R-backend or a reference to a versioned R-script was inserted in wiki document/article, and any found reference will lead to Read-update of the wiki article if data, script or document is updated. Diagrams are e.g. still PNG files in Wikimedia that are imported in a standard way most authors of the wiki community will know. The difference between a standard Wiki document and wiki document with R-Code chunks is, that any update of data or update of script will call the R-script again and a new version of output (diagrams as PNG files, number, dynamic text elements) are created. This concept is used basically for mathematical formulas in MediaWiki by TexVC[3] resp. the Math Extension for MediaWiki[4] as well. The LaTeX sources are parsed and converted into images, MathML, that can be displayed in a browser.
  • SageMath is another potential candidate as backend, perform the numerical and statistical analysis "on the fly" in a learning resource. The benefits are tremendous especially when learners and authors too, because a learning task can be performed in SageMath by the learner and the diagrams and maps for recents events can be visible in the document without the to update diagrams/figure statistical results with the most recent data again and again. The available software package within SageMath is huge and R is one package among the SageMath package list.
  • PanDoc could used to convert wiki code to markdown for processing with the KnitR package.

Learning Task[edit]

In the previous section the workflow of a integrated approach of KnitR was elaborated. Due to the fact that this concept is not implemented yet as extension in MediaWiki yet, the workflow cannot performed with code chunk for mathematical calculations in the MediaWiki of Wikiversity directly. But it possible to learn about the workflow in general:

See also[edit]


  1. WikiJournal of Medicine - An open access journal with no publication costs – About ISSN: 2002-4436 Frequency: Continuous Since: March 2014 Publisher: Wikimedia Foundation
  2. Karl Broman, KnitR in a Nutshell - (accessed 2017/08/14) -
  3. Schubotz, M. (2013). Making math searchable in Wikipedia. arXiv preprint arXiv:1304.5475.
  4. Schubotz, M., & Wicke, G. (2014). Mathoid: Robust, scalable, fast and accessible math rendering for wikipedia. In Intelligent Computer Mathematics (pp. 224-235). Springer, Cham.
  5. Quantum Geographic Information System (QGIS) - Open Source Software Package for Linux, Windows, Mac (2017) - LTR 2.18.11 access 2017/08/14 -

External links[edit]