KnitR/Wikiversity Integration

From Wikiversity
Jump to navigation Jump to search

Introduction[edit | edit source]

The following considerations focus on possible use-cases and Wikiversity articles that include R-code or Octave code for processing data. Transparency and reproducibility of scientific methods can be accomplished by retrieving the article from Wikiversity and executed the methods with the collected data on the local computer. So privacy of data is assured and processing of the data is performed on the local machine. The methods can be applied the code definition in the scientific paper e.g. in WikiJournal of Medicine.

Learning Task[edit | edit source]

  • Analyze COVID-19 and capacity building for data analysis with Wikiversity articles. Peer-reviewed journals as evidence are provide as reference for the applied method. What is a appropriate design of scientific papers that are published in WikiJournal that allow the replication of the analysis with the collected data of your own study?

Workflow for KnitR-Backend connected to Wikiversity/Wikipedia[edit | edit source]

The following workflow is not implemented in Wikiversity currently. With this wiki resource you learn about the possible workflows in analogy between KnitR and future benefits of a KnitR-like implementation for scientific publishing of dynamically generated learning resources in Wikiversity:

==My Section==
this is text in wikiversity. Now we calculate the covariance of two vectors x and y dynamically. The covariance is 
<dyncalc language="R" src="https://r-backend.example.com">
   x <- rnorm(100)
   y <- 3*x + rnorm(100)
   cor(x, y) 
</dyncal>
Now we create a scatter plot of the data 
<dyncalc language="R" src="https://r-backend.example.com">
   {r scatterplot, fig.width=8, fig.height=6}
   plot(x,y)
</dyncal>
Now we create the text output depending on the analysis of the data.
<dyncalc language="R" src="https://r-backend.example.com">
  if (cor(x, y) > 0) {
    print("The covariance of x and y is a positive number.")
  }
</dyncal>

After processing the wikiversity document with R-backend specified the src-attribute, the result could be:

==My Section==
this is text in wikiversity. Now we calculate the covariance of two vectors x and y dynamically. The covariance is 0.93612
Now we create a scatter plot of the data 
[[File:Scatterplot9923400384204.png]]
Now we create the text output depending on the analysis of the data.
The covariance of x and y is a positive number.

The proposed environment for dynamic calculations are placed in a tag-environment, just like the math-tag for mathematical expressions.

  • The encapsuled content in the math-tag is rendered for the output.
  • The encapsuled content in the dyncal-tag is submitted to a backend for calculation or creating a scatterplot.

The encapsuled code the real R-code, that works in R or RStudio. Workflow for R/KnitR can be found in the R-Tutorial by K. Broman[1]. The R-code creates to vectors with random numbers and calculates the covariance. The R-code should be processed for wikiversity pages after a code modification by default (Server load for the R-backend). Other options are, that the learner can download the source via an API and can create the KnitR offline on the mobile device. Wikiversity community will decide if this is an option to include in a learning environement to explore analysis of data and its interpretation.

  • document language will be standard Wiki markup, also known as wikitext or wikicode, consists of the syntax and keywords used by the MediaWiki software to format a page (e.g. used in Wikipedia, Wikiversity, OLAT,...).
  • R-Code chunks will be recognized and interpreted by a R-backend or a reference to a versioned R-script was inserted in wiki document/article, and any found reference will lead to Read-update of the wiki article if data, script or document is updated. Diagrams are e.g. still PNG files in Wikimedia that are imported in a standard way most authors of the wiki community will know. The difference between a standard Wiki document and wiki document with R-Code chunks is, that any update of data or update of script will call the R-script again and a new version of output (diagrams as PNG files, number, dynamic text elements) are created. This concept is used basically for mathematical formulas in MediaWiki by TexVC[2] resp. the Math Extension for MediaWiki[3] as well. The LaTeX sources are parsed and converted into images, MathML, that can be displayed in a browser.
  • SageMath is another potential candidate as backend, perform the numerical and statistical analysis "on the fly" in a learning resource. The benefits are tremendous especially when learners and authors too, because a learning task can be performed in SageMath by the learner and the diagrams and maps for recents events can be visible in the document without the to update diagrams/figure statistical results with the most recent data again and again. The available software package within SageMath is huge and R is one package among the SageMath package list.
  • PanDoc could used to convert wiki code to markdown for processing with the KnitR package.
  1. Karl Broman, KnitR in a Nutshell - (accessed 2017/08/14) - http://kbroman.org/knitr_knutshell/pages/Rmarkdown.html
  2. Schubotz, M. (2013). Making math searchable in Wikipedia. arXiv preprint arXiv:1304.5475.
  3. Schubotz, M., & Wicke, G. (2014). Mathoid: Robust, scalable, fast and accessible math rendering for wikipedia. In Intelligent Computer Mathematics (pp. 224-235). Springer, Cham.