# Simple statistical descriptive Models for the Web

# Simple statistical descriptive Models for the Web

- Formulating a research hypothesis and test it by means of simple descriptive statistics
- Reading diagrams

## Associated units

- Understand why we selected simple English Wikipedia as a toy example for modeling the web
- Understand that a task already as simple as counting words includes modeling choices
- Be familiar with the term “unique word token”
- Know some basic tools to count words and documents

- Be familiar with some basic statistical objects like Median, Mean, and Histograms
- Should be able to relate a histogram to its cumulative distribution function

- Understand the ongoing, cyclic process of research
- Know what falsifiable means and why every research hypothesis needs to be falsifiable
- Be able to formulate your own research hypothesis

- Understand what a log-log plot is
- Improve your skills in reading and interpreting diagrams
- Know about the word rank / frequency plot
- Should be able to transfer a histogram or curve into a cumulative distribution function

- Get a feeling for interdisciplinary research
- Know the Automated Readability Index
- Have a strong sense of support for our research hypothesis
- Be able to critically discuss the limits of our models

