Web Science/Part2: Emerging Web Properties/Simple statistical descriptive Models for the Web/Number of words needed to understand most of Wikipedia

From Wikiversity
Jump to: navigation, search

Number of words needed to understand most of Wikipedia

Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Learning-goals.svg

Learning goals

  1. Understand what a log-log plot is
  2. Improve your skills in reading and interpreting diagrams
  3. Know about the word rank / frequency plot
  4. Should be able to transfer a histogram or curve into a cumulative distribution function
Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Video.svg

Video

Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Quiz.svg

Quiz

1 We saw that more than half of the unique word tokens on Simple English Wikipedia occured only once. Which of the following statements are true?

picking a random word from the simple english wikipedia the chance is higher than 50% that it occured only once
picking 100 random words from wikipedia we expect more than 50 of them to occure only once
picking a random word the chance for getting a word that occurs only once is less than 10%


Wikiversity-Mooc-Icon-Discussion.svg

Discussion