Web Science/Part2: Emerging Web Properties/Simple statistical descriptive Models for the Web

From Wikiversity

< Web Science‎ | Part2: Emerging Web Properties

Jump to navigation Jump to search

edit MOOC index

Simple statistical descriptive Models for the Web

Formulating a research hypothesis and test it by means of simple descriptive statistics
Reading diagrams

Counting Words And Documents

Understand why we selected simple English Wikipedia as a toy example for modeling the web
Understand that a task already as simple as counting words includes modeling choices
Be familiar with the term “unique word token”
Know some basic tools to count words and documents

Typical length of a document

Be familiar with some basic statistical objects like Median, Mean, and Histograms

Should be able to relate a histogram to its cumulative distribution function

How to formulate a research hypothesis

Understand the ongoing, cyclic process of research

Know what falsifiable means and why every research hypothesis needs to be falsifiable

Be able to formulate your own research hypothesis

Number of words needed to understand most of Wikipedia

Understand what a log-log plot is

Improve your skills in reading and interpreting diagrams

Know about the word rank / frequency plot

Should be able to transfer a histogram or curve into a cumulative distribution function

Linguists way of checking simplicity of text

Get a feeling for interdisciplinary research

Know the Automated Readability Index

Have a strong sense of support for our research hypothesis

Be able to critically discuss the limits of our models

no further reading defined

You can define further reading here.
In general you can use the edit button in the upper right corner of a section to edit its content.

Retrieved from "https://en.wikiversity.org/w/index.php?title=Web_Science/Part2:_Emerging_Web_Properties/Simple_statistical_descriptive_Models_for_the_Web&oldid=1636607"