Web Science/Part2: Emerging Web Properties/Modeling the Web as a graph/Descriptive statistics of the web graph

From Wikiversity
Jump to: navigation, search

Descriptive statistics of the web graph

Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Learning-goals.svg

Learning goals

  1. Know terms like Size and (unique) volume
  2. Be able to count the in and out degree of web pages
  3. Have an idea what kind of law (in & out) degree distributions follow
  4. Know that degree is not distributed in a fair way
  5. Know that the Gini coefficient can be used to measure fairness
Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Video.svg

Video

Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Script.svg

Script

the slide deck can be found at File:Descriptive statistics of the web graph.pdf

Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Quiz.svg

Quiz

1

having a random web crawl which of the following statements would you expect to be true?

the highest indegree would be smaller than the highest outdegree
counting the anchor-tags on one html document gives the indegree of the node representing this document
in degrees can be exactly counted
the indegree and outdegree distribution will take the same values.

2

Wich statements with regard to the gini coefficient are true?

high values mean that the measured distribution is not very equal
low values mean perfect equality
the gini coefficient can take values between 0 and infinity
the gini coefficient can take values between -1 and 1
the gini coefficient can take values between 0 and 1


Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Further-readings.svg

Further reading

  1. tba
Wikiversity-Mooc-Icon-Discussion.svg

Discussion