Web Science/Part2: Emerging Web Properties/Modelling Similarity of Text/Cosine Similarity For Vectorspaces

From Wikiversity
Jump to: navigation, search

Cosine Similarity For Vectorspaces

Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Learning-goals.svg

Learning goals

  1. Be familiar with the the vector space model for text documents
  2. Be aware of term frequency and (inverse) document frequency
  3. Have reviewed the definitions of base and dimension
  4. Realize that the angle between two vectors can be seen as a similarity measure
Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Video.svg

Video

Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Script.svg

Script

The slides can be found at File:Cosine-Similarity-For-Vectorspaces.pdf

Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Quiz.svg

Quiz

1 Calculate the scalar product between and .

9
10
14
54

2 What is true about the distance measures on vector spaces?

they can have arbitrary large distances
the cosine distance can be arbitrary large
the euclidean distance is bound to a value of
they can be transformed to a similarity measure



Wikiversity-Mooc-Icon-Edit.svg
Wikiversity-Mooc-Icon-Ask.svg
Wikiversity-Mooc-Icon-Further-readings.svg

Further reading

  1. tba
Wikiversity-Mooc-Icon-Discussion.svg

Discussion