Web Science/Part2: Emerging Web Properties/Modelling Similarity of Text/Cosine Similarity For Vectorspaces

From Wikiversity
Jump to navigation Jump to search

Cosine Similarity For Vectorspaces

Learning goals

  1. Be familiar with the vector space model for text documents
  2. Be aware of term frequency and (inverse) document frequency
  3. Have reviewed the definitions of base and dimension
  4. Realize that the angle between two vectors can be seen as a similarity measure

Video

Script

The slides can be found at File:Cosine-Similarity-For-Vectorspaces.pdf

Quiz

1 Calculate the scalar product between and .

9
10
14
54

2 What is true about the distance measures on vector spaces?

they can have arbitrary large distances
the cosine distance can be arbitrary large
the euclidean distance is bound to a value of
they can be transformed to a similarity measure




Further reading

  1. tba

Discussion