Jump to content

Machine learning/Unsupervised Learning

From Wikiversity

Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labeled responses.

The most common unsupervised learning method is cluster analysis, which is used for exploratory data analysis to find hidden patterns or grouping in data.

  • What other methods are there besides cluster analysis?

Clustering

[edit | edit source]

Problem statement:

  • Given some data, , identify groups (clusters) of data points and assign each point to one of the groups (clusters)
  • The clusters are identified using a measure of similarity which is defined upon metrics such as Euclidean or probabilistic distance.
  • Points inside each cluster has a higher measure of similarity than data in any other cluster
Clustering methods
Method Comment
K-means clustering
  • Partitions data into k distinct clusters based on the distance to the centroid of a cluster
  • Find the cluster centers that minimized the sum of distances for all the data points
Hierarchical clustering Builds a multilevel hierarchy of clusters by creating a cluster tree
Gaussian mixture models Models clusters as a mixture of multivariate normal density components
Self-organizing maps Uses neural networks that learn the topology and distribution of the data

Index

[edit | edit source]