Introduction to Unsupervised Learning

Unsupervised learning is a type of machine learning algorithm used to draw inferences from unlabeled data.

Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution. This distinguishes unsupervised learning from supervised learning and reinforcement learning.

Unsupervised learning is closely related to the problem of density estimation in statistics. However unsupervised learning also encompasses many other techniques that seek to summarize and explain key features of the data. Many methods employed in unsupervised learning are based on data mining methods used to preprocess data.

Approaches to unsupervised learning

Approaches to unsupervised learning include:

  • clustering (e.g., k-means, mixture models, hierarchical clustering)
  • Approaches for learning latent variable models such as
    • Expectation–maximization algorithm (EM)
    • Method of moments
    • Blind signal separation techniques, e.g.,
      • Principal component analysis
      • Independent component analysis
      • Non-negative matrix factorization
      • Singular value decomposition
  • Among neural network models, the self-organizing map (SOM) and adaptive resonance theory (ART) are commonly used unsupervised learning algorithms.