What is Clustering?

Clustering is a Machine Learning technique that involves the grouping  of data point. Given a set of data points, we can use a clustering algorithms to classify each data point into a specific group.

In theory, data points that are in the same group should have similar properties and/or features, while data points in different groups should have highly dissimilar properties and/or features.

Clustering, in Machine Learning known as method of Unsupervised Learning, commonly used for statistical data analysis.

For example, consider t

What is Machine Learning


he case of image recognition for lions and cats. The data sometimes not only spans images of these animals, but also may have animals with similar resemblance. This will present problems in training algorithms. They may recognise the animals wrongly and the learning accuracy goes down significantly.

In order to overcome this, a class of algorithms called unsupervised learning algorithms are used. These algorithms work with data that are relatively new and unknown data in order to learn more. This class is again subdivided into two categories, clustering and association

Some clustering algorithms:

1. K-means Clustering

2. Mean-Shift Clustering

3. Density Based Spatial Clustering of Application with Noise (DBSCAN)

4. Expectation- Maximization(EM) Clustering using Gaussian Mixture Models(GMM)

5. Agglomerative Hierarchical Clustering

6. Fuzzy C-means Algorithm (FCM)

How Machine Learning Can Create a More Meritocratic, Less Biased Job Market