How Is Mean-Shift Clustering Better Than K-Means Clustering?
No need to specify the number of clusters (k) as a hyperparameter!
Clustering is an unsupervised machine learning type which works with unlabeled data.
Clustering assigns similar data points into clusters (groups). When training, the goal is to find a cluster label for each data point. A well-trained clustering model should be able to assign new unseen data points in the same domain to the identified clusters.
K-Means and Mean-Shift are two popular clustering algorithms. As I have already discussed the K-Means algorithm in my article, Hands-On K-Means Clustering, today, more emphasis will be given to the Mean-Shift algorithm while we still compare both algorithms.
How Mean-Shift Clustering Works
The Mean-Shift algorithm assigns data points into clusters by iteratively shifting each data point towards the mode of the data points within a limited radius which is defined by the bandwidth hyperparameter. The mode defines the highest density area of data points in Mean-Shift clustering. The algorithm continuously performs the shifting process until all data points are assigned to clusters.