Measuring Distance Between Data Points in Multidimensional Feature Space

5 types of distance functions (measures) used in machine learning algorithms

Rukshan Pramoditha
6 min readApr 6, 2024
Image by Leopictures from Pixabay

Distance plays an important role when discussing the similarity between data points. The lesser the distance between two data points, the more similar they are!

When distance increases, points become dissimilar (Image by author)

Many machine learning algorithms use distance functions to measure the similarity between data points. The dimension of the data doesn't matter here, distance functions work with any dimension, but we are only familiar with up to three dimensions. It is hard to imagine anything beyond that!

The algorithm's performance heavily depends on the type of distance function we use there. Here, we’ll discuss five types of distance functions used in machine learning algorithms. We begin with Euclidean distance, the most popular one!

1. Euclidean Distance

This is the most commonly used distance function which measures the shortest distance between two data points. In other words, it measures a straight line between two data points in a 2D plane.

--

--

Rukshan Pramoditha

3,000,000+ Views | BSc in Stats | Top 50 Data Science, AI/ML Technical Writer on Medium | Data Science Masterclass: https://datasciencemasterclass.substack.com