7 Types of Cross-Validation (CV) Techniques You Should Know as a Data Scientist in 2023

With their Python implementations graphical visualizations

Rukshan Pramoditha
8 min readOct 9, 2023

Cross-validation plays an essential role in evaluating machine learning models.

The main intention of cross-validation on machine learning models is to prevent overfitting and improve the generalizing capability of the models.

Overfitting occurs when the model is trained too well on the training data but poorly performs on new unseen data. That kind of model tries to memorize the training data and fails to generalize on new unseen data.

We’re often familiar with train-test splits. If we do not have a separate dataset for testing the model, we divide the same dataset into train and test splits.

The train set is used to train the model. The model parameters learn their values from the train set. The final evaluation is done on the test set which has never been seen by the model during training.

Apart from the train-test sets, there is another set which is called the validation set.

The validation set is used to train the model multiple times with different combinations of hyperparameters. Therefore, it is used for hyperparameter tuning.

--

--

Rukshan Pramoditha

3,000,000+ Views | BSc in Stats | Top 50 Data Science, AI/ML Technical Writer on Medium | Data Science Masterclass: https://datasciencemasterclass.substack.com