Data Science 365

Bring data into actionable insights.

Follow publication

Member-only story

One-hot Encode Scalar-value Labels for Deep Learning Models

One-hot vector explained in plain English

Rukshan Pramoditha
Data Science 365
Published in
4 min readJul 30, 2022

--

(Cover image by author, made with draw.io)

We need to convert scalar-value labels into a one-hot vector before using them in deep learning models.

This is required for multiclass classification models that output probabilities per class when using the categorical_crossentropy loss function.

Sparse scalar representation

The values in the label column are usually represented as sparse scalars (i.e. in single-digit format). For example, the training and test labels in the MNIST digits dataset are represented in single-digit format ranging from 0 to 9. Each digit represents a class label.

The training and test labels are in two separate one-dimensional vectors.

One-hot representation

As I explained at the beginning of this article, sparse scalar representation is not suitable for multiclass classification models that output probabilities per class. So, it is necessary to perform one-hot encoding for scalar-value labels before using them in deep learning models.

--

--

Data Science 365
Data Science 365
Rukshan Pramoditha
Rukshan Pramoditha

Written by Rukshan Pramoditha

3,000,000+ Views | BSc in Stats (University of Colombo, Sri Lanka) | Top 50 Data Science, AI/ML Technical Writer on Medium

No responses yet

Write a response