Truncated SVD for Dimensionality Reduction in Sparse Feature Matrices

Discussing how truncated SVD differs from normal SVD

Rukshan Pramoditha
7 min readJun 27, 2023
Image by Tim Mossholder from Pixabay

Sparse feature matrices require special dimensionality reduction techniques such as Truncated Singular Value Decomposition (Truncated SVD) as most of the values in the matrix are zero!

Sparse representation of a matrix

A feature matrix refers to a matrix with all input features and is typically represented by the variable, X. It is the training dataset that we use for training the model. When most of the values in the feature matrix are zero, it is often represented as a sparse matrix to save memory and computing time.

The following matrix has many zero elements.

We can convert it to a sparse matrix using the following code.

from scipy.sparse import csr_matrix

X_sparse = csr_matrix(X) # Where X refers to the above matrix (numpy array)
print(X_sparse)

We get the following output.

In the sparse representation, only non-zero elements are stored in the format of (row, column) value. For example, (1, 0) 1 denotes the value 1 is in the 2nd row and the first column in the matrix (indices begin with zero).

--

--

Rukshan Pramoditha

3,000,000+ Views | BSc in Stats | Top 50 Data Science, AI/ML Technical Writer on Medium | Data Science Masterclass: https://datasciencemasterclass.substack.com