Singular Value Decomposition

in Deep Learning and Portfolio Management

Ellie Arbab
5 min readSep 11, 2023

Singular Value Decomposition (SVD) is a popular technique from Linear Algebra. It shows how any matrix can be rewritten as the multiplication of three ‘simpler’ matrices, based on eigenvalues and eigenvectors.

SVD is a primay technique in data compression and noise reduction. In this article we describe its applications in Deep Learning and Portfolio Management after a brief and intuitive introduction to its inner workings.

Introduction to SVD

Any (m by n) matrix can be thought of as a linear transformation that maps points from the n-D space to a unique point in the m-D space. For Example matrix A maps the 3-D space to the 2-D space and matrix B is a mapping from the 3-D space to the 1-D space:

Symmetric Matrices have orthogonal eigenvectors. They stretch the unit ball along their eigenvectors, by the magnitude of their eigenvalues. Eigenvectors of Non-Symmetric Matrices are not orthogonal.

Symmetric Matrices are eigendecomposable. It means for any symmetric matrix M with eigenvalues λ_1, …, λ_n and eigenvectors u_1, …, u_n:

For example:

Check out Wolfram Alpha, an online tool that generate eigenvalues and eigenvectors in addition to other identifiers for a given matrix [2].

Equation (1) is projection of M onto the vector space spanned by its eigenvectors. It can be rewritten as:

Similarly, M as a linear function can be written as:

In the left hand side of Figure 1, we see how a symmetric matrix transforms a given vector x in the 2D plane and on the right hand side, we see the corresponding decomposition to eigenvectors.

Figure 1: Symmetric Matrices project vectors along the basis of their (orthogonal) eigenvector space proportional to their eigenvalues [1].

We can arrive at a similar formulation for non-square matrices with a small tweak. Note that for any non-symmetric matrix M, M’M is square and symmetric, hence:

Note that λ_1, …, λ_n and v_1, …, v_n are eigenvalues and eigenvectors for M’M. Before we can define singular values of a non-symmetric matrix, we need to show that all λ’s are positive:

For a given non-symmetric matrix M, let λ_1, …, λ_n be eigenvalues of M’M. Then M’s Singular Values σ_1, …, σ_n are defined as :

We are now prepared to introduce Singular Value Decomposition (SVD). Any Matrix M with singular values σ_1, …, σ_n, can be decompose to:

And similar to Equation (3):

Figure 2 from [1] gives a great visualization of SVD inner working.

Figure 2: Step-By-Step visualization of SVD inner working [1].

Deep Learning: Targeted Drop-Out

In a recent paper from Google and Rutgers University [3], researchers showed an application of SVD for fine-tuning Stable Diffusion Networks. In a nutshell, the idea is to only fine-tune along the non-zero eigenvectors of the weight matrix at each layer of the neural network.

This technique speeds up training significantly without negatively impacting the performance. In other words, by limiting the number of weights that get updated at each pass to the ‘most important’ ones, i.e. eigenvectors, we have built in a ‘targetted drop-out’ layer in the network.

SVD from Linear Algebra enhances Neural Networks in non-linear model space.

Portfolio Management: Adaptive Beta Estimate

In systematic portfolio construction, we look at covariance and correlation matrices of asset returns.

Identifying the non-zero eigenvectors of the correlation matrix indicates the main drivers of return in a given portfolio. In other words, the return for all other assets in the portfolio can be described as a linear function of these main drivers of return.

Recap
SVD allows us to intuitively understand how the linear transformation associate to a matrix stretches the unit ball, direction and magnitude, based on its singular values and some eigenvalues and eigenvectors.

In Deep Learning, vanishing/exploding gradients is an important concern. Drop-out layer randomly sets some weights to zero and helps curbing the impact however its random nature may set the most informative weights to zero. Applying back-propagation along the axis that weight matrix’s SVD identifies allows for targeted and hence a more efficient handling of vanishing/exploding gradients.

In Asset Pricing and Portfolio Theory, we assume Market maintains a growth-trend over time, referred to as market β. In attribution analysis of a multi-asset portfolio, it is important to correctly identify what portion of portfolio return growth/decline was due to market β and what can be attributed to assets idiosyncratic return. SVD offers an adaptive way to decompose portfolio return to market β and assets alpha.

SVD has long been applied in Data Compression and Noise Reduction and Signal Processing.

Can you think of other applications for SVD in your specific field?

References

[1] Reza Bagheri, “Understanding Singular Value Decomposition and Its Application in Data Science, Towards Data Science — Medium (2020).

[2] Wolfram Alpha Eigenvalue Calculator

[3] Han, Ligong, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, and Feng Yang. “Svdiff: Compact parameter space for diffusion fine-tuning.arXiv preprint arXiv:2303.11305 (2023).

--

--