Techno Blender
Digitally Yours.

Eigen Intuitions: Understanding Eigenvectors and Eigenvalues | by Peter Barrett Bryan | Jun, 2022

0 63


An intuitive basis for understanding all things “eigen”

We often want to transform our data to reduce the number of features while preserving as much variance (i.e., the differences among our samples) as we can. Often, you’ll hear folks refer to principal component analysis (PCA) and singular value decomposition (SVD), but we can’t appreciate how these methods work without first understanding what eigenvectors and eigenvalues are.

“Eigenvector” is a pretty weird word. As with many weird words (think kindergarten), we can blame the Germans. The most useful translation I’ve heard comes from the Coursera course “Mathematics for Machine Learning Specialization”: “eigen” means “characteristic.”

An “eigenvector” is a vector that “characterizes” a linear transform.

Let’s take a look at a couple vectors under arbitrary linear transforms like translation, scaling, rotation, and shear. Most of our vectors are shifted around. Some, though, point in the same direction before and after a transform.

Let’s take a look at a simple horizontal scaling! We can achieve this with the linear transform matrix [[2, 0], [0, 1]].

Figure 1: Visualizing three vectors through a horizontal scaling. Image by author.

If we plot three unit length vectors— one at 0°, one at 45°, and one at 90° — and visualize what happens after applying our transform matrix, we see that some vectors remain pointed in the same directions (0° and 90°) as before while others do not (45°). There’s something interesting about the two vectors that remain pointed in the same direction before and after. Under the linear transform, these vectors are just scaled by a scalar term. Our unit vector at 90° is unchanged, i.e., scaled by 1 while our unit vector at 0° is doubled. These vectors are our eigenvectors!

The eigenvectors of a linear transform are those vectors that remain pointed in the same directions. For these vectors, the effect of the transform matrix is just scalar multiplication. For each eigenvector, the eigenvalue is the scalar that the vector is scaled by under the transform.

Plotting a bunch of vectors and waiting for an animation to render isn’t a terribly efficient approach. Luckily, all we need to do is formalize the intuitions we’ve already built. Some of the the equations look a little intimidating, but they aren’t so bad once we understand where they come from.

While the mathematics extend to matrices of arbitrary dimensionality, we are going to stick to a 2×2 matrix for this demonstration.

Let’s consider a linear transform matrix A. As we saw, the eigenvectors for a matrix are the vectors that are just scaled. These scalars are eigenvalues, and we’ll call them λ. We’ll call our eigenvectors x.

All together now… the eigenvectors x are scale by our eigenvalues λ by our matrix A.

We can formalize this with the top equation in Figure 2.

Figure 2: Basic formalization of eigenvectors and eigenvalues. Image by author.

Let’s move some terms around! Subtracting off λx and factoring out x gives us a nice zero-valued equality. To subtract a constant (λ) off of the matrix A, we need to multiply it by an identity matrix the same dimensionality of A.

Let’s solve for our eigenvectors and eigenvalues! We aren’t interested in the “trivial” solution to these equations where the x vector is zero-valued. Instead, we want to know when the term (A-λI) is equal to zero. We can achieve this by checking the determinant (Figure 3) is zero-valued!

Figure 3: Definition of the determinant of a 2×2 matrix. Image by author.

Let’s expand our matrix A now, so that we can see each of the values in the matrix (Figure 4).

Figure 4: Expanding the matrix A. Image by author.

Piecing things together, we get equalities shown in Figure 5.

Figure 5: Checking det(A-λI)=0. Image by author.

Using our definition of the determinant from Figure 3, we can substitute in our values.

Figure 6: Substituting in the values of our matrix into the definition of a determinant. Image by author.

Finally, multiplying out terms, we recover a form called the “characteristic polynomial.”

Figure 7: Definition of the “characteristic polynomial.” Image by author.

That’s as far as we can go in the abstract! Now, let’s apply this “characteristic polynomial” and solve for our eigenvalues (λ).

Figure 8: Sample matrix A. Image by author.

Plugging in the values from our matrix into our characteristic polynomial…

Figure 9: Solving the characteristic polynomial for our matrix A. Image by author.

We get our eigenvalues (λ). Now, we can substitute our eigenvalues back in to solve for our eigenvectors.

Figure 10: Determining our eigenvectors based on eigenvalues. Image by author.

This is an odd result… @λ = 2, [0, -x₂] = 0. Our x₁ seems to have disappeared. What does this mean?

It means that for λ = 2 as long as x₂ is zero, x₁ can equal anything.

  • [5, 0], [1, 0], and [-3, 0] for instance

Similarly, for λ = 1 as long as x₁ is zero, x₂ can equal anything.

  • [0, 2], [0, -1], [0, 8] for instance

We express this invariance by substituting in a placeholder variable t for the terms that can take on any value.

Figure 11: Defining the eigenvectors of our space. Image by author.

This is exactly what our visual intuitions showed us! All the horizontal vectors of our space are eigenvectors and they are scaled by the eigenvalue 2. All the vertical vectors of our space are eigenvectors and they are scaled by the eigenvalue 1.

I found a lot of value in plotting things for myself. If you want to try, check out the source below!

It is important to give credit where it is most definitely due! While the code in the article is mine, the package used for visualization (manim) is certainly not! The visualization library and the method of explanation are shamelessly stolen from 3blue1brown.

I mentioned it earlier in the article, but I love this Coursera series: Mathematics for Machine Learning Specialization!


An intuitive basis for understanding all things “eigen”

We often want to transform our data to reduce the number of features while preserving as much variance (i.e., the differences among our samples) as we can. Often, you’ll hear folks refer to principal component analysis (PCA) and singular value decomposition (SVD), but we can’t appreciate how these methods work without first understanding what eigenvectors and eigenvalues are.

“Eigenvector” is a pretty weird word. As with many weird words (think kindergarten), we can blame the Germans. The most useful translation I’ve heard comes from the Coursera course “Mathematics for Machine Learning Specialization”: “eigen” means “characteristic.”

An “eigenvector” is a vector that “characterizes” a linear transform.

Let’s take a look at a couple vectors under arbitrary linear transforms like translation, scaling, rotation, and shear. Most of our vectors are shifted around. Some, though, point in the same direction before and after a transform.

Let’s take a look at a simple horizontal scaling! We can achieve this with the linear transform matrix [[2, 0], [0, 1]].

Figure 1: Visualizing three vectors through a horizontal scaling. Image by author.

If we plot three unit length vectors— one at 0°, one at 45°, and one at 90° — and visualize what happens after applying our transform matrix, we see that some vectors remain pointed in the same directions (0° and 90°) as before while others do not (45°). There’s something interesting about the two vectors that remain pointed in the same direction before and after. Under the linear transform, these vectors are just scaled by a scalar term. Our unit vector at 90° is unchanged, i.e., scaled by 1 while our unit vector at 0° is doubled. These vectors are our eigenvectors!

The eigenvectors of a linear transform are those vectors that remain pointed in the same directions. For these vectors, the effect of the transform matrix is just scalar multiplication. For each eigenvector, the eigenvalue is the scalar that the vector is scaled by under the transform.

Plotting a bunch of vectors and waiting for an animation to render isn’t a terribly efficient approach. Luckily, all we need to do is formalize the intuitions we’ve already built. Some of the the equations look a little intimidating, but they aren’t so bad once we understand where they come from.

While the mathematics extend to matrices of arbitrary dimensionality, we are going to stick to a 2×2 matrix for this demonstration.

Let’s consider a linear transform matrix A. As we saw, the eigenvectors for a matrix are the vectors that are just scaled. These scalars are eigenvalues, and we’ll call them λ. We’ll call our eigenvectors x.

All together now… the eigenvectors x are scale by our eigenvalues λ by our matrix A.

We can formalize this with the top equation in Figure 2.

Figure 2: Basic formalization of eigenvectors and eigenvalues. Image by author.

Let’s move some terms around! Subtracting off λx and factoring out x gives us a nice zero-valued equality. To subtract a constant (λ) off of the matrix A, we need to multiply it by an identity matrix the same dimensionality of A.

Let’s solve for our eigenvectors and eigenvalues! We aren’t interested in the “trivial” solution to these equations where the x vector is zero-valued. Instead, we want to know when the term (A-λI) is equal to zero. We can achieve this by checking the determinant (Figure 3) is zero-valued!

Figure 3: Definition of the determinant of a 2×2 matrix. Image by author.

Let’s expand our matrix A now, so that we can see each of the values in the matrix (Figure 4).

Figure 4: Expanding the matrix A. Image by author.

Piecing things together, we get equalities shown in Figure 5.

Figure 5: Checking det(A-λI)=0. Image by author.

Using our definition of the determinant from Figure 3, we can substitute in our values.

Figure 6: Substituting in the values of our matrix into the definition of a determinant. Image by author.

Finally, multiplying out terms, we recover a form called the “characteristic polynomial.”

Figure 7: Definition of the “characteristic polynomial.” Image by author.

That’s as far as we can go in the abstract! Now, let’s apply this “characteristic polynomial” and solve for our eigenvalues (λ).

Figure 8: Sample matrix A. Image by author.

Plugging in the values from our matrix into our characteristic polynomial…

Figure 9: Solving the characteristic polynomial for our matrix A. Image by author.

We get our eigenvalues (λ). Now, we can substitute our eigenvalues back in to solve for our eigenvectors.

Figure 10: Determining our eigenvectors based on eigenvalues. Image by author.

This is an odd result… @λ = 2, [0, -x₂] = 0. Our x₁ seems to have disappeared. What does this mean?

It means that for λ = 2 as long as x₂ is zero, x₁ can equal anything.

  • [5, 0], [1, 0], and [-3, 0] for instance

Similarly, for λ = 1 as long as x₁ is zero, x₂ can equal anything.

  • [0, 2], [0, -1], [0, 8] for instance

We express this invariance by substituting in a placeholder variable t for the terms that can take on any value.

Figure 11: Defining the eigenvectors of our space. Image by author.

This is exactly what our visual intuitions showed us! All the horizontal vectors of our space are eigenvectors and they are scaled by the eigenvalue 2. All the vertical vectors of our space are eigenvectors and they are scaled by the eigenvalue 1.

I found a lot of value in plotting things for myself. If you want to try, check out the source below!

It is important to give credit where it is most definitely due! While the code in the article is mine, the package used for visualization (manim) is certainly not! The visualization library and the method of explanation are shamelessly stolen from 3blue1brown.

I mentioned it earlier in the article, but I love this Coursera series: Mathematics for Machine Learning Specialization!

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment