Gram–Schmidt process facts for kids

The first two steps of the Gram–Schmidt process

The Gram–Schmidt process is a clever method in mathematics that helps us "tidy up" a group of vectors. Imagine you have several arrows pointing in different directions. This process helps you change them into a new set of arrows that are all perfectly perpendicular to each other. It can also make them "orthonormal," meaning they are perpendicular and each have a length of exactly one.

This method is often used in a part of math called linear algebra, which deals with vectors and spaces. It's especially useful in Euclidean space, which is like the normal space we live in, where we can measure distances and angles.

The Gram–Schmidt process takes a group of vectors that are "linearly independent" (meaning they don't just point in the same direction or are simple copies of each other). It then turns them into a new group of vectors that are perpendicular and still "cover" the same space as the original vectors.

The method is named after two mathematicians, Jørgen Pedersen Gram and Erhard Schmidt. However, a very famous mathematician named Pierre-Simon Laplace knew about it even before them!

One important use of the Gram–Schmidt process is in something called QR decomposition. This is a way to break down a matrix (a grid of numbers) into two simpler parts: one that is "orthogonal" (like our perpendicular vectors) and one that is "triangular."

How the Gram–Schmidt Process Works
Example in 2D Space
Numerical Stability
Algorithm (Computer Steps)
See also

How the Gram–Schmidt Process Works

The modified Gram-Schmidt process being executed on three linearly independent, non-orthogonal vectors of a basis for R³. This animation shows how the process makes vectors perpendicular one by one.

Let's think about how this process works step by step. Imagine you have a set of original vectors, let's call them v₁, v₂, and so on. The goal is to create a new set of vectors, u₁, u₂, etc., that are all perpendicular to each other.

First, we pick the first original vector, v₁. This will be our first "straightened" vector, so we set u₁ = v₁.

Next, we want to find u₂. We take the second original vector, v₂. We then remove any part of v₂ that points in the same direction as u₁. Think of it like finding the "shadow" of v₂ on u₁ and then subtracting that shadow. What's left will be perfectly perpendicular to u₁. This leftover part becomes u₂.

We use a special tool called a "projection operator" for this. It helps us find the part of one vector that lies along another. The formula for this is: Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): {\displaystyle \operatorname{proj}_{\mathbf{u}} (\mathbf{v}) = \frac{\langle \mathbf{v}, \mathbf{u}\rangle}{\langle \mathbf{u}, \mathbf{u}\rangle}{\mathbf{u}}} Here, $\langle \mathbf{v}, \mathbf{u}\rangle$ is the "inner product" (like a dot product) of the vectors v and u. It tells us how much they point in the same direction.

So, the steps look like this:

u₁ = v₁
u₂ = v₂ - (the part of v₂ that points along u₁)
u₃ = v₃ - (the part of v₃ that points along u₁) - (the part of v₃ that points along u₂)

And so on for all the vectors. Each new u vector is made by taking the original v vector and subtracting all the parts that point along the previously found u vectors. This guarantees that each new u vector is perpendicular to all the ones before it.

Once you have all the u vectors (which are perpendicular), you can make them "orthonormal" by simply dividing each u vector by its own length. This makes each vector have a length of 1. These new vectors are often called e₁, e₂, etc.

If you apply this process to a set of vectors that are not truly independent (meaning some are just combinations of others), the process will give you a zero vector at some point. If you want an orthonormal set, you just skip these zero vectors. The number of non-zero vectors you end up with tells you the true "dimension" of the space your original vectors covered.

Example in 2D Space

Let's try an example with two vectors in a 2D space (like a flat piece of paper). Our original vectors are:

v₁ = (3, 1)
v₂ = (2, 2)

First, we set u₁:

u₁ = v₁ = (3, 1)

Next, we find u₂. We need to subtract the part of v₂ that points along u₁. The projection of v₂ onto u₁ is calculated as: Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): {\displaystyle \operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_2) = \frac{\langle \mathbf{v}_2, \mathbf{u}_1\rangle}{\langle \mathbf{u}_1, \mathbf{u}_1\rangle}{\mathbf{u}_1} } Let's calculate the dot products:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \langle \mathbf{v}_2, \mathbf{u}_1\rangle = (2 * 3) + (2 * 1) = 6 + 2 = 8
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \langle \mathbf{u}_1, \mathbf{u}_1\rangle = (3 * 3) + (1 * 1) = 9 + 1 = 10

So, the projection is: Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): {\displaystyle \operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_2) = \frac{8}{10} \begin{bmatrix} 3 \\1 \end{bmatrix} = \begin{bmatrix} 24/10 \\ 8/10 \end{bmatrix} = \begin{bmatrix} 12/5 \\ 4/5 \end{bmatrix} }

Now, we find u₂: Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): {\displaystyle \mathbf{u}_2 = \mathbf{v}_2 - \operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_2) = \begin{bmatrix}2\\2\end{bmatrix} - \begin{bmatrix} 12/5 \\ 4/5 \end{bmatrix} = \begin{bmatrix} 10/5 - 12/5 \\ 10/5 - 4/5 \end{bmatrix} = \begin{bmatrix} -2/5 \\ 6/5 \end{bmatrix} }

So, our orthogonal vectors are u₁ = (3, 1) and u₂ = (-2/5, 6/5). We can check if they are perpendicular by calculating their dot product: Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): {\displaystyle \langle\mathbf{u}_1,\mathbf{u}_2\rangle = (3 * -2/5) + (1 * 6/5) = -6/5 + 6/5 = 0} Since the dot product is 0, they are indeed perpendicular!

To make them orthonormal (length 1), we divide each by its length:

Length of u₁ = Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \sqrt{3^2 + 1^2} = \sqrt{9 + 1} = \sqrt{10}
e₁ = Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{1}{\sqrt{10}}\begin{bmatrix}3\\1\end{bmatrix}

Length of u₂ = Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \sqrt{(-2/5)^2 + (6/5)^2} = \sqrt{4/25 + 36/25} = \sqrt{40/25} = \frac{\sqrt{40}}{5} = \frac{2\sqrt{10}}{5}
e₂ = Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{1}{2\sqrt{10}/5} \begin{bmatrix}-2/5\\6/5\end{bmatrix} = \frac{5}{2\sqrt{10}} \begin{bmatrix}-2/5\\6/5\end{bmatrix} = \frac{1}{\sqrt{10}} \begin{bmatrix}-1\\3\end{bmatrix}

Now we have two orthonormal vectors: e₁ and e₂.

Numerical Stability

When computers perform the Gram–Schmidt process, tiny "rounding errors" can happen. These errors can make the vectors not perfectly perpendicular, especially if there are many vectors or if the original vectors are almost pointing in the same direction. This is called "numerical instability."

To fix this, there's a slightly different version called the modified Gram-Schmidt (MGS) process. It gives the exact same answer if there are no errors, but it handles the small computer errors much better.

Instead of subtracting all the projections at once, the modified version subtracts them one by one. For example, to find u_k, it first subtracts the part along u₁, then from that new vector, it subtracts the part along u₂, and so on. This step-by-step approach helps keep the vectors more accurately perpendicular.

Algorithm (Computer Steps)

Here's how you might tell a computer to do the classical Gram–Schmidt process, using a language like MATLAB:

function U = gramschmidt(V)
    % V is a matrix where each column is an input vector
    % U will be a matrix where each column is an orthonormal vector
    [n, k] = size(V); % n is dimension of vectors, k is number of vectors
    U = zeros(n,k);   % Create an empty matrix for the results

    % First vector is just the first input vector, normalized
    U(:,1) = V(:,1) / norm(V(:,1));

    % Loop through the rest of the vectors
    for i = 2:k
        U(:,i) = V(:,i); % Start with the original vector
        % Subtract projections onto all previously found U vectors
        for j = 1:i-1
            % Calculate the projection and subtract it
            U(:,i) = U(:,i) - (U(:,j)'*U(:,i)) * U(:,j);
        end
        % Normalize the resulting vector
        U(:,i) = U(:,i) / norm(U(:,i));
    end
end

This set of instructions tells the computer exactly how to perform the Gram–Schmidt orthonormalization. It's a very efficient way to do it for many vectors.

Gram–Schmidt process facts for kids

Contents

How the Gram–Schmidt Process Works

Example in 2D Space

Numerical Stability

Algorithm (Computer Steps)

See also