Linear Algebra and Its Applications Chapter 1 - Linear Equations in Linear Algebra
by Arpon Sarker
Edit: Did most of the summary on 29/6/2025
Introduction
This is just my sparse notes as I was only trying to fill in the gaps from my course last semester.
How do elementary row operations yield the same system of linear equations?
- Row swapping is trivial to prove
- Row scaling is only just multiplying both sides of the equation by some scalar and also is quite trivial to see that it is the same line. $x+1=y$ is equivalent to $2x+2=2y$ as it still has the same solution set.
- Row addition of another scaled row is also just adding the equivalent values on the LHS and RHS of the equations. For example,
$$
2x + 4y = 14
5x + 2y = 19 \
2x + 4y + 19 = 14 + 19 \
2x + 4y + (5x+2y) = 14 + 19
7x + 6y = 33
$$
The above explanation was from maths stack exchange. This changed my perspective on row addition being just balancing the equation on either side and if scaling the row being added, this just calls on the reasoning of point 2 which states that this is just an equivalent line.
Another explanation uses the PLU decomposition where multiplying invertible matrices of the corresponding row operations and “inversing” them to get the decomposition which is just the equivalent operations in matrix form.
Furthermore, each of these linear equation can be seen as a line (for bivariate cases) and using point 3 does change the actual line into another line however the solution set is still the same with different linear equations. Points 1 and 2 don’t suffer these as they don’t change the overall behaviour of the linear equation.
Definitions
Two linear systems are equivalent if they have the same solution sets and two matrices are row equivalent if there is a sequence of elementary row operations that transforms one matrix into another. Elementary row operations are reversible so this “row equivalence” goes both ways.
Leading Entry: refers to the leftmost nonzero entry
Echelon Form:
- All nonzero rows are above any rows of all zeros
- Each leading entry of a row is in a column to the left of the leading entry of the row below it.
- All entries in a column below a leading entry are zeros.
Reduced Echelon Form:
- The leading entry in each nonzero row is 1.
- Each leading 1 is the only nonzero entry in its column.
Pivot Positions: is a position that corresponds to a leading 1 in the reduced echelon form. A Pivot Column is a column that contains a pivot position. This is because the leading entries of the echelon form do not change positions and so the positions of the leading entries of the echelon and reduced echelon form are the same.
Basic Variables: corresponds to pivot columns
Free Variables: when the column (variable) is not a pivot column and can take infinitely many values for infinitely many solutions
Linear Combinations: Given vectors $\bold{v}_1, \ldots, \bold{v}_p \in \R^n$ and scalar weights $c_1, \ldots, c_p$, the vector $\bold{y}$ is defined by
Span: $\text{Span}(\bold{v}_1,\ldots,\bold{v}_p) =$ The set of all linear combinations of $\bold{v}_1 ,\ldots , \bold{v}_p$ \(y = c_1\bold{v}_1 + \ldots + c_p\bold{v}_p\)
Linear Independence: \(x_1\bold{v}_1 + \ldots + x_p\bold{v}_p = \bold{0}\) has only the trival solution ($x_i = 0$)
\[x_1\bold{v}_1 + \ldots + x_p\bold{v}_p = \bold{0}\]Linear Dependence: When not all the weights $x_1, \ldots, x_p$ are zero but still result in
This equation is called a linear dependence relation.
One-to-one:
if each $\bold{b} \in \R^m$ is the image of at most one $\bold{x} \in \R^n$ that is, $T(\bold{x}) = \bold{b}$ has a unique solution or none at all. This is a uniqueness question.
Onto:
if each $\bold{b} \in \R^m$ is the image of at least one $\bold{x}\in\R^n$. Equivalently, the range of $T$ is the codomain $\R^n$. This is an existence question.
Existence and Uniqueness Questions
Throughout this chapter, the questions of whether a solution exists (“Is the system consistent; does at least one or infinitely many solutions exist?”) or is unique (“If a solution exists; is that solution unique?”)
Row Reduction Algorithm
- Begin with leftmost nonzero column. This is a pivot column and the pivot position is at the top.
- Select a nonzero entry in the pivot column as a pivot, maybe necessary to interchange rows.
- Use row replacement operations to create zeros in all positions below pivot.
- Cover the row containing pivot position and all above it. Apply steps 1-3 to submatrix until no more nonzero rows to modify.
To reduced echelon form,
- Beginning with rightmost pivot and working upward to the left, create zeros above each pivot. If pivot not 1, make it 1 by scaling.
Parametic Descriptions of Solution Sets
We arbitrarily make the free variables the parameters and convey the rest of the basic variables in terms of these parameters.
Vector Equations and Matrix Equations
A vector equation $x_1\bold{a}_1 + \ldots + x_n\bold{a}_n = \bold{b}$ is the same as matrix equation $[\bold{a}_1 \ldots \bold{a}_n] (\bold{x}_1, \ldots,\bold{x}_n) = \bold{b}$ which is solved by the augmented matrix $[\bold{a}_1 \quad \ldots \quad \bold{a}_n \quad \bold{b}]$
Linear Independence of Matrix Columns
The columns of a matrix A are linearly independent iff the equation $A\bold{x} = \bold{0}$ has only the trivial solution.
For sets of one or two vectors, they are linearly dependent if at least one of the vectors is a multiple of the other.
Linear Transformations
Thinking of matrix $A$ as an object that “acts” on vector $\bold{x}$ rather than being a linear combination of vectors.
A transformation (or function or mapping) $T$ from $\R^n$ to $\R^m$ is a rule that assigns each vector $\bold{x} \in \R^n$ to a vector $T(\bold{x}) \in \R^m$. The domain is $\R^n$ and the codomain is $\R^m$. The image of $\bold{x}$ (under the action of $T$) is $T(\bold{x})$ and the set of all images is called the range of $T$.
A transformation $T$ is linear if:
- $T(\bold{u} + \bold{v}) = T(\bold{u}) + T(\bold{v})$
- $T(c\bold{u}) = cT(\bold{u})$
which can be extended to: \(T(c_1\bold{v}_1 + \ldots + c_p\bold{v}_p) = c_1T(\bold{v}_1) + \ldots + c_pT(\bold{v}_p)\\ T(\bold{0}) = \bold{0} \\ T(c\bold{u} + d\bold{v}) = cT(\bold{u}) + dT(\bold{v})\)
The first statement, is the superposition principle which is important in physics and engineering. Use this last statement, to prove both properties 1 and 2 to show transformation is linear.
Theorems
Theorem 1: Uniqueness of the Reduced Echelon Form
Each matrix is row equivalent to ONLY one reduced echelon matrix.
Theorem 2: Existence and Uniqueness Theorem
A linear system is consistent iff the rightmost column is not a pivot column. If it is consistent then either there are no free variables which is a unique solution or infinitely many solutions, if there are free variables.
Theorem 3:
$A\bold{x} = \bold{b}$ has the same solution set as $x_1\bold{a}_1 + \ldots + x_n\bold{a}_n = \bold{b}$ which has the same solution set as augmented matrix $[\bold{a}_1 \ldots \bold{a}_n \bold{b}]$
Theorem 4:
The following statements are logically equivalent:
- For each $\bold{b} \in \R^n$, $A\bold{x}=\bold{b}$ has a solution
- Each $\bold{b} \in \R^n$ is a linear combination of the columns of A
- The columns of A span $\R^n$
- A has a pivot position in each row.
Proof of a) and d) \([A \quad \bold{b}] \quad \sim \quad \ldots \quad \sim \quad [U \quad \bold{d}]\)
If statement d) is true, then each row of $U$ contains a pivot position and there can be no pivot in the augmented and columns. Therefore, this system is consistent for any $\bold{b}$
Theorem 5:
- $A(\bold{u} + \bold{v}) = A\bold{u} + A\bold{v}$
- $A(c\bold{u}) = c(A\bold{u})$
Theorem 6:
Suppose $A\bold{x}=b$ is consistent and let $\bold{p}$ be a solution. Then the solution set is the set of all vectors $\bold{w} = \bold{p} + \bold{v}_h$ where $\bold{v}_h$ is any solution of the hommogenous equation $A\bold{x} = \bold{0}$.
The paremetric equation for $A\bold{x} = \bold{b}$ is $\bold{x} = \bold{p} + t\bold{v}$ whereas, the solutions for $A\bold{x} = \bold{0}$ is $\bold{x} = t\bold{v}$. This shows that the difference is just a translation and can be said as “the equation of the line through $\bold{p}$ parallel to $\bold{v}$.”
Theorem 7: Characterisation of Linearly Dependent Sets
An indexed set $S = {\bold{v}1 ,\ldots, \bold{v}_p}$ of two or more vectors is linearly dependent iff at least one of the vectors is a linear combination of the others. In fact, if S is linearly dependent and $\bold{v}_1 \neq \bold{0}$ then some $\bold{v}_j$ (j > 1) is linear combination of the preceding vectors, $\bold{v}_1,\ldots,\bold{v}{j-1}$
If there exists some $\bold{v}_j$ that is a linear combination of other vectors then it can be subtracted from both sides producing a linear dependence relation. \(\bold{v}_1 = c_2\bold{v}_2 + c_2\bold{v}_3 \\ \bold{0} = (-1)\bold{v}_1 + c_2\bold{v}_2 + c_2\bold{v}_3 + 0\bold{v}_4 + \ldots + 0\bold{v}_p\)
Conversely, suppose S is linearly dependent. Let $j$ be the largest subscript for which $c_j \neq 0$
\[c_1\bold{v}_1 + \ldots + c_j\bold{v}_j + 0\bold{v}_{j+1} + \ldots + 0\bold{v}_p = \bold{0} \\ \bold{v}_j = (-\frac{c_1}{c_j})\bold{v}_1 + \ldots + (-\frac{c_{j-1}}{c_j})\bold{v}_{j-1}\]Theorem 8:
If a set contains more vectors than there are entries in each vector, the set is linearly dependent. Since there must be a free variable which makes a nontrivial solution to $A\bold{x} = \bold{0}$.
Theorem 9:
If a set contains the zero vector, then the set is linearly dependent.
Theorem 10:
Let $T: \R^n \rightarrow \R^m$ then there exists a unique matrix $A$ such that \(T(\bold{x}) = A\bold{x}\)
where $A = [T(\bold{e}_1) \ldots T(\bold{e}_n)]$ where $e_j$ is the jth column of the identity matrix. This matrix is called the standard matrix for linear transformation $T$
Theorem 11:
T is one-to-one iff the equation $T(x) = \bold{0}$ has only the trivial solution.
This can be thought of in terms of free variables and linear independence.
Theorem 12:
- T maps $\R^n$ onto $\R^m$ iff the columns of A span $\R^m$
- T is one-to-one iff the columns of A are linearly independent.
Applications
- Leontief “exchange model” (Economics) - Equilibrium Prices
- Balancing Chemical Equations
- Network Flow (balancing flow in and flow out)
- Cambridge Diet (linear programming problems)
- Electrical Networks (Ohm’s Law and Kirchhoff’s Voltage Law)
- Linear Difference Equation (Recurrence Relation) $\bold{x}_{k+1} = A\bold{x}_k$
Numerical Notes
-
Partial Pivoting: A computer program selects as a pivot the entry with the highest absolute value to reduce roundoff errors in calculations.
-
Asymptotic Analysis of forward phase of row reduction for $n \times (n-1)$ matrix is $2n^3/3 + n^2/2 -7n/6$ flops for echelon form and at most $n^2$ flops to reduced echelon form. Where a flop is an arithmetic operation on two floating point numbers.
-
To optimize a computer algorithm to compute $A\bold{x}$, the calculations should involve data stored in contiguous memory locations. Fortran stores matrix as columns (compute $A\bold{x}$ as linear combination of columns of A) and C stores matrices by rows (computes algorithms by alternate rule using rows of $A$).