What is a Determinant?
I was watching Gilbert Strang’s 18th lecture in 18.06 Linear Algebra a couple of days ago, and he laid out a theory of determinants that started from a few basic properties and derived all the usual results. However he provided essentially no motivation for what he was doing. Why these properties? How did any one ever think of these particular axioms? And more tellingly, what is a determinant, really? I don’t mean the official definition (here quoted from Wikipedia and similar to Strang’s):
If we write an n-by-n matrix in terms of its column vectors
where the aj are vectors of size n, then the determinant of A is defined so that
where b and c are scalars, v is any vector of size n and I is the identity matrix of size n. These properties state that the determinant is an alternating multilinear function of the columns, and they suffice to uniquely calculate the determinant of any square matrix. Provided the underlying scalars form a field (more generally, a commutative ring with unity), the definition below shows that such a function exists, and it can be shown to be unique.
I can follow the derivation from that, but it doesn’t really explain what a determinant is. And the only alternative I could find in Wikipedia or the readily available textbooks, was that it’s the volume of a parallelepiped of the matrix formed by the vectors representing the parallelepiped’s sides. Again, that feels like a derived property, not a true definition. However, Mathworld, did give me one big hint:
For example, eliminating x, y, and z from the equations
= 0
= 0
= 0
gives the expression
which is called the determinant for this system of equation.
So here’s the answer: the determinant is the condition under which a set of linear equations has a non-trivial null space. Or, more simply, the determinant is the condition on the coefficients a, b, c… of a set of n linear equations in n unknowns such that they can be solved for the right hand side (0, 0, 0, …0) where at least one of the unknowns (x, y, …) is not zero. Let me prove that:
To make things simpler, let’s start with the 2 by 2 case:
ax + by = 0
cx + dy = 0
Without loss of generality, assume that a != 0. (If it does we can just swap the two equations. And if a == c == 0, then the set of equations has only the trivial solution y == 0.)
Now eliminate x and y from these equations. From the first equation we get
x = -by / a
Substituting into the second we get
c (-by / a) + dy = 0
implies
(d – bc/a) y = 0
so either y = 0 (trivial solution) or, if y != 0, we have
d – bc/ a = 0
which implies
ad – bc = 0
the usual determinant formula for a 2 by 2 matrix. So what this says is that if ad – bc = 0, then given any y we can pick x = -by / a and the equations are solved. That is, we have non-zero solutions (an infinite number of them in fact) for
ax + by = 0
cx + dy = 0
In other words, the matrix A := [[a, b], [c, d]] has a non-trivial null space. In particular, it has a null space of at least rank 1. That means that:
- The matrix A has a rank of at most 1. (This was shown a few lectures back in 18.06, and is sometimes called the rank-nullity theorem.)
- The columns of A are linearly dependent.
- All linear combinations of the columns of A take the form of a constant times either column.
In fact, these are three different ways of saying the exact same thing. (And for clarity on these points watch the first few lectures of 18.06.)
Continuing onward, the equation Ax = b can only be solved when b is a linear combination of the columns of A, but in this case, that’s just a line. We can easily pick a vector in the plane that is not a linear combination of the columns of A. Choose one such vector b’. For definiteness pick [a, 2c]. If an inverse existed then we could solve the equation Ax = b‘ as x = A-1b‘; but since we can’t solve that equation, no such inverse can exist. Thus when the determinant of A (a.k.a. det(A), a.k.a. ad – bc) == 0, the matrix A is not invertible. Q.E.D.
Anyway, that’s where the determinant comes from. And what is the determinant? It’s a condition on the coefficients of a set of n linear equations in n unknowns (represented by the square matrix A) such that the equation Ax = 0 has solutions other than the zero vector. From that all the rest follows. Or at least it does for the 2 by 2 case. I still need to extend this description to cover the n by n case. Nonetheless that is, to me at least, a much more satisfying and intuitive definition than a set of three unmotivated axioms that just happen to magically generate a host of inobvious properties.
P.S. If anyone knows a better way to include math in WordPress, please let me know.
Update
Following the suggestion below, I’ve installed MathJax on this blog, and it seems to work well. I am a little concerned though that it may be loading a big JavaScript library on every page even though most posts here don’t require it.
February 12th, 2012 at 6:22 pm
http://www.mathjax.org/ lets you add a single js file and type “$ A = \left{a, b, \cdots\right} $”, and parses it and outputs either a picture or mathml.
February 13th, 2012 at 8:26 am
Thanks for this; “why/what is the determinant” is a topic that’s been on my mind too.
I think the parallelepiped has some significance not mentioned in the usual presentation: it is the figure resulting from the matrix’s linear transformation applied to the unit axis-aligned cube between (0,0,0) and (1,1,1) in an orthonormal basis — or, in other words, { aî + b? + ck? | a,b,c ? [0,1] } — which could be considered the most basic three-dimensional shape. (I assume this also generalizes to arbitrary dimensions.)
Then the cases where the transformation is not invertible = the cases where the parallelepiped is of zero volume = the cases where the determinant is zero. We can start by asking “When is a linear transformation invertible?”, find that it is so if a unit cell is not transformed to a degenerate shape with zero volume, and obtain the determinant as the formula for that volume.
February 13th, 2012 at 4:53 pm
I’ve found Linear Algebra Done Right to be useful in conveying the (usually unspoken) motivations.
February 16th, 2012 at 5:10 pm
Your conclusion is this: “the determinant is the condition under which a set of linear equations has a non-trivial null space.” I find this definition of a determinant unsatisfactory, because it is a boolean condition: either the set of linear equations has a non-trivial null space or it doesn’t. But the definition doesn’t seem to explain why the determinant has its particular numerical value.