Mathematical Proofs A Transition To Advanced Mathematics 3rd Pdf

9,311,722 books books
84,837,646 articles articles
ZLibrary Home
Home
Main Mathematical Proofs: A Transition to Advanced Mathematics
Book cover Mathematical Proofs: A Transition to Advanced Mathematics
Mathematical Proofs: A Transition to Advanced Mathematics

Gary Chartrand, Albert D. Polimeni, Ping Zhang
How much do you like this book?
What's the quality of the file?
Download the book for quality assessment
What's the quality of the downloaded files?
Mathematical Proofs: A Transition to Advanced Mathematics, 2/e, prepares students for the more abstract mathematics courses that follow calculus. This text introduces students to proof techniques and writing proofs of their own. As such, it is an introduction to the mathematics enterprise, providing solid introductions to relations, functions, and cardinalities of sets. KEY TOPICS: Communicating Mathematics, Sets, Logic, Direct Proof and Proof by Contrapositive, More on Direct Proof and Proof by Contrapositive, Existence and Proof by Contradiction, Mathematical Induction, Prove or Disprove, Equivalence Relations, Functions, Cardinalities of Sets, Proofs in Number Theory, Proofs in Calculus, Proofs in Group Theory. MARKET: For all readers interested in advanced mathematics and logic.
The file will be sent to your email address. It may take up to 1-5 minutes before you receive it.
The file will be sent to your Kindle account. It may takes up to 1-5 minutes before you received it.
Please note: you need to verify every book you want to send to your Kindle. Check your mailbox for the verification email from Amazon Kindle.
Most frequently terms

                Chapter 14  Proofs in Ring Theory We have noted that many of the proofs we have seen thus far involve integers and their properties. This was certainly the case in Chapter 11, where we were primarily concerned with additive and multiplicative properties of integers. Many important properties of integers follow from just a very few familiar additive and multiplicative properties of integers. In particular, every three integers a, b, and c satisfy the following: (1) a + b = b + a (3) a + 0 = a (5) a(bc) = (ab)c  (2) (a + b) + c = a + (b + c) (4) a + (−a) = 0 (6) a(b + c) = ab + ac  (14.1)  Properties (1) – (4) tell us that the integers form an abelian group under addition, a fact we observed in Chapter 13. You can probably think of other familiar properties of integers (such as ab = ba), but let's concentrate on the six properties listed above. We saw in Chapter 13 that some of these properties have names. For example, (1) is called the commutative law of addition; while (2) and (5) are the associative laws of addition and multiplication, respectively. Property (6) is called the distributive law. Property (3) states that the integer 0 is the identity under addition; while property (4) tells us that for an integer a, the integer −a is its inverse under addition. Properties (3) and (4) in particular may seem as if they are such basic properties of the integers that they should not even be mentioned. However, it is precisely that these six properties are so basic and natural that makes them important and draws our attention to them. A question now arises: Just what facts about the integers are consequences only of these six properties? An even more basic question is: If we have a nonempty set S of objects (not necessarily integers) for which it is possible to add and multiply every two elements of S (and in each case obtain an element of S) such tha; t properties (1) - (6) are satisﬁed, then what additional properties must S possess? Of course, whatever properties that can be deduced about the elements of S will be properties of the integers as well. In fact, this is the essence of the area of abstract algebra that we are about to encounter (and often of all mathematics). While studying a familiar set of objects, we may discover an interesting fact about this set. But what features of this set led us to this conclusion? And if any other set had these same features, does this interesting fact hold for these sets as well? We are now prepared to explore nonempty sets on which addition and multiplication have been deﬁned that satisfy properties (1) - (6). 14.1 Rings Addition and multiplication of integers are binary operations since each associates an integer with each (ordered) pair of integers. Binary operations in general are discussed in Chapter 13. 1  2  CHAPTER 14. PROOFS IN RING THEORY  In the current context, a nonempty set with one or more binary operations that are required to satisfy certain prescribed properties is referred to as an algebraic structure. Hence we have already seen examples of algebraic structures. Indeed, every group (see Chapter 13) is an algebraic structure. Studying algebraic structures is fundamental to abstract algebra. We mentioned that the familiar operations of addition and multiplication deﬁned on the integers satisfy the six properties listed in (14.1). Other familiar sets of numbers with these operations also satisfy these six properties, including the rational numbers, the real numbers, and the complex numbers. The situation is diﬀerent for the irrational numbers, √ however, √ since addition and multiplication binary operations. For example, 2 and − 2 are √ are √ not even √ √ irrational numbers while 2 · 2 = 2 and 2 + (− 2) = 0 are not. These and other examples suggest a general concept. A set R (this is not the symbol used for the set of real numbers) with two binary operations, one of which is called addition and denoted by + and the other called multiplication and denoted by · (where we often write ab rather than a · b for a, b ∈ R), is called a ring if it satisﬁes the following six properties: R1 Commutative Law of Addition: a + b = b + a for all a, b ∈ R; R2 Associative Law of Addition: (a + b) + c = a + (b + c) for all a, b, c ∈ R; R3 Existence of Additive Identity: There exists an element 0 ∈ R such that a + 0 = a for all a ∈ R; R4 Existence of Additive Inverse: For each a ∈ R, there exists an element −a ∈ R such that a + (−a) = 0; R5 Associative Law of Multiplication: a(bc) = (ab)c for all a, b, c ∈ R; R6 Distributive Laws: a(b + c) = ab + ac and (a + b)c = ac + bc for all a, b, c ∈ R. Notice that property R3 requires the existence of at least one element in R, which implies that every ring is nonempty. Recall that if S is a set with a binary operation ∗ and e ∈ S is an identity for S under ∗, then e ∗ a = a ∗ e = a for all a ∈ S. Since a ring R has two binary operations and an identity element is only required for the operation of addition, we refer to an element 0 speciﬁed in property R3 as an additive identity. The notation 0 for an additive identity is chosen because the integer 0 is an additive identity in Z. In other words, an additive identity in a ring R has the same characteristic as the integer 0 under addition in Z. It is important to realize that when we refer to an additive identity 0 in a ring R, we are referring only to an element in R that we are denoting by 0 and that satisﬁes property R3, namely, a + 0 = a for all a ∈ R. Since property R1 holds in every ring, we also have 0 + a = a. Also, if an algebraic structure (S, ∗) has an identity e, then an element a ∈ S has an inverse b ∈ S if a ∗ b = b ∗ a = e. Each element of a ring R is only required to have this property for the operation of addition. Thus, an inverse of an element a ∈ R with respect to addition is called an additive inverse of a. In Z, an additive inverse of an integer m is its negative −m. For this reason, we use −a to denote an additive inverse of an element a in a ring R. We must keep in mind that an element −a in R stands only for some element in R that satisﬁes property R4, namely, a + (−a) = 0. By property R1, we also know that (−a) + a = 0. Since properties R1 – R4 are required of every ring R, it follows that (R, +) is an abelian group. A ring with binary operations + and · is commonly denoted by (R, +, ·). However, if the two operations involved are clear, then we simply write R. In particular, if we are dealing with a familiar set with standard operations of addition and multiplication (and these are the  3 operations we are using), then we write only the symbol for that set. Thus Z, Q, R, and C are rings. We now look at some other common examples of rings. Result 14.1  The set 2Z of even integers is a ring under ordinary addition and multiplication.  Proof. First we show that ordinary addition and multiplication are binary operations on 2Z. Let a, b ∈ 2Z. Then a = 2x and b = 2y for x, y ∈ Z. Then a + b = 2x + 2y = 2(x + y) and ab = (2x)(2y) = 2(2xy). Since x + y and 2xy are integers, a + b and ab belong to 2Z. Since 2Z ⊆ Z and the binary operations in 2Z are the same as those in Z, properties R1, R2, R5, and R6 are automatically satisﬁed. Moreover, since the integer 0 is even, 0 ∈ 2Z and so 2Z has an additive identity. To show that property R4 is also satisﬁed, let a ∈ 2Z. So a = 2x, where x ∈ Z. Then −a = −(2x) = 2(−x). Since −x ∈ Z, it follows that −a ∈ 2Z. Result 14.2 The set Zn = {[0], [1], [2], · · · , [n − 1]}, n ≥ 2, of residue classes is a ring under residue class addition and residue class multiplication. Proof. It was indicated in Chapter 7 that both residue class addition and multiplication deﬁned by [a] + [b] = [a + b] and [a] · [b] = [ab] are well-deﬁned and so are binary operations in Zn . That properties R1, R2, R5, and R6 are satisﬁed depends only on the corresponding properties in the ring Z. For example, to see that R1 and R2 are satisﬁed, let [a], [b], [c] ∈ Zn . Then [a] + [b] = [a + b] = [b + a] = [b] + [a] and ([a] + [b]) + [c] = [a + b] + [c] = [(a + b) + c] = [a + (b + c)] = [a] + [b + c] = [a] + ([b] + [c]). The proofs of properties R5 and R6 are similar. The residue class [0] is an additive identity in Zn and an additive inverse for [a] is [−a] since [a] + [−a] = [a + (−a)] = [0]. The ring (Zn , +, ·) described in Result 14.2 is commonly called the ring of residue classes modulo n. Result 14.3 The set M2 (R) of 2 × 2 matrices over R is a ring under matrix addition and matrix multiplication.   Proof.  Recall that for A =  are deﬁned by    A+B =  a b c d      and B =  a+e b+f c+g d+h    e f g h    in M2 (R), addition and multiplication   and AB =   An additive identity for M2 (R) is the zero matrix Z =     ae + bg af + bh ce + dg cf + dh 0 0 0 0    .    and an additive inverse for the  −a −b . The veriﬁcation of properties R1, R2, −c −d R5, and R6 depends only on the properties of the ring R. matrix A given above is the matrix −A =  Not only is M2 (R) a ring under matrix addition and matrix multiplication, so too is Mn (R) for each integer n ≥ 2.  CHAPTER 14. PROOFS IN RING THEORY  4  Result 14.4 The set FR = {f : f : R → R} of real-valued functions with domain R is a ring under function addition and function multiplication. Proof.  Recall that for f, g ∈ FR , addition and multiplication are deﬁned by (f + g)(x) = f (x) + g(x) and (f · g)(x) = f (x) · g(x)  for all x ∈ R. The proofs of properties R1, R2, R5, and R6 depend only on properties of the ring R. For example, property R1 follows because (f + g)(x) = f (x) + g(x) = g(x) + f (x) = (g + f )(x) for all x ∈ R and so f + g = g + f ; while property R5 follows because ((f · g) · h)(x) = (f · g)(x) · h(x) = (f (x) · g(x)) · h(x) = f (x) · (g(x) · h(x)) = f (x) · (g · h)(x) = (f · (g · h))(x) for all x ∈ R and so (f · g) · h = f · (g · h). The zero function f0 : R → R deﬁned by f0 (x) = 0 for all x ∈ R is an additive identity since for each f ∈ FR and all x ∈ R, (f + f0 )(x) = f (x) + f0 (x) = f (x) + 0 = f (x) and so f + f0 = f . For f ∈ FR , the function −f ∈ FR deﬁned by (−f )(x) = − (f (x)) for all x ∈ R is an additive inverse for f since for all x ∈ R, (f + (−f ))(x) = f (x) + (−f )(x) = f (x) + (−f (x)) = 0 = f0 (x) and so f + (−f ) = f0 . A less common, though useful, example of a ring is given next. Result 14.5 The set R × R = R2 is a ring under the addition (a, b) + (c, d) = (a + c, b + d) and multiplication (a, b) · (c, d) = (ac, bd). Before proving this result, it is important to know that we are deﬁning a new sum (a, b)+(c, d) in terms of the familiar sums a + c and b + c of two real numbers. Hence + has two diﬀerent meanings here. A similar distinction exists between the product in R2 and the standard product of real numbers. Proof of Result 14.5. Certainly the addition and multiplication deﬁned here are binary operations on R2 . That R2 satisﬁes property R1 follows because addition in R is commutative. Let (a, b), (c, d) ∈ R2 . Then (a, b) + (c, d) = (a + c, b + d) = (c + a, d + b) = (c, d) + (a, b). Let (a, b) ∈ R2 . Observe that (0, 0) ∈ R2 and that (a, b) + (0, 0) = (a + 0, b + 0) = (a, b). Thus (0, 0) is an additive identity in R2 . Moreover, (−a, −b) ∈ R2 and (a, b) + (−a, −b) = (a + (−a), b + (−b)) = (0, 0).  5 Hence (−a, −b) is an additive inverse of (a, b) and properties R3 and R4 hold. We only verify one of the distributive laws for a ring as the argument for the remaining law is similar. Again, let (a, b), (c, d), (e, f ) ∈ R2 . Applying the distributive law for addition and multiplication in R, we have (a, b)[(c, d) + (e, f )] = (a, b)(c + e, d + f ) = (a(c + e), b(d + f )) = (ac + ae, bd + bf ) = (ac, bd) + (ae, bf ) = (a, b)(c, d) + (a, b)(e, f ), establishing this distributive law in R2 . The associative properties R2 and R5 can be established in R2 in a similar manner. Next we show that familiar sets, under unfamiliar binary operations, need not be rings. Example 14.6 For a, b ∈ R, deﬁne addition a      and multiplication  b = a + b − 1 and a      by  b = ab,  where the operations indicated in a + b − 1 and ab are ordinary addition, subtraction, and   multiplication. Then (R, , ) is not a ring.   satisﬁes properties R1-R4, Solution. It was shown in Example 13.4 that the binary operation   that is, (R, ) is an abelian group. Because is ordinary multiplication, property R5 holds as well. However, property R6 is not satisﬁed since for a = b = c = 0, a Therefore, (R,    (b       ,  c) = 0    (−1) = 0  and  (a    b)    (a    c) = 0    0 = −1.  ♦  ) is not a ring.  Let's see what happens when ordinary addition and multiplication of real numbers are reversed. Example 14.7 The set R of real numbers is not a ring when addition ∗ is deﬁned as ordinary multiplication and multiplication ◦ is deﬁned as ordinary addition. Solution. We denote ordinary addition of real numbers by + and ordinary multiplication by · (though we write a · b as ab, as usual). We show that a distributive law fails in (R, ∗, ◦). Let a = b = c = −1. Then a ◦ (b ∗ c) = a + (bc) = (−1) + (−1)(−1) = (−1) + 1 = 0, while (a ◦ b) ∗ (a ◦ c) = (a + b)(a + c) = [(−1) + (−1)][(−1) + (−1)] = (−2)(−2) = 4. Therefore, (R, ∗, ◦) is not a ring.  ♦  Some rings satisfy properties beyond the six properties required of all rings. We have already mentioned that the integers satisfy the familiar property: ab = ba for all a, b ∈ Z. This, of course, is the commutative law of multiplication. Rings are not required to have this property. However, when they do, we give these rings a special name. A ring (R, +, ·) is called a commutative ring if it satisﬁes  CHAPTER 14. PROOFS IN RING THEORY  6  R7 Commutative Law of Multiplication: ab = ba for all a, b ∈ R. A ring (R, +, ·) that does not satisfy the Commutative Law of Multiplication is called a noncommutative ring. While the rings Z, Q, R, C, 2Z, Zn , FR , and R2 are commutative, the ring M2 (R) is noncommutative. For example, if we let   A= then  0 0 2 0    AB =      and B =  0 0 0 4      =  4 0 0 0    0 2 , 0 0   = BA.  Another property that Z possesses, which is basic yet important, is that it contains an integer e with the property that a · e = e · a = a for every integer a. Of course, 1 has this property in Z. In general, a ring (R, +, ·) is called a ring with unity (or a ring with multiplicative identity) if it satisﬁes R8 Existence of Multiplicative Identity: There exists an element 1 ∈ R such that a · 1 = 1 · a = a for all a ∈ R. If (R, +, ·) has an element 1 satisfying property R8, then 1 is called a unity for R. Again, we stress that much care is needed here. When we write 1, we mean only an element of R that satisﬁes property R8, namely, a · 1 = 1 · a = a for all a ∈ R. It does not imply that 1 is the integer 1. Indeed, R itself could be a ring with unity that contains no integers whatsoever. Also, if R is a commutative ring, then to show that some element 1 ∈ R is a unity requires only to show that a · 1 = a for all a ∈ R since a · 1 = 1 · a for all a ∈ R. The rings Z, Q, R, C, Zn , FR , and R2 are rings with unity. The number 1 is a unity for Z, Q, and R, as is 1 = 1 + 0i a unity for C. The residue class [1] is a unity for Zn ; while the constant function f1 : R → R deﬁned by f1 (x) = 1 for all x ∈ R is a unity for FR . Furthermore, the ordered pair (1, 1) is a unity for R2 . The noncommutative ring M2 (R) has a unity as well. Indeed, the 2 × 2 identity matrix   I= is a unity for M2 (R) since   a b c d    1 0 0 1      =  1 0 0 1  1 0 0 1      a b c d      =  a b c d    for all a, b, c, d ∈ R. On the other hand, not all rings have a unity. In particular, the ring 2Z of even integers does not have a unity since the only integer e such that e · a = a for every integer a is e = 1 but 1 ∈ / 2Z. 14.2 Elementary Properties of Rings Despite the fact that there are many diﬀerent kinds of rings, there are properties that all rings have in common. Necessarily, of course, any such properties are consequences of the six deﬁning properties of a ring. We now present some properties that all rings share, beginning with the uniqueness of certain types of elements in rings.  7 The deﬁnition of a ring R guarantees that it contains an additive identity, that is, an element 0 such that a + 0 = a for all a ∈ R. Although the deﬁnition does not specify that there is only one such element, there is, in fact, only one. Also, the deﬁnition of R states that for each a ∈ R, there is an element −a ∈ R such that a + (−a) = 0. Again, there is no indication that each element of R has only one additive inverse, but, in fact, this is the case. Actually, these are consequences of the fact that R is a group under addition (see Theorem 13.9), but we verify these facts here. Theorem 14.8  Let R be a ring. Then  (i) R has a unique additive identity, and (ii) each element in R has a unique additive inverse. Proof. We ﬁrst verify (i). Suppose that both 0 and 0 are additive identities for R. Since 0 is an additive identity, 0 + 0 = 0 . Also, since 0 is an additive identity, 0 + 0 = 0. It then follows by the commutative law that 0 = 0 + 0 = 0 + 0 = 0 and so 0 = 0. Therefore, there is only one additive identity in R and (i) holds. We now verify (ii). Suppose that −x and x are both additive inverses for the element x ∈ R. Then x + (−x) = 0 and x + x = 0. Hence −x = −x + 0 = −x + (x + x ) = (−x + x) + x = 0 + x = x . So each element in R has a unique additive inverse. Proof Analysis Let's revisit the proof of the uniqueness of additive inverses in Theorem 14.8 to see how this proof may have been constructed. We know that x + (−x) = 0 and x + x = 0. Since x + (−x) = 0 = x + x , it follows, by adding −x to the equal elements x + (−x) and x + x , that −x + (x + (−x)) = −x + (x + x ). (14.2) The left side of (14.2) is −x + 0 = −x. Our goal was to show that −x = x , so this suggests starting with −x = −x + 0 = −x + (x + x ) and the remainder of the proof follows quite naturally. The resulting proof given in Theorem 14.8 is certainly much clearer than giving a list of equalities, with no accompanying explanations: x + (−x) = x + x −x + (x + (−x)) = −x + (x + x ) (−x + x) + (−x) = (−x + x) + x 0 + (−x) = 0 + x −x = x . ♦ In view of Theorem 14.8, we can now refer to the additive identity of a ring and the additive inverse of an element in a ring. The additive identity of a ring R is called the zero element of R. Not only are the additive identity and the additive inverse of every element in a ring R unique, but if R has a unity, then this element is unique as well. Theorem 14.9 If R is a ring with unity, then R has a unique unity.  CHAPTER 14. PROOFS IN RING THEORY  8  Proof. Let 1 and 1 be unities in R. Since 1 is a unity, 1 · 1 = 1 · 1 = 1 ; while since 1 is a unity, 1 · 1 = 1 · 1 = 1. Therefore, 1 = 1 · 1 = 1 . A basic fact concerning rings allows us to simplify certain algebraic expressions. Although the next theorem is a consequence of the fact that R is an abelian group under addition (see Theorem 13.7), we provide a proof of this theorem. Theorem 14.10 (Cancellation Law of Addition) (R, +, ·) such that a + b = a + c, then b = c. Proof.  If a, b, and c are elements in a ring  Observe that b = 0 + b = [(−a) + a] + b = (−a) + (a + b) = (−a) + (a + c) = [(−a) + a] + c = 0 + c = c.  Therefore, the Cancellation Law of Addition holds in (R, +, ·). Proof Analysis Another version of the preceding proof begins with a + b = a + c (that is, a + b and a + c represent the same element in R). If the additive inverse −a of a is now added to this element, we obtain −a + (a + b) = −a + (a + c). By the associative law, (−a + a) + b = (−a + a) = c; so 0 + b = 0 + c and thus b = c.  ♦  We have seen that the zero element 0 in a ring R has the property that 0 + 0 = 0. Hence R contains an element c such that c + c = c, namely, c = 0. However, as an immediate consequence of the Cancellation Law of Addition, no other element of R has this property. Corollary 14.11 c = 0.  Let (R, +, ·) be a ring. If c is an element of R such that c + c = c, then  Proof. Since c+c = c, we also have c+c = c+0. Now, canceling c, it follows by Theorem 14.10 that c = 0. Although the deﬁning property of the zero element of a ring (R, +, ·) concerns only one of the two operations, namely addition, it has a property involving multiplication that is probably not unexpected. Theorem 14.12  For every element a in a ring (R, +, ·), a · 0 = 0 · a = 0.  Proof. Since the proofs that a · 0 = 0 and 0 · a = 0 are similar, we only verify the ﬁrst of these. Observe that a · 0 + 0 = a · 0 = a · (0 + 0) = a · 0 + a · 0. The result now follows from Corollary 14.11 (where c = a · 0). We now turn our attention to properties of rings involving additive inverses. At times, a very simple argument for some fact can be given by recognizing that −a represents the unique element which when added to a results in 0. Two examples of this appear in the following theorem.  9 Theorem 14.13  Let (R, +, ·) be a ring and let a, b ∈ R. Then  (i) −(−a) = a (ii) if a = −b, then b = −a. Proof. Since a + (−a) = 0, it follows that a is the additive inverse of −a, that is, a = −(−a). This veriﬁes (i). To establish (ii), let a = −b. Hence a is the additive inverse of b and so a + b = 0. This, however, implies that b is the additive inverse of a and so b = −a. We now consider some results concerning the product of two elements in a ring, at least one of which is an additive inverse. Since the additive inverse is an element that is deﬁned only in terms of addition, it would seem natural that any property concerning such an element that involves multiplication must be a consequence of the distributive laws. (This is exactly what occurred in Theorem 14.12.) Theorem 14.14  Let (R, +, ·) be a ring and let a, b ∈ R. Then (−a) · b = a · (−b) = −(ab).  Proof. To show that (−a) · b = −(ab), it suﬃces to verify that (−a) · b is the additive inverse of a · b. This can be accomplished by showing that a · b + (−a) · b = 0. Observe that a · b + (−a) · b = [a + (−a)] · b = 0 · b = 0. A proof that a · (−b) = −(a · b) is similar. Corollary 14.15  Let (R, +, ·) be a ring and let a, b ∈ R. Then (−a) · (−b) = ab.  Proof. By Theorem 14.14, (−a) · (−b) = a · [−(−b)], and by Theorem 14.13, −(−b) = b. Thus (−a) · (−b) = a · b. In the ring of integers we know that if a, b ∈ Z, then a+(−b) = a−b. We follow this convention in an arbitrary ring. If (R, +, ·) is a ring and a, b ∈ R, then we deﬁne the subtraction of b from a as a − b = a + (−b). In particular, if a = b in R, then we arrive at the seemingly obvious fact that a − b = b − b = b + (−b) = 0. We present a basic fact concerning subtraction. Result 14.16  Let (R, +, ·) be a ring and let a, b, c ∈ R. Then a(b − c) = ab − ac.  Proof. Observe that a(b − c) = a[b + (−c)] = ab + a · (−c). By Theorem 14.14, a(−c) = −(ac), so a(b − c) = ab + [−(ac)] = ab − ac. 14.3 Subrings We have seen that the subset 2Z of Z is a ring when the operations of addition and multiplication used in 2Z are the same as those of Z. Since Z is already a ring, we found that it was relatively easy to prove that 2Z is a ring. We saw that 2Z inherits the properties R1, R2, R5, and R6 of a ring from Z. What we didn't know automatically, and therefore had to verify, was  10  CHAPTER 14. PROOFS IN RING THEORY  that 2Z is closed under addition and multiplication, that the zero element of Z is also in 2Z, and that each element of 2Z has an additive inverse in 2Z. In general, then, it is much easier to prove that a subset S of a known ring R is a ring under the same operations deﬁned on R. This observation leads us to an important concept in the study of rings. Let R be a ring. If S is a subset of R such that S is a ring under the same operations deﬁned on R, then S is called a subring of R. If R contains at least two elements, then R contains at least two subrings, namely R itself and the "zero subring" {0}. We now state exactly what properties need to be veriﬁed to show that a subset of a known ring R is a subring of R. Theorem 14.17 (The Subring Test) A nonempty subset S of a ring R is a subring of R if and only if S is closed under subtraction and multiplication. Proof. If S is a subring of R, then certainly S is closed under subtraction and multiplication. For the converse, let R be a ring and S a nonempty subset of R that is closed under subtraction and multiplication. We show that S itself is a ring. Since S = ∅, there is some element s ∈ S. Because S is closed under subtraction, s − s = 0 ∈ S, that is, the zero element of R belongs to S and so property R3 holds. Now let a ∈ S. Again, since S is closed under subtraction, 0 − a = 0 + (−a) = −a ∈ S and so property R4 holds. This implies that the additive inverse of an element of S also belongs to S. For a, b ∈ S, we know that −b ∈ S and so a − (−b) = a + [−(−b)] = a + b ∈ S. Hence S is closed under addition as well. Now it remains to show that addition is commutative, that addition and multiplication are associative, and that distributive laws hold, namely, properties R1, R2, R5, and R6 hold in S. But all these properties are inherited from R and so hold in S as well. Consequently, to show that a subset S of a ring R is a subring, we need only show that S is nonempty and that S is closed under subtraction and multiplication. We now illustrate how the Subring Test is used by presenting several examples, beginning with a new proof that 2Z is a ring under ordinary addition and multiplication. Result 14.18 The subset 2Z of even integers is a subring of Z. Proof. Since 0 is an even integer, 2Z is nonempty. Let a, b ∈ 2Z. Then a = 2x and b = 2y, where x, y ∈ Z. Observe that a − b = 2x − 2y = 2(x − y) and ab = (2x)(2y) = 2(2xy). Since x − y and 2xy are integers, a − b and ab belong to 2Z. By the Subring Test, 2Z is a subring of Z. Result 14.19 The subset R × {0} = {(x, 0) : x ∈ R} of the ring R × R is a subring of R × R. Proof. Since (0, 0) ∈ R×{0}, the set R×{0} is nonempty. Let a, b ∈ R×{0}. Then a = (x, 0) and b = (y, 0) for some x, y ∈ R. Thus a−b = (x, 0)−(y, 0) = (x−y, 0−0) = (x−y, 0) ∈ R×{0} and a · b = (x, 0) · (y, 0) = (xy, 0) ∈ R × {0}. By the Subring Test, R × {0} is a subring of R × R. The next example concerns a subring of √ the ring of complex numbers. A complex number of the form a + bi, where a, b ∈ Z and i = −1, is called a Gaussian integer. Result 14.20 The set G = {a + bi : a, b ∈ Z} of Gaussian integers is a subring of the ring C of complex numbers.  11 Proof. Since 0 = 0 + 0i ∈ G, the set G is nonempty. Let x, y ∈ G. Then x = a + bi and y = c + di, where a, b, c, d ∈ Z. Observe that x − y = (a + bi) − (c + di) = (a − c) + (b − d)i and xy = (a + bi) · (c + di) = (ac − bd) + (ad + bc)i. Since a − c, b − d, ac − bd, and ad + bc are integers, x − y and xy are Gaussian integers. By the Subring Test, G is a subring of C. The elements that belong to two subrings of a ring R also produce a subring of R. Result 14.21  If S1 and S2 are subrings of a ring R, then S1 ∩ S2 is also a subring of R.  Proof. Since 0 ∈ S1 and 0 ∈ S2 , it follows that 0 ∈ S1 ∩ S2 and so S1 ∩ S2 is nonempty. Let a, b ∈ S1 ∩ S2 . Then a, b ∈ Si for i = 1, 2. Since S1 and S2 are subrings of R, it follows that a − b ∈ Si and ab ∈ Si for i = 1, 2. Hence a − b ∈ S1 ∩ S2 and ab ∈ S1 ∩ S2 . Therefore, by the Subring Test, S1 ∩ S2 is a subring of R. 14.4 Integral Domains Properties possessed by the integers have led us to the concept of a ring as well as two special kinds of rings, namely commutative rings and rings with unity. We have seen that if R is a ring, then a · 0 = 0 · a = 0 for every a ∈ R. This property can be stated in another way: Let a, b ∈ R. If a = 0 or b = 0, then a · b = 0.  (14.3)  Of course, the converse of (14.3) also holds in the ring Z: Let a, b ∈ Z. If a · b = 0, then a = 0 or b = 0.  (14.4)  The implication (14.4) also holds in the ring of real numbers. Indeed, (14.4) is the critical property of real numbers needed for solving many equations. For example, if (x − 3)(x + 2) = 0, where x ∈ R, then x = 3 or x = −2. This leads us to another important concept. A nonzero element a in a ring R is called a zero divisor of R if there exists a nonzero element b in R such that either ab = 0 or ba = 0. Of course, in this case, b is a zero divisor of R as well. Certainly then, the rings Z and R have no zero divisors. Furthermore, 2Z, Q, and C are rings possessing no zero divisors. There are, however, some well-known rings that do have zero divisors. In Z6 , we have seen that [2][3] = [6] = [0]. Since [2] = [0] and [3] = [0], it follows that [2] and [3] are zero divisors in Z6 . The residue class [4] is also a zero divisor in Z6 since [4][3] = [0]. Consider the functions f and g in FR deﬁned as:   f (x) =  1 0  if x ∈ Q if x ∈ I    g(x) =  0 1  if x ∈ Q if x ∈ I  Then (f · g)(x) = f (x) · g(x) = 0 = f0 (x) for all x ∈ R. Hence f · g = f0 , the zero element of FR , but f = f0 and g = f0 . So f and g are zero divisors in FR . In M2 (R), let  CHAPTER 14. PROOFS IN RING THEORY  12   A= Then    AB =  0 1 0 1  0 0 0 0        1 1 . 0 0  and B =      while BA =    0 2 . 0 0  Hence A and B are zero divisors in the noncommutative ring M2 (R). From what we have seen, it is not all that uncommon for a ring R to contain nonzero elements whose product is the zero element of R. Thus, it is useful to distinguish those rings that contain zero divisors from those that do not. Before proceeding further though, we need to address one special kind of ring. A ring R is called trivial if it contains only one element – necessarily, then, the zero element. That is, R is trivial if R = {0}. If R is nontrivial, then it contains at least two elements, and consequently at least one nonzero element. If R is a trivial ring, then certainly a · 0 = 0 · a = a for all a ∈ R since a = 0 is the only element of R. Therefore, if R is trivial, then it contains a unity (namely 0). Obviously, a trivial ring is commutative as well. On the other hand, if R is a nontrivial ring with unity, then it cannot occur that the unity and zero elements are the same. Theorem 14.22  If R is a nontrivial ring with unity 1, then 1 = 0.  Proof. Assume, to the contrary, that 1 = 0. Since R is a nontrivial ring, there is an element a ∈ R such that a = 0. However, then a = a · 1 = a · 0 = 0, which is a contradiction. A nontrivial commutative ring with unity that contains no zero divisors is called an integral domain. Therefore, all of the rings Z, Q, R, and C are integral domains. Not all commutative rings with unity are integral domains, however. For example, we saw that [2] and [3] are zero divisors in Z6 . We also saw that FR possesses zero divisors. Moreover, since (0, 1)·(1, 0) = (0, 0) in R2 , it follows that (0, 1) and (1, 0) are zero divisors in R2 . Therefore, although all of Z6 , FR , and R2 are commutative rings with unity, none is an integral domain. Since an integral domain is required to be a commutative ring with a unity, 2Z is not an integral domain, despite the fact that it is both commutative and contains no zero divisors, as it does not contain a unity. We have seen that every ring satisﬁes the Cancellation Law of Addition. For multiplication, the situation can be quite diﬀerent. There are two possible cancellation laws in this case. Cancellation Laws of Multiplication: Let R be a ring and let a, b, c ∈ R. (1) If ab = ac, where a = 0, then b = c. (2) If ac = bc, where c = 0, then a = b. Of course, if R is a commutative ring, then (1) and (2) say the same thing. In a noncommutative ring, (1) is referred to as the Left Cancellation Law of Multiplication and (2) as the Right Cancellation Law of Multiplication. In the ring Z6 , [3] · [2] = [3] · [4] but [2] = [4]. So the Cancellation Laws of Multiplication fail to hold in Z6 . The Cancellation Laws of Multiplication never fail to hold in rings without zero divisors, however.  13 Theorem 14.23 Let R be a ring. Then the Cancellation Laws of Multiplication hold in R if and only if R contains no zero divisors. Proof. Assume ﬁrst that R is a ring without zero divisors. We only verify the Left Cancellation Law (1) since the proof of (2) is similar. Let a, b, c ∈ R, where a = 0 and ab = ac. Since ab = ac, it follows that ab + (−(ac)) = ac + (−(ac)) and so ab − ac = 0. Thus a(b − c) = 0. Since R contains no zero divisors and a = 0, it follows that b − c = 0 and so b = c. For the converse, assume that R is a ring in which the Cancellation Laws of Multiplication hold. We show that R contains no zero divisors. Let a, b ∈ R such that ab = 0. We show that a = 0 or b = 0. If a = 0, then we have the desired result. So we may assume that a = 0. Hence a · b = 0 = a · 0 and so a · b = a · 0. By the (Left) Cancellation Law of Multiplication, the element a can be canceled in a · b = a · 0, arriving at b = 0. Thus R has no zero divisors. Since a ring R satisfying the Cancellation Laws of Multiplication is equivalent to R containing no zero divisors, we have an immediate consequence of Theorem 14.23. Corollary 14.24 Let R be a nontrivial commutative ring with unity. Then R is an integral domain if and only if the Cancellation Law of Multiplication holds in R. While Z6 is not an integral domain, it is not diﬃcult to show that Z5 is (by constructing a multiplication table for Z5 as in the same manner done for Z6 in Figure 7.1 of Chapter 7). Consequently, some rings Zn are integral domains while others are not. You might have seen a diﬀerence between Z6 and Z5 already, namely, 5 is prime and 6 is not. We are about to see that this is the key observation. In the proof of the next theorem, we will use the fact that if a and b are integers and p is a prime such that p | ab, then p | a or p | b. (This theorem is discussed in detail in Chapter 11. In particular, see Corollary 11.14.) Theorem 14.25 a prime.  For an integer n ≥ 2, the ring Zn is an integral domain if and only if n is  Proof. First, we show that if Zn is an integral domain, then n is a prime. Assume that n is not a prime. Then n = ab for some integers a and b with 1 < a < n and 1 < b < n. Thus [a] = [0] and [b] = [0] in Zn . On the other hand, [a][b] = [ab] = [n] = [0] in Zn . Hence [a] and [b] are zero divisors in Zn and so Zn is not an integral domain. For the converse, assume that n is a prime. We show that Zn is an integral domain. Certainly, Zn is a nontrivial commutative ring with unity. It remains only to show that Zn has no zero divisors. Let [a], [b] ∈ Zn such that [a] · [b] = [0]. Then [a] · [b] = [ab] = [0], which implies that ab ≡ 0 (mod n). Therefore, n | ab. Since n is a prime, it follows by Corollary 11.14 that n | a or n | b; so [a] = [0] or [b] = [0]. Thus Zn contains no zero divisors. 14.6 Fields Initially, we saw that many fundamental properties of integers are shared by other algebraic structures. This led us to the concept of rings. Among the many rings we encountered are Z, 2Z, Q, R, C, Zn , FR , and M2 (R). However, only some of these are commutative rings with unity, namely, Z,Q, R, C, Zn , and FR ; and only some of these are integral domains, namely, Z,Q, R, C, and Zp , where p is a prime. There is a property that Q, R, C, and Zp possess that Z does not, however, which will ﬁnally allow us to distinguish Z from these rings.  14  CHAPTER 14. PROOFS IN RING THEORY  Let a be a nonzero integer. Unless a is 1 or −1, there is no integer b such that ab = 1. On the other hand, if a is a nonzero rational number, then there is always a rational number b such that ab = 1. Indeed, b = 1/a ∈ Q has this property. This discussion leads us to another concept. Let R be a ring with unity 1. A nonzero element a of R is called a unit if there is some element b in R such that ab = ba = 1. In this case, b is called a multiplicative inverse of a. (Of course, b is also a unit with multiplicative inverse a.) We must take care to distinguish between the terms "unit" and "unity" in a ring R. A unity in R is an element 1 ∈ R such a · 1 = 1 · a = a for all a ∈ R. On the other hand, if R is a nontrivial ring with unity 1, then a nonzero element a ∈ R is a unit if a · b = b · a = 1 for some b ∈ R. The unity 1 is always a unit since 1 · 1 = 1. As with additive inverses, multiplicative inverses in a ring are unique. Theorem 14.26 Let R be a nontrivial ring with unity. Then each unit in R has a unique multiplicative inverse. Proof. Let a be a unit in R and suppose that b and c are multiplicative inverses of a. Hence ab = ba = 1 and ac = ca = 1. It then follows that b = b · 1 = b(ac) = (ba)c = 1 · c = c. Therefore, a has a unique multiplicative inverse. For a unit a in a nontrivial ring with unity, we write a−1 for the (unique) multiplicative inverse of a. The only units in Z are 1 and −1 since these are the only integers a for which there is an integer b such that ab = 1. In Q and R, however, all nonzero elements are units. In Z6 , [5] · [5] = [25] = [1], so both [1] and [5] are units. Furthermore, there are no other units in Z6 , as can be seen from the multiplication table (Figure 7.1) in Chapter 7. A nontrivial commutative ring with unity in which every nonzero element is a unit is called a ﬁeld. In addition to Q and R, the ring C of complex numbers is a ﬁeld. Result 14.27  The ring C of complex numbers is a ﬁeld.  Proof. We have already noted that C is a commutative ring with a unity, so we are only required to show that every nonzero complex number is a unit. Let x be a nonzero complex number. Hence x = a + bi, where a, b ∈ R and either a = 0 or b = 0. Thus a2 + b2 = 0. We show that there exists a complex number y = c + di, where c, d ∈ R, such that xy = 1 = 1 + 0i. a −b Let c = 2 and d = 2 and observe that 2 a +b a + b2     a −b + i xy = (a + bi)(c + di) = (a + bi) 2 a + b2 a2 + b2  1 1 = (a + bi)(a − bi) = a2 − b2 i2 a2 + b2 a2 + b2 a2 + b2 = = 1 = 1 + 0i. a2 + b2  a −b + 2 i. 2 +b a + b2 Proof Analysis In the proof of the preceding result, for the nonzero complex number x = a + bi, how did we know to choose y = c + di so that xy = 1 = 1 + 0i? That is, how did we know what the multiplicative inverse of x was? Actually, that was not so diﬃcult. Hence x has a multiplicative inverse, namely, x−1 =  a2  15 Since xy = (a + bi)(c + di) = 1 + 0i, it follows that (ac − bd) + (ad + bc)i = 1 + 0i. Hence ac − bd = 1  (14.5)  ad + bc = 0.  (14.6)  and Multiplying equation (14.5) by a, equation (14.6) by b, and adding, we obtain (a2 + b2 )c = a;  (14.7)  while multiplying equation (14.5) by −b, equation (14.6) by a, and adding, we obtain (a2 + b2 )d = −b.  (14.8)  Solving (14.7) for c and (14.8) for d, we ﬁnd that c=  a2  a −b and d = 2 . 2 +b a + b2  −b a −1 . That c + di is actually x−1 was, of course, Hence a2 +b 2 + a2 +b2 i is the logical choice for x veriﬁed in the proof of Result 14.27. ♦  Fields are actually special kinds of integral domains, as we now show. Theorem 14.28  Every ﬁeld is an integral domain.  Proof. Let F be a ﬁeld. To verify that F is also an integral domain, we need only show that F contains no zero divisors. Let a be a nonzero element of F and let b ∈ F such that ab = 0. Then 0 = a−1 · 0 = a−1 (ab) = a−1 a b = 1b = b. Since b = 0, it follows that a is not a zero divisor. Certainly, the converse of Theorem 14.28 is not true since Z is an integral domain that is not a ﬁeld. However, under a certain restriction, an integral domain is a ﬁeld as well. Theorem 14.29  Every ﬁnite integral domain is a ﬁeld.  Proof. Let D be a ﬁnite integral domain, say D = {a1 , a2 , · · · , an }. To show that D is a ﬁeld, we need only show that every nonzero element of D has a multiplicative inverse. Let a ∈ D, where a = 0, and consider the elements aa1 , aa2 , · · · , aan . If aai = aaj , where 1 ≤ i ≤ n and 1 ≤ j ≤ n, then ai = aj by the Cancellation Law of Multiplication. This implies that the elements aa1 , aa2 , · · · , aan are distinct and are, in fact, all n elements of D. Thus one of these elements is 1 and so aak = 1 for some integer k with 1 ≤ k ≤ n. Hence ak = a−1 and a has a multiplicative inverse. We have seen in Theorem 14.25 that Zn is an integral domain if and only if n is a prime. Theorem 14.29 now gives us the following result. Corollary 14.30  The ring Zn is a ﬁeld if and only if n is prime.  CHAPTER 14. PROOFS IN RING THEORY  16 Exercises for Chapter 14  14.1 Verify that each of the following is a ring by showing that (1) the indicated addition and multiplication are binary operations and (2) the required six properties are satisﬁed. (You may assume that both Z and R are rings under ordinary addition and multiplication.) (a) The set kZ, where k ∈ Z and k ≥ 2, under ordinary addition and multiplication. √ √ (b) The set Z[ 2] = {a + b 2 : a, b ∈ Z} under ordinary addition and multiplication. 14.2 Verify that each of the following is not a ring. (a) The set FR under function addition and function composition. (b) The set Z under the addition deﬁned by a ∗ b = a and ordinary multiplication. (c) The set Z under ordinary addition and the multiplication deﬁned by a ∗ b = a. (d) The set Z under the addition deﬁned by a∗b = min{a, b} and ordinary multiplication. (e) The set Z under ordinary addition and the multiplication deﬁned by a∗b = min{a, b}. 14.3 For a given set S and binary operations ∗ and ◦, determine whether (S, ∗, ◦) is a ring. (a) S = R, a ∗ b = a + b + 1, a ◦ b = ab. (b) S = R+ , the set of positive real numbers, a ∗ b = ab and a ◦ b = ab . 14.4 Let a be an element in a ring (R, +, ·). Complete the proof of Theorem 14.12 by proving that 0 · a = 0. 14.5 Let a and b be elements in a ring (R, +, ·). Complete the proof of Theorem 14.14 by proving that a · (−b) = −(a · b). 14.6 Let R be a ring with unity 1. Use Theorem 14.14 to prove that (−1)a = −a for all a ∈ R. 14.7 Let (R, +, ·) be a ring with the property that a2 = a · a = a for every a ∈ R. (a) Prove that every element in R is its own additive inverse, that is, prove that −a = a for every a ∈ R. [Hint: Consider (a + a)2 .] (b) Prove that R is a commutative ring. [Hint: Consider (a + b)2 .] 14.8 Does there exist an example of a nontrivial ring (R, +, ·), that is, R has at least two elements, such that addition and multiplication in R are the same, namely, a + b = ab for all a, b ∈ R? Justify your answer. 14.9 Verify that each of the following subsets is a subring of the given ring.     a 0 : a, b ∈ R in the ring M2 (R). (a) S = 0 b √ √ (b) S = {a + b 3 2 + c 3 4 : a, b, c ∈ Q} in the ring R. 14.10 Prove that the subset S = {[0], [2], [4]} is a subring of Z6 .  17 14.11 Recall √ that a Gaussian integer is a complex number of the type a + bi, where a, b ∈ Z and i = −1, and that the set G of Gaussian integers is a subring of the ring C of complex numbers. Deﬁne an even Gaussian integer to be a complex number of the type a + bi, where a, b ∈ 2Z. Is the set 2G of even Gaussian integers a subring of G? Justify your answer. 14.12 By Result 14.21, if S1 and S2 are subrings of a ring R, then S1 ∩ S2 is a subring of R. Both 2Z and 3Z are subrings of the ring Z. Give a simple description of the subring 2Z ∩ 3Z in Z. Justify your answer.   14.13 Let S =  a b 0 0    : a, b ∈ R .  (a) Prove that S is a subring of M2 (R). (b) Prove that there is an element E ∈ S such that EA = A for all A ∈ S, but there is an element C ∈ S such that CE = C. (c) Prove that S does not possess a unity. 14.14 Use Theorem 14.23 to prove Corollary 14.24. 14.15 Deﬁne multiplication ◦ on 2Z by a ◦ b = ab/2. Prove that (2Z, +, ◦) is an integral domain, where + is ordinary addition. 14.16 Let R be a commutative ring with unity. (a) Prove that a unit of R is not a zero divisor in R. (b) Determine whether the converse of (a) is true. (c) Prove that if R is a ﬁnite ring and a is not a zero divisor of R, then a has a multiplicative inverse in R. 14.17 Deﬁne addition ∗ and multiplication ◦ on Z as follows: a∗b=a+b−1  and a ◦ b = a + b + ab.  Prove that (Z, ∗, ◦) is a ring with unity and answer the following questions. (a) Is this ring commutative? (b) Is this ring an integral domain? (c) Is this ring a ﬁeld? √ √ 14.18 Show that Z[ 2] = {a + b 2 : a, b ∈ Z} is not a ﬁeld. 14.19 Give an example of a ring that is not a ﬁeld but has a subring that is a ﬁeld. 14.20 Let R be a nontrivial commutative ring with unity. Prove that R is a ﬁeld if and only if for every a, b ∈ R with a = 0, the equation ax = b has a solution x ∈ R. 14.21 Prove that Q[i] = {a + bi : a, b ∈ Q} is a ﬁeld. 14.22 Let (F, +, ·) be a ﬁeld and let a, b ∈ F with a = 0. Show that the equation a · x = b has a unique solution x ∈ F .  CHAPTER 14. PROOFS IN RING THEORY  18  14.23 Give examples of the following (if they exist): (a) a ﬁnite ring (b) an inﬁnite ring (c) a noncommutative ﬁnite ring (d) a noncommutative inﬁnite ring (e) a ring with unity (f) a ring without unity (g) a noncommutative ring with unity (h) a noncommutative ring without unity (i) a ring that is not an integral domain (j) a ﬁnite integral domain (k) an inﬁnite integral domain (") an integral domain that is not a ﬁeld (m) a ﬁnite ﬁeld (n) an inﬁnite ﬁeld 14.24 For the following statement S and proposed proof, either (1) S is true and the proof is correct, (2) S is true and the proof is incorrect, or (3) S is false and the proof is incorrect. Explain which of these occurs. S: Let A = {n ∈ N : n = 0}. Then A is a subring of (Z, +, ·). Proof. Let a, b ∈ S. Then a = 0 and b = 0. Since a − b = 0 − 0 = 0 ∈ A and a · b = 0 · 0 = 0 ∈ A, it follows that A is closed under subtraction and multiplication. By the Subring Test, (A, +, ·) is a subring of (Z, +, ·). 14.25 For the following statement S and proposed proof, either (1) S is true and the proof is correct, (2) S is true and the proof is incorrect, or (3) S is false and the proof is incorrect. Explain which of these occurs. S: Let R be a ring with unity containing at least two elements and let R = {a ∈ R : a − r is a unit for each r ∈ R}. Then R is a subring of R. Proof. Let a, b ∈ R . First, consider a − b and r ∈ R. Then (a − b) − r = a − (b + r). Since a ∈ R and b + r ∈ R, it follows that (a − b) − r is a unit and so a − b ∈ R . Next, consider ab and r ∈ R. Then ab − r = a − (a − ab + r ). Since a ∈ R and a − ab + r ∈ R, it follows that ab − r is a unit. Thus ab ∈ R . By the Subring Test, R is a subring of R.  Chapter 15  Proofs in Linear Algebra A topic you may very well have studied in geometry, calculus, or physics is vectors. You might recall vectors both in the plane R2 = R × R and in 3-space R3 = R × R × R. Often one thinks of a vector as a directed line segment from the origin to some other point. Examples of these (both in the plane and in 3-space) are shown in Figure 15.1.  y  z 4  (4, 3)  3  4  (2, 3, 4) 3  x  y  2 x  (a)  (b)  Figure 15.1: Vectors in the plane and 3-space The vector u in the plane (it is customary to print vectors in bold) shown in Figure 15.1(a) can be expressed as u = (4, 3); while the vector v in 3-space shown in Figure 15.1(b) can be expressed as v = (2, 3, 4). The vectors i = (1, 0) and j = (0, 1) in the plane and i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1) in 3-space will be of special interest to us. 15.1 Properties of Vectors in 3-Space One important feature of vectors is that they can be added (to produce another vector); while another is that a vector can be multiplied by an element of some set, usually a real number (again to produce another vector). In this context, these elements are called scalars. Let's focus on vectors in 3-space for the present. Let u = (a1 , b1 , c1 ) and v = (a2 , b2 , c2 ), where ai , bi , ci (i = 1, 2) are real numbers. The sum of u and v is deﬁned by u + v = (a1 + a2 , b1 + b2 , c1 + c2 ) 1  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  2  and the scalar multiple of u by a scalar (real number) α is deﬁned by αu = (αa1 , αb1 , αc1 ). From this deﬁnition, it follows that u = (a1 , b1 , c1 ) = (a1 , 0, 0) + (0, b1 , 0) + (0, 0, c1 ) = a1 (1, 0, 0) + b1 (0, 1, 0) + c1 (0, 0, 1) = a1 i + b1 j + c1 k. That is, it is possible to express a vector u in 3-space in terms of (and to be called a linear combination of) the vectors i, j, and k in 3-space. Listed below are eight simple, yet fundamental, properties that follow from these deﬁnitions of vector addition and scalar multiplication in R3 : 1. u + v = v + u for all u, v ∈ R3 . 2. (u + v) + w = u + (v + w) for all u, v, w ∈ R3 . 3. For z = (0, 0, 0), u + z = u for all u ∈ R3 . 4. For each u ∈ R3 , there exists a vector in R3 which we denote by −u such that u + (−u) = z = (0, 0, 0). 5. α(u + v) = αu + αv for all α ∈ R and all u, v ∈ R3 . 6. (α + β)u = αu + βu for all α, β ∈ R and all u ∈ R3 . 7. (αβ)u = α(βu) for all α, β ∈ R and all u ∈ R3 . 8. 1u = u for all u ∈ R3 . These properties are rather straightforward to verify, as we illustrate with properties 1, 4, and 6. To verify property 1, observe that u + v = (a1 , a2 , a3 ) + (b1 , b2 , b3 ) = (a1 + b1 , a2 + b2 , a3 + b3 ) = (b1 + a1 , b2 + a2 , b3 + a3 ) = v + u. Here, we used only the deﬁnition of addition of vectors in R3 and the fact that addition of real numbers is commutative. To verify property 4, we begin with a vector v = (b1 , b2 , b3 ) ∈ R3 and show that there is some vector in R3 , which we denote by −v, such that v + (−v) = z = (0, 0, 0). There is an obvious choice for −v, however, namely (−b1 , −b2 , −b3 ). Observe that v + (−b1 , −b2 , −b3 ) = (b1 , b2 , b3 ) + (−b1 , −b2 , −b3 ) = (b1 + (−b1 ), b2 + (−b2 ), b3 + (−b3 )) = (0, 0, 0). Hence, −v = (−b1 , −b2 , −b3 ) has the desired property. We note also that, according to the deﬁnition of scalar multiplication in R3 , (−1)v = ((−1)b1 , (−1)b2 , (−1)b3 ) = (−b1 , −b2 , −b3 ) = −v. We will revisit this observation later.  3 To establish property 6, observe that (α + β)u = (α + β)(a1 , b1 , c1 ) = ((α + β)a1 , (α + β)b1 , (α + β)c1 )) = (αa1 + βa1 , αb1 + βb1 , αc1 + βc1 ) = (αa1 , αb1 , αc1 ) + (βa1 , βb1 , βc1 ) = α(a1 , b1 , c1 ) + β(a1 , b1 , c1 ) = αu + βu. Thus, showing that (α + β)u = αu + βu also depends only on some familiar properties of addition and multiplication of real numbers. Vectors in the plane can be added and multiplied by scalars in the expected manner and, in fact, satisfy properties 1-8 as well. 15.2 Vector Spaces In addition to vectors in the plane and 3-space, there are other mathematical objects that can be added and multiplied by scalars so that properties 1-8 are satisﬁed. Indeed, these objects provide a generalization of vectors in the plane and 3-space. For this reason, we will refer to these more abstract objects as vectors as well. The study of vectors is a major topic in the area of mathematics called linear algebra. A nonempty set V , every two elements of which can be added (that is, if u, v ∈ V , then u + v is a unique vector of V ) and each element of which can be multiplied by any real number (that is, if α ∈ R and v ∈ V , then αv is a unique element in V ) is called a vector space (in fact, a vector space over R) if it satisﬁes the following eight properties: 1. u + v = v + u for all u, v ∈ V . (Commutative Property) 2. (u + v) + w = u + (v + w) for all u, v, w ∈ V . (Associative Property) 3. There exists an element z ∈ V such that v + z = v for all v ∈ V . 4. For each v ∈ V , there exists an element −v ∈ V such that v + (−v) = z. 5. α(u + v) = αu + αv for all α ∈ R and all u, v ∈ V . 6. (α + β)v = αv + βv for all α, β ∈ R and all v ∈ V . 7. (αβ)v = α(βv) for α, β ∈ R and all v ∈ V . 8. 1v = v for all v ∈ V . The elements of V are called vectors and the real numbers in this deﬁnition are called scalars. Hence if u, v ∈ V and α, β ∈ R, then both αu and βv belong to V . Therefore, αu + βv ∈ V . The vector αu + βv is called a linear combination of u and v. We can also discuss linear combinations of more than two vectors. Let u, v, w be three vectors in V and let α, β, γ be three scalars (real numbers). Therefore, αu, βv, and γw are three vectors in V and αu + βv + γw is a linear combination of u, v, and w. We've now encountered a familiar situation in mathematics. Since addition in V is only deﬁned for two vectors, what exactly is meant by αu + βv + γw? There are two obvious interpretations of αu + βv + γw, namely, (αu + βv) + γw (where αu and βv are added ﬁrst, producing the vector αu + βv, which is then added to γw) and αu + (βv + γw). However, property 2 (the associative law  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  4  of addition of vectors) guarantees that both interpretations give us the same vector and consequently, there is nothing ambiguous about writing αu + βv + γw without parentheses. In fact, if v1 , v2 , . . . , vn ∈ V and α1 , α2 , . . . , αn ∈ R, then α1 v1 + α2 v2 + . . . + αn vn is a linear combination of the vectors v1 , v2 , . . . , vn . The element z ∈ V described in property 3 (and used in property 4) is called a zero vector and an element −v in property 4 is called a negative of v. By the commutative property, we also know that z + v = v and (−v) + v = z for every vector v ∈ V . Since V satisﬁes properties 1–4, the set V forms an abelian group under addition (see Chapter 13). Although we have only deﬁned a vector space over the set R of real numbers (and this is all we will deal with), it is not always required that the scalars be real numbers. Indeed, there are certain situations when complex numbers are not only suitable scalars but in fact, the preferred scalars. Other possibilities exist as well. Of course, we have seen two examples of vector spaces, namely, R2 and R3 (with addition and scalar multiplication deﬁned above). More generally, n-space Rn = R × R × . . . × R (n factors) is a vector space where addition of two vectors u = (a1 , a2 , . . . , an ) and v = (b1 , b2 , . . . , bn ) is deﬁned by u + v = (a1 + b1 , a2 + b2 , . . . , an + bn ) and scalar multiplication αu, where α ∈ R, is deﬁned by αu = (αa1 , αa2 , . . . , αan ). We now describe two vector spaces of a very diﬀerent nature. Recall that FR is the set of all functions from R to R, that is, FR = {f : f : R → R}. Therefore, the well-known trigonometric function f1 : R → R deﬁned by f1 (x) = sin x for all x ∈ R belongs to FR . The function f2 : R → R deﬁned by f2 (x) = 3x + x/(x2 + 1) for all x ∈ R also belongs to FR . For f, g ∈ FR and a scalar (real number) α, addition and scalar multiplication are deﬁned by (f + g)(x) = f (x) + g(x) (αf )(x) = α(f (x))  for all x ∈ R,  for all x ∈ R.  For the functions f1 and f2 deﬁned above, (f1 + f2 )(x) = sin x + 3x +  x2  x +1  and  (5f2 )(x) = 15x +  5x . +1  x2  Under these deﬁnitions of addition and scalar multiplication, FR is a vector space, the veriﬁcation of which depends only on ordinary addition and multiplication of real numbers. As an illustration, we verify that FR satisﬁes properties 2-5 of a vector space. First we verify property 2. Let f, g, h ∈ FR . Then ((f + g) + h)(x) = (f + g)(x) + h(x) = (f (x) + g(x)) + h(x) = f (x) + (g(x) + h(x)) = f (x) + (g + h)(x) = (f + (g + h))(x)  5 for all x ∈ R. Therefore, (f + g) + h = f + (g + h). Second we show that FR satisﬁes property 3 of a vector space. Deﬁne the (constant) function f0 : R → R by f0 (x) = 0 for all x ∈ R. We show that f0 is a zero vector for FR . For f ∈ FR , (f + f0 )(x) = f (x) + f0 (x) = f (x) + 0 = f (x) for all x ∈ R. Therefore, f + f0 = f . The function f0 is called the zero function in FR . Next we show that FR satisﬁes property 4 of a vector space. For each function f ∈ FR , deﬁne the function −f : R → R by (−f )(x) = −(f (x)) for all x ∈ R. Since (f + (−f ))(x) = f (x) + (−f )(x) = f (x) + (−f (x)) = 0 = f0 (x) for all x ∈ R, it follows that f + (−f ) = f0 and so −f is a negative of f . Finally, we show that FR satisﬁes property 5 of a vector space. Let f, g ∈ FR and α ∈ R. Then, for each x ∈ R, (α(f + g))(x) = α ((f + g)(x)) = α (f (x) + g(x)) = αf (x) + αg(x) = (αf )(x) + (αg)(x) = (αf + αg)(x) and so α(f + g) = αf + αg. We now consider a special class of real-valued functions deﬁned on R. These functions are important in many areas of mathematics, not only linear algebra. A function p : R → R is called a polynomial function (actually a polynomial function over R) if p(x) = a0 + a1 x + . . . + an xn for all x ∈ R, where n is a nonnegative integer and a0 , a1 , . . . , an are real numbers. The expression p(x) itself is called a polynomial in x. You may recall that if an = 0, then n is the degree of p(x). The zero function f0 is a polynomial function. It is assigned no degree, however. We denote the set of all polynomial functions over R by R[x]. Hence R[x] ⊆ FR . Let f, g ∈ R[x] and let α ∈ R. Then f (x) = a0 + a1 x + . . . + an xn and g(x) = b0 + b1 x + . . . + bm xm , where n and m are nonnegative integers and ai , bj ∈ R for 0 ≤ i ≤ n and 0 ≤ j ≤ m. If we assume, say, that m ≥ n, then the sum f + g is the polynomial function deﬁned by (f + g)(x) = f (x) + g(x) = (a0 + b0 ) + (a1 + b1 )x + . . . + (an + bn )xn + bn+1 xn+1 + . . . + bm xm ; while the scalar multiple αf of f by α is the polynomial function deﬁned by (αf )(x) = α(f (x)) = (αa0 ) + (αa1 )x + . . . + (αan )xn . These deﬁnitions are, of course, exactly the same as the sum of two elements of FR and the scalar product of an element of FR by a real number. Actually, R[x] is itself a vector space over R under the addition and scalar multiplication we have just deﬁned. For example, let f, g ∈ R[x]. Since R[x] ⊆ FR and addition in R[x] is deﬁned exactly the same as in FR , it follows that f + g = g + f ; that is, property 1 of a vector space is satisﬁed. By the same reasoning, property 2 and properties 5-8 are satisﬁed as well. The zero function f0 is in R[x] and we know that f + f0 = f for all f ∈ FR . Hence p + f0 = p for all p ∈ R[x]. So f0 is a zero vector for R[x]. For f ∈ R[x] deﬁned by f (x) = a0 + a1 x + . . . + an xn , we know that −f is given by (−f )(x) = −(f (x)) = (−a0 ) + (−a1 )x + . . . + (−an )xn . Thus −f ∈ R[x] is a negative of f . Thus properties 3 and 4 are satisﬁed as well, and so R[x] is a vector space over R.  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  6  15.3 Matrices Among the best known and most important examples of vector spaces are those concerning matrices. A rectangular array of real numbers is called a matrix. The plural of "matrix" is "matrices". (In general, a matrix need not be an array of real numbers — it can be a rectangular array of elements from any prescribed set. However, we will deal only with real numbers.) Thus a matrix has m rows and n columns for some pair m, n of positive integers and contains mn real numbers, each of which is located in some row i and column j for integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ n. A matrix with m rows and n columns is said to have size m × n and is called an m × n matrix (read as "m by n matrix"). Hence   √ 1 2 −3/2 B = 0 −.8 4 is a 2 × 3 matrix, while      4 1 9   3 2  C =  0 7 −1 1 is a 3 × 3 matrix. A general m × n matrix A is commonly written as     A =     a11 a21 .. .  a12 a22 .. .  am1 am2  . . . a1n . . . a2n .. .. . . . . . amn      .    Therefore, aij represents the element located in row i and column j of A. This is referred to as the (i, j)-entry of A. In fact, it is convenient shorthand notation to represent the matrix A by [aij ] and to write A = [aij ]. The ith row of A is [ai1 ai2 . . . ain ] and the jth column is   a1j    a2j   . .  .   .  amj For two matrices to be equal, they must have the same size. Furthermore, two m × n matrices A = [aij ] and B = [bij ] are equal, written as A = B, if aij = bij for all integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ n. That is, A = B if A and B have the same size and corresponding entries are equal. Hence, in order for   A=  2 x −3 1/2 4 0      and  B=  2 4/5 −3 y 4 0    to be equal, we must have x = 4/5 and y = 1/2. For positive integers m and n, let Mmn [R] denote the set of all m × n matrices whose entries are real numbers. If m = n, then the matrices are called square matrices. The set of all m × m (square) matrices whose entries are real numbers is also denoted by Mm [R]. We now deﬁne addition and scalar multiplication in Mmn [R]. Let A, B ∈ Mmn [R], where A = [aij ] and B = [bij ]. The sum A + B of A and B is deﬁned as that m × n matrix [cij ], where cij = aij + bij for all integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ n. For α ∈ R, the scalar multiple αA of A by α is deﬁned as αA = [dij ], where dij = αaij for all integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ n. For example, if  7   A= then    A+B =  2 −1 −3 0 4 0      and  5 −10 −1 −2 9 0  B=  3 −9 2 −2 5 0      and  (−2)A =    ,  −4 2 6 0 −8 0    .  Under this addition and scalar multiplication, Mmn [R] is a vector space. As an illustration, we verify that properties 1 and 3-5 of a vector space are satisﬁed in M2 [R]. Let α ∈ R and let   a11 a12 a21 a22  A= Then      a11 a12 a21 a22  A+B =     +      and  =    b11 b12 b21 b22  b11 + a11 b12 + a12 b21 + a21 b22 + a22  B=    =      =  b11 b12 b21 b22    .  a11 + b11 a12 + b12 a21 + b21 a22 + b22  b11 b12 b21 b22      +  a11 a12 a21 a22     = B + A.  This veriﬁes property 1 of a vector space. We see here that verifying property 1 depended only on the deﬁnition of addition of matrices and the fact that real numbers are commutative under addition.   Let Z =    0 0 , often called the 2 × 2 zero matrix. Then 0 0   A+Z =   =  a11 a12 a21 a22 a11 a12 a21 a22      0 0 0 0  +       =  a11 + 0 a12 + 0 a21 + 0 a22 + 0    =A  and so Z is a zero element of M2 [R], thereby verifying property 3.   Next, let −A =  −a11 −a12 −a21 −a22   A + (−A) =    . Consequently,  a11 a12 a21 a22      +  −a11 −a12 −a21 −a22      =  0 0 0 0    = Z,  and so −A is a negative of A. Therefore, property 4 is satisﬁed. We note also that if A is multiplied by the scalar −1, then we obtain   (−1)A = (−1)  a11 a12 a21 a22      =  −a11 −a12 −a21 −a22        = −A.  Finally,   α(A + B) = α  a11 + b11 a12 + b12 a21 + b21 a22 + b22  =  α(a11 + b11 ) α(a12 + b12 ) α(a21 + b21 ) α(a22 + b22 )    CHAPTER 15. PROOFS IN LINEAR ALGEBRA  8   =    αa11 + αb11 αa12 + αb12 αa21 + αb21 αa22 + αb22    a11 a12 a21 a22  = α      +α    =  b11 b12 b21 b22    αa11 αa12 αa21 αa22      +  αb11 αb12 αb21 αb22    = αA + αB.  Under the right set of circumstances, matrices can also be multiplied — although this is, of course, not a requirement for a vector space. Let A = [aij ] be an m × n matrix and B = [bij ] be an n × r matrix, that is, let A and B be two matrices, where the number of columns in A equals the number of rows in B. In this case, we deﬁne the product AB of A and B as that m × r matrix [cij ], where n  cij = ai1 b1j + ai2 b2j + . . . + ain bnj =  aik bkj  (15.1)  k=1  for all integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ r. Hence the (i, j)-entry of AB is obtained from the ith row of A and jth column of B, that is,   [ai1 ai2 . . . ain ]        and  b1j b2j .. .         bnj by multiplying corresponding terms of this row and column and then adding all n products. The expression (15.1) is referred to as the inner product of the ith row of A and the jth column of B. For example, let   A=  1 −3 5 0 −1 0 6 2      and      B=    1 −6 5 2 0 1   . 3 3 2  −6 9 0  Since A is a 2 × 4 matrix and B is a 4 × 3 matrix, the product AB is deﬁned and, in fact, AB = [cij ] is the 2 × 3 matrix, where the six inner products are c11 = 1 · 1 + (−3) · 2 + 5 · 3 + 0 · (−6) = 10 c12 = 1 · (−6) + (−3) · 0 + 5 · 3 + 0 · 9 = 9 c13 = 1 · 5 + (−3) · 1 + 5 · 2 + 0 · 0 = 12 c21 = (−1) · 1 + 0 · 2 + 6 · 3 + 2 · (−6) = 5 c22 = (−1) · (−6) + 0 · 0 + 6 · 3 + 2 · 9 = 42 c23 = (−1) · 5 + 0 · 1 + 6 · 2 + 2 · 0 = 7. Hence    AB =  10 9 12 5 42 7    .  On the other hand, since the matrix B above is a 4 × 3 matrix and A is a 2 × 4 matrix, the product BA is not deﬁned. Certainly, however, if A and B are any two square matrices of the same size, then AB and BA are both deﬁned though they need not be equal. For example, if  9   A= then    AB =  1 2 1 2 2 1 2 1      and  B=    0 1 1 0   , while  BA =    1 2 1 2  ,   .  15.4 Some Properties of Vector Spaces Although we have now seen several diﬀerent vector spaces, there are a number of properties that these vector spaces have in common (in addition to the eight deﬁning properties). Indeed, there are a number of additional properties that all vector spaces have in common. Since vector spaces are deﬁned by eight properties, one might expect, and rightfully so, that any other properties they have in common are consequences of these eight properties. According to property 3, every vector space contains at least one zero vector and by property 4, every vector has at least one negative. We show that "at least one" can be replaced by "exactly one" in both instances. Actually, these are consequences of the fact that every vector space is a group under addition (Chapter 13). We verify these nevertheless. Theorem 15.1  Every vector space has a unique zero vector.  Proof. Let V be a vector space and assume that z and z are both zero vectors in V . Since z is a zero vector, z + z = z . Moreover, since z is a zero vector, z + z = z. Therefore, z = z + z  = z + z = z . As a consequence of Theorem 15.1, we now know that a vector space V possesses only one zero vector z that satisﬁes property 3 of a vector space. Hence we can now refer to z as the zero vector of V . Theorem 15.2  Let V be a vector space. Then every vector in V has a unique negative.  Proof. Let v ∈ V and assume that v1 and v2 are both negatives of v. Thus v + v1 = z and v + v2 = z. Hence v1 = v1 + z = v1 + (v + v2 ) = (v1 + v) + v2 = z + v2 = v2 . Proof Analysis Let's revisit the proof of Theorem 15.2. We wanted to show that each vector v has only one negative. We assumed that there were two negatives of v, namely v1 and v2 . Our goal then was to show that v1 = v2 . We started with v1 . Our idea was to add z to v1 , as this sum is the vector v1 again. Since z can also be expressed as v + v2 , we made this substitution, bringing the vector v2 into the discussion. Eventually, we showed that this expression for v1 was also equal to v2 . There is another approach we could have tried. Since v1 and v2 are both negatives of v, it follows that v + v1 = z and v + v2 = z, that is, v + v1 = v + v2 . If we add the same vector to both v + v1 and v + v2 , we obtain equal vectors (since v + v1 = v + v2 ). A good choice of a vector to add to both v + v1 and v + v2 is a negative of v (either one!). This gives us the following list of equalities: v1 + (v + v1 ) = v1 + (v + v2 ) (v1 + v) + v1 = (v1 + v) + v2 z + v1 = z + v 2 v1 = v2 .  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  10  Although this string of equalities results in v1 = v2 , this is not a particularly well-written proof. However, since our goal is to show that v1 = v2 , this suggests a way to arrive at our goal. We start with v1 (at the bottom of the left column), proceed upward, then to the right, and downward, producing v1 = z + v1 = (v1 + v) + v1 = v1 + (v + v1 ) = v1 + (v + v2 ) = (v1 + v) + v2 = z + v2 = v2 , which is similar to the proof given in Theorem 15.2 (though a bit longer).  ♦  As a consequence of Theorem 15.2, we can now refer to −v as the negative of v. Of course, the zero vector z has the property that z + z = z. However, no other vector has this property. Theorem 15.3 Proof.  Let V be a vector space. If v is a vector such that v +v = v, then v = z.  Since v + (−v) = z, it follows that z = v + (−v) = (v + v) + (−v) = v + (v + (−v)) = v + z = v.  A proof like that given for Theorem 15.3 can be obtained by adding −v to the equal vectors v + v and v and proceeding as we did in the discussion following the proof of Theorem 15.2. Also, see Exercise 15.6(b). We now describe two other properties concerning the zero vector that are consequences of Theorem 15.3. Corollary 15.4  Let V be a vector space. Then  (i) 0v = z for every vector v in V and (ii) αz = z for every scalar α ∈ R. Proof.  First, we prove (i). Observe that 0v = (0 + 0)v = 0v + 0v.  By Theorem 15.3, 0v = z. Next we verify (ii). Observe that αz = α(z + z) = αz + αz. Again, by Theorem 15.3, αz = z. Hence, by Corollary 15.4, 0v = z for every vector v in a vector space and αz = z for every scalar α. That is, if either α = 0 or v = z, then αv = z. We now show that the converse of this statement is true as well. Theorem 15.5  Let V be a vector space. If αv = z, then either α = 0 or v = z.  11 Proof. If α = 0, then, of course, the statement is true. So we may assume that α = 0. In this case, v = 1v =  1 α v= α  1 (αv) = α  1 z = z. α  Another useful property is that the scalar multiple of a vector by −1 is the negative of that vector. Actually, we have observed this earlier with two particular vector spaces but this is true in general. Theorem to Prove  If v is a vector in a vector space, then (−1)v = −v.  Proof Strategy Since v has a unique negative, to show that (−1)v = −v, we need only verify that the sum of v and (−1)v is z. ♦ Theorem 15.6 Proof.  If v is a vector in a vector space, then (−1)v = −v.  Observe that v + (−1)v = 1v + (−1)v = (1 + (−1))v = 0v = z.  Hence (−1)v = −v. 15.5 Subspaces Earlier we saw that FR = {f : f : R → R} is a vector space (under function addition and scalar multiplication). Since the set R[x] of all polynomial functions over R is a subset of FR and the addition and scalar multiplication deﬁned in R[x] are exactly the same as those deﬁned in FR , it was considerably easier to show that R[x] is a vector space. This idea can be made more general. For a vector space V , a subset W of V is called a subspace of V if W is vector space under the same addition and scalar multiplication deﬁned on V . Hence if W is a subspace of a known vector space V , then W itself is a vector space. Since every subspace contains a zero vector, W must be nonempty. As we study vector spaces further, we will see that certain subspaces appear regularly and consequently it is beneﬁcial to have an understanding of subspaces. Furthermore, some sets having an addition and scalar multiplication deﬁned on them are subsets of known vector spaces and can be shown to be vector spaces more easily by verifying that they are subspaces. What is required to show that a subset W of a vector space V is a subspace of V ? Of course, W must satisfy the eight properties required of all vector spaces. In addition, if u, v ∈ W , then u + v must belong to W . This property is expressed by saying that W is closed under addition. Also, if α is a scalar (a real number) and v ∈ W , then αv must belong to W . We express this property by saying that W is closed under scalar multiplication. Property 1 (the commutative property) requires that u+v = v +u for every two vectors u and v in W . However, V is a vector space and satisﬁes property 1. Thus u+v = v+u and W satisﬁes property 1. By the same reasoning, property 2 and properties 5-8 are satisﬁed by W . These properties of W are said to be inherited from V . Hence for a nonempty subset W of a vector space V to be a subspace of V , it is necessary that W be closed under addition and scalar multiplication. Perhaps surprisingly, these requirements are suﬃcient as well for a nonempty subset W of V to be subspace of V .  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  12  Theorem 15.7 (The Subspace Test) A nonempty subset W of a vector space V is a subspace of V if and only if W is closed under addition and scalar multiplication. Proof. First, let W be a subspace of V . Certainly, W is closed under addition and scalar multiplication. For the converse, let W be a nonempty subset of V that is closed under addition and scalar multiplication. As we noted earlier, W inherits properties 1, 2 and 5-8 of a vector space from V . Since W is nonempty and is closed under addition and scalar multiplication, only properties 3 and 4 remain to be veriﬁed. Since W = ∅, there is some vector v in W . Since W is closed under scalar multiplication, it follows by Corollary 15.4(i) that 0v = z ∈ W . Hence W contains a zero vector (namely the zero vector of V ) and property 3 is satisﬁed. Now let w be any vector of W . Again, (−1)w ∈ W . However, by Theorem 15.6, (−1)w = −w ∈ W , and so w has a negative in W (namely the negative of w in V ). Thus property 4 is satisﬁed in W as well. The proof of Theorem 15.7 brought out two important facts. Namely, if W is a subspace of a vector space V , then W contains a zero vector (namely, the zero vector of V ) and for every vector w ∈ W , its negative −w belongs to W as well. Every vector space V (containing at least two elements) always contains two subspaces, namely V itself and the subspace consisting only of the zero vector of V . We now present several examples to illustrate how the Subspace Test (Theorem 15.7) can be applied to show that certain subsets of a vector space are (or are not) subspaces of that vector space. The ﬁrst two examples concern the vector space R3 . Result 15.8  The set W = {(a, b, 2a − b) : a, b ∈ R}  is a subspace of R3 . First observe that W contains all vectors of R3 whose 3rd coordinate is twice the ﬁrst coordinate minus the second coordinate. So for example, W contains (3, 2, 4), taking a = 3 and b = 2, and (0, 0, 0), taking a = b = 0. Of course, if W is to be a subspace of R3 , then it is essential that W contains the zero vector of R3 . Proof of Result 15.8. Since W contains the zero vector of R3 , it follows that W = ∅. To show that W is a subspace of V , we need only show that W is closed under addition (that is, if u, v ∈ W , then u+v ∈ W ) and that W is closed under scalar multiplication (that is, if u ∈ W and α ∈ R, then αu ∈ W ). Let u, v ∈ W and α ∈ R. Then u = (a, b, 2a − b) and v = (c, d, 2c − d), where a, b, c, d ∈ R. Then u + v = (a + c, b + d, 2(a + c) − (b + d)) ∈ W αu = (αa, αb, 2(αa) − (αb)) ∈ W. By the Subspace Test, W is a subspace of R3 . Example 15.9  Determine whether W = {(a, b, a2 + b) : a, b ∈ R}  is a subspace of R3 .  and  13 Solution. Taking a = b = 1, we see that u = (1, 1, 2) ∈ W . Then 2u = (2, 2, 4). Since / W . Since W is not closed under scalar multiplication, 4 = 22 + 2, it follows that 2u ∈ W is not a subspace of R3 . (The subset W of R is not closed under addition either since u+u∈ / W .) ♦ We next consider the vector space FR . We have already mentioned that R[x] is a subspace of FR . Also, the set CR = {f ∈ FR : f is continuous} is a subspace of FR . Indeed, R[x] is a subspace of CR as well. Result 15.10  Let F0 = {f ∈ FR : f (1) = 0}. Then F0 is a subspace of FR .  Hence the function f1 : R → R deﬁned by f1 (x) = x − 1 belongs to F0 , as does the zero function f0 : R → R deﬁned by f0 (x) = 0 for all x. Since F0 contains the zero function, F0 = ∅. Let f, g ∈ F0 and  Proof of Result 15.10. α ∈ R. Then  (f + g)(1) = f (1) + g(1) = 0 + 0 = 0  (αf )(1) = αf (1) = α · 0 = 0.  and  Thus f + g ∈ F0 and αf ∈ F0 . By the Subspace Test, F0 is a subspace of FR . Example 15.11  Determine whether F1 = {f ∈ FR : f (0) = 1}  is a subspace of FR . Solution. Observe that the functions g, h ∈ FR deﬁned by g(x) = x+1 and h(x) = x2 +1 / F1 . belong to F1 . However, (g+h)(x) = g(x)+h(x) = x2 +x+2 and (g+h)(0) = 2, so g+h ∈ Therefore, F1 is not a subspace of FR . ♦ The next example concerns the vector space M2 (R) of 2 × 2 matrices with real entries. Result 15.12    The set W =  a 0 b c    : a, b, c ∈ R  is a subspace of M2 (R). Hence W consists of all these 2 × 2 matrices whose (1, 2)-entry is 0. Thus the zero matrix, all of whose entries are 0, belongs to W . Since W contains the zero matrix, W = ∅. Let A, B ∈ W and  Proof of Result 15.12. α ∈ R. So    A=  a 0 b c      and  B=  d 0 e f    ,  where a, b, c, d, e, f ∈ R. Then   A+B =  a+d 0 b+e c+f      and  αA =  αa 0 αb αc    .  Therefore, A + B and αA belong to W and by the Subspace Test, W is a subspace of M2 (R).  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  14  15.6 Spans of Vectors In Result 15.12 we showed that the set   W =  a 0 b c    : a, b, c ∈ R     a 0 b c  is a subspace of M2 (R). Thus if A ∈ W , then A =  for some a, b, c ∈ R. Observe,  also, that   A =  a 0 b c    = a    1 0 0 0    =   a 0 0 0   +b    0 0 1 0    +   0 0 b 0   +c      0 0 0 c  +  0 0 0 1      .  In A every matrix in W ) is a linear combination of  other words,   (and, consequently,  1 0 0 0 0 0 , , and . Therefore, W is the set of all linear combinations of 0 0 1 0 0 1 these three matrices. This observation illustrates a more general situation. Recall that if V is a vector space, v1 , v2 , . . . , vn ∈ V , and α1 , α2 , . . . , αn ∈ R, then every vector of the form α1 v1 + α2 v2 + . . . + αn vn is a linear combination of the vectors v1 , v2 , . . . , vn . Thus, by taking α1 = α2 = . . . = αn = 0, we see that the zero vector is a linear combination of v1 , v2 , . . . , vn . Also, by taking αi = 1 for a ﬁxed integer i (1 ≤ i ≤ n) and all other scalars 0, we see that each vector vi is a linear combination of v1 , v2 , . . . , vn . We have noted that every linear combination of vectors in V is a vector in V and, of course, the set of all such linear combinations is a subset of V . In fact, more can be said of this subset. Theorem 15.13 Let V be a vector space containing the vectors v1 , v2 , . . . , vn . Then the set W of all linear combinations of v1 , v2 , . . . , vn is a subspace of V . Proof. Since W contains the zero vector of V , it follows that W = ∅. Let u, w ∈ W and let α ∈ R. Then u = α1 v1 + α2 v2 + . . . + αn vn and w = β1 v1 + β2 v2 + . . . + βn vn , where αi , βi ∈ R for 1 ≤ i ≤ n. Then u + w = (α1 + β1 )v1 + (α2 + β2 )v2 + . . . + (αn + βn )vn and αu = (αα1 )v1 + (αα2 )v2 + . . . + (ααn )vn . So both u + w and αu are linear combinations of v1 , v2 , . . . , vn and hence belong to W . Thus by the Subspace Test, W is a subspace of V . For vectors v1 , v2 , . . . , vn in a vector space V , the subspace W of V consisting of all linear combinations of v1 , v2 , . . . , vn is called the span of v1 , v2 , . . . , vn and is denoted by v1 , v2 , . . . , vn . Also, W is referred to as the subspace of V spanned by v1 , v2 , . . . , vn . By Result 15.12,   W =  a 0 b c      : a, b, c ∈ R  =  1 0 0 0     ,  0 0 1 0     ,  0 0 0 1    .  15 We saw in Result 15.8 that W = {(a, b, 2a − b) : a, b ∈ R} is a subspace of of R3 . Since (a, b, 2a − b) = a(1, 0, 2) + b(0, 1, −1), it follows that W is spanned by the vectors (1, 0, 2) and (0, 1, −1), that is, W = (1, 0, 2), (0, 1, −1). We consider another illustration of spans of vectors. Result 15.14 Let f1 , f2 , f3 , g2 and g3 be ﬁve functions in R[x] deﬁned by f1 (x) = 1, 2 f2 (x) = 1 + x , f3 (x) = 1 + x2 + x4 , g2 (x) = x2 , and g3 (x) = x4 for all x ∈ R, and let W = f1 , f2 , f3  and W  = f1 , g2 , g3 . Then W = W  . Since W and W  are sets of vectors (polynomial functions) and our goal is to show that W = W  , we proceed in the standard manner by showing that each of W and W  is a subset of the other. Proof of Result 15.14. First, we show that W ⊆ W  . Let f ∈ W . Then f = af1 + bf2 + cf3 for some a, b, c ∈ R. Hence, for each x ∈ R,       f (x) = a · 1 + b · 1 + x2 + c · 1 + x2 + x4    = (a + b + c) + (b + c) · x2 + c · x4 . Thus, f is also a linear combination of f1 , g2 , and g3 . Consequently, W ⊆ W  . It remains to show that W  ⊆ W . Let g ∈ W  . Then g = af1 + bg2 + cg3 for some a, b, c ∈ R. So, for each x ∈ R,     g(x) = a · 1 + b · x2 + c · x4 = (a − b) · 1 + b · 1 + x2 + c · x4         = (a − b) · 1 + (b − c) · 1 + x2 + c · 1 + x2 + x4 . Hence g is also a linear combination of f1 , f2 , f3 as well and so W  ⊆ W . From what we have seen, if V is a vector space containing the vectors v1 , v2 , . . . , vn , then W = v1 , v2 , . . . , vn  is a subspace of V (that contains v1 , v2 , . . . , vn ). Quite possibly other subspaces of V contain v1 , v2 , . . . , vn as well. Of course, V itself is a subspace of V containing v1 , v2 , . . . , vn . In a certain sense though, W is the smallest subspace of V containing v1 , v2 , . . . , vn . Theorem 15.15 Let V be a vector space containing the vectors v1 , v2 , . . ., vn and let W = v1 , v2 , . . . , vn . If W  is a subspace of V containing v1 , v2 , . . ., vn , then W is a subspace of W  . Proof. Since W and W  are subspaces of V , we need only show that W ⊆ W  . Let v ∈ W . Thus v = α1 v1 + α2 v2 + . . . + αn vn , where αi ∈ R for 1 ≤ i ≤ n. Since vi ∈ W  for 1 ≤ i ≤ n and W  is a subspace of V , it follows that v ∈ W  . Hence W ⊆ W  . There is a consequence of Theorem 15.15 that is especially useful. Corollary 15.16 Let V be a vector space spanned by the vectors v1 , v2 , . . ., vn . If W is a subspace of V containing v1 , v2 , . . ., vn , then W = V .  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  16  Proof. Since W is a subspace of V , certainly W ⊆ V . By Theorem 15.15, V ⊆ W . Thus W =V. To illustrate a number of the concepts and results introduced thus far, we consider an example concerning 3-space. Result 15.17 (i) For the vectors i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1), R3 = i, j, k. (ii) If w1 = (1, 1, 0), w2 = (0, 1, 1), and w3 = (1, 1, 1), then R3 = w1 , w2 , w3 . (iii) Let u1 = (1, 1, 1), u2 = (1, 1, 0), and u3 = (0, 0, 1). Then u1 , u2 , u3  = u1 , u2 . Proof. Let W1 = i, j, k. Since W1 is a subspace of R3 , it follows that W1 ⊆ R3 . We now show that R3 ⊆ W1 . Let v ∈ R3 . So v = (a, b, c), where a, b, c ∈ R. Then v = (a, 0, 0) + (0, b, 0) + (0, 0, c) = a(1, 0, 0) + b(0, 1, 0) + c(0, 0, 1) = ai + bj + ck. Hence v is a linear combination of i, j, and k, and so v ∈ W1 . Hence R3 ⊆ W1 . This implies that R3 = i, j, k and (i) is veriﬁed. Next, we verify (ii). Let W2 = w1 , w2 , w3 . To verify that R3 = W2 , it suﬃces to show by Corollary 15.16 and part (i) of this result that each of the vectors i, j, and k belongs to W2 . To show that i, j, and k belong to W2 , we are then required to show that each of i, j, and k is a linear combination of w1 , w2 , and w3 . Since i = (1, 0, 0) = (1, 1, 1) + (−1)(0, 1, 1), it follows that i = 0·w1 +(−1)w2 +1·w3 . Now j = (0, 1, 0) = (1, 1, 0)+(0, 1, 1)+(−1)(1, 1, 1); so j = 1 · w1 + 1 · w2 + (−1)w3 . Finally, k = (0, 0, 1) = (1, 1, 1) + (−1)(1, 1, 0) and so k = (−1)w1 + 0 · w2 + 1 · w3 . Hence R3 = W2 and (ii) is established. Finally, we verify (iii). Let W = u1 , u2  and W  = u1 , u2 , u3 . Since W  contains the vectors u1 and u2 , it follows by Theorem 15.15 that W ⊆ W  . By Corollary 15.16, to prove that W  ⊆ W , we need only show that each of the vectors u1 , u2 , and u3 belongs to W , that is, each of these three vectors is a linear combination of u1 and u2 . This is obvious for u1 and u2 as u1 = 1 · u1 + 0 · u2 and u2 = 0 · u1 + 1 · u2 . Thus it remains only to show that u3 is a linear combination of u1 and u2 . However, u3 = (0, 0, 1) = (1, 1, 1) + (−1)(1, 1, 0) = 1 · u1 + (−1)u2 , completing the proof. 15.7 Linear Dependence and Independence For the vectors u1 = (1, 1, 0) and u2 = (0, 1, 1) in R3 , the vector u3 = (−1, 1, 2) ∈ R3 is a linear combination of u1 and u2 since u3 = (−1, 1, 2) = (−1) · u1 + 2 · u2 = (−1) · (1, 1, 0) + 2 · (0, 1, 1). Therefore, in a certain sense, the vector u3 depends on u1 and u2 in a linear manner. This linear dependence can be restated as (−1) · u1 + 2 · u2 + (−1) · u3 = (0, 0, 0). This kind of dependence plays an important role in linear algebra. Let S = {u1 , u2 , . . . , um } be a nonempty set of vectors in a vector space V . The set S is called linearly dependent if there exist scalars c1 , c2 , . . . , cm , not all 0, such that c1 u1 + c2 u2 + . . . + cm um = z. If S is not linearly dependent, then S is said to be linearly independent. For S = {u1 , u2 , . . . , um }, we also say that the vectors u1 , u2 , . . . , um are  17 linearly dependent or linearly independent according to whether the set S is linearly dependent or linearly independent, respectively. Consequently, the vectors u1 , u2 , . . . , um are linearly independent if whenever c1 u1 + c2 u2 + . . . + cm um = z, then ci = 0 for each i (1 ≤ i ≤ m). We now consider some examples. Example 15.18 Determine whether S = {(1, 1, 1), (1, 1, 0), (0, 1, 1)} is a linearly independent set of vectors in R3 . Solution.  Let a, b, and c be scalars such that a · (1, 1, 1) + b · (1, 1, 0) + c · (0, 1, 1) = (0, 0, 0).  By scalar multiplication and vector addition, we have (a + b, a + b + c, a + c) = (0, 0, 0), arriving at the following system of equations: a+b = 0 a+b+c = 0 a + c = 0. Subtracting the ﬁrst equation from the second, we obtain c = 0. Substituting c = 0 into the third equation, we obtain a = 0. Substituting a = 0 and c = 0 into the second equation, we obtain b = 0. Hence a = b = c = 0 and S is linearly independent. ♦ Example 15.19  Determine whether   S=  2 1 1 0     ,  0 1 1 2     1 1 1 1  ,    is a linearly independent set of vectors in M2 (R). Solution.  Again, let a, b, and c be scalars such that   a  2 1 1 0      +b  0 1 1 2      +c  1 1 1 1      =  0 0 0 0    .  By scalar multiplication and matrix addition, we have   2a + c a+b+c  a+b+c 2b + c      =  0 0 0 0    .  This results in the system of equations: 2a + c = 0 a+b+c = 0 2b + c = 0 where the second equation actually occurs twice. From the ﬁrst and third equations, it follows that c = −2a and c = −2b and so a = b = −c/2. Substituting these values for a  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  18  and b in the second equation gives (−c/2) + (−c/2) + c = −c + c = 0, that is, the second equation is satisﬁed for every value of c. Hence, if we let c = −2, say, then a = b = 1 and   1·  2 1 1 0      +1·  0 1 1 2      + (−2) ·  Consequently, S is a linearly dependent set of vectors.  1 1 1 1      =  0 0 0 0    .  ♦  We now show that a familiar set of polynomial functions is linearly independent. Theorem to Prove For every nonnegative integer n, the set Sn = {1, x, x2 , . . . , xn } is linearly independent in R[x]. Proof Strategy The elements of Sn are actually functions, say Sn = {f0 , f1 , f2 , . . . , fn }, where fi : R → R is deﬁned by fi (x) = xi for 0 ≤ i ≤ n and for all x ∈ R. To show that Sn is linearly independent, we are required to show that if c0 · 1 + c1 x + c2 x2 + . . . + cn xn = 0, where ci ∈ R for 0 ≤ i ≤ n, then ci = 0 for all i. Of course, the question is how to do this. By choosing various values of x, we could arrive at a system of equations to solve. For example, we could begin by letting x = 0, obtaining c0 · 1 + c1 · 0 + c2 · 0 + . . . + cn · 0 = 0, and so c0 = 0. Therefore, c1 x + c2 x2 + . . . + cn xn = 0. Letting x = 1 and x = 2, we have c1 + c2 + . . . + cn = 0 and 2c1 + 22 c2 + . . . + 2n cn = 0. We could actually arrive at a system of n equations and n unknowns, but perhaps this is sounding complicated. On the other hand, from the statement of the theorem, another approach is suggested. Quite often when we see a theorem stated as "for every nonnegative integer n", we think of applying induction. The main challenge to such a proof would be to show that if {1, x, x2 , . . . , xk } is linearly independent, where k ≥ 0, then {1, x, x2 , . . . , xk+1 } is linearly independent. Hence we would be dealing with the equation c0 ·1+c1 x+c2 x2 +. . .+ck+1 xk+1 = 0 for ci ∈ R, 0 ≤ i ≤ k + 1, attempting to show that ci = 0 for all i (0 ≤ i ≤ k + 1). We already mentioned that showing c0 = 0 is not diﬃcult. In order to make use of the induction hypothesis, we need a linear combination of the polynomials 1, x, x2 , . . . , xk . One idea for doing this is to take the derivative of c0 · 1 + c1 x + c2 x2 + . . . + ck+1 xk+1 . ♦ Theorem 15.20 For every nonnegative integer n, the set Sn = {1, x, x2 , . . . , xn } is linearly independent in R[x]. Proof. We proceed by induction. For n = 0, we are required to show that S0 = {1} is linearly independent in R[x]. Let c be a scalar such that c · 1 = 0. Then surely c = 0 and so S0 is linearly independent. Assume that Sk = {1, x, x2 , . . . , xk } is linearly independent in R[x], where k is a nonnegative integer. We show that Sk+1 = {1, x, x2 , . . . , xk+1 } is linearly independent in R[x]. Let c0 , c1 , . . . , ck+1 be scalars such that c0 · 1 + c1 x + c2 x2 + . . . + ck+1 xk+1 = 0,  (15.2)  for all x ∈ R. Letting x = 0 in (15.2), we see that c0 = 0. Now taking the derivatives of both sides of (15.2), we see that c1 · 1 + 2c2 x + 3c3 x2 + . . . + (k + 1)ck+1 xk = 0 for all x ∈ R. By the induction hypothesis, Sk is a linearly independent set of vectors in R[x] and so c1 = 2c2 = 3c3 = . . . = (k + 1)ck+1 = 0, which implies that c1 = c2 = c3 = . . . = ck+1 = 0. Since c0 = 0 as well, it follows that Sk+1 is linearly independent.  19 Proof Analysis Before proceeding further, it is important that we understand the proof we have just given. The proof began by showing that S0 = {1} is linearly independent. What this means is that S0 consists of the single constant polynomial function f deﬁned by f (x) = 1 for all x ∈ R. Let c be a scalar (real number) such that c · f = f0 , where f0 is the zero polynomial function deﬁned by f0 (x) = 0 for all x ∈ R. Thus, for each x ∈ R, (cf )(x) = f0 (x) = 0, that is, (cf )(x) = c · f (x) = c · 1 = 0 = f0 (x) ♦  and so c = 0.  We now consider a result for a general vector space. Result 15.21 If v1 , v2 , and v3 are linearly independent vectors in a vector space V , then v1 , v1 + v2 , and v1 + v2 + v3 are also linearly independent in V . Proof.  Let a, b, and c be scalars such that a · v1 + b · (v1 + v2 ) + c · (v1 + v2 + v3 ) = z.  From this, we have (a + b + c) · v1 + (b + c) · v2 + c · v3 = z. Since v1 , v2 , and v3 are linearly independent, a + b + c = b + c = c = 0, from which it follows that a = b = c = 0 and so v1 , v1 + v2 , and v1 + v2 + v3 are linearly independent. Let S = {v1 , v2 , . . . , vn } be a set of n vectors, where n ∈ N, and let S  be a nonempty subset of S. Then |S  | = m for some integer m with 1 ≤ m ≤ n. Since the order in which the elements of S are listed is irrelevant, these elements can be rearranged and relabeled if necessary so that S  = {v1 , v2 , . . . , vm }. This fact is quite useful at times. Theorem 15.22 Let S be a ﬁnite nonempty set of vectors in a vector space V . If S is linearly independent in V and S  is a nonempty subset of S, then S  is also linearly independent in V . Proof. We may assume that S  = {v1 , v2 , . . . , vm } and S = {v1 , v2 , . . . , vm , vm+1 , . . . , vn }, where then 1 ≤ m ≤ n. If m = n, then S  = S and surely S  is linearly independent. Thus we can assume that m < n. Let c1 , c2 , . . . , cm be scalars such that c1 v1 + c2 v2 + . . . + cm vm = z. However, then, c1 v1 + c2 v2 + . . . + cm vm + 0vm+1 + 0vm+2 + . . . + 0vn = z.  (15.3)  Since S is linearly independent, all scalars in (15.3) are 0. In particular, c1 = c2 = . . . = cm = 0, which implies that S  is linearly independent. We can restate Theorem 15.22 as follows: Let V be a vector space, and let S and S  be ﬁnite nonempty subsets of V such that S  ⊆ S. If S is linearly independent, then S  is linearly independent. The contrapositive of this implication gives us: If S  is linearly dependent, then S is linearly dependent.  20  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  Although we have only discussed linear independence and linear dependence in connection with ﬁnite sets of vectors, these concepts exist for inﬁnite sets of vectors as well. An inﬁnite set of vectors in a vector space V is linearly independent if every ﬁnite nonempty subset of S is linearly independent. Equivalently, an inﬁnite set S of vectors in a vector space V is linearly dependent if some ﬁnite nonempty subset of S is linearly dependent. Every example we have seen of a (ﬁnite) set S of linearly dependent vectors in some vector space V gives rise to an inﬁnite set T of linearly dependent vectors; namely, any inﬁnite subset T of V such that S ⊆ T is linearly dependent. But what is an example of a vector space that contains inﬁnitely many linearly independent vectors? We provide such an example now. Result 15.23  The set T = {1, x, x2 , . . .} is linearly independent in R[x].  Proof. Let S be a ﬁnite nonempty subset of T . Then there is a largest nonnegative integer m such that xm ∈ S. Therefore, S ⊆ Sm = {1, x, x2 , . . . , xm }. By Theorem 15.20, Sm is linearly independent in R[x] and by Theorem 15.22, S is linearly independent. Consequently, T is linearly independent in R[x]. 15.8 Linear Transformations We have seen that many properties of a vector space V , subspaces of V , the span of a set of vectors in V , and linear independence and linear dependence of vectors in V deal with a common concept: linear combinations of vectors. Perhaps this is not unexpected in an area of mathematics called linear algebra. There are occasions when two vectors spaces V and V  are so closely linked that with each vector w ∈ V , there is an associated vector w ∈ V  such that the vector associated with αu + βv in V is αu + βv in V  . Such an association describes a function from V to V  . In particular, a function f : V → V  is said to preserve linear combinations of vectors if f (αu + βv) = αf (u) + βf (v) for all u, v ∈ V and every two scalars α and β. If f : V → V  has the property that f (u + v) = f (u) + f (v) for all u, v ∈ V , then f is said to preserve addition; while if f (αu) = αf (u) for all u ∈ V and every scalar α, then f is said to preserve scalar multiplication. Let z be the zero vector of V  . If f : V → V  preserves linear combinations and u, v ∈ V , then f (u + v) = f (1 · u + 1 · v) = 1 · f (u) + 1 · f (v) = f (u) + f (v) and f (αu) = f (αu + 0v) = αf (u) + 0f (v) = αf (u) + z = αf (u). Hence if f : V → V  is a function that preserves linear combinations, then f preserves addition and scalar multiplication as well. Conversely, suppose that f : V → V  is a function that preserves both addition and scalar multiplication. Then for u, v ∈ V and scalars α and β, f (αu + βv) = f (αu) + f (βv) = αf (u) + βf (v), that is, f preserves linear combinations. Because functions that preserve linear combinations are so important in linear algebra, they are given a special name. Let V and V  be vector spaces. A function T : V → V  is called a linear transformation if it preserves both addition and scalar multiplication, that is, if it satisﬁes the following conditions:  21 1. T (u + v) = T (u) + T (v) 2. T (αv) = αT (v) for all u, v ∈ V and all α ∈ R. There are some points in connection with these conditions that need to be addressed and that may not be self-evident. Condition 1 states that T (u + v) = T (u) + T (v) for every two vectors u and v of V . Hence the addition indicated in T (u + v) takes place in V ; while, on the other hand, since T (u) and T (v) are vectors in V  , the addition indicated in T (u) + T (v) takes place in V  . Also, condition 2 states that T (αv) = αT (v) for every vector v in V and every scalar α. By the same reasoning, the scalar multiplication indicated in T (αv) takes place in V , while the scalar multiplication in αT (v) takes place in V  . From what we have already seen, every linear transformation preserves linear combinations of vectors (hence the name). Let's consider an example of a linear transformation. Result 15.24 The function T : R3 → R2 deﬁned by T ((a, b, c)) = T (a, b, c) = (2a + c, 3c − b) is a linear transformation. Before we prove Result 15.24, let's be certain that we understand what this function does. For example, T (1, 2, 3) = (5, 7), T (1, −6, −2) = (0, 0), while T (0, 0, 0) = (0, 0) as well. We now show that T is a linear transformation. Let u, v ∈ R3 . Then u = (a, b, c) and v = (d, e, f ) for  Proof of Result 15.24. a, b, c, d, e, f ∈ R. Then  T (u + v) = T (a + d, b + e, c + f ) = (2(a + d) + c + f, 3(c + f ) − (b + e)) = (2a + c, 3c − b) + (2d + f, 3f − e) = T (a, b, c) + T (d, e, f ) = T (u) + T (v) and T (αu) = T (α(a, b, c)) = T (αa, αb, αc) = (2αa + αc, 3αc − αb) = α(2a + c, 3c − b) = αT (u), as desired.      a   3 Sometimes the vectors in R are written as "column vectors", that is, as  b  rather c than (a, b, c) or the "row vector" [a b c]. In this case, notice that the linear transformation T : R3 → R2 deﬁned by T (a, b, c) = (2a + c, 3c − b) can be described as           a   a 2 0 1   2a + c   T (a, b, c) = T  b  = ,  b = 0 −1 3 −b + 3c c c       a 2 0 1   , then this linear transformation can be that is, if we let v =  b  and A = 0 −1 3 c deﬁned in terms of the matrix A, namely,  T (v) = Av.  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  22  In general, if A is an m × n matrix, then the function T : Rn → Rm deﬁned by n T (u) = Au for an n × 1 column  vector u ∈ R is a linear transformation. For example,     1 −2 a c   ,v= , and α ∈ R, consider the 3 × 2 matrix A =  3 −1 . For u = b d 2 5   T (u + v) = T  a+c b+d                 1 −2  a + c − 2b − 2d   a+c   =  3 −1  =  3a + 3c − b − d  b+d 2 5 2a + 2c + 5b + 5d          a − 2b c − 2d a c     +T =  3a − b  +  3c − d  = T b d 2a + 5b 2c + 5d  = T (u) + T (v) and   T (αu) = T  αa αb               1 −2  αa − 2αb   αa   =  3 −1  =  3αa − αb  αb 2 5 2αa + 5αb      a − 2b a   = α  3a − b  = αT = αT (u). b 2a + 5b  Thus, T : R2 → R3 is a linear transformation. The proof for a general m × n matrix is similar. As another illustration of a linear transformation, we consider a well-known function from R[x] to itself. Result 15.25  The function D (for diﬀerentiation) from R[x] to R[x] deﬁned by D(c0 + c1 x + c2 x2 + . . . + cn xn ) = c1 + 2c2 x + . . . + ncn xn−1  is a linear transformation. Proof. Let f, g ∈ R[x], where f (x) = a0 + a1 x + a2 x2 + . . . + ar xr and g(x) = b0 + b1 x + b2 x2 + . . . + bs xs and, say, r ≤ s. Then D(f (x) + g(x)) = D ((a0 + a1 x + . . . + ar xr ) + (b0 + b1 x + . . . + bs xs ))   = D (a0 + b0 ) + (a1 + b1 )x + . . . + (ar + br )xr + br+1 xr+1 + . . . + bs xs    = (a1 + b1 ) + . . . + r(ar + br )xr−1 + (r + 1)br+1 xr + . . . + sbs xs−1   =      a1 + 2a2 x + . . . + rar xr−1 + b1 + 2b2 x + . . . + sbs xs−1    = D(f (x)) + D(g(x)) and    D(αf (x)) = D αa0 + αa1 x + αa2 x2 + . . . + αar xr    = αa1 + 2αa2 x + . . . + rαar xr−1 = α(a1 + 2a2 x + . . . + rar xr−1 ) = αD(f (x)). Since D preserves both addition and scalar multiplication, it is a linear transformation. There is a special kind a function from a vector space to itself that is always a linear transformation.  23 Result 15.26 Let V be a vector space over the set R of real numbers. For c ∈ R, the function T : V → V deﬁned by T (v) = cv is a linear transformation. Proof.  Let u, w ∈ V . Then T (u + w) = c(u + w) = cu + cw = T (u) + T (w);  while, for α ∈ R, T (αu) = c(αu) = (cα)(u) = (αc)(u) = α(cu) = αT (u). Therefore, T is a linear transformation. For c = 1, the function T deﬁned in Result 15.26 is the identity function; while for c = 0, the function T maps every vector into the zero vector. Consequently, both of these functions are linear transformations. We now look at functions involving other vector spaces. For a function f ∈ FR and a real number r, we deﬁne the function f + r by (f + r)(x) = f (x) + r for all x ∈ R. Example 15.27 Let r be a nonzero real number. Prove or disprove: The function T : FR → FR deﬁned by T (f ) = f + r is a linear transformation. Solution. Let f, g ∈ FR . Observe that T (f + g) = (f + g) + r, while T (f ) + T (g) = (f + r) + (g + r) = (f + g) + 2r. Since r = 0, it follows that T (f + g) = T (f ) + T (g). Therefore, T is not a linear transformation. ♦ Example 15.28  Let T : M2 (R) → M2 (R) be a function deﬁned by   a b c d  T      =  ad 0 0 bc    .  Prove or disprove: T is a linear transformation.    Solution. Since T  2  1 1 1 1    and 2T    =T  1 1 1 1  T is not a linear transformation. Example 15.29        =2  2 2 2 2 1 0 0 1      =     =    4 0 0 4  2 0 0 2    ,  ♦  The function T : M2 (R) → M2 (R) is deﬁned by   T  a b c d      =  Prove or disprove: T is a linear transformation.  a a c c    .  CHAPTER 15. PROOFS IN LINEAR ALGEBRA  24   a1 b1 c1 d1  Solution. Let   T  a1 b1 c1 d1      +     ,  a2 b2 c2 d2  a2 b2 c2 d2    ∈ M2 (R) and α ∈ R. Then      = T   =  a1 + a2 a1 + a2 c1 + c2 c1 + c2    = T    a1 + a2 b1 + b2 c1 + c2 d1 + d2  a1 b1 c1 d1      =      +T  a1 a1 c1 c1  a2 b2 c2 d2      +    a2 a2 c2 c2    ;  while    T  α  a1 b1 c1 d1      = T   = α  αa1 αb1 αc1 αd1  a1 a1 c1 c1        =   = αT    αa1 αa1 αc1 αc1  a1 b1 c1 d1    .  Since T preserves both addition and scalar multiplication, T is a linear transformation.  ♦  15.9 Properties of Linear Transformations An important property of linear transformations is that the composition of any two linear transformations (when the composition is deﬁned) is also a linear transformation. This fact has an interesting consequence as well. Theorem 15.30 Let V, V  , and V  be vector spaces. If T1 : V → V  and T2 : V  → V  are linear transformations, then the composition T2 ◦T1 : V → V  is a linear transformation as well. Proof.  For u, v ∈ V and a scalar α, observe that (T2 ◦ T1 )(u + v) = T2 (T1 (u + v)) = T2 (T1 (u) + T1 (v)) = T2 (T1 (u)) + T2 (T1 (v)) = (T2 ◦ T1 )(u) + (T2 ◦ T1 )(v)  and (T2 ◦ T1 )(αv) = T2 (T1 (αv)) = T2 (αT1 (v)) = αT2 (T1 (v)) = α(T2 ◦ T1 )(v). Therefore, T2 ◦ T1 is a linear transformation. As an example of the preceding theorem, let T1 : R3 → R2 and T2 : R2 → R3 be deﬁned by T1 (a, b, c) = (a + 2b − c, 3b + 2c) and T2 (a, b) = (b, 2a, a + b). Then T2 ◦ T1 : R3 → R3 is given by (T2 ◦ T1 )(a, b, c) = T2 (T1 (a, b, c)) = T2 (a + 2b − c, 3b + 2c) = (3b + 2c, 2a + 4b − 2c, a + 5b + c).  25 From what we mentioned earlier, T1 and T2 can also be deﬁned by               a   a 0 1   1 2 −1   a     a = 2 0  . T1  b  =  b  and T2 0 3 2 b b c c 1 1  Interestingly enough,              a a 0 1      1 2 −1   (T2 ◦ T1 )  b  =  2 0   b , 0 3 2 c 1 1 c  that is, the composition T2 ◦ T1 can be obtained by multiplying the matrices that describe T1 and T2 . Therefore, if we represent the linear transformations T1 and T2 by matrices A1 and A2 , respectively, then the matrix that represents T2 ◦ T1 is A2 A1 . This also