|AustinTek homepage||| Linux Virtual Server Links | AZ_PROJ map server ||
Copyright © 2010 Joseph Mack
v20100901, released under GPL-v3.
This course is to explain Euler's identity: eiϑ=-1.
Material/images from this webpage may be used, as long as credit is given to the author, and the url of this webpage (http://www.austintek.com/complex_numbers) is included as a reference.
Table of Contents
|Much of the material here comes from "An Imaginary Tale, The Story of √-1", Paul J. Nahin 1998, Princeton University Press, ISBN 0-691-02795-1.|
i=√-1 is a relatively new concept in math. Without it, wave functions and much of modern physics and math cannot be understood.
An example: what pair of numbers have a sum of 2, and a product of 1  ?
Well what pair of number have a sum of 2, and a product of 2? This problem had no solution till recently. When mathematicians find problems which have a solution for some range of numbers, but not for other quite reasonable looking numbers, they assume that something is missing in mathematics, rather than the problem has no solution for those numbers. The solution to this problem came with the discovery that i=√-1 behaved quite normally under the standard operations (+-*/). It took some time for i to be accepted as just another number, since there was initially no obvious way to represent it geometrically. In the initial confusion, numbers using i were called imaginary (or impossible) to contrast them with real numbers. The name imaginary is still in use and is most unfortunate: imaginary numbers are no less real than numbers which represent less than nothing.
The solution: (1+i, 1-i). Add these numbers and multiply them to see the result.
some nomenclature: 1 is a real number (we're not going to differentiate integers and floating point numbers here). i=√-1 is an imaginary number. 1+i (or 1+i) is a complex number (it has real and imaginary parts). Often the terms imaginary and complex are used interchangeably, when it's not required for the listener to differentiate the two.
In computing terminology, the name "real numbers" differentiates them from "imaginary numbers". Computers can calculate with complex numbers (e.g. in electrical circuits), but no-one computes on imaginary numbers alone. The terminology "real number" is not a great one. Currently real numbers are implemented in floating point (as against fixed point) format and often the term "floating point number" is used for "real numbers". If you're not working with imaginary numbers, then there are reals and integers. If you're working with complex numbers, then there are real and there are imaginary numbers. (If you're working with complex numbers on a computer, you'll be doing electrical circuits and you won't have integers anywhere.) Despite the importance of clear communication, no better scheme has been proposed, and everyone has learned to work with this broken nomenclature.
As mentioned earlier, the equation
has two solutions, both x=1 (i.e. the function has two roots, both 1). The similar equation
has no solution for x ℜ (the symbol for the set of real numbers). The Fundamental Theorem of Algebra (http://en.wikipedia.org/wiki/Fundamental_theorem_of_algebra) says that a polynomial with real coefficients,
ax^n+bx^(n-1)...+yx +z = 0
has n roots, implying that the polynomial can be factorised into
(x-n1)(x-n2)...(x-nn) = 0
This theorem was generally accepted to be true from the early days, but took much time to prove. Without imaginary numbers, functions like x2-2x+2 had no (real) roots. With the arrival of imaginary numbers, we now know that the roots are 1+i,1-i. However no-one knew what to do with imaginary numbers, till Wessel found a geometric interpretation for them.
Wessel wasn't a mathematician, he was a surveyor.
|George Washington was also a surveyor. It was a good career for a technical person back then (you needed to be able to do trig, and use logs), just like electronics was for my generation and computer programming is now.|
|In the following diagrams, "0" is the origin. We will not differentiate a point at 1, and a line starting at 0 and ending at 1. The line represents a vector of length 1. The math here applies to both a point and a line. Imaginary numbers are used for vectors and all operations here will be diagrammed assuming they are vectors.)|
Wessel said that if we have a line of length 1 like this
.........0-------->........ 0 1
and we multiply the line by i, that with the current knowledge of i, we don't know what we've got anymore. But say we multiply that line by i a second time. Now we've multiplied the original line by -1 and we have
.<-------0....... -1 0
a line pointing in the opposite direction. Let's say we multiply by i a third time; again we have no idea what we've got. But let's say we multiply by i a forth time, we've multiplied by +1 and we're back to our original line.
.........0-------->........ 0 1
Wessel's contribution was to guess what happened in the places where we didn't have a clue what had happened. Wessel said that the line had rotated into another (the 2nd) dimension, i.e. by 90°. Wessel then used 2-D space for his diagrams. Here's Wessel's line of length i.
^ i | | | | .........0......... . 1 . . . . .
In agreement with the Cartesian convention (for the location of the 1st, 2nd, 3rd, 4th quadrant), multiplication by i rotates the line in the anticlockwise direction by 90°, i.e. a +ve rotation is anticlockwise. Here's Wessel's idea of -i.
. . . . . .........0......... | 1 | | | v -i
Wessel said that if you labelled the 2-D Cartesian axes as real and imaginary then you could have a geometric interpretation of i.
Wessel's discovery was nice and interesting, but then Wessel went on to show that his interpretation of i worked with the normal operators +-*/. To show addition, here's Wessel's geometric representation of the algebraic sum 1+i.
.i ^ (1+i) . | . | . | . | . | .........0-------->......... 1
In the following diagrams, the box which holds an ascii char is rectangular, rather than square. Thus the slope of the ascii "/" is about 60° (rather than 45°). As well the angled arrowhead doesn't exactly look like an arrowhead, so you'll have to recalibrate the diagrams in your head.
When you add vectors, you join them up head-to-tail. Let's say you move 1m to the east and then 1m to the north (use the Wessel diagram above for 1+i). If you move 1m to the east, you move to head of the 1m vector pointing to the east; then if you move 1m to the north, you start from the head of the east vector and head along the north vector by 1m. You've moved √2m to the north-east. The sum of vectors (called the resultant, short for resultant force, when the vector is a force) is the vector from the first tail to the last head.
|I'll use the resultant of the addition of the vectors from here, rather than showing the individual real and imaginary components.|
Here's the resultant of adding 1+i.
_ .i /| (1+i) . / . / . / . / ./ .........0........ 1
Example: You're travelling north at 1m/sec and the wind is coming from the west at 1m/sec, where does the wind appear to come from (i.e. if you held up a flag, what direction would it point)?
If you were facing N and not moving, the wind would hit the left of your head at 1m/sec. If you start moving N at 1m/s, then you will feel a wind from the N at 1m/sec. The resultant wind blowing past your head is the vector sum of those two winds. (In the diagram "()" is your head.)
stationary 1m/sec from W --->() ------ when moving | 1m/sec from N | | | | v 1m/sec from W ---->() ------ resultant |\ 1m/sec from N | \ | \ | \ | \ v _| 1m/sec from W ----->() ------ another way of writing the resultant (vector addition is commutative, i.e. you can do it in any order) 1m/sec from W -----> \ | 1m/sec from N \ | \ | \ | \ | _|V ()
The resultant is the vector from the west, followed (head to tail) by the vector from the north (or added in the reverse order). The resulting vector starts at the beginning of the vector from the west, and ends at the head of the vector from the north (or the joining of the two vectors in the reverse order). The flag would see the wind coming from the NW at √2m/sec.Here again is the sum of 1+i as a vector
.i - (1+i) . /| . / . / . / ./ .........0....... . 1 . . . . .-i
Draw similar vector diagrams for -1+i, -1-i, 1-i  .
These examples have all been addition of real and imaginary numbers. Here's some more examples. (I'm restricting them to ones easily representable by ascii art; you can do others yourself.) At the top of the diagram is the algebraic result. Below is Wessel's geometric representation.
(1-i) + (1+i) = 2 .i . . . . . 1 2 .........0..........._. .\ /| . \ / . \ / . \ / . \ / .-i _\|/
To subtract a vector, you draw it in the opposite direction, thus adding the -ve of the vector. Do the following subtraction (1+2i)-(2+i)
(1+2i)-(2+i)=-1+i (1+2i) .2i - . /| . / . / . / . / .i / . / . / . / . / -2 -1 . / 1 2 .................0............. (2+i) .2i . . . . . _ .i /| . / . / . / . / -2 -1 . / 1 2 .................0............. -(2+i) .2i . . . . . .i . . . . -2 -1 . 1 2 ................0............. / . / . / . / . / . / . |/_ .-1i . . . . . .-2i (1+2i)-(2+i)=-1+i .2i - . / /| . / / /. / / . / / . / |/_ .i / . / . / . / . / -2 -1 ./ 1 2 .................0.............
Do the following both algebraically and geometrically: (2+i)-(1-i), (-1+i)-(1-i), 1+i+i2+i3.
As we saw above, Wessel's initial insight was a geometric representation of multiplication by i. Let's do i*(1+i).
i*(1+i) = i-1 = -1+i
Geometrically, multiplication by i is (an anticlockwise) rotation by 90°
(1+i) .i - . /| . / . / . / ./ .........0....... . 1 . . . . .-i i(1+i)=(i-1)=(-1+i) - .i |\ . \ . \ . \ . \ . .........0....... -1 . 1 . . . . .-i
By inspection, you can see that the geometric i*(1+i) is (-1+i).
What is (1+i)*(1-i)?
(1+i)*(1-i)=1+i-i+1 (this is 1^2-i^2) =2
Geometrically (1+i)*(1-i) is the addition of a real*vector + imaginary*vector i.e. 1*(1-i)+i*(1-i) i.e.. You're adding a vector (or multiple of it) to the the same vector (or multiple of it) that's been rotated 90°.
(1-i) .i . . . . . .........0....... .\ 1 . \ . \ . \ . \ .-i _\| i*(1-i)=1+i .i _ . /| . / . / . / ./ .........0....... . 1 . . . . .-i (1+i)*(1-i)=1*(1-i)+i*(1-i) =(1-i)+(1*i) =2 .i _ . /|\ . / \ . / \ . / \ ./ \ .........0.........._\| . 1 2 . . . . .-i
Do the following algebraically and geometrically: (1+i)*(1+i); (-1+i)*(-1-2*i).
In Wessel's scheme, multiplication of a vector by a vector is addition of (a multiple of) the original vector (the real part) to (a multiple of) rotated copies of the original vector (the imaginary part).
|I can't get long division to work for complex numbers.|
For complex numbers, noone has come up with a geometric method of division using the Wessel diagram. However the Wessel diagram can be used for imaginary or real (i.e. not complex) divisors. Since division is the inverse of multiplication, then to divide, we multiply by the inverse. The inverse of an operation is the operation that undoes the original operation. Since multiplicaton by i is a rotation by 90°, division by i is a -90° rotation. Thus division by i is the same as multiplication by -i.
What is (1+i)/i?
version 1 (1+i)/i = 1/i + 1 what is 1/i? 1*i is 1 rotated by i. 1/i is 1 rotated by the inverse of i, ie clockwise by 90deg. 1/i = -i combining terms (1+i)/i = 1-i ---- version 2 (1+i)/i multiply by i/i = i(1+i)/-1 = -i(1+i) = 1-i
original vector (1+i) .i - . /| . / . / . / ./ .........0....... . 1 . . . . .-i vector rotated by -i (1+i)/i=(1-i) .i . . . . . .........0....... .\ 1 . \ . \ . \ . \ .-i _\|
How about division by a complex divisor e.g. what is 1/(1-i)? We do the first step of division algebraically. The method used is to find the conjugate of the divisor.
The conjugate is the complex number which when multiplied by the divisor gives a real (not necessarily the unit real). We know that (a+b)(a-b)=a2-b2. The imaginary version of this is (a+bi)(a-bi)=a2+b2. For any complex number a+bi, the conjugate is a-bi.
Can you see that the conjugate is the mirror image (in the real axis) of the vector?
Here's the example worked by multiplying top and bottom by the conjugate of (1-i)
(1+i) 1/(1-i) = ------- (1-i)(1+i) = (1+i)/2
Do this one yourself (1+i)/(1-i)  . The answer shows that 1+i and 1-i are orthogonal (at right angles). Can you see this?
Find a general formula for the inverse of a+bi in the form c+di  . No-one tries to remembers this formula. Unless you're using it a lot, it's better to derive it from scratch each time.
|On the Wessel diagram it should be possible to show that the product of a vector and its complex conjugate is real, but I haven't figured out a how to do it. It's simple enough algebraically, but to do it geometrically, you have to be able to construct a length a*b, which I don't know how to do with a pair of compasses and a ruler.|
The Wessel diagram doesn't help a whole lot with division, but at least you can use it to diagram the process.
The magnitude (or absolute value) of a number is the +ve version of its value and is written this way (with the pair of | symbols)
a = -2 b = 2 magnitude |a| = 2 |b| = 2
If you need a symbol for a complex number, without explicitly referring to its components, you use this convention
z = a + bi
You'll then do your operations on z.
What is the magnitude of z? In complex numbers, the length of the vector (i.e. the distance of the tip of the vector from the origin) is called its magnitude, modulus or absolute value. I'll be using the term magnitude, because that's the term used, when I learned complex numbers (not because it's a better term - I assume all terms are equally valid). The magnitude of the vector has no sign (it's always positive, no matter which direction it's pointing).
z = a + bi |z| = sqrt(a^2+b^2)
Let's look at the formula for 1/z again.
z = a+bi |z| = sqrt(a^2+b^2) 1/z = a/(a^2+b^2) + (-b/(a^2+b^2))*i = (a-bi)/|z|^2
If you want to divide by z, you multiply by the conjugate and divide by the square of the magnitude. This insight probably doesn't get you much; you're still going to have to find it algebraically.
The angle of a vector is the one it makes with the real axis (the axis with angle=0). This angle is called the angle of the vector (if you're dealing with math) or the phase (if you're dealing with waves or periodic functions). The symbol for the angle is usually ϑ (theta). In math and calculus, the angle is measured in radians (not degrees). Radians are more convenient for calculus, as a whole lot of dimensionless constants go to 1. The full circle is 2π radians. 90° is π/2 (pronounced "π on 2") radians.
|A math joke: You're being served pie by a mathematician. He says "how much?". You say "π on 2 (or π on 3) please". Because you said "on" he knows you asked in radian measure. (If you'd said "half a pie please", he would know that you wanted something different.) Others will think you're asking for twice as much pie as you've really asked for.|
In the diagram below, by the definition of tan(): tan(ϑ)=b/a, hence ϑ=tan-1(b/a)=arctan(b/a).
.i . _ . /| (a+bi) . / ./theta .........0......... -1 . 1 . . . .-i
Example: what is the magnitude and phase/angle of the vectors; (1+i), (1-i), -1, -i? (give the phase in radians.)
Show that doubling the angle of a unit vector is the same as squaring it.
Start with the unit vector z=(a+bi)/sqrt(a2+b2) (the vector has magnitude=1, tan(ϑ)=b/a).
You'll need this trig identity, the tangent half angle formula (http://en.wikipedia.org/wiki/Tangent_half-angle_formula), one of many half angle identities.
Trig identities are a staple of trig and calculus and are used to change one trig function into another more easily manipulated function (we'll see more examples below). In any conversion, the main problem is trivial bookkeeping mistakes (at least till you've had lots of practice and almost know the answer off by heart). Here you'll wind up with lots of terms in (b/a). To have a cleaner page (so you can spot mistakes) let t=b/a=tan(ϑ), then
tan(2ϑ) = 2t/(1-t2)
What is tan(90°)?
Some of the trig identities can be derived geometrically (in this identity, the (1-t2) denominator comes from the Pythagorean formula). You can derive them more simply using Euler's equation (below).
For the vector with phase=2ϑ use Z=A+Bi. It will have phase 2ϑ, where tan(B/A) is given by the half angle formula. To solve for A,B, you need a 2nd relationship for A,B. What is it  ?
Here's my solution  .
What is QED (http://en.wikipedia.org/wiki/Q.E.D.)? It's an abbreviation for the Latin phrase quod erat demonstrandum (which was to be demonstrated). The phrase is written in its abbreviated form at the end of a mathematical proof or philosophical argument, to signify that the last statement deduced was the one to be demonstrated; the abbreviation thus signals the completion of the proof. The humourous rendition in English is "quite easily done".
There's something sloppy in my proof. Can you spot it (did you do it correctly and then think that I knew more than you did and then "correct" your version)? Hopefully in an exam marks would be deducted for missing this. Here is the missing piece  . Are there two possibilities for the rotated vector? Go back over your proof, and find Z for the -ve version of A. Here is the answer  .
There aren't unique angles for the inverse of the trig functions sin(), cos(), tan(); you have to know which quadrant the vector is in to get an unambiguous angle. Thus the pair A,B=1,1 and A,B=-1,-1 both have the same tan() (as do the pair A,B=-1,1 and A,B=1,-1), but lie in opposite quadrants. For simple calculations, the usual inverse of tan() (which may be called any of atan(), arctan(), tan-1()) avoids this problem by returning -π/2<ϑ<π/2 and ignores the other solutions. If you have the sign of both components x,y (or r,i) it's possible to disambiguate angles which have the same tan(), and return the correct ϑ. The inverse tan function that takes two arguments is atan2(x,y). The function looks at both the sign of x,y and the ratio y/x. (You can feed the actual values for x,y or you can feed scaled pairs like (1,tan(ϑ)) or (-1,tan(ϑ)) as arguments to atan2(), to get the angle and the correct quadrant). Thus
atan2() returns -π<ϑ<π: atan2(-1,-1) would return -3π/4 and not 5π/4, while atan2(1,-1) would return -π/4 and not 7π/4.
How did we get into the situation where we got two angles? The question asked for the vector with angle=2ϑ. We answered a different question: we looked for the angle whose tan() was the same as that of twice the angle. This is a reasonable first attempt, but using this method we get two vectors, facing in opposite directions, and only one of them is the correct one. Can we pick between the two vectors? Symbolically it's not possible to tell which is the vector at twice the angle and which vector is facing the opposite direction to twice the angle. As well, the double angle isn't neccessary numerically double the angle, as angles are modulo 2π: e.g. if ϑ=5π/4, then 2ϑ=π/2. It would be possible to determine the correct vector in a numerical example, but you can't use those for proofs.
The exercise asked to show
i.e. that z2 and Z were the same; if you square the vector, then the result has double the angle. AND if you double the angle then you square the vector (i.e. the relationship works in both directions). What we've found (so far) is
i.e. that z2 implies Z; if you square the vector, then the result has double the angle, but we haven't shown the reverse (that if you double the angle then you square the vector). The problem is that if we double the angle using the tan half angle formula, we get two vectors, only one of which is correct.
Is there a fix for this? If you're the first person here, you've done all the grunt work, guessed that z2=Z, and done the algebraic proof, you don't want to give a talk where you conclude z2⇒Z, but that z2!=Z, only to have someone come up to you at the end of the talk and give you, what in retrospect is an obvious proof of the part you missed. Mathematicians are always impressed by an extension to a proof. Math is full of conjectures, statements which seem reasonable, but for which no-one can think of a proof. Many of these are 100's of years old. Proving or disproving a conjecture, even a minor one, is always a landmark in math. Conversely, missing an obvious extension of a proof is regarded as a blunder. Sure, the first part of your proof only took an afternoon, while the extra bit took 6 months, but you keep quiet about it till you've either got it, or you know for sure that you can't do it or you can't prove that it's not true.
|For someone who stuck with it and finally got the answer, see the story of Andrew Wiles solution to Fermat's Last Theorem. Wiles' original proof had an error picked up during review for publication. He took quite a while to find an alternate proof (at least a year). During that time there were calls for Wiles to publish the partial proof, by people who wanted to be the one to cross the finish line with the missing last bit of the proof. Wiles had to endure the selfishness of people purporting to be his peers. The world is full of people like this. Expect to encounter them as a result of your work.|
The current problem is the ambiguity in the inverse tan() function. We could try another approach, but even if you do, you should exhaust the inverse tan() method as well.
I looked at some numerical examples (n*π/8, n*π/6) without getting any insight. Everyone else handles the ambiguity with the atan2() approach, where we record the sign of A,B. Let's see how it works. If we build a table of all the possible signs of t matched up with all the possible signs of A,B how many entries will we need  ? Here's a table of the possible combinations of t,A,B, showing the correct and incorrect Z.
-----------------z-------------- ---z*2---- ------------------Z--------------------------- --correct=Z--- --incorrect=-Z- z t theta z^2 phase B/A A B 2theta A B 2theta ( 1+0i) 0 0 1 0 0 1 0 0 -1 -0 180 ( 1+i)/sqrt(2) 1 45 0+i 90 inf +0 +1 90 -0 -1 270 ( 0+1i) inf 90 -1 180 -0 -1 0 180 1 -0 360 (-1+i)/sqrt(2) -1 135 0-i 270 -inf +0 -1 270 -0 1 90 (-1+0i) -0 180 1 0 -0 1 -0 0 -1 0 180 (-1-i)/sqrt(2) 1 225 0+i 90 inf 0 +1 90 -0 -1 270 ( 0-i) -inf 270 -1 180 0 -1 0 180 1 -0 0 ( 1-i)/sqrt(2) -1 315 0-i 270 -inf 0 -1 270 -0 1 90
This takes a bit of time to wade through, although you probably have a good idea in your head of what's going on. z rotates 0 to 2π while z2 rotates twice through 0 to 2π. At the same time the solution we want for Z (let's call it +Z even though it has -ve magnitude half the time) also rotates twice from 0 to 2π while the incorrect solution for Z (lets call it -Z with the same proviso) rotates twice from π ito π.
The way to make the proof work in both directions is to know the quadrant of z. This requires knowing both a,b. It would be nice if you could use t, but t is ambiguous - t has the same value in two (opposite) quadrants. Then calculate the expected quadrant for Z. Then using the tan half angle formula, calculate the vector rotated through 2ϑ and select the A,B pair in the expected quadrant. If you'd attempted this proof before atan2() had been discovered, you would have had to invent it.
This is a messy proof and requires ad hocery (i.e. manual fiddling and handling of individual cases, all of which is easily handled by a computer). No-one will like this proof, but everyone will agree that you've shown how to make it work in both directions i.e. you have shown under the appropriate circumstances that z2=Z.
|When I did this proof in high school, we didn't handle the extra case. It was ignored or not mentioned. I didn't find out about atan2() till later in my life. I don't know what else we could have been doing that was so important that we were allowed to pass in an incorrect proof, without understanding its limitations.|
Why did I make a big deal about exploring the -ve root of A, which revealed a extra solution to Z? You've probably been mocked in class, by your classmates, for returning √4=±2. These people don't want to use their brains and want you to follow suite, to justify their position. There are always to two square roots of a number. You have to go check what they both tell you. If you don't, you could wind up giving the following in a presentation.
"We now have the first samples of our patented squaring chip coming off the line. These have arrived on time and on budget at a cost of $1M. There's a minor problem though; they only work for half the range of inputs. The other half of the range gives a completely different answer. It took a while to figure out, but our high school summer intern noticed that we hadn't checked both square roots at an intermediate step in our calculations. We have a fix though; with a small change in the mask, we can make a chip that handles the other half of the input range. We'll have to make another chip as well, to decide which of our two squaring chips gets to handle the calculation. This should only put us back two months, which isn't too bad, considering the slippage in our other targets for this project."
History is replete with examples of (missed or taken) opportunities like this.
There's something else sloppy too, but you probably won't notice it, unless you've done calculus or are familiar with limits. At one stage in the proof you replaced (1-t2)/(1-t2) with 1. This is fine unless t2=1, in which case you're replacing 0/0 with 1. 0/0 is undefined and you must leave (1-t2)/(1-t2) in the expression to let people know that the result blows up at t2=1. The derivation as written has holes at ϑ=45°,135°,225° and 315°, i.e. it can't tell you A+Bi for the 2ϑ vector for those values of ϑ. A reasonable argument is that the limit for (1-t2)/(1-t2) when t2=1 is 1, but you have to state it for the proof to be complete.
Back to the original question: what would have happened to the comparison between the rotated version and the squared version, if the original vector had not been of unit length  ?
Why did I ask you to show that the vector, rotated through 2ϑ, was the same as the vector squared, rather than asking if you recognised the rotated vector as anything obvious and let you figure it out? Let's say you're the first one in an area (say the Wessel diagram had just been invented), this relationship wouldn't neccessarily be obvious. You'd have to play around for a while, possibly being misdirected through mistakes in your calculations. You may have the correct answer for the vector rotated through ϑ, but you mightn't initially recognise (1-t2+2ti) in the numerator as (1+ti)2. You might resort to handworked examples to look for hints as to the behaviour of the vector rotated by 2ϑ. Eventually you'd guess that the squared and double angle rotated vectors were the same. Then you'd calculate both of them separately (as we did here) and confirm that they're the same. It's only then that you'd see a simpler path to the proof. So I got you to do the problem the way that the first person probably did it, by guessing the answer. Guessing answers is a well trodden path to solving problems. Finding the proof often comes only after you know the answer.
The Fundamental Theorem of Algebra says that an nth order polynomial, with real coefficients a..an
has n roots. As it turns out, the roots are on the real axis, or complex conjugate pairs (a±bi). For n>3 there is no simple way of finding the roots (there is no formula like the quadratic equation for n>3).
The name Fundamental Theorem of Algebra means that it's THE fundamental piece of knowledge in algebra. With its standing as the basis of algebra, a student should hear about it in the first weeks of their algebra education. Despite many years of math education, and a mathematically oriented career, I didn't hear of the Fundamental Theorem of Algebra (FTA) till late in my adulthood and then only as a result of my own reading. I now regard this state of affairs as a reflection on the education system. If I'd known about it earlier, it would have helped bring together imaginary numbers, trig and differential equations. Of course if they don't want you to understand any of this stuff and leave you floundering in pools of unrelated knowledge, feeling stupid about your inability to understand math, then it's essential that you don't find out about the Fundamental Theorem of Algebra. (Check the index of your Algebra text books - is the FTA in there? No? Then you aren't studying algebra.)
Everyone was amazed to find a geometric interpretation of imaginary numbers and to find that they obeyed the same rules as real numbers. Did Wessel find anything that couldn't be done algebraically? Let's look at the roots of xn=1.
The FTA says that the solution to the equation x2=1 has two roots (in this case, square roots). The FTA doesn't tell you their value, or how to find them, but only that they exist and that there are exactly two of them. We know their values (x=+1,-1) from other methods. Where do the two square roots of 1 lie on the Wessel diagram  ? What angles do these two vectors make with the (+ve) real axis in the Wessel diagram  ?
If you haven't already noticed, confirm algebraically that i is a 4th root of 1, i.e. i=11/4. How many 4th roots of 1 are there and what are they  ? Where do the 4 4th roots lie on the Wessel diagram  ? What angles do these four vectors make with the +ve real axis in the Wessel diagram  ?
How many cube roots of 1 does the FTA say must exist  ? What angle do you guess they make with the +ve real axis  ? Locate them on the Wessel diagram  ? Write the cube roots of 1 as complex numbers (if it helps, sin(30°)=0.5)  ? Cube each of your contenders to confirm that they are the cube roots.
The FTA (which hadn't been proven back then) said that there must be 3 cube roots of 1, but without the Wessel diagram, there was no way to find them. The Wessel diagram showed the location of the roots of xn=1, for any n, and showed there were n roots for at least one polynomial. There was some hope that the FTA was true.
How many roots are there of xn=1 and where do they lie on the Wessel diagram  ? The roots are sometimes called deMoivre numbers. The equation xn=1 is called the cyclotomic equation (which I believe is Greek for "circle cutting"). Gauss used this equation to construct regular polygons with only a pair of compasses and a ruler (the only aids allowed in classical Greek geometry).
Using the Wessel diagram, locate √i (hint 1  ) (hint 2  ) (hint 3  )  . The √i has a real component. This is somewhat of a surprise if you don't know about the Wessel diagram. However there are only two points on the unit circle that are only imaginary (i, -i) and they're already taken, so √i is going to have to have a real component.  ?
|Notice we have to keep track of the length of the vector. The vector 1+i has a length (via Pythagoras) of √2. In the real number system, if we want to square a number and have the result=1, then the number we're squaring has to have a length of 1 too (e.g. the two square roots of 1 are 1 and -1, both of length 1). Check that the length of (1+i)/√2 is 1.|
How many roots of √-i are there and where are they on the Wessel diagram  ?
Wessel published his work in 1799, in the Memoires of the Danish Academy of Science, a publication only read by the members of the Academy. His paper wasn't noticed for another 100yrs, by which time all of Wessel's discoveries had been rediscovered and credited to other people. The work by the others was all good and there is no hint that any work was misappropriated. The Wessel diagram was (re)discovered later by Argand. Argand and Wessel lived within a short distance of each other and never knew other in their lifetimes. When I was taught imaginary numbers, the Wessel diagram was called the Argand diagram, though it had been 50yrs since the historical record had been corrected. It would seem that correctness is not a high priority in the teaching of math.
(see student's t distribution http://en.wikipedia.org/wiki/Student's_t-distribution)
In statistics, there is a probability distribution called "student's distribution" (I remember it as having a small "s"; it seems to be an uppercase "S" now) or the "t distribution". The derivation of the t-distribution was first published in 1908 by William Sealy Gosset while he worked at a Guinness Brewery in Dublin. Due to proprietary issues, the paper was written under the pseudonym Student (i.e. he wasn't allowed to publish under his own name), otherwise it would surely have bankrupted Guiness. Gosset is dead now and Guiness can't sue mathematicians who correctly attribute the distibution to Gosset (rather than "student"), so there is no reason (apart from sloth and the low priority accorded to correctly crediting people for their work) that record has not been set straight.
|FIXME. Not done yet - come back later.|
Polynomials that the FTA addresses have real coefficients. They (can) have complex arguments and they return complex values e.g.y=f(x), for f(x)=x2+2x+2 gives y=(1-2i) for x=(-2+i). Thus the plot of f(x) in the complex plane is complex (i.e. the result has magnitude and phase, or has r,i components). The result is a complex surface: you will need to plot y which is complex, in the complex plane of r,i. You need to plot in 4-D. If your problem/experiment has complex results, then you'll probably need to plot them to understand what's going on. Even if you can see what's going on in your head, other people may not. Your money is coming from people (if from taxes, ultimately from the not terribly well informed consumer), who can't visualise in 4-D, so you'll need to display your results to them, otherwise they won't know what a great job you're doing. Displaying a projection of a 4-D object on a 2-D piece of paper (or screen) is not a solved problem, but various (sometimes ad hoc.) lashups handle it.
A synoptic weather map shows the magnitude and phase of air movement (in English: speed and direction of the wind) at any level (usually near the ground). The magnitude and phase are shown by the size and direction of arrows drawn along isobars (lines of constant pressure). With wind, most of the information is in the phase. The lack of dependance on the magnitude can be seen by creating a map with all arrows artificially drawn with the same length. The magnitude can be (at least roughly) inferred. Unlike the weather, in other situations there is no obvious connection between the magnitude and the phase, in which case you can try plotting the phase and magnitude on separate graphs. If you're looking at something new, you may have to make up a way of representing your output. In wikipedia, the imaginary parts of complex function are displayed by color.
Let's look at the complex plane for f(x)=x2+2x+2 to see if we can figure out a way to plot it that reveals the shape of the surface. Since the function can output complex numbers, let's relabel it z=f(r,i) where f(r,i)=f(x). We need the complex axes r,i which are in the plane of the paper, and z which is out of the plane of the paper and has complex values. As we've seen, the roots of the polynomial y=x2+2x+2 are complex (they're -1-i, -1+i). This means there are no roots for ℜ x (x real). Here's a table of values of z=f(r,i) for ℜ x.
r -4 -3 -2 -1 0 1 2 z 10 5 2 1 2 5 10 diff -5 -3 -1 1 3 5
Since the r axis is horizontal in the Wessel diagram, I've listed the z values (real in this table) horizontally, with r increasing from left to right.
There's nothing particularly dramatic here: the minimum of f(r,i) is 1 and occurs at -1+0i. The function z=f(x)=f(r,i) for ℜ x is a parabola. (check parabola http://en.wikipedia.org/wiki/Parabola, for some neat photos of parabolas in the physical world.) You can check that it's a parabola by taking the difference between successive points; these values are listed in the lower part of the box ("diff"), where you can see that the differences are linear. (I used these differences to check my hand calculations; it's easy to make mistakes and you need a check.) The differences are linear, as the derivative of a parabola is a straight line.
Which way does the parabola point? If we're sitting in the +ve z region, are we looking at the convex or the concave side of the vertex  ?
We know that y=f(x)=f(r,i) has roots (-1-i),(-1+i), so is seems that something interesting happens along the line r=-1. Here's values for y=f(x)=f(r,i), where r=-1 and i is varied.
r, i z diff -1-3i -8 -1-2i -3 5 -1- i 0 3 -1 1 1 -1+ i 0 -1 -1+2i -3 -3 -1+3i -8 -5
Since the i is the vertical axis on the Wessel diagram, I've drawn this table vertically. Notice that z is real for all i (it appears that ℜ z occurs along the axis of the roots, although I don't know why yet). Notice again that the differences between points are linear. Is y=f(x)=f(r,i) a parabola in the i dimension too  ? If we're in the +ve z region, are we looking at the convex or concave side of the vertex  ? Are the parabolas in along the two axes pointing in the same direction  ? If we're observing from the +ve z region, we're looking at a saddle shaped surface.
Here's the table combining the two previous tables, showing z along the lines r=-1 and i=0.
r -4 -3 -2 -1 0 1 2 i -3i -8 -2i -3 -1i 0 0i 10 5 2 1 2 5 10 1i 0 2i -3 3i -8
This table shows clearly that the parabolas face in opposite directions: The parabola at r=-1 is convex from the +z region, while the parabola for i=0 is concave from the +z region. We now have z values along the lines r=-1 and i=0. All of these values are real. Next we fill in the rest of the r,i plane.
I worked out enough of the surface by hand to figure out what was going on. and then wrote this code to calculate the surface from scratch  . You'll have to comment out a couple of different parts of the code to get it to do the 2nd or the 4th order polynomial.
Here's the output. The top block is the complex value; the 2nd block is the r component and the 3rd block is the i component. The rectangular table above is a subset of this table.
r -4 -3 -2 -1 0 1 2 i complex 3 (1-18j)(-4-12j)(-7-6j)(-8+0j) (-7+6j) (-4+12j)(1+18j) 2 (6-12j)( 1 -8j)(-2-4j)(-3+0j) (-2+4j) ( 1 +8j)(6+12j) 1 (9 -6j)( 4 -4j)( 1-2j)( 0j) ( 1+2j) ( 4 +4j)(9 +6j) 0 (10+0j)( 5 +0j)( 2+0j)( 1+0j) ( 2+0j) ( 5 +0j)(10+0j) -1 (9 +6j)( 4 +4j)( 1+2j)( 0j) ( 1-2j) ( 4 -4j)(9 -6j) -2 (6+12j)( 1 +8j)(-2+4j)(-3+0j) (-2-4j) ( 1 -8j)(6-12j) -3 (1+18j)(-4+12j)(-7+6j)(-8+0j) (-7-6j) (-4-12j)(1-18j) real 3 1.0 -4.0 -7.0 -8.0 -7.0 -4.0 1.0 2 6.0 1.0 -2.0 -3.0 -2.0 1.0 6.0 1 9.0 4.0 1.0 0.0 1.0 4.0 9.0 0 10.0 5.0 2.0 1.0 2.0 5.0 10.0 -1 9.0 4.0 1.0 0.0 1.0 4.0 9.0 -2 6.0 1.0 -2.0 -3.0 -2.0 1.0 6.0 -3 1.0 -4.0 -7.0 -8.0 -7.0 -4.0 1.0 imaginary 3 -18.0 -12.0 -6.0 0.0 6.0 12.0 18.0 2 -12.0 -8.0 -4.0 0.0 4.0 8.0 12.0 1 -6.0 -4.0 -2.0 0.0 2.0 4.0 6.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -1 6.0 4.0 2.0 0.0 -2.0 -4.0 -6.0 -2 12.0 8.0 4.0 0.0 -4.0 -8.0 -12.0 -3 18.0 12.0 6.0 0.0 -6.0 -12.0 -18.0
The r components are parabolas (I'm working on getting a diagram) facing in the opposite direction. The i components is a twisted sheet, with nodes along the r and i axis.
sin() and cos() are trigonometric relations between the sides of a right triangle. See trigonometry (http://en.wikipedia.org/wiki/Trigonometry) and trigonometric functions (http://en.wikipedia.org/wiki/Trigonometric_functions). (defn: trigonometry: from Greek trignon "triangle" + metron "measure")
Any triangle can be subdivided into right triangles, allowing calculation of positions and angles in any number of dimensions. Trigonometry is used in surveying, celestial navigation (sextants) and astronomy. The height of Mt Everest was first calculated from a set of observations taken 100miles away, during the great survey of India (1800's). Their estimate of the height was 28,002'. The current accepted height is 28,029'.
Although sin() and cos() are defined in introductory classes for 0°<angle<90°, the definitions can be easily extended for all angles. Look at the red and green graph of sin() and cos() in either of the wiki entries above. sin() and cos() can represent the position of the tip of a propellor (say on a motor boat) as the boat moves through the water (the path of the propellor tip tracing a helix through the water). sin() and cos() are offset from each other by 90°. Depending on where you set your origin, you'll see sin() when you look at the helix (or propellor) from the side and cos() when you look from above.
sin() and cos() appear in the physical world where some quantity goes through a periodic (rotating) motion. Processes described by sin(),cos() are often processes where energy is stored in two forms (or dimensions) and the energy is shifted from one form (or dimension) to the other and back again.
Example: As viewed from behind, assume the propellor on a motor boat rotates anticlockwise as the boat moves away from the viewer. Assume at t=0 the tip of a marked blade of the propellor is at the 12 o'clock position. Another viewer, off to the right of the first viewer sees the boat move from his left to right. To the 2nd viewer, is the tip of the marked blade described by a sin() or cos() function  ?
In this section we bring sin(),cos() into the description of complex numbers.
Euler knew the following relationships (these fall easily out of calculus)
sin(x) = x - x^3/3! + x^5/5! ... cos(x) = 1 - x^2/2! + x^4/4! ... e^x = 1 + x + x^2/2! + x^3/3! + x^4/4! + x^5/5! ...
Using these expressions, Euler found an expression for eix in terms of sin() and cos(). What was it  ? This equation had been observed by others, but only regarded as a curiosity. Euler found out what to do with it. Let's see what Euler's equation means.
Instead of using x as the argument, write Euler's equation in terms of angle (theta), ϑ: i.e. eiϑ=cos(ϑ)+i*sin(ϑ). The right hand side is of the form (a+ib), i.e. it's a vector, with a=cos(ϑ), b=sin(ϑ). It's magnitude is 1 (i.e. it's of unit length - can you prove this algebraically?) and its phase is ϑ (why is its phase ϑ ?).
.i . _ . /| . / ./theta .........0......... -1 . 1 . . . .-i
Thus eiϑ is the unit vector at the angle ϑ. To rotate the vector, you change ϑ. It turns out that math using exponentials is simpler than using the two argument vector notation z=a+ib, or z=cos(ϑ)+i*sin(ϑ).
|In math (and calculus), angles are in radians, not degrees. A rotation of 90° is a rotation of π/2 radians.|
Example: What is ieiϑ? Multiplying by i is a rotation by π/2 radians, so we should expect i*eiϑ=ei(ϑ+π/2)). (short cuts are useful sometimes). Let's do it a little more formally. We multiplied the vector by i. Find an angle ϑ for which the right hand side of Euler's equation is i. i.e. find the angle ϑ for which cos(ϑ)=0, sin(ϑ)=1  . Thus
i*eiϑ =eiπ/2*eiϑ =ei(ϑ+π/2).
What is [(1+i)/√2]*eiϑ in exponential form  ?
Let's go back to motor boat with one marked blade on the propellor. The blade is of unit length. We're behind the boat looking at the propellor rotating in the +ve (anticlockwise) direction. The propellor is spinning at the origin, in the r,i plane, about the z axis. The r axis is horizontal and the i axis is vertical. We're looking at the propellor from the +ve z region. At any instant the angle of the propellor is given by ϑ. Everything about the position of the propellor is described by eiϑ. When ϑ=0, the propellor is pointing along the r axis (i.e. to the right), when ϑ=π/2 the propellor is facing along the i axis (i.e. up), when ϑ=π the propellor is facing along the -ve r axis (i.e. to the left) and when ϑ=-π/2 (or 3π/2) the propellor is facing along the -ve i axis (i.e. down).
Let's move to the right of the boat. We're on the r axis looking back towards the origin. From here we see the projection of the propellor blade on the (vertical) i axis. We don't see the propellor when it's pointing along the r axis (we say that the projection of the r axis on the i axis is 0). The projection of the propellor's position on the i-axis is isin(ϑ). From Euler's equation, the imaginary part of eiϑ (written ℑeiϑ) is i*sin(ϑ). We still do our operations on eiϑ (i.e. we let the propellor rotate), but when we want the view from the right, we take the imaginary part of the eiϑ.
Instead of looking from the right, let's look from above the motor boat. We're looking down the i axis back to the origin. Now we're seeing the projection of the propellor along the r axis. If we want to see this projection, we do our math on the position of the propellor (i.e. we rotate it), and then look at the real part of the position (written as ℜeiϑ) is cos(ϑ).
Using the above relationships (being able to calculate projections), eiϑ describes the position of the propellor from the 3 orthogonal axes and hence from any position in 3-D space.
If we're looking along one of the r,i axes, we only see the projection of eiϑ on the other axis. We only see the sin() (or cos()) function. We can't tell that there's a propellor and it's rotating; all we see is the trace of the tip of the marked blade. If we move to the other axis and see a sinewave, we have to know the its phase relative to the sinewave on the first axis (e.g. we have a clock), before we know what's going on. If the phase is offset by 90° we know that something is rotating in a circular motion in the r,i plane. If the phases are the same then something is moving in a sinewave in a plane at 45° to the r axis (it has ϑ=45°).
To tell that the propellor is rotating, we need to look along the z axis, or to know both components (projections along the axes) and their relative phase. (In the case of the propellor, the two components are 90° out of phase or in quadrature.)
What is the value of eiϑ for ϑ=π  ? This is Euler's famous identity, described by Richard Feynmann (among others) as "the most remarkable formula in math". If you write eiπ+1=0 you add 0 into the identity, giving a relationship between exponentials, trigonometry, imaginary numbers, negative numbers and zero, tying together previously unrelated areas of math.
Euler's identity is the bridge between geometry and algebra.
Euler's identity itself is not at all profound; it says that the unit vector pointing along the -ve real axis has a magnitude of -1. (Now you can finally see why this is not taught in high school. It's beyond any high school student.)
Let's have some fun with Euler's equation (for one or two of these, you need to know that ln(a*b)=ln(a)+ln(b)).
You may have been told that you can't take the log of a negative number. Using Euler's identity find the natural log of -1 (ln(-1)).  . The log of a negative number is imaginary.
What is the log of an imaginary number? Try ln(i)  ? The log of an imaginary number is imaginary. If the log of a -ve number is imaginary, would you expect ln(i) to be imaginary too  ? Do you know why  ?
What is ln(-i)  ?
What is ln(1+i)  ? The log of a complex number is complex. It looks like the log of complex numbers of magnitude=1 are imaginary.
Using Euler' equation, what is ii? This result was not obvious to me. Here's a hint  . Here it is  . ii is real. In this proof, I used eiπ/2=i. However ei(π/2+2nπ)=i for any integer n. Thus ii has many values (all real); ii=e-π/2*e-2nπ (for n=0,1...∞).
|This has been proven. I can't prove it: I'm just going to state it.|
Real numbers are not complete i.e. you cannot do all math operations with real numbers - e.g. sqrt(-1) is not real. However complex numbers are complete i.e. you can do all math operations on complex numbers and the result will be a complex number. No-one is off searching for extensions to complex numbers in the way that Wessel needed to extend real numbers into another dimension to solve quadratic equations. If you go off exploring quantuum gravity, you're not likely to run into a brickwall because you're lacking the type of numbers to represent quantities in your new equations. It's possible you may need another number system that intersects with complex numbers (like Wessel's imaginary numbers intersect with the real axis), but so far no-one is attibuting the problems to lack of types of numbers.
You are now familiar with two 2-D systems of axes
The only real difference between the two systems is that Wessel's system has a built in operator i to rotate a vector. If you're throwing balls up in the air and watching their path, and you don't need to rotate a vector, then you use the Cartesian system. If you want to rotate a vector (rotating systems, like waves or propellors), then you use Wessel's system.
There's one minor problem with Wessel's system: it has a medieval nomenclature (real, imaginary). There's nothing imaginary about the imaginary axis anymore than the real axis is real. "real" has another meaning in the physical world, of things that exist in it, while "imaginary" has a meaning that the object doesn't exist in the real world, but only in the part of your brain that can create images of physical objects independantly of the physical object. However in Wessel's system "real" just means the coordinate along the horizontal axis. If the propellor is pointing up, you can kick it, touch it, photograph it, even though its position, in the nomenclature of complex numbers, is imaginary. Try pointing a 6yr old child to the propellor in the up position and telling them that blade is imaginary. Any other nomenclature would be better; e.g. we could label the axes r,i, give everyone the rules for using the operator i, and ask everyone never to say the words "real" or "imaginary" again. The problem then, of whether imaginary numbers are real or not, would disappear. If the problem disappears with relabelling, then the problem is not with the math or the understanding of it, but in the nomenclature.
NEW: Gauss once said (although I can't find a reference to it) something like "if we'd called the +ve x-axis direct, the -ve x-axis indirect and the imaginary axis transverse, it would have saved us iall this bother."
|I'll be using the nomenclature r axis, i axis to help make the Wessel diagram more (cough) real. I'm not going to wait for the rest of the world to realise that it's time to do something sensible here. Be aware, when you talk to others, that they won't neccessarily know what the r,i axes are. Be prepared to explain it to them.|
So far, using Euler's representation of a complex number, ϑ has been a fixed number. In a (sine) wave of 1kHz, ϑ will rotate from 0-2π 1000 times/sec. Thus the equation for ϑ for a wave of frequency f Hz is ϑ=2πft. In wave equations f is always associated with 2π so the frequency 2πf is shortened to ω (the angular frequency, in units of radians/sec). Thus in a wave, ϑ=ω*t and the notation for the vector rotating at ωradians/sec is eiωt.
Euler's equation shows that a rotating vector is the sum of two sinewaves, one along the r axis and the other along the i axis. The two sinewaves are 90° out of phase (they're in quadrature).
|If you're not already familiar with the following, confirm that two sine waves, with a 90° phase difference, one driving the r-axis and the other driving the i-axis, give a rotating vector. (You only need to check using the values for multiples of 90°. For angles other than multiples of 90°, you'll need Pythagoras to confirm that the length of the vector is constant.)|
All waves have components in quadrature. Sound has pressure and velocity. If you were plotting a sound wave on the Wessel diagram, you would label the axes p,v rather than r,i. When a sound wave arrives at the ear, or a microphone, it's converted into an electrical signal by a pressure dependant transducer. Only the pressure information is transduced. The velocity information is lost. On the Wessel diagram, you'd only have one component of the rotation.
|All microphones (and ears) are pressure transducers. There is no such thing as a velocity microphone i.e. a microphone that responds to the velocity of the air molecules. The only near item is a laser microphone that looks at the doppler shift of (usually artificial) fog in air. These microphones are used in wind tunnels to detect turbulent flow, and not for audio. A velocity microphone would need to be able to watch the movement of air molecules. They don't exist (at least yet). Ribbon microphones are often called velocity microphones: they aren't - this is marketting. Only the pressure component of the sound wave is available for amplification. Once the audio signal arrives at the ear (or the microphone), the quadrature information is lost.|
It is possible for an amplifier to process both the pressure and velocity information (quadrature amplifiers have two identical amplifiers). But once the wave has arrived at the audio amplifier (or the wetware), only the pressure component remains and the information neccessary for quadrature is gone. Thus audio amplifiers only have one channel (rather than two) and only process the pressure component. This isn't a problem; the electrical signal can be processed by the brain or amplified by an audio amplifier just fine. In an audio amplifier, the output signal powers a speaker which moves (has velocity) and pushes the air (creates pressure), creating a new wave. In some parts of a radio frequency transmitter or receiver, resonant circuits recreate the quadrature information again, but in general only one component is transferred to the next stage for amplification or processing. The antenna, which interfaces with the electromagnetic wave in space, regenerates the quadrature information (just as does the speaker for an audio amplifier).
A mechanical analogy for the transducer (microphone, speaker, antenna) is the piston in an engine, which moves linearly, while coupled to the crankshaft which rotates. The position of the crankshaft is described by eiωt while the position of the piston is described by sin() or cos(). The crankshaft is the rotating part of the wave, while the piston handles pressure. An V-8 engine has quadrature; there are two banks of pistons offset in phase by 90°.
|I said previously (when looking at the projection of a rotating propellor) that you can't conclude from the sinewave projection of the propellor, that something is in rotation. While you can generate sinewaves mathematically, in a physical system, a sinewave always originates with an oscillator, which needs both quadrature components to work (i.e. generate the oscillation). Thus when seeing a sinewave in a physical system, you can conclude that somewhere there had been a wave with quadrature.|
We're now going to derive the equations which allow us to convert between a wave (with quadrature) and its components. Remember eiϑ =cos(ϑ)+i*sin(ϑ).
First let's do some elementary transformations.
What happens to a vector at angle ϑ if you invert the polarity of the component on the i axis? First do this in your head (or pen and paper)  . Next we do it using Euler's equation. The goal is to restore the equation for the wave to the canonical Euler form of ei*angle =cos(angle)+i*sin(angle). (in case you don't already know; cos(x)=cos(-x), sin(x)=-sin(-x)) Watch how the original form of Euler's equation is restored.
new wave with sin() component inverted:
Make the sign of the sin() term +ve. This requires making the angle of the sin() term -ve. The angles of both components must be the same, so next make the angle of the cos() term -ve (this doesn't change the sign of the cos() term.)
Conclusion: changing the sign of the sin() component (the i component), gives a vector which rotates in the -ve direction.
Now let this new vector rotate at ωradians/sec. What physically is happening when a wave has a -ve frequency?
new wave with sin() component having inverted polarity:
The vector of -ve frequency is rotating in the -ve direction. Maybe it's a propellor operating in reverse, or it's a propellor with the blades set so that the boat moves forward when the propellor shaft rotates in the -ve direction. For whatever reason, the shaft is rotating in the -ve direction.
Here's one for you to do. Change the polarity of the component on the r axis. Identify the change (do it in your head first) (in case you don't already know; cos(x+π)=-cos(x), sin(x+π)=-sin(x))  .
Do something with (1),(2) from the previous exercise to get a formula for sin(ϑ) in terms of exponentials  .
Now we have a formula for sin(ϑ), and a formula for cos(ϑ) in terms of only exponentials.
e^ix = cos(x) + i*sin(x) (1) e^-ix = cos(x) - i*sin(x) (2) (1)+(2) (e^ix + e^-ix) cos(x) = ----------- 2 (1)-(2) (e^ix - e^-ix) sin(x) = ----------- 2i
These formulae are usually left with the denominators as written, since any algebra using them will give answers in terms of sin(),cos() and will need to keep their own denominators.
We now can convert between waves (with quadrature) and their cos(),sin() components. We had to use two counter rotating waves to do it. This turns out not to be a problem. Waves are all independant, so any system that will handle one wave will handle any number (at least til you send them through a non-linear medium - we'll address this in a later section).
|This is about quadrature and is just for fun.|
The position of a shaft can be used as input to another device e.g. the rotation of the volume control on a TV or audio amplifier. Usually the shaft is attached to a mechanical device, such as a variable resistor (rheostat). In cases where the shaft is rotated frequently (the tuning dial on a ham radio receiver, the ball in a computer mouse), and a mechanical device would wear out quickly, the position of the shaft is decoded optically (see rotary encoder http://en.wikipedia.org/wiki/Rotary_encoder). There are two types of encoders
In the latter type of shaft encoder, a disk is mounted on the shaft. Around the circumference of the disk are two rings of alternating dark and light bands. The light and dark bands are offset by 90° from each other (the two bands are in quadrature, see quadrature encoder http://en.wikipedia.org/wiki/Rotary_encoder#incremental_rotary_encoder). A sensor monitors the transition from light to dark in each band and remembers the new and old state (light/dark) (look at the diagram in the wiki with the A channel and the B channel). In combination with the state of the other sensor, you can determine the direction of rotation of the shaft. The output of the encoder is a string of pulses for each movement up/down or left/right. Some other device sets the position when the encoder is first turned on (often the turn on position is assigned the value of "0" or the last known position).
In a problem to find the maximum distance of a projectile, launched at angle ϑ to the horizontal, you had to maximize sin(ϑ)*cos(ϑ). There is a large range of trigonometric functions like sin(ϑ)*cos(ϑ); sums, differences, products, sums of products, divisors, half angle and double angle formulae (see trigonometric identities http://en.wikipedia.org/wiki/List_of_trigonometric_identities) and it's difficult to derive simplifications for them. In high school, we just learned them off by heart. Most of these identities aren't derived from geometric constructs, but from calculus, from deMoivre's equation and from Euler's equation.
Let's use Euler's Equation to simplify sin(ϑ)*cos(ϑ). Change the two trig functions to the exponential form, then see if you can recognise anything simpler (trigonometrically) in the output  . The maximum value of sin(ϑ)*cos(ϑ) occurs when sin(2ϑ)/2 has its maximum value, which occurs when 2ϑ=90° i.e. when ϑ=45°.
Prove the tan half angle formula
|I've never used deMoivre's formula (theorem). In high school, I was expected to learn the trig identities by heart, to save me the trouble of understanding their derivation. I didn't know that deMoivre was part of the chain.|
In a previous section you found that the n roots of xn=1 were spread at equal intervals around the unit circle in the Wessel diagram (you were able to find √i this way). This means that the operation which rotates the real axis by 135° (ei3π/4) is the same as the operation of rotating the real axis by 45° (eiπ/4) three times. The equivalence of these two operations is the algebraically trivial equality
ei(nϑ) = (eiϑ)n ( = eniϑ)
From Euler's equation you also know
eniϑ = cos(nϑ) + isin(nϑ)
Combining these two equations gives deMoivre's formula
[cos(ϑ) + isin(ϑ)]n = cos(nϑ) + isin(nϑ)
which has been derived here by inspection of the geometry depicted in the Wessel diagram (rather than by hard work, the way deMoivre did it).
deMoivre's formula works for -ve integer n too (as you would expect from the Wessel diagram)
[cos(ϑ) + isin(ϑ)]-1 = cos(-ϑ) + isin(-ϑ)
= cos(ϑ) - isin(ϑ)
and also for 1/n (this is the inverse operation; i.e. the rotation which is 1/3rd of 3π/4 is π/4). Here's the (trivial) proof.
[cos(ϑ) + isin(ϑ)]1/n = cos(ϑ/n) + isin(ϑ/n)
raising both sides to the nth power
cos(ϑ) + isin(ϑ) = [cos(ϑ/n) + isin(ϑ/n)]n
= cos(ϑ) + isin(ϑ)]
For our 2-D projectile problem, we needed the trig identity sin(2x)=2sin(x).cos(x). This is the sin half angle formula. You can find this using deMoivre's formula. You're unlikely to find it directly (i.e. "use deMoivre's formula to find the sin half angle identity"), but if you were deMoivre and you started messing around with your newly discovered formula, the identity would eventually appear amongst your results. Here's how it's done.
cos(2x) + isin(2x) = [cos(x) + isin(x)]2
expanding [cos(x) + isin(x)]2
[cos(x) + isin(x)]2 = cos2(x) - sin2(x) + 2isin(x)*cos(x)
cos(2x) + isin(2x) = cos2(x) - sin2(x) + 2isin(x)*cos(x)
If two vectors are identical, then their real and imaginary parts are identical (there's nothing magical about this; it's obvious). Equating the imaginary parts
sin(2x) = 2sin(x)*cos(x)
By equating the real parts of the above equation, you can find the cos half angle formula. The immediate result you get has both sin and cos terms in it. This formula is quite good enough for calculation purposes, but for aesthetics, you look to see if you can find cos(nϑ) in terms of only cos(ϑ) (and no other trig function). Eliminate the sin2() term using a trig identity you know - hint  .) Here's my answer  .
Test your formula with x=30°.
You should remember (know)
It's also helpful to know
but you should be able to derive this from Pythagoras and a right triange with two 45° angles. sin(60°)=cos(30°)=√3/2=0.866 is useful too, but you should be able to derive this from sin(30°) and Pythagoras if you've momentarily forgotten. I've never needed tan(30°), tan(60°).
You can use deMoivre's formula to find expressions for trig_function(nx) (where n=2.3...∞) in terms of trig functions of (x). Derive an expression for cos(3ϑ) which has only terms in cos(ϑ). As a side benefit, from the same calculation find an expression for sin3(ϑ). Here's the result  . If you want to test your deMoivre-fu, here's some more trig identities (http://en.wikipedia.org/wiki/List_of_trigonometric_identities).
|This section is based on material from Feynmann's Lectures on Physics|
Here we find a value for 10i .
First a re-cap on complex numbers:
|I can't prove any of these statements. Make sure you know that you're taking this information on hearsay and that (at least yet) you can't prove it yourself. Make sure you pick up the proofs somewhere along the line.|
|Algebraic numbers is the type of number that are the roots of polynomial equations. As we've found here, complex numbers are all that's needed for algebraic numbers. (Do you know why?  ) I take it then, that all numbers can be made from algebraic numbers.|
In case you're wondering what other sorts of numbers there are besides algebraic numbers, here's the complete beastiary of number types
irrational numbers ; numbers which cannot be expressed as the quotient of two integers e.g. √2 (if you don't know the proof of the irrationality of √2 let me know).
The Greeks were geometers and regarded integers in a mystical way and thought that the order of the universe depended on integers. They were most alarmed to discover the existance of irrational numbers and naturally tried to keep it quiet. The discovery was rated as a state secret and (according to legend) Hippasus (the likely discoverer of irrational numbers) was murdered for revealing the discovery (see Square root of 2 http://en.wikipedia.org/wiki/Square_root_of_2).
Presumably Hippasus approached an exhausted farmer behind his ox plowing his acre of field (a acre being the area you could plow in one day with one ox) and said
"psst, √2 is irrational".
We can only imagine what would have happened next, but it would have meant the end of the earth as we know it. For fear of equally calamitous consequences, modern states keep state secrets (such as that their troops shoot and kill children and newspaper photographers from helicopters) and will send you to jail (or worse) for divulging such secrets, while awarding medals or vacations in resorts (e.g. Lt. Caley) to the killers.
The Greeks used the word "incommensurable" (unmeasurable) rather than "irrational". Unfortunately the word irrational is now commonly used in psychology, a development the Greeks couldn't have anticipated.
transcendental numbers; numbers which aren't algebraic; e.g. e,π, eπ, 2√2, ii ; sin(x), cos(x), tan(x) for algebraic x≠0; ln(x) for algebraic x≠0,1.
|While transcendental numbers aren't algebraic, they can be described by the sum of an infinite series of algebraic numbers (usually rational numbers; e.g.e=1/0!+1/1!+1/2!...+1/n!).|
Before looking at 10i, in the general case what is 10(r+is)?10(r+is)=10r*10is.
This formulation allows us to remove the real part, which we understand, leaving us with 10is. By inspection of 10i, it's not obvious what it is, whether there's a connection to ei, and if there is, what it is. As a start, find a value for s for which we know 10is  . This didn't get us very far; i*0=0 and gives a real exponent. Still we know for some value of s that 10is has a magnitude of 1 (this is the principle of first noodling around, trying to find out small pieces of information, looking for a path to a hypothesis that's worth proving).
Since a complex number can represent any number, then it's possible to do this
leaving us to find x,y.
Having accepted that 10i is the complex number above, what is 10-i  ? From their representation in the complex plane, we can see that 10i and 10-i are complex conjugates. Multiply the two conjugates together and see what you get  . What does this tell you about the location of (x,y) (hint  )  . This tells us that 10i is a vector in the complex plane of unit length, just as is eiϑ. We don't have to find x,y separately anymore. We just have to find the argument (angle) of 10i.
Let's use Euler's representation of a unit vector in the complex plane, in which the only unknown is the angle.
10i = cos(ϑ)+isin(ϑ)
This may look like just a lot of tricks manipulating math. What does it mean? Let's go back to ei. What is the (approximate) value of ei (or equivalently, what is its location on the complex plane, or its magnitude and argument)  ?
Now look at 10i.
10i = e2.303i = ei2.303 = cos(2.303)+isin(2.303)
10i is the unit vector with an argument of ln(10) ≅ 2.303 radians (= 132°). Here's 10i
i 10 - .i |\ . \ . \ . \ . \ . 132deg .........0....... -1
10iϑ is a unit length vector which rotates around the origin, as a function of ϑ, just as does eiϑ. The only difference is that in the case of 10iϑ, the vector rotates ln(10)=2.303 times faster than does the vector for eiϑ. The vector for eiϑ rotates once around the origin for ϑ going from 0-2π, while 10iϑ rotates 2.303 times.
The base e is used to describe rotating vectors, as the period of rotation of the vector is the same as the period of ϑ. You won't find 10iϑ used anywhere. We only looked at 10iϑ here to show why e is special as the base of the Euler representation of vectors.
What do you have to do to 10i to make it a vector of length a  ?
We've used propellors to illustrate the position of the vector (eiϑ) as a function of the independant variable ϑ (the rotation angle of the drive shaft). What if there are a set of gears between the drive shaft and the propellor, so that the propellor rotates at a different rate to the drive shaft?
Write the Euler format version of a vector of unit length vector that has a period of π (i.e. the drive shaft rotates by half a revolution=π, as the vector rotates one revolution=2π )  .
Write a unit length vector in Euler format that has a period of 4π (i.e. the drive shaft rotates by two revolutions=4π, as the vector rotates one revolution=2π)  .
In quantuum mechanics, particles can have spin. Some particles have integer spin (0,1..n) and some particles have half spins (1/2, 3/2, 5/2). The names of quantuum mechanics parameters is somewhat arbitary and the names assigned in the early days, like spin, were derived while trying to understand the spectra of atoms/molecules and have more to do with spectroscopy than properties of the quantuum parameters. Modern naming is even more arbitary. Quarks are described by flavour, charm, beauty. How arbitary is the name "spin"? Is there any spin associated with spin? Does an electron spin about its axis, like the earth does before the sun? It's not likely. No-one can see a subatomic particle, much less see it spin. I'm not clear on spin; the people who taught me quantuum mechanics didn't understand it. Since it was too much trouble for them to learn it, and there was no requirement on the part of the university that we learn anything in the years we spent there pouring over our books there, we got the Reader's Digest version of quantuum mechanics. Sometimes we were told that the spin of an electron caused a magnetic field (presumably like electrons moving through coils in a magnet). There's some truth here; atomic particles behave differently in a magnetic field according to their spin. Sometimes we were told that spin had to do with symmetry. An object with a spin of 2 would look the same when rotated by half a turn. This was easy to understand; a two bladed propellor would have a spin of 2. If this explanation of spin was true, then an object of spin 1/2 would have to be rotated twice to return to its original orientation. This made no sense; what would the spin 1/2 particle look like after one rotation? I've now decided that spin has more to do with the period of Euler like equations. You can think of spin as the gear ratio between some rotating driving force and the parameter you measure (whatever that means).
I'm telling you this to prepare you for higher education, which from the half dozen or so US universities I've been associated with, aren't any better at teaching, than the Australian one I attended.
I'm most familiar with complex numbers in the context of electrical circuits. Here I will show the functions of some electrical circuits in terms of what they do to waves.
There are two types of signals/waves I'll talk about
Be aware of whether you're dealing with a (quadrature) wave or a sinewave; the math is a little different (with the quadrature wave, you need to keep track of both components or use Euler exponents). I will attempt to make clear whether I'm talking about a sinewave or a quadrature wave. Eventually you will be able to discern the type of wave from the context; Is it propagating through a medium, which can store two different forms of energy? Is it in exponential format, or as sin()? In the text, if I don't care which form the wave is in, I'll call it a "wave".
Many mediums are linear i.e. quantities such as pressure, velocity, voltage, current, electric field, magnetic field add. If you put two batteries in series, the voltages add; one of the batteries doesn't affect the voltage added by the other battery. If two people speak in a room, each voice is heard separately; the sound from one person's voice is not changed/corrupted by a second person's voice. We can see an object independantly of whether another object is producing (or reflecting) light nearby. This is because the waves we detect are subject to superposition (each one adds to the others). In mathematical terms we're saying
sin(ω1t) + sin(ω2t) = sin(ω1t) + sin(ω2t)
eiω1t + eiω2t = eiω1t + eiω2t
i.e. in a linear medium, nothing about the waves change. In a linear medium, waves are propagated independently of each other, and hence the individual waves can be separated at the destination, say by a frequency selective filter (for light, this would be a piece of coloured glass). If we use a pressure sensitive detector or voltmeter, at any instant we'll see the sum of the two terms, but both waves are still propagating separately. In electrical circuits, the signals can be represented as the sum of independant sine waves, each with its own frequency.
Electrical signals are sent through amplifiers. In an (electronic) amplifier the signals are sinewaves. For reasons which will soon become clear, we want the amplifier to be linear too. The purpose of an amplifier is to make the electrical signal stronger. If the input signal is sin(ωt), then the output will be A.sin(ωt) (where the amplification, A>1). If, as here, A is a constant, then the output is a perfect (linear) copy of the input, i.e. Vout/Vin=A (we will see shortly what happens if A is not a constant). In this case, the amplifier is called a linear amplifier and has these two useful properties
i.e. you get only an amplified version of the input signal. As you will see, the linearity of an amplifier is an important parameter of an amplifier. We will explore these two properties of a linear amplifier.
Here we explore a useful non-linearity; multiplication. There are electrical circuits that have two inputs, where the output is the product of the two inputs (i.e. the circuit multiplies the two inputs). A multiplier is also known as a product detector, as its output is the product of the two inputs. Some media are non-linear in a way that gives multiplication (air is non-linear at high pressure - explosions and shock waves, and at low sound pressure - you can't reduce the pressure below 0 - instead you get cavitation). A multiplier generates waves (frequencies) not present in the input. Here we'll see how the extra waves are generatied and examples of how multiplication is used.
If the input to an electrical circuit (or other detector, such as the eye or ear) is the product of two signals (e.g sin(ωat).sin(ωbt) ) or (eiωat*eiωbt) , then it's convenient to use a trig identity to turn the product of two signals into the sum of two signals. We will see how to do this transformation here. As to why we transform the new signal into the sum of two signals, I don't have a good answer, beyond that it works. For more examples of places that we do this and an attempt (but not a complete answer) to explain why it's valid, see dualities.
One place multiplication is used is to modulate a radio carrier with information. Radio (or light) can be used to carry information (audio, TV, data). The information, called the baseband signal or modulating signal, occupies a band of frequencies. For audio this band of frequencies is about 20Hz-20kHz (for TV, this is 0-5MHz; for a computer monitor it's 0-10MHz for a low res monitor up to 0-200MHz for a high res monitor).
The baseband signal (also called "the modulation") is impressed upon (or modulates) a (higher frequency) radio (or light) wave. This radio wave has no information and is called the carrier. Why do we modulate a carrier, rather than just sending our audio signal straight up the transmitting antenna? There are several reasons why sending the audio straight to the antenna won't work, but the main one is that all radio signals would be transmitted on the same frequency and a receiver wouldn't be able to differentiate one signal from another. The radio spectrum would sound the same as a room with dozens of people all speaking at once. There is plenty of radio spectrum and a different carrier frequency is chosen for each transmitter, to allow the receiver to select the signal it wants to receive. How this works should become obvious in this section.
At the transmitter, how do we modulate the carrier? Assume the modulating signal is 1kHz and the carrier is 1MHz. Here's the carrier plotted in the frequency domain (i.e. with f on the x axis) (the "x"s are ascii art for a single column/bar). Note f=0 (on the x-axis) is way off the left of the graph.
| | x | x output | x | x | x | x |_____________________ fc fc+fa <-0 1.0 1.001 f, MHz
Here's some audio (I made up the spectrum) plotted in the frequency domain, that we'll use to modulate the above carrier. Note that the scale of the x-axis is in kHz (and not MHz as for the carrier above).
| | xx | xx output | xxxxx x x | xxxxx x xxxxx xx |xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |_______________________________________ 0 10 20 fa, kHz
Since all waves (both sine and Euler) travel independantly, we can represent the baseband frequencies by a single (representative) sine wave, of frequency fa (a for audio). If we want to model the whole audio signal, we just add a whole bunch of separate (representative) sine waves. For discussion here let's show the whole audio spectrum by a representative sinewave of 1kHz
| | x | x output | x | x | x | x |_______________________________________ 0 10 20 fa, kHz
The simplest way (mathematically) to impress the signal onto the carrier (modulate the carrier), is to multiply the modulating signal (here the 1kHz audio sinewave) (fa) and the carrier (fc).
modulated signal = eiωct.eiωat
Since the modulation is the sum of many sinewaves, the actual modulated signal would be
modulated signal = eiωct. [eiωa1t + eiωa2t .. eiωant ]
= ei(ωc+ωa1)t + ei(ωc+ωa2)t .. ei(ωc+ωan)t
At the receiving end, you multiply the received signal by the -ve frequency of the carrier, recovering the modulation.
demodulated signal = [ ei(ωc+ωa1)t + ei(ωc+ωa2)t .. ei(ωc+ωan)t ]. e-iωct
= eiωa1t + eiωa2t .. eiωant
Here's the output of the modulator plotted in the frequency domain (note that the origin of the x-axis, with f=0, is way off the page to the left and the scale is in MHz) showing columns corresponding to modulating audio from 20Hz-20kHz.
| | xx | xx output | xxxxx x x | xxxxx x xxxxx xx | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |____________________________________________________ fc fc+fa <-0 1.0 1.001 1.002 ..... 1.020 f, MHz
So as to not clutter the diagram, we show the above diagram again, this time with only one frequency, our representative 1kHz sine wave, of frequency fa. after having modulated the carrier.
| | x | x output | x | x | x | x |_____________________ fc fc+fa 0<- 1.0 1.001 f, MHz
At the receiving end, to pick up the signal, you tune your receiver to the band of modulation frequencies. Then you multiply the received signal by the -ve frequency, recovering the modulation.
recovered signal = ei(ωc+ωa)t .e-iωct
|With quadrature multiplication, the output is a single frequency being the sum of the two input frequencies (either of fc, fa could be a negative frequency).|
There are two (technical) problems with this scheme
In the early days of radio, the circuitry neccessary to do quadrature multiplication was beyond the capabilities of the technology.
quadrature information isn't available for the audio you will use as modulation (at least you can't get the quadrature information from a microphone).
For a narrow band of frequencies it's easy to generate quadrature information. Examples of narrow band frequencies are
However hi-fi audio is wideband, it extends from 20Hz-20kHz, a frequency range of 1000:1 and no circuit was available to turn the audio into a quadrature wave. Even now it's still not possible to generate quadrature information for this range of frequencies. It took till the 1930's to work out how to create the quadrature channel for audio from the single audio pressure channel. It was possible to generate the quadrature signal for only a 10:1 frequency range. The range chosen was 300-3000Hz, giving the maximum intellegibility for low quality audio. The work was done by AT&T who wanted to fit more phone calls onto their copper cables. Copper metal was expensive, laying cable was also expensive.
|In the 1960's the value of copper metal in AT&T's phone cables was worth more than the company.|
No-one expected phone calls to be hi-fi (the phones themselves were lo-fi), so the scheme of delivering a phone call with restricted frequency range (300-3000Hz) audio worked well. At the same time, the reduced bandwidth audio allowed AT&T to pack more phone calls onto the cable. The wires from each house (or business) only carried one audio signal, but once the signals got to the exchange, AT&T wanted to pack as many as it could onto the trunk cables to be forwarded to the next exchange.
Here's a possible spectrum of audio from a phone call, filtered to pass 300-3000Hz, showing the signal as a function of frequency at a particular instant (I just made up the spectrum). Because of filtering, there is no audio below 300Hz or above 3kHz. Note that the frequency on the x-axis has changed just to show audio frequencies.
| | | output | xxx xx | xxxxxxxxxx | xxxxxxxxxxxx | xxxxxxxxxxxxxx |_________________________ 0 1 2 3 4 5 f, kHz
The way AT&T put more phone calls onto their copper cables, was by stacking the phone calls (frequency wise) on top of each other and sending the combined signal over the trunk cables. To do this, the quadrature audio signals from each phone call were multiplied by quadrature carriers at multiples of (say) 5kHz, giving this frequency multiplexed signal below, showing 5 phone calls, stacked at 5kHz intervals, being transmitted over one cable. (note the horizontal axis is compressed compared to the graph above, which has only a single audio signal).
| | | output | x xx x x | x xx xx x | x xx xx x xxx | xx xxx xx x xxx |_________________________ 0 5 10 15 20 25 f, kHz
|there's nothing much being said in the 4th phone call.|
At the receiving end, the phone calls are broken out (demodulated) for local delivery, by multiplying by the reverse set of frequencies (multiples of 5kHz).
|Copper cable has been replaced by much higher capacity fibre optic cable. It turns out that most of the cost of laying fibre cable is salaries and cost of digging equipment and not the actual cable. It doesn't cost much more to lay 100 cables than 1, so much of the installed fibre is dark, (i.e. not carrying any signals), waiting for future customers. Now with optical fibre cheap and bandwidth cheap, phone calls are digitally encoded hi-fi. This has allowed people to access the internet through phone lines, first through low speed modems (upto 56kBps) and now through DSL (80kBps and greater). The phone company isn't interested in sending hi-fi voice. The bandwidth is needed for digital data. Voice signals are sent over the same lines and because the equipment is designed to handle higher bandwidth, the voice is hi-fi as a side effect.|
This was how the phone company did it. They were a monopoly and regulated as a utility. They had plenty of money and used it to do some of the best research in the 20th century inventing Unix, the transistor and information theory (Claude Shannon). But AT&T got greedy and wanted even more money. The citizens revolted and AT&T was broken up by govt decree. Now instead of a govt regulated utility providing service (and research) for its citizens, we have a multitude of private companies providing the minimum service possible, on mutually incompatible equipment (to ensure vendor lock-in with the customers), while the regulatory arm of the govt looks the other way in the interests of "fostering innovation".
The phone company now has plenty of bandwidth and the modulation scheme described above has been replaced by wide band digital methods. Ham radio operators need to minimize their use of bandwidth and find the intelligibility of restricted bandwidth audio (300-3000Hz) quite sufficient. The modulation scheme was adopted by ham radio operators in 1950's, where it's called single side band (SSB). (I'll explain a side band shortly.) With only one person talking (the ham radio operator), only one voice signal modulates the carrier (there is no frequency multiplexing of voice signals).
In a ham radio receiver, SSB is demodulated in a slightly different way, than done by the phone company. Here we'll see how it's done, but first we need to understand the math involved. The method used involves multiplying sinewaves, rather than multiplying exponenential waves.
|For the input signals to the modulators/detectors, instead of a sinewave, we could use cos(ωt) (or sin(ωt+ϑ), where ϑ is a fixed phase offset), but it will only change the phase of the output signal. Here we're only interested in recovering the correct frequency of the original modulating signal. Human ears aren't sensitive to phase, so out the audio recovered from the product detector can be phase shifted with respect to the original modulating signal. The insensitivity of human ears to phase can be confirmed by playing middle C (physicists frequency = 256Hz, λ=4') on a piano and then moving backwards or forwards over a range of 4', while the note continues to play. The note will sound the same. If this weren't true, then we wouldn't be able to have a concert, where the members of the audience are all listening to different phases of the musical tones.|
|The obvious question then is whether elephant's ears are phase sensitive (see are elephant ears linear? in the section on Logrithms and Slide Rules in http://www.austintek.com/python_class/python_class.leftovers.html#austintek_python_programming.slide_rule).|
Trig identities allow you change the multiplication of waves into the sum of waves. We can look up (or remember) all the trig identities, but here we'll use Euler's equation to derive the one we want. Let's derive sin(a)*sin(b) for a wave (where we'll replace a,b with ωat,ωbt).
|You can have a -ve frequency with sin() or cos(), but you don't get anything new. sin(-a)=-sin(a); cos(-a)=cos(a). At most change you get by shifting the phase is the -ve of the signal (or a phase shift). -ve frequencies only make sense in the exponential form, when you have quadrature. If you have only the sin() component, you can't tell if the propellor is rotating backwards or forwards or even if a rotation is involved.|
Let's multiply two sinewaves.
sin(ωct)*sin(ωat) = [(eiωc*t -e-iωc*t)/(2i)] * [(eiωa*t -e-iωa*t)/(2i)]
= - ½[(ei(ωc+ωa)*t +e-i(ωc+ωa)*t)/2] + ½[(ei(ωc-ωa)*t +e-i(ωc-ωa)*t)/2]
= - ½cos[(ωc+ωa)*t] + ½cos[(ωc-ωa)*t]
v. impt: Multiplication of sinewaves produces the sum and the difference frequencies.
|Be prepared to handle any trig identity with Euler's equation. You should be able to do them on the back of an envelope in an elevator between floors. You'll rarely be required to do so; in an elevator you'll just need to remember that the products (of sinewaves) are the sum and difference (frequencies).|
You multiplied two sinewaves and got the sum of two coswaves. As you'll find out shortly, if you multiply two coswaves, you also get the sum of two coswaves; if you multiply and sinewave and a coswave, you'll get the sum of two sinewaves. While this may seem an oddity of nature at first glance, you can predict the result of multiplying sin() and cos() from knowing that
- that the numerator on the RHS (sum) side always can be reassembled into two terms, which are either sin()s or cos()s.
- that the denominator on the RHS will have a 2i for every sin() on the LHS and a 2i for every cos() on the LHS.
If there's 1 sin() term on the LHS, then the denominator for all terms on the RHS will have 2i and all terms on the RHS will be sin(). Otherwise if there are 0 or 2 sin() terms on the LHS, then the RHS will have a real denominator and all terms will be cos().
Note The terms on the RHS sometimes have a negative sign in front of them, so you still have to do the full calculation to determine the RHS. The info here just allows you to do a sanity check on your result.
Here's the diagram showing the sum and difference frequencies for our input frequencies fc, fa .
| | l u | l u output | l u | l u | l u | l u |_____________________ fc-fa fc fc+fa 0<- 0.999 1.0 1.001 f, MHz
The output is two signals, offset from the carrier frequency fc , by fa . These two signals are called sidebands, as they are on either side of the carrier. I've labelled them "u" and "l" for upper (USB) and lower sideband (LSB). These two signals contain the audio (modulation). The upper sideband has the original audio with the frequencies in the same order as the original audio (20Hz audio becomes 1MHz + 20Hz; 20kHz audio becomes 1MHz + 20kHz). The lower sideband has the original audio with the frequencies in the reverse order to the original audio (20Hz audio becomes 1MHz - 20Hz = 0.999980MHz; 20kHz audio becomes 1MHz - 20kHz = 0.980MHz). For the LSB, the -ve direction of the modulating frequencies comes from the -ve sign of the modulation frequencies in the difference frequency; cos[(ωc-ωa)*t].
|Multiplication is used at the receiver to restore the original audio. To recover the modulation, the received signal is multiplied by another signal generated at the receiver. If instead we repeated the multiplication process done at the transmitter on just the transmitted sum and difference frequencies, we don't get the desired result; we get (fc+fa) ± (fc-fa) = 2fc, 2fa . Multiplication of two sinewaves is not it's own inverse.|
The signal at this stage has two sidebands, occupying twice the bandwidth of the original audio. We could transmit this signal; it's called double sideband (DSB) (and this is occassionally done by hams). At the receiver, we put both sidebands through a product detector, multiplying by a signal with the frequency of the original carrier, giving us our audio signal (after filtering out the unwanted frequencies).
Since one of the goals of SSB is reduced bandwidth, and only one sideband is needed at the receiver to recover the audio, one of the sidebands is removed at the transmitter, restoring our transmitted signal to the same bandwidth as the original audio. (how this is done we'll see below). It doesn't matter which sideband is removed - both contain the same audio information. Depending which sideband is removed, you're transmitting lower sideband (LSB) or upper sideband (USB).
The (received) modulated signal now is sin(ωc+ωa)t (USB) or sin(ωc-ωa)t (LSB). At the receiving end, the SSB signal (or frequency multiplexed phone signal, in the case of the phone company) is demodulated by a product detector. One of the inputs to the product detector is the received signal, the other is a locally (i.e. in the receiver) generated signal of the same frequency as the original carrier (this local oscillator is called the beat frequency oscillator - BFO). There is no quadrature information, so the signals are represented by sin(ωt) rather than the exponential eiωt.
Here's the mathematical operation performed by the product detector in an SSB receiver
output = input1*input2
modulated signal is sin(ωc+at) (USB) or sin(ωc-at) (LSB).
The BFO signal is sin(ωct).
demodulator output (USB) = sin(ωc+at)*sin(ωct)
The output from the product detector is the sum and difference frequencies 2*fc, fa. Here's a diagram of the output of the product detector (the inputs are at fc and USB fa+fc). Note that the x-axis has a different scale to the previous diagram
| |x x |x x output |x x |x x |x x |x x |____________________________________________ fa fc fc+fa 2fc+fa 0.001 1.0 1.001 2.001 f, MHz
The lower frequency is the recovered modulation at 1kHz, while the higher frequency (at 2.001MHz) is filtered out leaving only the audio we want. (Filtering is easy to do, when the ratio of frequencies (2fc+fa)/fa) is large; here ≅2000:1.)
To receive the SSB signal, you tune the receiver for maximum signal, then if the signal is LSB, you adjust the frequency of the BFO to be on the high side (frequency wise) of the received signal. If the signal is USB, you adjust the frequency of the BFO to be on the low side (frequency wise) of the received signal. How do you know which sideband you're getting? There are conventions as to which to use, so in any band of frequencies you already know. However not everyone sticks to the convention. If you have the BFO on the wrong side of of the sideband, then the frequencies of the audio will be inverted about the mid frequency of the audio and you won't recover intelligible audio. You then move the BFO to the other side (frequency wise) of the signal when you'll hear intellegible audio.
When we multiply two exponential waves, the output is only one (exponential) wave of frequency fc+fa , while when we multiply two sinewaves, the output is two sinewaves at the sum frequency fc+fa , and the difference frequency fc-fa . Why is this? It all falls out from trig indentities. Let's do the math.
We've already found the result (above) of the multiplication of two sinewaves (we get the sum and difference frequencies).
Next, let's multiply a sinewave of f=fa (our audio), by an exponential wave of f=fc (our carrier). (This doesn't get us anything particularly interesting by itself; it's a step to multiplying two exponential waves.)
|It's relatively complicated (electrically) to get the quadrature signals for a wideband signal like audio. For a zero bandwidth signal like the carrier (or BFO), it's easier. For the carrier, you operate the oscillator at 4fc and use flip-flops to divide the frequency by 4. The flip-flops produce 4 independant signals at f=fc with phase = 0°, 90° 180° and 270°.|
Here we'll do the multiplication mathematically, using the Euler formula for an exponential wave.
output = sin(ωat)*eiωct
This is about as far as we can take multiplication (i.e. we don't get anything new or useful here).
Now we multiply an exponential and sinewave the way it's done in an electrical circuit. (We're going to get the same result as in the previous section, but a little more information about the phase of the output signal.) Electrical circuits aren't a medium for propagating an (exponential) wave; instead the component sinewaves (the cos() and the sin() components) are fed separately (they're real signals), to two separate identical circuits. Here are the inputs to our circuits, where cr is the r component of the exponential wave of frequency c and ci is the i component of the exponential wave of frequency c.
modulating sinewave (audio, goes to both multipliers) Sa = sinewavea = sin(ωa)t
r component of carrier: Ecr = exponential_wavecr = cos(ωc)t
i component of carrier: Eci = exponential_waveci = sin(ωc)t
For the each of the two components of the Euler exponential, we need a multiplier (modulator, detector) ciruit (total of two multipliers). Call them Mr, Mi for the r and i components respectively. Each multiplier has two inputs and one output.
------ ------ sin(fa.t)-->| | 1/2* sin(fa.t)-->| | 1/2* | Mr |-> [+sin[f(a+c).t] | Mi |-> [-cos[f(a+c).t] cos(fc.t)-->| | +sin[f(a-c).t]] sin(fc.t)-->| | +cos[f(a-c).t]] ------ ------
Here's the output Or of multiplier Mr
Or = Sa*Ecr = sin(ωa)t * cos(ωc)t
Use Euler-fu to change the product of two sinewaves to the sum of two sinewaves (remember the output will have the sum and difference frequencies). Here's my answer  .
Do the same for Oi for Mi.
Oi = Sa*Eci = sin(ωa)t * sin(ωc)t
|sin(a).sin(b) has already been done for you above, but do it for yourself here anyhow. Notice the -ve sign in front of the cos() term for the sum of the frequencies. Unfortunately it's hard to remember this -ve sign - you just have to calculate the sum and difference terms the hard way.|
Here's my answer  .
Here's the result in tabular form
Table 1. sinewave*exponential wave
|Mr||sin(ωat)||cos(ωct)||+ ½sin[(ωa+ωc)*t] + ½sin[(ωa-ωc)*t]|
|Mi||sin(ωat)||sin(ωct)||- ½cos[(ωa+ωc)*t] + ½cos[(ωa-ωc)*t]|
There are two copies of the carrier, one for each multiplier. The two copies have a phase difference of 90° (this was one of our requirements). How does this show up in the table above  ?
Look for a similar phase relationship between the resulting sidebands (i.e. what is the phase relationship between the two USBs, between the two LSBs?)  . This was the "slightly more info" I mentioned above.
We're looking at sinewaves, so we don't quite have a vector. However we have r and i components for each of the USB, and the LSB. Let's assume that we're seeing the components of vectors and figure out what the vectors are doing. Because of the factor ½ in front of each term, the length of the vector(s) is ½. The vectors are rotating, so we can't show them on a diagram, but we can look at their phase at any particular instant. Let's assume for a particular instant t that ωat=30°, ωct=30°. The USB and LSB have no particular phase relationship to each other (they're different frequencies). Instead let's look at the individual sidebands.
The (r,i) components of the USB are (½sin(30°+30°)=½*0.866, -½cos(30°+30°)=-½*0.5). (Sanity check; what's the length of this vector  ?) Here's the USB vector.
^ i USB . . . . . . .........0......... .x 1 . x . x . x . .
The (r,i) components of the LSB are (½sin(30°-30°)=½*0.0, ½cos(30°-30°)=½*1.0). (Sanity check; what's the length of this vector  ?) Here's the LSB vector (it points along the i axis).
^ i LSB . . x x x x .........0......... . 1 . . . . .
|Why we've drawn these two diagrams won't become obvious till we get to the next section, where we multiply two exponential waves.|
Let's multiply two exponential waves eiωa, eiωc. In this section, we'll use the Euler method. (You will need to use a few trig identities; to save class time, you can just look them up trig identities, sosmath http://www.sosmath.com/trig/Trig5/trig5/trig5.html or trig identities, wikipedia http://en.wikipedia.org/wiki/List_of_trigonometric_identities ). Here's the product
output = [cos(ωat)+isin(ωat)] * [cos(ωct)+isin(ωct]
Here's my result
|A whole bunch of terms (which I didn't show) cancel out. When you get to the electrical method for the multiplication, each of the signals will be present and will electrically cancel out.|
There is something in common with all the terms that cancelled out. What is it  ?
It's convenient to be able to change between LSB and USB at the flick of a switch. From your introduction to multiplying exponential waves, what do you have to do to produce the LSB instead of the USB  ? A negative frequency requires quadrature, which we don't have with sinewaves. However we can produce the electrical equivalent of negative frequency for either of the carrier or the audio (it doesn't matter which). In the next section we'll see how this is done both in the Euler (mathematical) sense and electrically.
In the electrical version of multiplying two exponential waves, we need 4 multipliers (we have two Euler terms in each of the two exponential waves). The bottom row of multipliers is the same as the two multipliers described previously (they take the sin() or i component of the audio signal). The multipliers have descriptors rc,ic,ra,rc for the r,i (cos,sin) components of the carrier and for the r,i (cos,sin) components of the audio respectively.
------ ------ cos(fa.t)-->| | 1/2* cos(fa.t)-->| | 1/2* |Mrc,ra|-> [+cos[f(a+c).t] |Mic,ra|-> [+sin[f(a+c).t] cos(fc.t)-->| | +cos[f(a-c).t]] sin(fc.t)-->| | -sin[f(a-c).t]] ------ ------ ------ ------ sin(fa.t)-->| | 1/2* sin(fa.t)-->| | 1/2* |Mrc,ia|-> [+sin[f(a+c).t] |Mic,ia|-> [-cos[f(a+c).t] cos(fc.t)-->| | +sin[f(a-c).t]] sin(fc.t)-->| | +cos[f(a-c).t]] ------ ------
Each of the multipliers is handling one of the r,i components for the audio or carrier. With two multipliers (as was done previously), we had only one pair of multipliers to determine the phase of the USB, LSB. With 4 multipliers, we can do it two ways (again assume at the instant ωc=30°, ωa=30°)
As I mentioned earlier, it's convenient to be able to change between LSB and USB at the flick of a switch. Starting with the Euler form of an exponential wave (or guesswork), determine which of these operations will produce the -ve frequency of the particular exponential wave you're operating on (which could be the carrier or the audio) and hence which operation will result in producing the opposite sideband (you've done some of these operations before).
If we change the sign of the frequency for the audio from fa to f-a then our sum and difference frequencies are (fc -fa) and (fc +fa) , the LSB and the USB respectively (note the change in sign of the audio frequency). If instead we change the sign of the frequency for the carrier from fc to f-c then our sum and difference frequencies are (-fc +fa) = -(fc -fa) and (-fc -fa) = -(fc +fa) , still the LSB and the USB respectively (note the change in sign of the frequency of both sidebands).
In an electrical circuit, we'll need 4 multipliers and two sets of quadrature signals.
|In ham radio, since the two inputs to the product detector (the received signal and the BFO) are narrow band (percentage wise), it's easy to generate quadrature forms of both narrow band signals. In the receiver, if we'd generated the quadrature form of both signals, we could have recovered the modulating audio without generating the extra frequency 2*fc+fa. It turns out that filtering out the product is simpler than generating the quadrature form of both signals, so the simpler (one channel) product detector is used.|
Although the SSB modulation scheme is simple mathematically, the technology for it came much later. The first practical modulation scheme was amplitude modulation (AM).
In amplitude modulation (AM), the carrier wave (say at 1MHz) is modulated by the audio signal, which is usually limited to 250Hz-5kHz (or so). For the moment we'll assume that we're modulating with a single tone, fa while the carrier is fc. The circuit used to amplitude modulate (AM) a radio signal does the following operation on its two inputs.
output = (1+sin(ωat))*sin(ωct)
|Active devices need a supply voltage to power them. Because of this, the input and output signals must stay between the ground (0V) and the supply voltage, which can be +ve or -ve. Thus inputs to and outputs from a real circuit can't go through 0V. They have to be offset to be always +ve or -ve. Since sinewaves oscillate about their equilibrium value (usually 0V), an offset must be added before it can be processed by a circuit. At the output, the offset voltage is removed. In the Wessel representation, a vector representing the input (or output) will rotate about a point in a quadrant, not ever hitting the axes. Because of these offsets, the modulator is not presented with the waves as shown in the above equation, but after all the offsetting and unoffsetting is done, the equation above (which does happen have an offset of "1") describes the multiplication.|
In AM, two signals are multiplied. The audio signal is offset by 1 allowing the carrier to appear in the output. The carrier carries phase information and is used (or helps depending on the demodulation circuit used) to recover the modulation (here audio) at the receiver. Because of the offset of 1, the amplitude of the carrier is changed to 2 when sin(ωat)=1 and to 0 when sin(ωat)=-1. Modulation changes the envelope of the carrier (see the diagram of an AM modulated carrier in AM http://en.wikipedia.org/wiki/Amplitude_modulation).
Use Euler's equation to find the output as the sum of frequencies.
output = (1+sin(ωat).sin(ωc)
= sin(ωc) + sin(ωat).sin(ωc)
Handle the product term with Euler's equation  .
= sin(ωc*t) - (1/2)cos[(ωa+ωc)*t] + (1/2)cos[(ωc-ωc)*t]
The carrier appears in the output along with two new frequencies (called sidebands) displaced from the carrier by the modulating frequency. In this case (carrier = 1MHz, modulating frequency = 1kHz) the sidebands are at a frequency of 1MHz±1kHz or 1.001MHz and 0.999MHz. The two sidebands carry the voice information. The carrier has no modulation information, but does have the phase.
Here's the output of the modulator plotted in the frequency domain (i.e. f on the x axis) (the "x"s are ascii art for a single column/bar). Note that the height of the each of the sidebands is half that of the carrier (i.e. much of the power of the transmitter is the carrier, which doesn't carry any voice information). (Note: the spacing on the x-axis has changed again.)
| | x | x output | x | x x x | x x x | x x x |_____________________ fc-fa fc fc+fa 0.999 1.0 1.001 f, MHz
|The new frequencies here are called side bands (they are bands of frequencies produced on each side of the carrier). They can be called products because they are produced by a multiplication process. Multiplication produces new frequencies which are the sum and differences of the original frequencies. (Addition and subtraction of waves produces no new frequencies at all - waves are independant)|
Not all amplifiers are linear. The amount of amplification can vary with frequency or it can vary with amplitude. This amplifier has a non-linear amplitude response like this
| | o o yyy | o | o output | o y.....y.....y | o | o |o o yyy |_____________________ input x . x . x . x x x . x . x . x
Under the x-axis is an input sine wave (time running vertically) and to the right of the graph, is the output (time running horizontally), being the input mapped through the non-linear transfer curve. The amplifier saturates for high amplitude input. Sine waves come out with flat tops. The amplifier produces distortion. (This effect is used deliberately to produce the fuzz effect in electric guitar amplifiers.)
A linear amplifier has a transfer function output = A*input + B. In the diagram above, if A was constant, then the line would be straight. The line wouldn't necessarily go through the origin (this would require B=0), producing a voltage offset at the output, which can easily be removed by a capacitor.
From calculus, we know that arbitary curves such as the transfer function above can be represented by a function like
y = a + bx + cx^2 + dx^3 ..
Let's ignore the offset voltage (i.e. make a=0). Let's also ignore the first order term (the bx): it produces the original wave multiplied by b. We're interested in signals appearing in the output that aren't amplified versions of signals at the input (these are called distortions). These all come from higher order terms in the equation above.
The transfer curve of the flat-topping amplifier above is symmetrical about the 0V line, and has no even order terms (2nd, 4th...). It has only odd order (3rd, 5th..) distortions.
|Sinewaves have +ve and -ve excursions. If you cube (or raise to any odd order power) the sinewave you still produce symmetrically +ve and -ve excursions. If you square (or raise to an even power), only +ve excusions are possible. Thus symmetrical distortions are due to odd order harmonics in the output (or produced by odd order distortions), while assymetric output is due to even order harmonics in the output (or produced by even order distortions). In amplifiers, which are powered by single sided power supplies (0V together with, usually, a +ve voltage) and where the input signal cannot swing symmetrically about a zero voltage, even order distortions dominate.|
Let's look at the 2nd order term (it's a parabola). An amplifier with 2nd order distortions has a transfer curve that looks like this
| | o y | o | o y y output | o | o y y y | o |o o yyy |_____________________ input x . x . x . x x x . x . x . x
Excursions in one direction are flattened, while excursions in the other direction become peaked.
Let's make all coefficients unity (b is the amplification; c is the 2nd order distortion). (In any real amplifier, b will be some large number e.g.103-10^6 and c will be 1% or less of b giving 1% or less distortion.) Our transfer function then is
y = x + x^2
This amplifier is non-linear
For input = sin(ω*t) our amplifier's output is
output = sin(ω*t) * sin2(ω*t)
For the moment, let's ignore the sin(ω*t) term. It's our input and it appears unchanged in the output. We're interested in what appears in the output that wasn't in the input (i.e. distortion). The sin2(ω*t) term is the 2nd order distortion added to the output. To find what the distortion looks like, we have to turn the output into the sum (rather than product - or square) of waves. Let's use Euler's equations again.
sin2(ω*t) = [(eiω*t -e-iω*t)/(2i)]2
= -(1/2) [(ei(2ω*t) +e-i(i2ω*t))/2]
The 2nd order distortion of a single note is the first harmonic of the note (a sinewave with freq = freq + 1*freq). The 2nd order distortion is an octave above the note. This note is harmonious and pleasant to listen to. Most people will realise that the timbre of the note has changed, but it will still be judged as pleasant. fuzz on an electric guitar for a single note doesn't sound fuzzy.
Let's see what happens when we put two waves (of different frequencies) through this amplifier. The extra output (added distortion) will be (leaving out intermediate steps)
[sin(ω1*t)+sin((ω2*t)]2 = - (1/2)cos[(ω1+ω2)*t] + (1/2)cos[(ω1-ω2)*t]
For every pair of frequencies (notes) you get another pair of notes, which have frequencies the sum and difference of the input frequencies. (These extra notes are called intermodulation products or "intermod".) Here is the distorted output in the frequency domain (the "x"s are ascii art for a single column/bar).
| | x x | x x output | x x | x x x x | x x x x | x x x x |_____________________ f1-f2 f1 f2 f1+f2 freq
The extra notes may have no harmonious musical relation to the input frequencies. The distortion sounds terrible to most people. The more notes you have, the more pairs of intermod products you have and the worse the output sounds (you need to strike two strings on a guitar to get fuzz). With modern electronic equipment, the audio amplifier can be made arbitarily accurate. The problem in reproducing sound is the mechanical parts; e.g. speakers, microphones (pickups for vinyl records have gone). Mechanical parts have mass and they can't follow the sound perfectly.
|Amplifiers with whose transfer curve is 2nd order can be used as multipliers.|
If you have 3 different notes at the input, you'll get 32=9 notes in the output. Here's the extra notes you'll get for playing the chord CEG in an amplifier with 2nd order distortion (frequencies taken from frequencies of musical notes http://www.phy.mtu.edu/~suits/notefreqs.html). Where the intermod products are near a note, I've used the name of that note. Where the intermod product is half way between two notes, I've called it "X". The notes you play are written in upper case in the middle horizontal band of the diagram. The intermod products are lower case, with the difference frequencies below and with the sum frequencies above. There are 3 columns, for each of the pairs CG,CE,EG. Only the first pair produces distortion products that are harmonious (there is little fuzz effect with only the tonic and dominant). The other pairs produce disharmonious intermod products.
freq notes 722 x 653 e5 591 d5 392 G4 G4 330 E4 E4 261 C4 C4 130 c3 70 c#2 62 b2
Here's the frequency domain graph for the output of an amplifier with 2nd order distortion, playing the chord CEG.
| | x x x | x x x output | x x x | xx x x x x xxx | xx x x x x xxx | xx x x x x xxx |_____________________ abc#defg|abcdefg|abcdefg|abcdeX 2 3 4
The tempered musical scale has notes be spaced at multiples of 21/12. A non-linear amplifier produces (extra) output at frequencies that the sum or difference frequencies. It's most unlikely that these intermod products will coincide with a harmonious note. The output will be distorted.
To get the sum and difference frequencies, you need to pass the waveform through a non-linear device. It turns out that the 2nd order (parabolic) non-linearity is the best at doing this, but any non-linear device will have a 2nd order non-linearity. When someone says "non-linear" with respect to a device or transmitting medium, sum and difference frequencies will be generated.
FIXME Only mathematics knows the product of two trigonometric functions. The physical world only knows sine waves and their amplitude. You can drive a speaker or antenna with functions like sin2(ϑ) but the receiver (the ear or a radio receiver) hears only the sum of sine waves.
This section doesn't bear directly on complex numbers, but the topic came up in the discussion of complex numbers. It needs to be explained somewhere, and this is as good a place as any.
In previous sections I've presented functions like sin(x).cos(x) and said that by using a trig identity, you can treat the function as if it were sin(2x). How do you know how whether to use sin(x).cos(x) or sin(2x)?
The short answer is that they're the same; you can use whichever form is most convenient. The equality sign "=" guarantees that both are the same under all circumstances. There are many places in the world where there are several functionally identical ways of looking at something, but the where the equivalence is not obvious. Let's look at some of them, starting with a pair where the equivalence is obvious, working towards sets of models where the equivalence is not obvious.
Here's the Pythagorean formula
I could say to you that the square of the hypoteneuse is the same as the sum of the squares of the other two sides. You might say "how can that be? The two other sides go off in other directions. They're in no way comparable to this single side." Yet it's known that they're the same.
If I said that one side of this equation had a value of 25, you wouldn't say "well is this the square of the hypoteneuse or is it the sum of the squares of the two sides?". Because of the "=" sign, you know that they're the same.
|Here we're talking about waves moving through a medium; these waves have quadrature, unlike the sinewaves in an electrical circuit, or the impulses in nerves, where only one of the quadrature components exists.|
With sound or electromagnetic waves, we think of waves in the frequency domain. Thus a voice or music, or the signal coming out of a transmitter, or the light emitted from a light source, is treated as the sum of an independant set of waves of different frequencies, each with their own amplitude. Thus you can sequentially play two notes on a piano and then play them together and everyone recognises that the two notes are independant; you can play them together or separately and they don't interact. You can't tell the difference between two notes coming out of one speaker, or each of these notes coming out of two separate speakers.
|Let's move to sinewaves, i.e. a component of a quadrature wave.|
The trig identities (some of which can be proved geomemtrically, and - I think - all of which can be proved by Euler's equation) say that two sinewaves are the product of another two sinewaves. "Which are they?" you might say. The non-obvious answer is that they're the same. I can show you the circuit which does the multiplication and if you tell me the inputs, then knowing the trig identities, I can tell you what will come out. I know that sin(ωat)*sin(ωbt) = -(1/2)cos[(ωa+ωc)*t] +(1/2)cos[(ωc-ωc)*t] . The sum of two sinewaves is the same as the product of another pair of sinewaves.
Is the speaker being driven by the sum of two sinewaves or the product of another pair of sinewaves? It's the same thing.
The medium for propagating light (space time) and for propagating sound (air) are linear. Light arrives after travelling 13Gyr across the universe unaccompanied by sum and difference frequencies. Sound is a pressure/velocity wave. If you change the pressure of air quickly, it's temperature will change. If the change in pressure associated with the sound is small enough that the temperature doesn't change (true for normal sounds), then air is linear; sounds arrive unaccompanied by sum and difference frequencies.
How do we know if we have an exponential wave with quadrature (there must be a simple name for this, but I don't know it, let's call it a wave) or a sinewave? (So we have wave or sinewave.)
In general, electrical circuitry doesn't have the quadrature information. The factor which decides whether the energy is transmitted as a wave or a sinewave is the scale of the medium (brain, electrical circuit) compared to the wavelength. If the medium is comparable in size (this means about λ/4) or bigger, then the energy will have to be a wave (and have quadrature). If the medium is smaller, then the energy will be transmitted as one of the components only. A musical instrument has to be of a size comparable to the wavelength of the music created. A violin is too small to emit the notes normally made by a bass. A piccolo cannot make the notes of a tuba. Normal home speakers are small for the wavelength of sound they're generating (middle C has a wavelength of 4' - a speaker for this note would have to be of comparable size). As a result, home stereo speakers are inefficient (1-5%; a 100W amplifier only produces 1W of sound and the rest heats up the speaker magnet). For radio waves, the antennas are λ/4 or λ/2 in length.
For a wave, quadrature is needed to propagate through the medium. However quadrature is not needed to generate the wave (e.g. at the speaker).
The only possible place to see two waves interact in a non-linear medium is light in a non-linear crystal or when two waves hit an atom. Non-linear crystals are using to double the frequency of lasers, generating green light from infra-red. The only case I know of where two waves of different frequencies interact is with two photon absorbtion of light. This is needed for many lasers, but the absorbtion of the two photons occurs sequentially. It's not like the two photons multiply in a non-linear medium.
This section is about the duality of the sum/difference and the product of waves (and sinewaves). While there are plenty of examples of the duality for sinewaves, I don't have a good example of waves interacting in a non-linear medium. (It may not happen.) I don't know a lot about this and I haven't heard much talk of it.
In another section, you learn about forces acting on balls (you throw them up in the air and watch them fall back to earth). Forces are a familiar part of our world; you hit a nail with a hammer, you feel forces acting on you as car accelarates (changes speed), you have to push objects to get them to move. Forces and accelaration, linked by the equation F=ma, are the basis of Newtonian mechanics and are governed by Newton's 3 laws of motion. Newtonian mechanics has been a great success: you can calculate the position of a ball, the period of a pendulum or vibrating string, or the period of masses connected by springs, the orbits of the planets and the tides.
There are two other formulations of mechanics, one due to LaGrange and the other due to Hamilton. They are based on momentum (p=mv) and energy (K.E.+P.E.). There are no forces in the world's of LaGrange and Hamilton, only changes in momentum and energy. Instead there is the principle of action. A ball (or pendulum, or masses coupled by springs or bodies moving according to the accelaration of gravity), moves along a path such that action is stationary (stationary - a word from calculus). A path is stationary with respect to action of a small change in the path does not change the action. (action does not have to be a minimum.) In Newtonian mechanics the energy of a pendulum is stationary if the bob is down or up (inverted).
The mechanics of Newton, LaGrange and Hamilton are equivalent.
LaGrangian and Hamiltonian mechanics are more complicated to set up and so for the usual simple cases, Newtonian mechanics is used. Newtonian mechanics can be taught without any understanding of calculus. For more complicated systems, Newtonian mechanics becomes unwieldy, while LaGrangian and Hamiltonian mechanics does not get appreciably more complicated (i.e. Newtonian mechanics does not scale well.) LaGrangian and Hamiltonian mechanics is reserved for systems for which calculus is required. Hamiltonian mechanics is best known for its use in quantuum mechanics. Hamiltonian mechanics is also well remembered by college students for classes in it taught by lecturers who have no understanding of it.
(see Lagrangian Mechanics http://en.wikipedia.org/wiki/Lagrangian_mechanics) In 1948, Feynmann invented the path integral formulation, extending the principle of least action to quantum mechanics for electrons and photons. In this formulation, particles travel every possible path between the initial and final states; the probability of a specific final state is obtained by summing over all possible trajectories leading to it. In the classical regime, the path integral formulation cleanly reproduces Hamilton's principle, and Fermat's principle in optics.
Feynmann shared the Nobel Prize with Julian Schwinger for his result. Freeman Dyson then showed that Schwinger's formulation was equivalent to that of Feynmann a result judged only slightly less important than that of Feynmann and Schwinger. Dyson is another of the geniuses from the 20th century. Unlike most of his peers, Dyson turned his genius onto the problems faced by society and freely exposed the incompetent. (From books by Dyson, you find that Dyson came to understand Feynmann's theory as a result of a hair raising car ride from Princeton to Los Alamos with Feynmann driving and non-stop talking.)
I'll not be doing any Hamiltonian or LaGrangian mechanics here (It would be useful to return after you've done more calculus). The point of this section is to give you more examples of dualities.
The popular press and (at least my) university professors, will tell you that subatomic particles and light sometimes behave as particles and sometimes as waves. "Well, are they particles or waves?" you quite reasonably ask. The answer you'll get, showing they have no idea, is "both" or "either" or "it depends on the experiment", or "it depends on the detector". How does the particle/wave know that there's a diffraction grating up ahead and that it should start behaving like a wave, or that there's an atom up ahead which it's going to run into and eject an electron so it should start behaving like a particle? It doesn't know and doesn't need to know. The professors won't let on that they don't know and leave you feeling stupid that everyone understands but you.
Here's the answer: they're the same thing. Both formulations are equivalent. (I can't prove this and I haven't seen a proof, but it's got to be true.) Which formulation you use doesn't depend on the detector, it depends on which way you analyse the experiment. It just happens to be easier to analyse a diffraction grating using wave theory, and it's easier to analyse the photoelectric effect using particle theory. I expect it's possible to analyse both experiments using the other theory, but I haven't seen it. There must be experiments where both analyses work, but I haven't seen them. I assume they aren't given to students, as it would take the mystery out of subatomic physics.
(-1+i) - .i (1+i) |\ . \ . \ . \ . \ . .........0....... -1 . 1 . . . . .-i
(-1-i) .i . . . . . .........0....... -1 /. 1 / . / . / . / . |/_ .-i
(1-i) .i . . . . . .........0....... .\ 1 . \ . \ . \ . \ .-i _\|
(1+i) (1+i)^2 ----- = --------- (1-i) (1+i)(1-i) 2i = -- 2 = i
1 (a-bi) ----- = --------- (a+bi) (a+bi)(a-bi) (a-bi) = ------- a^2+b^2 1 = ------- (a-bi) a^2+b^2
The starting unit vector z=(a+bi)/sqrt(a2+b2) has phase ϑ, where tan(ϑ)=b/a. For convenience, let t=b/a. The initial vector then is z=(1+ti)/sqrt(1+t2).
The square of the initial vector (by inspection) is z2=(1-t2+2ti)/(1+t2)
The final unit vector Z=A+Bi has phase 2ϑ, where tan(B/A)=2ϑ. The problem is to find A,B.
from the tan half angle formula: tan(2ϑ)=2t/(1-t2)
∴ B/A=2t/(1-t2) (1)
from Pythagoras: A2+B2=1 (2)
substituting (3) in (2) (and rearranging)
At this stage you should check that A,B satisfy (1),(2). Substituting your values for A,B into the formula for Z gives
Thus Z=z2. QED
In your proof you found A2=[(1-t2)/(1+t2)]2. There are two values for A: A=±(1-t2)/(1+t2), not just the single A=(1-t2)/(1+t2)
For A -ve, B is also -ve. The simultaneous change of sign of both variables has no effect on tan(ϑ)=B/A (i.e. we have the same tan() for both solutions). Do we have the same angle? tan(ϑ+π)=tan(ϑ) i.e. two angles, 180° apart, have the same tan(). We can't tell from this if we have two different angles; we could have two vectors with the same angle (if so, which angle?), or two vectors with the same tan(), but with different angles. However with A,B of the opposite sign, it's clear that the extra vector is pointed in the opposite direction. Our procedure has found two vectors, one the correct one and the other facing the opposite direction. This other vector is a spurious product of our algebra. The problem at this stage is that we can't tell which one is the spurious one. Sure for specific examples we can tell which is the square, but this doesn't help us in the symbolic representation.
The phase of z is 0 to 2π. Thus the phase of Z is 0 to 4π. Each quadrant has its own combination of sign of A,B so we will need 8 entries, one for each quadrant of Z. As a check, see that z in the first quadrant can produce a Z in both the first and 2nd quadrant. Thus we'll have to separately analyse z with phase 0->π/8 from the vector with phase from π/8 to π/4.
The rotated version would have still had the same length. The squared version would have had a length/magnitude the square of the original vector's length.
x=+1,-1 .i . . . ..<------0------>.. -1 . 1 . . .-i
The FTA says there are 4. They are 1, i, i2=-1, i3=-i. (Multiply each one out 4 times, if you're not convinced.)
1,-1,i,-i ^ i | | | ..<------0------>.. -1 | 1 | | v -i
Find the 4th power of each one by rotating each root through 4 times its phase/angle.
_ .i |\ . \ . \. .........0------>.. -1 /. 1 / . |/_ . .-i
1, (-1+i√3)/2, (-1-i√3)/2
There are n roots (by the FTA). They lie on the circumference of the unit circle (the circle of unit radius) at a spacing of (360/n)°.
Find the vector operation which when done twice (or the vector, which when multiplied by itself), gives the vector 0+i.
It's half way between the vector (0+i) and the +ve real axis.
The vector at 45° when multiplied by itself gives a vector at 90°. There's a minor problem; if you start with the 45° vector 1+i, the resultant has a length=2. You need to start with a shorter vector. How short does it have to be?
√i = ±(1+i)/√2
 There are two square roots of a number i.e. there are two lines half way between 0+i and the +ve real axis (they're in opposite directions, one being the -ve of the other).
.i . _ . /| sqrt(i)=(1+i)/sqrt(2) . / . / .........0....... / . 1 / . |/_ . sqrt(i)=-(1+i)/sqrt(2) . .-i
two (by the FTA, there are always two square roots). They're at 90° to √i. Do you know why?
±1; ±(1+i)/√2; ±i; ±(-1+i)/√2. You have to multiply each one 8 times to be guaranteed of getting 1.
#! /usr/bin/python # root_equation.py (C) Caspar Wessel 2010 wessel@danish_academy.edu #iterates through r,i finding complex values for polynomials. def polynomial_2nd(point): #x^2+2x+2=0 result = point*point + 2*point + 2 return result # polynomial_2nd()----------------- def polynomial_4th(point): #x^4-1=0 result = point*point*point*point - 1 return result # polynomial_4th()----------------- def polynomial(point): #uncomment one of these calls result = polynomial_2nd(point) #result = polynomial_4th(point) return result # polynomial()----------------- #for 4th order r_start = -4 r_end = 5 r_increment = 1 i_start = 4 i_end = -5 i_increment = -1 #for 2nd order r_start = -4 r_end = 3 r_increment = 1 i_start = 3 i_end = -4 i_increment = -1 output_string = "r" output_string += "\t" for r in range(r_start, r_end, r_increment): output_string += str(r) output_string +="\t" print output_string print "i" print print "complex" for i in range (i_start, i_end, i_increment): output_string = "" output_string += str(i) output_string +="\t" for r in range(r_start, r_end, r_increment): c = complex(r,i) output_string += str(polynomial(c)) output_string += "\t" print output_string print print "real" for i in range (i_start, i_end, i_increment): output_string = "" output_string += str(i) output_string +="\t" for r in range(r_start, r_end, r_increment): c = complex(r,i) output_string += str(polynomial(c).real) output_string += "\t" print output_string print print "imaginary" for i in range (i_start, i_end, i_increment): output_string = "" output_string += str(i) output_string +="\t" for r in range(r_start, r_end, r_increment): c = complex(r,i) output_string += str(polynomial(c).imag) output_string += "\t" print output_string # roots_equation.py -----
cos(). The position of the tip of the marked blade starts (at t=0) at its maximum value.
known equations i*sin(x) = ix -ix^3/3! + ix^5/5! ... cos(x) = 1 - x^2/2! + x^4/4! ... e^ix = 1 + ix - x^2/2! -ix^3/3! + x^4/4! + ix^5/5! ... Euler's identity e^ix = cos(x) + i*sin(x)
Euler's equation eiϑ=cos(ϑ)+isin(ϑ)
we want theta such that
This requires that
For this to be true ϑ=π/2
eiπ = -1
∴ln(-1) = iπ
eiπ/2 = i
∴ln(i) = iπ/2
i=sqrt(-1), so ln(i)=ln(-1)/2
You could find the exponential form of -i or you could find ln(i3) or you could find ln(-1*i).
The answer: ln(-i) = i3π/2
(1+i)/√2 = eiπ/4
∴ (1+i) = √2*eiπ/4
∴ ln(1+i) = ln(√2)+ *eiπ/4
The vector will start on the +ve r axis. Because the i component is inverted in polarity, it will first assume the value -1 and the vector will move to the -ve i axis (the down position). The r component hasn't changed in phase or polarity, so the vector will next move to the -ve r axis. Next the inverted i component will move the vector to the +ve i axis. The vector is rotating in the -ve direction.
invert polarity of cos() component:
new wave = -cos(ϑ)+i*sin(ϑ)
fix -ve sign of cos() component. This requires changing the angle, which in turn changes the sign of the sin() component.
You should recognise this as e-i(ϑ+π). However, let's finish the derivation in the formal manner; change the sign of the sin() term by negating the angle. This requires you to change the angle in the cos() term to match.
Conclusion: the vector rotates in the -ve direction, with a phase offset of π i.e. the vector starts pointing along the -ve r axis and rotates in the -ve direction.
The i components cancell. You're left with twice the r component (the cos() component).
+ve rotating wave:
-ve rotating wave:
We can't easily add terms on the right hand side, since the angles are different. Change the angles in (2) to be the same as in (1).
(1) + (3)
(1) - (3)
(e^ix + e-ix)(e^ix - e^-ix) sin(x)*cos(x) = ------------------------ 4i (e^i2x - e^-i2x) = -------------- 2*2i = sin(2x)
No. The proof required cancelling (1+t2)/(1+t2). You can't do this for t=i. There is a pin-hole in the domain of t at t=i.
The Wessel diagram and hence Euler's equation is only defined for ℜ ϑ. If you use the exponential form of tan(), equating it to i, you find that t=i for eiϑ=0 i.e. for a 0 length vector. This in turn gives a value for e-iϑ=∞ which means this method of finding a solution of t=i is nonsense.
There's an indentity involving sin2() and cos2() that derived directly from the Pythagorean formula.
Equating real terms
cos(2x) = cos2(x) - sin2(x)
cos2(x) + sin2(x) = 1
cos(2x) = 2cos2(x) - 1
cos(3ϑ) + isin(3ϑ) = [cos(ϑ) + isin(ϑ)]3
= cos3(ϑ) - 3cos(ϑ)sin2(ϑ) + i[3cos2(ϑ)sin(ϑ) - sin3(ϑ)]
equating real parts and using a well known trig identity
cos(3ϑ) = cos3(ϑ) - 3cos(ϑ)sin2(ϑ)
= 4cos3(ϑ) - 3cos(ϑ)
equating imaginary parts and using a well known trig identity
sin(3ϑ) = 3cos2(ϑ)sin(ϑ) - sin3(ϑ)
= 3sin(ϑ) - 4sin3(ϑ)
Polynomials can be factorised into terms like (x-a), (x2+a) or (x2+ax+b). When equated to 0, the roots of the factors are all in the complex plane.
The location of (x,y) (and hence 10i) is on the circumference of a circle of radius 1; i.e. 10i is a vector of unit length.
eiϑ = cos(ϑ)+isin(ϑ)
(10)i = eiϑ = cos(ϑ)+isin(ϑ)
∴ 10i = (eϑ)i
There's a couple of equivalent ways of looking at this
|ln(10)=2.303 crops up all the time when you collect data in decimal for exponential processes. Expect to see it more.|
eiϑ is the unit vector with argument (angle) ϑ. In this example, we have ϑ = 1 radian (i.e. ϑ ≅ 57°). Thus ei is the unit vector with argument = 1 radian. (60° is close enough to a radian for a back of the envelope calculation, but 57° ≅ 1 radian is a useful number to remember).
Here's ei (approximately)
_ i .i /| e . / . / . / . / ./ 57deg .........0........ 1
This vector returns to it's original orientation while the drive shaft rotates by π=180°.
|We could write this vector as (e2)iϑ = 7.389iϑ but this would hide what's going on.|
This function returns to it's original orientation when the object rotates by 4π=720°.
|We could write this vector as (e½)iϑ = 1.6487iϑ but this would hide what's going on.|
sin(ωat)*cos(ωct) = [(eiωa*t -e-iωa*t)/(2i)] * [(eiωc*t +e-iωc*t)/2]
= + ½ [(ei(ωa+ωc)*t +e-i(ωa+ωc)*t)/2i] + ½ [(ei(ωa-ωc)*t +e-i(ωa-ωc)*t)/2i]
= + ½sin[(ωa+ωc)*t] + ½sin[(ωa-ωc)*t]
sin(ωat) * sin(ωct)
= - ½cos[(ωa+ωc)*t] + ½cos[(ωa-ωc)*t]
The carrier input to Mr is cos(ωct). The carrier input to Mi is sin(ωct).
cos() and sin() of the same angle are offset by 90°.
The phase of the two upper sidebands is offset by 90° (sin(), cos()). The phase of the two lower sidebands is offset by -90° (sin(), -cos()).
length of vector ½*0.866,-½*0.5 = ½√(3/4+1/4) = ½
length of vector ½*0.0,½*1.0 = ½√(0+1) = ½
cos(ωa+ct) + isin(ωa+ct) = eiωa+ct
they're all LSB terms, i.e. terms in ωa-ct.
multiply by the negative frequency (of one of the audio or the carrier).
invert the r term
I+ve freq = eiωt = cos(ωt) + isin(ωt)
inverting the polarity of the r term
I-r = -cos(ωt) + isin(ωt)
= cos(ωt) + isin(ωt) = I+ve freq
There is no change in the signal (cos() is an even function)
inverting the polarity of the i term
I+ve freq = eiωt = cos(ωt) + isin(ωt)
inverting the i term
I-i = cos(ωt) - isin(ωt)
= cos(ωt) + isin(-ωt)
= cos(-ωt) + isin(-ωt) = e-iωt = I-ve freq
the signal rotates in the opposite direction, producing the opposite sideband (sin() is an odd function).
exchanging the carrier input signals between multipliers.
I+ve freq = eiωt = cos(ωt) + isin(ωt)
exchange the carrier inputs
Ic_reversed_inputs = sin(ωct) + icos(ωct)
restore the Euler format: restore r term (you need a cos(): use the indentity sin(x) = cos(x - π/2))
= cos[(ωc-π/2)t] + icos(ωct)
restore the i term (you need a sin(): use the indentity cos(x) = sin(x + π/2))
= cos[(ωc-π/2)t] + isin[(ωc + π/2)t]
= cos(ωt) + isin(-ωt)
= cos(-ωt) + isin(-ωt)
= I-ve freq
the signal rotates in the opposite direction, producing the opposite sideband (sin() is an odd function).
exchanging the polarity of both r (or i) terms.
exchanging the polarity of both rterms:
From a previous example, changing the polarity of the r term of one of the inputs does not change the sign of the frequency. We can assume that changing the polarity of the r component of both inputs has no effect on the sign of the frequency of the output.
exchanging the polarity of both i terms:
From a previous example, changing the polarity of the i term changes the sign of the frequency. If we do this to both inputs, then both input frequencies have the opposite sign. Here's our original multiplication
Iorig = eiωct *eiωat = ei(ωc+ωa)t
which gives the USB output.
Here's the result of multiplying the -ve frequencies of both inputs
= e-iωct *e-iωat
This is a USB signal of -ve frequency. (The listener won't be able to tell that the signal has -ve frequency, but will have to adjust the BFO for the LSB.)
Ic = eiωc
Ic -ve = eiωc
This is the same as adding the angle π to a vector. It doesn't change the direction of rotation.
= sin(ωc*t) + [(eiωa*t -e-iωa*t)/(2i)] * [(eiωc*t -e-iωc*t)/(2i)]
= sin(ωc*t) - (1/2)[(ei(ωa+ωc)*t +e-i(ωa+ωc)*t)/2] + (1/2)[(ei(ωc-ωc)*t +e-i(ωc-ωc)*t)/2]
|AustinTek homepage||| Linux Virtual Server Links | AZ_PROJ map server ||