AustinTek homepage | Linux Virtual Server Links | AZ_PROJ map server |

Teaching Computer Programming to High School students: An introductory course using Python as the high level language: Intro, Primitive Data types

Joseph Mack

jmack (at) wm7d (dot) net
AustinTek home page

v20090511, released under GPL-v3.


Table of Contents

1. Getting files
2. Software, Hardware and the Operating System (OS)
3. binary numbers, the bit (b)
3.1. Number systems with bases !=10
3.2. Using bits to represent numbers
3.3. Giving Credit
3.4. The byte (B)
3.5. The word
3.6. binary addition
3.7. Algorithm and Order of an algorithm demonstrated using binary addition
3.8. Overflow
3.9. Binary Multiplication
3.10. Binary Subtraction
3.11. bc
3.12. Hexadecimal
3.13. Base 256
3.14. Integer Division
3.15. Review: binary
4. Primitive Data Types
4.1. Primitive Data Type: Integer
4.2. Arithmetic with Long Numbers
4.3. Negative Integers
4.4. Range of Integers
4.5. Integer Arithmetic in Python
4.6. Largest/Smallest Integer in Python
4.7. Primitive Data Type: Characters, ASCII table
4.8. Primitive Data Type: Real Numbers
4.9. Primitive Data Type: Strings
4.10. Is it a string or number?
4.11. Other primitive data types
4.12. Review: primitive data types
5. Other Languages

Abstract

Class lessons for a group of 7th graders with no previous exposure to programming (as of Aug 2008, now 8th graders - in the US 7th, 8th grade are called Middle School). The student are doing this after school on their own time and not for credit. My son's school didn't want me to teach the class using any of the school facilities, as they thought it would compete with a class in Java given to advance placement math 12th graders. However I could use the school facilities, if I didn't teach anything which would fullfil a requirement (which I assume meant anything practical). So the class is at my home and is free. Since this is a hobby activity for the kids, I don't ask them to do homework. As they get further into the class and take on projects, I'll be quite happy for them to work on the projects in their own time, but will let them decide whether/when to do this.

Note
Originally all this material was in one webpage, but google stopped indexing it when it got to about 1M of html. I've now split up the material into several sections in the hopes that google will start indexing it again. Splitting up the document is a real pain: it's much easier to link to parts of the same document than to targets in other documents.
Note

The notes here are being written ahead of the classes. I've marked boundaries of delivered classes with "End Lesson X". Each class is about an 90mins, which is about as much as the students and I can take. After a class I revise the material I presented to reflect what I had to say to get the points across, which isn't always what I had in the notes that the students saw. Material below on classes that I've given will be an updated version of what I presented. Material below on classes I haven't given, are not student tested and may be less comprehensible.

The big surprise to me is that when you're presenting new material, the students all say "yeah, yeah we got that" and want you to go on. However when you ask them to do a problem, they haven't a clue what to do. So most of my revisions to the early material were to add worked problems. I also spend a bit of time at the start of the class asking the students to do problems from the previous week's class.

I've found that when asking the students to write a piece of code that some of them (even after almost a year) can't turn a specification into code. Instead I have to say "initialise these variables, use a loop to do X, in the loop calculate Y and print out these values, at the end print out the result". Usually these steps will be given in a list in the notes here. Once I'd shown the students how to initialise and write loops, I had expected them to be able to parse a problem into these steps without my help. However some of them can't and I changed my teaching to reflect this.

The kids bring laptops do to the exercises and they display this page by dhcp'ing and surfing to my router where this page is also stored.

Students are using Mac OS, Windows XP and Linux. For WinXP, the initial lessons used notepad and the windows python. For material starting at the Babylonian square root, I changed the WinXP student over to Cygwin (still using notepad).

In later sections I started having the kids do a presentation on what they'd learned. The primary purpose of this was to accustom the kids to talking in front of groups of people. They also had to organise the material, revealing how much they'd remembered and understood it. The ??? covered a body of material large enough that it took several months of classes and homework to assemble the presentation. For the next section of work, I'll have them do presentations on smaller sections of the material, and then have them present on the whole section at the end. I've hidden most of the answers to questions in footnotes (so they won't see the answers easily during class). However later, when the students are scanning the text to construct their presentations, that it's hard to find what they're looking for. I don't know what to do about that.

Material/images from this webpage may be used, as long as credit is given to the author, and the url of this webpage is included as a reference.

1. Getting files

  • You will need a working installation of python. If you don't already have python on your machine, go to python download (http://www.python.org/download/) and download and install a version of python for your machine.
  • To write programs, you will need a programming ??? (not needed for the first part of the class).
  • You will need some standard unix utilities e.g. bash shell and bc if you want to do some of the exercises. bash is the default shell for Linux and is used to retreive/manipulate/input/output information from the computer about its state/condition. bash is run within a terminal (e.g. xterm or a console). bc does arithmetic in different bases (e.g. binary and hexadecimal).

    • Linux has bash and bc by default.
    • Mac OS/X Jaguar has tcsh by default, Panther has bash by default. Here's an article on installing and running bash on Mac OS (http://www.macdevcenter.com/pub/a/mac/2004/02/24/bash.html).
    • To run these unix utilities on windows, you will need Cygwin (http://www.Cygwin.com/) plus a few files (the basic install just installs Cygwin). Start with setup.exe (at bottom, "Install or update now!") and select a download site. You will get an initial menu from which you choose your download/install. The menu is not particularly obvious. You will need bash from "shells", bc from "math" or "util".

      Cygwin is a unix like environment for windows. It's designed for people who know unix and who are forced to work on windows machines. In this class you can choose your OS, so if you're working under windows, you'll be doing so because that's what you want. In this case you should use the version of python for windows. (It would be possible to use the cygwin version of python instead of the native windows version of python and do all of the class within cygwin).

    If you're at a terminal and don't know your shell, type

    echo $SHELL
    /bin/bash
    
  • Python Tutorial on-line (http://docs.python.org/tut/), Python Tutorial in several formats (http://docs.python.org/download.html).
  • LiveWires python teaching course (http://www.livewires.org.uk/python/home). I will be using some of the course as examples.
  • LiveWires worksheets (http://www.livewires.org.uk/python/worksheets)
  • LiveWires package (http://www.livewires.org.uk/python/package)
  • pygames (http://www.pygame.org/download.shtml)
  • David Handy (who I find lives close by me) and is a member of Triangle ZPUG (http://starship.python.net/mailman/listinfo/triangle-zpug) has written Computer Programming is Fun (http://www.handysoftware.com/cpif/) to teach python to home schoolers (i.e. kids about the age of my class). I only knew about the LiveWires course when I started this course. David's book has graphics and audio libraries which will add some fun for kids, who don't want to start with hard boiled code and provably correct constructs.
  • While looking for clues on validating user input I found CS 107:Computing, Robots, and Python (http://www.cs.usfca.edu/~wolber/courses/107/)

2. Software, Hardware and the Operating System (OS)

A computer can be logically divided into

  • hardware - the physical parts of a computer, that you can thump/kick. These include, cpu, ethernet cards, harddisks, memory.
  • software/program - a set of instructions that tell the hardware what to do. software can be in lots of places

    • on media (harddisk, floppy disk, flash stick)
    • in the computer's memory (when the software is running)
    • built into ROM (read only memory) in hardware.

      software built into hardware that is burned into a ROM and which can't be changed (or changed easily) is callled firmware. e.g. a harddisk uses programs in its ROMs to allow it to read/write to disk.

    in many formats

    • in text form which needs to be compiled before it runs on the computer
    • in binary form which will run directly on a computer.

A particularly important piece of software is the operating system (OS). examples are Linux, MacOS, Windows. The purpose of the OS is to virtualise the hardware.

virtualise: make all hardware appear the same to the user, programs, no matter what piece of hardware is being used underneath.

If you use a harddisk that's IDE, SATA, scsi made by any manufacturer, of any size, the OS will present the harddisk as a storage accessed by the same instructions. The instructions will be different for each OS, but once you've picked your OS, the instructions for accessing the harddisk will be the same no matter what harddisk is in the machine.

3. binary numbers, the bit (b)

Computers have millions of pieces of hardware (memory/registers) that are in one of two states

  • up/down (N/S) (magnetism, harddisk)
  • switch on/off
  • voltage high/low
  • current high/low

These states are represented by 0/1

You don't have to know what the hardware is doing or even what the hardware technology is, or whether a 0 is represented by high or low voltage. You (or the computer) will just be told that the particular piece of hardware is in the 0 state or the 1 state.

Some hardware maintains its state without power e.g.

  • floppy disks
  • harddisks
  • flash memory

Most hardware looses its state when switched off e.g.

  • RAM (random access memory in the computer)

how much memory is in a typical hard disk, flash disk, floppy disk, RAM? [1]

Since there are only two states (two = bi), the state is represented by a binary digit (called a bit). A bit then either is 0 or 1.

3.1. Number systems with bases !=10

The number system we're used to is called decimal. The base of the decimal system is 10. Number systems used in computing are

  • base 10: decimal - for input and output to users
  • base 2: binary - the representation used for numbers in a computer
  • base 16: hexadecimal - a more convenient representation of binary for humans.
  • base 256: (one byte) used for assigning internet addresses.

3.2. Using bits to represent numbers

bit, wikipedia (http://en.wikipedia.org/wiki/Bit).

Computers crunch numbers. We (people) use the decimal number system. Computers only have bits and use these bits to make binary numbers. There are no decimal numbers inside a computer. Binary numbers are transformed by software to a decimal representation on the screen for us.

Before launching into binary numbers, lets refresh our memories on the positional nature of the representation of decimal numbers.

102=1*100  + 0*10   + 2*1
   =1*10^2 + 0*10^1 + 2*10^0

The value represented by each digit in the number 102 depends on it's position. The "1" represents "100". If the "1" was in the rightmost position it would represent "1". Each time a digit moves 1 place to the left, it increases the number it represents by a factor of the base of the number system, here 10. The left most digits represent the biggest part of the number.

binary numeral system, wikipedia (http://en.wikipedia.org/wiki/Binary_numeral_system).

In a binary number, the base is two, so the number prepresented by each position increases by a factor of 2 as you move left in the number. In binary, you need two numbers to represent all the available values. Here are the two numbers and their decimal equivalents.

binary decimal
   0  =  0
   1  =  1

Here's a binary number. What does it represent?

1101 = 1*2^3 + 1*2^2 + 0*2^1 + 1*2^1
     = 1*8   + 1*4   + 0*2   + 1*1
     = 13 decimal

As with decimal, the left most digit carries the most value (in 11012, the left most digit represents 810).

leading zeroes behave the same as for decimal numbers

00=0
01=1

Here's some more binary numbers

  binary decimal
      10=2
      11=3  (2+1)
     100=4
     101=5  (4+1)
    0101=5
00000101=5
    1001=9  (8+1)
    1011=11 (9+2,8+3))
   11011=27 (16+11, 24+3)
   11111=31

what is 10112 in decimal? [2]

Let's go the other way, decimal to binary

what is 710 in binary? Do it by stepwise division.

Start with the power of 2 just below your number.
Take a guess. 2^3=8. This is bigger than 7.
Try next one down. 2^2=4. This is less than 7. Start there

7/4=1, remainder=3

do it again, the power of 2 just below 3 is 2

3/2=1, remainder=1

with the remainder being 0 or 1, we're finished

7=1*2^2 + 1*2^1 + 1^2^0

7 decimal = 111 binary

In the above exercise, would anything have gone wrong if you'd started dividing by 8 rather than dividing by 4? No, you would have got the answer 710=01112. The leading zero has no effect on the number, so you would still have got a right answer, you just would have done an extra step.

what is 1510 in binary? [3]

Here's python code to convert decimal to binary (http://www.daniweb.com/code/snippet285.html).

Here's bash code to convert binary to decimal. If you aren't already in a bash shell, start one up in your terminal with the command /bin/bash. On windows, click on the Cygwin icon to get a bash prompt. (comments in bash start with #). The code here is from Bash scripting Tutorial (http://www.linuxconfig.org/Bash_scripting_Tutorial) in the section on arithmetic operations. (Knowing that bash can convert binary to decimal, I found this code with google using the search terms "convert binary decimal bash").

declare -i result	#declare result to be an integer
result=2#1111		#give result the decimal value of 1111 to base 2
echo $result		#output the decimal value to the screen
15
# same code in one line
declare -i result;result=2#1111; echo $result
15

3.3. Giving Credit

In the previous section, I showed bash code I found on the internet. When writing a new program, you'll often need some functionality that's already been coded up (after 50yrs of computing, there's not much that hasn't aleady been coded up) that's available in books or on the internet. Books on any computer language will have lots of worked examples. It's often faster to find debugged and working code, than it is to write it yourself from scratch. You're expected to do this and everyone does it. For a class exercise, you may have to nut it out yourself, but when it comes to writing working code, you borrow when you can. When you use someone else's code, you should document where you got it.

  • it's the right and honourable thing (TM) to do
  • it will be easier to find the original author, if you need to find out more about the code later
  • people will be happy to send you code when you ask them for it.
  • unless you're known to be a superhuman coding fiend, no-one will believe you wrote it all yourself and your credibility will be zero.
  • you don't want people who are paying for your coding output, to think you were stupid enough to write it all yourself when working and tested code for that function was already available.

3.4. The byte (B)

byte, wikipedia (http://en.wikipedia.org/wiki/Byte).

It turns out to be convenient (hardware wise) to manipulate bits 8 at a time. 8 bits = a byte(B). Most PCs (in 2008) are 32 bit machines, meaning that the CPU manipulates 4 bytes (32 bits) at a time. Since these machines are running at somewhere between 100MHz and 2GHz, then they are doing between 400 and 8000 million byte operations/sec.

some 1Byte numbers expressed in decimal

byte     decimal
00001000=8
00001111=15
00010000=16
00100000=32
01000000=64
10000001=129
11111111=255

1 Byte can represent integers from 0-255

3.5. The word

Bytes are always 8 bits. However data is shifted around according to the bus width of a computer. A 32 bit computer has 32 bit registers and 32 lines for addressing and fetching data. It can transfer data and instructions 4 bytes at a time. A term from the HPC (high performance computing or supercomputers) world, where 64 bit computing has been the standard for 30 yrs, the term word describes the width/size of a piece of data/instruction. In the HPC world, there are words of all sorts of lengths, including 128-bit. A 32-bit computer has a 32-bit (4 byte) word size.

3.6. binary addition

Let's do some binary addition: The rules are similar to decimal. Here's the addition rules.

	0+0=0
	1+0=1
	1+1=0 with carry, 1+1=10

here's the addition rules in table form.

addition                carry
+ | 0 1              + | 0 1
-------              -------
0 | 0 1              0 | 0 0
1 | 1 0              1 | 0 1
Note

1+1=0 with a carry.

1+1!=10 (!= not equal).

The extra leftmost digit, the "1" (as in "1+1=10") becomes the carry digit. It's handled separately by the computer. If you want it, you have to go find it. The equivalent table in decimal for a one digit computer would show that 8+7=5 (and not 15)

worked example, working from right to left, one bit at a time

 111		(what demical numbers are being added here?)
 010 +
 ---
  01
 1   carry

 001
1    carry

1001

What is 1001+00112 [4] ? What decimal numbers are represented?

Note
End Lesson 1

3.7. Algorithm and Order of an algorithm demonstrated using binary addition

Algorithm, wikipedia (http://en.wikipedia.org/wiki/Algorithm).

Algorithm: A predetermined sequence of instructions (events) that will lead to a result.

e.g.the algorithm of right-to-left addition will lead to the sum of two numbers.

e.g.putting your books into your school bag in the morning will lead to them all being at school with you the next morning.

What country do we get the word "algorithm"? What does "al" mean? What other words start with "al" used in the same way? [5]

Some algorithms are better (faster, more efficient, scale better) than others. While computers are fast enough that even a bad algorithm can process a small number of calculations, it takes a good algorithm to process large numbers of calculations. Computers are expensive and it's only worth using a computer if you're doing large numbers of calculations. We're always interested in how an algorithm scales with problem size. So we're always interested in the goodness of the algorithm being used.

The measure of goodness of an algorithm is the rate at which the time for it's execution increases as the problem size increases (i.e. how the algorithm scales). If you double the size of the problem, does the execution time not change (independant of problem size), double (linear in problem size) or quadruple (scales as the square of the problem size)?

This change in execution time with problem size is called the Order of the algorithm.

  • O(1): the execution time is always the same (1), i.e.is independant of problem size
  • O(n): the execution time is proportional to the problem size.
  • O(n2): the execution time goes up as the square of the problem size (O(n2)) algorithms are too slow to be used for large problems).

Speed (Order) of the addition operation:

addition one bit at a time takes 8 steps for addition of 2 bytes. The Order of addition for addition, one step at a time, is O(n) i.e.time taken is proportional to n=number of bits.

There's a faster way: add the bits in parallel and handle the carries in a 2nd step

 111
 010
 ---
 101 add
 10  carry
 ---
1001

Addition done in parallel takes 2 steps, no matter how long the numbers being added (you need to increase the amount of hardware, but you only have to pay for the hardware once and the extra cost can be amortised over many addition operations).

The Order of parallel addition is O(1) i.e.is proportional to 1 i.e.addition takes the same time independant of the number of bits being added. The parallel addition algorithm is much faster than the stepwise mechanism.

Note

amortise: If one process is faster/better than another, but the faster process is more expensive but only has to be paid for once, and you can use the process as many times as you want, then you are are said to be amortising the extra cost over a the life of the machine.

e.g. cost of stepwize adder=$10. cost of parallel adder=$20. If you're going to be doing 1015 additions before you retire the machine, the amortised extra cost of the parallel adder is 10-14$/addition. Most people will accept this extra cost, because of the increased speed will save them money for each run of the program.

The most common place that amortisation is used in computing is in the cost of writing a program. Writing a program is expensive; you have to pay the programmers salary, benefits, heating/AC, electricity and buy computers and lease a premises to do this. A program may cost $1k-$100M to write. However if the program is run millions of times, the cost/run for the end user may be insignificant compared to the cost of paying their staff to run the program. In this case the costs of the programmer's time is said to be amortised over the number of times the program will be run. Because writing programs is so expensive, you only write programs that are going to be run many times.

Here's the time/steps for the two types of addition for different problem sizes

	Stepwise	Parallel
1	2		2
2	4		2
3	6		2
4	8		2
.
.
16	32		2

We don't really care what the constant of proportion is, i.e.we don't care if each step takes 1usec, 1msec or 1sec, only how the time to completion scales with problem size. We know that if we scale the problem by some large number (e.g.106), that the constant of proportionality will be swamped by the problem size. Let's say that parallel addition took 8 steps instead of 2. Here's the new times for addition.

bits	Stepwise	Parallel
1	2		8
2	4		8
3	6		8
4	8		8
.
.
16	32		8

We only need to get to numbers of length 8 bits to be ahead with parallel addition.

3.8. Overflow

say we have a 4bit computer, what is

 1010
 0110+
 ----
 0000 with 1 carry

? When you do this, a digit rolls off the left hand end.

This is called overflow Since you only have a 4bit computer, you'll get an erroneous answer. In some circumstances you'll get an error message on overflow and in other situations you won't. Since overflow is part of the design of a computer, it is expected and is not neccessarily regarded as an error. (e.g. a drinking glass will only hold so much fluid. If you put in more fluid, it will overflow. People accept overflow as part of the design of a drinking glass.)

With addition, you have to anticipate getting a number that is 1 bit longer than the two numbers being added, i.e. you'll get a 1 bit overflow.

3.9. Binary Multiplication

the rules are similar to decimal - here's the 1 times table (there's no carry on binary multiplication)

* | 0 1
-------
0 | 0 0
1 | 0 1

3.9.1. left shifting (and overflow), multiplying by 10

Multiplication by 1,10,100 left shifts the digits by 0,1,2...n places (i.e. by adding a '0' on the righthand end). This is the same whether you're working in decimal or binary. Left shifting is a fast operation in a computer. When the computer detects that you're multiplying by a power of 2, it left shifts the number, rather than using the multiply hardware.

Note
both in binary and decimal
  1*10=  10
 11*10= 110 
110*10=1100

what numbers are represented in each case?

Left shifting produces overflow. Assume a 4bit computer

 1100*10 =1000 (not 11000)
 1100*100=0000 (not 110000)

If a 0 overflows, you get the correct result.

3.9.2. multiplication by other numbers

Computers do multiplication the same way we do, one digit at a time, but add the results in parallel.

what is

 1010
   11 x
 ----
 1010
1010
 ----
11110 addition
 0000 carry
-----
11110

again

    1010
    1101x
    ----
    1010
   0000
  1010
 1010
 -------
 1110010
   1     carry (a couple of rounds)
 -------
100000010

the carries are done in parallel
multiplication is fast and is O(1).

what is 10102x01102? [6]

what is 11002x10012? [7]

3.10. Binary Subtraction

It is not neccessary to have a separate set of hardware for subtraction. You could in principle add the negative of the number, but we don't have negative numbers (hardware only knows about 0-255 and has no concept of positive or negative numbers). (Software can interpret certain values of a byte as being negative, but we haven't got that far yet.) However we don't need negative numbers for substraction. Instead we add the complement (for binary, it's the twos complement: the twos complement instruction is fast - one clock cycle).

3.10.1. Tens complement

Let's look at subtraction by adding the complement using a decimal example.

Let's say we want to do

9-3=6

and we have a 1 decimal digit (dit) computer. Here's the addition table for a 1 dit computer.

+ | 0 1 2 3 4 5 6 7 8 9
-----------------------
0 | 0 1 2 3 4 5 6 7 8 9
1 | 1 2 3 4 5 6 7 8 9 0
2 | 2 3 4 5 6 7 8 9 0 1
3 | 3 4 5 6 7 8 9 0 1 2
4 | 4 5 6 7 8 9 0 1 2 3
5 | 5 6 7 8 9 0 1 2 3 4
6 | 6 7 8 9 0 1 2 3 4 5
7 | 7 8 9 0 1 2 3 4 5 6
8 | 8 9 0 1 2 3 4 5 6 7
9 | 9 0 1 2 3 4 5 6 7 8
Note
remember, because we have a one digit machine, 9+9=8. (9+9!=18, the 1 is lost by overflow.)

If we're going to do the substraction 9-3=6, by adding some other number, what number do we add to 9 to get 6? Looking at the addition table, we find we have to add 7.

9-3=6 what we want
9+?=6 what we're looking for
9+7=6 the answer from looking up the addition table above

The ten's complement of 3 then is 7.

What if we want to subtract 3 from any other number, say 8? If we want to do 8-3=5, by adding a number to 8, on looking at the addition table, we have to add 7. So whenever we want to subtract 3, instead we add 7.

8-3=5 what we want
8+?=5 what we're looking for
8+7=5 the answer from lookup up the addition table above.

We find the ten's complement of 3 is 7 no matter what number we subtract 3 from. Making the complement of a number only depends on the number, not what we subtract it from.

What is the tens complement of 8?

9-8=1
9+?=1
9+2=1

the complement of 8 is 2.

What's the complement of 9?

9-9=0
9+?=0
9+1=0

the complement of 9 is 1.

What is the complement of 0,4? [8]

Overflow isn't an advantage or a disadvantage; it's just part of the design of a computer. Since we have overflow, we can use it to do subtraction by addition of the complement, rather than having to build a subtractor into the hardware.

Here's the decimal (10s or tens) complement table

number 	complement
0	0
1	9	
2	8
3	7
4	6
5	5
6	4
7	3
8	2
9	1	

The complement is the number you have to add to get a sum of 0. Note that the sum is not really 0; it's 10 but the left digit is lost through overflow. The complement of a number then is

complement=(base of the system, here 10)-(the number).

But in a 1 dit computer you don't have two digits to make a 10. Instead if you want the ten's complement of 7, you ask the computer to come in 3 places from the biggest number (here 9), i.e.8,7,6 giving 6 and then you add 1 giving 7.

Note
Subtracting from 10 or marching in from 9 are the same to you, but if you're wiring up a computer, you can't subtract from 10, but you can count in from 9.

Summary: if we're going to do 9-3, we add 10 to the -3, giving +7. The answer we'll get by doing 9+7 will be 10 more than what we want. However the extra 10 will be lost through overflow (subtracting 10 from the result), giving us the correct answer.

what we wanted 
9-3=6

what the computer calculated (it added 10 to both sides)
9+(-3+10)=10+6

the answer the computer gave us
6

3.10.2. Twos complement

What is the (twos) complement of the 2 bit number 012?

find some number that when added to 012 gives 002 (just start poking in numbers till you get the required answer).

 01	#decimal 1
 11	#decimal 3
 ----
 00
Note: overflow is needed to get 00

See any connection between 1 and its complement 3, in a 2 bit system [9] ?

A 4 bit number system has base 16. See any connection between the values of the number-complement pairs and a 4 bit number system in the following examples? Using brute force, what's the twos complement of the 4 bit numbers

  • 0110 [10] ?
  • 1011 [11] ?
  • 1001 [12] ?
  • 1100 [13] ?

let's try an example in a regular 8 bit byte. Using brute force, what's the twos complement of 01000110? (for labelling, let's use the words minuend, subtrahend and difference.)

 01000101 subtrahend
 10111011 complement
 --------
 00000000

How do you make the complement in binary?

Following the decimal examples (above), to get the complement, you count in from the end number (1 or 0) by 1 number (i.e. you flip the bits), shown here

 01000101 original subtrahend
 10111010 bit flipped subtrahend
 10111011 known twos complement
Note
By looking at the bit flipped number and the twos complement, you can see that you have to add 1 (as is done for the decimal example).
 10111011 bit flipped subtrahend + 1

binary complement=(bit flipped subtrahend + 1)

we've found the complement (the -ve + the base number of the system)

 01000101 subtrahend (decimal 69)
 10111011 complement, (decimal 187)

What's the sum of the 8 bit subtrahend and its complement [14]

with the complement, we can do the subtraction.

 10000000  minuend (decimal 128)
 01000101- subtrahend (decimal 69)
 --------

 10111010 bit flipped subtrahend 
 10111011 bit flipped subtrahend +1 = complement of 69

Do the subtraction by adding the two's complement

 10000000 minuend (decimal 128)
 10111011 complement of decimal 69
---------
 00111011 difference (left bit overflows on an 8 bit computer) 

result:
 binary      decimal 
 10000000    128
 01000101-    69-
 -------     ---
 00111011     59

using the twos complement to do subtraction, what is

  •  01101101
     01100011-
     --------
    
    [15] ?
  •  10110110
     10001111-
     --------
    
    [16] ?

Note
End Lesson 2. Some kids didn't get the material on the complement and didn't complete the excercises. I added more exercises and started Lesson 3 at the beginning of binary subtraction.

3.11. bc

Most people can only do 4bits of binary in their head. You either go to hexadecimal (below) or use a binary calculator. Fire up a terminal and try a few examples (you can recall the last line with the uparrow key and edit it without having to type in the whole line again).

bc (basic calculator?) is a general purpose calculator. Using a terminal, try some standard arithmetic e.g.(+-/*).

echo "3*4"  | bc
12

bc does all input and output in decimal, until you tell it otherwise.

  • You change the output base using obase
  • You change the input base using ibase. After you've run this command, all following input will be read using the new ibase.

Here's a few binary examples.

#input will be in binary, output is decimal since you haven't changed output
echo "ibase=2; 1100-101" | bc
7

#with obase explicitly set (not needed if obase is 10)
echo "obase=10;ibase=2; 1100-101" | bc
7

#same problem, output in binary
echo "obase=2;ibase=2; 1100-101" | bc
111

#convert decimal to binary
echo "obase=2; 17" | bc
10001

#other examples:
echo "obase=10;ibase=2; 1100+101" | bc
17

echo "obase=2;ibase=2; 1100+101" | bc
10001

Exercises: Hint - the number(s) you're processing are in the last instruction on the line. Before you run the instruction, figure out the base for the input and for the output and then decide whether you need to set obase and/or ibase.

  • convert 3210 to binary [17]
  • convert 1012 to decimal [18]
  • convert 1001*102 to decimal [19]
  • convert 32+610 to binary [20]

The normal order is obase/ibase. What happens if you reverse the order of obase and ibase without changing their values?

normal order
echo "obase=10;ibase=2;01110001" |bc
113

inverted order
echo "ibase=2;obase=10;01110001" |bc
1110001

bc defaults to decimal input. In the normal order, bc interprets obase as 1010 and ibase as 210 (i.e. binary). The input will be intepreted as binary and output will be in decimal. In the inverted order, obase says that all further input will be interpreted as base 210 (i.e. binary). Thus the obase value is 102 (210), i.e. the answer will be in binary.

As long as you know what you're doing, you can use obase,ibase in any order. To minimise suprises, use obase first, leaving the input decimal, then input the value for ibase in decimal.

3.12. Hexadecimal

For the length of numbers used in a computer, binary is cumbersome. Unless you really want to know the state of a particular bit, you use hexadecimal (a number system with base 16), which uses 1 symbol for 4 bits, and runs from 0..f (or 0..F)

binary  hex  decimal
0000 	0        0
0001 	1        1
0010 	2        2
.
.
1000 	8        8
1001 	9        9
1010 	a or A  10
1011 	b or B  11
1100 	c or C  12
1101 	d or D  13
1110 	e or E  14
1111 	f or F  15

When input to a computer is ambiguous as to its value, hexadecimal is represented as "OxF" or "Fh" (preceded by "Ox" or postceded by "h").

Here's conversion of hexadecimal to decimal using bash

declare -i result	#declare result to be an integer
result=16#ffff		#give result the value in decimal of hex ffff
echo $result		#echo $result to the screen
65535

#or all in one line
declare -i result;result=16#ffff; echo $result
65535

using bc

echo "obase=16;ibase=16; F+F" | bc
1E
Note
End Lesson 3. Spent some time in the first half of the class going through the twos complement exercises which I added after lesson 2. I asked the kids to try the following exercises for homework. They didn't do them so I started with the decimal/binary/hex table above and then worked them through the exercises below, at the start of the class.

Using any method

  • convert 1010 to hexadecimal [21]
  • convert 10112 to hexadecimal [22]
  • make up the hex addition table and the carry table [23]
  • using this hex addition and carry table, give the result of adding "F+F" on a 4 bit computer [24] and an 8 bit computer [25]
  • find the sum of cd+0e (both hex) in hex for a 1 byte computer. Do it with bc and then using the table you just derived. [26]
  • give the hex complement of the hex numbers: 1,6,3A [27]
  • what is FD01-EF56 in hex [28] ?

3.13. Base 256

Note
Base 256 logically belongs here, but since you don't need it to start programming, and the introductory part of this course is long enough, I'll do it some time later. The material is at ???

3.14. Integer Division

Note
Integer division logically belongs here, but since you don't need it to start programming, and the introductory part of this course is long enough, I'll do it some time later. The material is at ???

3.15. Review: binary

Do calculations with pencil and paper (you can use a computer to check your answers later when you're done). Write out any steps required for calculations, as if you were doing an exam.

  • What is "bit" an abbreviation of?
  • convenient names are used for number systems with bases of 2,8,10 and 16. What are these 4 number systems called?
  • convert these decimal numbers to binary and to hexadecimal: 2,7,8,11
  • how many bits in a byte? how many different numbers can be represented by a byte?
  • what is the word size for standard home PCs, for computers used for mathematical modelling?
  • what is the result of this binary addition on a computer that has 1 byte registers.
    0101
    1101+
    ----
    
  • If you accept the possibility of loosing a bit to overflow, what width registers do you need to accept the result of addition of the numbers in two single byte registers?
  • The algorithm for addition on a computer is of order O(1). This means that if you double the width (size) of the registers from 32 to 64 bits, the number of steps for the addition is going to be what (the half, same, double, 2^32 times)?
  • Compared to other operations (e.g. division, square root) addition is cheap/expensive?
  • what is the result of this multiplication on a 16 bit computer.
    00000101
    00001101*
    --------
    
  • If you accept the possibility of loosing a bit to overflow, what width registers do you need to accept the result of multiplication of the numbers in two single byte registers?
  • What is the 10's complement of 7 on a computer which contains registers of only a single decimal digit? Use the 10's complement to do the following operation on a single decimal digit computer.
    8
    7-
    -
    
    and this operation
    3
    4-
    -
    
    Why isn't this 2nd result -ve?
  • On a 4 bit computer what is the two's complement of 0110?
  • Do the following subtraction on a 4 bit computer, using the 2's complement method
    1001
    0110-
    ----
    
  • What is the output of the following commands at the command prompt?
    # echo "obase=10;ibase=2;0110"| bc 
    # echo "ibase=2;obase=10;0110"| bc
    
  • convert 1310 to hexadecimal.
  • convert 11102 to hexadecimal.
  • convert efH to binary, decimal.
  • What is a0H+77H on a 4 bit computer?
  • Do this in your head and by using the 16's complement.
    ed01
    ae80-
    ----
    

answers: [29]

4. Primitive Data Types

Primitive type, wikipedia (http://en.wikipedia.org/wiki/Primitive_type).

Bytes hold numbers 0-25510, 00000000-111111112, 00-FFh It's all the computer is ever going to have. We need to use these bytes to represent things more useful/familiar to us.

Using bytes of 0-255, languages implement a set of primitive data types (and provide operators to manipulate the primitive data types).

  • integers:e.g. 42, 1024, -100
  • characters: e.g. 'a','Z','0',' '

    Note
    This explanation of the difference between '0' and 0 was later in the lesson, but the students immediately protested that '0' was a number and not a character.

    What's the difference between the integer 0 and the character '0'?

    • the integer 0:.

      If represented by a single byte, it will be 00000000. You can do arithmetic operations (e.g. multiply, add, subtract and divide) with the integer 0.

    • the character/symbol '0':.

      Has particular shape. It's represented by the byte 30h. When the computer needs to draw/print this character on a screen, the byte 30h is sent to the screen/printer, where the hardware knows to draw a symbol of the right shape to be a zero. The computer is not allowed to do arithmetic operations (e.g. add, multiply, subtract or divide) on the character '0'. However the computer can test the variable holding the character '0' to see whether it represents a decimal digit (number), hexadecimal digit, punctuation, letter and if a letter, whether it's upper or lower case.

    In situations where the computer doesn't know whether 0 is a number or character, you have to explicitly write '0' and/or "0" (depending on the language) for the character, while 0 is used for the number.

    To add to the confustion, the word "number" is used to mean both a numerical quantity and the characters which represent it. Context will indicate which is meant.

    I will be talking about the ASCII character set, ASCII, wikipedia (http://en.wikipedia.org/wiki/ASCII), which is useful for simple text in (US) English. An attempt at a universal character set, see Unicode, wikipedia (http://en.wikipedia.org/wiki/Unicode).

    Early in the days of computing, the US Govt decided to only buy computers that used the same character set and it mandated ASCII. Until then, manufacturers all used different hexadecimal representations of characters. Because ASCII was required for computers bought by the USGovt from the early days of computing, all manufacturers supported ASCII. ASCII is still the only guaranteed way of exchanging information between two computers. Usually if one computer wants to send the value 3.14159 to another computer, it is sent as a series of characters (string) and transformed into a number at the receiving end. (There is no agreed upon convention for exchanging numbers.) Thus e-mail and webpages all use ASCII. Many computer peripherals (e.g. temperature sensors) send their data as a string of ascii characters (terminated by a carriage return), which is then turned into a number within the computer.

    see big government does work.

    Note
    The US Govt could have set standards for exchange of numbers too, but it didn't, so numbers are exchanged between computers by ASCII.
  • real numbers: e.g. -43.0, 3.14159, 98.4

    Floating point numbers, wikipedia (http://en.wikipedia.org/wiki/Floating_point).

  • boolean: e.g. true, false (these are the only two allowed values) (most languages don't have booleans, you have to fake it).

    Boolean datatype, wikipedia (http://en.wikipedia.org/wiki/Boolean_datatype). Boolean logic in computer science, wikipedia (http://en.wikipedia.org/wiki/Boolean_logic_in_computer_science).

  • strings: e.g. "happy birthday", "my birthday is 1 Jan 2000".

    String (computer science), wikipedia (http://en.wikipedia.org/wiki/String_%28computer_science%29).

4.1. Primitive Data Type: Integer

Programs don't usually do much arithmetic with integers. Integers are used as counters in loops and to keep track of the position in an executing program. Integers do come from digital sensors: e.g. images from digital cameras, digital audio, digital sensors. However most data, by the time it arrives at the computer, is reals.

In a 32 bit computer, an integer has a range of 0-4294967295 (232, this number is referred to, somewhat inaccurately as 4G, but we've all accepted what it means - it's the 32 bit barrier).

#in bash
#binary
declare -i result;result=2#11111111111111111111111111111111; echo $result
4294967295

#hexadecimal
declare -i result;result=16#ffffffff; echo $result
4294967295

4.2. Arithmetic with Long Numbers

Numbers needing more bits than the machine's register size are called Long (or long), e.g. a 64 bit number on a 32 bit machine. Arithmetic on long numbers needs at least two steps, each of 32-bit numbers, and requires an "add with carry" (ADC) instruction (found on all general purpose computers designed to be able to load any code the user wants to install, but not found on special purpose computers exclusively running preinstalled code such as cell phones, routers, automotive computer chips). Here's how addition of long numbers works. Let's assume a 2bit computer and we want to add a 4bit number.

0010
1011+
----
????

First split the problem into pieces managable by the hardware (here 2 bits) giving us the right hand half (the least significant bits) and the left hand half (the most significant bits).


LH         RH       
00         10         
10+        11+
--         --
??         ??  

Next a word about addition and carry: When doing addition by hand, there is never a carry for the rightmost digit, but a computer has a carry bit for the rightmost bit which is set to 0 at the start of addition.

RH
10
11+
----
?? sum
?0 carry

step 1: right column, add two digits + carry digit. The carry to the 2nd column is 0.
RH
10
11+
--
?1 sum
00 carry

step 2: left column, add two digits + carry digit. There is overflow
RH
10
01+
--
01 sum
00 carry (with overflow)

The computer has a FLAGS register (32-bits in a 32 bit computer), which holds, in each bit, status information about the executing program, including whether the previous instruction overflowed, underflowed or set a carry.

The addition above overflowed, but the computer doesn't know if the bit is required for Long addition, in which case the overflow is really a carry. The computer stores the overflow bit in the carry bit in the flags register just in case. If the computer is doing a Long addition, the next step will ask for the carry bit. If the computer isn't doing a Long addition, then then the carry bit will be ignored (and will be lost).

Here's what the calculation looks like now (only the state of the carry bit is shown in the FLAGS register). The computer will first add the right most digits in its 2bit registers, using the regular add (ADD) instruction, which only adds the two numbers and the information setup in the carry input to the adder.

before 1st addition

LH         RH        FLAGS
00         10         ?
10+        11+
--         --
??         ?? sum 
?0         ?0 carry

after 1st addition

LH         RH        FLAGS
00         10         1
10+        11+
--         --
??         01 sum 
?0         00 carry

Because of the overflow, the FLAGS register is now 1. The computer has been told that it's doing the 2nd step in a Long addition. It uses the "add with carry" (ADC) instruction, which transfers the carry bit in the FLAGS register to the adder, and then does a normal addition.

2nd addition. first step, copy carry bit from FLAGS to carry input for LH

LH         RH        FLAGS
00         10         1
10+        11+
--         --
??         01 sum 
?1         00 carry

2nd step, add digits and carry digits for LH numbers

LH         RH        FLAGS
00         10         1
10+        11+
--         --
11         01 sum 
01         00 carry

we now read out the sum digits

11         01

giving the required answer of 1101

You can chain addition to any precision (on a 32-bit computer, to 64, 96, 128-bits...) Standard calculations rarely need more than 64 bits, but some people want to calculate π to billions of places and this is how they do it.

Long arithmetic is slower than regular arithmetic. You don't ask for Long operations unless you know you need them.

Note
End Lesson 4

4.3. Negative Integers

If we wanted negative integers, how would we do it? Pretend you're a 1 byte computer and you need to represent -1. You can do this by finding out the number which added to 1 gives 0.

00000001
????????+
--------
00000000

The answer is 111111112, 25510or FFh (note: computers need overflow to work).

00000001
11111111+
--------
00000000

You've seen the computer version of -ve numbers before. They're called what [30] ? They are the (-ve of the number + the base) (in a 1 byte computer the base is 256).

What is -210 in binary, hexadecimal? [31]

How do we know whether 255 should be interpreted as -1 or 255?

The level of primitive data types, is one level above a byte. Your program keeps a record of the primitive data type that each particular byte represents. When you write your program, your code will have descriptors stating whether this integer will have +ve only values (called an unsigned int) or both +ve and -ve values (called a signed int). Some programming languages will have already decided that you'll be using a signed int and you won't have any choice.

If you have a signed int, then integers with high order bit=1 are -ve while those with high order bit=0 are +ve.

binary     hexadecimal decimal
00000000       00         0
00000001       01         1
.
.
01111111       7F       127
10000000       80      -127
10000001       81      -126
.
.
11111100       FC        -4
11111101       FD        -3
11111110       FE        -2
11111111       FF        -1

Linux runs on 32 bit computers. What values is represented by 32x1's (or FFFFFFFF) in Linux bash. On a 32 bit machine we might expect this to be -1.

declare -i result;result=16#ffffffff; echo $result
4294967295

This is not a negative number. What's going on? Let's try a 64-bit number (just for reference, the biggest number that can be represented by 64 bits is 264=18,446,744,073,709,551,616=18.45*1018).

declare -i result;result=16#ffffffffffffffff; echo $result
-1
declare -i result;result=16#7fffffffffffffff; echo $result
9223372036854775807
declare -i result;result=16#8000000000000000; echo $result
-9223372036854775808

bash in Linux rolls over to -ve numbers half way through the 64 numbers: the integers in bash on Linux on 32 bit machines are 64-bit signed integers.

Note
see long_numbers for 64-bit math on 32 bit machines.

4.4. Range of Integers

Here is the range of values that various sized integers (int) can represent. To give an idea of the relative sizes, the table shows the time (stored as seconds) represented by that integer.

			8bit		16bit		32-bit		64-bit	
unsigned		0-255		0-65535		0-4294967296	0-18446744073709551616
signed			-/+127		-/+32767	-/+2147483647	-/+9223372036854775808

unsigned time		4mins		18hrs		136yrs		584,942,417,355yrs
signed time		2mins		9hrs		68yrs		292,471,208,677yrs

You'll see these numbers often enough that you'll start to remember them. For the moment be prepared to see numbers pop up that you've not previously seen much of.

The Y2.038K problem. In 1999 the computer world was in a flurry: programmers who'd prepresented years in their dates using 2 digits, realised that the year 00 would represent the year 1900. Paychecks would not be processed, elevators would stop working at midnight, trapping thousands (if not millions) of innocent people, and planes would fall out of the sky (except Japanese planes). A bigger calamity could not be imagined. The world's bureaucrats heroically spent millions$ of taxpayer's and consumer's money to prevent certain disaster. On 01 Jan 2000, none of the predicted misfortunes occured, for which we must thank the selfless and unacknowledged taxpayers and consumers of the world.

Unix represents time as a signed 32-bit integer of seconds, starting 1 Jan 1970, this date itself a major blunder [32] . If in Jan 2038, you're still using a 32 bit OS (not likely for desktop machines, but quite possible for embedded devices sitting in computer controlled machinery, which rarely need 32 bits, much less 64 bits), Unix time will overflow in Jan 2038. If in Jan 2038, your computer controlled refrigerator stops ordering food, it will be because the refrigerator is asking for food to be delivered in 1970. Jan 2008 was a good time to take out a 30yr loan for 250,000$; your monthly payments would be -1,600$ (a comment from slashdot in Jan 2008. http://it.slashdot.org/article.pl?sid=08/01/15/1928213).

A 32 bit computer can generate how many different different integers [33] ? This computer then is capable of generating that many different integers. If you use the integers to be labels or addresses, you can label that many different items.

A 32 bit computer addresses its memory in bytes using address lines in the hardware. The computer has to read/write a byte as one unit; it can't address individual bits in a byte in memory - it has to read or write the whole byte. Once the computer has read the byte into a register, then each bit within the byte is separately addressable. What is the maximum number of bytes a 32 bit computer can address, and how much memory is this [34] ?

Not too long ago, microprocessors had 4kbytes of memory. Now for many applications, computers need more than 4Gbytes of memory. These applications all have to be run on 64-bit computers. But if you wanted to increase the amount of memory available to 32-bit computers, how could you do it [35] ?

In fact most data and instructions in a 32-bit computer are 32 bits and are fetched 32 bits at a time, so changing the addressing to 32 bits would not be a big change. A char would now have to be represented by the byte of interest, followed (or preceded) by 3 empty bytes. The instructions that work on chars would have to mask off the 3 empty bytes. Since chars (in most programs) are a small fraction of the data held in memory, these unused bytes would not cause too much wasted space in memory.

I don't know why 32 bit computer manufacturer's didn't go to 32 bit addressing. Possible reasons might be

  • Compatibility is a big factor in the commodity market; unless of course you're Apple, who brings out new hardware and software every 3yrs, discontinues support of the current hardware and tells everyone how fortunate they are to be able to buy the new hardware and software. This selects for grateful and well-heeled customers. For the rest of the commodity market, everything has to be backward compatible, back to the clay tablet. The new architecture would break old code. Most people using PCs are running applications on the desktop (business/office applications), and don't have the source code for the programs they run; they run proprietary binaries and would have to buy new versions of their programs if they upgraded to 32 bit addressed hardware.
  • Most of a desktop machine's activities are character oriented; typing into a wordprocessor, display on a screen, sending characters or mouse clicks to machines on the internet (all of which have to be sent one character at a time in an IP packet, and have to await a reply packet before being displayed on your screen). (In contrast, an HPC machine doesn't interact with the slow user, keyboard, or screen; it runs for hours, days or weeks and then dumps its output as files, to be poured over later by the programmer/user.)
  • 64 bit addressing is available if you need it. 64 bit machines have been around for 30yrs or so. People needing large amounts of memory are crunching large amounts of data. They have more money and aren't running desktop applications. They're either using programs they wrote themselves (i.e. they have the source code, which they can recompile) or are prepared to pay for 64 bit versions of standard programs. They are also prepared to pay for 64 bit machines.
  • 64 bit addressing is available for the desktop too. The hardware for the commodity market (PCs) is changing over to 64 bit about the same time as the arrival of desktop programs that handle large amounts of data.

Harddisks read/write data to/from independantly addressable blocks (i.e. the computer cannot address individual bits or bytes within a block; it has to address and then read or write the block as one indivisible unit). Let's assume a blocksize of 512bytes. If the computer wants 1 bit off the harddisk, the computer has to know the address of the block, read the whole 512bytes into registers, manipulate the 1 bit and the write the 512byte block back to disk. What's the maximum number of blocks a 32 bit computer can address on a harddisk and how much can be stored on a disk with blocksize=512bytes [36] ?

If you wanted to put a bigger disk on a 32 bit computer, how would a harddisk manufacturer do it [37] ?

Harddisk manufacturers originally started with block size of 512 bytes and have incremented the size of the blocks continuously over the years. I don't know why hard disk manufacturers can change the blocksize whenever they want and not have programs fail, while at the same time microprocessor manufacturers have not been able change from 8bit addressing to 32 bit addressing. Possible reasons might be

  • While the 32 bit barrier for memory was always a long way off and programs never had to change this size, harddisk manufacturers set specs for their disks that were regularly exceeded almost the next year. You'd think the harddisk manufacturers would set a spec that would last 10yrs or so, but they didn't. OS writers were continually virtuallising out new harddisk hardware, while not having to change the word size for fetching from memory.

My guess as why memory addressing stayed constant at the 1 byte granularity over decades, while disk block size (granularity) increased from 512 bytes to 8192 bytes, is that going to larger block size forgoes the chance of addressing a 512 byte block, which doesn't cause any problems (you don't ship disk blocks off the local machine), but choosing 32-bit addressing forgoes the chance to address a byte, which you do want to be able to move around (including sending to other machines or peripherals).

IPv4 internet addressing uses a 32 bit unsigned integer (IP) to uniquely identify each network device (e.g. no two devices/computers can have the came IP). How many devices can be simultaneously on the internet using an IPv4 IP [38] ?

The internet game "World of Warcraft" (reported in slashdot, http://games.slashdot.org/games/08/01/19/1321241.shtml) uses a 32 bit integer to store the amount of gold that players earn. Some players have reached that limit and can no longer receive gold from other players. If these players have 1c of gold for each bit, what is their wealth in $ [39] .

The world's telephone system carries voice data in 8bit form i.e. it converts your voice into bytes, each byte representing the amplitude of your voice at any instant. How many different amplitude levels can be expressed using 1byte [40] ? Since the phone system uses 8bits, it's simple to send bytes from computer data across phone lines. Hifi audio is usually 12bits (how many levels is this? [41] ) which has less noise than 8bit audio.

Digital cameras, and computer monitors use 8 bits to represent the intensity of each of the 3 primary lights of a picture; red, green and blue. This turns out to be more levels than the eye can differentiate, but not much more (the eye doesn't see the edge between one intensity and the next and only sees a continuous change in color).

A color picture from an 8 bit digital camera can have how many different colors [42] ?

Matte print (e.g. books, newspapers) can only reproduce about 100 levels of light, but glossy print (e.g. in fine books and magazines) can reproduce about 180 levels, which is why expensive advertisements are run in glossy magazines.

The human eye can accomodate a range of light from nightime to midday on a cloudless day, a range of 107 in intensity (I think). The eye can see features in a face in shadow and in the face of a person standing next to them in full sun, but an 8 bit digital camera will, according to the exposure, only see the face of one, while the other will be washed out (either dark, or light). To help in situations of high contrast, expensive digital cameras record 12bits, allowing range compression and expansion. These photos are post-processed to reduce the range to 8bits for display (or printing) but keeping constrast in both the light and dark areas (i.e. you can see the features of a face in shadow and a face in bright light in the same photo).

What other things could we use 32-bits for: How many -ve numbers could we have [43] ? How many prime numbers could we address [44] ?

Note
End Lesson 5. At the start of the next class, I went back over the number of items that a 32 bit computer could represent. The students had forgotten the number of integers that could be represented by 32 bits (not only the 4G value, but the concept of there being a limit associated with 32 bits). I went through the number of integers, computers on the internet, blocks on a hard disk etc again. The seemed to remember the concept after a few examples. My partner reminded me that you have to tell a student 3 times before you can expect them to start to remember a fact.

4.5. Integer Arithmetic in Python

Now that we're at the level of primitive data types, we can use a language like python.

fire up python; you will be running python in immediate (interactive) mode, where it interprets each instruction one at a time, and returns to the prompt. (In normal operation a program keeps executing till it has to wait, say for a keystroke from you.)

  • in a terminal (Linux, Mac, Cygwin on windows) type python.
  • on windows; start-programs-python-python(commandline)

You will get the python ">>>" prompt

Python 2.4.3 (#1, Apr 22 2006, 01:50:16) 
[GCC 2.95.3 20010315 (release)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 
Note
The following examples are based on Chapter 1 of the LiveWires tutorial "Introducing Python".

try a few subtractions, multiplications and divisions on your own.

>>> 12 + 13
25

>>> 123456 * 3
370368

How about this one?

>>> 7/3
2	#7,3 are integers. You're asking to do integer arithmetic. You get an integer answer.
>>> 

>>> 7%3
1	#'%' is modulo (remainder)
>>> 

>>> 7.0/3
2.333333333333333 #as long as one of the numbers is real, the answer will be promoted to real

you'll learn about real numbers soon.

4.6. Largest/Smallest Integer in Python

Note
This went over the kid's heads, so I skipped to the next section (they don't need to know this right now)

Most programs (including python) use the machine's native libraries (e.g. math, string) (which are usually written in C). (No-one writes a library when a well tested one is already available.) The size (number of bits) for various primitive types in python will then depend on the native libraries. The documentation for python says there are two types of integers (see Numeric Types http://docs.python.org/lib/typesnumeric.html).

  • plain integers: (called "int" in most languages)

    the size depends on the native libraries. We would expect on a 32 bit PC for plain integers to be 32-bit (you don't always guess right: bash on Linux uses 64 bit integers).

  • long (or Long) integers (a number followed by "L"):

    numbers which are bigger than plain integers and have unlimited precision (the machine will use enough bits to handle whatever you throw at it). (Most languages restrict the number of bits you can have).

The Python documentation doesn't tell you the sizes for these two types of integers for any particular platform: you're supposed to be able to work it out yourself. What's the largest plain integer that python can represent? (for likely numbers, look at the table in integer_range) i.e. is it 32 or 64 bit? You won't have to remember the range of integers in python, but you'll need to understand enough about a computer to figure it out. You also should not be surprised if numbers become Long when they become big enough. (In the following, remember 65536 fills 2 bytes. For compactness, I'll use hexademical to illustrate what's happening.)

Python has no trouble representing any size integers. Here are some integers from 16-256bits (Long integers end with "L").

>>> 65536-1			#16bit FFFF
65535
>>> 65536*65536-1		#32-bit FFFFFFFF
4294967295L
>>> 65536*65536*65536*65536-1	#64-bit FFFFFFFFFFFFFFFF
18446744073709551615L
>>> 65536*65536*65536*65536*65536*65536*65536*65536-1	#128-bit  (32 Fs)
340282366920938463463374607431768211455L
>>> 65536*65536*65536*65536*65536*65536*65536*65536*65536*65536*65536*65536*65536*65536*65536*65536-1 #256bit
115792089237316195423570985008687907853269984665640564039457584007913129639935L

From the above output, the 16bit number 65536 is a plain integer, but the 32 bit number is Long. To calculate 65536*65536-1, we would first have had to calculate the intermediate result 65536*65536 which would be a Long (needs 33bits). If you subtract 1 from a Long, you still have a Long (even if it could be represented as a plain), so FFFFFFFF could be a plain integer, but we wouldn't have found out doing it this way. Let's look around for a 32 bit limit. Remember that only half of the integer range is used in signed integers, so let's look at half of a 32-bit number.

#here we would need 33 bits to handle the multiplication overflow 
#so we already know that the answer will be a L number.
>>>65536*65536-1	# FFFFFFFF
4294967295L
#what's half of 65536?
>>> 65536/2 		# 8000       
32768
#The result will be 80000000H, which if a signed integer will be a -ve number
#It looks like python promotes the integer to a L
>>> 32768*65536		# 80000000
2147483648L
#Let's get below 80000000H to 7FFF0000H. 
Yes it's a plain integer
>>> 32767*65536
2147418112
#Let's try 7FFFFFFFH. Yes it's a plain integer.
>>> 32767*65536+65535
2147483647
#just checking 80000000H again, this time approaching from below.
#It's L
>>> 32767*65536+65536
2147483648L

The largest plain integer represented by python is 7FFFFFFFh or 2147483647.

This process above, of poking numbers (or different piece of code) into a program to see what it does is called noodling. It's a good way to learn.

What's the -ve most plain integer in python [45]

Python uses a signed 32-bit integer to represent plain integers.

4.7. Primitive Data Type: Characters, ASCII table

We need to represent the characters in the alphabet, so the computer can type them on the screen and receive them from the keyboard. We need upper and lower case (52) + numbers (10), plus some punctuation and printer/screen control characters (move up/down/left/right, carriage return, line feed, end of file).

abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789
,. !@#$%^&*()-_[]{};:'"<>;/?`~

This is more than 64, but less than 127. This number of characters requires 7 bits. A regular 8 bit byte is used with the top 127 generally unused. The mapping between the 256 possibilities in a byte and the symbols displayed above, as mandated by the USGovt, is called ASCII.

A table of ascii characters and their binary/decimal/hexadecimal equivalents is at wiki, ASCII (http://en.wikipedia.org/wiki/ASCII). The table of printable characters (http://en.wikipedia.org/wiki/ASCII#ASCII_printable_characters). shows that in ASCII, the characters are in alphabetical order.

Note
unlike some other character sets e.g.EBCDIC http://en.wikipedia.org/wiki/EBCDIC originally devised as an extension of Binary Coded Decimal (BCD) http://en.wikipedia.org/wiki/Binary-coded_decimal needed to handle money.

A table which better illustrates the hexadecimal organisation of ASCII is ASCII Chart and Other Resources (http://www.jimprice.com/jim-asc.shtml#table). (A slightly fancier table ASCII Table and Unicode Characters http://ascii-table.com/).

The numbers are 3hex+number. This allows easy conversion of a character representing a number into a number (you mask off the left 4 bits and add the right 4 bits into the output number).

bash converts the hexadecimal representation of a character to its ascii symbol using this command

echo $'\x41'
A
Note
The "41" is hex and the 'A' output is a char (not a hex number). The rest of the command is obscure bash magic.

although you're never likely to have the octal representation of a char, bash converts the octal representation of a character to its ascii symbol using this command

echo $'\41'
!
Note
418=84+1=3310=21h
echo $'\x21'
!

How many letters down the alphabet is the character 'B' (try it at the prompt) [46] ? How many letters down the alphabet is the character represented by hex '51' [47] ? Knowing that the hex for 'A' is 41h, figure out the hex for 'Z' and then try it [48]

To change between upper and lower case, the 6th bit in the byte is flipped. What change in value (as an integer) does flipping the 6th bit represent [49] ?

echo $'\x5A'
Z

echo $'\x7A'
z

In a program. to differentiate a character from a number or variable do this

'c'	char
 c      the variable named c #better to use a longer descriptive name, eg computer_name
 7      the number 7
'7'	the character 7

Computers can scan text to test which characters are letters (A-Z,a-z), which are numbers (0-9) and which are punctuation. The computer can match characters (e.g. is the character an 'A' or a '9'?).

Note

Every keystroke on your keyboard is a character. If you type "3.14159" on the keyboard, the computer accepts it as a series of characters. If you want this to be a real, then you have to explicitely tell the computer to convert the string of characters into a number. If the computer asks you to input a number at the keyboard, your keystrokes will first be put into a string buffer and later your program will have to convert the string to a number.

If you have "3.14159" displayed on the screen and swipe it with your mouse and put it into a buffer, it will be in your buffer as a string of characters.

All normal input and output on a computer is characters and strings e.g. keyboard, screen, printer. (Some programs exchange data as binary, but you have to set that up.)

4.8. Primitive Data Type: Real Numbers

Note
The representation of real numbers take a bit of explaining. You don't need to understand how the are represented to use them (we'll do that later - look in ???)

"real" numbers (also called a "floating point" numbers) in computing are numbers like 3.14159 - anything with a decimal point. You do arithmetic on them.

>>> 3.0*6.0
18.0

You can mix integers and reals - the computer handles it for you, promoting the integer to a real.

>>> 3.0*6
18.0

Be careful how you mix integers and reals. The computer first evaluates (7/2), not knowing that it will next do an evaluation of a real.

>>> 7/2*5.0
15.0

A minor rearrangement of this code gives

>>> 5.0*7/2
17.5

Minor editing of this code makes a big difference in the output (one is correct and one is not). Code where a minor edit (like rearranging the order of a multiplication, which should not change the result) gives a different answer. Such code is called fragile code. Someone (maybe you), years later, could be working on your code and not see the code bomb and will rearrange the line to trigger the bomb.

You should practice safe programming. When mixing integers and reals, explicitely promote the integers to reals, and don't expect the computer to do it for you. Don't rely on rules of precedence too much. Use brackets to make code clear for the reader. This is how the code should be written.

>>> (5.0*7.0)/2.0
17.5
>>> (7.0/2.0)*5.0
17.5

4.9. Primitive Data Type: Strings

a string is a series of characters delineated by a pair of double or single quote characters; e.g. "happy birthday!", "1600 Pennsylvania Ave", "temperature=32F, pressure=29.90in", 'I am "late"'

In principle it's possible to operate on strings as arrays of characters, but strings are the dominant form of input and output on a computer (all computers and people can read them), so all languages have instructions to search, match, read and write strings.

In situations where enormous amounts of data are involved, which will never be read by a human and only ever read by another computer (mapping, photos, MRI data), then data is written in some compact binary (and likely platform specific) form. You'll still need a team of programmers to write the code to allow each new generation of computer to read and write that format.

Note
End Lesson 6

4.10. Is it a string or number?

Until you get used to the rules, and familiar with the error messages, you will have to check each time what you have. Some interconversions are done without asking and others have to be invoked explicitely.

Here's some operations on numbers

>>> variable=3.14159
>>> 3*variable		#since you can multiply, it must be a number, the integer is automatically promoted to real
9.4247699999999988

			#how does python know 3.14159 is a number and not a string?
			#see below for variable=3.14159q	

>>> print variable	#you can print numbers
3.14159

			#the + operator joins strings
	                #but you can't print a string and number at the same time using a +
			#the error message is helpful
>>> print "the number is " + variable
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: cannot concatenate 'str' and 'float' objects

			#you have to turn the number into a string
>>> print "the number is " + repr(variable)
the number is 3.1415899999999999

			#you print numbers using a ','
>>> print "the number is", variable
the number is 3.14159

Here's some operations on strings

>>> variable="3.14159"
>>> 3*variable 
'3.141593.141593.14159'	#same string 3 times. 
	                #not real useful, and probably not what you want.
	                #no other language has this. 
	                #if you really need to do this, 
	                #use a construction common to all languages
	                #or no-one will be able to maintain your code.
			#
			#what would have happened if you'd done the following?
			#variable="3.14159 " 
			#or
			#variable="3.14159q"
			#or
			#variable=3.14159q	

>>> 3.0*variable	#the error message is not helpful
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: can't multiply sequence by non-int

>>> 3.0*float(variable) #float() converts a string to a real 
9.4247699999999988

>>> print variable      #you can print a string.
3.14159

Note the unhelpful error message resulting from the command 3.0*string. You should be prepared for error messages to be wrong and have to go to google to find out the real problem (don't expect the right answer to be available there either, it usually will be, but you'll still have to figure some of them out yourself). The interpreter knows that it has a string and not a number and could have told you. Unfortunately this is part of computing - it's always been this way. I wonder if the messages are designed to raise the cost of entry to being a programmer. The error messages from gcc, the GNU C compiler, have improved dramatically in the last 10yrs.

Note
One of the students commented that the error messages were like having life boats that don't work.
  • print out the product of the numbers 3.0 and 4.0 [50]
  • print out the product of the number 3 and the number represented by the string "4.0". [51]
  • print out the string "here is the result" followed by the + sign, followed by the product of the numbers 3.0 and 4.0. [52]

4.11. Other primitive data types

The data types described so far are found in most languages. Others are common, but on knowing these four, the new ones will be easy to use when we need them.

What primitive data types do we know now [53] ?

4.12. Review: primitive data types

  • how many different numbers can be represented on a 32 bit computer?
  • what instruction is necessary for a computer to be able to do operations on numbers whose width is arbitary multiples of width of the computer's registers (i.e. what instruction is needed to do 64 bit and 128bit operations on a 32 bit computer)? How does the instruction allow arbitary width operations?
  • a computer with 1 byte registers has the bit pattern FDh in a register. What decimal number does this register represent if the number is an unsigned int, a signed int?
  • What can you say about a signed int whose first digit is 1, 0?
  • you have a +ve signed int. What is the relationship between the -ve of the number and the complement?
  • what is the largest/smallest unsigned int, signed int you can represent on an 8 bit, 16 bit and 32 bit computer?
  • a 32 bit computer addresses memory byte wise (i.e. it reads or writes the whole byte as one unit). Since a 32 bit computer can address 232≅4G different items, how much memory (RAM) (in bits) can a 32 bit computer address?
  • a computer (depending on the disk size) addresses a disk in blocks of 512 bytes. How big a disk can a 32 bit computer address?
  • Regular consumer digital cameras use 8 bits to represent each of the red, green and blue channels. How many different colors can be represented with a consumer digital camera? Photo processing software (e.g. Photoshop) can work in 16 and 24bit mode.
    Note
    16 bit code for GIMP was written in the early days but not accepted by the developers (a major blunder in my estimation). The 16 bit code was used for other projects and has returned in filmGIMP, but not regular GIMP.
    How many colors are available in 16 bit mode? Who uses 16 bit colors, if that many levels aren't recordable by a camera? You can take several exposures of a high contrast scene (e.g. images with shade and full sun, or astronomy), where some part of the range of the image is exposed correctly while the rest is over or under exposed. You can then superimpose the photos, selecting by exposure at each pixel, to compress the exposure. In this way the brightly lit spots are reduced in exposure, while the shaded spots are increased in exposure (see high dynamic range imaging http://en.wikipedia.org/wiki/High_dynamic_range_imaging).
  • what is the result of the following operations in python (is true for almost all languages)
    7+2
    7-2
    7*2
    7/2
    7%2
    
  • how much storage is required for a char?
  • a byte of memory representing a char holds the bit pattern 75h. What char is it?
    Note
    You can look up an ascii table on the internet, or convert it at the bash prompt.
  • You hit the 7 key on your keyboard. The computer receives which of these
    • '7'
    • 7
  • What's the difference between the '7' (the char 7) and 7 (an int)?
  • what is the result of the following operations in python (is true for almost all languages)
    7.0+2
    7-2.0
    7*2.0
    7.0/2
    7%2.0
    
  • Binary operations of int and reals give the expected result (the int is promoted to a real). The following is a ternary operation and is fragile code. What result does it give? Make a guess as to what the coder probably was trying to do, show the correct code and answer.
    7/5*10.0
    
  • As a result of asking for user input you execute the following code
    user_input="4.0"
    
    write code that prints to the screen "the result is " followed by the product of user_input and the int 3 (one extra line of code is enough). Did you output a real or a string (you can do it either way, but make sure you know which you've output)? Try doing it the other way as well. See if you can do it in your head before doing it at the python prompt.

Answers [54]

5. Other Languages

The style of programming that Python does is called imperative: you tell the computer to do something, then you tell it to do something else ... and so on. The other style, in languages like Prologue, is called logic based, where you give the computer a bunch of rules and a bunch of data and say "go see if you can find anything new". Logic based languages are used to derive new mathematical proofs, or to build Expert Systems (http://en.wikipedia.org/wiki/Expert_system).

Since all computers do the same thing i.e. perform calculations, test and branch on conditions, iterate over a set of data; then the imperative languages will have instructions for these actions and they'll all look much the same (there's only so many ways of asking if x==0). As a result, most languages are pretty similar, despite the noise from the adherents of each language. If you can do something in one language, then you can likely do it in another language.

The main difference between languages is the gratuitous changes in syntax for any particular functionality. The only purpose of this is to make each language appear different (like selling cars). (It turns out there's a lot more ways of asking if (x==0) than you'd ever want to know about. I'd be happy for just one.)

Footnotes:


[1]

  • harddisks: started at 10MBytes (1985) and now are 500GBytes. Transfer rate is the rotational speed of the disk (4,500-10,000rpm) * the number of platter surfaces (1-5 or so) and hasn't changed much in that time. The density of bits has increased enormously in that time, accounting for a large increase in disk capacity, but not speed.
  • floppy disks: 1.44MBytes. floppy disks are not used much anymore and some computers come without them.
  • flash memory: These started out at several MBytes. Now flash sticks are 2-4GBytes for $40.
  • RAM: Started at 640kBytes at $1/kByte. Now computers have 64M-8GBytes at $1/10MBytes.

[2] 11012 = 1110

[3] 1510 = 11112

[4]

 1001
 0011+
 ----
    0
   1 carry
   00
  1  carry
  100
 ----
 1100

[5] alchemy, algebra, alkali

"al"="the". It's Arabic. During the European Dark Ages (about 600-1600AD), Europe reverted to a primitive, uneducated and illiterate society. The Arabic world at the time was active and progressive towards learning (chemistry, mathematics, astronomy) and kept alive knowledge from the Greek civilisation that would have been otherwise lost by translating manuscripts into Arabic. (The Greeks knew that the world was round and had a good estimate of its diameter. Europe didn't rediscover the roundness of the earth till modern scientific times.) The modern times of Europe began with the translation of Arabic texts into the European languages.

[6]

    1010
    0110x
    ----
    0000
   1010
  1010
 0000
 -------
 0111100
	 no carries
 -------
 0111100

[7]

    1100
    1001*
    ----
    1100
   0000
  0000
 1100
 -------
 1101100
	 no carries
 -------
 1101100

[8] 0,6

[9] A two bit computer has a number system with base=4. The number and its complement add to 4.

[10]

  0110	#decimal 6
  1010  #decimal 10
  ----
  0000

[11]

  1011	#decimal 11
  0101  #decimal  5
  ----
  0000

[12]

 1001
 0111
 ----
 0000

[13]

 1100
 0100
 ----
 0000

[14] 25610, 1000000002. 256 is the base of an 8 bit system.

[15]

problem
 01101101  minuend
 01100011- subtrahend
 --------

find complement
 01100011 subtrahend

 10011100 bit flipped subtrahend
	1+ add 1
 --------
 10011101 complement of subtrahend

addition of complement
 01101101  minuend
 10011101+ complement of subtrahend
 --------
 00001010  difference

 decimal 109-99 becomes 109+157=266
 266 becomes 10 after overflowing by 256. 

[16]

problem 
 10110110  minuend
 10001111- subtrahend
 --------

find complement of subtrahend
 01110000 bit flipped subtrahend
	1+ add 1
 --------
 01110001 complement of subtrahend

do addition of complement
 10110110  minuend
 01110001+ complement of subtrahend
 --------
 00100111  difference


 decimal 182-143 becomes 182+113=295
 295 becomes 39 after overflowing by 256. 

[17] output is what base? input it what base?

echo "obase=2; 32" | bc
100000

[18]

echo "ibase=2; 101" | bc
5

[19] there's less chance for mistakes if you explicitly set obase and ibase whenever you're doing conversions. The time lost debugging a coding error is always more than the time it took to for the extra typing. As well the code is easier for someone else to read - if you allow a default setting, then the reader doesn't know whether you really wanted the default, or if you just forgot to say set obase.

echo "obase=10;ibase=2; 1001*10" | bc
18

[20]

echo "obase=2;ibase=10; 32+6" | bc
100110

[21] looking up the list at the start of the section; 1010=Ah.

using bc

echo "obase=16;ibase=10; 10" | bc
A

[22] looking up the list above; 10112 = 1110 = bh

using bc

echo "obase=16;ibase=2; 1011" | bc
B

[23]


hex addition table                       carry table
+ | 0 1 2 3 4 5 6 7 8 9 A B C D E F      | 0 1 2 3 4 5 6 7 8 9 A B C D E F
-----------------------------------    
0 | 0 1 2 3 4 5 6 7 8 9 A B C D E F     0| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 | 1 2 3 4 5 6 7 8 9 A B C D E F 0     1| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2 | 2 3 4 5 6 7 8 9 A B C D E F 0 1     2| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
3 | 3 4 5 6 7 8 9 A B C D E F 0 1 2     3| 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
4 | 4 5 6 7 8 9 A B C D E F 0 1 2 3     4| 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
5 | 5 6 7 8 9 A B C D E F 0 1 2 3 4     5| 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
6 | 6 7 8 9 A B C D E F 0 1 2 3 4 5     6| 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
7 | 7 8 9 A B C D E F 0 1 2 3 4 5 6     7| 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
8 | 8 9 A B C D E F 0 1 2 3 4 5 6 7     8| 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
9 | 9 A B C D E F 0 1 2 3 4 5 6 7 8     9| 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
A | A B C D E F 0 1 2 3 4 5 6 7 8 9     A| 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
B | B C D E F 0 1 2 3 4 5 6 7 8 9 A     B| 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
C | C D E F 0 1 2 3 4 5 6 7 8 9 A B     C| 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
D | D E F 0 1 2 3 4 5 6 7 8 9 A B C     D| 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
E | E F 0 1 2 3 4 5 6 7 8 9 A B C D     E| 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
F | F 0 1 2 3 4 5 6 7 8 9 A B C D E     F| 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

this took a bit of thinking by the students. Make sure they do the whole table. A common mistake was to fill the bottom right triangle with "0"s. Some students didn't know the result of "F+1" (G?). They also didn't understand that the carry for the bottom right triange was 1, so I added the carry table to the question.

[24] E

[25] 1E

[26] 1 byte = 2 hexadecimal numbers, so the answer will have 2 hexadecimal numbers and any carry beyond that will be lost.

using the list above, along with standard addition rules;

 cd
 0e+
 --
  b
 1  carry
 --
 db

Note the "0" was irrelevent; you get the same answer if you add "e" instead of "0e".

Note there was no carry beyond the byte boundary.

using bc (which requires hex to be upper case)

echo "obase=16;ibase=16; CD+E" | bc
DB

[27] F, A, CF

The trivial way of doing this, of finding the number that you need to add to get 0 works well for single digit numbers: F+1=0, so F and 1 are complements; 6+A=0 so A and 6 are complements.

However this doesn't work for the situations when you really need it; for multidigit numbers. You should instead count in from the highest number here F. Here's how it works for 3A.

A: count in A steps from F: EDCBA98765 - ans: 5
3: count in 3 steps from F: EDC        - ans: C

3A - subtrahend
C5 - bit flipped version of 3A 
 1+
--
C6 - complement of 3A

checking
3A
C6+
--
00 (with overflow)

[28] using bc
echo "obase=16; ibase=16; FD01-EF56" | bc
DAB

subtracting using the complement

FD01
EF56-
----

Find hex complement of EF56
step 1: flip bits 0<->F, 1<->E, 2<->D...
10A9
step 2: add 1
10A9
   1+
----
10AA

now add the complement
FD01
10AA
---- 
0DAB

[29]

  • binary digit
  • binary, octal, decimal and hexadecimal
    Note
    The octal number system hasn't been used since the '70's (thank goodness; it's been replace by hex). You have to know that this number system exist, as many utilities still accept octal input and unless you take precautions, your input will be interpreted in octal. :-(
  • 2,7,8,11 decimal -> 10,111,1000, 1011 binary; 0x2 (or 2H, or 2h), 7h, 8h, Bh.
  • 4 bits. A byte can represent 256 different numbers (e.g. -128 to 127, 0-255, 256 reals of any value you choose).
  • 32 bits, 64 bits (in 2009, home PCs are available with 64 bit words).
  • 0101
    1101+
    ----
    0010
    

    5+13=2

    Note
    a bit was lost from overflow
  • the same.
  • cheap
  • 00000101
    00001101*
    ----
    00000101
    00010100
    00101000+
    --------
    01000001
    

    5*13=65

  • 2 byte registers. Integer multiplication of arbitary sized integers will give answers whose width is twice that of the registers. In practice you never need to do this; you'll use reals, or if you really need double width integers, you'll use special software routines.
  • Add the 10's complement of the number.

    8
    7-
    -
    Find the 10's complement of 7.
    10's complement of 7 is the 7th number down, starting at 9:
    9,8,7,6,5,4,3. 10's complement of 7 is 3. 
    Add the 10's complement of 7
    
    8
    3+
    --
    1
    

    Add the 10's complement of 4

    3
    4-
    -
    The 10's complement of 4 is 6 (count in 4 digits, starting at 9)
    Add the 10's complement of 4
    
    3
    6+
    --
    9
    

    There are no -ve numbers on a single digit decimal computer. Due to overflow, 9 is the same as -1. The computer has to be told whether to interpret the number as positive (in which case it's 9) or negative (in which case it's -1).

  • 1001
    0110-
    ----
    
    calculating 2's complement 
    0110 original number
    1001 invert digits
    1010 add 1
    
    The subtraction becomes
    1001
    1010+
    ----
    0011
    

    Note
    in decimal, 9-6 becomes 9+10. On a 4 bit computer 9+10=3
  • # echo "obase=10;ibase=2;0110"| bc 
    6
    # echo "ibase=2;obase=10;0110"| bc
    110
    

  • 1310=dH
  • 11102=eH
  • efH=111011112=23910
  • a0H+77H=17H
    Note
    one digit is lost on overflow
  • ed01
    ae80-
    ----
    
    1.	#in your head
    
       1	#no carry
      8     #carry
     e	#d-f, carry
    3	#e-b
    3e81	#
    
    
    2.	#16's complement method
    
    517f	#flip bits of the subtrahend ae80 (f<->0,e<->1,d<->2,c<->3,b<->4,a<->5,...)
       1+	#add 1
    ----
    5180	#16's complement
    
    ed01
    5180+	#add 16's complement
    ----
    3e81
    
    
    3. bc
    
    #bc requires U.C. for hex input
    # echo "obase=16;ibase=16;ED01-AE80" | bc
    3E81
    

[30] the complement

[31] 11111110, FEh

[32] 01 Jan 1970 was the start of a decade (not particularly useful), year and month. Unfortunately it was a thursday (not the start of a week). Unix programmers, who have to tabulate data by the week on machines with Unix time, have to subtract out the extra days to thursday. A better year would have been 1967 (or 1950 if you wanted to start on a decade).

[33] 232 = 4294967296 = 4G

You can choose these 4G of different integers to represent +ve numbers of 0-4294967296, or +/-ve numbers in the range -/+2147483647.

[34] 4G of different memory sddresses, 4G * 1byte = 4Gbyte of memory = 32Gbits

Note
no-one measures memory in bits. This was just a question to make you think.

[35] You could do your addressing in 32-bit (4 byte) blocks. This would increase the amount of addressable memory to 16G.

[36] 4Gblocks * 512bytes = 2TBytes. (512bytes is 0.5kbyte)

Harddisks are approaching this limit (you can now - Jan 2008 - buy a 1Tbyte harddisk).

[37] Make the blocksize bigger. Harddisks haven't used blocks of 512bytes for quite a while now. Current hard disks use blocks of 4096 (or 8192) bytes. This means that a file of size 1byte occupies 4096 bytes on the disk and that all files occupy multiples of 4096 bytes, no matter what the actual file size. If your disk has many small files, then a lot of space on the disk will be unused (this is similar to the problem of representing a char with 32 bits as mentioned above). This is just life and is the price of having a large disk. Noone really cares about the lost space, since harddisks are cheap, especially compared to the cost of a programmer.

[38] 4294967296

[39]

echo "obase=10;ibase=16; FFFFFFFF" | bc
4294967295 (in cents)

Their wealth is $42,949,672.95

[40] 256

[41] 4096

[42] 8 bits in each of 3 colors is 2563=16777216

[43]

4G. You'd have them run from 0 to -4294967296. Noone has had a need for 4G of -ve numbers, so it hasn't been done yet, but if you needed them, then you could do it.

[44]

4G. You store them in an array on a disk. Since most of them would be quite long, you'd need a large harddisk.

[45]
>>> 65536/2	#I don't remember 8000 hex in decimal
32768
>>> _		#_ means recall the last output (up arrow recalls the last command)
32768
>>> _ * -1	#make it negative
-32768
>>> _ *65536	#-80000000 is a plain integer
-2147483648
>>> _ -1	#-80000001 is Long	
-2147483649L
>>> 
? The negative most plain integer is -80000000h, 10000000000000002, -214748364810. Any number more negative is a Long.

[46]

It's the next character

echo $'\x42'
B

[47]

10h or 1610 (this is 'Q').

Note
One student had problems with counting 1610 places down the alphabet. Did he count the 'A' as letter 1 or letter 0. I said it didn't matter as long as he stepped 16 letters down the alphabet. He got to 'P'. I didn't want to spend any time on the fencepost problem and left it at that (see Off-by-one error. http://en.wikipedia.org/wiki/Off-by-one_error).

[48]

'Z' is 2510 letters away from 'A'. If you changed 41 hex to 51 hex, then you're 1610 letters along the alphabet. You need to move another 9 positions along the alphabet. Using hex addition 51h+9h=5Ah

Note
Some students said 51hex+9h=60h and were perplexed by the result. Students who marched down the alphabet and got to 59h='Y' were surprised to find that their next try of 60h gave an '`'
echo $'\x5A'
Z

[49]

1000002 = 25 = 2*24 = 20hex. Flipping the 6th bit is equivalent to adding/subtracting 20h.

[50]

>>> print 3.0*4.0 
12.0

[51]

>>> print 3*float("4.0")
12.0

[52]

>>> print "here is the result " + repr(3.0*4.0)
here is the result 12.0

[53]

There are 4: integer, real, char, string.

[54]

  • 232≅4G
  • add with carry (often called ADC). The instruction saves the overflow bit (usually in a FLAG) (it does other things as well, but it must save the overflow bit).
  • unsigned int FDh=25310. signed int FDh=-310
  • signed int: if 1st digit is 1, is -ve; if 1st digit is 0 is +ve (or 0) (i.e. not -ve).
  • the complement and the -ve are the same. Both when added to the original number give 0.
  • see integer_range. You need to remember these numbers, or at least recognise the ones upto 32 bits.
  • 232bytes=232+4bits≅64Gbits
  • 232blocks≅512*4Gbytes≅2TBytes.
  • 8 bit sensors, for each of 3 colors, allow for 224=16777216 colors (a number you'll need to remember if you get into image processing). 16 bit image manipulation allows 248=280Tcolors.
  • 7+2 = 9
    7-2 = 5
    7*2 = 14
    7/2 = 3
    7%2 = 1
    

    7/2 was the only real question here. The other questions were padding.

  • 1 byte
  • # echo $'\x75'
    u
    
  • '7' (the char 7)
  • a char is a glyph of arbitary shape which can be entered by hitting the keyboard, or displayed on a screen. The char is assigned a meaning by convention e.g. the char 'a' is the first letter in the alphabet. An int is a counting number. The int 7 represents seven objects. Seven objects can be represented by 1112,710,7h or '7'.
    Note
    No-one is going to get legalistic about this and ask for a cast iron definition (presumably there is one but I don't know it). You just have to be clear that 01112 and '7' are different, even though there's a convention linking them.
  • In binary operations with int and real, the ints are promoted to reals.

    7.0+2 = 9.0
    7-2.0 = 5.0
    7*2.0 = 14.0
    7.0/2 = 3.5
    7%2.0 = 1.0
    
  • Answer for code as given is 10.0. The mostly likely correct code and answer is
    (7.0/5.0)*10.0
    14
    
    another possibility is
    7.0/(5.0*10.0)
    0.14
    
    The original coder should be kicked in the butt.
  • This outputs a real.

    >>> user_input="4.0"
    >>> print "the result is ", (3*float(user_input))
    the result is  12.0
    

    This outputs a string (note the '+' which concatenates strings in python).

    >>> user_input="4.0"
    >>> print "the result is " + repr(3*float(user_input))
    the result is 12.0
    

AustinTek homepage | Linux Virtual Server Links | AZ_PROJ map server |