BigNums
Volume Number: | | 8
|
Issue Number: | | 3
|
Column Tag: | | Lisp Listener
|
Arbitrarily Large Bignums
Three easy substrates
By André van Meulebrouck, Chatsworth, California
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
If we could count the stars, we should not weep before them. George Santayana
Introduction
In LISP, numbers of type fixnum are the analog to type int in conventional languages. In MacScheme, a fixnum is represented in 30 bits as a two's complement number. (The top 2 bits are used for tag information, leaving 30 bits to play with.) Thus, the highest number representable as a fixnum is 536,870,911. If bigger numbers are needed (i.e., to calculate the national debt), bignums must be used.
Bignums are integers that can be arbitrarily large, being limited only by available memory. Herein, bignums will be represented as dynamic lists.
Scheme implementations are not required to support bignums [Rees et al, 1986], however MacScheme provides them.
Despite that, we will implement them herein as an exploration of algorithms and LISP coding, and, to implement them in a more general fashion.
Our specific goal will be to compute the factorial of 120. Beyond that, the rest is left to the reader.
Here is what the desired example looks like using MacSchemes bignums (which one gets automatically when a number of type fixnum exceeds its size boundary).
MacScheme Top Level
>>> (define fact
(lambda (n)
(if (zero? n)
1
(* n (fact (1- n))))))
fact
>>> (fact 120)
6689502913449127057588118054090372586752746333138029810295671352301633557244962989366874165271984981308157637893214090552534408589408121859898481114389650005964960521256960000000000000000000000000000
An aside: a function called time-it will be used later to determine a functions execution time. In the case of fact (above), it took about 0.1 seconds on a Macintosh IIcx with 8 megabytes of RAM in MacScheme+Toolsmith version 2.0.
An Approach to the Problem
Since our bignums will be dynamic lists, and dynamic lists come for free in LISP, well just use LISP lists. For example, to represent 123 well do it like this: (1 2 3). At least as far as the user level digit (power) ordering is concerned. But is that the best digit ordering for internal purposes? LISP is good at dealing with the head of lists, but not so good at dealing with the ends of lists, and since numbers can be any length, when adding (1 2 3) to (9 1 3 5 2 1 5 3 6) what internal digit ordering would be most desirable? How about having the car (head) of each list represent the lowest digit (digit times 10 to the 0th power)? That way, the car of any two lists, regardless of the sizes involved, would always represent digit positions of the same power, and by cdring (cdr is a function that returns the rest of a list after the first element has been excluded) through them we could get to the higher digit places in perfect synchrony. (Note also that this choice of digit ordering obviates the need to normalize bignums before and after arithmetic operations.)
So, the first thing needed is a routine to convert from the users digit ordering to our chosen internal digit ordering: normalize. To convert back to the user digit ordering: unnormalize. (Note that the definition of normalize and unnormalize will turn out to be nothing more than Schemes reverse function, which reverses the elements of a list.)
Substrate One
We will be building up the routines we need by building up substrates. Lets start on the first substrate and see how many will be needed after making the first substrate. This will be called the big substrate.
It might prove a good tack to start by implementing only unsigned bignums, because even after other substrates are made, it could be that some functions consuming these substrates will want simply unsigned bignums, thus they can call directly into the lower, and presumably faster, first substrate. More importantly, it seems theoretically pleasing since many formal systems, such as primitive recursion, have only non-negative integers. In pure l-calculus there are no numbers whatsoever, however they can be bootstrapped, and when they are its the nonnegative integers that get derived first. So, lets make the first substrate handle only unsigned bignums.
Lets say we want to be able to represent numbers in bases other than 10. Lets say base 16,384 (why thats such a magical number will be explained later). The highest digit we can have in base 16,384 is 16,383! Yikes! We need a way to delimit digits which require more than one token. I chose angle brackets. Below is the factorial of 120 in base 16,384 (excerpted from the test suite). Note how the trailing zeros (which represent 8 digit positions) are not in angle brackets because they dont need to be.
>>>
;
(show-big foo)
5<9718><3586><10713><1404><3426><4947><9968><4456><15225><11647><7568><1257><7813><16381><15446><15340><6446><7087><1518><3762><12424><6353><12398><3716><16165><14012><15018><6126><504><12001><15793><3811><4956><11758><6872><658><228><6753><12016>00000000
#t
In the case of base 16, I kept with the convention of using ASCII characters: 10 = a, 11 = b, ..., 15 = f. This scheme is continued until the digits get so large that the set of reasonable ASCII characters we have to draw on gets exhausted.
The interesting thing about representing numbers in different bases is that the higher the base, the faster the computations can be done. However, to convert from a higher base to a lower one (as it is done in this articles code) takes time, so there are tradeoffs. Also, if these substrates were used to make big floats (arbitrarily accurate floating point numbers) we could make good use of different bases-it is sometimes desirable to represent certain numbers in one base versus another to avoid infinite division problems (i.e., 1/3 when floated gives a repeating digit in base 10 but not in base 3).
The routine to make bignums in the first substrate is make-big which normalizes its argument, and optionally converts it from any base to any other base. Conversely, bignums can be displayed in human readable form using show-big (as in above example).
The Code
A note on style: In keeping with Scheme tradition, all pure predicates (functions which answer a question by returning canonical true: #t or canonical false: #f) end in an interrogative mark (i.e., no-digits?). Impure predicates (functions which answer a question with other than #t or #f) I chose to end in double interrogatives (i.e., base-10??).
The programs data structuring was done abstractly. Abstract data structuring is a technique for making the reading, writing, and modifying of code easier, especially in big, complicated software systems. Reading ones own code is like a dog eating its own vomit. Reading someone elses code is like a dog eating another dogs vomit (grodymax!). Consider a program to keep track of widgets. Lets say you keep track of widgets by putting them in an array. If every time you access a widget, you make an explicit reference to the array that houses them, modifying and understanding the code could be difficult later, because the understanding and conceptualization of data structures is spaghetti-coded in with the rest of the program. That means if you wanted to change data structures, youd have to do major surgery on the program and have a more intimate understanding of the code than youd probably want to. To ameliorate such difficulties, one can (heres that ever so important word again) abstract out the data structuring from the rest of the program code! For instance, one could write a function make-widget-data-structure, and create accessors for it like: get-widget. If all references to data structures go through such abstract functions, modifying the data structuring is much easier. Not only that, but writing code in the first place is easier as well. For a good discussion on abstract data structuring, see [Abelson et. al., 1985].
No macros were used because I think macros represent weaknesses in current languages that need to be fixed, and they cause problems: If you update a macro, you must recompile all consumers of that macro to get the effect of the change-that introduces order matters issues that I find unpalatable. Also, macros arent first class (which means you cant return them, nor pass them as arguments, which of course means you cant map them, etc.). Furthermore, they dont show up in the debugger (since they are gone after compilation). Nonetheless, if you convert the data constructors and accessors (etc.) to macros, the timings will show goodly speed improvements.
Speed improvements could also be had in the code by getting rid of defaulted parameters at the big and bignum substrates. These are redundant and are only in the code to allow users to more conveniently explore the different substrates. Also, remove-leading-zeros gets called redundantly-perhaps there is some way to weed out some redundancy, perhaps not (left to the Department of Redundancy Department-I opted for correctness over efficiency).
A note on running the code of this article. The code of the article resides in three files: bignums-big.sch, bignums-bignum.sch, and bignums-object.sch. The file bignums.sch contains a loader for the code that is akin to a sort of primitive defsystem (MacScheme doesnt provide a defsystem feature, but on systems that do provide it, its analogous to Unixs make feature). In the file bignums.sch tweak the argument to the function load-bignum-system to the pathname where you placed the three files of the code on your Macintosh (i.e., what folder you placed it in). For me, this was:
(load-bignum-system
Mr.Disk:Applications:Word:Writing:MT/bignums:)
Then, evaluate all the forms in your tweaked version of file bignums.sch. Next, you will need to open the file bignum-tests.sch, be sure Copy to transcript on the Command menu is checked, then select Eval Window off the Command window, quickly selecting the transcript window with the mouse pointer in order to watch the test suites run.
Addition
(Addend + augend = sum)
To perform addition is simple: Just add the car of one bignum to the car of another, then repeat the process on the cdrs of the respective lists. The only glitch is handling carries. But what exactly is a carry? When two digits are added, if the result is greater than or equal to 10, we have to pass along that overage when we call our add routine on the cdrs of the lists so that the next digits added can add it in.
How do we find out what the overage (carry) is? The overage for us will look like: yx where y represents the tens digit and x represents the ones digit. More formally: y * 101 + x * 100, or more simply stated: y * 10 + x . With the result so represented, our job is clear: We want to peel off y (since y is the overage). How do we do it?
A digression that will answer that question is in order here. Since we are talking about base conversions, we may as well look at the algorithm for converting bases. Basically we just divide by the base continually, collecting all the remainders. Heres how 15 would be converted to base 3.
Figure 1
The quotient from the first division becomes the dividend of the next division. When we reach a dividend that is less than the base, we stop. 15 in base 3 is 120. If we cons each remainder, as we get it, into the recursive calls of the base conversion routine, we would collect: (0 2 1) which is nothing more than our internal representation for 120 (how convenient!).
So, you can see that a base conversion routine could allow us to break out digit positions exactly as we need. We simply call such a routine with the result from the current digit position. The routine used for this purpose is fix-base-10->big-base-n. (Question for the reader: Could we instead employ a routine that treats numbers as token strings, and simply pull off the overage using string or symbol manipulation rather than arithmetic? Could we employ such a scheme for base 10? How about other bases?)
>>> (fix-base-10->big-base-n 32 10)
(2 3)
To get the result that should go in the current digit position, we apply car to the result from fix-base-10->big-base-n, and pass the cdr of it to the next recursive call of the digit adder routine.
(Note: if you pick a base bigger than *biggest-base*, youll be using MacSchemes bignums! (And what do you suppose would happen if MacScheme didnt have bignums of its own?) *biggest-base* was chosen to ensure that all digit by digit intermediate operations done to implement add, mul, div, and sub have closure within the realm of MacScheme fixnums (i.e. produce numbers which are small enough be to representable as MacScheme fixnums). The routine find-biggest-base determines what the biggest base you can use is. To express numbers in bases bigger than *biggest-base*, you would need to make a meta substrate which consumes the three substrates herein to do its intermediate operations!)
The numerical arguments to big-add might be lists of different lengths. I chose nested if tests to test for the end of first list, then the other list because that most accurately reflects my thinking while coding the algorithm.
Getting Carried Away
A concern: Note that big-add makes the assumption that there will never be more overage than one digits worth when two digits are added. In other words, the following assumption is implicit: (<=? (length new-rem) 2) . Is this a valid assumption for all cases in all bases? It would seem okay for base 10: 9 + 9 = 18 which requires only two digits. Still, it would be nice to prove this assumption for the general case. Basically we need to show that b - 1 (which is the biggest digit we can get in any base) + b - 1 <= (b - 1) * b + b - 1 (which is the biggest quantity we can get in two digits worth in any base.
b - 1 + b - 1 <=? (b - 1) * b + b - 1
2*b - 2 <=? b^2 - b + b - 1
2*b - 2 <=? b^2 - 1
2*b <=? b^2 + 1
b^2 - 2*b + 1 >=? 0
(b - 1) * (b - 1) >=? 0
(b - 1) * (b - 1) >= 0 for any integer b (QED)
Indeed, the statement is true for any integer b. Since were only interested in positive, non-zero bases, and since any non-zero positive integer makes the equation true, were thoroughly safe. In other words, we can rest assured that any two digits added in any base will give only results that can be represented in two digits.
While were at it, lets do the same proof for multiplication, wherein we need to prove that (b - 1) * (b - 1) <= (b - 1) * b + b - 1. This seems likely for base 10 since 9 * 9 = 81 which is well less than 99 (99 being the biggest number we can represent in two digits in base 10).
(b - 1) * (b - 1) <=? (b - 1) * b + b - 1
b^2 - 2*b + 1 <=? b^2 - 1
-2*b + 1 <=? -1
-2*b <=? -2
b >=? 1
b >= 1 for any counting number (QED)
So for multiplication, as long as the base is greater than or equal to one, were okay! (Exercise for Zippy: Explain base one!)
Subtraction
(Minuend - subtrahend = difference)
Now we consider subtraction, which you might expect to be much harder. We implemented addition in big-add the same way we do it by hand. Lets look at how we humans do subtraction, then see if we can code a computer algorithm the same way.
When the minuend digit is greater than or equal to the subtrahend digit, the difference is trivial to compute because there are no borrows to worry about.
When the minuend digit is less than the subtrahend digit, we have to borrow. This is where coding an algorithm could get hairy. Consider 1000000 - 1. Most people borrow through all the zero digits of the minuend. Yuk! This is contrary to LISPs strength at dealing with the head of a list because it might require us to look ahead quite a ways in a given list if we need to do a borrow.
A trick: Optimized Subtraction
Lets look at an example in human readable (as opposed to our internal) form. In the case of 1000000 - 1 we would be wanting to convert (1 0 0 0 0 0 0) to (0 9 9 9 9 9 10). But if you think about it, a borrow from the minuend is the same as a carry added to the subtrahend. So, the conversion from (1 0 0 0 0 0 0) to (0 9 9 9 9 9 9 10) can be simulated, with the same effect, by converting the minuend to this instead: (1 0 0 0 0 0 10) as long as we convert the subtrahend from (1) to (1 1)! However, after the first digit subtraction, we have to borrow again. No problem, we just repeat the process every time we need to borrow. The advantage to doing the borrow in this way is that it fits the LISP style of car and cdr, and it allows us to get everything done in one trip through the list with no look aheads whatsoever. Furthermore, its much simpler-not much more difficult than the carry was for the add algorithm. In fact, not only is it easier for LISP-its also easier for humans! Not only do you not have to look ahead more than one digit for a borrow, but youre never borrowing anything other than 10. (In the customary method, when borrowing, you are sometimes borrowing 10, and sometimes 9). These schemes are mathematically equivalent. In the below, consider c to be the minuend digit from which a borrow was taken, and d to be the subtrahend digit.
(c - 1) - d (Given. i.e., how most people do it.)
c + (-1 - d) (Associative Law.)
c + (-d - 1) (Commutative Law.)
c - (d + 1) (Distributive Law.)
I was introduced to this trick by a CU professor [Haddon, 1982] who presented it as a trick for humans doing subtraction, especially in other bases such as 16, 8, and 2). He also presented a second optimization, and although it doesnt matter for our implementation of bignums herein, I will show it anyway.
An trick for humans: Optimization Two
After a borrow, subtract from 10 before adding. More tutorially: After a borrow, most people add the borrowed quantity to the minuend digit that needed the borrow. Then the subtrahend digit is subtracted from the aforementioned sum. Lets say that c is the minuend digit that needed the borrow, and d is the subtrahend digit. We just borrowed a 10. Normally,
people do this: (10 + c) - d , and Im suggesting this instead: (10 - d) + c . These are mathematically the same.
(10 + c) - d (Given.)
10 + (c - d) (Associative Law.)
10 + (-d + c) (Commutative Law.)
(10 - d) + c (Associative Law.)
It is customary to double check ones answer by adding the difference one obtains with the subtrahend to see if their sum is equal to the minuend of the subtraction problem. Note what happens when we try that after doing subtraction in the optimized style: The tick marks from the borrows (of the subtraction) turn out to be right where the carries need to be! In fact one can do the double check in ones head without actually writing anything! Another nice creature feature: no digits got stroked through! In the usual style of subtraction, if a digit gets borrowed from, it gets stroked through and one less than the digit gets penciled in above it.
Figure 2
Figure 3
Above: The same subtraction problem done using the customary method. Note that in the optimized form of subtraction, both the minuend and the subtrahend could get numbers penciled in above digits, but the only penciled in digits possible are 10 and 1. In the customary method, only the minuend can get penciled in numbers above a digit, and possible penciled in numbers can span a range of different values (Question to the reader: What is the possible range?). Worse yet, the penciled in numbers can get modified, as in the case of the 3 in the minuend which first became 2 then 12.
Rationale for subtracting via the optimized method: (10 - d) + c is much easier for humans to compute than (10 + c) - d . This is the case in any base, but most especially in higher numbered bases like base 16 and beyond. Heres why: One has to memorize fewer subtractions. Lets compute exactly how many operations are required in base 10 for both methods of subtraction.
The first case is where no borrow is required. Lets omit the cases: n - 0 = n and n - n = 0 since they are degenerate cases which can be computed trivially by viewing them as a special case.
If we have a 9 in the minuend digit, we have to know how to subtract 8, 7, ..., 2, 1 from it. Thats a total of 8 subtractions. If we have an 8 we have to know how to subtract 7, 6, ..., 2, 1 from it. Thats a total of 7 subtractions. In the case of 2, we can only subtract 1 from it, and in the case of 1, weve arrived at the degenerate case were not counting. As you can see, the sum of all these subtractions we have to memorize is 8 + 7 + ... + 1 . Reversing that we are just summing the first n counting numbers wherein n = 8. There is a formula for computing this: n * (n + 1) / 2 (The derivation of the formula is quite cute, but that would be too much of a diversion.)
So, for n = 8, we have: 8 * 9 / 2 = 36 , which means wed have to memorize 36 cases when subtracting with no borrows.
When there is a borrow, using the usual style of subtraction: 9 never causes a borrow, 8 will if the subtrahend digit is 9 (thus the subtraction needed here is 18 - 9 = 9), 7 will if the subtrahend digit is 8 or 9, 0 will if the subtrahend digit is 1 through 9. The pattern is a forward summing of the integers from 1 to 9, so 9 * 10 / 2 = 45.
When there is a borrow using the optimized style of subtraction, one need only know how to subtract from 10: 10 - 9, 10 - 8, ..., 10 - 1. A total of 9 subtractions. After doing the subtraction from 10, one has to do a trivial addition wherein c + d <= 10 - 1. This is truly a trivial addition when compared with what one must know for addition: c + d <= (10 -1) + (10 - 1) = 2 * 10 - 2 , which is certainly more additions than is required in c + d <= 10 - 1.
To compare: The conventional style requires 36 + 45 = 81 subtractions one must memorize, whereas the optimized style requires only 36 + 8 = 45. 81 versus 45? 45 + 45 * x = 81 ; 45 * x = 81 - 45 ; x = (81 - 45) / 45 = .8 which means the conventional style requires one to memorize 80 percent more subtractions! Holy cow, no wonder Zippy cant subtract!
Other concerns when subtracting.
Note that leading zeros can arise during subtraction. These could be ignored but accumulating too many could be inefficient, besides which everyone likes to see bignums without leading zeros. Therefore function remove-leading-zeros is used.
Removing leading zeros forces another interesting issue: what is the representation for zero? The logical choice is: ( ) which is the empty list-this makes things work out nicely when recursing.
By the way, list numbers can be used in conjunction with these bignums to allow numberless bignums (in which case bignums would be lists of lists but showing that is beyond my game plan for this article).
Numbers Love to be Complemented
There are different ways of doing subtraction, based on different representations of negative numbers. The most popular of these are the complement schemes. I rejected the use of complement schemes herein because they rely on static limits, which would have added some complexities-specifically, numbers of different lengths would have had to have been sign extended in some manner.
Deriving complements from optimized subtraction
Let me motivate two complement systems by showing how they are similar to the optimized form of subtraction shown herein.
WIBNI (Wouldnt It Be Nice If) we had greater consistency in our subtraction algorithm?
If you recall, our subtraction algorithm goes digit by digit and distinguishes two cases for x - y (where x and y represent digits);
Case one: the subtrahend (y) is greater than the minuend (x),
Case two: the subtrahend is less than or equal to the minuend.
Case one requires a borrow, and the optimized method of subtraction performs x - y as though it was (10 - y) + x.
Case two is normal subtraction.
WIBNI we could handle both cases the same way?
Well, we cant get rid of case one, but can we massage case two into an instance of case one?
Absolutely! If we have the equation x - y = z (again, where x and y represent digits) there is no reason why that equation cant be massaged into something else, as long as we do the same thing to both sides.
Since case one does x - y as though it is (10 - y) + x, our goal is clear: we want to massage the LHS (Left Hand Side) of x - y = z into (10 - y) + x, then see what the resulting RHS (Right Hand Side) looks like.
x - y = z [Given.]
-y + x = z [Commutative Law.]
10 - y + x = z + 10 [Addition Theorem of Equality.]
(10 - y) + x = z + 10 [Associative Law.]
So, we can do both case one and case two the same way: (10 - y) + x, as long as we subtract 10 from the result. Lets try it by running through case one and case two with specific examples.
Case one: 2 - 6. x = 2, y = 6. (10 - y) + x = z + 10; (10 - 6) + 2 = z + 10; 4 + 2 = z + 10; 6 = z + 10; z = 6 - 10; z = -4. Note that when we had z = 6 - 10 we couldnt simply peel off the tens digit because it wasnt there (or we could consider the tens digit as being 0). This means we had a deficit.
Case two: 6 - 3. x = 6, y = 3. (10 - y) + x = z + 10; (10 - 3) + 6 = z + 10; 7 + 6 = z + 10; 13 = z + 10; z = 13 - 10; z = 3. Note how easy 13 - 10 is to perform-it amounts to simply throwing away the tens digit position.
There is a correlation between the result in the tens digit and the sign of the result. And it would be nice to represent sign information, especially since we can create complements that look like positive quantities-it would be nice if we could encapsulate sign information into each number so we dont have to mentally keep track of whats negative and positive. Perhaps the result in the tens digit can be used as a sign. However, it seems backward-wed really like to have a 0 in the tens digit on results that are positive-that way positive numbers dont have to be altered for sign information, just negative numbers need be altered.
So, lets say a 0 in the tens digit means positive. Now, the question becomes what to do for negatives. How about appending a 1 for negative numbers? That means adding 10 for a sign (because we want the sign in the tens digit place), after doing a complement. Lets try it.
Case one: 2 - 6. 10 + (10 - 6) + 2 = z + 10 + 10; 10 + 6 = z + 20; 16 = z + 20; z = -4.
Case two: 6 - 3. 10 + (10 - 3) + 6 = z + 10 + 10; 10 + 13 = z + 20; 23 = z + 20; 3 = z.
So, as you can see, at the z + 20 = ? stage, we can tell the sign. If the result at that point has a 1 in the tens digit, we know the result is negative-if a 2, we know its positive. But thats not exactly what we want. Wed rather have a positive result be positive without any special effort-so we want a 0 in the tens digit. How can we achieve that? Well, there is no positive number we can add to 1 to get 0. However, 1 + 9 = 10, so if we use 9 as the negative sign thats much closer to what we want. That means (in our case) we want to add 90 to the result of the complement process. Lets try it.
Case one: 6 - 3. 90 + (10 - 3) + 6 = z + 10 + 90; 97 + 6 = z + 100; 103 = z + 100; 03 = z + 00; z = 3.
Case two: 2 - 6. 90 + (10 - 6) + 2 = z + 10 + 90; 94 + 2 = z + 100; 96 = z + 100; z = -4.
Aha! That does the trick. At the stage where we had a result on the LHS, all we had to do was look at the tens digit to tell the sign. For the positive result, we had a 0 there, and a 9 for the negative result. Thats what we want. In the case of the positive result, we had a digit beyond the tens digit-we can simply ignore that (the justification for ignoring it is clear-ignoring it is the equivalent of subtracting the 100 from each side-this is just a short cut allowed us when we use 9 as the negative sign! Nifty!
Note one requirement of this system: since we prepended sign information to the number itself, we had to decide on static limits that the number can take on!
Now, we want to consider examples of multiple digit numbers and try this sign system on them. To do that I will digress and show a different system, then tie the new system into the previous system.
Nines complement
Instead of subtracting 10 - digit, we could do something even simpler: 9 - digit. Thats even less subtractions that we have to memorize, and, it allows us to handle multiple digits quite easily. We simply would need a 9 for each digit. For instance, 4398 would require 9999. So, we could do this: 9999 - 4398 = 5601.
Going back to our original equation we used to justify complement systems, it would look like this: x - y = z; nines - y + x = z + nines. So, if we had 67062 - 4398 = z, then: 9999 - 4398 + 67062 = z + 9999; 5601 + 67062 = z + 9999; 72,663 = z + 9999; z = 72,663 - 9999; z = 62,664. This is nine's complement!
Tens complement
It would be nice to find a shortcut for the step wherein we subtract out the 9999 from the result to get the final z. Since 9999 = 10000 - 1, we could do this: z = 72,663 - (10000 - 1); z = 72,663 - 10000 + 1; z = 62,663 + 1 = 62,664.
We could go a step beyond the above however: x - y = z; (nines + 1) - y + x = z + (nines + 1); ((nines - y) + 1) = z + (nines + 1) [Commutative and Associative Laws]. Note that if you have a bunch of nines, and you add 1 to it, you wind up with 10 to some power. This is ten's complement! A alternative way of computing a tens complement, is to do a nine's complement, then add 1. (This brings us around full circle. At first, we started out wanting to simplify our optimized form of subtraction to merge the 2 separate cases it uses to handle digits into one unifying case. We did that for digits by essentially coming up with ten''s complement for digits, then we did nines complement, and showed how nines complement plus one gives us ten's complement for the general case of arbitrary numbers of digits. There are other ways of motivating and arriving at these conclusions, but this is how I chose to do it.)
Using a numeral for the sign token
Heres yet another shortcut. Before, we figured out the sign and dealt with it as a separate entity, and did complements based on the number of actual digits (excluding the sign digit). Instead of doing that, we can just pretend the sign digit is one more digit to be complemented. Lets go back to an old example: 33 - 54. Lets actually put the positive sign on: 033 - 054. Now, go from there mechanically, without thinking of the sign as being any different than any other digit: 033 + ((999 - 054) + 1) = z + (999 + 1); 033 + (945 + 1) = z + (999 + 1); 033 + 946 = z + (999 + 1); 979 = z + (999 + 1); z = 999 - 979 + 1; z = 20 + 1; z = 21.
Sign extension
If we have a complemented number, and want to add it to a number that has more digits, we need only sign extend the complemented number to match the number of digits in the number we want to add it to. In other words: -03 = 97; -003 = 997; -0003 = 9997; in general -03 = 9...97 etc.. (Convince yourself as to why this is so.)
Note that in base two this process degenerates into simple operations that computers can do quickly and easily. Instead of nine's complement, we do a one's complement, which is cheap and easy. Most computers have two's complement operations wired in. We use the highest order bit for the sign bit.
Closure
One thing that our complement system must consider is closure. For instance, if we have -03 - 08, wherein the leading zero represents the sign digit, that operation goes like this: -03 - 08 = -03 + -08 = 99 - 03 + 1 + 99 - 08 + 1 = 96 + 1 + 91 + 1 = 97 + 92 = 189. Then we throw away the digit beyond the sign bit, leaving 89. Whats that? Somethings wrong! The sign digit changed. We know that a negative plus a negative must yield a negative, and it didnt. This condition is underflow-the result could not be represented within the static limits we represented the operands in. We can detect overflow/underflow by checking the signs of the operands, and checking the results sign. If the signs of the operands are different, we know that the result is closed under addition (i.e., we cant have overflow/underflow). If the signs are the same, we dont have closure under addition, and must check to be sure that the sign of the result is the same as the sign of the operands. If not, an overflow/underflow error should be raised. Of course, if we can represent numbers dynamically, overflow is not a problem for positives. Question for the reader: can we solve this problem for negatives by using dynamic lists and adding on a 0 digit beyond the sign bit before complementing, then throwing away the highest digit of the result?
Range of nines complement and tens complement
What kind of a range can be gotten from ten's complement? The truth table can be written out by enumerating the range of the numerals first with a positive sign, then with a negative sign. After doing so, the negative numbers can be complemented to see what they represent. Notice that 0 has a sign (since its representation has a 0 in the sign digit), but in reality 0 is neither positive nor negative.
As can be seen above, the number of values represented in n numerical digits, is 2 * 10n because there are 10n representable values with a positive sign, and that same number of values is representable with a negative sign. Notice however that 10n - 1 of those are positive while 10n are negative!
Figure 4
Two representations of zero
Note that nine's complement gives two representations for 0! Note also how that changes the range (left to the reader).
Exercise for the reader
Are bignums simpler when implemented as they are herein? Or would they be simpler if implemented using the ten's complement scheme? Faster? Slower?
Note: I didnt include the implementation of bignums using ten's complement because I didnt want to bother. Er...uh....I mean...I wanted to leave that as an exercise for the reader (Yeah, yeah, thats the ticket!).
Multiplication
(Multiplicand * multiplier = product)
Despite all the mathematical knowledge we have, the human method of multiplying is still used in computer algorithms: Multiply, shift, add. (Open ended question for the Überprogrammer: Is there a better way? What do you think of Booths algorithm?) Notice that in base 2, this process degenerates into shift and add.
I used two routines to aid multiplication. First, I wrote a routine to multiply a bignum by a digit. Shifting is easy-since the lower positions are at the head of the list, we can shift it over by consing (pushing) zeros into it! The bignum add routine was already written, so all that was needed after the support functions were written, was the routine to call them all and organize them.
One other entirely optional feature I added was a check to reorder the arguments if necessary in order to assure that the multiplicand has more digits in it than the multiplier. I performed no tests to see if this is actually an optimization or not-it could be that the overhead for determining which argument has more digits is more than the potential gains from such an optimization (if any). However, I consider it a conceptual optimization (even if the optimization doesnt pan out when implemented on a computer) because it mimics the heuristic a human would use. The test used to compare the number of digits is less-digits? which doesnt care about digit positions numerically (only symbolically) and does short-circuit (lazy-like) logic-as soon as the end of either number is reached a conclusion can be arrived at regarding whether the first argument to less-digits? has less digits than the second argument. If Schemes length function had been used, both arguments would have to be completely traversed.
Division
(Dividend / divisor = quotient + remainder)
The Scheme code herein for big-div, is by far the hairiest of the arithmetic operations.
Basically, the division algorithm used herein is akin to that done by a human, including having to guess at each digit of the quotient. The guessing algorithm is the most interesting part of the code. For each quotient digit, an intermediate dividend must be selected. This is done by pushing the first digit of the master dividend into any remainder from subtracting the last quotient digit times the divisor from the last intermediate dividend. If the divisor is greater than the current intermediate dividend, then the quotient digit for that intermediate dividend is 0 and the process continues by recursing on the rest of the master dividends digits. If the divisor is not greater, then there are two possible cases for the intermediate dividend;
1) The divisor and intermediate dividend has the same number of digits (leading zeros excluded). In this case, we are guaranteed to get a quotient digit guess that is greater or equal to what the current quotient digit should be by simply finding the quotient of the first digit of the intermediate dividend divided by the first digit of the divisor. The reason why is that any digits in the positions not being looked at in the divisor can only serve to increase the value of guess times divisor which in turn can only serve to increase the likelihood that the guess will be too big rather than too small. The digits not being considered in the intermediate dividend cant possibly matter enough to ever make our guess too small-indeed, for those digits to be of enough consequence to give us an underestimate in our guess, they would have had to have been big enough to force the digits being considered to be bumped up by 1, but obviously they werent! (Sounds rhetorical, until you think about it.)
2) The number of digits in the intermediate dividend is one greater than the number of digits in the divisor. In this case we cant make a reasonable guess by looking at first digits. We need to find the quotient of the first two digits of the intermediate dividend and the first digit of the divisor. Just as above, this will give a quotient digit guess that is greater than or equal to the correct digit. Since the bandwidth of a fixnum is enough to accommodate two of our bignum digits (the value of *biggest-base* was picked deliberately to assure this), we can stuff two digits into one fixnum and use MacSchemes quotient function just as we did in case two above. If the result is greater than 10, the guess should be 9 because thats the biggest number we can represent (in base 10) in one digit, otherwise, we just take the guess as it is.
Let me state a more formal proof of why the above cases will always give quotient digit guesses that are greater than or equal to the correct quotient digits.
Given qa / xp wherein q, a, x, and p stand for variables that occupy digit positions, and b is the base value that variables q and x reside at. So in other words, qa is worth q * b + a, and xp is worth x*b + p. The quotient digit is picked by quotient(q, x).
Lets consider the case of p = 0. Basically, the claim that the quotient digit guesses are greater than or equal to the correct digits is true, can be expressed like this: remainder(q, x) * b + a < x * b.
Lets consider two cases of remainders. One wherein the largest possible remainder is generated, and the other wherein the smallest possible remainder is generated.
Case one: (x - 1) * b + a < x * b; x * b - b + a < x * b; a < b.
Case two: 0 * b + a < x * b; a < x * b.
As can be seen, by starting with a symbolic expression of the claim, and massaging it, we arrive at true statements for both case one and case two. This reasoning works wherein a and p stand for no digits or any number of digits. Also, by allowing q to stand for two digits considered as one conglomerate digit, this reasoning can be applied to both cases of picking quotient digits; the case wherein the current dividend has the same number of digits as the divisor, and the case wherein the current dividend has one more digit than the divisor (and again, those are the only cases that are possible wherein the divisor is less than or equal to the dividend, leading 0s excluded).
After a quotient digit is arrived at, it is multiplied by the divisor and compared to the current dividend to see if it is greater. If it is, then 1 is subtracted from the quotient digit, and the new quotient is checked. When the correct quotient digit is found, it is multiplied by the divisor, and that result is subtracted from the current dividend to give a remainder which will be used by the next recursive call.
Note that its good that the quotient digit guesser never guesses too low. Low guesses are more expensive to deal with because you have to multiply the quotient digit by the divisor and subtract it from the current dividend, then the resulting remainder has to be compared to the divisor, whereas if it is known that the quotient digit guesses are too big, we can skip the subtraction! That saves a lot of time. When I first had a guesser that could go high or low, the times for division were appreciably slower.
Note that the recursive process of big-div stops only when the dividend is eq to the symbol done. Rationale: just because the dividend is empty (i.e. is big-zero) does not mean were done-we still have a quotient digit from a previous call that needs to be pushed into the result, even in the case where the master dividend started off being big-zero (in which case weve got a result of 0 and a remainder of whatever the divisor was).
Substrate Two
The purpose of the second substrate is to provide signed bignums. The sign is attached by consing it into the head of each list representing a bignum. This only need be done for negatives (that way the numbers created by the first substrate are compatible with this substrate-they dont need to be modified. After the sign frobbing is done by substrate two, substrate two calls out to substrate one functions to do the rest. Simple! And, it doesnt alter any numbers created by substrate one! The attaching of a sign does not side effect the original number.
Substrate Three
The purpose of the third substrate is to allow for numbers in different bases to be participants in the same arithmetic operation. This requires two things; 1) The ability to tag base information into a number, and 2) The ability to coerce numbers from one base to another.
In the case of base coercion, the participant with the highest base is the most contagious. This is done because higher bases result in faster computation times. Note that 0 and 1 are the same in every base! Therefore, these two special numbers get assigned a special base called all.
In the case of base information, that could be attached as one more piece of information like a sign. However, that seems like a bit of a kludge. A more appealing approach is to resort to object oriented programming. Viewing numbers as objects nicely meets the demands created by devising truly general numbers.
Heres the original example done in the highest substrate (taken from the test suites).
>>>
;
;;; bignum-object tests.
;
(define foo
(time-it
(bignum-object-fact (make-bignum-object (1 2 0)))))
available space: 80152
time: 131.15
foo
>>>
;
(foo show-number)
6689502913449127057588118054090372586752746333138029810295671352301633557244962989366874165271984981308157637893214090552534408589408121859898481114389650005964960521256960000000000000000000000000000
#t
Substrate N
These substrates, in true Hollywood style, could have sequels. Substrate n could have arithmetic functions that can take 0, 1, or many arguments, and do type overloading. By type overloading I mean that the arithmetic operations could be called simply; add, mul, div, sub; and such operators could accept arguments of any type (thus the type slot in the substrate three objects). This would make the coercion functions quite tricky, if say, one had a truly generic math system that had type complex, type polynomial, type bignum, type rational, etc..
The objects could become more sophisticated too: a class hierarchy could be introduced: the functions that make objects could make use of inheritance and slot defaults. For instance, a rational number could inherit characteristics from the integer/bignum class.
Base n
To generalize all the discussions in this article for the general case of base n, simply swap 10 for base and 9 for base - 1.
Acknowledgments
Thanks to John Koerber for allowing ideas to be bounced off of him (and we all know how painful that can be...); and for suggesting many changes to earlier drafts of this article, with the interests of readability, beginners, and nonLISPers at heart.
Thanks to Henry Baker for making comments on an earlier draft.
The code is unrefereed. Infelicities/bugs mea culpa.
References
[Abelson et al, 1985] Harold Abelson and Gerald Jay Sussman with Julie Sussman. Structure and Interpretation of Computer Programs. MIT Press, Cambridge, Massachusetts, USA, 1985.
[Auvil, 1979] Daniel L. Auvil. Intermediate Algebra. Addison-Wesley Publishing Company, 1979.
[Haddon, 1982] Bruce K. Haddon. Course: CS 453 Assembly Language Programming. Colorado University at Boulder, Spring 1982.
[Osborne, 1976] Adam Osborne. An Introduction to Microcomputers. Adam Osborne & Associates, Incorporated, Berkeley, California, 1979.
[Rees et al, 1986] Jonathan Rees and William Clinger (editors). Revised3 Report on the Algorithmic Language Scheme; AI Memo 848a. MIT Artificial Intelligence Laboratory, Cambridge, Massachusetts, USA, September 1986.
[Roth, 1979] Charles H. Roth, Jr.. Fundamentals of Logic Design second edition. West Publishing Company, 1979.
. . .
MacScheme is put out by Lightship Software, P.O. Box 1636, Beaverton, OR 97075 USA. Phone: (415) 694-7799.
The Code
;
(define print
(lambda (n)
(display n)
(newline)))
;
(define base-10??
(lambda (base)
(if (null? base)
10
(car base))))
;
(define to-base??
(lambda (bases)
(if (or (null? bases)
(null? (cdr bases)))
10
(cadr bases))))
;
(define normalize reverse)
;
(define unnormalize reverse)
;
(define bigify-digits list)
;
(define no-digits? null?)
;
(define first-digit car)
;
(define rest-digits cdr)
;
(define last-digit
(lambda (n)
(first-digit (last-pair n))))
;
(define push-digit cons)
;
(define big-zero ())
;
(define big-zero?
(lambda (n)
(cond ((no-digits? n)
#t)
((and (number? (first-digit n))
(zero? (first-digit n)))
(big-zero? (rest-digits n)))
(else
#f))))
;
(define big-one '(1))
;
; detokenize and normalize could be combined to save one
; trip through the argument list. I prefer the
; generality of not having them combined--tokenization
; is not a time critical routine.
;
(define detokenize
(lambda (x)
(if (no-digits? x)
big-zero
(push-digit (if (number? (first-digit x))
(first-digit x)
(- (char->integer
(string-ref
(symbol->string
(first-digit x))
0))
87))
(detokenize (rest-digits x))))))
;
(define show-big
(lambda (big)
(if (big-zero? big)
(print "0")
(let ((first-thing (first-digit big)))
(if (or (null? first-thing)
(pair? first-thing))
(begin
(show-big first-thing)
(display "remainder ")
(show-big (rest-digits big)))
(let ((start-value (- (char->integer #\a)
10)))
(begin
(for-each
(lambda (x)
(cond ((<? x 10) ; a
(display x))
((<? x 36) ; z
(display
(make-string
1
(integer->char
(+ start-value
x)))))
(else ; out of tokens
(display "<")
(display x)
(display ">"))))
(unnormalize big))
(newline))))))))
;
(define find-biggest-base
(lambda ()
(letrec
((iter
(lambda (base bits-per-digit)
(let ((base-minus-one
(- base 1)))
(if (not (fixnum? (* base-minus-one
base-minus-one)))
(if (odd? bits-per-digit)
(quotient base 2)
(quotient base 4))
(iter (* 2 base) (+ bits-per-digit
1)))))))
(iter 2 1))))
;
(define *biggest-base* (find-biggest-base))
;
(define fix-base-10->big-base-n
(lambda (n . base)
(let ((base (base-10?? base)))
(letrec
((recur
(lambda (n)
(if (<? n base)
(bigify-digits n)
(push-digit (remainder n base)
(recur (quotient n
base)))))))
(recur n)))))
;
(define compare-digits
(lambda (first second)
(if (no-digits? first)
(if (no-digits? second)
'=
'<)
(if (no-digits? second)
'>
(compare-digits (rest-digits first)
(rest-digits second))))))
;
(define less-digits?
(lambda (first second)
(eq? (compare-digits first second) '<)))
;
(define more-digits?
(lambda (first second)
(eq? (compare-digits first second) '>)))
;
(define same-number-of-digits?
(lambda (first second)
(eq? (compare-digits first second) '=)))
;
(define big-add
(lambda (addend augend . base)
(let ((base (base-10?? base)))
(letrec
((recur
(lambda (addend augend rem)
(if (no-digits? addend)
(if (no-digits? augend)
rem
(recur augend addend rem))
(if (no-digits? augend)
(if (no-digits? rem)
addend
(recur addend rem ()))
(let
((new-rem
(fix-base-10->big-base-n
(+ (first-digit addend)
(first-digit augend)
(if (no-digits? rem)
0
(first-digit rem)))
base)))
(push-digit
(first-digit new-rem)
(recur
(rest-digits addend)
(rest-digits augend)
(rest-digits new-rem)))))))))
(recur addend augend ())))))
;
(define big-sub
(lambda (minuend subtrahend . base)
(let ((base (base-10?? base)))
(letrec
((recur
(lambda (minuend subtrahend borrow)
(if (no-digits? minuend)
(if (no-digits? subtrahend)
(if (=? borrow 1)
(error
"needed to borrow more than I had")
())
(error
"(>? subtrahend minuend) => #t"
minuend
subtrahend))
(if (no-digits? subtrahend)
(if (zero? borrow)
minuend
(recur minuend
(bigify-digits borrow)
0))
(let ((dig1 (first-digit minuend))
(dig2 (+ (first-digit
subtrahend)
borrow)))
(if (<? dig1 dig2)
(push-digit
(+ (- base dig2)
dig1)
(recur (rest-digits minuend)
(rest-digits
subtrahend)
1))
(push-digit
(- dig1 dig2)
(recur (rest-digits minuend)
(rest-digits
subtrahend)
0)))))))))
(remove-leading-zeros
(recur minuend subtrahend 0))))))
;
(define remove-leading-zeros
(lambda (n)
(if (or (no-digits? n)
(not (zero? (last-digit n))))
n
(letrec
((peel-off-zeros
(lambda (n)
(cond ((no-digits? n)
big-zero)
((zero? (first-digit n))
(peel-off-zeros (rest-digits n)))
(else
n))))
(recur
(lambda (n user-n)
(if (no-digits? user-n)
big-zero
(push-digit
(first-digit n)
(recur (rest-digits n)
(rest-digits
user-n)))))))
(recur n
(peel-off-zeros (unnormalize n)))))))
;
(define big-*-digit
(lambda (big digit . base)
(let ((base (base-10?? base)))
(cond
((zero? digit)
big-zero)
((=? digit 1)
big)
(else
(letrec
((recur
(lambda (big rem)
(if (no-digits? big)
rem
(let
((new-rem
(fix-base-10->big-base-n
(+ (* digit
(first-digit big))
(if (no-digits? rem)
0
(first-digit rem)))
base)))
(push-digit
(if (no-digits? new-rem)
0
(first-digit new-rem))
(recur (rest-digits big)
(rest-digits new-rem))))))))
(recur big '(0))))))))
;
(define big-mul
(lambda (multiplicand multiplier . base)
(let ((base (base-10?? base)))
(if (less-digits? multiplier multiplicand)
(big-mul multiplier multiplicand base)
(letrec
((recur
(lambda (multiplicand multiplier
shift-amount)
(if (no-digits? multiplicand)
big-zero
(big-add (big-right-shift
(big-*-digit
multiplier
(first-digit
multiplicand)
base)
shift-amount)
(recur (rest-digits
multiplicand)
multiplier
(1+ shift-amount))
base)))))
(recur multiplicand multiplier 0))))))
;
(define big-right-shift
(lambda (big n)
(if (zero? n)
big
(push-digit 0 (big-right-shift big
(1- n))))))
;
(define big-fact
(lambda (n . base)
(let ((base (base-10?? base)))
(letrec
((recur
(lambda (n)
(if (big-zero? n)
big-one
(big-mul n
(recur (big-sub n
big-one
base))
base)))))
(recur n)))))
;
(define last-digit?
(lambda (big)
(no-digits? (rest-digits big))))
;
(define big-compare?
(lambda (first second predicate?)
(letrec
((iter
(lambda (first second)
(if (no-digits? first)
(if (no-digits? second)
#f
#t)
(if (no-digits? second)
#f
(if (last-digit? first)
(if (last-digit? second)
(predicate?
(first-digit first)
(first-digit second))
#t)
(iter (rest-digits first)
(rest-digits second))))))))
(iter first second))))
;
(define equal-digit-big-compare?
(lambda (first second predicate?)
(letrec
((iter
(lambda (first second)
(if (no-digits? first)
#f
(if (= (first-digit first)
(first-digit second))
(iter (rest-digits first)
(rest-digits second))
(if (predicate? (first-digit first)
(first-digit second))
#t
#f))))))
(iter (unnormalize first) (unnormalize second)))))
;
(define big-<?
(lambda (first second)
(let ((first (remove-leading-zeros first))
(second (remove-leading-zeros second)))
(case (compare-digits first second)
((<) #t)
((>) #f)
((=)
(equal-digit-big-compare? first second <?))))))
;
(define big->?
(lambda (first second)
(let ((first (remove-leading-zeros first))
(second (remove-leading-zeros second)))
(case (compare-digits first second)
((>) #t)
((<) #f)
((=)
(equal-digit-big-compare? first second >?))))))
;
(define big-=?
(lambda (first second)
(let ((first (remove-leading-zeros first))
(second (remove-leading-zeros second)))
(case (compare-digits first second)
((>) #f)
((<) #f)
((=)
(equal-digit-big-compare? first second =?))))))
;
(define big-<=?
(lambda (first second)
(not (big->? first second))))
;
(define big->=?
(lambda (first second)
(not (big-<? first second))))
;
(define big-div
(lambda (dividend divisor . base)
(let ((base (base-10?? base))
(dividend (unnormalize dividend))
(divisor (remove-leading-zeros
divisor)))
(let ((user-divisor
(unnormalize divisor)))
(letrec
((find-next-quotient-digit
(lambda (dividend)
(let ((dividend
(remove-leading-zeros
dividend)))
(if
(big->? divisor dividend)
(push-digit 0 dividend)
(letrec
((guess-next-quotient-digit
(lambda ()
(let ((dividend
(unnormalize
dividend)))
(let
((dig1-divisor
(first-digit
user-divisor))
(dig1-dividend
(first-digit dividend)))
(if
(and
(same-number-of-digits?
dividend
user-divisor)
(<=?
dig1-divisor
dig1-dividend))
(quotient
dig1-dividend
dig1-divisor)
(let
((guess
(quotient
(+
(*
base
dig1-dividend)
(first-digit
(rest-digits
dividend)))
dig1-divisor)))
(if (>=? guess base)
(- base 1)
guess)))))))
(iter
(lambda (guess)
(let ((subtrahend
(big-*-digit
divisor
guess base)))
(if
(big->? subtrahend
dividend)
(iter (- guess 1))
(push-digit
guess
(big-sub
dividend
subtrahend base)))))))
(iter
(guess-next-quotient-digit)))))))
(iter
(lambda (dividend rem result)
(if
(eq? dividend 'done)
(push-digit
(remove-leading-zeros
result)
(remove-leading-zeros rem))
(let* ((new-quotient-digit
(find-next-quotient-digit
rem))
(new-rem
(cdr new-quotient-digit)))
(if
(no-digits? dividend)
(iter
'done
new-rem
(push-digit
(first-digit
new-quotient-digit)
result))
(iter
(rest-digits dividend)
(push-digit
(first-digit dividend)
new-rem)
(push-digit
(first-digit
new-quotient-digit)
result))))))))
(iter (rest-digits dividend)
(bigify-digits
(first-digit dividend))
()))))))
;
'done