MacForth Calc
Volume Number: | | 9
|
Issue Number: | | 4
|
Column Tag: | | Jörg's Folder
|
Object Programming in MacForth
Porting the simple calculator to MacForth
By Jörg Langowski, MacTech Magazine Regular Contributing Author
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
Those of you who have been with us from the very beginning, V1#1, will remember that this column started out as a Forth column, and that my first programming examples were in MacForth. At the time when the Macintosh was introduced, this was one of the few development systems for the rest of us, and you see, it has stayed around all the time.
In fact, MacForth is the only Forth system for the Macintosh that is still commercially supported; you have read about others here, but they are in the public domain (like Mops and Yerk) or almost public domain (like Mach2). Since numerous calls to the readership for MacForth article submissions were of no avail, Ive finally decided to write an example program myself to bring MacForth back into the memory of the MacTech readers and show how nicely you can write object-oriented code in MacForth.
So here were back to our roots: MacTutor is back to MacTech, and J.L. is back to MacForth.
Why bother?
Those of you who would never touch Forth with a ten-foot reverse Pole (although Im only 511), please dont turn to the next article immediately. Ill give you some basic notions that may help you to understand the example.
A Forth program consists of a sequence of words. Each word is a routine that works by taking parameters off a stack and eventually leaving results there. Any number can be considered as just a word that leaves its value on the stack. Forth development systems normally accept words interactively or from a file and execute them immediately. All words are separated by blanks. If you type
3
there will be a 3 on top of the stack. The sequence
3 4 +
will put the numbers 3 and 4 on the stack, and then compute their sum, which is left on the stack. All in all, Forth works like an HP calculator, using reverse Polish notation.
How about writing programs? There are two words, : (colon) and ; (semicolon), which start and end definitions. A definition looks like this:
: test 3 4 + ;
This defines the new word test which simply executes the summation of 3 and 4 and leaves the result on the stack. Thus, the first word after the colon is the name of the newly defined word, and the remaining sequence of words is not executed, but compiled into a dictionary and by some means or other linked to the name which is kept in a vocabulary. When the newly defined word is then executed, it will behave as if one had entered the initial sequence of words in the definition.
Forth programs are typically written by splitting up the program task into very small units and coding each of them into a word definition, just if one was to write a C program by coding lots of small routines and assemble them into larger ones. A very structured approach, which is easy to do with Forth because the low-level definitions are compiled immediately (by typing them in or loading them from a file), and they can be tested interactively before using them in the higher-level routines. This interactive, incremental compiling is one of the great strengths of Forth.
This concludes my short sales pitch for Forth. Before we jump right into the middle of things with our example, some small Forth words that youll come across in the listing:
! (exclamation mark) takes two values from the stack, an address (top) and a number (underneath). It stores the number into the 32-bit long word at address. c! stores the value into the byte at address.
@ (at sign) takes the address from top of stack and gets the 32-bit longword stored there. c@ gets the 8-bit value at address.
dup duplicates the value on top of the stack, drops it, and exchanges the two top values.
+, -, *, / are operators that do exactly what you would expect to the two top values on the stack.
if then brackets code that is conditionally executed if the top of the stack is non-zero.
Other control structures: do loop, case endcase, begin until, begin again. Details about their functioning can be found in any good book on Forth. I recommend Starting Forth by Leo Brodie, published by Forth, Inc. But now to the example.
An arithmetic expression parser in Forth
For the example, I decided to convert the calculator program that I presented in V9#1 from C++ to Forth (I know there are some people wholl try to kill me for this). It is not as ridiculous as it seems: The original program was written as an example for object programming, and youll see that the object oriented extensions contained in MacForth make it relatively easy to port the code. In fact, the only places where I really had to pay attention to bugs were the routines which input and parsed a string from the keyboard, and entering and retrieving names from the symbol table. The whole object-oriented structure of the program worked almost immediately the way I typed it in.
The calculator works just like the C++ example: it reads in a line with an arithmetic expression, analyzes it and constructs a syntax tree representation from it, and finally evaluates that syntax tree and prints the result (in order to understand this months example it helps to get out your old V9#1 copy and compare to the C++ code). In the C++ example, we defined classes for the different objects that make up the syntax tree and constructed an instance of one of those classes for each node of the tree. Those nodes could be: the operators +,-,*,/ and =, which each connect two other nodes, the unary minus, and the end nodes, numbers and variables.
To reconstruct the C++ example, lets first see how object-oriented programming is achieved in MacForth. MacForth provides so-called acting elements or actels, which contain their own private data structure and to which messages can be sent; so these are objects. An actel is defined first through its data structure; MacForth provides possibilities for structure definitions. You can see this at the beginning of the listing. As an example, the structure corresponding to a dyadic operator is defined by
structure dyadheader
minElemHeaderSize +
handle:+left
handle: +right
structure.end
This definition does not create the object; it only defines a word dyadheader which will leave the total size of the structure on the stack when executed, and other words (+left, +right) that will add a number to the top of stack equal to the offset for the corresponding element in the structure. So you may define
create my.dyad dyadheader allot
which will allocate space in the dictionary for one copy of this structure, my.dyad, and then access the two fields of the structure by writing
my.dyad +left
which leaves the address of the left field of the structure on the stack. Remark on the side: the field name in MacForth structure definitions is not local to the structure, but visible globally. Thus, you cannot define two different structures which have a field with the same name at different offsets. This is one limitation that one has to live with.
You see that there is some space created at the beginning of the structure (of length minElemHeaderSize) which contains fields for internal use. In the current implementation, this size is 16 bytes. The first longword of these reserved bytes containes a machine instruction that jumps to the method selector of the class, and the last longword contains the total size of the header.
The messages that one sends to MacForth actels are again Forth words, which by convention start with >>. There are a couple of predefined messages, the ones interesting to us are >>New (make a new object on the heap), >>Discard (remove the object from the heap), >>Empty (fill the objects data space with zeros) and >>Room (resize the object). New messages are created by writing e.g.
1000 message >>eval
1001 message >>set
as in the example. This assigns the message numbers 1000 and 1001 to the messages >>eval and >>set. Every message must have a unique number, thats how they are distinguished. When a message is sent to an Actel, actually the message number is passed to a selector, which then calls the appropriate method. The selector, e.g. for the plus operator, is defined as follows:
1 selector: dyad.msgs
ElemPanic
dyad::New
dyad::Discard
drop
inherited
;selector
1000 selector: plus.msgs
dyad.msgs
plus::Eval
;selector
First, a selector defines the basic methods used in dyadic operators. They are assigned increasing message numbers, starting with 1. The first word ElemPanic after the selector name dyad.msgs is the routine which is called if the message number is outside the range of the selector. Since this is the lowest level of methods, the routine called here will complain that the message number was not implemented. The next words correspond to message numbers 1 to 4, or >>New, >>Discard, >>Empty and >>Room. >>New will call the method dyad::New (the two colons are simpky my convention from C++; you may give the method any name you like), >>Discard calls dyad::Discard, >>Empty simply drops the top stack item (which is the handle of the object to which the message was sent), this means it does nothing; and >>Room is not handled by this class.
The selector plus.msgs assigns the method specific to the plus operator, its evaluation plus::eval, to the message number 1000, >>eval. This will be the highest level selector that is stored in the header of the plus operator objects and which is first executed when a plus object receives a message. If it is =1000, plus::Eval is called. If not, the next lower level selector, dyad.msgs, is invoked.
How are methods written? The calling sequence for a method is <parameters> <object> <message>. The message word calls the selector corresponding to the class of the object, which then calls the appropriate method. At that moment, the objects handle is on the top of the stack, and any parameters underneath. The method must dereference the object handle, access its fields if necessary and do its things using the parameters and the object fields. As an example, look at the plus::Eval method again:
: plus::Eval ( handle -- value )
h@ dup +left @ >>eval swap +right @ >>eval +
This method will get the handle to the left side of the plus operator (stored in the left field), which is an object, send it the message >>eval, then do the same thing to the right hand side, and finally add the results and leave the sum on the stack.
This column has not enough space for explaining all the methods in detail; with some intuition and Forth thinking, or better a MacForth manual, you will easily see the parallels between the code here and the C++ example. What is important to mention here is that for each class you need to define one instance of an object by hand, that is create its data structure on the heap. This is the parent instance; further instances of the same class are then made simply by copying the parent object. The parent objects of all the classes are made in the middle of the example code, where the selectors are also defined; look at the words make.dyad etc. The jump to the highest-level method selector is also stored in the parent object by ElemAction!.
In the next part of the code, the token scanning routine from the C++ example is recreated. Non-Forthers who are still reading, you may skip this; suffice it to say that each invocation of the word scan leaves the next token from the input stream on the stack, which can then be stored in the global variable token. For the Forthers, it should be easy to verify that this code is a typical example of the mess you have to deal with when you are doing input/output in Forth
After all these preliminaries, the port of the main parser code from C++ to Forth is quite easy. We have to rewrite the three routines that parse an expression, a term and a factor. All these routines can call each other recursively, so we have to use deferred words to implement them. First, we define the routine names:
defer factor
defer term
defer expr
then we code the routines do.factor, do.term and do.expr, and finally we assign these routines to the deferred words:
token.for do.factor IS factor
token.for do.term IS term
token.for do.expr IS expr
The routines itself are a straight translation from C++; you may verify this from the example in V9#1. So you see, other than the strange message passing mechanism and the reverse Polish notation, Forth is just like C++! :-)
Of course, I could have written this example in Mops or Yerk as well; you may do that as an exercise for yourself. The code will be probably shorter in Mops, since many of the things that we did here explicitly (such as creating the initial parent objects) are done more or less implicitly in other object-oriented systems. But then, MacForth doesnt claim to be a full object-oriented language, only a Forth with object-oriented extensions. You can see that these extensions go actually quite far, and provide a very good means to understand what is actually going on when a routine is selected through a method table.
At the end, let me say that I would have taken much more time doing this example if it werent for the incremental compiling of Forth in general and the excellent debugging facilities of MacForth. If one couldnt analyze exactly what each little routine is doing to the stack at each point (which is possible with MacForth), one would get lost immediately even in a small piece of code like this.
Next time (two months from now) back to C++.
Listing: Calculator example in MacForth
anew --calc--
structure dyadheader
minElemHeaderSize +
handle:+left
handle: +right
structure.end
structure unaryheader
minElemHeaderSize +
handle:+operand
structure.end
structure numheader
minElemHeaderSize +
long: +value
structure.end
structure IDheader
minElemHeaderSize +
31 string: +name
structure.end
\
\ the symbol table is implemented as a vocabulary
\ use CONTEXT as template for symtab (see below)
\
global symtab
\
\ messages
\
1000 message >>eval
1001 message >>set
\
\ methods
\
: dyad::New ( left\right\handle -- handle )
dyadheader swap ElemNew
\ create space for new dyad, handle on stack
dup h@ locals| selfaddr self |
( stack now: left\right )
selfaddr +right ! selfaddr +left !
self
;
: dyad::Discard ( handle -- )
dup h@ +left @ >>Discard
dup h@ +right @ >>Discard
to.heap
;
: plus::Eval ( handle -- value )
h@ dup +left @ >>eval swap +right @ >>eval +
;
: minus::Eval ( handle -- value )
h@ dup +left @ >>eval swap +right @ >>eval -
;
: times::Eval ( handle -- value )
h@ dup +left @ >>eval swap +right @ >>eval *
;
: divide::Eval ( handle -- value )
h@ dup +left @ >>eval swap +right @ >>eval /
;
: equals::Eval ( handle -- value )
h@ dup +right @ >>eval swap +left @ >>set
;
: unary::New ( operand\handle -- handle)
unaryheader swap ElemNew
dup h@ locals| selfaddr self |
selfaddr +operand !
self
;
: unary::Discard ( handle -- )
dup h@ +operand @ >>Discard
to.heap
;
: uminus::Eval ( handle -- value )
h@ +operand @ >>eval negate
;
: num::New ( value\handle -- handle )
numheader swap ElemNew
dup h@ locals| selfaddr self |
selfaddr +value !
self
;
: num::Discard ( handle -- )
to.heap
;
: inumber::Eval ( handle -- value )
h@ +value @
;
: ID::New ( name\count\handle -- handle )
IDheader swap ElemNew
dup h@ locals| selfaddr self count |
count selfaddr +name c! \ store count byte
( name ) selfaddr +name 1+ count cmove
\ store string data
selfaddr +name symtab >>findname
0= if
0 selfaddr +name symtab >>AddEntry
0= abort" Symbol table full"
then
self
;
: ID::Discard ( handle -- )
to.heap
;
\ unfortunately, a MacForth vocabulary structure
\ takes only 16-bit entries...
\ so we're limited to 16 bit integers
: ID::Eval ( handle -- value )
h@ +name symtab >>findname
dup if symtab >>GetValue then
;
: ID::Set ( value\handle -- value )
h@ locals| selfaddr value |
selfaddr +name symtab >>findname
\ this is the only way to modify the value
?dup 0= abort" Tried to set a non-existing variable"
symtab >>RemoveEntry drop
value selfaddr +name symtab >>AddEntry
0= abort" Symbol table full"
value
;
\
\ variables for parent instances of classes
\
global PLUS
global MINUS
global TIMES
global DIVIDE
global EQUALS
global UMINUS
global INUMBER
global ID
\
\ make parent instances, define message selectors
\
: make.dyad ( -- dyadhandle )
dyadheader makehandle locals| handle |
dyadheader handle h@ 12 + !
handle
;
1 selector: dyad.msgs
ElemPanic
dyad::New
dyad::Discard
drop
inherited
;selector
1000 selector: plus.msgs
dyad.msgs
plus::Eval
;selector
make.dyad -> PLUS
token.for plus.msgs PLUS elemaction!
1000 selector: minus.msgs
dyad.msgs
minus::Eval
;selector
make.dyad -> MINUS
token.for minus.msgs MINUS elemaction!
1000 selector: times.msgs
dyad.msgs
times::Eval
;selector
make.dyad -> TIMES
token.for times.msgs TIMES elemaction!
1000 selector: divide.msgs
dyad.msgs
divide::Eval
;selector
make.dyad -> DIVIDE
token.for divide.msgs DIVIDE elemaction!
1000 selector: equals.msgs
dyad.msgs
equals::Eval
;selector
make.dyad -> EQUALS
token.for equals.msgs EQUALS elemaction!
: make.unary ( -- unaryhandle )
unaryheader makehandle locals| handle |
unaryheader handle h@ 12 + !
handle
;
1 selector: unary.msgs
ElemPanic
unary::New
unary::Discard
drop
inherited
;selector
1000 selector: uminus.msgs
unary.msgs
uminus::Eval
;selector
make.unary -> UMINUS
token.for uminus.msgs UMINUS elemaction!
: make.num ( -- numhandle )
numheader makehandle locals| handle |
numheader handle h@ 12 + !
handle
;
1 selector: num.msgs
ElemPanic
num::New
num::Discard
drop
inherited
;selector
1000 selector: inumber.msgs
num.msgs
inumber::Eval
;selector
make.num -> INUMBER
token.for inumber.msgs INUMBER elemaction!
1 selector: ID.low.msgs
ElemPanic
ID::New
ID::Discard
drop
inherited
;selector
1000 selector: ID.msgs
ID.low.msgs
ID::Eval
ID::Set
;selector
: make.ID ( -- IDhandle )
IDheader makehandle locals| handle |
IDheader handle h@ 12 + !
handle
;
make.ID -> ID
token.for ID.msgs ID elemaction!
\ root token of the syntax tree
global root
\ token being scanned
global token
\ temp string storage
create tstring 80 allot
variable tlength
\ input scanned will go to PAD
\ restriction: one line per expression
\ extending this is left as an exercise to the reader
\ we don't use WORD because we don't want to have to use
\ blanks as delimiters everywhere
\
255 constant maxpad
variable chars.input
: fillpad
pad 1+ maxpad expect
span @ chars.input !
1 pad c!
;
\ gets a new character from standard input into PAD
\ reads new line if end of line encountered
: cget ( -- char )
pad c@ chars.input @ >
if fillpad then
pad c@ pad + c@
pad c@ 1+ pad c!
;
\ backspaces pointer to PAD
\ no action if already at beginning
: cputback ( char -- )
pad c@ dup 1 > if 1- pad c! else drop then
drop
;
: ?digit ( char -- true or false)
dup ascii 0 < not swap ascii 9 > not and
;
: ?alfa ( char -- true or false)
locals| ch |
ch ascii a < not ch ascii z > not and
ch ascii A < not ch ascii Z > not and or
;
: ?anum ( char -- true or false)
dup ?digit swap ?alfa or
;
: syntax.error -1 abort" Syntax Error" ;
\ scan the remaining digits of a number
: scan.number ( ch -- )
tstring c!
1 tlength !
80 1 do \ we don't expect more than 80 digits
cget dup ?digit not if cputback leave then
tstring i+ c!
i 1+ tlength !
loop
;
\ scan the remaining chars of an identifier
: scan.anum ( ch -- )
tstring c!
1 tlength !
80 1 do \ we don't expect more than 80 chars
cget dup ?anum not if cputback leave then
tstring i+ c!
i 1+ tlength !
loop
;
\
\ constants
\
128 dup constant IDT
1+dup constant INT
1+dup constant EOLN
1+constant BAD
: scan ( -- token )
0 locals| c |
begin
cget -> c
c case
ascii ( ascii + range.of c endof
ascii / of c endof
ascii - of c endof
ascii = of c endof
13 of EOLN endof
bl of 0 endof \ just continue scan
dup ?anum if
dup ?digit if
c scan.number INT swap
else
c scan.anum IDT swap
then
else
BAD swap
then
endcase
?dup until
;
\
\ parser routines for expression, term, factor
\
: get.token.name ( -- name\count )
tstring tlength @
;
: get.token.value ( -- value )
bl tstring tlength @ + c!
tstring 1- number
;
defer factor
defer term
defer expr
: do.factor ( -- factor-handle )
0 locals| root |
token
case
IDT of
get.token.name ID >>New -> root
scan dup -> token
ascii = = IF
scan -> token
root expr EQUALS >>New -> root
THEN
endof
INT of
get.token.value INUMBER >>New -> root
scan -> token
endof
ascii ( of
scan -> token
expr -> root
token ascii ) = not IF syntax.error THEN
scan -> token
endof
ascii - of
scan -> token
factor UMINUS >>New -> root
endof
syntax.error
endcase
root
;
token.for do.factor IS factor
: do.term ( -- term-handle )
factor locals| root |
begin
token
case
ascii * of
scan -> token
root term TIMES >>New -> root
0 \ go on
endof
ascii / of
scan -> token
root term DIVIDE >>New -> root
0 \ go on
endof
1 swap \ terminate
endcase
until
root
;
token.for do.term IS term
: do.expr ( -- expr-handle )
term locals| root |
begin
token
case
ascii + of
scan -> token
root term PLUS >>New -> root
0 \ go on
endof
ascii - of
scan -> token
root term MINUS >>New -> root
0 \ go on
endof
1 swap \ terminate
endcase
until
root
;
token.for do.expr IS expr
: calc
0 locals| root |
begin
cr ." Enter expression:" cr
fillpad
scan -> token
expr -> root
token BAD = if syntax.error then
root 0= not if
cr cr ." Result = " root >>Eval . cr
root >>Discard
then
again
;
: setup.everything
1000 context @ >>New -> symtab
make.dyad -> PLUS
token.for plus.msgs PLUS elemaction!
make.dyad -> MINUS
token.for minus.msgs MINUS elemaction!
make.dyad -> TIMES
token.for times.msgs TIMES elemaction!
make.dyad -> DIVIDE
token.for divide.msgs DIVIDE elemaction!
make.dyad -> EQUALS
token.for equals.msgs EQUALS elemaction!
make.unary -> UMINUS
token.for uminus.msgs UMINUS elemaction!
make.num -> INUMBER
token.for inumber.msgs INUMBER elemaction!
make.ID -> ID
token.for ID.msgs ID elemaction!
;
: go
setup.everything
calc
;