APL
Volume Number: | | 1
|
Issue Number: | | 12
|
Column Tag: | | APL Adventures
|
A Beginner's look at Mac APL
By Allyn Weaks, Seattle, WA.
I've wanted to learn APL ever since I heard that it could do integrals in one line and matrix inversions in a single symbol. Until PortaAPL for the Mac came along, I never had access to a computer with a full implementation. I'm hardly an expert at this language, having played with it for all of a month now, so please forgive any minor blunders. This month I'll start with a description of APL, enough of a tutorial to get you started defining simple functions (including an overview of the syntax, data types, and some of the simpler functions), and a brief review of PortaAPL for the Mac. I'll spend as few articles as possible in straight APL tutorial mode before starting in on Mac-specific features and bigger programs.
APL, A Programming Language, was developed by Kenneth Iverson at IBM in the early 60's, which makes it one of the older computer languages. It is more of a mathematical notation that can be interpreted by a computer than a traditional computer language, and handles data in groups (arrays) rather than just in small pieces.
Though it's a natural language for scientific and mathematical problems, in the U.S. it seems to be used mostly as a business language. Almost all of the textbooks currently in print are slanted towards business. Insurance companies in particular like it, because APL is ideal for statistical programs. In Europe and Canada it caught on more quickly and is used a lot in the scientific community for data analysis.
Since it is an interpreter, APL runs slower than most compilers. But it runs much faster than other interpreters such as Basic. The programs are much shorter, so APL spends most of its time working on the problem, not interpreting the commands. You also get the advantages of an interpreter - interactive writing and debugging. Functions are defined by name as in Forth or Logo, and you can define local variables, so it's easy to define new functions as you go along.
APL is extremely powerful. It's recursive and self-modifying, with dynamic allocation of data storage. It does calculations on entire arrays at once, not just element by element. This power has it's disadvantages, the primary one being memory limitations. Arrays require lots of memory. On a 512k Mac, the largest two dimensional array of integers you can deal with is about 225 by 225, which isn't big enough for many problems. Sometimes you have to use a less elegant algorithm to solve a problem. But then, who has ever had enough memory?
The peculiar character set APL uses (þ,©,,ð,,Â, etc.) has given it the reputation of being unreadable. Pascal programmers have been heard to say that APL programmers hand each other sections of code and say 'Bet you can't figure out what this does!'. But APL isn't really harder to read than other languages once you know what the symbols mean. And since APL can work on groups of data all at once, instead of just a little at time, the code is much more compact. You can see the whole thing at once, instead of getting lost manipulating indices in the middle of triply nested loops.
APL is very different from Fortran, C, Forth, or Lisp. The symbols, syntax, and natural algorithms all take some getting used to. The first hurdle is the keyboard layout. Unshifted characters give upper-case letters, shifted characters and numbers give the APL symbols. You'll need to keep the keyboard map handy for reference at first, though most of the symbols are easy to find once you know the mnemonics (i.e. È (iota) is shift-I, ® (rho) is shift-R). If you can't stand it, there is an ASCII mode that substitutes keywords and ASCII punctuation for the symbols, but if you plan to read standard APL books, or trade programs with other APL users, it's better to do it right. The real notation is also more compact and easily read - the symbols stand out.
The next difficulty is the syntax. There is no hierarchy of operators - all APL expressions are evaluated from right to left. If you need to change the order of operation, you must use parentheses. For example, the expression 5-6-2 is equivalent to 5-(6-2) = 1 in APL, not (5-6)-2 = -3 as in most languages. And 5x6-2 equals 20, not 28.
The terminology is a bit different too. What is usually called an operator, such as + or -, is a function in APL, because it returns a value. An operator is a special thing that systematically modifies the effect of a function. Functions operate on data, and operators operate on functions. Operators are one of the things that makes APL so powerful - many mathematical ideas such as the inner product (given two vectors a,b,c and x,y,z the standard inner product is ax+by+cz, a scalar) can be generalized to any functions, not just the sum of the products.
Functions can be niladic, monadic, or dyadic. Niladic functions take no arguments - some system commands such as )OFF are niladic. Monadic functions take one argument, and are called by the sequence FUNCTION ARGUMENT, such as the factorial of N (!N). Dyadic functions, which take two arguments, are called as ARGUMENT FUNCTION ARGUMENT, as in 5 + 3. This is true of user defined functions as well as the built in primitives.
People who insist on strongly typed languages can stop here and read the Pascal article instead. The standard data types are boolean, character, integer, and floating point, but these distinctions are less important than the structure of the data into scalars, vectors, and larger arrays. An argument to any function can be a scalar, vector, or array, with a few exceptions. It is irrelevant (again with some exceptions) whether the data is a character string, an integer, or a set of boolean values. A scalar is a single number or character, without dimension or length. A vector is a set of data that forms a one dimensional array, a matrix has two dimensions. Arrays can have up to 63 dimensions in PortaAPL, though you'll never use that many - if you had two bits per dimension, you'd need a billion Gigabytes to hold the array. All elements of an array must have the same data type, so if you put a single floating point number in a large array, every element will be floating point. Type conversions are taken care of automatically and transparently at the interpreter's discretion, though there are ways to force a conversion to a particular type. On the Mac, boolean values are stored as bits, characters as bytes, integers as 4 byte words, and floating point as 8 byte long words. A character string is a vector of characters; a character matrix is a set of strings, one per row.
In addition to dividing functions into classes according to how many arguments they take, they can be divided into three groups that depend on what their effect is. Scalar functions change the data in an array, but not the shape or size of the array. Restructuring functions change the shape or dimension of an array, but not the data values. Mixed functions change both the data and the shape, and will have to wait for a later article. A scalar function can operate on arrays, not just scalars, but they operate element by element. For example, you can add two vectors of the same length:
2 4 6 11 8 + 7 43 1 6 31
9 47 7 17 39
Iota (È) and rho (®) are two of the most important reshaping functions. Iota in it's monadic form creates a vector of integers in ascending order from a scalar: È6 gives the vector 1 2 3 4 5 6. Monadic rho gives the length of each dimension of an array: ® 1 3 5 7 is 4. Dyadic rho creates an array with the dimensions given by the left argument, filled with the values given in the right argument. The right argument is repeated as many times as necessary to fill the array:
2 4 ® 1 2 3
1 2 3 1
2 3 1 2
Some of the scalar functions will look deceptively familiar to programmers. + and - do what you expect, but * means the exponential or power, not multiply, and / is not divide, but something more complicated - it can be either a restructuring function (compress) or an operator (reduction). When used as the reduction operator, f/ inserts the function f between each element of a vector:
+/ 1 2 3 4
10
ª/ 1 0 0 1
1
-/ 1 2 3 4
2
Fig. 1 APL Table of Functions
There is a floating point benchmark that's become popular on usenet: how long it takes to calculate the harmonic series to 10000. This is the sum of 1/i from i=1 to 10000. Most languages take 5 or so lines to write this program; APL takes 9 characters, including the 10000. No looping is needed. Start by noticing that you want to do something with an index that ranges from 1 to 10000. Well, there's a function iota that will generate a vector with all those values. Then, you need to take the reciprocal of each value, so just use the monadic ±. To take the sum of the vector, use the reduction operator with addition. So the whole thing turns into -/±È10000. How long does it take? 49 seconds, compared to 38 seconds for Aztec C, and 69 seconds for Megamax C. APL won't always do this well compared to compilers, this set of functions is extremely efficient in APL. Notice also the amount of memory used for a vector of 10000 4-byte integers.
To turn this into a function so you can run it for any number of terms, enter the editor by typing © Z HARMONIC N <ret>. This line will appear as line zero at the top of the screen. Any variable appearing on line zero is a local variable, all others are global. If you don't assign the function to a dummy variable (Z in this case), you won't be able to call it from another function later. Next, starting with line one, type in your program. At some point, the final value of the program must be assigned to the same dummy variable as in line zero. When you're finished, the screen should look like this:
©Z HARMONIC N
[1] Z -/±ÈN
©
You can move the cursor with the mouse. The lamp symbol (Ê) can be used at the beginning of a line to put in comments. Use command-z or the EDIT menu to exit the editor. To define a dyadic function, put one argument in front of the function name, and one after. DICE will roll N dice with M sides each:
©Z N DICE M; A; B
[1] Ê D&D dice roller -
rolls N M-sided dice
[2] A N ® M
[3] B A
[4] Z -/B
[5] Ê This can also be written on
one line as:
[6] Ê Z -/N®M or Z -/BAN®M
©
Line 2 sets up a vector with the same number of elements as there are dice and with each element equal to the numbers of sides of the die. The scalar operator ?R chooses a random element of the vector (ÈR), and if you give ? a vector as an argument it will find the random number for each element. Line 3 just adds up all of the independent rolls. What sort of changes need to be made to roll several dice with different numbers of sides? None! M can be a vector containing the sides of each die. Make sure the number of elements agrees with the value of N, however, or you will get the wrong result. There's at least one way to take care of this without changing the function, and I'll leave the finding of it as a puzzle.
In addition to the built in functions, APL has a number of system functions and system commands that are fairly standard from one implementation to another. System commands are prefaced with a right parenthesis, such as )OFF to exit to the finder, )SAVE FILENAME to save the workspace, and )COPY FILENAME to add a workspace to the current workspace. System functions, or quad functions, start with the quad symbol ¬. These can be used to inquire about or set various system variables, such as the precision of results with ¬PP, or to manipulate files. There are also quad versions of ¬OFF, ¬COPY and some other commands so that they can be called from inside of functions.
PortaAPL is written and distributed by Portable Software in Cambridge MA (60 Aberdeen Avenue, Cambridge, MA 02138; (617) 547-2918). A 512k Mac is required. The latest version is 2.0m. The price is $275 for the interpreter, some very clear documentation, and several workspaces with sample programs and functions to access parts of the toolbox. Upgrades are available for $25. Best of all, the interpreter isn't copy protected, so you can run it easily from a hard disk.
Portable Software has been writing APL interpreters for several years and for several different computers including Vax and the IBM PC. PortaAPL is a full IBM mainframe implementation, with lots of nice extensions. In fact, the main reference manual provided is the official IBM APL manual. The differences to the IBM version are an improved screen oriented editor, and the workspace id format which is system dependent. (Workspace id is a fancy term for program file name.) Extensions include ASCII mode, access to the Mac file system (though not the resource fork), and a way to write assembly language functions. The keyboard is a standard APL keyboard, with extensions for the Mac. Overstruck characters can be typed with the option key instead of key1-backspace-key2.
Another nice feature is a built in Datamedia 1520 APL terminal emulator. If you have access to APL on another computer, you can call up and work from your Mac. The Mac allows an improved keyboard set up, so you can send lowercase characters if you need to by holding down the option key. Unfortunately, there is no file transfer, so you can't send programs back and forth with it.
The program is written in C so it isn't as compact or as fast as it could be. It's 180k, and allows a work area of about 221k. Integers are 4 bytes and floating points numbers are 8 bytes with a range of +/- 1.79E308. Floating point calculations use the SANE package. The largest allowed function is 9999 lines; the maximum length of a symbol is 77 characters and underscores are allowed in symbol names. 16 files can be opened at once.
Editing functions and character matrices is acceptable, and certainly an improvement over the standard APL editor. (Try reading through some of the descriptions of editing a function in one of the books listed at the end!) The cursor is moved with the mouse, and there are menu and keyboard commands to insert and delete lines. Because of the nice fonts on the Mac, overstruck characters are typed in with the option key, so the editor can be in insert mode all of the time. Cut and paste aren't available, which can be a nuisance.
An excellent DRILL program is provided, that randomly generates APL expressions at 5 levels for you to evaluate. Since it evaluates the answer you give it and compares that to the evaluation of the problem, you can add to the challenge by coming up with expressions that are equivalent, but use different functions.
One of the workspaces that comes with the package is called Goodies. It has lots of useful stuff in it, including peek and poke, a sound function for playing simple tunes, mouse functions, printer i/o redirection, and functions to let you access the modem port. There are also routines that copy text or graphics from the clipboard to the screen, or text from the screen to the clipboard.
As of 2.0m, PortaAPL supports much of quickdraw and menus, but not windows, resources, or controls. Desk accessories are available from inside the interpreter. Although the EDIT menu doesn't show the standard cut, copy, and paste functions, command-X, and V are functional when desk accessories are active. The calls to quickdraw and the menu manager are straight-forward. To use them, you need to include the appropriate workspace. The quickdraw workspace includes functions for line drawing, rectangles, ovals, arcs, text fonts and scroll regions. Not all of quickdraw is supported - getpixel, regions, pictures, copybits, and icons are all missing. The lack of getpixel is going to make it much harder for me to do screen dumps to my non-Imagewriter printer. Menus are easily created by assigning a character matrix containing the items and the functions they run to a menu id number, then installing the menu. The interrupts are handled by APL if you are at the command level. From inside of programs, there is a function ¬GETKEY that lets you ask for the appropriate events. Windows aren't supported, but you can define regions for input and output.
All in all, PortaAPL is a nice job. I haven't found any bugs yet while putting in sample programs from various sources. There is one thing to beware of - you must remember to save your work before you exit, because APL won't remind you. It would be nice to have complete access to the toolbox, but that will come eventually - if not from Portable Software, then from anyone who is willing to do a bit of assembly language programming. It's expensive compared to Basic, but it's much more fun.
I hope this has been enough to get you started. I recommend that you buy or borrow an APL text book and work through plenty of examples, as well as the Drill program. Next month, I'll get into loops and branches, the other operators, and a few ways to generate prime numbers.
A Programming Language by Kenneth E. Iverson
John Wiley & Sons 1962
An Introduction to APL by S. Pommier
Cambridge University Press 1984
APL: An Interactive Approach by Gilman & Allen J. Rose John Wiley & Sons 1983
Structured Programming in APL by D. Geller & D. Freedman
Winthrop 1976