RISC on Mac II
Volume Number: | | 6
|
Issue Number: | | 9
|
Column Tag: | | Programmer's Forum
|
The Mac II On Steroids
By Paul Zarchan, Cambridge, MA
The Mainframe Potential Of The Mac
Introduction and Background
The 68000-based Macintosh was introduced in 1984 and its processing power remained virtually unchanged for approximately 3 years. A dramatic speed increase came with the introduction of the 68020-based Mac II in 1987. Ordinary applications such as word processing ran 4 times faster on the Mac II because of its higher clock rate (16 Mhz vs 8 Mhz) and increased number of bits (32 bits vs 16 bits) while numerically intensive programs ran 10 times faster because of the addition of the 68881 math coprocessor. In fact, for number crunching programs written in FORTRAN, a $5,000 Mac II ran nearly at the speed of a VAX 11/780 - a minicomputer costing $250,000. 1
Since 1987 there has not been a dramatic improvement in Macintosh running speeds. The introduction of the 68030-based Macintosh only slightly increased the speed of the 68020-based Mac II whereas higher clock rates have gradually accelerated speeds of the original Mac II by up to a factor of 3. Although a factor of 3 is not insignificant, it is not commensurate with the expectations of the microcomputer user community nor is it adequate for many mainframe-based scientific and engineering applications.
Whats New?
Much has been written about the wall facing all microcomputers. Physics appears to place an upper limit of 100 to 150 Mhz on achievable clock rates with silicon. Does that mean the best we can see in the future for the Mac is a mere threefold increase in speed? Fortunately the answer is no! For scientific and engineering applications written in FORTRAN, the Mac II can be made up to 30 times faster - not in the near future but right now! In other words, the Mac II can be given the number crunching capability of a mainframe.
A special board, based on Motorolas new 88000 RISC architecture is available from Tektronix, and a 88000 FORTRAN compiler is available from Absoft giving the Macintosh II a mainframe speed capability. The board, known as the RP88 Coprocessor Board, can be installed in approximately 2 minutes into a NuBus slot and the FORTRAN compiler works in the MPW environment. Calculation intensive programs are written and compiled in the 68020 Macintosh environment but executed (by double-clicking an icon on the screen) on the 88000. Data generated by the 88000-based program can be viewed on the screen and/or data can be written to a file for viewing later. More advanced users can actually have portions of a program such as the Macintosh interface running on the 68020 and sophisticated algorithms running on the 88000.
Although RISC boards have been around for some time on a variety of hardware platforms, the Tektronix contribution is different in two important respects. First the extraordinary power of RISC can now easily be exploited from a high order language by engineers and scientists for plain vanilla code. C and FORTRAN compilers for the 88000 can not only be ordered but they are actually available. Secondly, we still have all the advantages that the Macintosh has to offer. In fact, when operating under MultiFinder it is possible, without additional programming, to have an 88000-based program running simultaneously with a 68020-based application, without loss of speed in either application.
What Is RISC?2,3
RISC is an acronym for reduced instruction set computer. It is a style of computer architecture that advocates shifting complexity from hardware and program run time to software and program compile time. At the heart of RISC are two important concepts:
Most instructions are effectively executed in a single machine cycle
Only those features that measurably affect performance are implemented in hardware
Apparently the first RISC machine was the IBM/801 minicomputer built in 1979. This computer, which was not a commercial product, had very fast memory and fixed format instructions that could execute in a single clock cycle. The IBM RT PC workstation was a commercial product introduced in 1986 based on the 801 technology. However the original RT was a failure commercially. One of the possible reasons for its lack of success was the absence of high level language support.
Today one only has to read the ads of scientific/engineering magazines to see that there are many RISC products in the microcomputer/workstation world. In this article we shall not attempt to compare one product versus another but merely show that the RISC product available for the Mac II yields an astounding leap in performance.
How Fast Is The 88000-Based Mac?
The whetstone benchmark, devised in England by H. Curnow and B. Wichmann in the Feb. 1976 issue of Computer Journal,4 is an attempt to cover a typical mix of all floating-point operations. This benchmark contains linear arrays, and addition, subtraction, multiplication, division, and transcendental operations. Many computer manufacturers have rated their machines in terms of thousands of whetstones per second or kwhet/s. Higher whetstone ratings mean more powerful machines. Table 1, based on the results of Reference 5, presents single precision and double precision whetstone ratings for several computing platforms including the 88000-based Mac II. In addition, the cost of the host computer is included in the Table to provide a sobering perspective. Here we can consider cost to be the platform purchase price only. This neglects the cost of the many individuals required to operate and maintain the larger machines. In fact, the cost of this small army of technicians usually far exceeds the machines purchase price!
Table 1 Whetstone Ratings For A Variety Of Computers
We can see from Table 1 that although the original Mac II is very fast, the addition of the 88000 RISC board speeds up the Mac II by a factor of 23 for single precision whetstones and a factor of 13 for double precision whetstones when the default compiler optimization is used. Much higher whetstone ratings for the 88000-based Mac can be achieved by using additional compiler optimization options. However these higher whetstone ratings (approximately a factor of 2 higher) are not indicative of general performance gains in a variety of applications.
Generally higher cost computers yield faster performance. However, Table 1 shows that cost is not always commensurate with the performance. For example, a VAX 11/780 is only 1.5 times as fast as a Macintosh II (double precision whetstones) and yet is 50 times more expensive. An IBM/3090 is 33 times faster than a Macintosh II and is 1000 times more expensive.
A 20 Mhz 88000 Tektronix board with 8 Megabytes of memory costs $12,000 (less expensive versions are available too) and the Absoft 88000 FORTRAN compiler costs $2000. Therefore the total cost of an 88000-based Mac is approximately $19000 ($12000+$2000+$5000). The Table indicates that the 88000-based Mac runs 2.4 times slower than the IBM 3090 super computer at 260 times less cost when the default compiler optimization is used. Although the 88000-based Mac is nearly 4 times more expensive than a conventional Mac II it is from 13 to 23 times more powerful!
If we normalize the computer performance information of Table 1 as measured by whetstones per second to the computer purchase price, we can generate bang for the buck information as was done in Ref. 5. More bang for the buck means that the computer yields a higher whetstone rating for less cost. Figure 1 presents this cost effectiveness information for single and double precision whetstones. The figure clearly shows that the 88000-based Mac (when the default compiler optimization is used) is more than two orders of magnitude cost effective than super mini or mainframe computers and from 3 to 6 times more cost effective than a conventional Mac II. Most importantly, mainframe power is now available in a desktop microcomputer at very reasonable cost!
Figure 1 RISC Significantly Improves Cost Effectiveness of Mac II
How Fast Is The 88000-Based Mac On Actual Programs?
Whetstone benchmarks are meaningless unless they reflect how the computer will perform on actual programs. If a computer has a whetstone rating 20 times higher than that of another computing platform, the expectation is that normal (as written by non-computer professionals) FORTRAN programs will run 20 times faster on the more powerful computer. In the case of the 88000-based Mac we shall see that the whetstone rating is actually an underestimate of how powerful this enhanced microcomputer actually is.
A monte carlo program, whose source code is presented in Listing 1, was taken from Reference 6. This program simulates a missile-target engagement and involves the numerical integration of differential equations and a random input error source. Fifty run monte carlo set sample sizes are required to accumulate accurate statistics on performance as a function of flight time. Data from each monte carlo set (corresponding to a particular flight time) is post-processed and the mean and standard deviation of each set is computed and written to a file. A glance at Listing 1 also shows how uniformly distributed random numbers are generated and how computer running time is calculated with the Absoft 88000 FORTRAN compiler
__________________________________________________________
DIMENSION Z(1000)
INTEGER RUN
INTEGER*4 m(4),random
CALL times(m)
ntim=m(1)
OPEN(1,STATUS=NEW,FILE=DATFIL)
VC=4000.
XNT=96.6
VM=3000.
XNP=3.
TAU=1.
RUN=50
106 CONTINUE
DO 60 TF=1,10
Z1=0.
DO 20 I=1,RUN
K=random()
SUM=K/2.1475e9
TSTART=TF*SUM
K1=random()
PZ=K1/2.1475e9
PZ=PZ-.5
IF(PZ.GT.0.)THEN
COEF=1.
ELSE
COEF=-1.
ENDIF
Y=0.
YD=0.
T=0.
H=.01
S=0.
XNC=0.
XNL=0.
10IF(T.GT.(TF-.0001))GOTO 999
IF(T.LT.TSTART)THEN
XNT=0.
ELSE
XNT=COEF*96.6
ENDIF
YOLD=Y
YDOLD=YD
XNLOLD=XNL
STEP=1
GOTO 200
66STEP=2
Y=Y+H*YD
YD=YD+H*YDD
XNL=XNL+H*XNLD
T=T+H
GOTO 200
55CONTINUE
Y=.5*(YOLD+Y+H*YD)
YD=.5*(YDOLD+YD+H*YDD)
XNL=.5*(XNLOLD+XNL+H*XNLD)
S=S+H
GOTO 10
200 CONTINUE
TGO=TF-T+.00001
RTM=VC*TGO
XLAMD=(RTM*YD+Y*VC)/(RTM**2)
XNC=XNP*VC*XLAMD
XNLD=(XNC-XNL)/TAU
YDD=XNT-XNL
IF(STEP-1)66,66,55
999 CONTINUE
Z(I)=Y
Z1=Z(I)+Z1
XMEAN=Z1/I
20CONTINUE
SIGMA=0.
Z1=0.
DO 50 I=1,RUN
Z1=(Z(I)-XMEAN)**2+Z1
IF(I.EQ.1)THEN
SIGMA=0.
ELSE
SIGMA=SQRT(Z1/(I-1))
ENDIF
50CONTINUE
WRITE(9,*)TF,SIGMA,XMEAN
WRITE(1,*)TF,,,SIGMA,,,XMEAN
60CONTINUE
CLOSE(1)
CALL times(m)
ztim=(m(1)-ntim)/60.
WRITE(9,*)ztim
PAUSE
END
_____________________________________________________________
Listing 1 Monte Carlo Program FORTRAN Source Code
Table 2 compares the compile and running time using Absoft Version 2.3 FORTRAN for the Mac II and Absoft 88000 FORTRAN for the 88000-based Mac (using the default compiler optimization). In this table compile actually means compile, assemble and link. In other words it is the time the user must wait after making a source code change to get an executable program. We can see that for this example the 88000-based Mac run time was 26 times faster than the Mac II. However the compile times for the 88000 compiler are much higher. Apparently the price paid for dramatic increases in run time speed using RISC is a significant increase for the source code to compile.
Table 2 - RISC Yields Faster Run Times At Expense of Longer Compile Times
In general I have found that my applications, using single precision arithmetic, run from 20 to 30 times faster with the RISC board while my double precision applications run from 10 to 20 times faster. The major annoyance with the 88000-based Mac is in the much slower compile times with the 88000 FORTRAN (I was spoiled by Absofts very fast compiler for the Mac II). Applications which consist of a few hundred lines of code take from 1 min to 4 min to generate executable code whereas applications of more than 1000 lines take from 5 min to 15 min to compile, assemble and link. Making separate files for each program subroutine seems to speed up compilation on subsequent recompiles. However, the method that seems to work best for me is to develop the program under 68020 Absoft FORTRAN Version 2.3 and then to recompile under Absoft 88000 FORTRAN.
Is It Necessary To Learn MPW?
Although the 88000 board is easy to install and the ensuing performance gains breathtaking, the documentation leaves something to be desired. The initial documentation release had no FORTRAN examples and did not even tell you how to compile and execute a simple program. Some of the information provided was even scary. For example, the instructions for installing FORTRAN are: The files listed above have been given to you on a tar formatted tape... After searching frantically for the tape and drive I decided to call Textronic for help. Fortunately they were pleasant and very helpful. In case future documentation releases are not more explicit, here is a step-by-step procedure for compiling and executing a program for the 88000.
The 88000 FORTRAN compiler runs in the MPW environment. After the souse code is written using the MPW editor and named program.f (in this example whet.f), one pulls down the Build menu and clicks on Create RP88 ... as shown in Fig. 2.
Figure 2 - Step 1 In Using The 88000 FORTRAN Compiler
Next the user types in the name of the program output (i.e. double clickable icon) and clicks on the files button as shown in Fig. 3.
Figure 3 - Step 2 In Using The 88000 FORTRAN Compiler
A list of files will appear as shown in Fig. 4. The user double-clicks on the files of interest. After all the files are selected, the user clicks on done.
Figure 4 - Step 3 In Using The 88000 FORTRAN Compiler
In step 4 the user clicks on the CreateMake88 button.
Figure 5 - Step 4 In Using The 88000 FORTRAN Compiler
Finally the user pulls down the Build menu for the last time and clicks on Build...
Figure 6 - Step 5 In Using The 88000 FORTRAN Compiler
A dialog box comes up and the user types in the program name (if it does not already appear) and then clicks on OK.
Figure 7 - Step 6 In Using The 88000 FORTRAN Compiler
If there is a compilation error, the MPW worksheet will indicate the error and line number. Selecting the line and hitting the enter key will automatically take you to the offending line in the source code. If there are no errors, the MPW worksheet eventually indicates that the whole process is completed. At this time the user types in host88, a space and then the name of the program (in this case host88 whet) and hits the enter key. This command automatically launches the 88000-based program.
General Comments
I have used the RP88 and FORTRAN 88000 compiler for approximately 3 months. The product allows me to tackle problems which were previously beyond my reach. I would highly recommend this product to any scientist or engineer who must do time consuming number crunching problems or to any individual currently wasting money on excessive mainframe charges. When I first told a colleague about this product he actually thought nitrogen bottles and super conductivity were involved in achieving mainframe speeds with a microcomputer.
At work, skeptics became convinced of the utility of this product when we ported a mainframe covariance analysis program, using double precision arithmetic. The program took 6 hrs to run on a Mac II. Only one line of code had to be modified to work with the 88000 FORTRAN compiler. In the first attempt, the program ran in 20 min with the 88000. We saw that the 88000 bottle neck was excessive writing to the screen (this was originally done on the 68020 version of the code just to let the user know that the program was alive). In writing to the screen, the 88000 must communicate with the 68020 causing the 88000 to spend a great deal of time waiting. By writing the data to a file (for viewing later) and eliminating writing to the screen when using the 88000 compiler we cut the run time down to 10 min. In addition, with MultiFinder we can make batch runs in the background while using the 68020 portion of the Mac for other productive work.
Current pricing information on the Tektronix RP88 can be obtained from Tektronix, PO Box 500, MS 50-662, Beaverton, Oregon 97077 (800-TEK-WIDE ext. 8800). Information on the Absoft 88000 FORTRAN compiler can be obtained from Absoft, 2781 Bond Street, Rochester Hills, MI 483089 (313-853-0095).
References
1 Zarchan, P., New Mac Workstation Potential, MacTutor, Vol. 3, March 1987, pp 15-21.
2 Hennessy, J., VLSI RISC Processors, VLSI Systems Design, Oct. 1985, pp 22-32.
3 Robinson, P., How Much of a RISC, BYTE, Vol. 12, April 1987, pp. 143-150.
4 Curnow, H. J., and Wichmann, B. A., Synthetic Benchmark, Computer Journal, Vol. 19, Feb. 1976, pp 43-49.
5 Zarchan, P., Benchmarks Re-Visited, MacTutor, Vol. 3, Sept. 1987, pp. 78-80.
6 Zarchan P., Tactical and Strategic Missile Guidance, Vol. 124, Progress in Astronautics and Aeronautics, AIAA, Washington, DC 1990.