Aug 97 Challenge
Volume Number: 13 (1997)
Issue Number: 8
Column Tag: Programmer's Challenge
Programmers Challenge
by Bob Boonstra, Westford, MA
Stratego®
In recognition of Deep Blue's chess victory over Garry Kasparov (and I would have said "In celebration of...", except that I have mixed feelings about the outcome), we return to our series of board game tournament Challenges. This month, you will be writing a program to play the game Stratego. Stratego is a board game played on a rectangular 10x10 grid. Each player begins the game with 40 pieces, one of which is a Flag. The object of the game is to find and capture the opponent's flag.
The pieces provided to each player at the beginning of the game, in order of descending rank, are as follows (quantity in parentheses): Marshall (1), General (1), Colonels (2), Majors (3), Captains (4), Lieutenants (4), Seargent (4), Miners (5), Scouts (8), and Spy (1). In addition, each player is given Bombs (6) and a Flag (1). At the start of play, each player places his pieces in the four rows nearest his side of the board such that their rank, which is visible from one side of a piece but not from the other, is hidden from the opponent. Players alternate making moves, with Red moving first, then Blue. Each turn consists of either moving a piece into an open square, or striking an opponent's piece in an adjacent square. Except for the Scout, Flag, and Bomb pieces, each piece can move one square forward, backward, or sideways (but not diagonally) on a given turn. Flags and Bombs cannot move at all. Scouts can move any number of open squares forward, backward, or sideways (but again, not diagonally).
If a piece is moved into an adjacent square occupied by an opponent, the move is referred to as a "strike", which results in the lower ranking piece being removed from the board, and the higher ranking piece moving into the square formerly occupied by the losing piece. When both pieces involved in a strike are of equal rank, both pieces are removed. A Marshall defeats a General, a General defeats a Colonel, etc. The Spy is the lowest ranking piece, and is defeated by any other piece, except that a Spy defeats a Marshall when the Spy initiates the strike. Any piece except a Miner striking a Bomb is removed (and the Bomb remains in its original position). When a Miner strikes a Bomb, the Bomb is removed and the Miner moves into the square formerly occupied by the Bomb. The Flag and the Bomb cannot move and cannot initiate a strike. The game ends when a player strikes the opponent's Flag.
Your code will be competing in a round-robin tournament against other Challenge entries. The prototype for the code you should write is:
#define kBoardSize 10
typedef enum { kUnknown=0,
kMarshall=1,kGeneral,kColonel,kMajor,kCaptain,
kLieutenant,kSergeant,kMiner,kScout,kSpy
} PieceRank;
typedef enum {kRed=1,kBlue=2} PlayerColor;
typedef struct PieceType {
PieceRank thePieceRank; /* rank of a piece */
PlayerColor thePieceColor; /* color of a piece */
} PieceType;
typedef PieceType Board[kBoardSize][kBoardSize];
/* Used to provide test code with board configuration. Red starts
in rows 0..3, Blue starts in rows 6..9 */
/* Squares [4][2], [4][3], [4][6], [4][7] and [5][2], [5][3], [5][6], [5][7] are water
and cannot be occupied */
typedef struct PiecePosition {
long row; /* 0..9 */
long col; /* 0..9 */
} PiecePosition;
typedef struct MoveResult {
PieceType attacker; /* after a strike, returns rank of attacker */
PieceType defender; /* after a strike, returns rank of defender */
Boolean attackerRemoved; /* true after a strike against a piece of equal
or greater rank, or against a bomb when the
attacker is not a Miner */
Boolean defenderRemoved; /* true after a strike by a piece of equal or
greater rank, or against a bomb when the
attacker is a Miner, or against a Marshall by a
Spy */
Boolean victory; /* true after a strike against the Flag */
Boolean legalMove; /* true unless you
- move into an occupied square, or
- move or strike in a direction other than forward, backward, or
sideways, or
- move more than one square (except Scouts), or
- move a Bomb or a Flag,
- move into Water, or
- strike a square not occupied by an opponent, or
- make any other illegal move */
} MoveResult;
void PositionPieces(
void *privStorage, /* 1MB of preinitialized storage for your use */
PlayerColor playerColor, /* you play red or blue, with red playing first */
Board *theBoard /* provide the initial position of your pieces */
);
typedef void (*ReportYourMove)(
/* Callback to inform test code of move and get results */
PiecePosition *moveFrom, /* piece you are moving or using to strike */
PiecePosition *moveTo, /* destination square or square being struck */
Boolean strike, /* false indicates a move, true indicates a strike */
MoveResult *results /* returns identity of struck piece and other info */
);
typedef void (*GetOpponentMove)(
/* Callback to get results of opponents last move */
PiecePosition *moveFrom,/* piece opponent moved or used to strike */
PiecePosition *moveTo, /* destination square or square struck */
Boolean *strike, /* false indicates a move, true indicates a strike */
MoveResult *results /* returns identity of struck piece and other info */
);
Boolean /* TRUE claims victory or illegal opponent move */ MakeAMove(
void *privStorage, /* 1MB of storage from PositionPieces */
PlayerColor playerColor, /* you play red or blue, with red playing first */
GetOpponentMove *GetMove, /* callback used to make a move */
ReportYourMove *ReportMove /* callback used to find about opponents last move */
);
Your PositionPieces routine will be called at the beginning of each game so that you provide the initial position of your pieces. If your playerColor is kRed, you should occupy rows 0..3 of the board, otherwise you should occupy rows 6..9. You should set theBoard[row][col].thePieceColor to your playerColor and theBoard[row][col].thePieceRank to the PieceRank you are placing on each square. As indicated in the typedef for theBoard, eight squares in the middle rows of the board are water and cannot be occupied by pieces of either side.
Note that theBoard is used only to pass your initial position to the test code - you may choose your own data structure to keep track of your pieces and the knowledge gained about your opponent's pieces as the game progresses. PositionPieces will be provided with 1MB of preallocated storage pointed to by privStorage. This storage will be initialized to zero before PositionPieces is called and will persist between moves. You should not allocate any additional dynamic storage beyond that provided by privStorage. Small amounts of static storage are permitted.
The normal move sequence consists of finding out via the GetMove callback where your opponent moved on the previous turn, determining your next move, and finding out the results of that move via the ReportMove callback. Your MakeAMove code should include something like this:
if (!firstMove || (playerColor==kBlue)) {
(*GetMove)(&opponentMoveFrom,&opponentMoveTo,
&opponentStrike,&opponentResult);
}
/* respond to opponent move, and calculate your move */
(*ReportMove)(&myMoveFrom,&myMoveTo,myStrike,&myResult);
/* process results of move */
You provide ReportMove with the location from which and to which you are moving (moveFrom and moveTo, respectively), and set strike to TRUE if your move strikes one of your opponent's pieces. The callbacks will populate a MoveResult data structure to reflect the results of your opponent's last move or your current move. Flags are set to indicate whether the attacker or defender (or both) are removed as the result of an attack, and PieceType fields are populated when an attack reveals information about a piece. The key to winning a game is properly interpreting the movement and tactics of your opponent, deciding when and where to strike, and using the information gained in those strikes to best advantage. GetMove should return TRUE when you claim victory or when the callback indicates that an illegal move has been made.
The Challenge will be scored by a tournament where each entry plays against each other entry an even number of times, half playing first and half playing second. Another fair tournament schedule might be used if a large number of entries are received. For each game, the winner will earn game points for the victory, minus an amount based on the execution time used to compute the moves, as follows:
game points = 10 - execution time in seconds
The loser will earn zero points and will not be penalized for the execution time expended in defeat. No points will be awarded for a tie. The player with the most total points for all games in the tournament will be the winner. This will be a native PowerPC Challenge, using the latest CodeWarrior environment. Solutions may be coded in C, C++, or Pascal.
Stratego is ® 1960 in the U.S. Patent Office and is © 1961 by the Milton Bradley Company.
Three Months Ago Winner
The Equation Evaluator Challenge posed in May required contestants to reproduce part of the functionality of Apple's Graphing Calculator. The problem was to parse an input equation and then evaluate it for combinations of up to three variables. Selecting a winner proved to be difficult, as none of the six entries I received proved to be completely correct. One of the fastest of the nearly correct entries needed only a trivial code change to be completely correct, met all of the accuracy requirements, was among the fastest of the nearly correct entries, and was the smallest of the top entries. Congratulations to Mark Day (Saratoga, CA) for submitting the winning entry, based on correctness, speed, and size.
Mark parses the input equation into a binary tree of TokenNode data structures, generates a sequence of PowerPC instructions in the CodeGen routine, and then executes those instructions to evaluate the input equation over the required set of values. As permitted by the problem statement, the CallGeneratedCode routine makes use of a small amount of assembly language to execute the generated code. The key to code generation is the subroutine CodeGenTree, which recursively produces the code corresponding to a given TokenNode, beginning with the root node. One thing to take note of as you look at the winning solution is the technique used to call library functions: a set of enums (e.g., kFuncSin) corresponds to an entry in the global function table gFunctions, which is accessed via a dedicated register (regFunctionBase) by the CallFunction code. Also note the use of MakeDataExecutable prior to execution of the generated code.
Turlough O'Connor also generates PowerPC instructions to solve the problem, but he demonstrates how to call the generated code without direct use of assembly language:
codeBlock = NewPtr(kCodeBlockSize);
// pageStart round up to 4K boundary
pageStart = (Ptr) (((UInt32)codeBlock + 0xFFF) & ~0xFFF);
gPC = (UInt32 *)pageStart;
prologStart = gPC;
// [SNIP] - generate prolog starting at pageStart, and code starting at funcStart
MakeDataExecutable(pageStart,
(UInt32)gPC - (UInt32)pageStart);
typedef struct {
void *proc;
void *rtoc;
} TVector;
typedef void (*GenFunc)(const Values *xP, const Values *yP,
const NValues *nP, Results *wP,
const ConstFuncTable *constTable);
TVector ourVec = {funcStart, 0};
GenFunc ourFunc = (GenFunc)&ourVec;
(ourFunc)(xP, yP, nP, w, &gConstFuncTable);
Three of the remaining solutions (Willeke Rieken, Ernst Munter, and John Nevard) took an approach that did not involve generating PowerPC code. After parsing the input equation, these solutions evaluated the equation directly using a virtual machine of one sort or another. While this approach was generally slower, one of the solutions was competitive and could have won had it not suffered from an accuracy problem in a test case involving a truncated Taylor series expansion of a trig function.
I evaluated the entries using 17 test cases, designed to execute in comparable amounts of time, and summed the execution times to produce the final score. The test cases included combinations of arithmetic operations, trigonometric functions, hyperbolic functions, square roots, logarithms, and exponentiation. One test case included the truncated Taylor series mentioned above. Several test cases used polar coordinates as the equation input. Two of the entries, including the winner, had problems with polar coordinates: the winner calculated the angle as atan2(x,y) instead of atan2(y,x), while another entry used the atan function instead of the atan2 function. The former was easily corrected, while the latter was not. Three other entries did not meet the relative accuracy requirement specified in the problem, and one incorrectly processed some of the test cases. Only the last entry was disqualified.
The table below lists for each entry the total execution tine for all test cases, the severity of the errors in the solution, code and data sizes, and the programming language used. The number in parentheses after the entrant's name is the total number of Challenge points earned in all Challenges to date prior to this one.
Name | Time | Errors | Code | Data | Language |
Mark Day | 289 | atan2(x,y) | 7328 | 1364 | C/Asm |
Willeke Rieken | 284 | accuracy | 16796 | 1779 | C++ |
Turlough O'Connor (7) | 284 | atan | 13172 | 19283 | C++ |
Ernst Munter (242) | 341 | accuracy | 8904 | 794 | C++ |
John Nevard (17) | 418 | accuracy | 4520 | 2126 | C++ |
M. N. | 239 | correctness | 16356 | 1799 | C++ |
Here is Mark's winning solution (with some formatting changes and deletions to reduce length - see the ftp site for the full code listing):
Evaluate.c
© 1997 Mark Day
// Contains routines to parse an equation and drive the
// evaluation of that equation over a set of input values.
#include <ctype.h>
#include <string.h>
#include <math.h>
#undef HUGE_VAL // fp.h redefines this
#include <fp.h>
#include <Memory.h>
#include <OSUtils.h>
#include "Evaluate.h"
ulong gVariableFlags;
struct TokenLookup {
int token;
int which;
char *string;
};
typedef struct TokenLookup TokenLookup;
TokenLookup gTokenLookup[] = {
tokLeftParen, kLeftParen, "(",
tokRightParen, kRightParen, ")",
tokAddOp, kAdd, "+",
tokAddOp, kSubtract, "-",
tokMulOp, kMultiply, "*",
tokMulOp, kDivide, "/",
tokPowerOp, kExponent, "^",
tokRootOp, kFuncSquareRoot, "\\",
tokFactorial, kFuncFactorial, "!",
tokFunction, kFuncLogN, "ln",
tokFunction, kFuncLogTen, "log",
// Hyperbolic functions need to come first or else they
// would parse as the normal functions plus an extra
// "h" character (which makes for a syntax error)
tokFunction, kFuncSinh, "sinh",
tokFunction, kFuncCosh, "cosh",
tokFunction, kFuncTanh, "tanh",
tokFunction, kFuncSech, "sech",
tokFunction, kFuncCsch, "csch",
tokFunction, kFuncCoth, "coth",
tokFunction, kFuncSin, "sin",
tokFunction, kFuncCos, "cos",
tokFunction, kFuncTan, "tan",
tokFunction, kFuncSec, "sec",
tokFunction, kFuncCsc, "csc",
tokFunction, kFuncCot, "cot",
tokFunction, kFuncArcSin, "asin",
tokFunction, kFuncArcCos, "acos",
tokFunction, kFuncArcTan, "atan",
tokFunction, kFuncArcSec, "asec",
tokFunction, kFuncArcCsc, "acsc",
tokFunction, kFuncArcCot, "acot",
tokFunction, kFuncAbs, "abs",
tokVariable, kPi, "p", tokVariable, kE, "e",
tokVariable, kR, "r", tokVariable, kTheta, "t",
tokVariable, kX, "x", tokVariable, kY, "y",
tokVariable, kZ, "z", tokVariable, kN, "n",
tokAssign, kEquals, "=", tokInvalid, kInvalidToken, "",
};
//
// These are single precision versions of functions, not
// found in either math.h or fp.h
//
static float Factorial(float x) { return gamma(x+1.0); }
static float Secant(float x) { return 1.0 / cosf(x); }
static float CoSecant(float x) { return 1.0 / sinf(x); }
static float CoTangent(float x) { return 1.0 / tanf(x); }
static float ArcSecant(float x) { return acosf(1.0/x); }
static float ArcCoSecant(float x) { return asinf(1.0/x); }
static float ArcCoTangent(float x) { return atanf(1.0/x); }
static float HypSecant(float x) { return 1.0 / coshf(x); }
static float HypCoSecant(float x) { return 1.0 / sinhf(x); }
static float HypCoTangent(float x) { return 1.0 / tanhf(x); }
typedef float (*FloatFunc)(float); FloatFunc gFunctions[] = {
(FloatFunc) powf, fabsf, qrtf, Factorial, expf, logf, log10f, sinf, cosf,
tanf, Secant,CoSecant, CoTangent, asinf, acosf, atanf, ArcSecant,
ArcCoSecant, ArcCoTangent, sinhf, coshf, tanhf, HypSecant, HypCoSecant,
HypCoTangent, (FloatFunc) atan2f
};
//
// Return the next token in the equation. Adjust the
// pointer to point just after the token.
//
int GetToken(char **equation, Constant *value)
{
int token = tokInvalid;
char *s = *equation;
int i, width;
TokenLookup *p;
//
// skip leading white space
//
while (*s == ' ' || *s == '\t' || *s == '\n') ++s;
if (*s == '\0') return tokInvalid;
//
// See if *s matches one of the token strings
//
p = gTokenLookup;
while (p->token != tokInvalid) {
char *tokenString = p->string;
size_t len = strlen(tokenString);
if (!strncmp(s, tokenString, len)) {
token = p->token; value->which = p->which;
s += len; break;
} else {
++p;
}
}
//
// Update the current position within the string.
//
*equation = s;
//
// Try to parse a numeric constant.
//
if (token == tokInvalid) {
token = GetNumberToken(equation, value);
}
return token;
}
//
// GetNumberToken recognizes numeric constants of the forms
// [0-9]+.[0-9]* or [0-9]*.[0-9]+
//
int GetNumberToken(char **equation, Constant *value)
{
char *s;
char c; // keep the next character in a register
long digits = 0; // used to determine whether we've seen any digits yet
float val; // the value of the constant
float divisor; // keeps track of place value of float digits
int token; // the kind of number we found
token = tokInvalid; // assume we didn't find a number
s = *equation; // point to start of string
c = *s; // get first char
//
// Gather the integer part of the constant
//
val = 0.0;
while (isdigit(c)) {
val = val*10.0+(c-'0'); // shift in the next digit
++digits; // remember we found a digit
c = *(++s); // get next character
}
//
// If we found an integer part, prepare to return it.
//
if (digits) {
value->f = val;
token = tokFloat;
}
//
// Now let's see if there is a trailing decimal point
// and optional digits. If so, then this is a floating
// point constant.
//
if (c == '.') {
c = *(++s); // skip over decimal point
divisor = 1.0; // haven't seen fraction part yet
while (isdigit(c)) {
val = val*10.0+(c-'0'); // shift in next digit
divisor *= 10; // remember to fix place value
++digits; // we saw another digit
c = *(++s); // get next character
}
//
// If there were digits before or after the decimal
// point, then we have a valid float. Otherwise,
// we only saw a decimal point.
//
if (digits) {
value->f = val / divisor; // this fixes the place value
token = tokFloat;
} else {
// We saw only a decimal point. Point us back at the decimal point.
-s;
}
}
*equation = s; // point beyond what we found
return token; // return what we found
}
TokenNode *NewTokenNode(TokenNode *left, TokenNode *right, int token)
{
TokenNode *node;
node = (TokenNode *) NewPtr(sizeof(TokenNode));
node->left = left; node->right = right;
node->token = token;
return node;
}
void FreeTree(TokenNode *root)
{
if (root) {
FreeTree(root->left); FreeTree(root->right);
DisposePtr((Ptr) root);
}
}
/*
A simple parser for our expression grammar. The grammar
looks like:
expression -> term ( ADD_OP term )*
term -> factor ( [ MULTIPLY_OP ] factor )*
factor -> SQUARE_ROOT factor
| signed EXPONENT factor
| signed
signed -> MINUS signed
| PLUS signed
| gamma
gamma -> simple_value ( FACTORIAL )*
simple_value-> NUMBER
| VARIABLE
| LEFT_PAREN expression RIGHT_PAREN
| FUNCTION LEFT_PAREN expression RIGHT_PAREN
statement-> VARIABLE EQUALS expression
where (...)* means repeated zero or more times, and
[...] means optional.
*/
TokenNode *ParseExpression(char **s)
{
char *temp;
TokenNode *left, *right, *node;
int token;
Constant value;
node = NULL;
// Get the first term
left = ParseTerm(s);
if (left == NULL)
return left;
node = left;
// Get successive terms, separated by tokAddOp's
do {
temp = *s;
token = GetToken(&temp, &value);
if (token == tokAddOp) {
*s = temp;
right = ParseTerm(s);
if (right) {
node = NewTokenNode(left, right, token);
node->value.which = value.which;
left = node; // this makes it left-associative
}
}
} while (token == tokAddOp);
return node;
}
TokenNode *ParseTerm(char **s)
{
char *temp;
TokenNode *left, *right, *node;
int token;
Constant value;
node = NULL;
left = ParseFactor(s);
if (left == NULL) return left;
node = left;
//
// Get successive factors. They may be separated by
// tokMulOp's, or they may be adjacent (implying
// multiplication). If we find a tokAddOp, then we
// have to unwind; otherwise, an expression like
// "3 - 7" would be parsed as "3 * (-7)".
//
do {
temp = *s;
token = GetToken(&temp, &value);
switch (token) {
// Explicit multiply/divide
case tokMulOp:
*s = temp;
right = ParseFactor(s);
if (right) {
node = NewTokenNode(left, right, token);
node->value.which = value.which;
left = node; // this makes it left-associative
}
break;
// Explicit add/subtract; don't do implicit multiply.
// Use the single factor only.
case tokAddOp:
node = left;
break;
// Implicit multiply or single factor
default:
// Try to find a second factor
temp = *s;
right = ParseFactor(s);
if (right) {
// Got one, so do implicit multiply
token = tokMulOp; // pretend there was an explicit multiply
node = NewTokenNode(left, right, token);
node->value.which = kMultiply;
left = node; // make it left-associative
} else {
// Nope, just a single factor
*s = temp; // put back the tokens ParseFactor gobbled
node = left;
}
break;
}
} while (token == tokMulOp);
return node;
}
// ParseFactor, PaarseNegative, ParseFactorial deleted for brevity
// See ftp site for complete code listing
TokenNode *ParseSimpleValue(char **s)
{
char *temp;
TokenNode *node, *left;
int token;
Constant value;
left = NULL; node = NULL;
temp = *s;
token = GetToken(s, &value);
switch (token) {
case tokFloat:
left = NewTokenNode(NULL, NULL, token);
left->value = value;
++gNumConstants;
break;
case tokVariable:
left = NewTokenNode(NULL, NULL, token);
left->value = value;
switch (value.which) {
case kPi: gVariableFlags |= kPiMask; break;
case kE: gVariableFlags |= kEMask; break;
case kR:
// using r implies using x and y since r is computed from x and y
gVariableFlags |= kRMask + kXMask + kYMask;
break;
case kTheta:
// using theta implies using x and y since theta is computed from x & y
gVariableFlags |= kThetaMask + kXMask + kYMask;
break;
case kX: gVariableFlags |= kXMask; break;
case kY: gVariableFlags |= kYMask; break;
case kN: gVariableFlags |= kNMask; break;
}
break;
case tokFunction:
node = NewTokenNode(NULL, NULL, token);
node->value.which = value.which;
//
// If the function was abs(), then convert to a
// special token so it can be evaluated more
// efficiently.
//
if (value.which == kFuncAbs) node->token = tokAbs;
token = GetToken(s, &value);
if (token != tokLeftParen) { // Missing left parenthesis
node = NULL;
}
// Fall into processing of parethesized expression
case tokLeftParen:
left = ParseExpression(s);
token = GetToken(s, &value); // swallow right parenthesis
break;
case tokRightParen:
// A right parenthesis means we're doing doing
// a recursive parse of an expression. In that
// case, pretend like we hit the end of string.
*s = temp; // back up so caller finds right parenthesis
break;
case tokInvalid: break;
default: // An unexpected token was seen
break;
}
//
// If it was a function call, then point to the argument.
//
if (node == NULL) node = left;
else node->left = left;
return node;
}
TokenNode *ParseStatement(char **s)
{
TokenNode *node, *left, *right;
int token;
Constant value;
//
// Get the variable
//
token = GetToken(s, &value);
if (token == tokVariable) {
left = NewTokenNode(NULL, NULL, token);
left->value = value;
if (value.which == kZ)
gVariableFlags |= kFunctionOfY; // "z=" implies yP is valid input
}
//
// Get the assignment token
//
token = GetToken(s, &value);
if (token != tokAssign) { // Missing "="
return NULL;
}
//
// Get the expression
//
right = ParseExpression(s);
//
// Build the node
//
if (right) {
node = NewTokenNode(left, right, token);
node->value.which = value.which;
} else {
node = NULL;
}
return node;
}
//
// Here's the main entry point for the challenge.
//
void Evaluate(
char *equation, // null-terminated equation to evaluate
const Values *xP, // input values for x
const Values *yP, // input values for y
const IntValues *nP, // input values for n
Results w[]) // preallocated storage for equation values
{
char *s = equation;
TokenNode *root;
Values nFloatValues;
//
// Set up the floating point values of n
//
nFloatValues.first = nP->first;
nFloatValues.delta = nP->delta;
nFloatValues.howMany = nP->howMany;
//
// Parse the input equation
//
gVariableFlags = 0;
root = ParseStatement(&s);
//
// Generate code to evaluate the equation
//
CodeGen(root->right);
MakeDataExecutable(gCodeBase, 32768);
CallGeneratedCode(gCodeBase+4, xP, yP, nP,
&nFloatValues, w,
gConstants, gFunctions,
2.718281828459, 3.1415926535898);
//
// Free up the memory we allocated.
//
DisposePtr((Ptr) gCodeBase);
gCodeBase = NULL;
gNextInstruction = NULL;
FreeTree(root);
if (gConstants) {
DisposePtr((Ptr) gConstants);
gConstants = NULL;
}
}
CodeGen.c
// Contains various routines used to produce PowerPC object code
// (in memory) to evaluate an equation at a range of input values.
#include <Memory.h>
#include "Evaluate.h"
// Some simple macros for producing the PowerPC instructions in the generated
// code. The order of the arguments matches the order you'd write those arguments
// in assembly language source.
#define op_add(rD, rA, rB) (0x7C000214 + \
(rD<<21) + (rA<<16) + (rB<<11))
#define op_addi(rD, rA, simm) (0x38000000 + \
(rD<<21) + (rA<<16) + (simm & 0xFFFF))
#define op_b(offset) (0x48000000 + (offset & 0x03FFFFFC))
#define op_bctr() (0x4e800420)
#define op_bl(offset) (0x48000001 + (offset & 0x03FFFFFC))
#define op_blr() (0x4e800020)
#define op_bne(offset) (0x40820000 + (offset & 0xFFFF))
#define op_cmplwi(rA, uimm) (0x28000000 + \
(rA<<16) + (uimm & 0xFFFF))
#define op_fabs(frD, frB) (0xFC000210 + (frD<<21)+(frB<<11))
#define op_fadds(frD, frA, frB) (0xEC00002A + (frD<<21) + \
(frA<<16) + (frB<<11))
#define op_fdivs(frD, frA, frB) (0xEC000024 + (frD<<21) + \
(frA<<16) + (frB<<11))
#define op_fmadds(frD, frA, frC, frB) (0xEC00003A + (frD<<21) + \
(frA<<16) + (frB<<11) + (frC<<6))
#define op_fmr(frD, frS) (0xFC000090 + (frD<<21) + (frS<<11))
#define op_fmuls(frD, frA, frC) (0xEC000032 + (frD<<21) + \
(frA<<16) + (frC<<6))
#define op_fneg(frD, frB) (0xFC000050 + (frD<<21) + (frB<<11))
#define op_fsubs(frD, frA, frB) (0xEC000028 + (frD<<21) + \
(frA<<16) + (frB<<11))
#define op_lfs(frD, d, rA) (0xC0000000 + (frD<<21) + \
(rA<<16) + d)
#define op_lwz(rD, d, rA) (0x80000000 + (rD<<21) + \
(rA<<16) + d)
#define op_mflr(rD) (0x7c0802a6 + (rD<<21))
#define op_mtctr(rS) (0x7c0903a6 + (rS<<21))
#define op_mtlr(rS) (0x7c0803a6 + (rS<<21))
#define op_nop() (0x60000000)
#define op_stfs(frS, d, rA) (0xD0000000 + (frS<<21) + \
(rA<<16) + d)
#define op_stfsu(frS, d, rA) (0xD4000000 + (frS<<21) + \
(rA<<16) + d)
#define op_stwu(rS, d, rA) (0x94000000 + (rS<<21) + \
(rA<<16) + d)
#define op_subi(rD, rA, simm) op_addi(rD, rA, -simm)
/*
Some missing optimizations:
* Loop unrolling (** probably the most imporant opportunity **)
* Common sub-expression elimination
* Example: It's usually faster to compute both cos(x) and sin(x)
in a single operation than computing them separately.
The trick is recognizing the opportunity; CSE is a start.
* Take advantage of trig identities
* Redundant load/store
* Sometimes moves values to and from non-volatile storage when the
intervening operations don't change volatile registers.
* Doesn't combine multiply and add into a single instruction.
* Negation should be combined with previous fmul or fabs instruction.
*/
ulong regUsage[32]; // non-zero means register[x] is currently in use
ulong numMemoryVars; // number of temporaries not in registers
ulong gNumConstants;
float *gConstants = NULL;
ulong *gCodeBase,*gNextInstruction;
InitCodeGen
void InitCodeGen(ulong *codeStart)
{
int i;
gCodeBase = codeStart;
gNextInstruction = codeStart;
for (i=0; i<32; i++)
regUsage[i] = 0; // mark registers as not in use (i.e. available)
// put constants in overflow register pool
numMemoryVars = gNumConstants;
// For variables that are used, mark them
//
if (gVariableFlags & kNMask) regUsage[regN] = 1;
if (gVariableFlags & kXMask) regUsage[regX] = 1;
if (gVariableFlags & kYMask) regUsage[regY] = 1;
if (gVariableFlags & kRMask) regUsage[regR] = 1;
if (gVariableFlags & kThetaMask) regUsage[regTheta] = 1;
if (gVariableFlags & kEMask) regUsage[regE] = 1;
if (gVariableFlags & kPiMask) regUsage[regPi] = 1;
//
// Allocate memory for constants
//
gConstants = (float *) NewPtr(gNumConstants * sizeof(float));
gNumConstants = 0;
}
AllocateRegister
// AllocateRegister: reserve a non-volatile register or memory location
// to hold some value.
//
static ulong AllocateRegister(void)
{
int i;
//
// Try allocating a real register.
//
for (i=regFreeBase; i<regMax; i++) {
if (regUsage[i] == 0) {
++regUsage[i];
return i;
}
}
//
// Didn't work. Need to use memory.
//
return regMemory + numMemoryVars++;
}
UseRegister
// UseRegister: Mark a register as in use. If the given register
// is a non-volatile register (not a memory location), then increment
// it's usage count. This is similar to AllocateRegister, but when
// you already have the register number.
//
static void UseRegister(ulong which)
{
if (which >= regFreeBase && which < regMax) {
++regUsage[which];
}
}
// FreeRegister, StoreRegister, NonVolatileRegister have been
// deleted for brevity. See ftp site for full code listing.
LoadRegister
// LoadRegister: Make sure the source register/memory is in a real
// register. If not, move it to the destination (which is a real
// register) via a lfs instruction.
//
static ulong LoadRegister(ulong source, ulong destination)
{
ulong displacement;
if (source >= regMemory) {
displacement = (source-regMemory)*4;
*(gNextInstruction++) =
op_lfs(destination, displacement, regVariableBase);
return destination;
} else {
return source;
}
}
TemporaryResult
// TemporaryResult: Generate a (real) register number to be a destination register
// for some instruction. The destination has already been determined and is in
// "which".
// If "which" is memory, then use a volatile register, which will be stored later.
//
static ulong TemporaryResult(ulong which)
{
if (which >= regMemory) return fpTempResult;
else return which;
}
StoreResult
// StoreResult: the second part of TemporaryResult. If TemporaryResult was passed
// a memory location, then this function takes care of actually storing the result
// (after it has been put in the temporary volatile register).
//
static void StoreResult(ulong which)
{
if (which >= regMemory) {
StoreRegister(which, fpTempResult);
}
}
MoveRegister
//MoveRegister: Move register to memory, register to register, or memory to register.
//
static void MoveRegister(ulong destination, ulong source)
{
if (source >= regMax) { // Source is memory.
if (destination < regMax)
LoadRegister(source, destination);
return;
}
if (destination >= regMax) {
// Destination is in memory, so generate a stfs instruction
StoreRegister(destination, source);
} else {
// Source and destination are registers, so generate fmr instruction
*(gNextInstruction++) = op_fmr(destination, source);
}
}
CallFunction
// CallFunction: Call a function. Put the descriptor pointer into R12 and
// use the function call glue at 0 to jump to the desired routine.
//
static void CallFunction(ulong which)
{
long displacement;
//
// Load the pointer to the desired function into r12
//
displacement = which*4;
*(gNextInstruction++) =
op_lwz(regFunctionGlue, displacement, regFunctionBase);
//
// Generate a bl instruction to the glue at the start of the generated code.
//
displacement = (ulong) gCodeBase - (ulong) gNextInstruction;
*(gNextInstruction++) = op_bl(displacement);
}
CodeGenTree
/*
Inputs:
root (sub)tree to generate code for
resultRegister desired register for result (0 = any)
Result:
Register containing result. If resultRegister was non-zero, then
this will be the same as resultRegister.
*/
int CodeGenTree(TokenNode *root, int resultRegister)
{
char *opname;
ulong temp, temp2, arg1, arg2;
int result;
if (root == NULL) {
return 0;
}
switch (root->token) {
case tokFloat:
// Stuff the constant in memory in the first part of the overflow register area.
gConstants[gNumConstants] = root->value.f;
// If the result needs to go into a specific register, then load that register from
// memory. Otherwise, return an overflow register number (which will get
// loaded into a temporary registerjust before being used).
if (resultRegister) {
// Explicitly move value to a given register
result = resultRegister;
LoadRegister( regMemory + gNumConstants,
resultRegister);
} else { // Tell caller where the value is so they can fetch it
result = regMemory + gNumConstants;
}
gNumConstants++;
return result;
case tokAddOp:
temp = CodeGenTree(root->left, 0);
temp = NonVolatileRegister(temp);
// make sure it's in a non-volatile reg.
temp2 = CodeGenTree(root->right, 0);
// so we can evaluate second arg
FreeRegister(temp);
FreeRegister(temp2);
if (resultRegister) result = resultRegister;
else result = AllocateRegister();
arg1 = LoadRegister(temp, fpTemp1);
arg2 = LoadRegister(temp2, fpTemp2);
if (root->value.which == kAdd) {
*(gNextInstruction++) =
op_fadds(TemporaryResult(result), arg1, arg2);
} else {
*(gNextInstruction++) =
op_fsubs(TemporaryResult(result), arg1, arg2);
}
StoreResult(result);
return result;
case tokMulOp:
temp = CodeGenTree(root->left, 0);
temp = NonVolatileRegister(temp);
// make sure it's in a non-volatile reg.
temp2 = CodeGenTree(root->right, 0);
// so we can evaluate second arg
FreeRegister(temp);
FreeRegister(temp2);
if (resultRegister) result = resultRegister;
else result = AllocateRegister();
arg1 = LoadRegister(temp, fpTemp1);
arg2 = LoadRegister(temp2, fpTemp2);
if (root->value.which == kMultiply) {
*(gNextInstruction++) =
op_fmuls(TemporaryResult(result), arg1, arg2);
} else {
*(gNextInstruction++) =
op_fdivs(TemporaryResult(result), arg1, arg2);
}
StoreResult(result);
return result;
case tokPowerOp:
// Evaluate second argument, put in any register.
temp = CodeGenTree(root->right, 0);
// make sure it's in a non-volatile reg.
temp = NonVolatileRegister(temp);
// Evaluate first argument, put into fpArg1.
CodeGenTree(root->left, fpArg1);
// Move second argument to fpArg2 If first argument didn't require a
// function call, we could have put second argument directly into fpArg2
MoveRegister(fpArg2, temp);
FreeRegister(temp);
// Call power function
CallFunction(kFuncPower);
// Return result in desired register
temp = fpResult;
if (resultRegister && resultRegister != temp) {
MoveRegister(resultRegister, temp);
temp = resultRegister;
}
return temp;
case tokNegate:
temp = CodeGenTree(root->left, 0);
FreeRegister(temp);
if (resultRegister) result = resultRegister;
else result = AllocateRegister();
arg1 = LoadRegister(temp, fpTemp1);
*(gNextInstruction++) =
op_fneg(TemporaryResult(result), arg1);
StoreResult(result);
return result;
case tokAbs:
temp = CodeGenTree(root->left, 0);
FreeRegister(temp);
if (resultRegister) result = resultRegister;
else result = AllocateRegister();
arg1 = LoadRegister(temp, fpTemp1);
*(gNextInstruction++) =
op_fabs(TemporaryResult(result), arg1);
StoreResult(result);
return result;
case tokFunction:
// Evaluate the argument (root->left), putting it in fp1.
temp = CodeGenTree(root->left, fpArg1);
// Call the function
CallFunction(root->value.which);
// Put the result in the desired location
temp = fpResult;
if (resultRegister && resultRegister != temp) {
MoveRegister(resultRegister, temp);
temp = resultRegister;
}
return temp;
case tokVariable:
switch (root->value.which) {
case kPi: temp = regPi; break;
case kE: temp = regE; break;
case kR: temp = regR; break;
case kTheta: temp = regTheta; break;
case kX: temp = regX; break;
case kY: temp = regY; break;
case kN: temp = regN; break;
}
if (resultRegister && resultRegister != temp) {
MoveRegister(resultRegister, temp);
temp = resultRegister;
}
UseRegister(temp); // balance FreeRegister when this value is used
return temp;
default:
// Should never get here.
return 0;
}
// Should never get here.
return 0;
}
CodeGenRTheta
// If r or theta (t) is used to evaluate the equation, generate some code
// to set up their values based on the values of x and y.
//
static void CodeGenRTheta(void)
{
if (gVariableFlags & kRMask) {
// r = sqrtf(x*x + y*y)
*(gNextInstruction++) = op_fmuls(fpArg1, regX, regX);
*(gNextInstruction++) = op_fmadds(fpArg1, regY, regY,
fpArg1);
CallFunction(kFuncSquareRoot);
MoveRegister(regR, fpResult);
}
if (gVariableFlags & kThetaMask) {
// JRB correction - theta = atan2(y,x), not atan2(x,y)
MoveRegister(fpArg1, regY);
MoveRegister(fpArg2, regX);
// end correction
CallFunction(kFuncAtan2);
MoveRegister(regTheta, fpResult);
}
}
CodeGenStoreResult
// Generate code to store one equation result plus the variables
// used to produce that result. Assumes that the wP register points
// to 4 bytes BEFORE the location to start storing (so we can use the
// store-with-update instructions).
//
static void CodeGenStoreResult(ulong result)
{
// Store equation result
*(gNextInstruction++) = op_stfsu(result, 4, regResults);
// Store x
*(gNextInstruction++) = op_stfsu(regX, 4, regResults);
if (gVariableFlags & kFunctionOfY) {
// Store both y and n
*(gNextInstruction++) = op_stfsu(regY, 4, regResults);
*(gNextInstruction++) = op_stwu(regNInteger, 4,
regResults);
} else { // Skip over y and store n
*(gNextInstruction++) = op_stwu(regNInteger, 8,
regResults);
}
}
CodeGenLoopInit
// Generate code to initialize the loop variables. Load the first value, load the count
// (howMany), then branch to the test part of the loop (after the body of the loop).
// Since we don't know where the test will be, we just skip an instruction word and
// will insert the branch when we generate the test.
//
static void CodeGenLoopInit(ulong variableFlags, ulong
**xBranch, ulong **yBranch, ulong **nBranch)
{
// Generate the loop for x first
if (variableFlags & kXMask) {
*(gNextInstruction++) = op_lfs(regX, 0, regXValues);
// x = xP->first
*(gNextInstruction++) = op_lwz(regXCount, 8, regXValues);
// xCount = xP->howMany
*xBranch = gNextInstruction;
// remember branch address
*(gNextInstruction++) = op_nop();
// will be patched later
}
// Now generate the loop for y, if function was of the form "z=..."
if ((variableFlags & kFunctionOfY) && (variableFlags & kYMask)) {
*(gNextInstruction++) = op_lfs(regY, 0, regYValues);
// y = yP->first
*(gNextInstruction++) = op_lwz(regYCount, 8, regYValues);
// yCount = yP->howMany
*yBranch = gNextInstruction;
*(gNextInstruction++) = op_nop();
// will be patched later
}
// And lastly, n. N is a bit different since we maintain
// both the integer and floating point values simultaneously.
if (variableFlags & kNMask) {
*(gNextInstruction++) = op_lfs(regN, 0, regNValues);
// n = nFloat->first
*(gNextInstruction++) = op_lwz(regNInteger, 0, regNIntValues);
// nInteger = nP->first
*(gNextInstruction++) = op_lwz(regNCount, 8, regNIntValues);
// nCount = nP->howMany
*nBranch = gNextInstruction;
*(gNextInstruction++) = op_nop();
// will be patched later
}
}
CodeGenLoopEnd
// Generate the iteration step and test/branch for the variable loops.
// Just before we generate the test/branch instructions, patch up the
// branch instructions inserted by CodeGenLoopInit.
//
// The general form generates something like:
// iteration: x += xP->delta;
// xCount-;
// test: if (xCount != 0) goto loop_body
//
// where loop_body is the instruction following the forward branch that gets patched.
//
static void CodeGenLoopEnd(ulong variableFlags, ulong *xBranch, ulong *yBranch, ulong *nBranch)
{
ulong offset;
// Generate the part for n. Remember that we must increment both the integer
// and floating point values of n.
if (variableFlags & kNMask) {
*(gNextInstruction++) = op_lfs(fpTemp1, 4, regNValues);
// get nFloatValues->delta
*(gNextInstruction++) = op_fadds(regN, regN, fpTemp1);
// increment n (float)
*(gNextInstruction++) = op_lwz(regIntTemp1, 4, regNIntValues);
// get nP->delta
*(gNextInstruction++) =
op_add(regNInteger, regNInteger, regIntTemp1);
// increment n
*(gNextInstruction++) = op_subi(regNCount, regNCount, 1);
// nCount-
offset = (int) gNextInstruction - (int) nBranch;
*nBranch = op_b(offset);
// patch branch at start of loop
*(gNextInstruction++) = op_cmplwi(regNCount, 0);
// nCount == 0?
offset = (int) (nBranch+1) - (int) gNextInstruction;
*(gNextInstruction++) = op_bne(offset);
// if not, branch to loop body
}
// Next comes y, if a loop was generated for it (i.e. equation was "z=...")
if ((variableFlags & kFunctionOfY) && (variableFlags & kYMask))
{
*(gNextInstruction++) = op_lfs(fpTemp1, 4, regYValues);
// get yP->delta
*(gNextInstruction++) = op_fadds(regY, regY, fpTemp1);
// increment y
*(gNextInstruction++) = op_subi(regYCount, regYCount, 1);
// yCount-
offset = (int) gNextInstruction - (int) yBranch;
*yBranch = op_b(offset);
// patch branch at start of loop
*(gNextInstruction++) = op_cmplwi(regYCount, 0);
// yCount == 0?
offset = (int) (yBranch+1) - (int) gNextInstruction;
*(gNextInstruction++) = op_bne(offset);
// if not, branch to loop body
}
// Lastly comes x.
//
if (variableFlags & kXMask) {
*(gNextInstruction++) = op_lfs(fpTemp1, 4, regXValues);
// get xP->delta
*(gNextInstruction++) = op_fadds(regX, regX, fpTemp1);
// increment x
*(gNextInstruction++) = op_subi(regXCount, regXCount, 1);
// xCount-
offset = (int) gNextInstruction - (int) xBranch;
*xBranch = op_b(offset);
// patch branch at start of loop
*(gNextInstruction++) = op_cmplwi(regXCount, 0);
// xCount == 0?
offset = (int) (xBranch+1) - (int) gNextInstruction;
*(gNextInstruction++) = op_bne(offset);
// if not, branch to loop body
}
}
CodeGen
// Build up a function that loops over all the variable input values, evaluates
// the function, and stores the results.
//
// If the loops were written in C, they'd look like this:
// for (x=xP->first, xCount=xP->howMany; // loop initialization
// xCount != 0; // loop condition
// x+=xP->delta, xCount-) // iteration step
// {
// evaluate and store the result
// }
//
void CodeGen(TokenNode *root)
{
ulong result;
ulong *generatedCode;
ulong *xBranch, *yBranch, *nBranch;
generatedCode = (ulong *) NewPtr(32768);
if (generatedCode != NULL) {
InitCodeGen(generatedCode);
//
// Generate the pointer glue for making indirect function calls
//
*(gNextInstruction++) = op_lwz(regR0, 0, regFunctionGlue);
// lwz r0, 0(r12)
*(gNextInstruction++) = op_lwz(regTOC, 4, regFunctionGlue);
// lwz r2, 4(r12)
*(gNextInstruction++) = op_mtctr(regR0); // mtcr r0
*(gNextInstruction++) = op_bctr(); // bctr
//
// Generate the function prolog - save the link register
//
*(gNextInstruction++) = op_mflr(regSavedLR);
// mflr r15
//
// Generate loop initialization for any variables that _are_
// used to evaluate the function
//
CodeGenLoopInit(gVariableFlags, &xBranch, &yBranch, &nBranch);
//
// If r or theta are used to evaluate the equation, then compute them
// from x and y.
//
CodeGenRTheta();
//
// Generate the code to evaluate the equation. The equation can be put
// in any register. But make sure it ends up in a real register (not memory).
//
result = CodeGenTree(root, 0);
result = LoadRegister(result, fpResult);
//
// Generate loop initialization for any variables that _are not_ used
// to evaluate the function. This just causes the result to be stored
// several times in a row, without re-evaluating it.
//
CodeGenLoopInit(gVariableFlags ^ (kXMask+kYMask+kNMask),
&xBranch, &yBranch, &nBranch);
//
// Generate the code to store one result
//
CodeGenStoreResult(result);
FreeRegister(result);
//
// Generate the iteration and test parts of the variable loops
//
CodeGenLoopEnd(gVariableFlags ^ (kXMask+kYMask+kNMask),
xBranch, yBranch, nBranch);
CodeGenLoopEnd(gVariableFlags, xBranch, yBranch, nBranch);
//
// Generate the function epilog - restore link register and return
//
*(gNextInstruction++) = op_mtlr(regSavedLR);
// mtlr r15
*(gNextInstruction++) = op_blr();
// blr
}
}
CallGeneratedCode.c
// Contains a glue routine used to call the PowerPC code produced
// by CodeGen.c
//
#include "Evaluate.h"
typedef struct {
unsigned long savedSP;
unsigned long savedCR;
unsigned long savedLR;
unsigned long reserved1;
unsigned long reserved2;
unsigned long savedTOC;
} Linkage;
typedef struct {
Linkage linkage;
long arguments[8]; // set aside maximum room for called functions
long savedGPR[20]; // r13-r31 (last location is not used)
double savedFPR[18]; // fp14-fp31
} StackFrame;
enum { StackFrameSize = sizeof(StackFrame) };
CallGeneratedCode
// Saves registers and sets up parameter area for generated code. The generated code
// may save the link register in the linkage area. Since the generated code never uses
// the TOC register itself, it is sufficient to save and restore it here in case the
// generated code calls to a different fragment.
//
asm float CallGeneratedCode(
register void *addr, // first instruction to execute
register const Values *xValues,
register const Values *yValues,
register const IntValues *nIntValues,
register const Values *nValues,
register Results *resultsPtr,
register float *constants, register void *functions,
register float eValue, register float piValue)
{
//
// Save our return address and TOC
//
mflr r0
stw r2,Linkage.savedTOC(SP)
stw r0,Linkage.savedLR(SP)
//
// Make new stack frame
//
stwu SP,-StackFrameSize(SP)
//
// Save non-volatile FPRs
//
stfd fp14,StackFrame.savedFPR[0](SP)
stfd fp15,StackFrame.savedFPR[1](SP)
stfd fp16,StackFrame.savedFPR[2](SP)
stfd fp17,StackFrame.savedFPR[3](SP)
stfd fp18,StackFrame.savedFPR[4](SP)
stfd fp19,StackFrame.savedFPR[5](SP)
stfd fp20,StackFrame.savedFPR[6](SP)
stfd fp21,StackFrame.savedFPR[7](SP)
stfd fp22,StackFrame.savedFPR[8](SP)
stfd fp23,StackFrame.savedFPR[9](SP)
stfd fp24,StackFrame.savedFPR[10](SP)
stfd fp25,StackFrame.savedFPR[11](SP)
stfd fp26,StackFrame.savedFPR[12](SP)
stfd fp27,StackFrame.savedFPR[13](SP)
stfd fp28,StackFrame.savedFPR[14](SP)
stfd fp29,StackFrame.savedFPR[15](SP)
stfd fp30,StackFrame.savedFPR[16](SP)
stfd fp31,StackFrame.savedFPR[17](SP)
//
// Save non-volatile GPRs
//
stw r13,StackFrame.savedGPR[0](SP)
stw r14,StackFrame.savedGPR[1](SP)
stw r15,StackFrame.savedGPR[2](SP)
stw r16,StackFrame.savedGPR[3](SP)
stw r17,StackFrame.savedGPR[4](SP)
stw r18,StackFrame.savedGPR[5](SP)
stw r19,StackFrame.savedGPR[6](SP)
stw r20,StackFrame.savedGPR[7](SP)
stw r21,StackFrame.savedGPR[8](SP)
stw r22,StackFrame.savedGPR[9](SP)
stw r23,StackFrame.savedGPR[10](SP)
stw r24,StackFrame.savedGPR[11](SP)
//
// Pass input parameters to generated code
//
fmr fp30,eValue
fmr fp31,piValue
mr r13,constants
mr r14,functions
mr r16,xValues
mr r17,yValues
mr r18,nValues
mr r19,nIntValues
subi r20,resultsPtr,4 // set up results-4
//
// Call function at addr
//
mtctr addr
bl GoToCTR // this sets up LR to return to us
//
// Restore GPRs
//
lwz r13,StackFrame.savedGPR[0](SP)
lwz r14,StackFrame.savedGPR[1](SP)
lwz r15,StackFrame.savedGPR[2](SP)
lwz r16,StackFrame.savedGPR[3](SP)
lwz r17,StackFrame.savedGPR[4](SP)
lwz r18,StackFrame.savedGPR[5](SP)
lwz r19,StackFrame.savedGPR[6](SP)
lwz r20,StackFrame.savedGPR[7](SP)
lwz r21,StackFrame.savedGPR[8](SP)
lwz r22,StackFrame.savedGPR[9](SP)
lwz r23,StackFrame.savedGPR[10](SP)
lwz r24,StackFrame.savedGPR[11](SP)
//
// Restore FPRs
//
lfd fp14,StackFrame.savedFPR[0](SP)
lfd fp15,StackFrame.savedFPR[1](SP)
lfd fp16,StackFrame.savedFPR[2](SP)
lfd fp17,StackFrame.savedFPR[3](SP)
lfd fp18,StackFrame.savedFPR[4](SP)
lfd fp19,StackFrame.savedFPR[5](SP)
lfd fp20,StackFrame.savedFPR[6](SP)
lfd fp21,StackFrame.savedFPR[7](SP)
lfd fp22,StackFrame.savedFPR[8](SP)
lfd fp23,StackFrame.savedFPR[9](SP)
lfd fp24,StackFrame.savedFPR[10](SP)
lfd fp25,StackFrame.savedFPR[11](SP)
lfd fp26,StackFrame.savedFPR[12](SP)
lfd fp27,StackFrame.savedFPR[13](SP)
lfd fp28,StackFrame.savedFPR[14](SP)
lfd fp29,StackFrame.savedFPR[15](SP)
lfd fp30,StackFrame.savedFPR[16](SP)
lfd fp31,StackFrame.savedFPR[17](SP)
//
// Unwind stack frame
//
addi SP,SP,StackFrameSize
//
// Restore TOC and return
//
lwz r0,Linkage.savedLR(SP)
lwz r2,Linkage.savedTOC(SP)
mtlr r0
blr
GoToCTR:
bctr
}
Evaluate.h
// Prototypes deleted - see code on ftp site
/*-----------------------------------
These are types and function prototypes used internally
-----------------------------------*/
extern ulong gVariableFlags;
extern ulong gNumConstants; // Number of constants found/stored
extern float *gConstants; // Array of constant values from equation
extern ulong *gCodeBase; // Address of start of generated code
extern ulong *gNextInstruction;
// Where next generated instruction will be stored
union Constant {
float f; // a float constant
long which; // which one of a group (kVariable, kAddOp, etc.)
};
typedef union Constant Constant;
struct TokenNode {
int token;
Constant value;
struct TokenNode *left;
struct TokenNode *right; // unused for unary operators, functions
};
typedef struct TokenNode TokenNode;
//
// Return the next token in the equation. Adjust the pointer to
// point just after the token.
//
enum {
tokFloat, tokLeftParen, tokRightParen,
tokAddOp, tokMulOp, tokPowerOp,
tokNegate, tokAbs, tokRootOp,
tokFactorial,
tokFunction, // ln, sin, tan, etc.
tokVariable, // x, y, r, t, etc.
tokAssign, // =
tokInvalid
};
// Indices for the various functions
enum {
kFuncPower, kFuncAbs, kFuncSquareRoot,
kFuncFactorial, kFuncExp, kFuncLogN,
kFuncLogTen, kFuncSin, kFuncCos,
kFuncTan, kFuncSec, kFuncCsc,
kFuncCot, kFuncArcSin,
kFuncArcCos, kFuncArcTan,
kFuncArcSec, kFuncArcCsc,
kFuncArcCot, kFuncSinh, kFuncCosh,
kFuncTanh, kFuncSech, kFuncCsch,
kFuncCoth, kFuncAtan2
};
// Values of "which" for misc. tokens
enum {
kFloat, kInteger, kLeftParen,
kRightParen, kAdd, kSubtract,
kMultiply, kDivide, kExponent,
// "variables"
kPi, kE, kR, kTheta, kX, kY, kZ, kN,
kEquals, kInvalidToken
};
// Constants for bits in gVariableFlags
enum {
kRMask = 0x0001, // r was used in the equation
kThetaMask = 0x0002, // t was used in the equation
kXMask = 0x0004, // x was used in the equation
kYMask = 0x0008, // y was used in the equation
kNMask = 0x0010, // n was used in the equation
kEMask = 0x0020, // e was used in the equation
kPiMask = 0x0040, // p was used in the equation
kFunctionOfY= 0x0080 // function was of the form "z="
};
// The various integer and floating point registers used
enum {
regR0 = 0, // used for function calls
regTOC = 2, // TOC is pointer to globals
regFunctionGlue = 12,
regVariableBase = 13, // points to (temporary) values stored in memory
regFunctionBase = 14, // points to array of function pointers
regSavedLR = 15, // we save caller's LR here
regXValues = 16, regYValues = 17,
regNValues = 18, regNIntValues = 19,
regResults = 20, regNInteger = 21,
regXCount = 22, regYCount = 23,
regNCount = 24,
fpResult = 1, // results always in fp1
fpArg1 = 1, // first fp argument in fp1
fpArg2 = 2, // second fp argument in fp2
fpTemp1 = 3,
fpTemp2 = 4,
fpTempResult = 5,
regIntTemp1 = 4, // a volatile integer register
regFreeBase = 14, // first non-volatile register
regMax = 32, // number of fp registers
regN = 25, regX = 26, regY = 27,
regR = 28, regTheta = 29, regE = 30,
regPi = 31,
regMemory = 100 // fake register number for memory locations
};