Jan 96 Challenge
Volume Number: | | 12
|
Issue Number: | | 1
|
Column Tag: | | Programmers Challenge
|
Programmers Challenge
By Bob Boonstra, Westford, Massachusetts
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
Sliding Tiles
You have all probably seen small versions of the puzzle that is the basis for this months Challenge: a 4-by-4 grid of interlocking tiles, with one empty tile among the 16 cells allowing the puzzle to be scrambled by sliding adjacent cells into the empty location. This month the Challenge is to write code that will unscramble a larger version of the Sliding Tiles puzzle.
The prototype for the code you should write is:
typedef Boolean /*legalMove*/ (*MoveProc)(
/* Callback procedure to move tile at */
long tileToMoveRow,/* these coordinates into the location */
long tileToMoveCol /* of adjacent empty tile */
);
void SolveTiles(
long *tiles, /* pointer to array of tiles where */
long numRows, /* tile (row,col) is at */
long numCols, /* *(tiles + row*numCols + col) */
MoveProc MakeMove /* Callback procedure to move a tile */
);
You will be given a pointer tiles into an array of tile values, the number of rows and columns in the puzzle (numRows and numCols, respectively), and the address of a callback procedure MakeMove used to tell my test code about the moves you make to solve the puzzle. The tiles array will be initialized with the values 0..numRows*numCols-1, in an order scrambled by the calling routine. The value 0 represents the empty tile.
Your code should make a sequence of calls to MakeMove and return when the puzzle is solved. Each MakeMove call exchanges the empty tile with the indicated adjacent tile. The puzzle is solved when you have moved each tile into its proper location: moving the tile with value i into location tiles[i] (i.e., row=i/numCols and col=i%numCols).
The callback routine will be something like the code provided below:
static long gNumRows,gNumCols; /* initialized by test code */
static long gEmptyRow,gEmptyCol; /* initialized by test code */
static long *gTiles; /* initialized by test code */
#define TileValue(tiles,row,col) *(tiles+(row)*gNumCols+(col))
#define OutOfRange(val,num) (((val)<0) || ((val)>=(num)))
static Boolean MakeMove(long tileToMoveRow,long tileToMoveCol)
{
long diff;
if (OutOfRange(tileToMoveRow,gNumRows)) return false;
if (OutOfRange(tileToMoveCol,gNumCols)) return false;
if (tileToMoveRow == gEmptyRow) {
diff = tileToMoveCol - gEmptyCol;
} else if (tileToMoveCol == gEmptyCol) {
diff = tileToMoveRow - gEmptyRow;
} else {
return false;
}
if ((diff != -1) && (diff != 1)) return false;
TileValue(gTiles,gEmptyRow,gEmptyCol) =
TileValue(gTiles,tileToMoveRow,tileToMoveCol);
gEmptyRow = tileToMoveRow;
gEmptyCol = tileToMoveCol;
TileValue(gTiles,gEmptyRow,gEmptyCol) = 0;
}
As an example, given the initial conditions:
long tiles[] = {1,4,0,3,5,2};
SolveTiles(tiles,2,3,MakeMove);
you could generate the following moves:
MakeMove(1,2);
MakeMove(1,1);
MakeMove(0,1);
MakeMove(0,0);
to transform the puzzle like this:
1 4 0 ==> 1 4 2 ==> 1 4 2 ==> 1 0 2 ==> 0 1 2
3 5 2 3 5 0 3 0 5 3 4 5 3 4 5
It turns out that half of the possible permutations of the values 0..numRows*numCols-1 are illegal in that they cannot be reached from the solved state. The calling routine will provide a legal starting state - you dont have to worry about the puzzle being unsolvable.
The number of moves you make to solve the puzzle is not an explicit criterion in determining the winner, but the winner will be determined by total execution time, including the time used by the callback routine, as we did in the Master MindReader challenge a few months back. Note that you are not permitted to optimize the callback routine - its purpose is to provide a fixed time penalty for each move your solution routine makes.
This will be a native PowerPC Challenge, scored using the Symantec 8.0.3 compiler. Good luck. Email me with any questions, or - better yet - join the Programmers Challenge Mailing List
Mailing List Reminder
Many Challenge readers have already joined the Programmers Challenge Mailing List announced last month. The list is being used to distribute the latest Challenge, provide answers to questions about the current Challenge, and discuss suggestions for future Challenges. The Challenge problem is posted to the list sometime between the 20th and the 25th of the month.
To subscribe to the list, send a message to autoshare@mactech.com with the SUBJECT line sub challenge YourName, substituting your real name for YourName. To unsubscribe from the list, send a message to autoshare@mactech.com with the SUBJECT line unsub challenge.
Two Months Ago Winner
Congratulations to Eric Lengyel (Blacksburg, VA) for submitting the fastest entry to the EnclosingBounds Challenge. The problem was to find the smallest rectangle enclosing all of the non-white pixels in a PixMap. Eight of the 13 entries submitted worked correctly, but Erics solution was significantly faster than the others. This is Erics second victory in three months, following his first-place finish in the September Reversible Scrambling Algorithm Challenge.
The winning solution uses a clever technique to minimize the number of comparisons required to find the enclosing rectangle. Rather than test each pixel to determine if it is non-white, Eric logically ORs the values for all pixels in a row (for the indexed color cases), taking advantage of the fact that white is always represented by a zero value. A single comparison then determines whether that row contains only white pixels. Working separately from the top and bottom of the selection rectangle identifies the top and bottom rows of the enclosing rectangle. A similar technique applied to columns finds the left and right boundaries of the rectangle. For the direct (32-bit) color case, the approach is similar, except that pixel values in a row or column are logically ANDed, taking advantage of the fact that white is represented by the value 0x00FFFFFF.
Here are the times and code sizes for each of the correct entries. Numbers in parentheses after a persons name indicate that persons cumulative point total for all previous Challenges, not including this one.
Name time time time test code data
1-bit 8-bit 32-bit time size size
Eric Lengyel (20) 13 66 272 340 1608 320
Ernst Munter (100) 22 96 326 427 2980 32
Miguel Cruz Picao 34 110 476 593 3328 44
John Sweeney 75 145 502 659 4416 624
Bill Karsh (78) 54 135 517 662 1600 8
Tom Saxton 146 170 560 758 1044 132
Chris Rudolph 514 289 973 1354 1420 8
P.L. 6197 4672 5384 11181 656 24
The times listed above were all achieved using the Metrowerks CodeWarrior 7 compiler. Running the winning entry with code generated by the Symantec and MrC compilers (with all speed optimizations enabled in each case) gave some interesting results, with the MrC code executing in 2/3 to 3/4 of the time required by the others:
Compiler (version) time time time
1-bit 8-bit 32-bit
MrC / MPW (1.0f2) 10 52 183
Metrowerks C (1.3.2) 13 66 272
Symantec (8.0.3) 17 75 292
An investigation of the generated code provides some insight into these numbers. CodeWarrior generates the following code for one of the inner loops in the winning solution:
for (i = 0; i < numWholeWords; i++)
00000064: 7D274B78 mr r7,r9
00000068: 38A00000 li r5,0
0000006C: 48000014 b *+20 ; $00000080
{
accumulator |= *(long *) k;
k += 4;
}
00000070: 80070000 lwz r0,0(r7)
00000074: 38A50001 addi r5,r5,1
00000078: 7C630378 or r3,r3,r0
0000007C: 38E70004 addi r7,r7,4
00000080: 7C052000 cmpw r5,r4
00000084: 4180FFEC blt *-20 ; $00000070
By comparison, MrC generates the following longer, but faster code:
for (i = 0; i < numWholeWords; i++)
00F8 006C 48000018 b $+0x0018 ; 0x00000084
00FC 0070 X 4E800020 blr
0100 0074 31290001 addic r9,r9,1
0104 0078 7D4A3814 addc r10,r10,r7
0108 007C X 7C093000 cmpw r9,r6
010C 0080 X 4080FFF0 bge $-0x0010 ; 0x00000070
0110 0084 X 40990028 ble cr6,$+0x0028 ; 0x000000AC
0114 0088 X 7D0903A6 mtctr r8 ; CTR = 9
0118 008C X 2C080001 cmpwi r8,1
011C 0090 X 4181000C bgt $+0x000C ; 0x0000009C
0120 0094 X 38600001 li r3,1
0124 0098 X 7C6903A6 mtctr r3 ; CTR = 9
0128 009C X 318AFFFC subic r12,r10,4
{
accumulator |= *(long *) k;
k += 4;
}
012C 00A0 846C0004 lwzu r3,0x0004(r12)
0130 00A4 7C6B5B78 or r11,r3,r11
0134 00A8 X 4200FFF8 bdnz $-0x0008 ; 0x000000A0
Notice that the inner loop is 6 instructions in the CodeWarrior version but only 3 instructions in the MrC code. The key to the difference is the use of the mtctr, lwzu, and bdnz instructions. The mtctr instruction loads the special purpose CTR register, which the bdnz instruction decrements and tests, branching when CTR is nonzero (similar to what the DBRA instruction does on 68K machines). The bdnz instruction replaces 3 instructions generated by CodeWarrior. The lwzu instruction loads a value from memory, but also stores the effective address back into the register used for the indirect memory access, replacing 2 CodeWarrior instructions. Reading disassembled compiler-optimized PowerPC code takes a little practice, but it can provide some insight into what the compiler is doing to you (or for you). Those interested in learning more are referred to the many PowerPC articles in past issues of MacTech, including a two part series by Bill Karsh in August and September of 1994.
Top Contestants of All Time
Here are the Top Contestants for the Programmers Challenges to date, including everyone who has accumulated more than 20 points. The numbers below include points awarded for this months entrants.
Rank Name Points Rank Name Points
1. [Name deleted] 176 11. Mallett, Jeff 44
2. Munter, Ernst 110 12. Kasparian, Raffi 42
3. Gregg, Xan 81 13. Vineyard, Jeremy 42
4. Karsh, Bill 80 14. Lengyel, Eric 40
5. Larsson, Gustav 67 15. Darrah, Dave 31
6. Stenger, Allen 65 16. Landry, Larry 29
7. Riha, Stepan 51 17. Elwertowski, Tom 24
8. Goebel, James 49 18. Lee, Johnny 22
9. Nepsund, Ronald 47 19. Noll, Robert 22
10. Cutts, Kevin 46
There are three ways to earn points: (1) scoring in the top 5 of any Challenge, (2) being the first person to find a bug in a published winning solution or, (3) being the first person to suggest a Challenge that I use. The points you can win are:
1st place 20 points 5th place 2 points
2nd place 10 points finding bug 2 points
3rd place 7 points suggesting Challenge 2 points
4th place 4 points
Here is Erics winning solution:
EnclosingBounds
Copyright © 1995 Eric Lengyel
/*
This algorithm is based on the following idea. Assuming that we are going to have to
check many rows or columns which dont contain any non-white pixels, it is faster to
combine all of the pixels in a row or column and look at the end result than it is to
check each pixel individually. This is done by ORing entire rows or columns
together for 1-bit and 8-bit deep pixel maps and ANDing entire rows or columns
together for 32-bit deep pixel maps. The two different methods are necessary
because for 1-bit and 8-bit pixel maps, white is represented by zeros and for 32-bit
pixel maps, white is represented by ones.
The mask tables below are used with 1-bit and 8-bit deep pixel maps. They are
needed when the left or right side of the selection rectangle is not word aligned.
*/
long LeftMask1[32] =
{0xFFFFFFFF, 0x7FFFFFFF, 0x3FFFFFFF, 0x1FFFFFFF,
0x0FFFFFFF, 0x07FFFFFF, 0x03FFFFFF, 0x01FFFFFF,
0x00FFFFFF, 0x007FFFFF, 0x003FFFFF, 0x001FFFFF,
0x000FFFFF, 0x0007FFFF, 0x0003FFFF, 0x0001FFFF,
0x0000FFFF, 0x00007FFF, 0x00003FFF, 0x00001FFF,
0x00000FFF, 0x000007FF, 0x000003FF, 0x000001FF,
0x000000FF, 0x0000007F, 0x0000003F, 0x0000001F,
0x0000000F, 0x00000007, 0x00000003, 0x00000001};
long RightMask1[32] =
{0x80000000, 0xC0000000, 0xE0000000, 0xF0000000,
0xF8000000, 0xFC000000, 0xFE000000, 0xFF000000,
0xFF800000, 0xFFC00000, 0xFFE00000, 0xFFF00000,
0xFFF80000, 0xFFFC0000, 0xFFFE0000, 0xFFFF0000,
0xFFFF8000, 0xFFFFC000, 0xFFFFE000, 0xFFFFF000,
0xFFFFF800, 0xFFFFFC00, 0xFFFFFE00, 0xFFFFFF00,
0xFFFFFF80, 0xFFFFFFC0, 0xFFFFFFE0, 0xFFFFFFF0,
0xFFFFFFF8, 0xFFFFFFFC, 0xFFFFFFFE, 0xFFFFFFFF};
long LeftMask8[4] =
{0xFFFFFFFF, 0x00FFFFFF, 0x0000FFFF, 0x000000FF};
long RightMask8[4] =
{0xFF000000, 0xFFFF0000, 0xFFFFFF00, 0xFFFFFFFF};
long DirectWhite = 0x00FFFFFF; // Value of white pixel
// in 32-bit map.
EnclosingBounds
void EnclosingBounds(PixMapHandle pm,
Rect selection, Rect *enclosingRect)
{
PixMapPtr map;
long pixelSize, rowBytes, accumulator,
leftMask, rightMask, baseAddr,
leftSide, rightSide, topSide, bottomSide,
numWholeWords, needLeftMask, needRightMask,
i, j, k, l, m;
map = *pm;
/* Compute position of selection rectangle relative to upper-left corner of pixel map. */
leftSide = selection.left - map->bounds.left;
rightSide = selection.right - map->bounds.left;
topSide = selection.top - map->bounds.top;
bottomSide = selection.bottom - map->bounds.top;
/* Check validity of selection rectangle. */
if ((rightSide <= leftSide) || (bottomSide <= topSide))
{
enclosingRect->left = enclosingRect->right =
enclosingRect->top = enclosingRect->bottom = 0;
return;
}
/* Determine characteristics of pixel map. */
rowBytes = map->rowBytes;
if (rowBytes >= 0) pixelSize = 1; // BitMap
else pixelSize = map->pixelSize; // PixelMap
rowBytes &= 0x3FFF; // Strip flags
baseAddr = (long) map->baseAddr;
/* Handle 1-bit and 8-bit deep pixel maps with same chunk
of code. 32-bit deep pixel map handled separately. */
if (pixelSize != 32)
{
/* Move baseAddr over to the first column of the selection rectangle, still keeping it
word aligned. Then determine what masks are needed for leftmost and rightmost
words in the selection and how many whole words there are in between. */
if (pixelSize == 1)
{
baseAddr += (leftSide >> 5) << 2;
leftMask = LeftMask1[leftSide & 0x1F];
rightMask = RightMask1[(rightSide - 1) & 0x1F];
numWholeWords = (rightSide >> 5) -
((leftSide + 31) >> 5);
}
else
{
baseAddr += leftSide & 0xFFFC;
leftMask = LeftMask8[leftSide & 3];
rightMask = RightMask8[(rightSide - 1) & 3];
numWholeWords = (rightSide >> 2) -
((leftSide + 3) >> 2);
}
/* Set flags indicating what masks are in use. If the left and right boundaries of the
selection fall within the same word, then take the intersection of the left and right
masks and only consider one column of words. */
needLeftMask = (leftMask + 1 != 0);
needRightMask = (rightMask + 1 != 0);
if (numWholeWords < 0)
{
leftMask &= rightMask;
needRightMask = 0;
}
/* Find first row with a non-white pixel by ORing the
whole row together and checking for a non-zero result. */
j = topSide;
accumulator = 0;
m = baseAddr + j * rowBytes; // Top-left corner
do
{
k = m;
if (needLeftMask)
{
accumulator |= (*(long *) k) & leftMask;
k += 4;
}
for (i = 0; i < numWholeWords; i++)
{
accumulator |= *(long *) k;
k += 4;
}
if (needRightMask)
{
accumulator |= (*(long *) k) & rightMask;
}
if (accumulator != 0) break;
m += rowBytes;
} while (++j < bottomSide);
if (j == bottomSide) // Whole selection is white
{
enclosingRect->left = enclosingRect->right =
enclosingRect->top = enclosingRect->bottom = 0;
return;
}
topSide = j;
/* Find last row with a non-white pixel. */
j = bottomSide - 1;
accumulator = 0;
m = baseAddr + j * rowBytes; // Bottom-left corner
do
{
k = m;
if (needLeftMask)
{
accumulator |= (*(long *) k) & leftMask;
k += 4;
}
for (i = 0; i < numWholeWords; i++)
{
accumulator |= *(long *) k;
k += 4;
}
if (needRightMask)
{
accumulator |= (*(long *) k) & rightMask;
}
if (accumulator != 0) break;
m -= rowBytes;
} while (--j >= topSide);
bottomSide = j + 1;
/* Find leftmost column containing a non-white pixel. */
accumulator = 0;
m = baseAddr + topSide * rowBytes;
l = 0;
if (needLeftMask)
{
k = m;
j = topSide;
do
{
accumulator |= (*(long *) k) & leftMask;
k += rowBytes;
} while (++j < bottomSide);
if (accumulator != 0) goto leftFound;
l += 4;
}
for (i = 0; i < numWholeWords; i++)
{
k = m + l;
j = topSide;
do
{
accumulator |= *(long *) k;
k += rowBytes;
} while (++j < bottomSide);
if (accumulator != 0) goto leftFound;
l += 4;
}
if (needRightMask)
{
k = m + l;
j = topSide;
do
{
accumulator |= (*(long *) k) & rightMask;
k += rowBytes;
} while (++j < bottomSide);
}
/* When we get to here, we have narrowed down the left-most non-white to the
word. The value in the accumulator will tell us the exact column of the pixel. We
then move baseAddr over to the last column of the selection rectangle (word
aligned). */
leftFound:
if (pixelSize == 1)
{
leftSide = (leftSide & 0xFFFFFFE0) + (l << 3);
while (accumulator >= 0)
{
leftSide++;
accumulator <<= 1;
}
baseAddr = (long) map->baseAddr +
(((rightSide - 1) >> 5) << 2);
}
else
{
leftSide = (leftSide & 0xFFFFFFFC) + l;
while ((accumulator & 0xFF000000) == 0)
{
leftSide++;
accumulator <<= 8;
}
baseAddr = (long) map->baseAddr +
((rightSide - 1) & 0xFFFC);
}
/* Find rightmost column containing a non-white pixel. */
accumulator = 0;
m = baseAddr + topSide * rowBytes;
l = 0;
if (needRightMask)
{
k = m;
j = topSide;
do
{
accumulator |= (*(long *) k) & rightMask;
k += rowBytes;
} while (++j < bottomSide);
if (accumulator != 0) goto rightFound;
l += 4;
}
for (i = 0; i < numWholeWords; i++)
{
k = m - l;
j = topSide;
do
{
accumulator |= *(long *) k;
k += rowBytes;
} while (++j < bottomSide);
if (accumulator != 0) goto rightFound;
l += 4;
}
if (needLeftMask)
{
k = m - l;
j = topSide;
do
{
accumulator |= (*(long *) k) & leftMask;
k += rowBytes;
} while (++j < bottomSide);
}
rightFound:
if (pixelSize == 1)
{
rightSide = ((rightSide + 31) & 0xFFFFFFE0) - (l << 3);
while ((accumulator & 1) == 0)
{
rightSide--;
accumulator >>= 1;
}
}
else
{
rightSide = ((rightSide + 3) & 0xFFFFFFFC) - l;
while ((accumulator & 0x000000FF) == 0)
{
rightSide--;
accumulator >>= 8;
}
}
}
/* Now for the code which handles 32-bit deep pixel maps. For direct pixels white is
ones, unlike indexed pixels where white is zeros. We will use the same technique,
but we will have to AND the rows and columns together. We dont have to worry
about left and right masks - in 32-bit deep pixel maps every pixel is word aligned. */
else
{
baseAddr += leftSide << 2;
numWholeWords = rightSide - leftSide;
/* Find first row. */
j = topSide;
accumulator = DirectWhite;
m = baseAddr + j * rowBytes;
do
{
k = m;
i = 0;
do
{
accumulator &= *(long *) k;
k += 4;
} while (++i < numWholeWords);
if (accumulator != DirectWhite) break;
m += rowBytes;
} while (++j < bottomSide);
if (j == bottomSide) // All white pixels
{
enclosingRect->left = enclosingRect->right =
enclosingRect->top = enclosingRect->bottom = 0;
return;
}
topSide = j;
/* Find last row. */
j = bottomSide - 1;
accumulator = DirectWhite;
m = baseAddr + j * rowBytes;
do
{
k = m;
i = 0;
do
{
accumulator &= *(long *) k;
k += 4;
} while (++i < numWholeWords);
if (accumulator != DirectWhite) break;
m -= rowBytes;
} while (--j >= topSide);
bottomSide = j + 1;
/* Find leftmost column. */
accumulator = DirectWhite;
m = baseAddr + topSide * rowBytes;
l = 0;
i = 0;
do
{
k = m + l;
j = topSide;
do
{
accumulator &= *(long *) k;
k += rowBytes;
} while (++j < bottomSide);
if (accumulator != DirectWhite) break;
l += 4;
} while (++i < numWholeWords);
leftSide += l >> 2;
/* Find rightmost column. */
baseAddr = (long) map->baseAddr +
(rightSide << 2) - 4;
accumulator = DirectWhite;
m = baseAddr + topSide * rowBytes;
l = 0;
i = 0;
do
{
k = m - l;
j = topSide;
do
{
accumulator &= *(long *) k;
k += rowBytes;
} while (++j < bottomSide);
if (accumulator != DirectWhite) break;
l += 4;
} while (++i < numWholeWords);
rightSide -= l >> 2;
}
/* Return enclosing rectangle in the pixel maps local coordinates. */
enclosingRect->left = leftSide + map->bounds.left;
enclosingRect->right = rightSide + map->bounds.left;
enclosingRect->top = topSide + map->bounds.top;
enclosingRect->bottom = bottomSide + map->bounds.top;
}