Dec 97 Challenge
Volume Number: 13 (1997)
Issue Number: 12
Column Tag: Programmer's Challenge
by Bob Boonstra, Westford, MA
Clueless Crosswords
A couple of months ago, Bob Noll sent me a Challenge suggestion involving a crossword puzzle variant published by one of his local newspapers. The difficulty in using that suggestion was deciding how to provide the clues to the puzzle in a usable form. I thought about providing some sort of thesaurus, but giving simple synonyms as clues didn't seem to capture the crossword spirit. Then it occurred to me that clues serve only to make the crossword easy to solve, an advantage that certainly wouldn't be needed by our skilled Challenge readers or by the code they write. So the Challenge this month is to write code that will solve a crossword puzzle without clues.
The prototype for the code you should write is:
#define kMaxSize 32
typedef char Puzzle[kMaxSize][kMaxSize];
void Crossword(
Puzzle thePuzzle, /* return solved puzzle here */
char *dictionary[], /* array of words to choose from */
long puzzleSize, /* number of rows/cols in puzzle */
long dictSize /* number of words in dictionary */
);
Your Crossword routine will be provided with thePuzzle, where cell thePuzzle[row][col] will be initialized to a zero if you are to fill in that cell, or initialized to 0xFF if the cell is blacked out. The first puzzleSize rows and columns constitute the puzzle; the remaining cells in the Puzzle array are padding and should be ignored. You are to fill in the empty cells of thePuzzle with words from the dictionary provided, such that each uninterrupted sequence of blank cells, both horizontal and vertical, forms a word from the dictionary. The dictionary will contain all of the words needed to solve thePuzzle, but, to make things more interesting, it will also contain extra words not needed to solve thePuzzle. The dictionary may contain only a few extra words, or it may contain as many as 10 extra words for each word used in thePuzzle. Words in the dictionary are not guaranteed to be in any order, and any word might be used more than once in thePuzzle.
Each puzzle is guaranteed to have at least one solution using the dictionary provided. You will be guaranteed 20MB of memory for your code, static data, and any dynamically allocated memory. You may use more memory if it is available, but your code should detect and respond to memory allocation failures if you ask for more than the guaranteed amount of memory. As always, you must deallocate any dynamically allocated memory before returning.
This will be a native PowerPC Challenge, using the latest CodeWarrior environment. Solutions may be coded in C, C++, or Pascal. The Challenge winner will be the entry that provides a correct solution to a set of crossword puzzles in the minimum time. Thanks to Bob Noll for suggesting a crossword puzzle problem -- he wins two Challenge points for the suggestion.
Three Months Ago Winner
Congratulations to Ludovic Nicolle (St-Nicolas, Quebec) for submitting the winning entry to the Image Detector Challenge. The problem was to find all occurrences of a given pattern bitmap in a sequence of background bitmaps, subject to a specified allowable error rate. Five people submitted entries, and all but one of them performed correctly in my tests.
Both the winning entry and the close second place entry by Ernst Munter used 64K of static memory to look up the number of bits set in a given 16-bit value, as part of calculating the number of mismatched bits. Ludo gained some advantage by optimizing the detection routine DetectCenter, used to examine most of the background image, to process 16 potential matches in parallel and thus take advantage of the many registers available in the 604 processor. He also used an interesting technique to trick the compiler into keeping other local variables in registers, by declaring them to be parameters. Finally, Ludo notes in his commentary that the branch prediction capability of the 604 caused him to select a different code construct than the one that worked best on a 601-based machine.
The table below lists the execution time, code size, data size, and programming language for each entry. The number in parentheses after the entrant's name is the total number of Challenge points earned in all Challenges to date prior to this one.
Name Time Code Data Language
Ludovic Nicolle (28) 13.74 8606 65860 C
Ernst Munter (290) 5.96 5276 65896 C++
Greg Cooper (54) 87.04 2420 2045 C
ACC Murphy (30) 252.71 2596 120 C++
A.H. 431.30 4076 164 C
Looking Back to May - Equation Evaluator
Ron Avitzur, author of the Graphing Calculator that inspired the May Equation Evaluator Challenge, wrote to mention a technique that was not used by any of the entries in that Challenge. Ron pointed me to http://symbolicnet.mcs.kent.edu/areas/cr, which discusses a technique called Chains of Recurrence for optimizing function evaluation. The technique can provide speedups of a factor of 50 when evaluating functions over a range of uniformly spaced points. The web page allows you to enter an equation and then demonstrates the improvement. Thanks for the tip, Ron!
... and Back to July - Disambiguator
Ernst Munter wrote to point out a low-memory bug in his winning entry for the July Disambiguator Challenge that did not show up in my tests. Ernst wrote to say that "the published program works fine when tested with a generous allowance for private storage, as originally specified, i.e. at least 21 bytes/word + 1 byte per character of each word. However, the program was designed to require only about half as much memory. Unfortunately this has resulted in a bug which will occur with a findstring of "*" (return all words) because the function SendAll() will start scanning from pageGroup[0]. But pageGroup[0] was aliased to nextPage, just a temporary pointer variable, and never properly initialized. If memory was plentiful, the memory pointed to by pageGroup[0] would still be 0, and nothing bad happens, but otherwise it is likely to contain some data and result in an unmapped memory exception." Ernst provided the following replacement function to fix this bug:
ulong SendAll(CCC* matchList[],ulong minLen) {
// Sends all words >= minimum length from all pages
ulong numMatch=0;
/* insert the following line to fix bug */
if (minLen<1) minLen=1;
for (int len=MIN(31,minLen);len<32;len++) {
Page* page=pageGroup[len];
while (page) {
numMatch=page->SendAll(matchList,numMatch);
page=page->next;
}
}
return numMatch;
}
Top 20 Contestants
Here are the Top Contestants for the Programmer's Challenge. The numbers below include points awarded over the 24 most recent contests, including points earned by this month's entrants.
Rank Name Points
1. Munter, Ernst 210
2. Cooper, Greg 61
3. Lewis, Peter 57
4. Gregg, Xan 53
5. Nicolle, Ludovic 48
6. Boring, Randy 41
8. Murphy, ACC 34
7. Mallett, Jeff 30
9. Larsson, Gustav 27
10. Antoniewicz, Andy 24
11 Picao, Miguel Cruz 21
12. Day, Mark 20
13. Higgins, Charles 20
14. Lengyel, Eric 20
15. Studer, Thomas 20
16. Saxton, Tom 17
17. Gundrum, Eric 15
18. Hart, Alan 14
19. O'Connor, Turlough 14
20. Karsh, Bill 12
Here is Ludo's winning solution:
ImgDetector.c
© 1997 Ludovic Nicolle
/*
General issues:
The code uses 3 functions, DetectLeft/Center/Right to find matches on each side and in the middle. The same three functions are used for the top/center/bottom on each column.
Algorithmic considerations:
the fundamental operation for each bit of the mask at any particular location on the image is the following one and referred as the xor/and later in this discussion:
mismatch = (pattern ^ background) & mask;
if mismatch == 1, you got a bit mismatch.
Since it was assumed the code would more naturally be called with (noise <0.5) than greater, it made sense to count the mismatches. The code actually substract any mismatch from the maximum allowed. (The variables names are maxBad for the maximum and remBad for the remaining allowed bad bits at any time in the process.)
The first natural step of optimization was to treat the data in parallel, either 8, 16 or 32 bits at a time. I chose to work on a 16 bits basis. After the entire operation of xor/and has been done on 16 bits, the number of 1's in the 16 bits is loaded from a 64k static array (named gBitsOn) containing a 2^16 chars (this method proved to be statistically much faster than anything else).
Since we needed to "move" the mask and pattern in parallel over the background, I choosed to move the background instead so I had only one variable to shift each time.
In the DectectLeft/Right functions, the code reflects only those considerations, with a special treatment for the border when it is not flush.
The DetectCenter function uses a more sophisticated algorithm derived form the simple one described above. While written in pure C portable code, is is really optimized for the register-rich PowerPC architecture.
It uses 16 registers to treat 16 "remBad" in parallel. Each time a short is loaded for the pattern and for the mask, it is used to xor/and with 16 different positions of the background. Reducing the number of loads from the memory maximize the troughput of the function. An other register is used to globally monitor the status of the 16 remBads, using a simple bit array, and is called remFlag.
To achieve the best performance, all the 16 local remBads had to be in registers. Intermediate versions having only 10 to 15 remBads in registers performed less than the all-in-regs one. To free as much registers as possible, the DetectCenter function was tweeked as much as possible:
- since the pattern and mask had no padding rowBytes (rowBytes was still a multiple of 2 of course, but not necessarily of 4), the internal ptrs didn't needed intermediate line ptrs (they are used on the Left/Right)
- the line index of the mask/pattern is a float, using the otherwise unused floating point unit of the processor,
- many local variables were placed as dumb parameters of the function (the ImageDetect calls use nil/0 as the 6 first parameters) to fool the PowerPC Compiler which follows the PowerPC binary architecture for allocating parameters and local variables.
Many parameters were put as stack parameters in order to free enough space for the remBad's and other inner loops variables.
As a last note on this optimization issue, I want to say this version of the code uses an if (remBadxx) statement before each xor/and. On my PPC 601, this was less efficient than blind execution of the xor/and followed by the check to see if the remBad was now under 0. On the PPC 604, my tests showed that this version was faster, probably due to one or both of the following factors:
- branch prediction/excution is faster on the 604,
- the 604/memory speed ratio being greater than with my 601, the load from the gBitsOn array incurs a greater time penalty than the time taken for branching.
*/
/*
InitTarget
InitTarget jobs are:
- to count the number of bits inside the pattern,
- to build an array containing the number of bits to remove when the mask is placed partially outside the backgroundImage,
- to find how many white rows and cols are on the left/right/top/bottom of the mask and squeeze the pattern and mask accordingly
- to make local copies of the pattern and mask that have no extra padding rowBytes,
- to zero any mask bits on the right end of those rows if they are not a multiple of 16.
ImageDetect determines the number of bad bits allowed with the noise threshold then determines the number of rows must be tested on the top/bottom border. It also sets a few globals about the locations.
It then repeatedly calls DetectLeft/Center/Right for the top. It then calls the three functions once again for the middle (vertically speaking). Finally, they are called repeactedly again on the bottom.
*/
typedef unsigned short ushort;
typedef unsigned long ulong;
/* the gBitsOn array is in the BitsOn.c file which must be linked into the project. */
extern unsigned char gBitsOn[256*256];
/* Globals */
BitMap gPattern;
BitMap gMask;
ushort gmkWidth;
ushort gmkHeight;
Point gmkSkip;
Point *gLocations;
long gLocationsCt;
long gMaxLocations;
ulong gmkTotalBits;
ulong *gmkBitsCache;
void InitTarget(
BitMap pattern, /* image to be detected */
BitMap mask /* bits in image that we care about */
);
long /* numFound */ ImageDetect(
BitMap backgroundImage, /* find the target image in backgroundImage */
Point locations], /* return topLeft of matching locations here */
long maxLocations, /* max number of locations to return */
float noise /* allow this fraction of mismatched bits */
);
void CleanUp(void); /* deallocate any memory allocated by InitTarget */
InitTarget
void InitTarget(
BitMap pattern, /* image to be detected */
BitMap mask /* bits in image that we care about */
)
{
ulong mkSize;
short height, width;
short minRowBytes;
short lineIdx, colIdx;
short mkTopSkip, mkBotSkip, mkRightSkip, mkLeftSkip;
Rect srcRect, destRect;
ulong *mkBitsCache;
ushort *mkLinePtr, *mkCurPtr;
ulong bitsCt;
Ptr mkBaseAddr;
short offSet;
/* Calculating mask size */
width = mask.bounds.right - mask.bounds.left;
height = mask.bounds.bottom - mask.bounds.top;
/* Minimizing rowBytes and copying bitmaps structures
and setting mask/pattern topleft to (0,0) */
minRowBytes = 2 * ((width + 15) / 16);
mkSize = height * minRowBytes;
gMask.rowBytes = minRowBytes;
gMask.bounds.top = 0;
gMask.bounds.left = 0;
gMask.bounds.bottom = height;
gMask.bounds.right = width;
gMask.baseAddr = NewPtr(mkSize);
if (width % 16)
{ /* Zeroing the last short of each row */
offSet = minRowBytes - 2;
(Ptr)mkLinePtr = gMask.baseAddr + offSet;
for (lineIdx = 0; lineIdx < height; lineIdx++)
{
(*mkLinePtr) = 0;
(Ptr)mkLinePtr += minRowBytes;
}
}
CopyBits(&mask, &gMask, &mask.bounds,
&gMask.bounds, srcCopy, nil);
/* filling lines cache, removing empty top lines and
counting empty top lines. */
(Ptr)mkBitsCache = NewPtr(sizeof(ulong) *
height * width);
gmkBitsCache = mkBitsCache;
(Ptr)mkLinePtr = gMask.baseAddr;
bitsCt = 0;
mkTopSkip = 0;
for (lineIdx = 0; lineIdx < height; lineIdx++)
{
if (mkBitsCache)
mkBitsCache[(lineIdx - mkTopSkip) * width] =
bitsCt;
mkCurPtr = mkLinePtr;
(Ptr)mkLinePtr += minRowBytes;
for (; mkCurPtr < mkLinePtr; mkCurPtr++)
bitsCt += gBitsOn[*mkCurPtr];
if (bitsCt == 0)
mkTopSkip++;
}
gmkTotalBits = bitsCt;
/* counting empty bottom lines */
mkBotSkip = 0;
lineIdx = height - mkTopSkip - 1;
for (lineIdx = height - mkTopSkip - 1; (lineIdx > 0) &&
(mkBitsCache[lineIdx * width] == bitsCt);
lineIdx--)
mkBotSkip++;
height -= (mkTopSkip + mkBotSkip);
mkBaseAddr = gMask.baseAddr + mkTopSkip * minRowBytes;
/* filling the bits cache*/
if (gmkBitsCache)
{
for (colIdx = 1; colIdx < width; colIdx++)
{
ushort mkMask;
ulong freeBitsCt;
short colModulo;
(Ptr)mkLinePtr = mkBaseAddr +
(height - 1) * minRowBytes;
mkLinePtr += (colIdx - 1) / 16 ;
colModulo = (colIdx - 1) % 16;
mkMask = 0xFFFF << (15 - colModulo);
freeBitsCt = 0;
for (lineIdx = height - 1; lineIdx >= 0;
lineIdx--)
{
freeBitsCt +=
gBitsOn[(*mkLinePtr) & mkMask];
bitsCt = freeBitsCt +
mkBitsCache[lineIdx * width +
colIdx - 1 - colModulo];
mkBitsCache[lineIdx * width + colIdx] =
bitsCt;
(Ptr)mkLinePtr -= minRowBytes;
}
}
}
/* calculating left and right empty cols */
mkLeftSkip = 0;
for (colIdx = 1;
(colIdx < width) && (mkBitsCache[colIdx] == 0);
colIdx++)
mkLeftSkip++;
mkRightSkip = 0;
for (colIdx = width - 1;
(colIdx > 0) && (mkBitsCache[colIdx] == bitsCt);
colIdx--)
mkRightSkip++;
/* restructuring mkBitsCache according to the new width if it's smaller. */
if ((mkRightSkip + mkLeftSkip) > 0)
{
short oldWidth = width;
width -= (mkRightSkip + mkLeftSkip);
minRowBytes = 2 * ((width + 15) / 16);
for (lineIdx = 0; lineIdx < height; lineIdx++)
for (colIdx = 0; colIdx < width; colIdx++)
mkBitsCache[lineIdx * width + colIdx] =
mkBitsCache[lineIdx * oldWidth +
colIdx + mkLeftSkip];
}
/* initializing the pattern BitMap. the allocated space
will be used for mask or pattern data. */
gPattern.rowBytes = minRowBytes;
gPattern.bounds.top = 0;
gPattern.bounds.left = 0;
gPattern.bounds.bottom = height;
gPattern.bounds.right = width;
mkSize = minRowBytes * height;
gPattern.baseAddr = NewPtr(mkSize);
if ((mkTopSkip > 0) || (mkLeftSkip > 0) ||
(minRowBytes < gMask.rowBytes))
{ /* I must recopy the mask bits */
srcRect.top = mkTopSkip;
srcRect.left = mkLeftSkip;
srcRect.bottom = mkTopSkip + height;
srcRect.right = mkLeftSkip + width;
destRect.top = 0;
destRect.left = 0;
destRect.bottom = height;
destRect.right = width;
if (width % 16)
{ /* Zeroing the last short of each row */
offSet = minRowBytes - 2;
(Ptr)mkLinePtr = gPattern.baseAddr + offSet;
for (lineIdx = 0; lineIdx < height; lineIdx++)
{
(*mkLinePtr) = 0;
(Ptr)mkLinePtr += minRowBytes;
}
}
CopyBits(&gMask, &gPattern, &srcRect, &destRect,
srcCopy, nil);
/* switching bits data between gMask and gPattern */
(Ptr)mkLinePtr = gMask.baseAddr;
gMask = gPattern;
gPattern.baseAddr = (Ptr)mkLinePtr;
}
srcRect = pattern.bounds;
srcRect.top += mkTopSkip;
srcRect.bottom -= mkBotSkip;
srcRect.left += mkLeftSkip;
srcRect.right -= mkRightSkip;
CopyBits(&pattern, &gPattern, &srcRect,
&gPattern.bounds, srcCopy, nil);
gmkWidth = width;
gmkHeight = height;
gmkSkip.v = mkTopSkip;
gmkSkip.h = mkLeftSkip;
}
CleanUp
void CleanUp(void) /* deallocate any memory allocated by InitTarget */
{
if (gPattern.baseAddr)
DisposePtr(gPattern.baseAddr);
if (gMask.baseAddr)
DisposePtr(gMask.baseAddr);
if (gmkBitsCache)
DisposePtr((Ptr)gmkBitsCache);
}
Boolean DetectLeft(
BitMap* picture,
long maxBad,
short mkRowBytes,
short picRowBytes,
short picTop,
short picTopLow,
short mkTop,
short mkBottom,
short mkWidth);
Boolean DetectCenter(
ushort *patCurPtr,
ushort *mkCurPtr,
ushort *picLinePtr,
ushort *picCurPtr,
ulong picLong,
long dumbReg,
long mkRowBytes,
long picRowBytes,
long mkTop,
long picTop,
long maxBad,
long picTopLow,
long picWidth,
long mkBottom,
BitMap* picture);
Boolean DetectRight(
BitMap* picture,
long maxBad,
short mkRowBytes,
short picRowBytes,
short picTop,
short picTopLow,
short mkTop,
short mkBottom,
short picWidth,
short mkWidth);
ImageDetect
long /* numFound */ ImageDetect(
BitMap backgroundImage, /* find the target image in backgroundImage */
Point locations[], /* return topLeft of matching locations here */
long maxLocations, /* max number of locations to return */
float noise /* allow this fraction of mismatched bits */
)
{
ulong maxBad;
short picWidth, picHeight;
ulong *mkBitsCache;
short mkRowBytes;
short picRowBytes;
short mkLine, outTop, outBottom;
long totalBits;
gLocations = locations;
gLocationsCt = 0;
gMaxLocations = maxLocations;
if (gmkBitsCache == nil)/*cannot work without cache */
return 0; /*but this could be patched */
if (noise < 0.0 || noise >= 1.0)
return 0; /* not supposed to happen */
if (gmkTotalBits == 0)
return 0; /* there is no bits in the mask */
maxBad = gmkTotalBits * noise;
picWidth = backgroundImage.bounds.right -
backgroundImage.bounds.left;
picHeight = backgroundImage.bounds.bottom -
backgroundImage.bounds.top;
if ((picWidth < gmkWidth) ||
(picHeight < gmkHeight))
return 0; /* not supposed to happen */
mkBitsCache = gmkBitsCache;
mkRowBytes = gMask.rowBytes;
picRowBytes = backgroundImage.rowBytes;
/* scan top */
mkLine = 1;
while ((mkLine < gmkHeight) &&
(mkBitsCache[mkLine * gmkWidth] <= maxBad))
mkLine++;
for (outTop = mkLine - 1; outTop >= 1; outTop--)
{
if (DetectLeft(&backgroundImage, maxBad,
mkRowBytes, picRowBytes,
backgroundImage.bounds.top,
backgroundImage.bounds.top + 1,
outTop, gmkHeight, gmkWidth))
return gLocationsCt;
if (DetectCenter(nil, nil, nil, nil, 0, 0,
mkRowBytes, picRowBytes, outTop,
backgroundImage.bounds.top,
maxBad - mkBitsCache[outTop * gmkWidth],
backgroundImage.bounds.top + 1,
picWidth, gmkHeight, &backgroundImage))
return gLocationsCt;
if (DetectRight(&backgroundImage, maxBad,
mkRowBytes, picRowBytes,
backgroundImage.bounds.top,
backgroundImage.bounds.top + 1,
outTop, gmkHeight, picWidth, gmkWidth))
return gLocationsCt;
}
/* scan middle left */
if (DetectLeft(&backgroundImage, maxBad,
mkRowBytes, picRowBytes,
backgroundImage.bounds.top,
backgroundImage.bounds.bottom - gmkHeight + 1,
0, gmkHeight, gmkWidth))
return gLocationsCt;
/* scan middle center */
if (DetectCenter(nil, nil, nil, nil, 0, 0,
mkRowBytes, picRowBytes, 0,
backgroundImage.bounds.top, maxBad,
backgroundImage.bounds.bottom - gmkHeight + 1,
picWidth, gmkHeight, &backgroundImage))
return gLocationsCt;
/* scan middle right */
if (DetectRight(&backgroundImage, maxBad,
mkRowBytes, picRowBytes,
backgroundImage.bounds.top,
backgroundImage.bounds.bottom - gmkHeight + 1,
0, gmkHeight, picWidth, gmkWidth))
return gLocationsCt;
/* scan bottom */
totalBits = gmkTotalBits;
mkLine = 1;
while ((mkLine < gmkHeight) && ((gmkTotalBits -
mkBitsCache[(gmkHeight - mkLine) * gmkWidth])
<= maxBad))
mkLine++;
for (outBottom = 1; outBottom < mkLine; outBottom++)
{
if (DetectLeft(&backgroundImage, maxBad,
mkRowBytes, picRowBytes,
backgroundImage.bounds.bottom - gmkHeight +
outBottom,
backgroundImage.bounds.bottom,
0, gmkHeight - outBottom, gmkWidth))
return gLocationsCt;
if (DetectCenter(nil, nil, nil, nil, 0, 0,
mkRowBytes, picRowBytes, 0,
backgroundImage.bounds.bottom - gmkHeight +
outBottom,
maxBad - totalBits +
mkBitsCache[(gmkHeight - outBottom) * gmkWidth],
backgroundImage.bounds.bottom, picWidth,
gmkHeight - outBottom, &backgroundImage))
return gLocationsCt;
if (DetectRight(&backgroundImage, maxBad,
mkRowBytes, picRowBytes,
backgroundImage.bounds.bottom - gmkHeight +
outBottom,
backgroundImage.bounds.bottom,
0, gmkHeight - outBottom, picWidth, gmkWidth))
return gLocationsCt;
}
return gLocationsCt;
}
DetectLeft
Boolean DetectLeft(
BitMap* picture,
long maxBad,
short mkRowBytes,
short picRowBytes,
short picTop,
short picTopLow,
short mkTop,
short mkBottom,
short mkWidth)
{
ushort *picBasePtr, *picLinePtr, *picCurPtr;
ushort *patBasePtr, *patLinePtr, *patCurPtr;
ushort *mkBasePtr, *mkLinePtr, *mkCurPtr,
*mkEndOfLinePtr;
short mkLine;
ushort outLeft;
ulong picLong;
ushort picShift;
long remBad;
ushort mkout16s;
for (;picTop < picTopLow; picTop++)
{
outLeft = 1;
do
{ /* calculating remBad */
if ((mkTop) || // we're out by the top
(mkBottom == gmkHeight)) // we are fully in
// vertically
remBad = maxBad -
gmkBitsCache[mkTop * mkWidth + outLeft];
else
remBad = maxBad - gmkTotalBits +
gmkBitsCache[mkBottom * mkWidth + outLeft] -
gmkBitsCache[outLeft];
if (remBad < 0)
break; // this causes the do{} to stop
mkout16s = outLeft / 16;
(Ptr)mkBasePtr = gMask.baseAddr +
mkTop * mkRowBytes + mkout16s;
(Ptr)patBasePtr = gPattern.baseAddr +
mkTop * mkRowBytes + mkout16s;
(Ptr)picBasePtr = picture->baseAddr +
(picTop - picture->bounds.top) * picRowBytes;
picShift = outLeft & 0x0F;
/* left border matching */
if (picShift)
{
ushort mkMask;
mkMask = 0xFFFF >> picShift;
mkLinePtr = mkBasePtr++;
patLinePtr = patBasePtr++;
picLinePtr = picBasePtr++;
for (mkLine = mkTop; mkLine < mkBottom;
mkLine++)
{
picLong = *picLinePtr;
(Ptr)picLinePtr += picRowBytes;
remBad -= gBitsOn[
((*patLinePtr) ^ (picLong >> picShift)) &
(*mkLinePtr) & mkMask];
(Ptr)mkLinePtr += mkRowBytes;
(Ptr)patLinePtr += mkRowBytes;
}
}
if (remBad < 0)
continue;
/* regular pattern matching*/
mkLinePtr = mkBasePtr;
(Ptr)mkEndOfLinePtr = gMask.baseAddr +
(mkTop + 1) * mkRowBytes;
patLinePtr = patBasePtr;
picLinePtr = picBasePtr;
/*for each line in the mask */
for (mkLine = mkTop; mkLine < mkBottom; mkLine++)
{
mkCurPtr = mkLinePtr;
patCurPtr = patLinePtr;
picCurPtr = picLinePtr;
picLong = *picCurPtr++;
if (picShift)
{
picLong <<= 16;
picLong |= *picCurPtr++;
}
/* for each 16s of the mask inside the pict */
for (; mkCurPtr < mkEndOfLinePtr; mkCurPtr++)
{
remBad -= gBitsOn[((*patCurPtr++) ^
(picLong >> picShift)) & (*mkCurPtr)];
picLong <<= 16;
picLong |= *picCurPtr++;
}
if (remBad < 0)
break;
(Ptr)mkLinePtr += mkRowBytes;
(Ptr)mkEndOfLinePtr += mkRowBytes;
(Ptr)patLinePtr += mkRowBytes;
(Ptr)picLinePtr += picRowBytes;
}
if (remBad >= 0)
{
if (gLocationsCt < gMaxLocations)
{
gLocations[gLocationsCt].v =
picTop - mkTop - gmkSkip.v;
gLocations[gLocationsCt++].h =
picture->bounds.left - outLeft
- gmkSkip.h;
} else
return true;
}
} while (++outLeft < mkWidth);
}
return false;
}
#define AddCenterLocation(rem, shift); \
if (rem >= 0) \
{ \
if (gLocationsCt < gMaxLocations) \
{ \
gLocations[gLocationsCt].v = \
picTop - mkTop - gmkSkip.v; \
gLocations[gLocationsCt++].h = \
picture->bounds.left + shift + \
(picWidth - picRemain) \
- gmkSkip.h; \
} else \
return true; \
}
DetectCenter
Boolean DetectCenter(
ushort *patCurPtr,
ushort *mkCurPtr,
ushort *picLinePtr,
ushort *picCurPtr,
ulong picLong,
long dumbReg,
long mkRowBytes,
long picRowBytes,
long mkTop,
long picTop,
long maxBad,
long picTopLow,
long picWidth,
long mkBottom,
BitMap* picture
)
{
short picRemain;
float mkLineStart;
float mkLine;
mkLineStart = mkBottom - mkTop - 0.5;
picWidth -= gmkWidth;
for (; picTop < picTopLow; picTop++)
{
picRemain = picWidth;
do
{
register ushort remFlag;
register long
remBad00, remBad01, remBad02, remBad03,
remBad04, remBad05, remBad06, remBad07,
remBad08, remBad09, remBad10, remBad11,
remBad12, remBad13, remBad14, remBad15;
/* initializing the remBads and remFlag */
remFlag = 0xFFFF;
remBad00 = maxBad; remBad01 = remBad00;
remBad02 = remBad00; remBad03 = remBad01;
remBad04 = remBad00; remBad05 = remBad01;
remBad06 = remBad00; remBad07 = remBad01;
remBad08 = remBad00; remBad09 = remBad01;
remBad10 = remBad00; remBad11 = remBad01;
remBad12 = remBad00; remBad13 = remBad01;
remBad14 = remBad00; remBad15 = remBad01;
switch (picRemain)
{
case 0: remBad01 = -1;
case 1: remBad02 = -1;
case 2: remBad03 = -1;
case 3: remBad04 = -1;
case 4: remBad05 = -1;
case 5: remBad06 = -1;
case 6: remBad07 = -1;
case 7: remBad08 = -1;
case 8: remBad09 = -1;
case 9: remBad10 = -1;
case 10: remBad11 = -1;
case 11: remBad12 = -1;
case 12: remBad13 = -1;
case 13: remBad14 = -1;
case 14: remBad15 = -1;
}
(Ptr)mkCurPtr =
gMask.baseAddr + mkTop * mkRowBytes;
(Ptr)patCurPtr =
gPattern.baseAddr + mkTop * mkRowBytes;
/* regular pattern matching*/
(Ptr)picLinePtr = picture->baseAddr +
(picTop - picture->bounds.top) * picRowBytes;
picLinePtr += (picWidth - picRemain) / 16;
/*for each line in the mask */
for (mkLine = mkLineStart; mkLine > 0.0; mkLine--)
{
picCurPtr = picLinePtr;
(Ptr)picLinePtr += mkRowBytes;
picLong = *picCurPtr++;
/* for each 16s of the mask */
for (; picCurPtr <= picLinePtr; )
{
ulong mkLo, patLo;
picLong <<= 16;
picLong |= *picCurPtr++;
mkLo = (*mkCurPtr++);
patLo = (*patCurPtr++);
switch (picRemain)
{
default:
case 15:
if (remBad15 < 0)
remFlag &= 0x7FFF;
else remBad15 -= gBitsOn[(patLo ^
(picLong >> 1)) & mkLo];
case 14:
if (remBad14 < 0)
remFlag &= 0xBFFF;
else remBad14 -= gBitsOn[(patLo ^
(picLong >> 2)) & mkLo];
case 13:
if (remBad13 < 0)
remFlag &= 0xDFFF;
else remBad13 -= gBitsOn[(patLo ^
(picLong >> 3)) & mkLo];
case 12:
if (remBad12 < 0)
remFlag &= 0xEFFF;
else remBad12 -= gBitsOn[(patLo ^
(picLong >> 4)) & mkLo];
case 11:
if (remBad11 < 0)
remFlag &= 0xF7FF;
else remBad11 -= gBitsOn[(patLo ^
(picLong >> 5)) & mkLo];
case 10:
if (remBad10 < 0)
remFlag &= 0xFBFF;
else remBad10 -= gBitsOn[(patLo ^
(picLong >> 6)) & mkLo];
case 9:
if (remBad09 < 0)
remFlag &= 0xFDFF;
else remBad09 -= gBitsOn[(patLo ^
(picLong >> 7)) & mkLo];
case 8:
if (remBad08 < 0)
remFlag &= 0xFEFF;
else remBad08 -= gBitsOn[(patLo ^
(picLong >> 8)) & mkLo];
case 7:
if (remBad07 < 0)
remFlag &= 0xFF7F;
else remBad07 -= gBitsOn[(patLo ^
(picLong >> 9)) & mkLo];
case 6:
if (remBad06 < 0)
remFlag &= 0xFFBF;
else remBad06 -= gBitsOn[(patLo ^
(picLong >> 10)) & mkLo];
case 5:
if (remBad05 < 0)
remFlag &= 0xFFDF;
else remBad05 -= gBitsOn[(patLo ^
(picLong >> 11)) & mkLo];
case 4:
if (remBad04 < 0)
remFlag &= 0xFFEF;
else remBad04 -= gBitsOn[(patLo ^
(picLong >> 12)) & mkLo];
case 3:
if (remBad03 < 0)
remFlag &= 0xFFF7;
else remBad03 -= gBitsOn[(patLo ^
(picLong >> 13)) & mkLo];
case 2:
if (remBad02 < 0)
remFlag &= 0xFFFB;
else remBad02 -= gBitsOn[(patLo ^
(picLong >> 14)) & mkLo];
case 1:
if (remBad01 < 0)
remFlag &= 0xFFFD;
else remBad01 -= gBitsOn[(patLo ^
(picLong >> 15)) & mkLo];
case 0:
if (remBad00 < 0)
remFlag &= 0xFFFE;
else remBad00 -= gBitsOn[(patLo ^
(picLong >> 16)) & mkLo];
}
}
(Ptr)picLinePtr -= mkRowBytes;
(Ptr)picLinePtr += picRowBytes;
if (remFlag == 0)
break;
}
if (remFlag)
{
AddCenterLocation(remBad00, 0);
AddCenterLocation(remBad01, 1);
AddCenterLocation(remBad02, 2);
AddCenterLocation(remBad03, 3);
AddCenterLocation(remBad04, 4);
AddCenterLocation(remBad05, 5);
AddCenterLocation(remBad06, 6);
AddCenterLocation(remBad07, 7);
AddCenterLocation(remBad08, 8);
AddCenterLocation(remBad09, 9);
AddCenterLocation(remBad10, 10);
AddCenterLocation(remBad11, 11);
AddCenterLocation(remBad12, 12);
AddCenterLocation(remBad13, 13);
AddCenterLocation(remBad14, 14);
AddCenterLocation(remBad15, 15);
}
picRemain -= 16;
} while (picRemain >= 0);
}
return false;
}
DetectRight
Boolean DetectRight(
BitMap* picture,
long maxBad,
short mkRowBytes,
short picRowBytes,
short picTop,
short picTopLow,
short mkTop,
short mkBottom,
short picWidth,
short mkWidth)
{
ushort *picBasePtr, *picLinePtr, *picCurPtr;
ushort *patBasePtr, *patLinePtr, *patCurPtr;
ushort *mkBasePtr, *mkLinePtr, *mkCurPtr,
*mkEndOfLinePtr;
short mkLine;
ushort outRight;
ulong picLong;
ushort picShift;
long remBad;
ushort mkout16s;
ushort mktotal16s;
ushort mkModulo;
ushort picModulo;
ushort flushModulo;
mkModulo = mkWidth & 0x0F;
(Ptr)mkBasePtr = gMask.baseAddr +
mkTop * mkRowBytes;
(Ptr)patBasePtr = gPattern.baseAddr +
mkTop * mkRowBytes;
for (; picTop < picTopLow; picTop++)
{
outRight = 1;
do
{ /* calculating remBad */
if ((mkTop) || // we're out by the top
(mkBottom == gmkHeight)) // we are fully in
// vertically
remBad = maxBad - gmkTotalBits +
gmkBitsCache[(mkTop + 1) * mkWidth -
outRight] -
gmkBitsCache[mkTop * mkWidth];
else
remBad = maxBad - gmkTotalBits +
gmkBitsCache[mkWidth - outRight] +
gmkBitsCache[mkBottom * mkWidth] -
gmkBitsCache[(mkBottom + 1) * mkWidth -
outRight];
if (remBad < 0)
break; // this causes the do{} to stop
/* setup */
mktotal16s = (mkWidth + 15) / 16;
mkout16s = (outRight +
((mkWidth - outRight - 1) & 0x0F)) / 16;
(Ptr)picBasePtr = picture->baseAddr +
(picTop - picture->bounds.top) *
picRowBytes;
picModulo = picWidth & 0x0F;
flushModulo = (mkWidth - outRight) & 0x0F;
picShift = flushModulo - picModulo;
if (picShift < 0)
picShift += 16;
/* right border matching */
if (flushModulo > 0)
{
ulong mkMask;
mkLinePtr = mkBasePtr + mktotal16s -
(mkout16s + 1);
patLinePtr = patBasePtr + mktotal16s -
(mkout16s + 1);
picLinePtr = picBasePtr + (picWidth / 16 - 1);
mkMask = 0xFFFF << (16 - flushModulo);
for (mkLine = mkTop;
mkLine < mkBottom; mkLine++)
{
picLong = *picLinePtr;
picLong <<= 16;
if (picModulo)
picLong = *(picLinePtr + 1);
picLong = picLong >> picShift;
remBad -= gBitsOn[((*patLinePtr) ^ picLong)
& (*mkLinePtr) & mkMask];
(Ptr)mkLinePtr += mkRowBytes;
(Ptr)patLinePtr += mkRowBytes;
(Ptr)picLinePtr += picRowBytes;
}
}
if (remBad < 0)
continue;
/* Regular pattern matching */
mkLinePtr = mkBasePtr;
patLinePtr = patBasePtr;
picLinePtr = picBasePtr + (picWidth / 16) -
(mktotal16s - mkout16s);
mkEndOfLinePtr = mkLinePtr +
mktotal16s - mkout16s;
if (flushModulo)
mkEndOfLinePtr--;
/* for each line in the mask */
for (mkLine = mkTop; mkLine < mkBottom; mkLine++)
{
mkCurPtr = mkLinePtr;
patCurPtr = patLinePtr;
picCurPtr = picLinePtr;
picLong = *picCurPtr++;
if (picShift)
{
picLong <<= 16;
picLong |= *picCurPtr++;
}
/* each short in the mask inside the pict */
for (; mkCurPtr < mkEndOfLinePtr;
mkCurPtr++)
{
remBad -= gBitsOn[((*patCurPtr++) ^
(picLong >> picShift)) & (*mkCurPtr)];
picLong <<= 16;
picLong |= *picCurPtr++;
}
if (remBad < 0)
break;
(Ptr)mkLinePtr += mkRowBytes;
(Ptr)mkEndOfLinePtr += mkRowBytes;
(Ptr)patLinePtr += mkRowBytes;
(Ptr)picLinePtr += picRowBytes;
}
if (remBad >= 0)
{
if (gLocationsCt < gMaxLocations)
{
gLocations[gLocationsCt].v =
picTop - mkTop - gmkSkip.v;
gLocations[gLocationsCt++].h =
picture->bounds.right + outRight -
mkWidth - gmkSkip.h;
} else
return true;
}
} while (++outRight < mkWidth);
}
return false;
}
*BitsOn.c
/*
an array of 65536 char giving the number of 1's
in any 16 bit number.
*/
unsigned char gBitsOn[256*256] = {
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
/*
[ you get the idea ... most of the rest deleted ]
*/
11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15,
8, 9, 9,10, 9,10,10,11, 9,10,10,11,10,11,11,12,
9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13,
9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13,
10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14,
9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13,
10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14,
10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14,
11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15,
9,10,10,11,10,11,11,12,10,11,11,12,11,12,12,13,
10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14,
10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14,
11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15,
10,11,11,12,11,12,12,13,11,12,12,13,12,13,13,14,
11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15,
11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15,
12,13,13,14,13,14,14,15,13,14,14,15,14,15,15,16
};