Aug 93 Challenge
Volume Number: | | 9
|
Issue Number: | | 8
|
Column Tag: | | Programmers Challenge
|
Programmers Challenge
By Mike Scanlin, MacTech Magazine Regular Contributing Author
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
REPLACE ALL
Have you ever done a Replace All operation in some program and thought that it was taking more time than it should to do the job? Well, this month youll have your chance to show those text-based applications how to do it right. The goal is to write a case-sensitive find and replace function.
The prototype of the function you write is:
long ReplaceAll(sourceHndl,
replaceHndl, targetHndl)
Handle sourceHndl;
Handle replaceHndl;
Handle targetHndl;
The sourceHndl contains the text to search for. The replaceHndl contains the text to replace the source text with, once found, and the targetHndl contains the text to look in. You should replace each occurance of the sourceHndl text in targetHndl with the replaceHndl text. You can get the sizes of each piece by using GetHandleSize. And you can resize targetHndl as necessary with normal Mac memory manager calls.
For example, suppose targetHndl contained this text:
Perhaps this modern sorcery especially attracts those who believe in
happy endings and fairy godmothers.
with this sourceHndl text:
fairy godmothers
and this replaceHndl text:
pink elephants
After a call to ReplaceAll targetHndl would contain:
Perhaps this modern sorcery especially attracts those who believe in
happy endings and pink elephants.
The return value from ReplaceAll is the actual number of substitutions made (1 in this example). You must find an exact match before doing a replacement (i.e. cat != Cat). The max size of each Handle on entry to ReplaceAll will be 255 bytes for sourceHndl and replaceHndl; and 65535 bytes for targetHndl. There is no max output size for targetHndl. You should grow or shrink targetHndl as necessary so that its exactly the correct size when ReplaceAll returns. If you run out of memory while trying to grow targetHndl then return -1 instead of the number of substitutions. The text you will be searching through will be mixed case English text containing punctuation.
TWO MONTHS AGO WINNER
Congratulations to Bob Boonstra (Westford, MA) for his fast and small entry in the Where In The World challenge. Tying Bob for speed but not for code size is Jeff Mallett (Hickory, NC). Jeffs solution was a little faster than Bobs for some cases and a little slower than Bobs for others. I couldnt find a pattern so I went to the second criteria, code size, to pick the winner.
Here are the sizes and times for the 16 entries that yielded correct results:
Name bytes ticks
Bob Boonstra 358 55
Jeff Mallett 728 55
Stepan Riha 514 78
Jim Bumgardner 454 86
Jeff Tupper 624 94
Russ LaValle 530 101
Jan Bruyndonckx 402 112
David Salmon 638 133
Ted Krovetz 250 136
Patrick Breen 350 137
Ricky Schrieber 286 145
Thomas Studer 488 164
Stuart McIntosh 390 230
Robert Fisher 396 268
David Rand 302 420
Bob Menteer 1224 506
Since the puzzle stated that exact matches would be passed to FindCity 2/3rds of the time it is reasonable to do an exact match search first, before starting off and trying to find the five closest matches. Bob makes this first pass by comparing the first 4 bytes of the target string with the first 4 bytes of each element in the cities array. Only if there is an exact match (which includes the length byte) does he then continue to check the remaining parts of the string (again, 4 bytes at a time through clever typecasting).
Once Bob detects that he doesnt have an exact match, he calculates a closeness-of-match number for every entry in the cities array. But he eliminates some of the work by first comparing the difference in length bytes between the target string and the current element. If its greater than his current largest difference (of the 5 closest hes keeping track of) then he skips the element completely and moves on to the next. This is a neat trick that takes advantage of the way the closeness-of-match number was defined. I like it.
Heres Bobs winning solution:
/* FindCity by Bob Boonstra */
#define kMax5
#define kSentinal0x035A5A5A
#define low3Bytes0x00FFFFFF
typedef struct distRec {
short theDist;
unsigned short theIndx;
} distRec;
Boolean FindCity(cityNames, cityToFindNamePtr,
closestMatches)
Str255 cityNames[];
Str255 *cityToFindNamePtr;
unsigned short closestMatches[];
{
distRectheRec[kMax];
unsigned char *curCityP, *city2Find;
unsigned short curIndx;
register short maxDist;
city2Find = *cityToFindNamePtr;
curCityP = cityNames[0];
/*
*Scan for an exact match.
*/
{
register long toFind3;
curIndx = 0;
toFind3 = *(long *)city2Find & low3Bytes;
do {
if ( *(long *)curCityP == *(long *)city2Find ) {
/*
*If first 4 characters match, look at the
*rest in chunks of 4 characters.
*/
register short charsLeft;
register unsigned char *s1 = curCityP+4;
register unsigned char *s2 = city2Find+4;
if ( (charsLeft = *curCityP-7) >= 0 ) {
do {
if ( *(long *)s1 != *(long *)s2 )
goto nextOne;
s1 += 4; s2 += 4;
} while ( (charsLeft-=4) >= 0 );
}
/*
*If all chunks of 4 characters match, look at the
*rest individually.
*/
if (charsLeft+=4) {
do {
if (*s1++ != *s2++)
goto nextOne;
} while (--charsLeft);
}
/*
*Exact match found. Return index of match.
*/
closestMatches[0] = curIndx;
return(1);
/*
*Process the next city. Exit if it is greater in
*alphabetic order (based on 1st 3 characters,
*w/o length byte).
*Sentinal will force exit if necessary.
*/
nextOne: ;
}
++curIndx;
curCityP+=sizeof(Str255);
} while ( (*(long *)curCityP & low3Bytes) <= toFind3 );
}
/*
*Initialize distance structure for 5 closest matches.
*/
noMatch:
{
register distRec *p = &theRec[0];
(++p)->theDist = (++p)->theDist = (++p)->theDist =
(++p)->theDist = (p)->theDist = maxDist = 255;
}
/*
*Loop thru cityNames to find closest matches.
*/
curCityP=cityNames[0];
curIndx=0;
do {
register short curDist;
/*
*Calculate dist between cityToFind and cityName[curIndx].
*/
{
register unsigned char *s1,*s2;
register short lng;
/*initialize dist to length difference */
s2 = city2Find;
lng = *s2++;
if ((curDist = *(s1=curCityP) - lng) < 0) {
curDist = -curDist;
lng = *s1;
}
/*
*Move to next city if distance exceeds the distance
*of 5 other cities.
*Increment dist for each nonMatching char.
*Unroll for a little extra speed.
*/
++s1;
do {
if (*s1++ != *s2++)
if (++curDist >= maxDist) goto nextCity;
if (!(--lng)) break;
if (*s1++ != *s2++)
if (++curDist >= maxDist) goto nextCity;
if (!(--lng)) break;
if (*s1++ != *s2++)
if (++curDist >= maxDist) goto nextCity;
if (!(--lng)) break;
if (*s1++ != *s2++)
if (++curDist >= maxDist) goto nextCity;
} while (--lng);
}
/*
*This city is closer than at least one of the five
*currently closest cities. Store the distance and
*the index in the proper place.
*distRec[0].theIndx is the closest match, and
*distRec[0].theDist is the associated distance
*/
{
register distRec *q=theRec+kMax-1;
maxDist=curDist;
if (curDist >= (q-1)->theDist) goto storeIt;
*q = *(q-1);
maxDist = q->theDist; --q; /* [3]-->[4] */
if (curDist >= (q-1)->theDist) goto storeIt;
*q = *(q-1); --q; /* [2]-->[3] */
if (curDist >= (q-1)->theDist) goto storeIt;
*q = *(q-1); --q; /* [1]-->[2] */
if (curDist >= (q-1)->theDist) goto storeIt;
*q = *(q-1); --q; /* [0]-->[1] */
storeIt:
q->theDist = curDist;
q->theIndx = curIndx;
}
/*
*Process next city.
*/
nextCity:
curCityP+=sizeof(Str255);
++curIndx;
/*
*Exit when sentinal is found.
*/
} while (*(long *)curCityP != kSentinal);
/*
*Return indices of 5 closest cities.
*/
{
register unsigned short *p = closestMatches;
register unsigned short *q = &theRec[0].theIndx;
*p++ = *q; q+=2;
*p++ = *q; q+=2;
*p++ = *q; q+=2;
*p++ = *q; q+=2;
*p = *q;
}
return(0);
}
The Rules
Heres how it works: Each month there will be a different programming challenge presented here. First, you must write some code that solves the challenge. Second, you must optimize your code (a lot). Then, submit your solution to MacTech Magazine (formerly MacTutor). A winner will be chosen based on code correctness, speed, size and elegance (in that order of importance) as well as the postmark of the answer. In the event of multiple equally desirable solutions, one winner will be chosen at random (with honorable mention, but no prize, given to the runners up). The prize for the best solution each month is $50 and a limited edition The Winner! MacTech Magazine Programming Challenge T-shirt (not to be found in stores).
In order to make fair comparisons between solutions, all solutions must be in ANSI compatible C (i.e., dont use Thinks Object extensions). Only pure C code can be used. Any entries with any assembly in them will be disqualified. However, you may call any routine in the Macintosh toolbox you want (i.e., it doesnt matter if you use NewPtr instead of malloc). All entries will be tested with the FPU and 68020 flags turned off in THINK C. When timing routines, the latest version of THINK C will be used (with ANSI Settings plus Honor register first and Use Global Optimizer turned on) so beware if you optimize for a different C compiler. All code should be limited to 60 characters wide. This will aid us in dealing with e-mail gateways and page layout.
The solution and winners for this months Programmers Challenge will be published in the issue two months later. All submissions must be received by the 10th day of the month printed on the front of this issue.
All solutions should be marked Attn: Programmers Challenge Solution and sent to Xplain Corporation (the publishers of MacTech Magazine) via snail mail or preferably, e-mail - AppleLink: MT.PROGCHAL, Internet: progchallenge@xplain.com, CompuServe: 71552,174 and America Online: MT PRGCHAL. If you send via snail mail, please include a disk with the solution and all related files (including contact information). See page 2 for information on How to Contact Xplain Corporation.
MacTech Magazine reserves the right to publish any solution entered in the Programming Challenge of the Month and all entries are the property of MacTech Magazine upon submission. The submission falls under all the same conventions of an article submission.