Programmer's Challenge

by Bob Boonstra, Westford, MA

Image Locator

Imagine yourself with a collection satellite images and the task of finding a particular item in those images. Rather than look for a needle in this image haystack manually, you might call on your PowerMac to narrow down the choices for you. We'll enlist the aid of our Programmer's Challenge participants to help you do that job quickly. The Challenge this month is to detect the presence of a target pattern inside a larger background image. Because the background has been detected by an imperfect sensor, there is noise present in the background image. Your code will need to detect the target in the background despite this noise.

The prototype for the code you should write is:

#define topLeft(r)    (((Point *) &(r))[0])

void InitTarget(
  BitMap pattern,    /* image to be detected */
  BitMap mask      /* bits in image that we care about */
);

long /* numFound */ ImageDetect(
  BitMap backgroundImage,  /* find the target image in backgroundImage */
  Point locations[],    /* return topLeft of matching locations here */
  long maxLocations,    /* max number of locations to return */
  float noise        /* allow this fraction of mismatched bits */
);

void CleanUp(void);  /* deallocate any memory allocated by InitTarget */

Image location will take place in two steps. First, your InitTarget routine will be called with the target pattern that you will be looking for. Next, the ImageDetect routine will be called multiple times with a different background image and an associated noise threshold. ImageDetect should locate all occurrences of the target in the background, allowing for mismatched bits up to the noise threshold, and return the location of the pattern matches. Finally, the CleanUp routine will be called to allow you to deallocate any memory allocated by InitTarget.

InitTarget will be called with two BitMaps that describe the target pattern to be detected. The pattern BitMap identifies bits that should be set to 1, and the mask BitMap describes the bits that you care about (1s and 0s). Any bits not in the mask are not part of the target image, and the corresponding values in the background image are to be ignored. InitTarget should process the target image as desired and allocate memory to remember it.

ImageDetect will then be called multiple times (5-10 on average for each call to InitTarget). You should locate each occurrence of the target image in backgroundImage and return the coordinate in the background of topLeft(pattern.bounds) in the locations array. The noise parameter describes the fraction of target bits where the backgroundImage is allowed to differ from the target and still be considered a match. Up to noise times the number of 1s in the mask, rounded down, bits may be mismatched. Normally, locations will be large enough to hold all of the matches found, but you should not return more than the maxLocations matches for which storage has been allocated. The pattern matches may be returned in any order. If the maxLocations limit is exceeded, the choice of which matches to report is yours. ImageDetect should return the number of matches found.

Other information: The bounds rectangle for the pattern and the mask will be identical. All bits set in the pattern will also be set in the mask (but not the converse). The backgroundImage will typically be the size of a large monitor (e.g., 1024x768, or 1600x1200).

This will be a native PowerPC Challenge, using the latest CodeWarrior environment. In keeping with tradition, September is assembly language month here at the Programmer's Challenge. Solutions may be coded in PowerPC or 68K assembly language, C, C++, or Pascal.

Finally, we should note that the Programmer's Challenge began its sixth year last month. During that time, the Challenge has changed development environments, moved from 68K to PowerPC, and expanded its selection of languages. We appreciate the participation of our readers, without which the Challenge would not be possible. Happy belated birthday, Programmer's Challenge.

Three Months Ago Winner

The June Challenge was to implement a Turing Machine, a finite state machine augmented with an infinite amount of external storage. Twenty people submitted entries, and 17 of those worked correctly. Congratulations to Ernst Munter (Kanata, Ontario) for submitting the fastest solution and returning to the Challenge winner's circle.

The key to success in this Challenge was being able to quickly find the rule that applied to the current machine state and the current input symbol. A variety of techniques were used to find the applicable rule. Hashing was used by many of the faster entries. Ernst uses either hashing or a simple lookup table, depending on memory availability. Others sorted the rules and used a binary search. The slower solutions typically used a brute force approach of simply searching linearly through the rule set.

I used two types of test cases to stress the solutions. The first case involved a Turing Machine of approximately 2300 rules that sorted an input tape with an alphabet of 30 symbols and tape lengths of about 100 symbols. This case required over 113,000 Turing Machine state changes. The second test case was a Universal Turing Machine. A UTM is an interesting creature. Its input tape has two parts, an encoded version of the rules (program) for another Turing Machine, which it is to execute, and the input tape for that emulated program. The tape also contains an area where the Universal TM maintains the state for the program being emulated. The UTM operates by looking up the rule (or program instruction) that applies given the current state of the machine being emulated, remembering that instruction while it moves to the current input for the emulated machine, and then executing that instruction. The Universal Turing Machine I used operated on a binary alphabet and consisted of 184 rules, operating on an input tape that described a simple unary addition machine. This test case required just under 240,000 state changes to execute.

The table below lists for each entry the execution times in milliseconds for the sort test case and the Universal Turing Machine case, total execution time, code and data sizes, and the programming language used. The number in parentheses after the entrant's name is the total number of Challenge points earned in all Challenges to date prior to this one.

Name	Time1	Time2	Total Time	Code	Data	Language
Ernst Munter (246)	28.1	29.3	57.6	920	8	C++
Russ Webb	30.9	33.0	64.2	1696	140	C
Devon Carew	33.4	35.6	79.5	976	28	C
Gary Beith (24)	39.6	39.9	86.9	592	32	C
Mason Thomas (4)	47.9	40.5	89.1	740	8	C
Kevin Cutts (57)	42.4	43.9	89.7	620	32	C++
Simon Holmes à Court	44.1	42.3	90.0	744	32	C++
Juerg Wullschleger	48.5	50.4	99.7	476	8	C
Daniel Harding	96.0	63.0	159.7	2612	406	C++
Zach Thompson	93.2	64.8	161.3	1076	48	C++
Gregory Cooper (54)	107.8	84.6	192.7	668	40	C
Graham Herrick	137.0	107.2	244.8	820	16	C
Andy Scheck (17)	3239.0	158.3	3397.0	212	8	C++
Charles Higgins (20)	3238.0	167.8	3406.0	276	8	C
David Whitney	4174.0	272.3	4449.0	19800	2745	C++
Bjorn Davidsson (6)	6565.0	165.6	6731.0	224	8	C++
Terry Noyes	6737.0	198.3	6936.0	200	8	C
R.B.				2736	99	C
S.A.				840	448	C
W.R.				1148	8	C++

Top 20 Contestants

Here are the Top Contestants for the Programmer's Challenge. The numbers below include points awarded over the 24 most recent contests, including points earned by this month's entrants.

Rank	Name	Points
1.	Munter, Ernst	196
2.	Gregg, Xan	83
3.	Cooper, Greg	54
4.	Lengyel, Eric	40
5.	Boring, Randy	37
6.	Lewis, Peter	32
7.	Mallett, Jeff	30
8.	Murphy, ACC	30
9.	Larsson, Gustav	27
10.	Antoniewicz, Andy	24
11.	Nicolle, Ludovic	21
12.	Picao, Miguel Cruz	21
13.	Brown, Jorg	20
14.	Day, Mark	20
15.	Gundrum, Eric	20
16.	Higgins, Charles	20
17.	Slezak, Ken	20
18.	Studer, Thomas	20
19.	Karsh, Bill	19
20.	Nevard, John	19

Here is Ernst's winning solution:

Turing.cp ® 1997 Ernst Munter

Problem Statement: Implement the engine for a Turing Machine, a state machine which, at each step, reads a symbol from a tape, consults a rule which is a function of the current state and the symbol. The rule specifies a new state, a symbol to output, and the direction in which to move the tape, or to halt.

Solution

I first try to build a lookup table as an index into the rules array. But this may require an "unreasonable" amount of memory.

First, I scan the rules to determine the amount of table memory required for a simple lookup index. If this appears to be too much, I go to plan B: a hashed index.

The hash table uses linear open addressing: when the table is built and an index location is needed which is already in use we have a collision. To resolve it, I scan linearly through the index array until a free location is found. The size of the index array is larger than the number of rules, so a free location will always be found.

Optimization of hash table lookup

The majority of rules will hash to unique index addresses.

Rules which hash to the same value can be seen as a sequence of index table entries, with the primary location containing the first rule address to be found.

When the Turing Machine is executing, any colliding rule that is encountered will have its index moved to the primary index location, on the assumption that it will be used again, and will then be found more quickly.

Assumptions

A minimum amount of table memory of 8K entries is always provided. But for larger rule sets, an index table that will occupy about 1/2 to 3/4 the amount of memory taken by the rules array itself may be allocated.

The memory allocated for the simple lookup table will be the number_of_states * number_of_symbols, rounded up to a power of 2, but not more than 8K entries, or 2 * number_of_rules, whichever is larger.

If the memory required for the simple index would exceed those rules, for example if there are a lot of holes in the symbol/state space, the hashed index is used.

The memory allocated for the hash index array will be the larger of 8K entries or 2 * number_of_rules, rounded up to the nearest power of 2.

For example, a rule set of up to 2K rules may result in a 32K byte index; a rule set of 50,000 rules which occupy 1M of rules memory may get an index table of 512K bytes.

#include <stdio.h>

#include <stdlib.h>
#include <string.h>
#include "turing.h"

const enum {    // constants controlling min size of index
  EXPANSION  = 2,
  MIN_BITS   = 13,
  MIN_SIZE    = 1L<<MIN_BITS };  
  
typedef const TMRule* TMRulePtr;

Boolean TuringMachine(
  const TMRule theRules[],
  ulong numRules,
  ulong *theTape,
  ulong tapeLen,
  long rwHeadPos,
  TMMoveProc ReportMove
)  {
  TMRulePtr* index;
  ulong    state=0;
  ulong    symbol;
  int    direction;
  ulong*  tape=theTape+rwHeadPos;
  ulong*  tapeEnd=theTape+tapeLen;

  ulong    mask;
  TMRulePtr rule=theRules;

// The function contains 2 very similar sections,
// one section uses a plain lookup table for an index,
// the other uses a hash table.

// Try to construct a collision-free index of rule addresses

// compute table size
  ulong   maxState=rule->oldState;
  ulong   maxSym=rule->inputSymbol;
  ulong   minSym=rule->inputSymbol;
  rule++;
  for (int i=1;i<numRules;i++) {
    if (maxState<rule->oldState) maxState=rule->oldState;
    if (maxSym<rule->inputSymbol) maxSym=rule->inputSymbol;
    else
    if (minSym>rule->inputSymbol) minSym=rule->inputSymbol;
    rule++;
  }
  ulong numSyms=maxSym-minSym+1;
  ulong numStates=maxState+1;
  ulong numIndex=(numStates)*(numSyms);

  if ((numIndex<numStates)               // overflow
    || ((numIndex > MIN_SIZE)
    && (numIndex > EXPANSION*numRules)))  // too large
      goto try_hash;

// increase size to the next power of 2
  ulong dummy=1;
  while (numIndex) {
    numIndex>>=1;
    dummy<<=1;
  }
  numIndex=dummy;

// Allocate the table memory
  index=(TMRulePtr*)malloc(numIndex*sizeof(TMRulePtr));

// Always expect to get the memory, but just in case ...
  if (index==0)
    return FALSE;

// All unused index locations will remain 0
  memset(index,0,numIndex*sizeof(TMRulePtr));
  mask=numIndex-1;

// Scan the rules and populate the index array
  rule=theRules;
  for (int i=0;i<numRules;i++) {
    ulong addr=
      mask & (rule->oldState*numSyms+rule->inputSymbol);
    index[addr]=rule++;
  }
// Using the collision-free index table:
// Loop until the tape halts or we fail on error
  do {
      symbol=*tape;
      ulong addr=mask & (state*numSyms+symbol);
      rule=index[addr];
      if (rule == 0)    // illegal symbol, no rule
        break;
      symbol=rule->outputSymbol;
      state=rule->newState;
      direction=rule->moveDirection;

      ReportMove(symbol,state,MoveDir(direction));
      *tape=symbol;

      if (direction==kHalt) {
        free(index);
        return TRUE;  
    }  
    
      tape+=direction;
  } while ((tape>=theTape) && (tape<tapeEnd));
  free(index);
  return FALSE;

try_hash:
// Section 2
// If we get here, we could not make a simple table
// and have to go with a hash table, collisions are possible

// Find table size >= minimum size
  numIndex=MIN_SIZE;
  while (numIndex < EXPANSION*numRules) {
    numIndex*=2;
  }

// Allocate the table memory
  index=(TMRulePtr*)malloc(numIndex*sizeof(TMRulePtr));

// Always expect to get the memory, but just in case ...
  if (index==0)
    return FALSE;

// All unused index locations will remain 0
  memset(index,0,numIndex*sizeof(TMRulePtr));

  mask=numIndex-1;
  ulong   hFactor=1 | (numIndex/numStates);
  long     hDelta=1 | (hFactor>>1);

// Scan the rules and populate the index array
  rule=theRules;
  for (int i=0;i<numRules;i++) {
    ulong addr=
      mask & (rule->oldState*hFactor+rule->inputSymbol);

// if primary location is not empty: find next free location
    while (index[addr])   
      addr=mask & (addr+hDelta);
    index[addr]=rule++;
  }

// Using the hash index table:
// Loop until the tape halts or we fail on error.
// This loop is the same as the loop in the first section
// except we have to check for possible collisions with each rule..
  do {
      symbol=*tape;
      ulong addr=mask & (state*hFactor+symbol);
      rule=index[addr];
      if (rule == 0)          // illegal symbol, no rule
        break;
  
// check if we have the right rule, or a collision  
    if ((symbol != rule->inputSymbol)
      || (state != rule->oldState)) {  
       const TMRule* rule0=rule;
        ulong addr0=addr;

      do {          // resolve the collision
        addr=mask & (addr+hDelta);
        rule=index[addr];
          if (rule == 0) {      // could not find rule
        free(index);
      return FALSE;
      }  
  
      } while ((symbol != rule->inputSymbol)
      || (state != rule->oldState));
      index[addr]=rule0;        // move last-used rule
      index[addr0]=rule;        // up in chain
  
    }
  
// now we have the correct rule  
      symbol=rule->outputSymbol;
      state=rule->newState;
      direction=rule->moveDirection;

      ReportMove(symbol,state,MoveDir(direction));

      *tape=symbol;
      if (direction==kHalt) {    // normal stop
        free(index);
        return TRUE;  
        }

        tape+=direction;
  } while ((tape>=theTape) && (tape<tapeEnd));

  free(index);
  return FALSE;
}