Programmer's Challenge

by Bob Boonstra, Westford, MA

Costas Arrays

A Costas array of order N is an NxN array of 1s and 0s satisfying two constraints. First, the array must have exactly N 1s and N*(N-1) 0s, with exactly one 1 in each row and column. Second, no two lines between pairs of 1s may have exactly the same length and the same slope. So, for example, there are exactly 12 Costas arrays of order 4:

		1000  0001  0010  0010  1000  0001
		0010  1000  1000  0100  0001  0100 
		0001  0010  0100  0001  0100  1000 
		0100  0100  0001  1000  0010  0010 
 
		0100  0100  1000  0001  0100  0010
		0010  0001  0100  0010  1000  0001
		1000  0010  0001  1000  0010  0100
		0001  1000  0010  0100  0001  1000

So why would one care about Costas arrays? Because of the asymmetries imposed by the two constraints, Costas arrays make ideal waveforms for certain sensor applications, reducing ambiguity in interpreting radar and sonar returns. The mathematics are interesting in other ways. For example, according to the CRC Standard Mathematical Tables, the number C(n) of Costas arrays of order n increases as a function of n until n==16, after which it decreases, at least until n==23:

n	C(n)	n	C(n)
1	1	2	2
3	4	4	12
5	40	6	116
7	200	8	444
9	760	10	2160
11	4368	12	7852
13	12828	14	17252
15	19612	16	21104
17	18276	18	15096
19	10240	20	6464
21	3536	22	2052
23	872	24	>=1

The number of Costas arrays of order 24 and greater is the subject of continuing research. Your Challenge is to aid these researchers by writing efficient code to enumerate Costas arrays.

The prototype for the code you should write is:

#if defined(__cplusplus)
extern "C" {
#endif

long /* number of arrays */ EnumerateCostas(
	int n, 							/* enumerate Costas arrays of order n */
	long *costasArrays		/* preallocated storage for returning your results */
	/* row r of array k is in costasArrays[k*n + r], r,k are origin 0 */
);

#if defined(__cplusplus)
}
#endif

Your EnumerateCostas routine will be asked to enumerate all Costas arrays of order n and return the results in the preallocated costasArrays. Each cell of an array is represented by one bit. The bits for row r of the k-th Costas array are to be returned in costasArrays[k*n + r], r=0..n-1, k>=0. The cell representing column c, c=0..n-1, is the bit in 1<<c. EnumerateCostas must produce all valid Costas arrays, with no duplicates, and return the number of arrays produced.

Testing will be constrained to arrays of order 32 or less. The winner will be the solution that correctly enumerates the Costas arrays in the minimum time.

This will be a native PowerPC Challenge, using the latest CodeWarrior environment. Solutions may be coded in C, C++, Pascal, or Java.

Three Months Ago Winner

Congratulations to our top two leaders in the points contest for finishing first and second in the September Playfair Challenge. Ernst Munter (Kanata, Ontario) took first place with a solution that was by far the fastest, and Tom Saxton took second place.

The Playfair Decipher Challenge asked contestants to decrypt an encoded phrase based on knowing the dictionary of words used in the message and some information about how the encoding is done. Encoding is based on a keyword from the dictionary, which is used to create a 5x5 encoding substitution matrix for the letters of the alphabet. The encoding matrix maps pairs of plaintext letters into pairs of encoded text. Complicating factors include the fact that the letters 'I' and 'J' are encoded as the same character. The encoding technique also inserts separator characters ('X' or 'Z') to separate repeated letters in a pair, in order to prevent double letters from mapping into an easily detected encoded letter pair. For more information on the problem, consult the September issue of MacTech, or check out the online version at <www.mactech.com/progchallenge/>.

Ernst's solution thoroughly analyzes the dictionary during the initialization phase. The SetUpDecocders routine creates a Decoder matrix for each potential keyword in the dictionary, minus duplicate matrices resulting from similar keywords. The initialization of the CodeBreaker structure creates a set of LinkedNodes based on the first three letters of the words in the dictionary. It also registers all letter triplets (trigraphs) that occur at the beginning or the end of a dictionary word. The decoding routine loops through all of the words in the dictionary, trying each of them as a decoding keyword. The trigraphs calculated during initialization are used by the DecodeCipher routine to prune the decoding process. If decoding is successful to this point, the CodeBreaker::Spell routine recursively matches the potentially decoded text to dictionary words. The code is complicated but fast.

Tom Saxton's solution also analyzes the dictionary during initialization, creating a tree structure. It creates the decoding matrix during the decoding process, not during initialization. Tom's word matching algorithm is recursive in concept, but iterative in implementation. Overall, Tom's solution took somewhat more than twice as long as the winning solution.

The third-place solution by Lad (last name unknown) was nearly as fast as the second-place solution. It was unique in that it was submitted as a library rather than as source code. Normally that would have been a disqualification, but since the September Challenge, in keeping with tradition, allowed solutions to be coded in assembly language, I decided to score the results.

The table below lists, for each of the five solutions submitted, the total execution time, code size, data size, and programming language. It also indicates whether a solution completed all of the test cases correctly. As usual, the number in parentheses after the entrant's name is the total number of Challenge points earned in all Challenges prior to this one.

Name	Time (msec)	Errors	Code Size	Data Size	Language
Ernst Munter (507)	493	no	8052	5580	C++
Tom Saxton (118)	1059	no	6984	443	C
Lad	1132	no	2796	170	Unknown
Rishi Gupta	57503	no	11844	1919	C++
R. B.	90773	yes	4828	638	C++

Top Contestants

Listed here are the Top Contestants for the Programmer's Challenge, including everyone who has accumulated 10 or more points during the past two years. The numbers below include points awarded over the 24 most recent contests, including points earned by this month's entrants.

Rank	Name	Points
1.	Munter, Ernst	227
2.	Saxton, Tom	116
3.	Maurer, Sebastian	70
4.	Boring, Randy	66
5.	Rieken, Willeke	51
6.	Heithcock, JG	39
7.	Shearer, Rob	34
8.	Brown, Pat	20
9.	Hostetter, Mat	20
10.	Mallett, Jeff	20
11.	Jones, Dennis	12
12.	Hart, Alan	11
13.	Hewett, Kevin	10
14.	Murphy, ACC	10
15.	Selengut, Jared	10
16.	Smith, Brad	10
17.	Strout, Joe	10
18.	Varilly, Patrick	10

There are three ways to earn points: (1) scoring in the top 5 of any Challenge, (2) being the first person to find a bug in a published winning solution or, (3) being the first person to suggest a Challenge that I use. The points you can win are:

1st place	20 points
2nd place	10 points
3rd place	7 points
4th place	4 points
5th place	2 points
finding bug	2 points
suggesting Challenge	2 points

Here is Ernst's winning Playfair solution:
Playfair.cp
Copyright © 1999 Ernst Munter

/*  
	September 6, 1999.
  Submission to MacTech Programmer's Challenge for September 99.
  Copyright © 1999, Ernst Munter, Kanata, ON, Canada.
  
  				"Playfair Decipher"
  				
  Version 2
  
Problem Statement
---------
Given a dictionary, a cipher text, and the encoding method, break the code and return the deciphered plaintext.

Solution
----
My strategy is to decode the ciphertext with each possible keyword until a plain text results which is accepted by the spell checker.

In InitPlayfair() the dictionary is scanned twice.

First, the set of decoders, one for each keyword, is constructed.  Because similar keywords can result in identical coding matrices, redundant matrices are discarded where the keywords are within a certain distance of each other in the dictionary.

Secondly, a spell check tree is built which indexes all words in the  dictionary that are distinct in the first 5 characters (stem).  Longer words are checked by indexing to the stem, and scanning the group of words which share the same stem directly in the dictionary.  

In addition, all trigraphs (sequences of 3 chars) in the dictionary words and digraphs marking start and end of each word are collected.  

In DecodePlayfair() we use each of the keywords in turn to decode the cipher text into a tentative plain text.  If during the plain text construction any illegal trigraphs are encountered, the decoding breaks off, and the next keyword decoder is selected.

If decoding passes the trigraph check (about 1 in 20 to 1 in 50), the text is submitted to the spell checker which recursively tries to match the text to dictionary words until the whole plain text has passed the check.   The spell checker builds a stack of dictionary indices matching the found words.  The first successfully checked plain text is returned to the caller by copying the indexed dictionary words into the final plain text.

Complications
-------
There are two sources of complications for the spell check.

The letters 'J' and 'I' are encoded as if they are the same.  The decoding step always yields 'I'.  The spell checker allows this aliasing with  additional tree branches (alias nodes).  Nevertheless, there can be true  ambiguity which can not be resolved.  For example the dictionary words  "JON" and "ION" cannot be distinguished.

The encoder inserts 'X' (or 'Z' in some cases) if the two characters of a  letterpair are the same ("BLEEP" becomes "BLEXEP".  But it does not insert  'X' between pairs, thus "BEEP" remains "BEEP".  In addition, there are words which might be ambiguous, e.g. SEES and SEXES, both of which are in the dictionary.  The spell checker also provides alias nodes for such cases.

My spell checker is built on the simplifying assumption that no double  letters occur in the raw decoded text.  'X' is always expected between  double letters.  This also improves the efficiency of trigraph checking. To make this assumption true, the decoder inserts 'X' between letter pairs  if necessary, and marks any 'X' which occurs as the first character in a  letter pair as a "hard" X, because this 'X' is never a filler 'X'.  

Limits and Assumptions
-----------
CipherText and the dictionary contain only words using 'A' - 'Z'.

The dictionary is sorted.

The program uses the standard (64K) program stack and allocates a word  stack on the heap for up to 400 words in the resulting plain text. 

The program uses a recursive function which takes up to 144 bytes on the  call stack and recurses 1 level per word found.  The standard stack size  of 64K is only sufficient for plain text messages of about 300 words.

NOTE: 	If it is desired to decode longer texts than about 300 words,
		Please change the constant "kMaxWords" accordingly and increase 
		the call stack allocation by 144K for every 1000 words.

The array parameter "char* plainText" in DecodePlayfair() is assumed to be of sufficient size to hold the resulting plain text PLUS an allowance for a slightly larger intermediate raw plain text (50% extra will cover even pathological cases). 

Version 2 changes
---------
Fine tuning for speed (15% gain), but no logical changes.  Techniques:
- use of unsigned char instead of char to avoid extend-signbit instruction
- reordering of tests in Legal() to favour most frequent case
- review of need of masking of characters.  1 << c, where c is an
  uppercase character.  This works without mask because shift is modulo 64.
*/

#include "Playfair.h"
#include <string.h>

typedef unsigned char uchar;
typedef unsigned long ulong;
typedef unsigned short ushort;
typedef const char* Cptr;

static const char** dict;
static int 	numDecoders;

enum {
	kMaxWords 	= 400,	// Increase for longer texts, also increase stack
	kMaxNodes	= 27*32*32,
	kDepth		= 16
};

enum {
	kXFLAG		= 1,
	kJFLAG		= 1<<27,
	kZFLAG		= 1<<28,
	kAll		= 0x07FFFFFF,
	
	kStart		= 27,
 	kStartOfWord= 1<<kStart,
	kEndOfWord	= 1,
	
	kJbit		= (1 << (31 & 'J')),
	kCharSet	= 0x07FFFFFE & ~kJbit
};

struct Rule
// Rule implements the encoding rule in a static look up table
static struct Rule {
	short	LUT[25][32];
	Rule()
	{
		int spot1=0;
		for (int r1=0;r1<5;r1++)
		for (int c1=0;c1<5;c1++,spot1++)
		{
			int spot2=0;
			for (int r2=0;r2<5;r2++)
			for (int c2=0;c2<5;c2++,spot2++)
			{
				int vr1,vr2,vc1,vc2;
				if (r1==r2)
				{
					vr1=r1;
					if (c1) vc1=c1-1;else vc1=4;
					vr2=r2;
					if (c2) vc2=c2-1;else vc2=4;
				}
				else if (c1==c2)
				{
					vc1=c1;
					if (r1) vr1=r1-1;else vr1=4;
					vc2=c2;
					if (r2) vr2=r2-1;else vr2=4;
				}
				else
				{
					vr1=r1;vc1=c2;
					vr2=r2;vc2=c1;
				}
				LUT[spot1][spot2]=
					((vr1*5+vc1) << 8) 
					+(vr2*5+vc2);
			}
		}
	}
} R;
	
/******* Trigraphs and set of related functions **********/

static ulong trigraph[27][32];

Register
inline void Register(ulong c)
// Single character words 	
 	{
 		for (int i=1;i<=26;i++)
 		{
 			trigraph[i][c] |= kEndOfWord;
 			trigraph[c][i] |= kStartOfWord;
 		}
 	}
inline void Register(ulong c1,ulong c2)
// Two character words 
 	{
 		trigraph[31 & c1][31 & c2] |= kEndOfWord | kStartOfWord;
 	}
inline void Register(ulong c1,ulong c2,ulong c3)
// Three character words 
 	{
 		trigraph[31 & c1][31 & c2] |= kStartOfWord | (1 << c3);
 		trigraph[31 & c2][31 & c3] |= kEndOfWord;
 	}	
inline void Register(ulong c1,ulong c2,ulong c3,ulong c4)
// Four character words 
 	{
 		trigraph[31 & c1][31 & c2] |= kStartOfWord | (1 << c3);
 		trigraph[31 & c2][31 & c3] |= 1 << c4;
 		trigraph[31 & c3][31 & c4] |= kEndOfWord;
 	}	

RegisterHead
inline void RegisterHead(ulong c1,ulong c2,ulong c3,ulong c4,ulong c5)
 	{
 		trigraph[31 & c1][31 & c2] |= kStartOfWord | (1 << c3);
 		trigraph[31 & c2][31 & c3] |= 1 << c4;
 		trigraph[31 & c3][31 & c4] |= 1 << c5;
 	}	

RegisterTail
inline void RegisterTail(ulong c1,ulong c2,const uchar* w)
 	{
 		ulong c3=31 & *w;
 		while (c3)
 		{
 			trigraph[31 & c1][31 & c2] |= 1 << c3;
 			c1=c2;c2=c3;c3=*++w;
 		} 
 		trigraph[31 & c1][31 & c2] |= kEndOfWord;	
 	}	

RemoveDoubleLetters 	
static void RemoveDoubleLetters()
// Removes all double letter cases from the trigraphs and replaces
// them with the escape sequences using 'X' or 'Z' as appropriate	
	{
		for (int c1=1;c1<=26;c1++)
		{
			ulong set=trigraph[c1][c1];
			if (0==set) continue;
			int sub;
			if (c1==(31 & 'X')) sub=31 & 'Z'; else sub=31 & 'X';			
			trigraph[c1][sub] |= 
				(set & kStartOfWord) | (1 << c1);
			if (set & kEndOfWord)
				trigraph[sub][c1] |= kEndOfWord;
				
			trigraph[c1][c1] = 0;
			for (int c2=1;c2<=26;c2++)
			{
				set=trigraph[c1][c2];
				if (set & (1 << c2))
				{
					if (c2==(31 & 'X')) sub=31 & 'Z'; else sub=31 & 'X';	
					trigraph[c1][c2]=set & (~(1 << c2)) | (1 << sub);
				}
			}
		}
	}
	
LegalStart	
inline bool LegalStart(ulong cpair)
{
	return (trigraph[31 & (cpair>>8)][31 & cpair] & kStartOfWord);
}

LegalEnd
inline bool LegalEnd(ulong cpair)
{
	return (trigraph[31 & (cpair>>8)][31 & cpair] & kEndOfWord);
}

Legal
inline bool Legal(ulong cp1,ulong cp2)
{
	return (
		(trigraph[31 & (cp1>>8)][31 & cp1] & 		// ABC.?
			(kEndOfWord | (1 << (31 & (cp2 >> 8))))) ||
// note: mask needed here because cp2.hi might be lower case 'x'
//		 however, it costs nothing here since >>8 and &31 is one instr.
			
		(trigraph[31 & cp1][31 & (cp2 >> 8)] & 		// ?.BCD
			(kStartOfWord | (1 << (cp2)))) ||
			
		(LegalEnd(cp1) && LegalStart(cp2)) 			// AB.CD
	);
}

struct Decoder
static struct Decoder{
	uchar	matrix[25];
	uchar	spot[27];	
#if KWD	
	const char*	kwd;	// useful in debugging
#endif	

	void Init(const char* keyword)
	{		
// Sets up the unique matrix and its inverse (=spot) for one keyword 	
#if KWD	
		kwd=keyword;
#endif						
		const	char* kp=keyword;
		uchar	line[25];
		uchar*	lp=line;
		
		uchar 	nextC=*kp;
		ulong	mask=kCharSet;	// make sure there is no 'J'
			
		// write unique letters in a line	
		for (;;)
		{
			uchar c=nextC;
			nextC=*++kp;
			if (c==0) break;
			
			if (c=='J') c='I';
			int bit=1 << c;
						
			if (mask & bit) {
				mask &= ~bit;
				*lp++ = c;
			}
		}
		int numUnique=lp-line;
	
		// write remaining letters
		for (int c='A';c<='Z';c++)
		{
			if ((mask & 2)) *lp++=c;
			mask >>= 1;
		}
		
	
		// transpose line into matrix
		uchar* mp=matrix;
		int k=0;
		for (int i=0;i<numUnique;i++) {
			for (int j=i;j<25;j+=numUnique) {
				int c=line[j];					
				*mp++ = c;
				spot[31 & c]=k++;
			}	
		}
	}
	
	bool SameMatrix(Decoder* dp)
	{
// Compares matrizes to help in eliminating unneeded duplicates	
		ulong* a=(ulong*)matrix;
		ulong* b=(ulong*)(dp->matrix);
		if (a[0] != b[0]) return false;
		if (a[1] != b[1]) return false;
		if (a[2] != b[2]) return false;
		if (a[3] != b[3]) return false;
		if (a[4] != b[4]) return false;
		if (a[5] != b[5]) return false;
		return true;
	}
	
	ulong Decode(ulong cipherPair)
	{
// Decodes one character pair
// Hard 'X' is identified by lower case 'x'	
		int spot0=spot[31 & (cipherPair>>8)];
		int spot1=spot[31 & cipherPair];
		int spotPair=R.LUT[spot0][spot1];
		uchar c0=matrix[spotPair >> 8];
		if (c0=='X') c0='x';			// "hard" X
		uchar c1=matrix[31 & spotPair];
		return (c0 << 8) | c1; 
	}
	
	int DecodeCipher(const char* cipherText,char* plainText)
	{
// Decodes the entire cipherText into a tentative plaintext
// Inserts 'X' or 'Z' between double letters at pair boundaries
// 		to accomodate spell checker method	
		const char* ct=cipherText;
		char* pt=plainText;										
		
		ulong cipherPair=*((ushort*)ct);ct+=2;	
		ulong plainPair1=Decode(cipherPair);
		
		if (0==LegalStart(plainPair1)) {						
			return 0;
		}	
		
		*((ushort*)pt)=plainPair1;pt+=2;
		
			
		for (;;ct+=4)
		{
			ulong cipherWord=*((ulong*)ct);					
			if ((cipherWord & 0xFF000000) == 0)
				break;
			
			ulong plainPair0=Decode(cipherWord >> 16);
			if (0==(0x1F & ((plainPair0 >> 8) ^ plainPair1)))
			{
				if ('X'==(0x5F & plainPair1)) *pt++ = 'Z';
				else *pt++ = 'X';
			} else if (!Legal(plainPair1,plainPair0)) 		
				return 0;
				
			*((ushort*)pt)=plainPair0;pt+=2;
			if ((cipherWord & 0x0000FF00) == 0) 
				break;
			
			plainPair1=Decode(cipherWord);
			if (0==(0x1F & ((plainPair1 >> 8) ^ plainPair0)))
			{
				if ('X'==(0x5F & plainPair0)) *pt++ = 'Z';
				else *pt++ = 'X';
			} else if (!Legal(plainPair0,plainPair1)) 	
				return 0;
			
			*((ushort*)pt)=plainPair1;pt+=2;
		}
		pt[0]=0;										
		return pt-plainText;	
	}
	
} *D;

// The spell checker tree is a three level hierarchy:
//		Node -> List -> Stem -> dictionaryWord

// Nodes are in a 3-dimensional table accessed by the first 3 characters 
// A node contains pointers to up to 26 lists ,
// A list contains up to 26 stems ,
// A stem contains a pointer to a dictionary word,
//		and the number of words in the group with the same stem.

// Shorter words are referenced by using the 0-index at each level.

struct Stem
struct Stem {
	const 	char** word;	// a dictionary word
	int 	numWords;	// number of words with common stem
	
	void Add(ulong w)
	{
		if (word==0) {
			word=dict+w;
			numWords=1;
		}	
		else numWords++;				
	}
	
	bool InsertedX(const uchar c,const uchar* pt,int len) const 
	{
		uchar c0=pt[len-1];
		if ( (c0==pt[len+1]) &&
			 ( (c=='X') || ((c=='Z') && (c0=='X')) ) )
		{
			return true;
		} 
		return false;	 
	}
	
	bool JtoI(const uchar c1,const uchar c2) const
	{
		return ((c1=='J') && (c2=='I'));
	}
	
	const char* MatchTail(const uchar* pt,int & len,int numX,
		int curWord) const
// Matches string starting at the 5th letter - numX
// numX is the number of inserted X (or Z) in the first 5 letters	
	{
		if (curWord < numWords)
		{
			const char* wp=word[curWord];
			const char* temp=wp+5-numX;
			len=5;
			uchar c1;	
			while (0 != (c1 = *temp)) {
				uchar c2=pt[len];
				
				if (31 & (c1 ^ c2))
				{
					 if (InsertedX(c2,pt,len))
					 {
					 	temp-;		// skipped X, repeat comparison
					 } else if (!JtoI(c1,c2))
					 	break;
				}
				len++;	
				temp++;	
			}
			if (0==*temp)
				return wp;
		}
		return 0;	
	}
};

struct LinkedNode
// Lists and Nodes are dynamically allocated.
// They are descended from LinkedNode and maintained in a linked list.
// This is to allow them to be deleted when TermPlayfair() is called.
struct LinkedNode {
	LinkedNode*		link;
	LinkedNode(LinkedNode* lk):link(lk){}
}; 

struct List
struct List:LinkedNode {
	Stem	group[27];
	List*	alias;	
	ulong	xjFlag;
	
	List(LinkedNode* lk,ulong flag):
		LinkedNode(lk),alias(0),xjFlag(flag)
		{memset(group,0,sizeof(group));}
		
	void Add(ulong c,ulong w){group[c].Add(w);}
		
	const char* GetListLeader() const
	{
		ulong n=group[0].numWords;
		if (n)
			return group[0].word[0];
		return (const char*)n;	//== 0; cast saves 2 instructions!
	}
};

struct Node
struct Node:LinkedNode {
	List*	list[27];
	Node*	alias;
	ulong	xjFlag;
	Node(LinkedNode* lk,ulong flag):
	LinkedNode(lk),alias(0),xjFlag(flag)
	{memset(list,0,sizeof(list));}
		
	void Add(ulong c,List* lp){list[c]=lp;}
	const char* GetNodeLeader() const
	{
		List* lp=list[0];
		if (lp)
		{
			if (lp->group[0].numWords){
				return lp->group[0].word[0];
			}	
		}
		return (const char*)lp;	// == 0; cast saves 6 instructions!!
	}
};

struct CodeBreaker
static struct CodeBreaker {
	LinkedNode*	nodeList;
	Node*		base[kMaxNodes];
	Cptr		wordStack[kMaxWords];
	Cptr*		stackPtr;
	int			cache;
	
	CodeBreaker(const char *words[],long numDictionaryWords)
		:nodeList(0),cache(0),stackPtr(wordStack)
	{
// The constructor creates the index tree from the dictionary words			
		dict=words;
		SetupDecoders(numDictionaryWords);
		memset(base,0,sizeof(base));
		for (ulong w=0;w < numDictionaryWords;w++)
		{
			const 	char* word=dict[w];
			ulong 	first=*((const ulong*)word);
			ulong	c0=31 & (first >> 24),
					c1=31 & (first >> 16),	
					c2=31 & (first >>  8),	
					c3=31 & first,
					c4=31 & word[4];
							
// Change all 'J' to 'I' as we analyse the letters	
			ulong nodeJflag=0;		
			if (c0==(31 & 'J'))
			{
				 c0=(31 & 'I');
				 nodeJflag=kJFLAG;
			}	 
			if (c1==0)		
			{
				List* lp=NewList(0,0,w);		// only a 1-letter word
				Node* np=NewNode(0);				
				base[32*32*c0]=np;
				np->Add(0,lp);
				Register(c0);
				continue;
			} 
			
// At least 2 letters:
			ulong nodeXflag=0;	
			if (c1==(31 & 'J')) 
			{
				 c1=(31 & 'I');
				 nodeJflag |= kJFLAG<<1;
			}	 
			
			if (c0 == c1)
			{
				c4=c3;c3=c2;c2=c1;
				if (c0 == (31 & 'X')) {
					c1 = 31 & 'Z';
					nodeXflag |= kZFLAG;
				} else c1 = 31 & 'X';
				nodeXflag |= kXFLAG;
			}
			
			if (c2==0)
			{
				List* lp=NewList(0,0,w);	// only a 2-letter word
				Node* np=NewNode(0);
				base[32*(32*c0+c1)+c2]=np;
				np->Add(0,lp);
				Register(c0,c1);
				continue;
			}
			
// At least 3 letters:
			if (c2==(31 & 'J')) 
			{
				 c2=(31 & 'I');
				 nodeJflag |= kJFLAG<<2;
			}	  
			if (c1 == c2)
			{
				c4=c3;c3=c2;
				if (c1 == (31 & 'X')) {
					c2 = 31 & 'Z';
					nodeXflag |= kZFLAG;
				} else c2 = 31 & 'X';
				nodeXflag |= kXFLAG;
			}
			
			
			Node* np=base[32*(32*c0+c1)+c2];
			if (np==0)
				base[32*(32*c0+c1)+c2]=
				np=NewNode(nodeXflag | nodeJflag);
			else np=GetAlias(np,nodeXflag | nodeJflag);
			
			if (c3==0)
			{
				List* lp=NewList(0);
				np->Add(0,lp);	// only a 3-letter word
				lp->Add(0,w);				
				Register(c0,c1,c2);
			} else
// more than 3 letters			
			{		
				ulong listXflag=0,listJflag=0;
				ulong numX=1 & nodeXflag;
				if (c3==(31 & 'J')) 
				{
					 c3=(31 & 'I');
					 listJflag = kJFLAG;
				} 
				if (c2 == c3)
				{
					c4=c3;
					if (c2 == (31 & 'X')) {
						c3 = 31 & 'Z';
						listXflag |= kZFLAG;
					} else c3 = 31 & 'X';
					listXflag |= kAll;	// all stems in list affected
					numX++;
				} 
			
				List* lp=np->list[c3];
				if (lp==0) 
				{
					lp=NewList(listXflag);
					np->Add(c3,lp);	
				}		
				
				if (c4==(31 & 'J')) 
					c4=(31 & 'I');	// no need for flag; J follows I
				if (c3 == c4)
				{
					if (c3 == (31 & 'X')) {
						c4 = 31 & 'Z';
						listXflag |= kZFLAG;
					} else c4 = 31 & 'X';
					listXflag |= 1 << c4;	// only [c4] stem affected 
					numX++;
				} 
				
				lp=GetAlias(lp,listXflag | listJflag);
				
				lp->Add(c4,w);		
				
// Register all trigraphs from this word and stem				
				if (c4==0) Register(c0,c1,c2,c3);
				else
				{	
					RegisterHead(c0,c1,c2,c3,c4);
					RegisterTail(c3,c4,(uchar*)word+5-numX);
				}	
			}		
					
		} // end of loop through the dictionary	
		
// Fixup trigraphs		
		RemoveDoubleLetters();
	}
	
	~CodeBreaker()
	{
// Destructor releases all dynamically allocated nodes and lists	
		LinkedNode* np=nodeList;
		while(np)
		{
			LinkedNode* next=np->link;
			delete np;						
			np=next;
		}
		delete [] D;
	}
	
	void SetupDecoders(long numDictionaryWords)
	{
// Scans the dictionary and allocates 1 decoder per potential keyword	
 		D=new Decoder[numDictionaryWords];		
 		Decoder* dp=D;
 		const char** wp=dict;
 		for (int i=0;i<numDictionaryWords;i++,wp++)
 		{
 			dp->Init(*wp);
 			Decoder* dx=dp-1;	
 			int k=kDepth;
// Scans backwards to discover perhaps identical matrizes, and assigns
// this matrix only if it is unique.
// (Example GUESS and GUESSES yield the same matrix, no need to keep both) 			
 			for (;dx>=D;dx-,k-){			
 				if (k<=0) break;
 				if (dp->SameMatrix(dx))
 					goto cont;	
			}
 			dp++;	
 			cont:;		
 		}
		numDecoders=dp-D;			
 	}
 	
// Allocation of lists and nodes, and linkage in chain for memory mgmnt 	
	Node*	NewNode(ulong xFlag){
		Node* np=new Node(nodeList,xFlag);
		nodeList=np;								
		return np;
	}
	List*	NewList(ulong flag){
		List* lp=new List(nodeList,flag);
		nodeList=lp;								
		return lp;
	}
	List*	NewList(int index,ulong flag,int w){
		List* lp=NewList(flag);			
		lp->Add(index,w);				
		return lp;
	}
	
// Alias nodes and lists are chained off the first node or list
// with the same 3- or 4-letter key.	
	Node* GetAlias(Node* np,const ulong flag) 
	{
		if (np->xjFlag == flag) return np;
		if (np->alias==0) 
		{
			np->alias=NewNode(flag);
			return np->alias;
		} else return GetAlias(np->alias,flag);	
	}
	List* GetAlias(List* lp,const ulong flag)
	{
		if (lp->xjFlag == flag) return lp;
		if (lp->alias==0) 
		{
			lp->alias=NewList(flag);
			return lp->alias;
		} else return GetAlias(lp->alias,flag);
	}
	
	bool Spell(const uchar* pt,int tailLen,const uchar lastChar,int numWords);
// Recursive, not inlined, defined below. 
	
	void Push(const Cptr wp) {*stackPtr++ = wp;}
	
	void CopyStack(char* text)
// Copies the word stack of solution words back into plaintext
// This overwrites the decoded plaintext with dictionary words
// and automatically takes care of I/J equivalents and X/Z fillers.	
	{
		while (stackPtr > wordStack) {
			Cptr wp=*-stackPtr;
			uchar c=*wp;
			while (c) {
				*text++=c;
				c=*++wp;	
			}
		} 
		*text=0;
	}
	
	void Cache(Decoder* lastUsed)
	{
// Builds a cache of recently used decoders by moving lastUsed forward	
		Decoder* nextCache=D+cache;
		if (nextCache < lastUsed) 
		{
			Decoder temp=*nextCache;
			*nextCache=*lastUsed;
			*lastUsed=temp;
			cache++;
		}
	}
} *C;


// 			CodeBreaker::Spell
//  Returns true if the string in pt[] of length tailLen can be parsed
//	into dictionary words, allowing for J/I substitutions and X/Z fillers.	

//	On each recursion it attempts to match the head of pt[] with a
//	dictionary word, and if successful, stacks a pointer to the word
//	and calls itself with the shortened string until the end of pt[] is
//	reached.

//	Note: tailLen and wordlengths refer to decoded plain text and
//	words before filler 'X' and 'Z' characters are removed.	
	bool CodeBreaker::
	Spell(const uchar* pt,int tailLen,
				const uchar lastChar,int numWords)
	{
	
		if (-numWords <=0)	// runtime check against stack overflow
		{
			numWords++;
			return false;
		}	
		
		const char* wp;			
		
//	Get Node matching the first 3 characters
		ulong pt0123=*((ulong*)pt);	
		Node* np=base[32*(32*
			(31 & (pt0123 >> 24)) +
			(31 & (pt0123 >> 16))) +
			(31 & (pt0123 >> 8))];
			
		if (np)
		{	
			ulong hardX=pt0123 & 0x20202000;
		
			if (hardX && 
				(1 & np->xjFlag) &&
				(0==(kZFLAG & np->xjFlag))
				) np=np->alias;
		
			while (np) {
//	Get list matching the first 4 characters	
				List* lp=np->list[31 & pt0123];
				hardX=0x20 & (31 & pt0123 | pt[4]);
			
				if (hardX && lp 
					&& (1 & lp->xjFlag) 
					&& (0==(kZFLAG & lp->xjFlag))
					) lp=lp->alias;
			
				while (lp) {
//	Get stem matching the first 5 characters
					Stem* gp=&lp->group[31 & pt[4]];
			
					if (gp->numWords) {
//	Explore all words matching the first 5 characters starting with 
//	the last word in the group (hoping to catch	longer words first)		
						int curWord=gp->numWords;
						int numX = 		// make this a function
							(1 & np->xjFlag) + 
							(1 & (lp->xjFlag >> pt[4]));
						
						do {
							int len;
							wp=gp->MatchTail(pt,len,numX,-curWord);
							if (wp)
							{
								if ((len==tailLen) 
								|| (Spell(pt+len,tailLen-len,pt[len-1],numWords)))
								{
									Push(wp);
									numWords++;
									return true;	
								}	
							}
						} while (curWord > 0);
					}
				
// Try the 4-letter word
					wp=lp->GetListLeader();
					if (wp)
					{ 
						if ((4==tailLen) 
						|| (Spell(pt+4,tailLen-4,pt[3],numWords)))
						{
							Push(wp);
							numWords++;
							return true;	
						}	
					}
					lp=lp->alias;
				}
				
// No luck so far, try the 3-letter word
			
				wp=np->GetNodeLeader();
				if (wp)
				{ 
					if ((3==tailLen) 
					|| (Spell(pt+3,tailLen-3,pt[2],numWords))){
						Push(wp);
						numWords++;
						return true;	
					}	
				}
				np=np->alias;
			}	// end of tests where first 3 letters match	
		}
		
// Try the 2-letter word
		np=base[32*(32*(31 & pt[0])+(31 & pt[1]))+0];
		if (np)
		{
			wp=np->GetNodeLeader();
			if (wp)
			{ 
				if ((2==tailLen) 
				|| (Spell(pt+2,tailLen-2,pt[1],numWords))){
					Push(wp);
					numWords++;
					return true;	
				}	
			}
		}
		
// Try the 1-letter word
		np=base[32*(32*(31 & pt[0])+0)+0];
		if (np)
		{
			wp=np->GetNodeLeader();
			if (wp)
			{ 
				if ((1==tailLen) 
				|| (Spell(pt+1,tailLen-1,pt[0],numWords))){
					Push(wp);
					numWords++;
					return true;	
				}	
			}		
		}
		
// Finally try if this is a filler 'X' or 'Z' between words
		if ((pt[0]=='X') || 
			((0==(31 & (lastChar ^ 'X'))) && (pt[0]=='Z'))) 
		{
			if (tailLen==1){
				numWords++;
				return true;
			}	
			if (0==(31 & (lastChar ^ pt[1])) 
			&& Spell(pt+1,tailLen-1,0,numWords))
			{
				// nothing to stack, just carry on unwinding	
				numWords++;
				return true;
			}	
		}
		
// Spellcheck failed: no word matches at this offset in plaintext	
		numWords++;	
		return false;
	}
	

InitPlayfair
void InitPlayfair(
  const char *words[],		/* dictionary words */
  long numDictionaryWords		/* number of null-terminated words in dictionary */
) 
{
 	C = new CodeBreaker(words,numDictionaryWords);	// all that's needed				
}

DecodePlayfair
void DecodePlayfair(
  const char *cipherText,	/* null-terminated text to decode */
  char *plainText						/* null-terminated decoded text */
)
{
// Decodes cipherText into plainText. 
// If no solution is found, plainText will be a gibberish string
// of the same length as cipherText, or possibly a bit longer.

	if ((0==*cipherText) || (strlen(cipherText) & 1))
// Cannot handle empty or odd-length cipher text strings
		return;
		
 	Decoder* dp=D;
 	Decoder* dEnd=D+numDecoders;									
 	for (;dp<dEnd;dp++)
 	{	
 		int len=dp->DecodeCipher(cipherText,plainText);
 		
		if (len && C->Spell((uchar*)plainText,len,0,kMaxWords)) 
		{
			C->CopyStack(plainText);
			C->Cache(dp);
 			break;
 		}	
 													
 	}
}

TermPlayfair
void TermPlayfair(void)
{
	delete C;		
}

Software Updates via MacUpdate

Latest Forum Discussions

Price Scanner via MacPrices.net

Jobs Board

SPREAD THE WORD:
Slashdot
Digg
Del.icio.us
Reddit
Newsvine