XCMD Sort
Volume Number: | | 4
|
Issue Number: | | 8
|
Column Tag: | | HyperChat
|
XCMD Cookbook: Sorting Routines
By Donald Koscheka, Apple Computer, Inc.
In the last two issues we covered XCMD programming basics. First we looked at the interface between Hypercard and XCMDs and then we discussed the callbacks and how they can make your XCMD programming a little easier. This month, I present an XCMD that adds a useful feature to Hypercard - the ability to sort lines in a text field.
This is a good example of XCMD programming since it extends Hypercard by adding a new feature. In addition to adding features, XCMDs are a useful way of speeding up some process that may have been written in HyperTalk. Sorting lines of text is certainly something that could be accomplished in Hypertalk. But if you have a lot of lines in the field, the Hypertalk script may be too slow. Enter the XCMD!
Before we can write the sort program, we must decide what it is we want to sort and what form the data will be presented in. Lets define a line as a string of text that is spearated by a newline character. Next well want to be able to sort both alphabetic strings and numeric strings. We need the numeric sort because strings of characters are sorted by ASCII value rather than numeric value.
Once we decide on the form and value of the data to sort, we can decide what we need to pass to the XCMD from Hypercard. Obviously, well need the lines. Parameter 1 will contain a handle to the field, parameter 2 will tell the XCMD whether to sort by ASCII value (alphanumeric sort) or by numeric value (decimal sort). Parameter 3 tells the xcmd to sort in ascending or descending order. Actually LineSort only sorts in ascending (smallest to largest) order. There really is no need to sort from largest to smallest. You can simply report the lines backwards to get descending order. Ive left this as an exercise to the reader since it involves little more than reversing the order of the FOR loop in two of the routines in SetNewText and SetNewNums.
When LineSort is complete, well return the sorted list in the parameter blocks returnValue. By making this an XFCN we can invoke it using the follwing format:
Put LineSort( card field my list, alpha ) into card field my list
The actual process of sorting can be broken down into the following steps:
(1) LineStart: Create an array of offsets into the list marking the start of each line.
(2) SortText: Sort the list of text by comparing lines using some sorting algorithm.
(3) SetNewText: Report the sorted list back to Hypercard.
(Note: the process is identical for numeric sorting. Because the numeric sort is quite a bit simpler in implementation, well concentrate this discussion on the text sort).
How we accomplish these three tasks is quite another story. Sorting lines of text is not as straightforward as sorting integers or characters since each line can be of an arbitrary length. Before performing the sort, we need to calculate where in the large chunk of text each line starts. We create an array, LineStart, that gives us an offset into the list of the start and length of each line. Consider the list:
Margaret
Colleen
Donald
The first line in the list has an offset of 0 and a length of 8. The next item in the list has an offset of 9 and a length of 7. The third item has an offset of 17 and a length of 6. If the numbers dont seem to add up its because the linestart array accounts for the newline character that terminates every entry in the list!
This matter of creating a linestart array pays handsome dividends. When it comes time to sort the list, we dont have to move the strings around in memory. Rather, we simply rearrange the linestart array to reflect the sort order. For example, the line start array for the above array looks like this before the sort:
[0,8]
[9,7]
[17,6]
and like this after the sort:
[9,7]
[17,6]
[0,8]
To report the sorted list back to hypercard now becomes a simple task. SetNewText extracts N elements from the linestart array where N is the number of elements in the array. We determine this number by dividing the size of the array by the size of one element in the array. For each element in the array, the first value in the array is the offset of the first byte in the string and the second element is the number of bytes to copy onto that line. Each line is already terminated with a newline so the process of building the output string is a simple process of concatenating the strings in the order prescribed by LineStart. If you want to perform a descending sort, build the output list starting with the last element in the array and work your way backwards, one element at a time! This is good practice for the beginner so I havent included the sort order part of the code.
I glossed over the actual sorting of the linestart array for two reasons. First off, it looks a lot scarier than it is. Second off, you may want to replace my sorting algorithm with one of your own. I use a Shell sort because it is one of the easier sorting methods to follow. If youre programming in MPW C you should consider replacing SortText and SortNums with Apples QuickerSort. Greg Kimberly of Apple Computer, Inc. demonstrated an XCMD to me that uses QuickerSort. It was at least three times faster than my Shell sort implementation!
The Shell sort starts by comparing the first element in the array with every element that is at least jump elements away from the first element. If an some element is smaller than the first element, then we swap that element, and test the next one. Once no more elements can be swapped, we exit the FOR loop, decrease the jump size and start the comparison process all over again.
The ToolBox call IUMagString compares the two string elements and returns -1 if string 1 is less than string 2, 0 if theyre equal and 1 if string 1 is greater (the swap condition). IUMagString takes two string pointers and the length of each string as input. The length is easy, thats simply the second element of each linestart array entry. We can create a pointer to each line by adding the offset element of each array entry to the starting address of the text. Since the text is referenced via a handle (passed to sortText as hField), we lock down the handle to make sure that this starting address doesnt change on us during the sort.
The numeric sorting scheme is very similar to the alpha scheme, only easier because the size of each element is fixed at 4 bytes (the size of a longInt). If you have trouble reading the sort Text routines, try starting with the sort numeric code.
Next month: File I/O. You can take it with you!
{*************************}
{* File: LineSort.p*}
{* *}
{* Sorts lines of text (delimited by *}
{* newline and null terminated). *}
{* Returns the sorted container. *}
{*-------------------------------- *}
{* In: paramPtr=pointer to the XCMD *}
{* Parameter Block *}
{* *}
{* params[1] = handle to the text *}
{* params[2] = handle to sort type *}
{* (ALPHA, NUMERIC)*}
{* params[3] = Sort Order *}
{* (ASCENDING, DESCENDING)*}
{* *}
{* Defaults : ALPHA, ASCENDING*}
{* Out: returnValue = handle to the*}
{* sorted data *}
{* *}
{*-------------------------------- *}
{* © 1988, Donald Koscheka*}
{* All Rights Reserved *}
{*-------------------------------- *}
{*************************}
(*************************
BUILD SEQUENCE
pascal LineSort.p
link -m ENTRYPOINT -rt XFCN=65535
-sn Main=LineSort
LineSort.p.o
{Libraries}Interface.o
{PLibraries}Paslib.o
-o {xcmds}testxcmds
*************************
{$S LineSort }
UNIT Donald_Koscheka;
{----------INTERFACE------------}
INTERFACE
USES
MemTypes,QuickDraw,OSIntf,ToolIntf,
PackIntf,HyperXCmd;
PROCEDURE EntryPoint(paramPtr:XCmdPtr);
{--------IMPLEMENTATION--------}
IMPLEMENTATION
{$R-}
CONST
NULL = $00;
NEWLINE= $0D;
ALPHA = $00;
NUMERIC= $01;
ASCEND = $00;
DESCEND= $01;
LESSTHAN = -1;
EQUALTO= 0;
GREATERTHAN= 1;
TYPE
Str31 = String[31];
LinePtr = ^LineElem;
LineHand = ^LinePtr;
LineElem = PACKED RECORD
Start: LongInt;
Size : LongInt;
END;
numPtr = ^LongInt;
numHand= ^numPtr;
PROCEDURE LineSort(paramPtr:XCmdPtr);
FORWARD;
{------------EntryPoint------------}
PROCEDURE EntryPoint(paramPtr: XCmdPtr);
BEGIN
LineSort(paramPtr);
END;
{----------LineSort----------------}
PROCEDURE LineSort(paramPtr:XCmdPtr);
(*************************
* Main code segment follows
*************************)
VAR
SortType : INTEGER;
SortOrder: INTEGER;
LineStart: LineHand;
hNums : numHand;
NewField : Handle;
SortStr: Str255;
{$I XCmdGlue.inc }
(************************)
(*** Alpha Sorting Routines***)
(************************)
FUNCTION GetLineStarts(hField:Handle)
: LineHand;
(**************************
* Given a pointer to a block of text,
* scan for line-terminators (NEWLINE
* | NULL) and fill out a dynamically
* allocated array of line starts
* information.
*
* The line starts array contains
* the offset in the text to the start
* of the line as well as the length
* of the line in bytes.
*
* Offsets are used to allow the record
* to remain valid across relocation
* of the basic text.
*
* In: Pointer to a block of text,
* null terminated.
*
* Out: Handle to an array of
* linestart indices.
*
*************************)
VAR
done : BOOLEAN;
arraySize: LongInt;
startText: LongInt; { Base pointer of the text }
sizeText : LongInt; { length of the current run}
lineStart: LineHand; { array of lineStarts pointers}
txStart: Ptr; { pointer to the input text}
txPtr : Ptr; { pointer into input text (FIELD)}
lineArray: LinePtr; { Pointer into lineStart array }
BEGIN
IF (hField<>NIL)AND(hField^^<>NULL)THEN
BEGIN
HLock( hField );
txPtr := hField^;
startText := ORD( txPtr );
lineStart := LineHand(NewHandle(0));
done := FALSE;
WHILE NOT done DO
BEGIN
txStart := txPtr;
ScanToReturn( txPtr );
IF txPtr^ = NULL THEN
BEGIN
txPtr^ := NEWLINE;
done := TRUE;
END;
txPtr := Pointer( ORD(txPtr) + 1);
sizeText := ORD(txPtr)-ORD(txStart);
IF sizeText > 1 THEN
BEGIN
{*** point to next record in linestarts array ***}
arraySize := GetHandleSize( Handle(LineStart) );
SetHandleSize( Handle(LineStart),
arraySize + sizeOf( lineElem ));
lineArray := Pointer( ORD(lineStart^) + arraySize );
{*** Put away offset and length of line ***}
WITH lineArray^ DO
BEGIN
Start:= ORD(txStart) - startText;
Size := SizeText;
END;
END;
END; {*** While ***}
HUnlock( hField );
END;
GetLineStarts := lineStart;
END;
PROCEDURE SortText( hField:Handle );
(************************
* Given a handle to an run of text,
* sort the lines of text pointed to and
* rearrange the array accordingly.
*
* Text is sorted by rearranging the line
* starts array.
*
* In: Pointer to array of linestarts
*linestarts, linecount (global)
* Out: sorted array of linestarts.
************************)
VAR
done : BOOLEAN;
jump,len1,len2: INTEGER;
n,m,lineCount : LongInt;
str1, str2 : Ptr;
tempElem : LineElem;
elem1, elem2 : LinePtr;
BEGIN
LineCount:= GetHandleSize( Handle(lineStart) );
LineCount:= LineCount DIV sizeOf( LineElem );
jump := lineCount;
HLock( Handle(lineStart) );
HLock( hField );
WHILE jump > 1 DO
BEGIN
jump := jump DIV 2;
REPEAT
done := TRUE;
FOR m := 0 to ( lineCount - jump - 1 ) DO
BEGIN
n := m + jump;
{*** Calculate the offsets of the two elements ***}
elem1 := LinePtr( ORD(LineStart^) +
(n * sizeof( LineElem )) );
str1 := Pointer( ORD( hField^ ) + elem1^.Start) ;
len1:= INTEGER( elem1^.Size );
elem2 := LinePtr( ORD(LineStart^) +
(m * sizeof( LineElem )) );
str2 := Pointer( ORD( hField^ ) +
elem2^.Start) ;
len2 := INTEGER( elem2^.Size );
IF IUMagString( str2, str1, len2, len1 )
= GREATERTHAN THEN
BEGIN
tempElem.Start := elem1^.Start;
tempElem.Size := elem1^.Size;
elem1^.Start := elem2^.Start;
elem1^.Size:= elem2^.Size;
elem2^.Start := tempElem.Start;
elem2^.Size:= tempElem.Size;
done := FALSE;
END;
END; {*** FOR loop ***}
UNTIL done;
END; {*** WHILE jump > 1 ***}
HUnlock( Handle(lineStart) );
HUnlock( hField );
END;
FUNCTION SetNewText( hField: Handle;
Dir : INTEGER ): Handle;
(*****************************
* Given a pointer to a linestarts array, and
* a corresponding block of text, rearrange
* the text to match the line starts in the
* array. This is useful after doing a search
* since the linestarts array will be in order
* but the text wont be.
*
* return a run of null-terminated text
* with each line NEWLINE-terminated.
*
* In: Handle to text array
*Handle to linestarts array.
*
* Out: Handle lines of newline terminated text.
****************************)
VAR
LineNum,
LineCount,
oldSize: LongInt;
NewText: Handle;
StartOfLine: Ptr;
NextLine : LinePtr;
BEGIN
LineCount:= GetHandleSize( Handle(lineStart) );
LineCount:= LineCount DIV sizeOf( LineElem );
NewText:= NewHandle( 0 );
FOR LineNum := 0 to LineCount-1 DO
BEGIN
NextLine := LinePtr( ORD( LineStart^ ) +
(LineNum * sizeOf( LineElem )) );
StartOfLine := Pointer( ORD( hField^ ) +
NextLine^.Start );
{*** add length of new line to output text ***}
oldSize := GetHandleSize( NewText );
SetHandleSize( NewText, oldSize + NextLine^.Size );
{*** move the line into the array ***}
BlockMove( StartOfLine, Pointer( ORD( NewText^ ) + oldSize), NextLine^.Size
);
END;
{*** Tack a NULL on the end of the new text ***}
oldSize := GetHandleSize( newText );
SetHandleSize( newText, oldSize + 1 );
StartOfLine := Pointer( ORD( newText^ ) + oldSize );
StartOfLine^ := NULL;
SetNewText := NewText;
END;
(***************************)
(*** Number Sorting Routines ***)
(***************************)
FUNCTION GetNums( hField : Handle ): numHand;
(***************************
* Given a handle to a block of text, scan
* for line-terminators (NEWLINE | NULL) and
* fill out a dynamically allocated array of
* long integers, one for each line.
*
* Out: Handle to an array of longInt.
*
***************************)
VAR
done : BOOLEAN;
arraySize,
theNum : LongInt;
txStart: Ptr;
txPtr : Ptr;
numArray : numPtr;
hNum : NumHand;
tStr : Str255;
BEGIN
IF ( hField <> NIL ) AND ( hField^^ <> NULL ) THEN
BEGIN
HLock( hField );
txPtr := hField^;
hNum := NumHand( NewHandle( 0 ) );
done := FALSE;
WHILE NOT done DO
BEGIN
txStart := txPtr;
ScanToReturn( txPtr );
IF txPtr^ = NULL THEN
done := TRUE
ELSE
txPtr^ := NULL;
ZeroToPas( txStart, tStr );
theNum := StrToNum( tStr );
txPtr := Pointer( ORD(txPtr) + 1);
arraySize := GetHandleSize( Handle( hNum ) );
SetHandleSize( Handle( hNum ), arraySize + sizeOf( longInt ));
numArray := Pointer( ORD( hNum^) + arraySize );
numArray^:= theNum;
END; {*** While ***}
HUnlock( hField );
END;
GetNums := hNum;
END;
PROCEDURE SortNums( theNums : NumHand );
(*********************
* Given a pointer to an array of longints,
* sort the numbers and rearrange the array
* accordingly.
*
* In: Handle to an array of longInts;
* uses linecount and lineNum;
* Out: sorted array of linestarts.
********************)
VAR
done : BOOLEAN;
jump,len1,len2 : INTEGER;
n,m ,swap,
lineCount: LongInt;
num1,num2: numPtr;
BEGIN
LineCount := GetHandleSize( Handle(theNums) ) DIV sizeOf( LongInt );
jump := lineCount;
WHILE jump > 1 DO
BEGIN
jump := jump DIV 2;
REPEAT
done := TRUE;
FOR m := 0 to ( lineCount - jump - 1 ) DO
BEGIN
n := m + jump;
{*** Calculate the offsets of the two elements ***}
num1 := Pointer(ORD(theNums^) + (n * sizeof( LongInt )) );
num2 := Pointer(ORD(theNums^) + (m * sizeof( LongInt )) );
IF num2^ > num1^ THEN
BEGIN
swap := num1^;
num1^:= num2^;
num2^:= swap;
done := FALSE;
END;
END; {*** FOR loop ***}
UNTIL done;
END; {*** WHILE jump > 1 ***}
END;
FUNCTION SetNewNums( hNum: numHand; Dir : INTEGER ): Handle;
(********************
* Given a handle to an array of
* longints, convert them back to
* strings and put them away in
* a new handle as text
*
* In: Handle to num array
*
* Out: Handle lines of newline
* terminated text.
*********************)
VAR
index, intCount,
oldSize, numLen : LongInt;
nexNum : NumPtr;
NewText: Handle;
tempPtr: Ptr;
tempStr: Str255;
BEGIN
NewText:= NewHandle( 0 );
intCount := (GetHandleSize( Handle(hNum) ) DIV sizeOf( LongInt ));
FOR index := 0 to intCount-1 DO
BEGIN
nexNum := NumPtr( ORD( hNum^ ) +
(index * sizeOf( LongInt )) );
tempStr := NumToStr( nexNum^ );
numLen := LongInt( ORD( tempStr[0] ) );
{*** add length of new line to output text ***}
oldSize := GetHandleSize( NewText );
SetHandleSize( NewText, oldSize + NumLen + 1 );
{*** move the line into the array ***}
TempPtr := Pointer( ORD( @tempStr ) + 1 );
BlockMove( TempPtr , Pointer(ORD( NewText^)+ oldSize),NumLen);
TempPtr := Pointer(ORD( NewText^) + oldSize + NumLen);
TempPtr^ := NEWLINE;
END;
{*** Tack a NULL on the end of the new text ***}
oldSize := GetHandleSize( newText );
SetHandleSize( newText, oldSize + 1 );
tempPtr := Pointer( ORD( newText^ ) + oldSize );
TempPtr^ := NULL;
SetNewNums := NewText;
END;
{******* Main Block for LineSort *******}
BEGIN
NewField := NIL;
WITH paramPtr^ DO
IF (params[1] <> NIL) AND (params[1]^^ <> NULL) THEN
BEGIN
SortType := ALPHA;
SortOrder := ASCEND;
IF params[2] <> NIL THEN
BEGIN
ZeroToPas( params[2]^, sortStr );
IF StringEqual(NUMERIC, sortStr) THEN
SortType := NUMERIC;
END;
IF params[3] <> NIL THEN
BEGIN
ZeroToPas( params[3]^, sortStr );
IF StringEqual(DESCENDING, sortStr ) THEN
SortOrder := DESCEND;
END;
CASE SortType OF
ALPHA:
BEGIN
LineStart := GetLineStarts( params[1] );
SortText( params[1] );
newField := SetNewText( params[1], SortOrder );
DisposHandle( Handle( LineStart ) );
END;
NUMERIC:
BEGIN
hNums := GetNums( params[1] );
SortNums( hNums );
newField := SetNewNums( hNums, SortOrder );
DisposHandle( Handle( hNums ) );
END;
END; {*** CASE SortOrder OF ***}
END;
paramPtr^.returnValue := newField;
END;
END.