Browser
Volume Number: | | 7
|
Issue Number: | | 8
|
Column Tag: | | C Workshop
|
A Fast Text Browser
By Allen Stenger, Gardena, CA
The Browser is a Think C program for viewing and searching (but not modifying) text files. It is an example of using the Think Class Library to build a useful application with only a modest effort. The Browser uses a fast search algorithm invented by Boyer and Moore. Running on a Mac Plus, the program can open a 100 Kbyte text file in 3 seconds and can search it in 0.4 seconds.
Simplifying Assumptions
I often need to look up things in text-only files; e.g., to look up a topic in the Macintosh Technical Notes index or to look at a program source file. Borlands NotePad+ (part of SideKick) is very handy for this, but it is based on TextEdit. This makes it slow, and incapable of handling large files (greater than 28K). I needed something fast and capacious.
Because of the limited use to which I would put it, I could make several simplifying assumptions. First, it would only read files, so I didnt have to worry about memory management on a changing file. Second, the files are already formatted into lines, so I could ignore text-wrap and write each paragraph (i.e., string terminated by a Return) on a single line (as long as I had a way to scroll horizontally through any extra-long lines). Third, I never extracted any text from these files, so I didnt have to worry about highlighting selections or indeed about text selection at all. This requires a slightly non-Macish way of doing repeated searches. I defined Find as Find from the top of the file and Find Again as Find from the last found position.
Mapping into TCL
The Think Class Library takes care of most of the user interface details. I (being so lazy) started with that -- specifically with the Starter application. (Symantec includes a TinyEdit example based on Starter, but it uses TextEdit and so did not meet my performance criteria.)
Although this is nominally object-oriented programming, objects play little explicit role and the Browser is more an example of software reuse. The Starter application already does much of what I wanted, and it was organized so that I could easily make the desired changes. The one visible advantage of objects themselves in this application is the ability to open multiple documents with no special programming - each document is a separate object and does not interfere with the others.
(Easing software reuse may be the greatest contribution that object-oriented programming will make: the structure of OOP programs makes it easy to modify them without destroying already-working code in the process.)
The TCL is not a complete class library, such as is provided with Smalltalk implementations. It is a user interface class library and deals with all the details of the Macintosh user interface: menus, windows, scrolling, etc. You can define your own class libraries built on or separate from TCL, through the OOP features of Think C.
If (like me) you have avoided relocatable blocks because of the perceived difficulties and if (like me) you make lots of mistakes using them, you will have a lot of trouble with TCL. All the instance variables are in a relocatable block, and you have to be extra-careful either to lock the block (the handles name is self) before doing anything that might move memory or else copy the variables you need into local, non-relocatable variables (the recommended technique, called shadowing by Symantec). Realize that memory may be moved not just through obvious trap calls such as NewHandle but also by calling a subroutine (if the subroutine is in another, non-loaded segment and LoadSeg is called to bring it in). This makes the shadowing of string variables a little tricky, since the obvious way to copy to a local variable is to use strcpy -- a subroutine call. This implies that in general you should not define function parameters to be strings.
Shadowing is the clumsiest and most error-prone aspect of the Think object implementation. Unfortunately I cant think of any better way to do it, without giving up the memory-management benefits of relocatable blocks.
The Browser, like the Starter, defines subclasses of the three classes CApplication, CDocument, and CPanorama. The center of attention is CBrowserDoc, the subclass of CDocument. It handles everything involving creating and destroying documents and searching. CBrowserPane, a subclass of CPanorama (not of CPane as the name suggests), has the relatively easy job of updating the window when something changes in it. The TCL takes care of window sizing, scroll bar manipulation, etc. and sends itsMainPane (the single instance of CBrowserPane for this document) a message to update the window. The message is defined in Panorama coordinates, which are both the coordinates relative to the document itself and the QuickDraw local coordinates. This makes it easy for itsMainPane to figure out what part of the document needs to be rewritten and where it goes in the window. CBrowserApp defines the application itself, and deals mostly with opening documents. Browser is the main program, and is identical with Starter except the names; it creates an instance of CBrowserApp and starts its running.
The TCL translates menu selections into commands. A command passes through a chain of command of objects until it reaches an object that knows how to process it. Every object in the chain of command is an instance of CBureaucrat and as such has an instance variable defining its supervisor; if the object cannot process a command it passes it explicitly to its supervisor. For the Browser the chain starts at the instance of CBrowserPane, chains upward to the instance of CBrowserDoc, and ends with the instance of CBrowserApp. If no object in the chain can process a command, it is lost. (You can simulate the effect of a menu selection by calling DoCommand with the desired command; CBrowserApp::StartUpAction does this to open a file when the application begins running. You also can invent your own commands, not tied to a menu, if for some reason you want to send controls along the chain of command.)
The processing within an object also goes through a chain, but this one is a class inheritance chain. For example, the document object is an instance of CBrowserDoc, which has a DoCommand method that handles Find and Find Again. Any other commands are passed explicitly (by a call to inherited::DoCommand) to its superclass, CDocument, which handles commands dealing with open documents (e.g., Close). CDocument passes any leftover commands explicitly to its superclass, CDirector, which then passes any remaining commands explicitly to its superclass, CBureaucrat. This is the end of the line for an instance of CBureaucrat: the DoCommand method in CBureaucrat passes all commands reaching it explicitly up to the objects supervisor.
Another example is the pane object. It is an instance of CBrowserPane, which has no DoCommand method. Its superclass, CPanorama, also has no DoCommand method, so the object eventually inherits its DoCommand from CBureaucrat, which always sends the command to the supervisor.
Search Algorithm
This program uses a string-search algorithm invented by Robert S. Boyer and J Strother Moore. The algorithm itself is simple, but it was derived by some very clever reasoning.
Suppose we want to find the string onion in a file that begins Colored cartoons of giant onions . The common approach would be to align the search string and the file string and compare left-to-right until a mismatch is found, then shift the search string right one and repeat. So we would start with
onion
Colored cartoons of giant onions
and observe that o and C dont match. We would shift onion right one to get
onion
Colored cartoons of giant onions
where now the two os match but then n and l dont match, so we shift again. This continues until we have shifted all the way to onions, one position at a time.
Boyer and Moore observed that we can speed up the search by extracting some more information from the failures. Their first innovation is to compare right-to-left, so we would be matching the final n of onion against the r of Colored, which again fails. So far this is no improvement, but we can observe that not only does n not match r, but there is no r anywhere in the search string, so we can immediately shift it right by its length rather than by one, bypassing all the hopeless intermediate searches.
onion
Colored cartoons of giant onions
Repeating, we see that a of cartoons occurs nowhere in onion, so we can shift right another 5 characters.
onion
Colored cartoons of giant onions
If a had occurred in the search string, we could not have shifted by the search strings length, but we could shift to align the rightmost occurrence of a with its occurrence in the file string. This process is the inner loop of the Boyer-Moore search algorithm. The decision whether a character occurs in the search string, and if so its rightmost position in the search string, is made by a lookup table (called delta0) prepared at the beginning of the search.
Continuing, we see this time that there is a match of the two ns, so we move (left) to the next characters, which are two os and also match. The next comparison fails (i against o). Our inner loop method does not help here; it suggests that we should shift the search string left one to align second o of onion with the the first o of cartoons, but we already know there are no matches to the left of the current position. What can we do?
One method that will always work is merely to shift the search string another position to the right (we know there is no match at the current position or to the left). But Boyer and Moore have added a further refinement to wring even more information out of the failed matches. In the present case we know that the position being examined in the file string has on preceded by something other than i. If the search string also has on preceded by something other than i, we should shift the search string right to align this with its occurrence in the file string. For most search strings there would be no reoccurrence, so we could shift by the length of the search string; but here on does indeed reoccur, preceded by nothing, so we shift to align the first on of onion:
onion
Colored cartoons of giant onions
The amount of shift is also pre-calculated for each trailing string of the search string and is stored in the lookup table delta2. The location of the reoccurrence is called the rightmost plausible reoccurrence (RPR). For example, the RPR of on is the first on of onion, but the trailing string n has no RPR since there is no n preceded by something other than o. If there is no RPR the delta2 table commands a shift by the length of the search string.
We can continue the search as before. The o of of does not match the n of onion, and we can shift right one to align the os.
onion
Colored cartoons of giant onions
Now the n and the f do not match, and there are no fs in onion, so we shift right 5 to get
onion
Colored cartoons of giant onions
Now the ns match but the o and the a dont, and the RPR rule tells us we can again shift 5.
onion
Colored cartoons of giant onions
Now the n and the i dont match, and we shift 2 to align the is.
onion
Colored cartoons of giant onions
This is finally the match we are looking for, and the algorithm merely compares character-by-character to confirm this.
The original Boyer-Moore algorithm deals only with exact (case-sensitive) matches, but in most Macintosh work we want case-insensitive matches. The algorithm is easily modified for this: construct delta0 and delta2 by considering uppercase and lowercase the same, then do the character-by-character compares as case-insensitive. The speed of the lookups is unaffected by this, and they take nearly all the running time of the algorithm. The algorithm does slow down somewhat because there are now more false partial matches than if we had stuck to case-sensitive, but even so it is nearly as fast as the case-sensitive version.
Notes
The Think C debugger gives no visibility into the heap, and so is not much help in finding heap problems. Jasik Designs The Debugger is excellent for solving heap problems. Turn on Heap Check to check heap corruption; you also may step through the program (e.g., by every Memory Manager trap), dumping the heap at each stop to check incorrect types or sizes. The Debugger also has an excellent interface into the Think C project file that will show you the C source alone or interspersed with the assembly language code. This lets you see where you are in the program better than MacsBug does. It is also good for examining the quality of the generated code - I used it for tuning Search.c and LineCount.c.
If you get mysterious errors with the Think C debugger on but not with it off, you are almost certainly the victim of heap problems and you should check all your instance variables.
One drawback of the TCL is that it develops only applications. I would have liked the Browser to be a desk accessory (like NotePad+), but I could not find a convenient way to do this. It seems that it would not be hard to alter the TCL to produce DAs; the biggest changes would be to change the resource number references from constants defined in Constants.h to calculated values, and to feed events into CSwitchboard::ProcessEvent from the DAs control routine rather than let it pull them from the event queue itself. I did not want to alter the TCL, and I could not see how to do this just by overriding existing classes. If DAs do not become extinct with the advent of System 7.0, perhaps Symantec could put these changes in an upgrade to the TCL.
Each object disposes of itself so be sure to do all necessary cleanup for the class before issuing inherited::Dispose(). This cleanup includes disposing of any handles and any objects created by the object. For example, the pane (itsMainPane) is disposed of by CBrowserDoc::Dispose().
The panorama units are used to quantize the amount of scrolling, e.g. into whole numbers of lines or characters. The Browser sets the units to one line vertically and one character horizontally; the TCL scrolls only in multiples of these amounts. The arguments to CPanorama::ScrollTo are in panorama units, since it deals with scrolling. The arguments to the Draw function (defined in CPane and overridden in CBrowserPane) are in QuickDraw local coordinates, since it deals with a window and not specifically with scrolling.
One user-interface detail not handled by the TCL is proper updating of a window if part of it has been invalidated just before a scroll. This happens in the Browser when doing a Find, since the Find dialog box comes up in front of the Browser window and invalidates part of it. The Toolbox ScrollRect routine does not translate the invalid region (since it doesnt know about it), and the TCL generates a Draw() call for the old invalid region (which is now in a different place in the window). This is corrected by issuing a call to CWindow::Update() just after the modal dialog to redraw the area under the dialog. If this is omitted, a white space that is the scroll of the dialog box area is left in the window (unless this area is scrolled totally off-screen).
For Further Reading
1. Robert S. Boyer and J Strother Moore, A Fast String Searching Algorithm. Communications of the ACM, v. 20, no. 10, pp. 762-772 (October 1977). Describes the Browsers search algorithm. Also has much empirical and theoretical analysis of the algorithms performance. (ACM = Association for Computing Machinery)
2. Donald E. Knuth, James H. Morris, and Vaughn R. Pratt, Fast Pattern Matching in Strings. SIAM Journal of Computing, v. 6, no. 2, pp. 323-350 (June 1977). This describes another fast search algorithm, which was developed at the same time as the Boyer-Moore algorithm and uses some of the same ideas. It compares left-to-right rather than right-to-left: moving backward in the file string is messy if the file is too large to fit in memory, and one consideration in developing the algorithm was to avoid this. This is a fairly theoretical article, full of finite-state automata. (SIAM = Society for Industrial and Applied Mathematics)
3. John A. Nairn, All About Scrolling Windows. MacTutor v. 5 no. 4 (April 1989), also reprinted in The Best of MacTutor v. 5 pp. 69-87. Everything you ever wanted to know about scrolling, and then some. The TCL takes care of ordinary everyday scrolling for you, so you usually dont have to worry about it, but this article shows how to do any imaginable scrolling and includes a scrolling library in C (not TCL). This library uses a binary search to find the line corresponding to a position in the file: speed is important, since it performs the search for each character typed into the file. CBrowserDoc::FindIt uses a linear search: it is only called after a string search, and the string search time and scrolling time tend to swamp the line search time, making the speedup from a binary search unnoticeable.
4. Mark McBride, Modeless Search and Replace. MacTutor, v. 5 no. 5 (May 1989), also reprinted in The Best of MacTutor v. 5 pp. 344-353. A simple text editor in Pascal, with emphasis on the Find/Replace functions. Uses Munger for searches.
5. Enrico Colombini, Designing an Object with THINK C. MacTutor, v. 6 no. 8 (August 1990), pp. 68-76. Develops a custom control by defining new subclasses of the TCL. The Browser did not have to define any subclasses beyond those defined in Starter.
6. Mark B. Kauffman, TCL OOPs Introduction. MacTutor, v. 6 no. 9 (September 1990), p. 84-86. An extremely simple example using TCL, just enough to get you started. This one has no user interaction, it just draws some shapes and then halts.
7. Denis Molony, HeapLister. MacTutor, v. 6 no. 10 (October 1990), pp. 56-76. A very elaborate example using TCL, showing off most of its features. Defines several subclasses of TCL classes. Includes a window split into three panes.
Source Listings
This program is based on Symantec's Starter application. Most of the Starter comments were stripped out to reduce the size of the published listing. The original Starter sources are copyright © 1990 Symantec Corporation.
(The other segments are the same as Starter.Π.)
/******************************************************/
/* Browser.h */
/* General Browser includes */
/* Written in Think C version 4.0.2 */
/* Allen Stenger January 1991 */
/******************************************************/
/***********************************/
/* Search menu equates */
/***********************************/
/* menu number for the Search menu */
#define MENUsearch 20
/* commands for the Search menu */
#define cmdFind 2000L
#define cmdFindAgain 2001L
/* dialog number of Search dialog */
#define dlogSearch 1000
/* item numbers in Search dialog */
#define dlogOK 1
#define dlogCancel 2
#define dlogText 3
/***********************************/
/* Resource IDs */
/***********************************/
#define WINDBrowser500 /* WIND template */
#define dlogAbout1001/* About dialog */
/******************************************************/
/* Browser.c */
/* This is the main program for the Browser; it */
/* creates an instance of the application and starts */
/* it running. */
/* Written in Think C version 4.0.2 */
/* Based on Starter.c */
/* Allen Stenger January 1991 */
/******************************************************/
#include "CBrowserApp.h"
extern CApplication *gApplication;
void main()
{
gApplication = new(CBrowserApp);
((CBrowserApp *)gApplication)->IBrowserApp();
gApplication->Run();
gApplication->Exit();
}
/******************************************************/
/* CBrowserApp.h */
/* Class definition for Browser application */
/* Written in Think C version 4.0.2 */
/* Based on CStarterApp.h */
/* Allen Stenger January 1991 */
/******************************************************/
#define _H_CBrowserApp
#include <CApplication.h>
struct CBrowserApp : CApplication {
/* No instance variables */
/* methods - same function as CStarterApp */
/* except as noted */
void IBrowserApp(void);
void SetUpFileParameters(void);
void DoCommand(long theCommand);
/* DoCommand processes the "About " command */
/* CreateDocument is deleted, since we only open */
/* existing documents */
void OpenDocument(SFReply *macSFReply);
void StartUpAction(short numPreloads);
/* StartUpAction asks for the first file to open */
};
/******************************************************/
/* CBrowserApp.c */
/* Written in Think C version 4.0.2 */
/* Based on CStarterApp.c */
/* Allen Stenger January 1991 */
/******************************************************/
#include "CBartender.h"
#include "CBrowserApp.h"
#include "CBrowserDoc.h"
#include "Commands.h"
#include "Browser.h"
extern OSType gSignature;
extern CBartender *gBartender;
void CBrowserApp::IBrowserApp(void)
{
CApplication::IApplication(4, 20480L, 2048L);
}
void CBrowserApp::SetUpFileParameters(void)
{
inherited::SetUpFileParameters();
sfNumTypes = 1;
sfFileTypes[0] = 'TEXT';
gSignature = '????';
}
void CBrowserApp::DoCommand(long theCommand)
{
short itemHit; /* which item in dialog selected */
DialogPtrtheDialogP;/* pointer to About dialog */
switch (theCommand) {
case cmdAbout:
theDialogP = GetNewDialog(dlogAbout, NULL, -1L);
ModalDialog(NULL, &itemHit);
DisposDialog(theDialogP);
break;
default: inherited::DoCommand(theCommand);
break;
}
}
void CBrowserApp::OpenDocument(SFReply *macSFReply)
{
CBrowserDoc*theDocument;
theDocument = new(CBrowserDoc);
theDocument->IBrowserDoc(this, FALSE);
/* we never print */
theDocument->OpenFile(macSFReply);
}
void CBrowserApp::StartUpAction(short numPreloads)
{
FlushEvents(everyEvent, 0);
DoCommand(cmdOpen);
}
/******************************************************/
/* CBrowserDoc.h */
/* Classes for Browser documents */
/* Written in Think C version 4.0.2 */
/* Based on CStarterDoc.h */
/* Allen Stenger January 1991 */
/******************************************************/
#define _H_CBrowserDoc
#include <CDocument.h>
#include <CApplication.h>
struct CBrowserDoc : CDocument {
/* instance variables */
/* **itsDataH and **itsLineStartsH are shared with */
/* CBrowserPane, which uses them to update the */
/* window contents and to control scrolling */
char **itsDataH;/* handle to itsFile's data */
short itsLineCt; /* number of lines in **itsDataH */
long **itsLineStartsH;
/* Handle to line starts table - defined by */
/* (*itsLineStartsH)[i] = */
/*offset to start of line i */
/* for i=0 to i=itsLineCt-1, and as the size of */
/* **itsDataH for i=itsLineCt (or 1 if no data) */
long itsFindOffset;
/* offset of last found instance of search string */
/* in **itsDataH, or -1 if none */
/* Methods - same function as CStarterDoc, except */
/* as noted */
void IBrowserDoc(CApplication *aSupervisor,
Boolean printable);
void Dispose(void);
void FindIt(long offset);
/* Internal function to find next occurence of */
/* string. Search begins "offset" into the text. */
/* If found, the window is scrolled to put the */
/* found line at the top, and itsFindOffset is */
/* updated to the offset to the found substring. */
void DoCommand(long theCommand);
/* DoCommand processes the Search menu */
void UpdateMenus(void);
/* UpdateMenus processes the Search menu - Find */
/* is always enabled, and Find Again is enabled */
/* if a search string has been entered in Find */
void OpenFile(SFReply *macSFReply);
void BuildWindow(Handle theData);
};
/******************************************************/
/* CBrowserDoc.c */
/* Written in Think C version 4.0.2 */
/* Based on CStarterDoc.c */
/* Allen Stenger January 1991 */
/******************************************************/
#define DOTIMING 0
/* do timing measurements if true */
#include <Global.h>
#include <Commands.h>
#include <CApplication.h>
#include <CBartender.h>
#include <CDataFile.h>
#include <CDecorator.h>
#include <CDesktop.h>
#include <CError.h>
#include <CPanorama.h>
#include <CScrollPane.h>
#include "CBrowserDoc.h"
#include "CBrowserPane.h"
#include "Browser.h"
#include "Search.h"
#include "LineCount.h"
extern CApplication *gApplication;
extern CBartender *gBartender;
extern CDecorator *gDecorator;
extern CDesktop *gDesktop;
extern CBureaucrat*gGopher;
extern OSType gSignature;
extern CError *gError;
void CBrowserDoc::IBrowserDoc(CApplication *aSupervisor,
Boolean printable)
{
CDocument::IDocument(aSupervisor, printable);
itsDataH = NULL;
itsLineStartsH = NULL;
itsLineCt = 0;
itsFindOffset = -1;
}
void CBrowserDoc::Dispose()
{
if (itsDataH != NULL) DisposHandle(itsDataH);
if (itsLineStartsH != NULL)
DisposHandle(itsLineStartsH);
if (itsMainPane != NULL) itsMainPane->Dispose();
inherited::Dispose();
}
void CBrowserDoc::FindIt(long offset)
{
Point scrollToPoint;
/* point in panorama coordinates to scroll to */
int i;/* loop control */
#if DOTIMING
long startTime, endTime;
float elapsedTime;
/* time to search and scroll to right line (secs) */
long startLineTime, endLineTime;
float elapsedLineTime;
/* time to find right line given the offset (secs)*/
#endif
#if DOTIMING
startTime = TickCount();
#endif
if ( GoFind(itsDataH, &offset) ) {
#if DOTIMING
startLineTime = TickCount();
#endif
for (i=1; (*itsLineStartsH)[i]<=offset; i++)
;
#if DOTIMING
endLineTime = TickCount();
elapsedLineTime = (endLineTime - startLineTime)
/ 60.0;
#endif
scrollToPoint.v = i - 1;
scrollToPoint.h = 0;
((CBrowserPane *)itsMainPane)->
ScrollTo(scrollToPoint, TRUE);
itsFindOffset = offset;
}
else SysBeep(1); /* not found */
#if DOTIMING
endTime = TickCount();
elapsedTime = (endTime - startTime) / 60.0;
#endif
}
void CBrowserDoc::DoCommand(long theCommand)
{
Boolean aSearchStringGiven;
/* Find dialog got string */
switch (theCommand) {
case cmdFind:
aSearchStringGiven = GetSearchString();
itsWindow->Update();
/* update window to correct damage */
/* from Find dialog */
if (aSearchStringGiven) {
itsFindOffset = -1;
FindIt( itsFindOffset + 1 );
}
break;
case cmdFindAgain:
FindIt( itsFindOffset + 1 );
break;
default: inherited::DoCommand(theCommand);
break;
}
}
void CBrowserDoc::UpdateMenus()
{
inherited::UpdateMenus();
gBartender->EnableCmd(cmdFind);
if ( HaveSearchString() )
gBartender->EnableCmd(cmdFindAgain);
gBartender->DisableCmd(cmdNew);
gBartender->DisableCmd(cmdSaveAs);
}
void CBrowserDoc::OpenFile(SFReply *macSFReply)
{
CDataFile*theFile;
Handle theData;
Str63 theName;
OSErr theError;
short theLineCt; /* shadow variable */
long **theLineStartsH; /* shadow variable */
theFile = new(CDataFile);
theFile->IDataFile();
theFile->SFSpecify(macSFReply);
itsFile = theFile;
theError = theFile->Open(fsRdPerm);
if (!gError->CheckOSError(theError)) {
Dispose();
return;
}
gApplication->RequestMemory(FALSE, TRUE);
theFile->ReadAll(&theData);
/* Reset canFail to FALSE for default */
/* memory-error handling. */
gApplication->RequestMemory(FALSE, FALSE);
if (theData == NULL) {
gError->CheckOSError(MemError());
Dispose();
return;
}
itsDataH = theData;
/* save handle to data in instance variable */
DoLineCount( itsDataH, &theLineCt, &theLineStartsH );
itsLineCt = theLineCt;
itsLineStartsH = theLineStartsH;
BuildWindow(theData);
itsFile->GetName(theName);
itsWindow->SetTitle(theName);
itsWindow->Select();
theError = itsFile->Close();
itsFile = NULL;
}
void CBrowserDoc::BuildWindow (Handle theData)
{
CScrollPane*theScrollPane;
CBrowserPane *theMainPane;
itsWindow = new(CWindow);
itsWindow->IWindow(WINDBrowser, FALSE, gDesktop, this);
theScrollPane = new(CScrollPane);
theScrollPane->IScrollPane(itsWindow, this,
10, 10, 0, 0,
sizELASTIC, sizELASTIC,
TRUE, TRUE, TRUE);
theScrollPane->FitToEnclFrame(TRUE, TRUE);
theMainPane = new(CBrowserPane);
theMainPane->IBrowserPane(theScrollPane, this,
0, 0, 0, 0,
sizELASTIC, sizELASTIC,
itsDataH, itsLineCt, itsLineStartsH);
itsMainPane = theMainPane;
itsGopher = theMainPane;
theMainPane->FitToEnclosure(TRUE, TRUE);
theScrollPane->InstallPanorama(theMainPane);
gDecorator->PlaceNewWindow(itsWindow);
itsWindow->Zoom(inZoomOut); /* open full-screen */
}
/******************************************************/
/* CBrowserPane.h */
/* Classes for Browser panes */
/* Written in Think C version 4.0.2 */
/* Based on CStarterPane.h */
/* Allen Stenger January 1991 */
/******************************************************/
#define _H_CBrowserPane
#include <CPanorama.h>
struct CBrowserPane : CPanorama {
/* instance variables */
char **itsDataH; /* handle to text - */
/*passed from and owned by CBrowserDoc instance */
short itsLineCt;
/* number of lines in **itsDataH */
short itsLineHeight;
/* height of a line in this window (pixels) */
long **itsLineStartsH;
/* line starts table - passed from and owned by */
/* CBrowserDoc instance - see CBrowserDoc for */
/* definition */
/* Methods - same function as CStarterPane except */
/* as noted */
void IBrowserPane(CView *anEnclosure,
CBureaucrat *aSupervisor,
short aWidth, short aHeight,
short aHEncl, short aVEncl,
SizingOption aHSizing,
SizingOption aVSizing,
/* added parameters follow */
char **theDataH, /* data of file */
short theLineCt,/* line count of file */
long **theLineStartsH);
/* line start table of file */
void Draw(Rect *area);
};
/******************************************************/
/* CBrowserPane.c*/
/* Written in Think C version 4.0.2 */
/* Based on CStarterPane.c */
/* Allen Stenger January 1991 */
/******************************************************/
#include "CBrowserPane.h"
#define FONTNUMBER 4 /* Monaco */
#define FONTSIZE 9
void CBrowserPane::IBrowserPane(CView *anEnclosure,
CBureaucrat *aSupervisor,
short aWidth, short aHeight,
short aHEncl, short aVEncl,
SizingOption aHSizing,
SizingOption aVSizing,
/* added parameters */
char **theDataH,
short theLineCt,
long **theLineStartsH)
{
FontInfo theFontInfo; /* from GetFontInfo */
Rect panoRect;/* panorama rectangle */
short thisLineLength, /* length of current and */
maxLineLength; /* longest line in */
/* **itsDataH */
long i;/* loop control */
itsDataH = theDataH;
itsLineCt = theLineCt;
itsLineStartsH = theLineStartsH;
CPanorama::IPanorama(anEnclosure, aSupervisor,
aWidth, aHeight,
aHEncl, aVEncl, aHSizing, aVSizing);
Prepare();
TextFont(FONTNUMBER);
TextSize(FONTSIZE);
GetFontInfo( &theFontInfo );
itsLineHeight = theFontInfo.ascent +
theFontInfo.descent +
theFontInfo.leading;
/* get maximum line length - needed for horizontal */
/* scrolling control */
maxLineLength = 1;
for (i=0; i<itsLineCt; i++) {
thisLineLength = (*theLineStartsH)[i+1] -
(*theLineStartsH)[i];
if (thisLineLength > maxLineLength)
maxLineLength = thisLineLength;
}
/* set up extents of scrolling controls (panorama) */
SetRect(&panoRect,0,0,maxLineLength,itsLineCt);
SetScales(theFontInfo.widMax,itsLineHeight);
SetBounds(&panoRect);
}
void CBrowserPane::Draw(Rect *area)
{
short firstLine,lastLine;
short i;/* loop control */
/* Get first and last line numbers to draw. */
/* We will draw an extra line at the top so that */
/* if we scroll upward the descenders will be */
/* available for scrolling. */
firstLine = (*area).top / itsLineHeight;
if (firstLine < 1) firstLine = 1;
lastLine = 1 + (*area).bottom / itsLineHeight;
if (lastLine > itsLineCt) lastLine = itsLineCt;
for (i=firstLine; i<=lastLine; i++) {
MoveTo(0,i*itsLineHeight);
DrawText(*itsDataH+(*itsLineStartsH)[i-1],0,
(*itsLineStartsH)[i]-
(*itsLineStartsH)[i-1]);
/* note that we do not expand tabs - this could */
/* be added if desired */
}
}
/******************************************************/
/* LineCount.h */
/* Not a class - utility that counts lines in the */
/* file and builds the line starts table. Separated */
/* from CBrowserDoc for ease of testing and tuning. */
/* See CBrowserDoc for definition of line starts */
/* table. */
/* Written in Think C version 4.0.2 */
/* Allen Stenger January 1991 */
/******************************************************/
#define _H_LineCount
/* count lines and build line starts table */
void DoLineCount(
Handle textHandle,/* handle to text */
short *theLineCtP, /* output - line count */
long ***theLineStartsHP ); /* output - line */
/* starts table */
/******************************************************/
/* LineCount.c */
/* Written in Think C version 4.0.2 */
/* Allen Stenger January 1991 */
/******************************************************/
#include "LineCount.h"
#define DOTIMING 0 /* set to 1 if timing */
/* measurements desired */
void DoLineCount( Handle textHandle,
short *theLineCtP,
long ***theLineStartsHP )
{
register
char *dp = *textHandle;
/* character pointer in loop */
long offset = 0;/* = dp - *textHandle */
long dataSize;/* size of **textHandle */
long theLineStartsSize;
/* allocated number of */
/* entries in theLineStarts */
long **localLineStartsH; /* shadow variable */
short localLineCt;/* shadow variable */
char saveChar;/* saves last char of file */
#if DOTIMING
long startTime,
endTime;
float elapsedTime;/* time to build table (sec)*/
#endif
#if DOTIMING
startTime = TickCount();
#endif
theLineStartsSize = 2;
localLineStartsH =
(long **) NewHandle(theLineStartsSize *
sizeof(**localLineStartsH));
(*localLineStartsH)[0] = 0;
dataSize = GetHandleSize(textHandle);
if (dataSize == 0) {/* special case - file is empty */
localLineCt = 1;
(*localLineStartsH)[1] = 1;
}
else { /* normal case - file not empty */
/* This search uses a "sentinel"; the last */
/* character of the file is replaced with \r, so */
/* that the check for end-of-file can be moved out*/
/* of the inner loop and into the outer linecount */
/* loop. This speeds up the search. */
localLineCt = 0;
saveChar = *(*textHandle+dataSize-1);
*(*textHandle+dataSize-1) = '\r'; /* sentinel */
while (offset != dataSize) {
while (*dp++ != '\r')
; /* skip to next \r */
localLineCt++;
offset = dp - *textHandle;
if (localLineCt >= theLineStartsSize) {
/* add more space to localLineStartsH */
theLineStartsSize += 1000;
SetHandleSize( localLineStartsH,
theLineStartsSize *
sizeof(**localLineStartsH) );
dp = offset + *textHandle;
}
(*localLineStartsH)[localLineCt] = offset;
}
*(*textHandle+dataSize-1) = saveChar; /* restore */
SetHandleSize( localLineStartsH,
(localLineCt+1) * sizeof(**localLineStartsH) );
/* shrink to needed size */
}
/* return variables */
*theLineCtP = localLineCt;
*theLineStartsHP = localLineStartsH;
#if DOTIMING
endTime = TickCount();
elapsedTime = (endTime - startTime) / 60.0;
#endif
}
/******************************************************/
/* Search.h */
/* Not a class - utility routines for string */
/* searching. Separated from CBrowserDoc for ease */
/* of testing and tuning. */
/* Written in Think C version 4.0.2 */
/* Allen Stenger January 1991 */
/******************************************************/
#define _H_Search
/* Find next occurrence of search string */
Boolean GoFind(
Handle theTextH, /* Handle to text */
long *offsetP); /* offset into *theTextP to */
/* start search, also returned */
/* as the offset where string */
/* is found */
/* returns TRUE if found, FALSE if not found */
/* Get search string from user and store. Entering a */
/* null string resets the saved search string. */
Boolean GetSearchString( void );
/* returns TRUE if user supplies non-empty string, */
/* FALSE if user cancels or supplies empty string */
Boolean HaveSearchString( void );
/* returns TRUE if non-empty search string has been */
/* entered */
/******************************************************/
/* Search.c */
/* Uses the Boyer-Moore search algorithm. Reference: */
/* "A Fast String Searching Algorithm," Robert S. */
/* Boyer and J Strother Moore, CACM v. 20 no. 10 */
/* (October 1977), pp. 762-772. */
/* Their original algorithm is modified here to do */
/* case-insensitive searches. */
/* Written in Think C version 4.0.2 */
/* Allen Stenger January 1991 */
/******************************************************/
#define DOCOUNT 0
/* count number of passes in loops */
#define DOTIMING 0 /* measure time of searches */
#include <string.h>
#include <ctype.h>
#include "Browser.h"
#include "Search.h"
#define LARGE 100000000L
/* LARGE is picked to be larger than any possible */
/* file size. It is a flag in delta0 indicating */
/* that the index character is in the pattern. */
/* These are the Boyer-Moore tables, which tell how */
/* far the pattern may be shifted for the next trial */
/* comparison. delta0 is indexed by unsigned char, */
/* and delta2 is indexed by position in the pattern. */
static long delta0[256];
static shortdelta2[256];
char searchString[256] = {256 *'\0'};
/* saves the search string */
/******************************************************/
/* internal functions */
/******************************************************/
/* prototypes for internal functions */
/* Calculate the Boyer-Moore delta0 and delta2 tables.*/
/* Tables are calculated assuming upper and lower case*/
/* are the same. The search pattern is the global */
/* variable searchString, and the tables are stored */
/* in the global variables delta0 and delta2. */
static void GetDelta0( void );
static void GetDelta2( void );
/* Boyer-Moore search method - returns TRUE if string */
/* found, FALSE otherwise. */
static Boolean /* returns whether found */
BMSearch( char *string, /* target string */
long stringlen,/* length of *string */
long *offsetP); /* output - where found */
/* end prototypes */
static void GetDelta0( void )
{
short i; /* loop control */
char *pat = searchString;/* local copy */
long patlen = strlen(pat); /* local constant */
for (i=0; i<256; i++) delta0[i] = patlen;
for (i=0; i<patlen; i++) {
delta0[ (unsigned char) tolower(searchString[i]) ] =
patlen - 1 - i;
delta0[ (unsigned char) toupper(searchString[i]) ] =
patlen - 1 - i;
}
delta0[ (unsigned char)
tolower(searchString[patlen-1]) ] = LARGE;
delta0[ (unsigned char)
toupper(searchString[patlen-1]) ] = LARGE;
}
static void GetDelta2( void )
{
#define SameCharAtPos(a, b) ( \
(tolower(pat[a]) == tolower(pat[b])) \
||(toupper(pat[a]) == toupper(pat[b])) )
short i;/* index in trial string for rpr */
short j;/* rpr(j) now being calculated */
short k;/* trial for rpr */
short currentRPR;/* latest good rpr(j) */
char *pat = searchString;/* local copy */
short patlen = strlen(pat); /* local constant */
Booleanunifies; /* TRUE if pat[j+1]..pat[patlen-1]*/
/* unifies with pat[k]..pat[k+patlen-j-2] */
for (j=0; j<=patlen-1; j++) {
/* Calculate rpr(j). NOTE: our rpr is one less */
/* than the Boyer-Moore rpr, because our arrays */
/* start at 0 and theirs at 1. The delta2 values */
/* are the same, although indexed beginning at 0. */
currentRPR = j + 1 - patlen;
for (k=j+2-patlen; k<=j; k++) {
/* check for reoccurrence at pat[k] */
unifies = TRUE;
for (i=0; i<=patlen-2-j; i++) {
if ( (k+i>=0) && !SameCharAtPos(j+1+i,k+i))
unifies = FALSE;
}
if ( unifies &&
( (k<1) || !SameCharAtPos(k-1,j) ) )
currentRPR = k; /* found rightmore k */
}
delta2[j] = patlen - currentRPR;
}
}
static Boolean
BMSearch(register char *string,
register long stringlen,
long *offsetP)
{
register long
i;/* index into string of current pointer */
register short
j;/* index into pat for char-by-char */
char *pat = searchString;/* local copy */
short patlen = strlen(pat); /* local constant */
register long
d0jump,
d2jump; /* sizes of jumps indicated by */
/* delta0, delta2 */
#if DOCOUNT
/* number of times through loops */
long whileTRUECount = 0; /* while(TRUE) loop */
long whileICount = 0; /* while(i) loop */
long whileJCount = 0; /* while(j) loop */
#endif
i = patlen - 1;
if (i >= stringlen) return( FALSE );
while (TRUE) {
#if DOCOUNT
whileTRUECount++;
#endif
/* inner loop of Boyer-Moore algorithm */
while( (i += delta0[ (unsigned long)
(unsigned char) string[i] ])
< stringlen ) {
#if DOCOUNT
whileICount++;
#endif
};
if (i<LARGE) return( FALSE );
i -= (LARGE + 1);
j = patlen - 2;
/* character-by-character comparison */
while ( (j>=0) &&
(tolower(string[i]) == tolower(pat[j])) ) {
#if DOCOUNT
whileJCount++;
#endif
--j;
--i;
}
if (j < 0) {
/* success - whole pattern matched */
*offsetP = i + 1;
return( TRUE );
}
/* failure - only part of pattern matched - get */
/* shifts indicated by delta0 (single-character */
/* mismatch) and by delta2 (next plausible */
/* reoccurrence) and take the larger. */
d0jump = delta0[ (unsigned long)
(unsigned char) string[i] ];
if (d0jump == LARGE) d0jump = 0;
d2jump = delta2[j];
i += (d0jump > d2jump) ? d0jump : d2jump;
}
}
/******************************************************/
/* External functions */
/******************************************************/
Boolean GoFind( Handle theTextH, long *offsetP)
{
long newOffset; /* where match found */
BooleanmatchFound;/* whether match found */
#if DOTIMING
long startTime, endTime;
float elapsedTime; /* time for GoFind (seconds) */
#endif
#if DOTIMING
startTime = TickCount();
#endif
/* note - BMSearch always starts from the beginning */
/* of the string, so we have to offset the text and */
/* add relative offsets */
matchFound = BMSearch(
*theTextH + *offsetP,
GetHandleSize(theTextH) - *offsetP,
&newOffset );
if (matchFound) *offsetP += newOffset;
#if DOTIMING
endTime = TickCount();
elapsedTime = (endTime - startTime) / 60.0;
#endif
return (matchFound);
}
Boolean GetSearchString( void )
{
short itemHit; /* which item in dialog selected */
DialogPtrtheDialogP; /* pointer to modal dialog */
short itemType; /* for GetDItem */
Handle itemHandle; /* for GetDItem */
Rect box;/* for GetDItem */
theDialogP = GetNewDialog(dlogSearch, NULL, -1L);
CtoPstr(searchString);
GetDItem(theDialogP, dlogText, &itemType,
&itemHandle, &box);
SetIText(itemHandle, searchString);
SelIText(theDialogP, dlogText, 0, 32767);
itemHit = dlogText;
while ( !( (itemHit == dlogOK) ||
(itemHit == dlogCancel)) )
ModalDialog(NULL, &itemHit);
if (itemHit == dlogOK) {
GetDItem(theDialogP, dlogText, &itemType,
&itemHandle, &box);
GetIText(itemHandle, searchString);
}
PtoCstr(searchString);
/* convert back to C whether entered or not */
DisposDialog(theDialogP);
if (strlen(searchString) != 0) {
GetDelta0();
GetDelta2();
}
return( (itemHit == dlogOK) &&
(strlen(searchString) != 0) );
}
Boolean HaveSearchString( void )
{
return (strlen(searchString) != 0);
}
*******************************************************
*Browser.Π.r *
*Browser resource definitions *
*To create the resource file, make a copy of *
*Starter.Π.rsrc and rename it Browser.Π.rsrc. Then *
*run this file through RMaker to add Browser's *
*unique resources. *
*Written in RMaker version 2.2 *
*Allen Stenger January 1991 *
*******************************************************
/QUIT
!Browser.Π.rsrc
TYPE DLOG
Search,1000 ;; ID = dlogSearch in Browser.h
Search
200 40 320 280
Visible NoGoAway
0
0
1000 ;; DITL ID = dlogSearch in Browser.h
TYPE DLOG
About,1001;; ID = dlogAbout in Browser.h
About Browser
40 40 240 400
Visible NoGoAway
0
0
1001 ;; DITL ID = dlogAbout in Browser.h
TYPE DITL
Search,1000 ;; ID = dlogSearch in Browser.h
4
button ;; item number = dlogOK in Browser.h
80 32 100 92
OK
button ;; item number = dlogCancel in Browser.h
80 150 100 210
Cancel
editText;; item number = dlogText in Browser.h
45 12 67 230
staticText;; (not referenced in program)
9 13 30 228
Search for what string?
TYPE DITL
About,1001;; ID = dlogAbout in Browser.h
2
button ;; (not referenced in program)
176 8 196 68
OK
staticText
2 2 168 360
Fast Text Browser\0D\0D++
Written by Allen Stenger, January 1991.\0D++
Written in Think C and Think Class Library.\0D++
Portions copyright (c) by Symantec Corporation.
* This MBAR overrides the one from Starter, to enable
* the Search menu.
TYPE MBAR = GNRL
,1
.I
4
1
2
3
20 ;; Search menu, ID = MENUsearch in Browser.h
* This MENU overrides the Apple menu from Starter, to
* change the "About" name to Browser.
TYPE MENU
Apple,1 (4)
\14
About Browser #256 ;; cmd = cmdAbout from Commands.h
(-
TYPE MENU
Search,20 (4) ;; ID = MENUsearch in Browser.h
Search
(Find #2000/F ;; cmd = cmdFind in Browser.h
(Find Again#2001/A ;; cmd = cmdFindAgain in Browser.h