High-Performance ACGIs in C
High-Performance ACGIs in C
Ken Urquhart
Asynchronous Common Gateway Interface (ACGI) programs allow Macintosh HTTP servers to do
external processing tasks ranging from custom HTML forms processing to
controlling hardware devices. ACGIs are usually written in AppleScript (which
limits them to handling only one server request at a time). High-performance
ACGIs, ones that are capable of handling multiple simultaneous requests, need
to be written in a high-level language like C. The resulting ACGI will work
with any HTTP server that supports the WebSTAR WWW Apple event suite.
Now that you've got your HTTP server up and running on your Macintosh, people
are flocking to your Web site by the thousands. The only problem is that you've
written all of your Asynchronous Common Gateway Interface programs (ACGIs) in
AppleScript and their performance is leaving much to be desired. You know you
should be writing your ACGIs in C for speed, but you think that will be a lot
of work.
Well, have I got news for you! A full-blown, multithreaded, high-performance
ACGI program for use with Macintosh HTTP servers is easier to write than you
think. If you've worked through one of the introductory Macintosh programming
books, you already know just about everything you need to.
When all is said and done, an ACGI is little more than a simple, Apple
event-aware application that knows how to process Apple events in threads. Most
of the work is concentrated in decoding the Apple event parameters that make up
each server request. Hopefully you won't feel so overwhelmed by ACGIs written
in C (or any other high-level language) after you've read this article, and you
can get on with using them to hot-rod your Web site!
I've made writing an ACGI easier for you by providing a generic ACGI program,
which accompanies this article on this issue's CD and develop's Web site. I
designed the program (which I'll be referring to as an ACGI "shell") in such a
way that you can create your own ACGIs just by customizing a handful of
routines. The messy details of accepting multiple requests from an HTTP server,
and then handling each request in its own thread of execution, are taken care
of for you. The program even relieves you of the burden of URL-decoding the
post and search arguments (including breaking up all of the name=value pairs
and translating them from the ISO-8859 Latin-1 character encoding used by most
browsers into the standard Macintosh Roman encoding).
I've also provided a rich set of convenience routines that perform the
following tasks:
- give you easy access to all the arguments and parameters that make up a
server request
- help you compose your HTML replies
- get and set various ACGI performance-tuning parameters
- allow you to gently turn away new requests when your ACGI is very busy
- gracefully shut down the ACGI if the need arises
I've tried to
provide enough support to make it possible for you to forget most of the
details of interacting with an HTTP server and concentrate on writing the code
needed to implement your custom form processing.
The ACGI shell program, compiled under CodeWarrior as a PowerPC application
with no optimizations, takes up a little under 42K on disk (not including
custom code that you must add to process your requests). Memory requirements
are dictated by the number of concurrent requests you want to handle and how
much stack space you allocate to each running thread. In a typical case, the
shell should provide uniform response to about five to ten concurrent requests
in a 1 MB memory footprint.
WHAT'S AN ACGI?
Before I can tell you what an ACGI is, I need to explain what a CGI is. This
requires a bit of background on what HTTP servers are all about.
WHAT'S A CGI?
HTTP servers are designed to do one thing and to do it very well: respond to
requests from Web browsers. If the request is for a file that resides somewhere
in the server's directory tree, the server locates the file, reads its
contents, and then sends the information back to the browser. Other requests
such as image map or form processing are handed off to auxiliary programs that
communicate with the server by using the Common Gateway Interface (CGI)
protocol. When the server receives a request that must be handled by a CGI
program, the server starts up the CGI (if it wasn't already running) and passes
it the request. The CGI is responsible for parsing and decoding the request
parameters, processing them, and then composing the HTML response. The server
takes care of returning the response to the requesting browser.
Being a computer program, a CGI can readily interact with databases,
transaction processing systems, or even connected serial devices to process a
given request. So CGIs allow your Web site to serve up a wide variety of
dynamic information.
The structure of a CGI program is dictated by the HTTP server and by the
operating system. The first Macintosh HTTP server was MacHTTP, written by Chuck
Shotton. He used Apple events for server/CGI communication and defined a
special event suite (WWW ) for this purpose. He later extended this
suite, adding several more parameters, when he wrote WebSTAR -- the commercial
version of MacHTTP. His suite has become the de facto standard for server/CGI
interaction on the Macintosh. As such, you can be sure that most other
Macintosh HTTP servers will support it.
WebSTAR-like servers use custom Apple events to communicate with CGIs and can
call them either synchronously or asynchronously.
- Synchronous calls require the server to suspend processing while it waits
for the Apple event reply from the CGI.
- Asynchronous calls allow the server to send the request to the CGI and
then continue processing other connections while the CGI does its
work.
Asynchronous calls are almost always preferable for a popular Web
site that's receiving several connection requests a second.
SO NOW WILL YOU TELL ME WHAT AN ACGI IS?
An ACGI is a CGI that's called asynchronously by the HTTP server (you're
surprised to hear this?). Furthermore, when an ACGI is written to handle each
request in a separate thread of execution (enabling it to deal with multiple
requests simultaneously), it's referred to as a
threaded ACGI.
To write a threaded ACGI for the Macintosh, you need to understand the
following:
- how Web browsers send CGI requests to HTTP servers
- how a Macintosh HTTP server uses the WWW Apple event suite to
pass these requests along to an ACGI
- how an ACGI can arrange to process each Apple event in a separate
thread of execution
- how to extract the URL-encoded data from the Apple events so that the
ACGI can process it
While it would be just about impossible to describe
each of these points in detail in one short article, I do provide brief
overviews as I talk about the functions of the ACGI shell.
For more information on writing a threaded ACGI, refer to the book Planning and
Managing Web Sites on the Macintosh: The Complete Guide to WebSTAR and MacHTTP,
which covers this topic in detail and is a good general reference. Chapters 10
through 15 provide a wealth of information, especially Chapter 13, "Writing CGI
Applications," and Chapter 15, "Developing CGIs in C."*
Like other threaded ACGI solutions (described in "Other Techniques for
Developing a Threaded ACGI"), my technique uses cooperative threads as opposed
to preemptive threads. This allows you to call any Toolbox routine you want
when you're carrying out your form processing. Preemptive threads currently
have many Toolbox calling restrictions (see the article "Concurrent Programming
With the Thread Manager" in develop Issue 17).
OTHER TECHNIQUES FOR DEVELOPING A THREADED ACGI
Processing Apple events in threads has been dealt with by several authors, and
there are a variety of solutions available.
The first solution was presented by Steve Sisak in late 1994 in his MacTech
Magazine article "Adding Threads to Sprocket." His AEThreads library allows you
to choose which Apple events to process in threads and gives you complete
control over all thread creation parameters.
A second, rather different approach can be found in the source code for the
Mail Tools ACGI written by Jon Norstad (available at
http://charlotte.acns.nwu.edu/mailtools/techinfo.html).
Greg Anderson, in his article "Futures: Don't Wait Forever" in develop Issue
22, presented a third solution involving a predispatch Apple event handler that
transparently threaded all Apple events.
John O'Fallon described a fourth method in his MacTech article "Writing CGI
Applications in C." In 1996, Grant Neufeld came up with a fifth solution in
conjunction with his CGI framework in his MacTech article "Threading Apple
Events."
Not wishing to break with this long tradition, the program described in this
article presents yet a sixth variation on the theme.
THE STRUCTURE OF THE ACGI SHELL
Just as there are many ways of writing a Macintosh application, there are many
ways to write an ACGI shell. I've taken the simplest possible approach and
avoided using an application framework like MacApp or PowerPlant. My ACGI shell
is written in plain C and consists of three logically separate code sections:
- a main program that receives Apple event requests from an HTTP server and
processes them in separate threads of execution
- the set of customizable request-processing routines
- a set of convenience routines that simplify accessing the request data,
composing HTML response pages, and controlling the runtime behavior of the
ACGI
The code is split into two source files (acgi.c and www.c), two
include files (acgi.h and www.h), and one resource file (acgi.rsrc). The main
application and the convenience routines are located in acgi.c, while the
routines that you'll need to customize are in www.c. The include file acgi.h
contains the public prototypes for the convenience functions you can call from
www.c, while the include file www.h contains the function prototypes and data
structure definitions used by routines in both source files.
THE ROUTINES YOU NEED TO CUSTOMIZE
The file www.c contains six routines that you'll need to customize to implement
your own custom form processing. Four routines are called exactly once by the
main program while the ACGI is running. A fifth routine is called at idle time
in the main event loop, while the last one is called to process each HTTP
request.
WWWGETLOGNAME
When the ACGI starts up, one of the first things the main program does is open
a log file to write progress messages to. It gets the name of the file by
calling this routine:
char *WWWGetLogName(void);
Customizing WWWGetLogName allows you to specify the name of the log file. All you typically
need to do is write something like this:
char *WWWGetLogName(void)
{
return "acgi.log";
}
The one gotcha here is that I've used ANSI file I/O routines to simplify the
program code. So you must always be sure to return a valid ANSI filename (a
plain filename fewer than 31 characters long with no full or partial Macintosh
file path prepended to it). Note that some Macintosh ANSI libraries will allow
filenames prefixed by partial paths as long as the total length of the string
is no longer than 255 characters.
WWWGETHTMLPAGES
After the log file is opened, the main program will ask you to build four HTML
error pages that are returned to the HTTP server when one of these general
errors occurs:
- The ACGI is declining (refusing) to process requests.
- The ACGI is too busy to handle a new request.
- The ACGI has run out of memory.
- The ACGI has run into an unexpected problem.
The routine you use to construct your pages is as follows:
void WWWGetHTMLPages(Handle refused, Handle tooBusy,
Handle noMemory, Handle unexpectedError);
The main program passes in four handles. Each handle contains a standard HTTP
response header, and you're responsible for appending whatever HTML text you
want for the error pages. This allows you to control the "look and feel" of the
error messages returned by your ACGI. Perhaps the simplest approach here is to
put the HTML error pages into text files located in the same directory as your
ACGI and then append them to the handles with the convenience routine
HTMLAppendFile:
void WWWGetHTMLPages(Handle refused, Handle tooBusy,
Handle noMemory, Handle unexpectedError
{
HTMLAppendFile(refused, "acgiRefused.html");
HTMLAppendFile(tooBusy, "acgiTooBusy.html");
HTMLAppendFile(noMemory, "acgiNoMemory.html");
HTMLAppendFile(unexpectedError, "acgiUnexpected.html");
}
Other convenience routines allow you to read the text from string and text resources,
so you have some flexibility here. The idea behind WWWGetHTMLPages is to allow
you to create your HTML error pages early in the initialization phase so that
they'll always be available for use.
WWWINIT
After the main program has completed its initialization steps, you're given a
chance to carry out any private initialization you need to do before beginning
form processing. This might include calling the ACGI runtime-tuning routines,
initializing your own global variables, reading resources into memory, building
HTML template pages, or opening connections to external databases and other
computers. The prototype is
OSErr WWWInit(void);
If you run into problems during your initialization, simply return a nonzero code.
The main program checks the return code and immediately quits to the Finder
when the code is nonzero.
If you have no special initialization to do, you could write this routine as
follows:
OSErr WWWInit(void)
{
return (noErr);
}
WWWQUIT
When the main program exits its main event loop, it calls this next routine to
give you one last chance to clean up after yourself (close files, database
connections, and so on):
void WWWQuit(void);
If you don't need to do any cleaning up, you can write something as simple as this:
void WWWQuit(void) { }
WWWPERIODICTASK
The main program allows you to carry out idle-time processing by calling the
following routine at the end of each pass through the main event loop:
OSErr WWWPeriodicTask(void);
This is where you'd place code to check that connections to other computers are
still alive or carry out any background processing initiated by previous server
requests. If you have no idle-time processing, you could write the following:
OSErr WWWPeriodicTask(void)
{
return (noErr);
}
The main program checks the return code from this routine and, if the code is
nonzero, quits to the Finder (after trying to gracefully abort all currently
running threads).
WWWPROCESS
The last routine you must customize is the one that processes a server request:
OSErr WWWProcess(WWWRequest request);
When the HTTP server sends the ACGI a request through an Apple event, the main
program creates a new thread and passes the Apple event data into the thread.
The thread extracts the request data from the Apple event and packs it into a
private data structure. The thread then calls WWWProcess, passing a pointer to
the private data structure in the request parameter. You extract information
from the data structure with the convenience routines (described later).
If you need to abort the processing of a request, you can return one of the
four error codes errWWWRefused, errWWWTooBusy, errWWWNoMemory, and
errWWWUnexpected. These cause the corresponding HTML error pages that you built
in the routine WWWGetHTMLPages to be returned to the server.
THE MAIN PROGRAM
As mentioned previously, the main program is a simple Macintosh application --
simpler than most of the programs described in introductory Macintosh
programming books. It's important to remember that an ACGI is meant to interact
with HTTP servers, not live users. It doesn't need any windows, complex menus,
or even an About box. Its purpose in life is to respond to Apple events and not
mouse clicks or keystrokes.
Furthermore, you cannot assume that a human will always be watching the server
screen, ready to react to dialog boxes or alerts. If an ACGI runs into trouble,
it should try to recover as best it can and keep going. For example, if a
required external database shuts down, an ACGI might return an "out of service"
response to each request until the database comes back online. If an ACGI runs
out of memory, it might simply quit and allow the HTTP server to launch a fresh
copy of it the next time a request comes in. Hopefully, that would cure the
problem in the short term.
An efficient, low-overhead ACGI is therefore a windowless, Apple event-aware
program that posts no alerts or dialogs. It implements only the Apple and File
menus. For simplicity, the About item in the Apple menu does nothing except
show the name of the ACGI (although there's nothing to stop you from
implementing an About box if you want to). The File menu contains the single
item Quit. A log file is used to record all informational, error, and debugging
messages.
As shown in Listing 1, the main program starts by calling ACGIInit to set
itself up. Then it runs the main event loop, calling ACGIEvent to process each
new event, until the global gDone flag is set and all threads have completed.
The program then cleans up after itself by calling ACGIQuit.
Listing 1. The ACGI main program
// Include files and function prototypes
...
static Boolean gDone = false;
static unsigned long gThreads = 0;
static long gThreadSleep = 4;
static long gIdleSleep = 0x7FFFFFFF;
static long gWNEDelta = 8;
void main(void)
{
EventRecord theEvent;
long sleep;
unsigned long nextWNE;
ACGIInit();
while (!gDone || gThreads > 0) {
if (gThreads > 0)
sleep = gThreadSleep;
else
sleep = gIdleSleep;
if (WaitNextEvent(everyEvent, &theEvent, sleep, nil))
ACGIEvent(&theEvent);
nextWNE = TickCount() + gWNEDelta;
do {
YieldToAnyThread();
} while (TickCount() <= nextWNE);
ACGIPeriodicTask();
}
ACGIQuit();
}
THREADS AND THE MAIN EVENT LOOP
The presence of threads affects the main event loop shown in Listing 1 in three
ways. First, the loop doesn't exit as long as there are active threads. This
ensures that all threads processing HTTP server requests complete their work
before the ACGI shuts down. Second, there are two different sleep times for
WaitNextEvent: gThreadSleep when threads are running and gIdleSleep when
they're not. We need idle time to give the threads a chance to run. This means
we should use a rather small value for sleep when gThreads is greater than 0.
On the other hand, when there are no outstanding requests, we should set sleep
to a large value to avoid wasting CPU time. The exception to this rule is when
you have periodic tasks, in which case you should call ACGISetSleeps in WWWInit
to set gIdleSleep to get the idle time you need.
Third, there's the inner loop that repeatedly calls YieldToAnyThread. This
routine causes the Thread Manager to turn control over to the oldest running
thread. This thread keeps control until it too calls YieldToAnyThread to turn
control over to the next running thread. This continues until the newest thread
calls YieldToAnyThread and control returns to the main event loop (see
"Concurrent Programming With the Thread Manager" in develop Issue 17).
It's important to call YieldToAnyThread frequently inside your
request-processing code, usually after you complete a logical step in your
processing and no less than every 1 to 2 ticks of the Macintosh clock (1 tick =
1/60th of a second). Don't bother putting your calls to YieldToAnyThread inside
a timed loop as we did in the main event loop. Just call it often throughout
your code: it's a very low overhead call. The secret to uniform response time
to all requests is not to allow any one thread to hog the CPU.
YieldToAnyThread is enclosed in a timed loop to give threads enough time to do
useful work when running on a Power Macintosh. Currently, there's a context
switch from native PowerPC mode to 680x0 emulation mode when WaitNextEvent is
called. In addition, historical reasons guarantee that WaitNextEvent always
waits at least 1 tick before it returns. Calling YieldToAnyThread only once per
pass through the main event loop means that threads would get time only once
every 1/60th of a second and a lot of useful CPU time would be wasted in mode
switches. The timed loop could result in a thousandfold performance increase --
without noticeably affecting other applications -- for ACGIs running
compute-bound threads that frequently yielded.
THE INITIALIZATION ROUTINE ACGIINIT
ACGIInit carries out seven distinct steps to get the ACGI going:
- Initialize the Toolbox.
- Get the name of the log file by calling WWWGetLogName and then open it.
- Check to see that both Apple events and the Thread Manager are present.
- Set up the menu bar.
- Install the Apple event handlers.
- Call WWWGetHTMLPages to build the four generic HTML error pages.
- Call WWWInit to initialize your processing environment.
If ACGIInit runs into trouble, it calls ACGIFatal to write an error message to the
log file and quit. If you run into trouble in WWWInit you should write a
meaningful error message to the log with ACGILog and return a nonzero result
code. ACGIInit will write the code to the log and then quit.
THE LOGGING ROUTINES ACGILOG AND ACGIFATAL
Two routines that write zero-terminated strings to the log -- ACGILog and
ACGIFatal -- are shown in Listing 2. In these routines, gLog is an ANSI
FILE*
variable that's local to the source file acgi.c. It points to the open log
file.
Listing 2. Logging routines
void ACGILog(char *msg)
{
DateTimeRec dt;
ThreadID theThread;
if (gLog == NULL)
return;
GetTime(&dt);
GetCurrentThread(&theThread);
fprintf(gLog, "%4d/%02d/%02d\t%02d:%02d:%02d\t%010lu\t%s\n",
dt.year, dt.month, dt.day, dt.hour, dt.minute, dt.second,
theThread, msg);
fflush(gLog);
}
void ACGIFatal(char *reason)
{
if (gLog != NULL) {
ACGILog(reason);
ACGILog("That was a fatal error...shut down.");
}
ExitToShell();
}
ACGILog prefixes each message with the date and time and the ID number of the
thread it was called in. The items are tab-separated so that you can later
import the log into a spreadsheet and sort it by date, time, or thread ID. This
can be useful when you're trying to debug an ACGI or gather statistics based on
the messages you wrote into the log during processing. ACGIFatal calls ACGILog
to write its message to the log and then quits the program immediately without
waiting for running threads to complete. It's meant to be called only from
within ACGIInit.
PERIODIC TASKS AND THE TERMINATION ROUTINE
ACGIPeriodicTask runs periodic tasks by calling your WWWPeriodicTask routine
and then checking for a nonzero result code (in which case it writes the code
to the log and, if the code is positive, sets gDone to true). The termination
routine ACGIQuit is the last routine called by the main program. It shuts down
processing by calling your WWWQuit routine and then closes the log.
EVENT HANDLING IN THE MAIN EVENT LOOP
Since an ACGI is basically a simple Macintosh application with no windows, no
About box, and only the Apple menu and File menu (which supports the single
item Quit), you don't have to worry about activate and update events, and
suspend/resume events only need to set the cursor to an arrow. Keystrokes are
important only if they're Command-key equivalents that might represent a menu
selection. This limited event handling is carried out entirely in the routine
ACGIEvent and its small support routine DoMenu (for menu and Command-key
handling). ACGILog is used to report any errors that are encountered.
ACGIEvent doesn't need to do any special processing at this level to handle
threaded Apple events. It just calls AEProcessAppleEvent like any other
application. Details of the threading process are hidden away in the Apple
event handler that's called in response to HTTP server requests.
APPLE EVENT SUPPORT IN THE ACGI
The ACGI must support the four core Apple events and the custom event sent by
HTTP servers and must be able to process HTTP events in threads. Here are the
details of how the ACGI shell implements the required Apple events and the
threading of the server requests.
SUPPORTING CORE APPLE EVENTS
Any application that supports Apple events must support the four core events
(Open Application, Open Document, Print Document, and Quit Application), as
well as any custom Apple events needed for communication with other programs.
Because the ACGI doesn't have any documents, doesn't do any printing, and does
all the application initialization before accepting the first Apple event, it
can deal with the four core events with the single handler HandleAECore:
#define kQuitCoreEvent 1
#define kOtherCoreEvent 0
static pascal OSErr HandleAECore(AppleEvent *event,
AppleEvent *reply, long refCon)
{
if (refCon == kQuitCoreEvent)
gDone = true;
return (noErr);
}
The ACGI sets the handler reference constant, refCon, to kOtherCoreEvent for the
'oapp', 'odoc', and 'pdoc' events and to kQuitCoreEvent for the 'quit' event.
When the handler is called, it simply returns noErr if the refCon is
kOtherCoreEvent and sets gDone to true if the refCon is kQuitCoreEvent.
THREADING HTTP SERVER REQUESTS
The WWW Apple event class defines a single event ID ('sdoc') to pass
requests to ACGI programs. This is the event that the ACGI shell responds to.
To handle multiple server requests at once, the ACGI must process each request
in its own thread of execution.
This leads to some complications in the code because the Apple Event Manager
was designed to have only one event active at any given time. To process
multiple Apple events in threads, the ACGI will have to suspend each new Apple
event in the main thread of execution, put each suspended event into its own
thread for processing, and then let each thread resume its suspended Apple
event at the end of processing so that replies are returned to the HTTP
server.
The one catch here is that when an event is suspended, the pointers to the
event and reply data structures become invalid. The ACGI must therefore make
copies of the event and reply data structures (and not just the pointers)
before suspending an event. These copies of the AEDescs are passed into the
thread for processing.
So, the processing flow for threading HTTP server requests is as follows:
- ACGIInit makes HandleSDOC the handler for HTTP server requests.
- The main event loop (running in the main thread) receives an HTTP
server request and calls AEProcessAppleEvent as usual.
- HandleSDOC (also running in the main thread) receives the Apple
event.
- If there are too many threads running or the ACGI is refusing
connections, the handler immediately returns an HTML page indicating that the
server request cannot be processed. Otherwise, the handler allocates a handle
called params to hold copies of the Apple event and its reply. Note that the
complete data structures must be copied, not just the pointers to them, because
the pointers become invalid when the event is suspended.
- HandleSDOC creates a new thread and passes params into it. If the
thread cannot be created, params is disposed of and the error code is returned.
- HandleSDOC increments the count of running threads and then suspends
the current Apple event and returns. The main event loop is now free to accept
another server request.
- The main event loop regains control and calls YieldToAnyThread
almost immediately. Each processing thread is given time to run, and control
eventually passes to the new thread.
- The new thread begins life by calling SDOCThread. This routine makes
local copies of the suspended Apple event and its reply and then disposes of
the params handle that was passed to it by HandleSDOC.
- SDOCThread extracts parameters from the Apple event, URL-decodes
them, and then calls WWWProcess to process the server request. WWWProcess calls
YieldToAnyThread frequently to give time to other threads and to allow the main
thread to accept new Apple events. When WWWProcess finishes, it returns a
handle containing the HTML response page.
- SDOCThread places the response into its copy of the Apple event reply
and then resumes execution of the suspended event. The event in this thread is
now considered complete. You're guaranteed that no other Apple event will be
"current" at this time because HandleSDOC suspends each new event before any of
the processing threads are given time to run.
- The thread decrements the global counter gThreads and then returns
(causing the thread to be disposed of).
With this processing flow as a
guide, the associated code practically writes itself. The HandleSDOC routine is
shown in Listing 3.
Listing 3. Handling HTTP server requests
static unsigned long gMaxThreads = 10;
static Boolean gRefusing = false;
static long gThreadStackSize = 0;
static ThreadOptions gThreadOptions
= kCreateIfNeeded | kFPUNotNeeded;
typedef struct AEParams {
AppleEvent event;
AppleEvent reply;
} AEParams;
void SDOCThread(void *threadParam);
OSErr ACGIReturnHandle(AppleEvent *reply, Handle h);
pascal OSErr HandleSDOC(AppleEvent *event, AppleEvent *reply,
long refCon)
{
AEParams** params;
ThreadID newThreadID;
OSErr err;
// [1] Too many threads already running?
if (gThreads >= gMaxThreads)
return (ACGIReturnHandle(reply, gHTMLTooBusy));
// [2] Should we handle this request?
if (gDone || gRefusing)
return (ACGIReturnHandle(reply, gHTMLRefused));
// [3] OK to run...make copies of event and reply.
params = (AEParams**) NewHandle(sizeof(AEParams));
if (params == nil)
return (errAEEventNotHandled);
(*params)->event = *event; // Copy the data structures...
(*params)->reply = *reply; // ...not just the pointers to them!
// [4] Create the thread, passing in the copies of event and
// reply.
err = NewThread(kCooperativeThread,
(ThreadEntryProcPtr) SDOCThread, (void*) params,
gThreadStackSize, gThreadOptions, nil, &newThreadID);
if (err != noErr) {
DisposeHandle((Handle) params);
return (err);
}
// [5] Increment the count of running threads and then suspend
// the current event so that we can accept new events.
gThreads++;
return (AESuspendTheCurrentEvent(event));
}
Global variables guide the actions of HandleSDOC. The maximum number of
concurrent processing threads is controlled by gMaxThreads. You can get and set
this value with the convenience routines ACGIGetMaxThreads and
ACGISetMaxThreads. If gRefusing is true, the handler will return the HTML page
stored in gHTMLRefused and not process the event (you build this page in your
custom routine WWWGetHTMLPages). You set gRefusing by calling ACGIRefuse. If
you're really concerned about heap fragmentation, you might want to create a
pool of preallocated threads during initialization with the number of
threads in the pool equal to gMaxThreads. Threads are recycled into the pool,
limiting fragmentation. This is the approach taken by Grant Neufeld in his ACGI
framework (see "Threading Apple Events" in the April 1996 issue of MacTech
Magazine).
The globals gThreadStackSize and gThreadOptions give you control over how
threads are created. The convenience routines ACGIGetThreadParams and
ACGISetThreadParams allow you to get and set their values. The default stack
size of 0 causes the Thread Manager to allocate a 24K stack to each thread.
(Thread creation options are described in detail in "Concurrent Programming
With the Thread Manager" in develop Issue 17.)
If your WWWProcess routine (or any routine that it calls) uses a lot of stack
space for local variables, you might have to increase the thread stack size.
You should do this in your WWWInit routine. You'll know if you're running out
of stack space in your ACGI because your server computer will usually lock up
when a running thread's stack overflows the heap space allocated to it. So
remember, if your server keeps freezing up or bombing, and you don't think your
code is the problem, try increasing the stack size allocated to your threads
and then increase the ACGI memory allocation by roughly the increase in stack
size multiplied by your chosen value of gMaxThreads.
The Thread Manager has routines that check how much stack space a given thread
is using. You could therefore write a debugging macro that logs the stack space
remaining before calling YieldToAnyThread. This could be useful in isolating
where the problem is after the crash -- but it wouldn't actually stop the
thread from exhausting its stack space because that happens between
yields.*
HTTP REQUEST PROCESSING
Each thread created by HandleSDOC won't start running until the main event loop
calls YieldToAnyThread. When it's time for a new thread to run, the Thread
Manager saves the state of the thread that just yielded, sets up the new
thread's environment, and then calls SDOCThread. This routine is where all the
real work of the ACGI takes place -- and where your custom processing routine
WWWProcess is invoked.
SDOCThread is the longest and most complicated routine in the ACGI. It's
responsible for extracting all request parameters, URL-decoding the search and
post arguments, packing the parameters into a WWWRequest data structure,
calling WWWProcess to process the request, placing the HTML response page into
the server reply, and then resuming the Apple event to send the reply back to
the server.
Before looking at the code, it's a good idea to go over exactly what's packed
into the 'sdoc' Apple event. A client browser asks the server to run an ACGI
either by referencing the ACGI's URL or by submitting HTML form pages that
specify the ACGI as its action.
A direct reference is just the URL of the ACGI:
http://www.test.com/test.acgi
To invoke an ACGI as the action of a form, you need to write HTML code like this:
<FORM METHOD=GET ACTION="http://www.test.com/test.acgi">
...form input items...
</FORM>
or similar code for METHOD=POST. In both cases, you can supply extra arguments to
the ACGI by adding them to the end of the URL like this:
http://www.test.com/test.acgi$path_args?search_args
The
path arguments are everything between the dollar sign ($) and the question mark
(?), while the
search arguments are everything following the question mark. The
order of the $ and the ? are important. If you put the ? before the $,
everything following the ? (including the $ and what comes after it) is
considered part of the search arguments.
When you're using forms, you can specify a method of either GET or POST. All of
your form's input variables are URL-encoded. If you specify GET, the input
variables are tacked onto the end of the search arguments; if you use POST,
they're placed into a separate parameter called the post arguments and sent
separately.
URL encoding isn't particularly fancy. All it means is that the input field
names and field values are written out as name=value pairs, and all such pairs
are placed into one long parameter with each pair separated from the next by an
ampersand (&). All spaces in the original input variables are replaced by
plus signs (+) and any special characters are replaced by their ISO-8859
Latin-1 hexadecimal equivalents in the form %xx (where xx represents the two
hex digits identifying the character).
Any or all of these arguments (if present), along with a series of parameters
that describe the client browser and the server, are placed into the 'sdoc'
Apple event and sent to the ACGI by the HTTP server. Each parameter is
identified in the Apple event by 4-character keyword names. The ACGI passes
these keyword names to the Apple Event Manager to extract the various
parameters.
For a full description of the keywords, refer to Planning and Managing Web
Sites on the Macintosh: The Complete Guide to WebSTAR and MacHTTP, Chapters 13
and 15.*
The five most important keywords to be aware of are as follows:
- kPathArgsKeyword -- the parameter that contains the path arguments (the
text between the $ and the ?)
- kSearchArgsKeyword -- the search arguments (everything after the ?)
- kPostArgsKeyword -- the post arguments
- kUserAgentKeyword -- the name and version of the browser that made the
request
- kMethodKeyword -- the name of the method (such as GET, POST, or ACTION)
by which the ACGI was called
The path, search, and post arguments hold the
data that makes up a request. The browser name lets you decide which HTML
features you might want to include in your response page. For example, you
might not want to use the latest HTML features of Netscape
NavigatorTM
in your response page if the browser name says that
the client is an old version of Mosaic that doesn't understand tables and
frames.
Most of the code in SDOCThread (excerpted in Listing 4) deals with extracting
parameters from the event and then breaking up the search and post arguments
into name=value pairs.
Listing 4. The SDOCThread routine
static void SDOCThread(void *threadParam)
{
WWWRequest request;
Size spaceNeeded, responseSize;
OSErr err;
// [1] Copy event and reply to local storage.
AEParams** params = (AEParams**) threadParam;
AppleEvent event = (*params)->event;
AppleEvent reply = (*params)->reply;
DisposeHandle((Handle) params);
// [2] Initialize request structure.
memset(&request, 0, sizeof(request));
// [3] Allocate storage for params/args.
spaceNeeded = ACGIParamSize(&event);
request.storage = NewHandleClear(spaceNeeded);
if (request.storage == nil) {
char msg[128];
sprintf(msg, "SDOCThread: no storage memory: %lu bytes.",
spaceNeeded);
ACGILog(msg);
err = ACGIReturnHandle(&reply, gHTMLNoMemory);
gDone = true;
goto Done;
}
HLockHi(request.storage);
// [4] Copy params/args into position.
err = ACGICopyArgs(&event, &request);
if (err != noErr) goto Done;
// [5] Decode URL-encoded search and post arguments.
if (strlen(*request.storage + (long) request.searchArgs) > 0) {
err = ACGIURLDecode(
*request.storage + (long) request.searchArgs,
&request.searchNum, &request.searchNames,
&request.searchValues);
if (err != noErr) goto Done;
}
if (strlen(*request.storage + (long) request.postArgs) > 0) {
err = ACGIURLDecode(*request.storage + (long) request.postArgs,
&request.postNum, &request.postNames,
&request.postValues);
if (err != noErr) goto Done;
}
HUnlock(request.storage);
// [6] Allocate HTML response.
request.response = NewHandleClear(gHTTPHeaderLen);
if (request.response == nil) {
gDone = true;
err = ACGIReturnHandle(&reply, gHTMLNoMemory);
goto Done;
}
BlockMoveData(gHTTPHeader, *request.response, gHTTPHeaderLen);
// [7] Call the custom processor.
err = WWWProcess(&request);
// [8] Put the response into the reply and resume the Apple
// event.
Done:
if (request.storage != nil) DisposeHandle(request.storage);
if (request.searchNames != nil)
DisposeHandle(request.searchNames);
if (request.searchValues != nil)
DisposeHandle(request.searchValues);
if (request.postNames != nil) DisposeHandle(request.postNames);
if (request.postValues != nil) DisposeHandle(request.postValues);
responseSize = GetHandleSize(request.response);
if (err == noErr && request.response != nil
&& responseSize > gHTTPHeaderLen)
err = ACGIReturnHandle(&reply, request.response);
else
switch (err) {
case errWWWNoMemory:
err = ACGIReturnHandle(&reply, gHTMLNoMemory);
break;
case errWWWRefused:
err = ACGIReturnHandle(&reply, gHTMLRefused);
break;
case errWWWTooBusy:
err = ACGIReturnHandle(&reply, gHTMLTooBusy);
break;
case errWWWUnexpected:
err = ACGIReturnHandle(&reply, gHTMLUnexpectedError);
break;
default:
err = ACGIReturnHandle(&reply, gHTMLUnexpectedError);
break;
}
if (request.response != nil) DisposeHandle(request.response);
// [9] Put error code into the Apple event (if needed).
if (err != noErr) {
long errorResult = err; // Must be long integer.
AEPutParamPtr(&reply, keyErrorNumber, typeLongInteger,
&errorResult, sizeof(long));
}
// [10] Resume the event, decrement running thread count, write
// to the log.
AEResumeTheCurrentEvent(&event, &reply,
(AEEventHandlerUPP) kAENoDispatch, 0);
gThreads--;
ACGILog("Done.");
return;
}
The only item passed to your custom WWWProcess routine is a pointer to the
WWWRequestRecord. You access the items stored in the record using the
convenience routines that are defined later.
EXTRACTING PARAMETERS FROM THE APPLE EVENT
The routines ACGIParamSize and ACGICopyArgs repeatedly call the Apple Event
Manager to get the size and the text of each parameter. ACGICopyArgs moves each
successive parameter into the
request.storage handle in the WWWRequestRecord
data structure (see acgi.h). It also places the offset of each parameter,
relative to the start of the handle, into corresponding pointer variables in
request. Because most parameters are only 10 to 100 bytes in length, it seemed
far more efficient to pack them all into a single handle. This avoids the
overhead of making multiple calls to the Memory Manager to allocate one handle
for each parameter and then make multiple calls to HLock and HUnlock when
manipulating the parameters during processing.
All parameters are stored as text strings, even the connection ID (a long
integer). Missing or empty parameters are stored as zero-length strings so that
the ACGI can handle requests from HTTP servers that only partially implement
the full WWW Apple event suite (there's no guarantee a given server
program will pass your ACGI all the parameters defined in the suite). You can
get the numeric value of any parameter by calling the convenience routine
HTTPGetLong.
DECODING URL-ENCODED POST ARGUMENTS
The search and the post argument strings are URL-decoded by the routine
ACGIURLDecode following the prescription outlined in Chapter 13 of
Planning and
Managing Web Sites on the Macintosh: The Complete Guide to WebSTAR and MacHTTP.
The routine begins by counting all of the name=value pairs in the given string
by looking for & separators. Two handles are then allocated to hold the
char* pointers. The string is then scanned, and the offset of each argument
name and its associated value are recorded in the arrays. Finally, the routine
ACGIDecodeCStr is called to convert each name=value pair from ISO-8859 Latin-1
encoding to the standard Macintosh Roman encoding. The conversion table used by
the popular Netscape Navigator browser is employed here for compatibility. If
you want to substitute another 256-character translation table, you'll need to
replace the ID=1000 'xlat' resource located in the resource file acgi.rsrc.
CONVENIENCE ROUTINES
There are three sets of convenience routines that allow you to extract
parameters from a server request, build your HTML response page, and fine-tune
the runtime performance of the ACGI.
PARAMETER AND ARGUMENT EXTRACTION ROUTINES
Seven routines, identified by the prefix "HTTP," can be used to extract
parameters or post and search arguments from the WWWRequestRecord that's passed
to the WWWProcess routine. The enumeration WWWParameter contains the name by
which an individual parameter must be referenced:
typedef enum WWWParameter {
p_path_args = 0,
p_username,
p_password,
p_from_user,
p_client_address,
p_server_name,
p_server_port,
p_script_name,
p_content_type,
p_referer,
p_user_agent,
p_action,
p_action_path,
p_method,
p_client_ip,
p_full_request,
p_connection_id
} WWWParameter;
Following are descriptions of the routines.
Boolean HTTPLockParams(WWWRequest r);
Locks down the request parameters. Several items in the WWWRequestRecord are stored
as handles and must be locked down before the ACGI can access them.
HTTPLockParams locks the items down for you and HTTPUnlockParams (below)
releases them. It might be a good idea to unlock your parameters before calling
YieldToAnyThread.
Convenience routines that return const char* pointers to parameters implicitly
call HTTPLockParams to lock down the WWWRequestRecord before they return the
pointers. Note that the request record remains locked when the routines return.
The routines that copy parameters and arguments into the character strings you
pass in will lock the request record while they're copying the information and
then unlock it before they return (but only if the data structure wasn't
already locked on entry).
void HTTPUnlockParams(WWWRequest r);
Unlocks the request parameters.
const char *HTTPGetParam(WWWRequest r, WWWParameter par);
Gets a pointer to one of the parameter strings. This leaves r locked.
Boolean HTTPGetLong(WWWRequest r, WWWParameter par, long *i);
Gets the integer value of a parameter. The result is returned in i. The routine
returns false if the parameter is not an integer.
Boolean HTTPCopyParam(WWWRequest r, WWWParameter par, char *result, long len,
long *actualLen);
Copies the parameter text into the character variable
result. The length of
result is
in
len; the actual length of the parameter is returned in
actualLen. The
routine returns false if the parameter identifier
par is invalid.
long HTTPGetNumSrchArgs(WWWRequest r);
long HTTPGetNumPostArgs(WWWRequest r);
Gets the number of search or post arguments.
Boolean HTTPGetSrchArgAt(WWWRequest r, long index, char *name,
long nameLen, long *actualNameLen, char *value, long valueLen,
long *actualValueLen);
Boolean HTTPGetPostArgAt(WWWRequest r, long index, char *name,
long nameLen, long *actualNameLen, char *value, long valueLen,
long *actualValueLen);
Gets a search or post argument by absolute position.
index is between 1 and the
total number of such arguments.
name receives the name of the argument at
position
index, and
value receives the value. The lengths of the character
array's name and value are in
nameLen and
valueLen. The actual lengths of the
items are returned in
actualNameLen and
actualValueLen. The routine returns
false if
index is out of range.
Boolean HTTPGetSrchArgCount(WWWRequest r, char *name,
long *numValues);
Boolean HTTPGetPostArgCount(WWWRequest r, char *name,
long *numValues);
Gets the number of search or post arguments that have the field name
name. The
number is returned in
numValues. The routine returns false if there's no search
or post argument called
name.
const char *HTTPGetMultipleSrchArg(WWWRequest r, char *name,
long index);
const char *HTTPGetMultiplePostArg(WWWRequest r, char *name,
long index);
Tries to get the instance
index of a multivalued search or post argument. The routine
returns an empty string if
index is out of range or if
name doesn't exist. The
routine leaves
r locked on exit.
index starts at 1.
Boolean HTTPGetLongMultipleSrchArg(WWWRequest r, char *name,
long index, long *i);
Boolean HTTPGetLongMultiplePostArg(WWWRequest r, char *name,
long index, long *i);
Gets the integer value of the instance index of a multivalued search or post
argument called name. The routine returns the value in i, and returns false if
index is out of range or the argument is not an integer. index starts at 1.
Boolean HTTPCopyMultipleSrchArg(WWWRequest r, char *name, long index,
char *value, long len, long *actualLen);
Boolean HTTPCopyMultiplePostArg(WWWRequest r, char *name, long index,
char *value, long len, long *actualLen);
Copies the contents of the instance
index of a multivalued search or post argument
called
name. The routine returns text in
value. The length of the
value string
is in len; the actual length of the
value string is returned in
actualLen. The
routine returns false if
index is out of range or
name doesn't exist.
index
starts at 1.
HTML PAGE COMPOSITION ROUTINES
There are ten routines, all prefixed with "HTML," to help you compose the HTML
response pages. The routines that allow you to append different types of data
to the response page are shown in Table 1; the handle to the response page is
obtained by calling HTMLGetResponseHandle.
Handle HTMLGetResponseHandle(WWWRequest r);
Gets the handle to the HTML response page.
OSErr HTMLClearPage(Handle r);
Clears the current response page (except for the HTTP header) and starts over.
Table 1. Routines that append data to the HTML response page
Routine
OSErr HTMLAppendHandle(Handle r, Handle h);
OSErr HTMLAppendTEXT(Handle r, long iTEXTResID);
- TEXT resource with ID iTEXTResID
OSErr HTMLAppendString(Handle r, long iSTRResID);
- STR resource with ID iSTRResID
OSErr HTMLAppendIndString(Handle r, long iSTRResID, long index);
- String at location index in STR# resource with ID iSTRResID
OSErr HTMLAppendFile(Handle r, char *localFileName);
OSErr HTMLAppendCString(Handle r, char *cString);
OSErr HTMLAppendPString(Handle r, StringPtr pString);
OSErr HTMLAppendBuffer(Handle r, char *buffer, long len);
- Text buffer of length len
ACGI RUNTIME-TUNING ROUTINES
There are 13 routines that allow you to fine-tune the runtime behavior of the
ACGI without having to modify the code in acgi.c or directly set global
variables.
void ACGIShutdown(void)
Shuts down the ACGI as soon as all current threads are finished.
Boolean ACGIIsShuttingDown(void)
Tests whether the ACGI is shutting down.
Boolean ACGIRefuse(Boolean refuse)
Sets whether to accept or reject requests.
unsigned long ACGIGetRunningThreads(void)
Gets the number of active threads.
unsigned long ACGIGetMaxThreads(void)
void ACGISetMaxThreads(unsigned long newThreads)
Gets or sets the maximum number of threads allowed to run at the same time.
void ACGIGetSleeps(long *whenThreads, long *whenIdle)
void ACGISetSleeps(long whenThreads, long whenIdle)
Gets or sets the sleep settings.
long ACGIGetWNEDelta(void)
void ACGISetWNEDelta(long newDelta)
Gets or sets the time between calls to WaitNextEvent.
void ACGIGetThreadParams(Size *stack, ThreadOptions *options);
void ACGISetThreadParams(Size stack, ThreadOptions options);
Gets or sets the thread stack size and creation options.
const char *ACGIGetHTTPHeader(void)
Gets a pointer to the standard HTTP header.
CUSTOMIZABLE ROUTINES
The six customizable routines in www.c allow you to adapt the ACGI shell to
suit your needs. I've supplied simple, straightforward samples of the routines
in the file www.c.
The default version of the WWWProcess routine is shown in Listing 5. It returns
a page that displays all of the HTTP server request parameters in a nicely
formatted table. Note the use of the YIELD macro here. It provides a convenient
way of yielding to other threads and automatically aborting should the ACGI
signal that it wants to quit.
Listing 5. The default version of WWWProcess
#define
YIELD() { YieldToAnyThread(); \
if (ACGIIsShuttingdown()) \
return (errWWWRefused); }
OSErr WWWProcess(WWWRequest request)
{
Handle r = HTMLGetResponseHandle(request);
char s[1024], name[512], value[512];
long len, i, n, iName, iValue;
Boolean gotOne;
OSErr err;
// Build a table to display the WebSTAR request parameters.
err = HTMLAppendPString(r,
"\p<HTML><HEAD><TITLE>ACGI</TITLE></HEAD>\r\n");
YIELD();
err = HTMLAppendCString(r,
"<BODY><H1>ACGI Parameters</H1><TABLE BORDER=0>");
YIELD();
err = HTMLAppendCString(r,
"<TR><TD ALIGN=RIGHT NOWRAP><B>Path
arguments:</B></TD><TD>");
YIELD();
if (HTTPCopyParam(request, p_path_args, s, 1023, &len))
err = HTMLAppendCString(r, s);
YIELD();
... // and so on, for all the other parameters
// Now show all the search arguments.
err = HTMLAppendCString(r,
"</TD></TR><TR><TD ALIGN=RIGHT NOWRAP VALIGN=TOP>"
"<B>Search Arguments:</B></TD><TD>");
YIELD();
n = HTTPGetNumSrchArgs(request);
if (n > 0) {
for (i = 1; i <= n; i++) {
gotOne = HTTPGetSrchArgAt(request, i, name, 511, &iName,
value, 511, &iValue);
if (gotOne) {
if (i > 1)
err = HTMLAppendCString(r, "<BR>");
err = HTMLAppendCString(r, name);
err = HTMLAppendCString(r, " = ");
err = HTMLAppendCString(r, value);
}
YIELD();
}
}
else
err = HTMLAppendCString(r, "(none)");
... // and similarly for the post arguments
err = HTMLAppendCString(r,
"</UL></TD></TR></TABLE>\r\n</BODY>\r\n</HTML&g;\r\n");
return (err);
}
OVER TO YOU
That's about it for writing threaded, high-performance ACGIs in C. I bet you
thought it was a lot more difficult than this, didn't you?
A threaded ACGI written in a high-level language offers a significant
performance increase compared to an equivalent ACGI written in AppleScript. If
you've been using AppleScript exclusively to do your HTML form processing, I
hope this article will whet your appetite to try something a bit more daring.
It's time to kick your Web site into high gear and move it over into the fast
lane!
REFERENCES
- "Futures: Don't Wait Forever" by Greg Anderson, develop Issue 22.
- "Concurrent Programming With the Thread Manager" by Eric Anderson and
Brad Post, develop Issue 17.
- MacTech Magazine articles: "Threading Apple Events" by Grant Neufeld,
Vol. 12, No. 4 (April 1996); "Writing CGI Applications in C" by John O'Fallon,
Vol. 11, No. 9 (September 1995); and "Adding Threads to Sprocket" by Steve
Sisak, Vol. 10, No. 12 (December 1994). MacTech Magazine articles can be found
at http://web.xplain.com/mactech.com/magazine/features/articlearchives.html.
- Planning and Managing Web Sites on the Macintosh: The Complete Guide to
WebSTAR and MacHTTP by Jon Wiederspan and Chuck Shotton (Addison-Wesley, 1995).
Chapter 13, "Writing CGI Applications," and Chapter 15, "Developing CGIs in C,"
are available in the file "Developing CGIs.pdf" included in the WebSTAR
documentation at ftp://ftp.starnine.com/pub/docs/webstar_doc.sea.hqx.
KEN URQUHART received his Ph.D. in physics in
1989 and has been dividing his
time between physics and computer science ever since. Ken's work has taken him
and his wife from North America to Japan and back again. Their cats (who travel
with them wherever they go) have been extremely good sports about international
travel. Ken's pretty sure the cats understand English perfectly well -- they're
simply choosing to ignore him unless they want food, body heat, or the litter
box cleaned.*
Thanks to our technical reviewers Kevin Arnold, Steve Sisak, and Michelle
Wyner.*