CGIs and WebStar 2
Volume Number: | | 11
|
Issue Number: | | 8
|
Column Tag: | | Internet Development
|
CGI Applications and WebSTAR
Make your World Wide Web server sing and dance.
By Jon Wiederspan, jonwd@tjp.washington.edu
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
Last month I walked you through the basics of creating a CGI application (CGI Applications and WebSTAR: Have some fun with your World Wide Web server). The result was a framework that can be used to create almost any CGI application you can think of. If youre like me, though, having a framework is terribly frustrating because it takes you right to the brink of being able to do what you want and then stops. This month Ill push you right over the edge as we plunge headlong into a real world application of CGI programming.
Before we get into the meat of a new script, though, there are several preliminary topics to cover. I will cover how to use existing OSAXes to speed up your AppleScripts, how to get useful error replies, how to parse and decode the information that is passed to the CGI application, and the basics of how FORM elements work. Once all of that is covered you will be ready to tackle this months project, a CGI that allows people to leave comments about your site (or whatever you want). Because I have already crammed a ton of information into this article, I will not be repeating anything from last months article. If you have not read it recently, you might want to give it a quick review or at least keep it close at hand while you read this one.
This article is about writing CGI applications in AppleScript. When I refer to the script, I am talking about the AppleScript code that will be compiled to become the CGI application. When I refer to the CGI application, I am talking about the final product. Until the script is properly compiled and saved (see last month) it is not a CGI application. Just so nobody gets confused.
Gathering the Ingredients
AppleScript has one great drawback and that is speed. As your scripts gets larger you will start measuring their running time in minutes instead of seconds. That is one of the prices you commonly have to pay for an easy language and it is fine for many uses. It is totally unacceptable for a CGI application, though. WebSTAR will only wait so long before it just gives up and returns an error to the user (that is determined by the timeout value that you set for WebSTAR).
The answer to the speed problem is an OSAX (Open Scripting Architecture eXtension). OSAXes are code extensions to AppleScript that provide new commands. These commands can speed up your scripts greatly by doing in one command what would otherwise take many lines of AppleScript. An example could be an OSAX caseConvert that would convert all lower case characters to upper case in a text string. There are hundreds of OSAXes available on the Internet and various online services (see Uniform Resource Locator for some good sites to check).
For this script I will be using four OSAXes:
ScriptTools OSAX v. 1.3.1 - This is a suite of OSAXes that provide new commands for file management, sound handling, and more. We will only be using the File IO OSAX, but I recommend installing the rest as well because of their usefulness. (freeware)
Tokenize OSAX - This OSAX is part of the ACME Script Widgets collection. It takes in a string of text and a list of one or more text delimiters. The delimiters are used to break the text string up, then it deletes the delimiters and returns the intervening text chunks as list items. As an example, using the delimiters {:,/}, the text string http://www.uwtc.washington.edu/Computing/WWW/ would be returned as the list {http}{www.uwtc.washington.edu}{Computing}{WWW}. (shareware, $10)
DecodeURL OSAX - This OSAX takes in a string of text and decodes it according to the guidelines in the URL specification. (freeware)
DePlus OSAX - This OSAX takes in a string of text and replaces all occurences of the plus (+) character with a space. This is needed to take care of information returned by Mosaic or NetScape Navigator. (freeware)
Installing OSAXes
To install these OSAXes, put them in a folder called Scripting Additions in the Extensions folder in your System Folder. You cannot simply drag them over the System folder, as you can with other system extensions. AppleScript loads OSAXes dynamically, so it should not be necessary to restart your computer after adding new OSAXes. I still recommend that you do restart if possible, though, to reduce the chances of problems. Note: remember to take the OSAXes out of their folders before moving them to the System folder.
All four of these OSAXes are available at ftp://ftp.first.com/pub/Mac/prog/as/. Tokenize is part of an archive called ACME Script Widgets 2.0. It is a good idea to download these OSAXes and have them installed in your system before you read any further. That way you will have the documentation on hand (I dont have room here to cover specific commands) and be able to jump right into the demonstration script..
Error Handling
Script Editor does a grammar and syntax check on your script every time you compile it (save it as an application or as a compiled script or click on the Check Syntax button). That doesnt guarantee that the script will run without errors, though. It only means that every command you have used is known to AppleScript and seems to have the proper information provided for it. There is no way the Script Editor can know about other problems that might occur at runtime, such as a timeout (if the script takes too long to finish) or a failure to pass the proper information back to WebSTAR. In addition, since AppleScript doesnt type variables until they are used at runtime, you could easily have the wrong type of information being passed to a command, such as a variable containing a list instead of text.
Any one of these situations could result in an error in your CGI application. Unfortunately, the error information that you get back from WebSTAR and the operating system is not very useful. It may consist of an informative statement like CGI application failed and an error code number or you may just get a blank page. In addition, the error may cause your CGI application to post an error notification on the server (a blinking icon in the menubar or a dialog box) and the CGI wont work again until you acknowledge the notification.
Luckily, AppleScript contains a construction for handling errors so they wont tie up your server. The compound statement looks like this:
try
[your code goes here]
on error errNum number errMsg
[error handling code goes here]
end try
If an error occurs in the code between the try and on error statements, execution will pass to the on error handler. AppleScript always includes two pieces of information for the handler, the description of the error (errMsg) and the error number (errNum). This information allows you to take different actions depending on the source of the error. Remember to keep the code in the error handler simple, since an error can also occur there, in which case the script will stop.
The first thing we will do is take last months code and add the try construction. For optimum protection, the try statement should be the very first line inside the WWW sdoc event handler and the on error handler should be the last thing. The main thing we want to do in the error handler is to return information to the user so that they will know (1) what happened and (2) what to do about it and so WebSTAR wont be left waiting for a reply. The following code will do just that.
on error errMsg number errNum
set return_page to http_10_header ¬
& "<HTML><HEAD><TITLE>Error Page</TITLE></HEAD>" ¬
& "<BODY><H1>Error Encountered!</H1>" & return ¬
& "An error was encountered in this script." & return
set return_page to return_page ¬
& "<H3>Error Message</H3>" & return & errMsg & return ¬
& "<H3>Error Number</H3>" & return & errNum & return ¬
& "<H3>Date</H3>" & return & (current date) & return
set return_page to return_page ¬
& "<H>RPlease notify the webmaster at " ¬
& "<A HREF=\"mailto:webmaster@your.domain.name\">" ¬
& "webmaster@your.domain.name</A>" & " of this error." ¬
& "</BODY></HTML>"
return return_page
end try
This code is pretty straightforward. First it builds an HTML text document to return and store the information in return_page, then it returns return_page to WebSTAR. Notice that the first thing added to return_page is the http_10_header. I cant say often enough how important it is to return a proper header for the client (you may think I can, but I really cant). The HTML document tells the user what the error message was, the identification number of the error, and when the error occurred. Then, at the bottom of the page, it tells the mail address to send the information to (you will want to change this address, of course, to the real address for the webmaster of your site). If the user has a mailto-capable client, they can mail the information right from this error page.
Other Common Errors
The try construction wont handle all errors, though. It only catches those that cause the CGI application to stop execution without crashing the entire server. Here are the most common remaining problems that you might encounter.
1) The majority of problems are with incorrectly installed OSAXes. These usually result in errors when you try to compile or check the script. The first thing you should do when AppleScript complains about a command not found or an event not available is to re-install the OSAXes and restart your computer (the one the CGI application is running on).
2) The DecodeURL OSAX does not process 8-bit (international or double-byte) text correctly. The CGI application will run but may return garbage or altered text instead of the international characters. There is a modified DecodeURL OSAX from Motoyuki Tanaka that correctly processes international text. The OSAX is available from his home page at http://mtlab.ecn.fpu.ac.jp/guruguru/
others/DecodeURL_8bits.hqx.
3) It is important to remember is that all processing for this CGI ends when the information is returned to the user. A CGI application is launched when WebSTAR sends the proper Apple event to it and it ends when the reply to that event is sent back by a return statement. Some people think they can save processing time by returning information quickly to the user then processing the rest of the information. Nope.
4) WebSTAR will not wait around forever for a reply from your CGI application. The amount of time it will wait (in seconds) is defined by the timeout value setting for the server. If you suspect that your CGI application will take a long time to process the information, remember to increase this setting. If WebSTAR times out, it will return the -1701 error code, meaning that no reply was returned from the CGI application.
Improved Error Reporting
There are many ways that the error handler could be improved, of course. You could use the error number to provide a more informative error message for the user or you could return the contents of some key variables to show their intermediate values. You could also use another script (one that works) to automatically send a mail message to yourself about the error so you dont have to count on the user doing so or log the information to disk. Be creative. Go wild. The worlds your oyster.
Figure 1: A sample form page
Forms
Now it is time to talk about forms, by which I mean HTML documents that contain FORM elements. Forms allow you to gather input from your users by providing text fields, checkboxes, lists, and other items which the user can manipulate. Every form also contains a button so users can send the information in the form to the server. Forms are very important because they are really the only way to get extensive user interaction with your Web pages. The search interface is too limited for getting much information and maps are even more so. Figure 1 shows what a form page might look like.
The selection in Text 1 shows how the form portion in Figure 1 looks as HTML text. This sample displays one type of input, the text field, but there are others such as radio buttons, checkboxes, scrolling lists, and popup menus. There is not room here to cover how to write the forms themselves, but any good HTML book will cover them in detail. Instead, I will focus on two topics, how a form page interacts with a CGI application (general) and how the form data is encoded before it is sent to a CGI application (specific). Im assuming in the following sections that you are already familiar with interactions between a WWW client, WebSTAR, and CGI applications, as outlined in my last article.
<FORM ACTION="/cgi/Guestbook.acgi" METHOD=post>
<P>Your name:<INPUT TYPE="text" NAME="name" ¬
VALUE="" MAXLENGTH=40>
<B>RYour comment:<TEXTAREA NAME="comment" ROWS=4 COLS=60></TEXTAREA><B>R
<INPUT TYPE="submit" NAME="S" VALUE="Add to Guestbook">
<INPUT TYPE="reset" NAME="R" VALUE="Clear this Form">
</FORM>
Text 1: Sample form section of an HTML document
Processing Forms
Every form page has to have at least one SUBMIT-type input defined. This presents a button (for graphical browsers) that allows the user to send the information from the form fields off for processing by a CGI application. When the button is clicked by the user, all of the information in every field is packaged together into one large piece of form data to be sent off for processing. The very first tag in the form section defines where the form data will go and what method will be used to send it. The tag looks something like this:
<FORM ACTION="/cgi/Guestbook.acgi" METHOD=post>
The METHOD defines how the data in the form fields will be sent (in this case, using the POST method) and the ACTION provides a URL to the CGI application that will receive and process the data. In most cases, like this one, the CGI application will be on the same server as the form page that calls it, so the URL is just the path (relative to WebSTAR) and file name. You can also link to CGI applications on other servers by specifying a complete URL. That is a good way to distribute the processing load over several computers, or a good way to make someone angry, depending on whether you have permission to do it or not.
This brings up a good use for last months example. That basic CGI is perfect for testing whether you are using the correct path to call the CGI application or not. Put it where you would normally put the CGI that you plan to use to process the form and give it the same name. Since it doesnt do any special processing or use any OSAXes, it is practically guaranteed to work if you call it correctly. Another easy way to tell if you are using the correct path is to check whether your CGI gets launched or not. First make sure the CGI application is not running, then bring up the form page and submit some data. Now check and see if the CGI application is running. If it is then you used the correct path and the problem is somewhere else.
Form Data Encoding
If you look at the form in Figure 1 for a minute you should notice one problem: how do you pass all of that information to the CGI application? There are only two fields there, but imagine if there were a dozen or more. Well, there are lots of ways it could be done, but the people who wrote the CGI specification chose the following method:
First, for every item in the form (meaning every field, checkbox, etc.) the name of the item is concatenated with the data in the item using an equals (=) character to delimit the two. Thus you have a string of text that looks like name=data. Then, each of these strings is concatenated into one long string using the ampersand (&) to delimit each pair. The final string looks like this:
name1=data1&name2=data2&....&namen=datan
You can see that this introduces a problem, namely what if there are = or & characters in the data itself? How do you differentiate between those and the delimiters? The CGI specification solves this by specifying that any occurrences of these characters in the data is to be encoded as %XX where XX is the hexadecimal equivalent of the character. Using this method, each = becomes %3D and each & becomes %26. This is known as URL Encoding because it is also part of the specification for encoding special characters that might occur in a URL. With URL encoding there are many special characters that are encoded, such as + (%2B) <space> (%20) and % (%25). In addition, client software is allowed to encode other characters if it is deemed necessary to ensure their transmission in data.
There is one small problem with this encoding. There are two big players in the Web scene who do not strictly follow this pattern, those being NCSA (Mosaic) and Netscape (Netscape Navigator). Early on, Mosaic decided to encode all occurences of a space as + instead of %20. Netscape followed suit with their browser, presumably because much of the original Mosaic team was involved and tended to make the same decisions the second time through. Since many commercial browsers are based on the Mosaic (or Enhanced Mosaic) code, this problem is spreading.
Processing Form Data
With this information, you are now ready to make use of the data that users submit from your form pages. There are two types of information passed to a CGI application by WebSTAR. The first information is external, meaning that it is received from the WWW client as part of a form page submission, or a map click, or something similar. This information may be in one or more of the variables path_args, http_search_args, or post_args, depending on the method used, and should be URL encoded. The other information is internal and is generated by WebSTAR directly. This includes all of the other variables such as referer, script_name, and method and is not encoded. There is also one special variable full_request which contains the entire request as it was sent from the WWW client to WebSTAR. Some of the text in this variable will be URL encoded.
Because all of the data in the external information is encoded, you use will need to decode the information before you can use it. In addition, if the data in one of the variables came from a form page (and Im assuming here that it did) then the data must be parsed into individual fields and the field data separated from the field names. The following script fragment will do both functions: parsing and decoding. In this example, the script processes the information in the post_args variable, which is where the data would be put from a form using the POST method for submission. The same process works for http_search_args if the form uses the GET method.
set postarglist to tokenize (dePlus post_args) ¬
with delimiters {"&"}
set oldDelim to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"="}
repeat with currpostarg in postarglist
set currname to first text item of currpostarg
if currname = "name" then
set messageName to (Decode URL ¬
(last text item of currpostarg))
else if currname = "comment" then
set messageBody to (Decode URL ¬
(last text item of currpostarg))
else if currname = "S" then
-- ignore it. That's the Submit button.
else
-- generate an error to report the unknown field
error ("Unknown field in post_args: " & currname) ¬
number 100
end if
end repeat
set AppleScript's text item delimiters to oldDelim
The first line of this fragment makes use of two OSAXes to do the bulk of the processing. There are actually three commands on this line. The first dePlus post_args is in parentheses to indicate that the result of that command is to be the input for the surrounding command. dePlus quickly scans the entire block of text passed in post_args and converts all occurrences of + to a space. The second command, tokenize () with delimiters {&} takes the text output of dePlus and converts it to a list with each item representing the text between ampersand (&) characters. Note that the ampersands are removed as part of this process. The result is a list in which each element is a name=data pair. The third command is an assignment set postarglist to ... which assigns the list output from Tokenize to the variable postarglist.
The rest of the script fragment processes the list one item at a time, looking at the name portion of the item, and then assigning the data portion of the item to the proper variable based on the name. Although you could use the Tokenize OSAX again to do this, I have chosen to do it with AppleScripts text item delimiter instead to show you the alternative. Since there should be only one occurence of the delimiter in the item text, the speed should be about the same. First the script saves the current text item delimter settings (to be restored later), then it uses a repeat loop to look at each list item in turn. Each time through the loop, the script extracts the name (first text item of currpostarg) and uses a series of if...then statements to see what the name is. When a name match is found, the data (last text item of currpostarg) is extracted, passed to DecodeURL to convert all of the hexadecimal codes back to the original characters, then assigned to a variable for use later on in the script.
If you want this script to handle more fields or fields with different names, this is the section to change. To handle more fields, you need to add more elseif statements to the script, one for each new field. The statement should assign the new fields data to a variable or concatenate it to an existing variable somehow. If you want to handle different fields, you need only change the text string that is used to match the field name. This script is set so that the default action, if the name of the field doesnt match what you were expecting, is to generate an error message. The line that does this (error (Unknown field in post_args: & currname) number 100) causes processing to jump to the error handler, telling it what the unexpected field name was. The number 100 was chosen at random and has no meaning here except to make sure that a number is passed.
Normally, you would extract the information from the post_args variable by reversing exactly the procedure that was used to create it, meaning that you would parse the name=data pairs, then extract the data, and then do all of the decoding (including converting + to space). You could do that, but as it turns out it is safe to run DePlus on the entire post_args at once because the + character is not used as any kind of delimiter. This can save a significant amount of processing time on a large form.
If you are reading information from post_args that is not from a form (or from http_search_args or path_args), you can decode the entire data with the one line set <variable> to Decode URL (dePlus post_args). Note that it is necessary to do the + to space conversion before decoding the hexadecimal encoded characters. If you reverse that process you might decode some more + characters, which would then incorrectly be converted to spaces.
That is all of the preliminary information. You are now fully equipped for the real world example.
A Comment Page: Divide and Conquer
Now you are ready for some real fun! This script will use the script in last months article as a base, add the features discussed above, show you even more features, and use the results to do something we all want: provide a page where people can leave nice comments about our work. In addition, this script will display some tricks for getting better response speed from your CGI applications. By response speed I mean the speed with which the CGI application returns data back to the user. This is the only speed that matters when you are writing a CGI application.
Speed Tricks
The first trick is to cache the page in memory. Any CGI that needs to read data from a disk file is going to be slowed down by the disk access. If you can read the data once and cache it into memory you can significantly speed up processing time. This comes at a slight price, of course, because your CGI will use more memory to cache the pages. Most pages are relatively small, though (<32K), so this shouldnt be much of a problem until you try to handle dozens of pages or more. For this example we will read the entire page in when the CGI application is launched so it will be ready when the first request comes in. Since all of this processing makes the CGI application a bit slow to launch, I recommend that you set it to launch at the same time that WebSTAR does so it is ready for the first request.
The second trick is to put off writing data to disk until idle time. Writing information out to disk is even more time consuming than reading it in. Writing to the end of a file is not too bad because it always takes constant time. You just position yourself at the end of the file, then extend it with the new data. In a comment page, though, it is preferable to have the newest comments at the top so people see something new immediately when they visit the page. This requires that you insert the new text into the beginning portion of the file and move all of the rest of the file to make space. Doing this takes longer and longer as the file increases in size.
We will get around this by keeping all of the information in memory and only writing it to disk when the CGI application is idle. It will still take longer to write the data as the file grows in size, but you are using spare time so it doesnt matter much. In addition, you can save up several entries before writing to disk so you write much less often. Of course, this requires that your server be at least somewhat stable or else you lose all of the information cached in memory when it crashes. The more stable your server is, the longer you can go between writes and the bigger speed improvement you will see.
File I/O OSAX
There are several ways to read from and write to disk files in AppleScript, depending on what version of the operating system and AppleScript you are using. I wont be able to cover all of these in this article, so I have chosen to use the ScriptTools package by Mark Aldritt. This suite of several OSAX includes one for file I/O called File IO OSAX. This OSAX provides commands to open and close a file, read in data, write data, get the size of a file, and get your current position in a file (where you will read from or write to next). The documentation is excellent, so I will leave the exact syntax to your own study.
The File IO OSAX reads data from disk one line at a time (a line being all of the text up to the next carriage return). By default it limits a line to 1024 bytes, but that can be increased to the limit of the available memory. Since each line requires a disk access and the overhead on this is significant, you can cause the script to run more quickly by keeping your text file to as few lines as possible. HTML text gets its formatting from tags and not from carriage return or linefeed characters, so it does not hurt the presentation of the document to have all of the text on one line. Here is what the HTML text file I use for this script looks like:
<HTML><HEAD><TITLE>Jons GuestBook</TITLE></HEAD><BODY><H1>
Welcome!</H1>I dont know why these things are so popular,
but they are. Leave your pearls of wisdom here.<P><I>If you
want to format your comment you will need to use HTML tags.
Please do not link to large graphics.</I><H>R
<P><FORM ACTION="/cgi/Guestbook.acgi" METHOD=POST><B>RYour
Name:<INPUT TYPE="text" NAME="name" VALUE="" SIZE=20
MAXLENGTH=40><P>Your Comment:<TEXTAREA NAME="comment"
ROWS=3 COLS=50></TEXTAREA><B>R<INPUT TYPE="Submit"
VALUE="Add to Guestbook"><INPUT TYPE="reset" NAME="S"
VALUE="Clear this Form"></FORM><H>R
<!-- DIVIDE_HERE -->
<P><B>Mike Hon</B> (rcwusr.bp.com.) <I>Wednesday, May 31, 1995
7:30:58 AM</I><B>RV.impressive. Hope to see you speak at
WebEdge II.
<H>R<A HREF="/JonWiederspan/JonW.html">[Return to Jons Home
Page]</A><B>R<A HREF="/cgi/Guestbook.acgi?view">[Reload the
guestbook]</A></BODY></HTML>
This is actually only four lines, as shown by the indentation. That means that I only have to access the disk four times to read everything in. I have left one user comment in to show how I also save each comment on a single line so the file does not grow too quickly. If you really want to save some accesses in a large file you could probably cram several comments on each line, but then youre talking about a seriously ugly text file.
One line of the file is used as a marker to indicate where the new comments should be inserted. The marker is embedded in an HTML comment so it is never displayed by the client software. When the script reads the file into memory, it divides the file so that everything from the start of the file to and including the marker is in the top portion (header_data) and everything after the marker is in the bottom portion (footer_data). This allows new comments to easily be added by prepending them to the footer_data and then concatenating the two pieces back together to reform the file.
The code that reads the file into memory is not in a handler. Instead, it is at the very beginning of the script. This causes it to be executed immediately upon the launch of the CGI application. This can be a problem with some CGI applications because of the way WebSTAR communicates with them. When WebSTAR receives a request for a CGI application that is not currently running, it first tells the application to launch itself, then it sends the CGI request. If the CGI application has a lot of processing to do at launch, it will likely start processing the CGI request before the startup processing finishes (it can run both processes at the same time) which can result in incomplete results or even failure for the first request. In these cases it is especially important to make sure that the CGI is launched before or immediately after WebSTAR launches so it is ready to process any possible customer requests.
Reading The File Into Memory
Here is the section of code that handles reading the file into memory. The first line puts the name of the file on disk into a variable (guestFile). This variable is shared as a global variable and used by all routines that read from or write to this file.
set guestFile to Macintosh HD:MacHTTP Server:Guestbook.html
The next step is to read the file into memory. The script opens the file, then reads line by line, adding each line to the header_data until a line is encountered that contains the marker (<!-- DIVIDE_HERE -->). Notice how each line needs to have a carriage return added because the OSAX strips that character out when it reads from the file. After the marker is found, the script reads the rest of the lines, this time adding them to the footer_data. Since were not looking for a marker at this point the script saves a little time by adding the file data directly to footer_data.
-- open the file for reading
set fileRefNum to open file (guestFile as alias)
-- initialize storage for the first part of the file
set header_data to ""
-- read every line until the marker is detected
repeat
set currLineData to read file fileRefNum
set header_data to header_data & currLineData & return
if currLineData = "<!-- DIVIDE_HERE -->" then
exit repeat
end if
end repeat
-- initialize storage for the rest of the file
set footer_data to ""
-- read in the rest of the lines
repeat
set footer_data to footer_data & (read file fileRefNum) ¬
& return
end repeat
-- youre done with the disk file for now
close file fileRefNum
This entire section of the script is wrapped in a try...on error construction so that any errors that occur will go to the error handler shown. The first thing the error handler does is close the file, since you will always want to do that so the file is not damaged in the event that the CGI application crashes. It then tests to see what type of error occured. The only error the script specifically handles is the one that indicates the end of the file has been reached, at which point the script processing stops and waits for another event to be sent. It would be a good idea to also test for other common errors such as a non-existent or busy file.
try
[code to read file into memory goes here]
on error errMsg number errNum
close file fileRefNum
if errMsg = "End of file error." then
-- were all done reading data.
-- ignore the error
else
return OK_header & "<TITLE>ERROR!</TITLE>"
& "<H1>Error Notice!</H1><B>Error Number:</B> " & errNum ¬
& "<B>R<B>Error Message:</B> " & errMsg
end if
end try
Adding New Comments
The CGI event handler loop is pretty much the same as I described earlier. The form page has only two fields, so processing is very quick. The data from the name field is put into the commentName variable and the data from the comment field is put into commentBody. The main difference is that this handler actually does something with the variables. The following line adds the two variables (and the users IP address and the current date) to the existing comments at the start of footer_data then the next line forms the return_page by concatenating the HTTP header, header_data, and footer_data and returns the result to WebSTAR.
set footer_data to "<P><B>" & commentName & "</B> (" ¬
& client_address & ") <I>" & (current date) & "</I><B>R" ¬
& commentBody & return & footer_data
set return_page to OK_header & header_data & footer_data
Saving To Disk
One reason this CGI application is able to respond so quickly to the user is that the data is not saved to disk until later, when there is idle time. The idle handler is run whenever an idle event is sent to the CGI application. Idle events are sent when the application itself does not receive an event or do any processing for a specified period of time (it doesnt matter if the rest of the computer is busy). The value returned from the idle handler tells the system how long to wait (in seconds) before sending another idle event.
When an idle event is received, the following code is run to save the current comments to disk.
on idle
global guestFile
global header_data
global footer_data
set fileRefNum to open file (guestFile as alias)
write file fileRefNum text header_data & footer_data
close file fileRefNum
-- wait 30 minutes for the next idle event
return 1800
end idle
The same code is executed when the CGI application is quit by including a quit handler. That ensures that you dont lose any comments added since the last idle time.
Further Improvements
There are several ways that this script could be improved. Some of them were in my original script but had to be left out to fit in one article. For example, you could add a variable that tells whether any new comments have been added or not. That way you could run the idle handler a little more often and test whether information needs to be written to disk. The idle and quit handlers should also have try...on error contructions added in case there is a problem opening or writing to the file.
This script works with a dynamic page, meaning one that has information added to it by the user. This method of reading a page into memory and user a marker to insert data into a page can be especially powerful when it is used with static pages, though, where the page does not change, so no information needs to be written to disk. If you want to add a hit counter to a page, or show the current date, or insert a nice hello message with the clientss machine name, this is the way to do it in AppleScript. If you use a counter, of course, you will want to save it as a script property so you dont have to restart it at zero every time the CGI application quits.
Other Cool Uses for CGI Applications
The list of possibilities for using CGI applications with WebSTAR is nearly endless because you can use AppleScript to not only do custom processing but also to link WebSTAR with almost any other Macintosh application. Heres a short list of some of the cooler stuff that you will find already in use on various WebSTAR sites:
Interface with Eudora to receive e-mail from Web pages or to automatically mail out files or send mail directly using the TCP/IP Scription OSAX.
Interface with applications like Butler, 4D, FileMaker, TR-WWW, HyperCard, and AppleSearch to make databases and text documents searchable via Web pages. You can also allow users to update database information with this interface.
Provide Web pages that allow you to monitor and control network applications like ListSTAR, WebSTAR, and TribeLinks SNMP software.
Control hardware to provide live pictures of remote locations
Make pages that change depending on the clients IP address or the WWW browser type.
Closing
So, now you know everything that I know. Well, not really but at least you know everything I know about CGI applications. Well, not even that actually, but you do know enough to get started on some really fun projects of your own. In closing there are three points that I want to leave you with.
First, buy an AppleScript book. Whether you are a novice programmer or have ten years experience, you will benefit from one of the several fine books on the market now. This article is only intended to get you over the first hurdle, that of how to get the information from WebSTAR and put it into a useful form. Everything from here on out is AppleScript.
Second, if you plan to continue using AppleScript heavily (and there are lots of fun things you can do with it besides CGI applications), I also recommend that you investigate one of the fine Script Editor replacements that are now available. The three I know of - ScriptWizard, ScriptDebugger, and Scripter - are all three tremendous improvements and will save you hours of time both in debugging and in creating new scripts.
Third, make use of the resources that are available on the various networks. You can subscribe to the MacScripting mailing list, or just search past archives of the list, or check out the scripting section of your favorite online service. Any one of these will put you in touch with hundeds, if not thousands, of people who have already gone through the same problems you will encounter. At least one of them will probably be willing to help you avoid the trouble areas. Of course, people will be more likely to help you if you read the book first to learn about the most common mistakes.
Fourth, be a source of help. If you have benefitted by this article, or by others who answer your questions, pass that help along to someone else who may be just starting out. And be gentle in correcting beginners mistakes. You never know when the person you help might turn out to be your future employer.
Late Breaking OSAX News...
As I was finishing this article, two new OSAXes were released that would greatly simplify the process of extracting the information from forms in CGI applications. The first product is Parse Post Args OSAX by Wayne Walrath. This OSAX is a new addition to the ACME Script Widgets collection and is available free to currently registered users of the ACME Script Widgets. The second product is the Parse CGI OSAX by Document Directions. This OSAX is shareware ($10) and available at http://marquis.tiac.net/software/home.html. Both of these OSAXes perform basically the same function, combining the features of Tokenize, DecodeURL, and DePlus to create an easy and much faster interface for working with form data. They differ primarily in the way they hand information back to the user and neither is a clear winner at this point.
Listing 1: WriteComment.txt
-- this is a line termination indicator
set crlf to (ASCII character 13) & (ASCII character 10)
-- standard header for returning file data
set OK_header to "HTTP/1.0 200 OK" & crlf & "Server: WebSTAR/1.0 ID/ACGI"
& crlf & "MIME-Version: 1.0" & crlf & "Content-type: text/html" & crlf
& crlf
-- this is the guestbook text/html file
set guestFile to "Macintosh HD:MacHTTP Server:Guestbook.html"
-- this code is run on startup to read the file into memory
set fileRefNum to open file (guestFile as alias)
try
-- initialize storage for the first part of the file
set header_data to ""
-- read every line until the marker is detected
repeat
set currLineData to read file fileRefNum
set header_data to header_data & currLineData & return
if currLineData = "<!-- DIVIDE_HERE -->" then
exit repeat
end if
end repeat
-- initialize storage for the rest of the file
set footer_data to ""
-- read in the rest of the lines
repeat
set footer_data to footer_data & (read file fileRefNum) & return
end repeat
-- youre done with the disk file for now
close file fileRefNum
on error errMsg number errNum
-- if anything goes wrong you still want to close the file
close file fileRefNum
if errMsg = "End of file error." then
-- were all done reading data.
-- ignore the error
else
return OK_header & ¬
"<TITLE>ERROR!</TITLE><H1>Error Notice!</H1><B>Error Number:</B> " &
errNum & "<B>R<B>Error Message:</B> " & errMsg
end if
end try
-- ********************************************************
-- THIS IS THE APPLE EVENT HANDLER TO
-- ACCEPT EVENTS FROM WEBSTAR
-- ********************************************************
on «event WWW sdoc» path_args ¬
given «class kfor»:http_search_args, «class post»:post_args, «class
meth»:method, «class addr»:client_address, «class user»:username, «class
pass»:password, «class frmu»:from_user, «class svnm»:server_name, «class
svpt»:server_port, «class scnm»:script_name, «class refr»:referer, «class
Agnt»:user_agent, «class ctyp»:content_type
try
global header_data
global footer_data
global OK_header
-- parse post_args into a list of name=data items
-- and convert + to space
set postarglist to tokenize (dePlus post_args) ¬
with delimiters {"&"}
-- process each item to extract the data and decode it
set oldDelim to AppleScripts text item delimiters
set AppleScripts text item delimiters to {"="}
repeat with currpostarg in postarglist
set currname to first text item of currpostarg
if currname = "name" then
set commentName to (Decode URL ¬
(last text item of currpostarg))
else if currname = "comment" then
set commentBody to (Decode URL ¬
(last text item of currpostarg))
else if currname = "S" then
-- ignore it. Thats the Submit button.
else
-- generate an error to report the unknown field
error ("Unknown field in post_args: " & currname) ¬
number 100
end if
end repeat
set AppleScripts text item delimiters to oldDelim
-- insert the new comment at the start of the footer
-- then build the text page to return to the client
set footer_data to "<P><B>" & commentName & "</B> (" ¬
& client_address & ") <I>" & (current date) & "</I><B>R" ¬
& commentBody & return & footer_data
set return_page to OK_header & header_data & footer_data
-- return the text data
return return_page
on error errMsg number errNum
set return_page to http_10_header ¬
& "<HTML><HEAD><TITLE>Error Page</TITLE></HEAD>" ¬
& "<BODY><H1>Error Encountered!</H1>" & return ¬
& "An error was encountered in this script." & return
set return_page to return_page ¬
& "<H3>Error Message</H3>" & return & errMsg & return ¬
& "<H3>Error Number</H3>" & return & errNum & return ¬
& "<H3>Date</H3>" & return & (current date) & return
set return_page to return_page ¬
& "<H>RPlease notify the webmaster at " ¬
& "<A HREF=\"mailto:webmaster@your.domain.name\">" ¬
& "webmaster@your.domain.name</A>" & " of this error." ¬
& "</BODY></HTML>"
return return_page
end try
end «event WWW sdoc»
-- The idle event is run at idle time to save the data to disk.
on idle
global guestFile
global header_data
global footer_data
set fileRefNum to open file (guestFile as alias)
write file fileRefNum text header_data & footer_data
close file fileRefNum
return 1800 -- wait 30 minutes to save information
end idle
-- The quit handler saves the data to disk before quitting
on quit
global guestFile
global header_data
global footer_data
set fileRefNum to open file (guestFile as alias)
write file fileRefNum text header_data & footer_data
close file fileRefNum
continue quit
end quit