Scripting Nisus Writer Express
Volume Number: 19 (2003)
Issue Number: 8
Column Tag: Scripting
Scripting Nisus Writer Express
Writing perl macros to add functionality
by David Linker
New opportunities with Mac OS X
Although we have all read about how the change from OS 9 to OS X has challenged developers as they adapt their programs to either Carbon or Cocoa, the addition of BSD has added a number of powerful tools that can be accessed from or added to applications. An example of this is the powerful word-processing program Nisus Writer, which is especially well suited for writing in foreign languages, and has a very powerful macro programming language. The current version runs in Classic, but the programmers are had at work on a Cocoa version, and have released a new version, which they call Nisus Writer Express. This version is available for evaluation as a free download, at <http://www.nisus.com/Express/>.
A perl by any other name
One of the interesting design decisions that the programmers made was that they incorporated the ability to run perl scripts as a macro language. Perl is an interpreted language that is now included on every Macintosh as part of OS X. The name "perl" comes from "Practical Extraction and Report Language", and in the words of the documentation "Perl is a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information." It has been around since 1987, and is a favorite of many for processing text files. It has a rich set of text processing capabilities, many useful libraries, and an active community of users. As such, it is an excellent choice for a macro language.
In order to try out the capabilities of this new type of macro extensibility, and learn a little about perl in the process, I decided to add back a simplified version of a feature of the Classic version of Nisus that is currently lacking in Nisus Writer Express, namely the mail merge function.
In the process of implementing this functionality, we can learn about how the programmers of Nisus Writer have connected perl and Nisus, and also learn a little about perl programming.
Making a Macro
Assuming you have installed the Nisus Writer Express beta, we can start making a perl macro. Start up Nisus Writer Express, which we will call NWE from now on. After it has started up, select "New Perl Script" from the Macro menu. In the window that comes up, type in the following:
#!/usr/bin/perl
#Nisus Macro Block
#source front
#destination new
#End Nisus Macro Block
use strict;
use warnings;
print 'Mail merge coming soon!';
Save this by choosing "Save as Macro..." from the "Macro" menu. In the dialog that comes up, you will need to choose "Nisus Perl Macro" from the file format pop-up, and the "UTF-8" choice from the text encoding. You should choose the name, which should be "Mail Merge.pl". Note that the extension should be filled in automatically if you have chosen the perl macro format. Check that the destination for the file is the /Users/<user_home>/Library/Application Support/Nisus Writer Express/Macros, where "<user_home>" represents where your user directory is, and save the file.
Now, we will try running the file before we dissect it. Choose "Mail Merge" from the macro menu. A new window should open up, and the line "Mail merge coming soon!" should be written in it.
Now, we will look at how this works. The first point to understand is that any line that starts with a pound sign is considered a comment, and is ignored by the perl interpreter. This means that the first line is a comment, but this is interpreted by the shell as instructions on where to find the interpreter for the script, which in this case is perl. In the case of a Nisus perl macro, this is not actually necessary, since NWE knows that a macro with the ".pl" extension is to be executed as perl, but it is good form to include it since all other perl scripts outside of NWE will need this line to function properly.
The next four lines are special comments that communicate with NWE to help set up the environment in which the macro runs. These take the form of a number of optional commands. The first line, "Nisus Macro Block", defines the beginning of these instructions. The command "source" defines where standard input will come from. The options are to take it from the front window (front), the window below the front window (next), the clipboard (clipboard), or no input (none). The next line is the command "destination", which defines where standard output will go, which includes the same options as the "source" command, with one additon, which opens a new window for output, with the "new" option. If no other option is specified, the output is appended to the end of the current contents. If you want to replace the current contents, you add the "replace" option. The "End Nisus Macro Block", ends the Nisus commands section.
The next two lines are commands to the perl compiler, which control how it handles ambiguous situations. The first, "use strict", invokes some error checking, such as making sure that variables are declared before they are used. This can be very helpful, since if you don't use this option and you type a variable name incorrectly, it will just create a new variable, introducing a hard to find error. The second reports warnings, which include use of uninitialized variables, dropped variables, using closed or undefined file handles, among others. These commands are not necessary, but are often recommended to learners of perl, like myself, in order to enforce good programming practices.
Finally we get to the action statement, which is a simple print statement that prints a string.
Merging mail
We are now ready to write our first version of the mail merge function. We will try to duplicate the original mail merge function of Nisus Writer. There were two files that had to be created. There was a template document and the merge file. The template document had all of the common text, with the variable text replaced by keywords surrounded by "international quote marks", or " and ", also known as a left and right guillemot (which is also the name of a bird. Go figure!). You can type the """ by pressing the option key, and then the backslash, or "<option>-\". The """ can be typed by pressing "<option>-<shift>-\". The merge document in its simplest form consists of comma-delimited text, with the first line containing the keywords, and subsequent lines the text to be substituted for each keyword, with each line representing a different copy of the template.
With this in mind, we will create a new version of the macro, which will imbed the template in the macro itself, and read the merge file from the front window. Reading in a template is more complex, and we will deal with that later. The output will be put in a new window. With that in mind, type in the following and save it.
#!/usr/bin/perl
#Nisus Macro Block
#source front
#destination new
#End Nisus Macro Block
use strict;
use warnings;
my @template = ("Dear "firstname",\n",
"Thank you for the "gift" you gave me.\n");
my @mergefile = <STDIN>;
my $line = ''; # Holder for temp strings
foreach $line (@mergefile) {
chomp ($line);
}
my @keywords = split(/,/,$mergefile[0]);
foreach $line (@keywords) {
$line = '"'.$line.'"';
}
for (my $i=1; $i<@mergefile;$i++) {
my @copy = @template;
my @subst = split(/,/,$mergefile[$i]);
for (my $j = 0; $j < @subst;$j++) {
for (my $k = 0; $k<@copy; $k++) {
$copy[$k] =~ s/$keywords[$j]/$subst[$k]/g;
}
}
print @copy;
print "\n\n";
}
Now we need to create the merge document. Open a new window, and type the following:
firstname,gift
Joe,stuffed moose
Bob,old barrel
If you want, you can save this file as a text file.
Now, with that file as the foreground file, run your macro from the macro menu. You should get the following output:
Dear Joe,
Thank you for the stuffed moose you gave me.
Dear Bob,
Thank you for the old barrel you gave me.
Let's see how this works. The beginning and set-up are the same as before. We then create an array of strings that holds the template. The "my" keyword defines the following variable to be local to the enclosing block. A variable that starts with a "$" is a scalar, or single variable, which can hold a number or string. A variable that starts with a "@", on the other hand, is an array, which can also hold numbers or strings. In this case, we are declaring a local array variable named @template, and assigning two strings to this array. Note that the strings are surrounded by double quotes rather than the single quotes we used in our first version. If we use single quotes, the strings are used exactly as they are typed. When we use double quotes, the enclosed string is interpreted by a process called "interpolation" in the perl documents. Any included variable names are evaluated, and the result placed in the string. Also, special sequences are replaced. We use this characteristic using the "\n" combination to indicate to insert a carriage return after each line.
Next, we create another local array variable, and use it to read from STDIN, the standard input, which has been set to read from the front window. We use a trick in perl, which is that we can read an entire file with one statement into an array, and it will be automatically split into lines. Each line will still have the carriage return at the end, which can mess us up if we don't remove it. We do this in the foreach loop, using the chomp function, which removes the last character of a string. The foreach function is like a for loop but without explicitly indexing every element in the array. In this case, the variable $line is assigned to each of the elements in the array @mergefile in order, and changing $line will result in changing the array element. This frees us from the indexing details.
We then need to find the keywords from the first line of mergefile array. To do this, we use the split function. It takes two arguments. The first consists of a delimiter, followed by what sequence of characters signifies a split point, followed by the delimiter again. In our case, the comma is signified as our split character The second argument is a string which is to be split. The function returns an array containing the fragments that resulted from the split, in this case, the keywords. We then add the international quotes to each of the keywords, so that we will find the keywords only when they are contained within the international quotes.
We are then ready to process the merge file data. This is done with a set of nested for loops. Note that if we use an array identifier in a context which calls for a single (scalar) variable, it returns the number of elements in the array. In order to access array elements, we use the array name preceded by a "$", and followed by square brackets enclosing the index, which starts with zero. The first for loop is repeated with each line of the merge data, and creates the array of strings that will replace the keywords. In it, we make a copy of the template to work on, and split the string with the merge data in it. We next have loops for each substitution, and for each line in the template. In the inside of the loop, we perform the actual substitution, using the "s" command. The =~ means "operate on the variable to the left, and then replace it with the result on the right". The portion between the first two slashes is the string to match, while the second portion is what will replace it. Each of these portions is treated as if it were a string in double quotes, meaning that it is interpolated. This means that variables are replaced with their values. The "g" at the end means to repeat the substitution as many times as possible.
Adding Flexibility and Style
The version we have lacks flexibility, because to change the template text we have to change the macro. It would be better if the macro read the template text from a file. A perl macro in Nisus can read data from a file, from the front window, from the next window, or the clipboard. We could put the template in one of these locations, but we would need to be certain that either the clipboard contains the correct data, or that we have the correct windows open and in the correct order. In order to keep things simpler and more reliable, we will instead use a file for the template in combination with the front window as the merge file.
A second deficiency is the lack of the ability to use different fonts and styles. Perl is designed for plain text, and when it is reading from the window within Nisus it only extracts the text, without any font or style information. One way we can convert all of the style and font information to plain text is to save the file as RTF, which is a text representation including all the page setup, font and style information. If we do this, we can turn our previous decision to save the template as a text file to our advantage.
A deficiency of using perl this way is that we can't have user input to a running perl script. We would like to be able to specify where the merge template file is, but if we include it in the merge data file, it would be different than the syntax of the previous version of Nisus uses for its mail merge function, which we are trying to emulate. Instead, we will us a built-in feature of the NWE implementation of perl to find the merge template in the same location as the merge file.
The NWE allows us to get the full path to the file associated with the frontmost window in the second element of a special array, as $ARGV[1] . This array is normally used in perl to access command line options, and if there is no front window, or the front window has not been saved, the array will be empty. If I have a file named "test.rtf" in the Documents folder of my home directory (dtlinker), the path will be "/Users/dtlinker/Documents/test.rtf". We use this string to construct the file name and path for the template and output files.
So, enter the following in Nisus Writer Express, and save to your Documents folder as "Merge template.rtf", in RTF format. Note that the keyword for the firstname is in bold, and the keyword for the gift is in italic and in a larger font size, to demonstrate font style changes. When you change these, be careful to include the international quotes on either side when you change the size or style.
Dear "firstname",
Thank you for the "gift" you gave me.
We also need to decide where to send the output. The output will be an RTF file, which means that if we just send it to a window, we will get all of the unintelligible format specifications. We could do this, and then save the file with an ".rtf" suffix, and then open it again to see the changes, but this would add extra steps. Theoretically, we could alternatively use the Nisus perl macro directive "#Send Text as RTF", and get the formatting interpreted correctly in the window, but when I tried this I ran into erratic behavior. So, we will save the output directly to a file with the ".rtf" suffix, to avoid these problems. This will also avoid the extra step of saving and reopening the window with the results.
So, the plan is to read the template from an RTF file, process it using the merge file which is located in the front window, and then save the output to another RTF file. There is just one other change that we need to make. When we save a keyword of the format ""keyword"" as RTF, the international quotes are altered to the special codes, "\'c7" and "\'c8". This means that what we are going to look for is a string"\'c7keyword\'c8".
Now, things start to get complicated. The character "\", or backslash is special in perl, and is the escape character, which means that the next character is the be interpreted in a special way. So, in order to get a backslash, we need to use two in a row. Now, recall that the string in the substitution command s is interpolated. This means that we need two backslashes in a row to get one in the search string. Since the search string is interpolated, and double quoted strings are interpolated, in order to get a single backslash we need to have four backslashes in a row!
Bells and whistles
There are a few more final details we need to take care of. The first is error detection and handling. The main errors that can occur are when we open a file, or when we try to find the path associated with the front window. We can trap these errors by using the return value of the open function, and the die function. This will send the subsequent message to STDERR, or the standard error output, and then exit the program. For perl embedded in new, STDERR is output to a modal dialog.
Another trick that adds a nice feature is the ability send a command to the shell from within a perl script. This is done by enclosing the command with backwards slanted single quotes, located at the upper left of the keyboard on my iBook. The last line of this macro opens the resulting output file in NWE using this technique.
The final version is listed below.
#!/usr/bin/perl
# Mail Merge.pl
# A mail merge macro for Nisus Writer Express
# written by David T. Linker
# 7/1/03 Version 1.1
# dtlinker@mac.com
#
# Uses template and merge file format of Nisus Classic
#
# Save the this file in the macro folder in the folder
# <user home>/Library/Application Support/Nisus Writer/Macros
#
# Make a template file, with the keywords surrounded by
# guillemots, like this: "keyword". You can get these characters
# by using <option>-\ and <option>-<shift>-\. Save the template
# file as RTF, with the last part of the name before the extension
# being " template".
#
# Make a merge data file, with the first line consisting of the
# keywords separated by commas, then the subsequent lines with
# the text to substitute separated by commas. Save the file to the
# same folder as the template file, with the same name as the template
# file, but without the word "template".
#
# Leave this file open as the front window, and run the macro.
# The output will be opened in a new window, and saved to the
# same directory as the other files.
#
# Known bugs:
# Ill-formed RTF output
# Does not handle accented characters in merge data
#
# Valuable suggestions and additions by Kino, of the Nisus list
#Nisus Macro Block
#source front
#End Nisus Macro Block
use strict;
use warnings;
# Is there a front window file, and has it been saved?
if (@ARGV<1) { # No arguments means no merge file
die "Front window not saved, or no front window\n";
}
# Now, make the name of the input and output files
my $merge_template_file = $ARGV[1]; # Get file path
$merge_template_file =~ s/\..{0,3}$//; # Remove extension
my $merge_output_file = $merge_template_file.' OUTPUT.rtf';
$merge_template_file = $merge_template_file.' template.rtf';
# Read from file at root
open (TEMPLATE,'<',$merge_template_file)
or die "Unable to open $merge_template_file";
my @template = <TEMPLATE>;
close(TEMPLATE);
# Read merge data from front window
my @mergefile = <STDIN>;
foreach my $line (@mergefile) {
chomp ($line);
}
# Extract the keywords
my @keywords = split(/,/,$mergefile[0]);
foreach my $line (@keywords) {
$line= "\\\\'c7".$line."\\\\'c8";
}
# Open the output file
open (OUTFILE,'>',$merge_output_file)
or die "Unable to create $merge_output_file";
#Do the actual merge
for (my $i=1; $i<@mergefile;$i++) {
my @copy = @template; # make a local copy to alter
my @subst = split(/,/,$mergefile[$i]);
for (my $j = 0; $j < @subst;$j++) {
for (my $k = 0; $k<@copy; $k++) {
$copy[$k] =~ s/$keywords[$j]/$subst[$j]/g;
}
}
print OUTFILE @copy; # Output the result
if ($i<@mergefile-1) {
print OUTFILE '\page'; # Insert a page break between copies
}
}
close (OUTFILE);
# Let's look at the result
`open -a "Nisus Writer Express" "$merge_output_file"`;
To use this macro, you should make the merge data document and save it, leaving it as the front window. If it has the name "Merge", with any kind of extension, make sure that you have your template stored as "Merge template.rtf" in the same folder as the merge data document. If you then run the macro and it runs successfully, there will be no messages, but you will have a file named "Mail Merge OUTPUT.rtf" in the same folder as the other files, and open as the front window.
Future improvements
This version accomplishes all of our original goals, which were to implement the basic functionality of mail merge, and learn some perl in the process. There are a number of improvements that could be added, and deficiencies addressed. Some of these are:
- We could add error checking for input data formats of the merge template and merge data.
- The output is not syntactically correct RTF, which leads to some format errors when the result is opened in other applications. We could fix the RTF by actually parsing the RTF and only changing the text, and adding page breaks between copies.
- Accented text in the merge data is not handled properly, since extended ASCII, which is read from the merge data file, is handled by special codes in RTF, which is the format for the template and output files. Adapting the macro to use RTF for the merge data file as well would fix this.
- We could implement other features of the mail merge language, which included conditionals and include files, for example.
Resources
There are a number of resources that I found useful in working on this, and which may prove useful to those who wish to write perl macros.
There were two tutorials that I found to be very helpful. The most basic, which I used most extensively, was at the University of Leeds site, at:
<http://www.comp.leeds.ac.uk/Perl/start.html>
A more detailed tutorial that I also used was found at:
<http://www.ebb.org/PickingUpPerl/pickingUpPerl_toc.html>
Full documentation of perl is available on your Macintosh by using the Terminal and typing "man perl" on the command line. This accesses the main documentation guide, which also has links to the other documentation. A very useful online resource for the perl manual pages, documentation and function descriptions, among other things, is found at:
<http://www.perldoc.com/>
One of the documentation pages that may be of interest is "perlembed", which describes "how to embed perl in your C program". In addition to the documentation, this site also contains links to many other sites of interest to perl programmers. One that I found quite intriguing is CPAN, or "Comprehensive Perl Archive Network":
<http://www.cpan.org/>
This contains a vast library of library routines for perl. One set that I looked at for this article was the modules to parse RTF files, but these would require independent installation to be used as is, and I felt that this would be more complex than I wanted for this demonstration. There are even routines for accessing URL's over the internet!
The Nisus Writer Express perl macro usage is documented in the help file, accessed by the Help menu, then clicking on "Macro Menu", then on "More about Nisus Writer Express Macros". Additional information is included in the release notes for Nisus Writer Express, which are accessible from the download page at Nisus.
Finally, an extremely valuable resource for those who use Nisus Writer is the Dartmouth Nisus list:
<http://listserv.dartmouth.edu/archives/nisus.html>
Members on this list are very helpful and knowledgeable about Nisus and a wide range of other topics. One of the list members, Kino, was particularly helpful with suggestions and techniques that were used in this macro, including the error checking on file opening, and the technique of executing a shell command to open the result file in a Nisus window.
I hope that this will spur your interest in perl in general and writing perl macros for Nisus Writer Express in particular. Maybe you will even be inspired to embed perl functionality into an application you are developing. Have fun!
David is a collector of computer languages, having programmed in Algol-60, Fortran IV, Pascal, Basic, Lisp, Forth, Logo, Modula-2, java, and various flavors of assembly. He has also written some C, but tries not to admit it. He is currently teaching himself Objective-C, and scripting under Mac OS X. You can reach him at dtlinker@mac.com.