TweetFollow Us on Twitter

June 96 - MPW Tips And Tricks: Scripted Text Editing

Mpw Tips And Tricks: Scripted Text Editing

Tim Maroney

The MPW Shell contains a full-strength, high-speed text editor with scripting capabilities. It's nothing to write love letters with, because it's targeted at the ASCII format of compiler source files, but it provides the power to automate complex and repetitive tasks in ASCII text. The key to the system lies in a few editing-related commands, together with its regular expressions and selection expressions.

REGULAR EXPRESSIONS

In the MPW Shell, any search command can take one of two kinds of arguments. The first is a plain string, which matches exactly its contents and nothing else, using a simple character-by-character match. The other is a regular expression, which is a pattern that can be recognized by a finite state machine. You can't parse programming languages with regular expressions, but you can use them to recognize many patterns, including wildcards, repeating sequences, and sets of characters. Regular expressions are bracketed with either slashes or backslashes, for searching forward or backward respectively. So, for instance, the regular expression \wombat\ would search backward from the current location for the string "wombat".

There are about 20 special constructs within regular expressions, all of which are cryptically described when you execute the command line "Help Patterns" within the MPW Shell. I'll mention some of the more useful ones here. The wildcard characters are the question mark (?) and the equivalence symbol (~, Option-X). The question mark matches any one character except the end of a line, while the equivalence symbol matches any number of such characters. For instance, /w?mb~t/ would match "wombat" as well as "wambiklort" and "wymbt", but not "wafkambiliot", nor "wkmb" at the end of a line. Restricted sets of symbols can be given in brackets; for instance, you can search for alphanumeric characters with the pattern [a-zA-Z0-9]. The reverse of a set can be specified with the "not" symbol (~, Option-L); for instance, /[~a-z]/ finds any character except a lowercase letter. The start of a line can be specified with the bullet symbol (*, Option-8) and the end of a line with the infinity symbol ([[infinity]], Option-5).

    These keyboard shortcuts are for American QWERTY keyboards. Other keyboards have different layouts. For instance, on a direct neural interface keyboard, think "blue wildebeest" and raise your right ear to type the bullet symbol.*
Repeating patterns can be specified in three ways. Following any pattern with a plus sign (+) means one or more instances of that pattern; for instance, the regular expression /[0-9]+/ would match any sequence of digits. An optional repeating pattern can be similarly specified with an asterisk (*), which means zero or more repetitions. The rarely seen double angle brackets can be used to specify exactly how many repetitions of a pattern are allowed. They're typed as Option-backslash (<<) and Option-Shift-backslash (>>) and enclose a single number to mean exactly that many repetitions, or two numbers separated by a comma to specify a minimum and maximum number of repetitions, or a single number followed by a comma to mean at least that many repetitions. For instance, the pattern /[a-zA-Z]<<3,7>>/ would find all strings composed of alphabetical characters and from three to seven letters long.

There are a number of ways of "escaping" special characters when you want to look for something that has special meaning within regular expressions, such as a question mark or plus sign. You can escape any character with the lowercase delta ([[partialdiff]], Option-D), or use single or double quotes to escape strings. To find the string "wombat+", for instance, you'd need to escape the plus sign: /wombat[[partialdiff]]+/.

Finally, one of the most useful constructs consists of a tagged regular expression. This allows you to associate a number between 0 and 9 with a pattern that's matched, referring to it later with the "registered" symbol (reg., Option-R) followed by a digit. This is very handy when you're doing replacements. For instance, you can replace any angle-bracketed string with a parenthesized string with the following command, which would turn "<wombat>" into "(wombat)":

Replace /<([~<>]*)reg.1>/ (reg.1)
This searches for any number of characters (except angle brackets) that are between angle brackets, assigns them the number 1, and then replaces the angle brackets with parentheses. Note that the syntax of tagged patterns requires the pattern to be parenthesized.

SELECTION EXPRESSIONS

Many editing commands (such as Replace) can take selection expressions as well as regular expressions. Selection expressions provide more ways to select text than the string matching provided by regular expressions. Common selection expressions include the following:
  • The bullet symbol, meaning the start of a file.

  • The infinity symbol, meaning the end of a file.

  • The current selection, denoted by [[section]] (Option-6). This might have been selected with the mouse or by a Find command. [[section]] by itself indicates the selection in the target window (which I'll explain later), while pathname:[[section]] means the selection in the file indicated by the pathname.

  • A line number, specified simply as a number.

  • The name of a marker, specified by the Mark command.

  • A range between two selection expressions, separated by a colon (:).
The above expressions require no special delimiters (they're not directional like regular expressions). Regular expressions are actually a kind of selection expression and are delimited by slash or backslash characters as usual.

Some character-skipping variants of these options are also provided, such as the position that's one character after the selection, denoted by following a selection expression with an uppercase delta ([[Delta]], Option-J). These are useful in dealing with context; for instance, you may want to select a string when it's followed by another character, but not include the following character in the selection. (An example is given later in the Subword script.) Text emitted by a program like a table generator may be in a known format, such as a columnar arrangement, in which case skipping a certain number of characters will take you to the selection you need.

Again, the MPW Shell will give you a terse summary of selection expressions when you execute the command line "Help Selections". I'm not going to list all the minor variants here, but feel free to while away the hours in rapturous contemplation of their mysteries on your own.

EDITING COMMANDS

The most common editing commands are two that you probably use already: Find and Replace. Dialogs that stand in for these commands are built into the MPW Shell and accessible from the Find menu. You can give any selection expression as a search pattern in either of these dialogs by clicking the Selection Expression radio button instead of the default Literal button. The same commands are the basis of most editing scripts. As tools, Find and Replace take a selection expression as their primary argument. Don't confuse Find and Search! The Search command puts out its results as text, while Find actually changes the selection. In addition, Search takes a pattern -- that is, a regular expression -- while Find takes any selection expression. For example, to go to the start of a file in a script, you could give the command "Find *", but not "Search *".

Find is the basic navigation command in most editing scripts. For instance, you can simulate the Select All command in the Edit menu like so:

Find *:[[infinity]]  # select from start to end of target
The commands File and Open, along with the variables Target and Active, determine the files your scripts will work on. "File" is actually an alias for the real command name, Target. The File command opens a file and makes it the target window -- the window behind the frontmost window. The target window is an important notion in MPW. It exists so that you can use the Worksheet window to type commands that affect another window; since the Worksheet would be in front, the window being affected would need to be behind the Worksheet. During scripting, you may prefer to use the Open command, which opens a file and makes it the frontmost window. The target window is referred to as {Target} in scripts, while the frontmost window is called {Active}. Editing commands work on the target window if you don't specify a window explicitly.

The Line command may also be used for navigation: it selects the numbered line in the target window and then brings that window to the front. You probably know this command already if you use compilers in the MPW Shell, since they put out error messages in this form:

File "gwork.c"; Line 418 # Syntax error
Executing this command takes you to the line in your code where the error was detected.

The Position command returns the current position in the target window, as a line number, a character range, or both. The position could be saved to a variable for later use as follows, using the backquote mechanism to execute a command and insert its output inline:

Set SavedLineNumber `Position -l`
There are dozens of commands pertaining to text editing in the MPW scripting language. Help on all of them is available in the MPW Shell. The usual Macintosh text-editing menu commands are available in the MPW scripting language, including New, Open, Close, Save, Revert, Print, and the standard Edit menu commands.

StreamEdit is a standalone editing tool that's rich and strange enough to deserve its own co-->umn. It's a structured search and replacement language based on the UNIXreg. command sed.

Some simpler standalone editing tools are provided. Sort has a rich function set and can be used for many text-editing tasks. Canon takes a file of search and replace strings and applies them to a file. It's used to automate terminology changes, such as the work that was done to make the Mac OS API use fewer acronyms and abbreviations when the new Inside Macintosh books were written. Translate, like the UNIX command tr, maps characters onto other characters.

Text indentation can be handled with four tools: Adjust, Align, Entab, and Format. Adjust shifts a line to the right or left by a specified number of spaces. Align sets the margin of a range of selected lines to the margin of the first selected line. Entab converts runs of spaces to tabs, and Format sets the column width used for tabs in a text document, as well as other settings like font and size. (These settings are saved in a resource in the file, which many ASCII text editors can recognize.)

Text-editing scripts often create temporary files, split single files into multiple files, and perform other file-related tasks. MPW provides commands to help you manage files. It has commands corresponding to almost all Finder operations, such as Duplicate, Move, Delete, and NewFolder. There are also some specialized file commands: FileDiv splits a file into multiple files based on a byte or line count or on embedded form feed characters inserted during a previous editing pass; Catenate does the opposite, joining files together.

A text-editing script often takes search and substitution text as parameters on the command line. A few commands related to parameters are worth a quick mention here. Echo is handy for concatenating parameters with other text. Quote is similar to Echo but adds quote marks as needed to preserve the word breaks in its parameters. MPW scripting requires quotes around any string that is meant to be a single parameter but contains spaces (which would break the string into multiple parameters). Echo puts out its arguments in a way that allows them to be broken up, while Quote preserves the original word breaks by inserting quotes.

Echo "Richard Loves Pat"
Richard Loves Pat
Quote "Bill Loves Everyone"
'Bill Loves Everyone'

AN EXAMPLE SCRIPT

Here's a script I've found useful for some years. It's called Subword and it replaces a word by another string everywhere it occurs in the target window.
Set Sep "[~a-zA-Z_0-9]"  # word separators
Find * "{Target}"  # start at top of file
Replace -c [[infinity]] [[partialdiff]]
   "[[Delta]]/{Sep}{1}{Sep}/!1:[[Delta]]/{Sep}/" [[partialdiff]]
   "{2}" "{Target}"
The selection in this Replace command is probably about as clear as the U.S. tax code, so allow me to explain. The [[Delta]] means one character before the selection. The !1 means one character past the selection. The colon denotes everything between the selections (inclusively). So this pattern says, in a nutshell, select the pattern in the first parameter ({1}) when it's bracketed by separators, but exclude the separators.

Normally I don't use this script directly. I incorporate it into other scripts as a utility. The bulk of the work of converting between similar languages like Pascal and C can be done by an editing script, for example. Subword can be used to convert keywords, as could Canon. I use another script which is essentially Subword without the separators for changing symbols like equality operators.

Scripts to preconvert between Pascal and C can be found on this issue's CD. They don't generate compiler-ready text, but I've found that they facilitate a manual conversion at the rate of hundreds of lines per hour, allowing source bases in the thousands of lines to be accurately translated in a day or three. So the next time you're faced with a dull text-processing task, look over the tools MPW gives you, and see whether you can save yourself a few days of tedious manual labor!

TIM MARONEY recently changed his Apple badge color from green to white: he's gone from contract programming to a technical leadership role developing user interface software. Tim entertains himself in a variety of ways, such as straining his surgically altered eyeballs on the small print of obscure footnotes and collectible trading card games, and contorting his limbs in yogic asanas. He designed the iron crystal that now resides at the core of the earth and contributed significant ideas to the original (now obsolete) implementation of Planck-scale gravitational phenomena in the universe.*

Thanks to Dave Evans, Scott Fraser, Arno Gourdol, and Alex McKale for reviewing this column.*

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

FotoMagico 5.6.12 - Powerful slideshow c...
FotoMagico lets you create professional slideshows from your photos and music with just a few, simple mouse clicks. It sports a very clean and intuitive yet powerful user interface. High image... Read more
OmniGraffle Pro 7.12.1 - Create diagrams...
OmniGraffle Pro helps you draw beautiful diagrams, family trees, flow charts, org charts, layouts, and (mathematically speaking) any other directed or non-directed graphs. We've had people use... Read more
beaTunes 5.2.1 - Organize your music col...
beaTunes is a full-featured music player and organizational tool for music collections. How well organized is your music library? Are your artists always spelled the same way? Any R.E.M. vs REM?... Read more
HandBrake 1.3.0 - Versatile video encode...
HandBrake is a tool for converting video from nearly any format to a selection of modern, widely supported codecs. Features Supported Sources VIDEO_TS folder, DVD image or real DVD (unencrypted... Read more
Macs Fan Control 1.5.1.6 - Monitor and c...
Macs Fan Control allows you to monitor and control almost any aspect of your computer's fans, with support for controlling fan speed, temperature sensors pane, menu-bar icon, and autostart with... Read more
TunnelBear 3.9.3 - Subscription-based pr...
TunnelBear is a subscription-based virtual private network (VPN) service and companion app, enabling you to browse the internet privately and securely. Features Browse privately - Secure your data... Read more
calibre 4.3.0 - Complete e-book library...
Calibre is a complete e-book library manager. Organize your collection, convert your books to multiple formats, and sync with all of your devices. Let Calibre be your multi-tasking digital librarian... Read more
Lyn 1.13 - Lightweight image browser and...
Lyn is a fast, lightweight image browser and viewer designed for photographers, graphic artists, and Web designers. Featuring an extremely versatile and aesthetically pleasing interface, it delivers... Read more
Visual Studio Code 1.40.0 - Cross-platfo...
Visual Studio Code provides developers with a new choice of developer tool that combines the simplicity and streamlined experience of a code editor with the best of what developers need for their... Read more
OmniGraffle 7.12.1 - Create diagrams, fl...
OmniGraffle helps you draw beautiful diagrams, family trees, flow charts, org charts, layouts, and (mathematically speaking) any other directed or non-directed graphs. We've had people use Graffle to... Read more

Latest Forum Discussions

See All

The House of Da Vinci 2 gets a new gamep...
The House of Da Vinci launched all the way back in 2017. Now, developer Blue Brain Games is gearing up to deliver a second dose of The Room-inspired puzzling. Some fresh details have now emerged, alongside the game's first official trailer. [Read... | Read more »
Shoot 'em up action awaits in Battl...
BattleBrew Productions has just introduced another entry into its award winning, barrelpunk inspired, BattleSky Brigade series. Whilst its previous title BattleSky Brigade TapTap provided fans with idle town building gameplay, this time the... | Read more »
Arcade classic R-Type Dimensions EX blas...
If you're a long time fan of shmups and have been looking for something to play lately, Tozai Games may have just released an ideal game for you on iOS. R-Type Dimensions EX brings the first R-Type and its sequel to iOS devices. [Read more] | Read more »
Intense VR first-person shooter Colonicl...
Our latest VR obsession is Colonicle, an intense VR FPS, recently released on Oculus and Google Play, courtesy of From Fake Eyes and Goboogie Games. It's a pulse-pounding multiplayer shooter which should appeal to genre fanatics and newcomers alike... | Read more »
PUBG Mobile's incoming update bring...
PUGB Mobile's newest Royale Pass season they're calling Fury of the Wasteland arrives tomorrow and with it comes a fair chunk of new content to the game. We'll be seeing a new map, weapon and even a companion system. [Read more] | Read more »
PSA: Download Bastion for free, but wait...
There hasn’t been much news from Supergiant Games on mobile lately regarding new games, but there’s something going on with their first game. Bastion released on the App Store in 2012, and back then it was published by Warner Bros. This Warner... | Read more »
Apple Arcade: Ranked - 51+ [Updated 11.5...
This is Part 2 of our Apple Arcade Ranking list. To see part 1, go here. 51. Patterned [Read more] | Read more »
NABOKI is a blissful puzzler from acclai...
Acclaimed developer Rainbow Train's latest game, NABOKI, is set to launch for iOS, Android, and Steam on November 13th. It's a blissful puzzler all about taking levels apart in interesting, inventive ways. [Read more] | Read more »
A Case of Distrust is a narrative-driven...
A Case of Distrust a narrative-focused mystery game that's set in the roaring 20s. In it, you play as a detective with one of the most private eye sounding names ever – Phyllis Cadence Malone. You'll follow her journey in San Francisco as she... | Read more »
Brown Dust’s October update offers playe...
October is turning out to be a productive month for the Neowiz team, and a fantastic month to be a Brown Dust player. First, there was a crossover event with the popular manga That Time I Got Reincarnated as a Slime. Then, there was the addition of... | Read more »

Price Scanner via MacPrices.net

Score a 37% discount on Apple Smart Keyboards...
Amazon has Apple Smart Keyboards for current-generation 10″ iPad Airs and previous-generation 10″ iPad Pros on sale today for $99.99 shipped. That’s a 37% discount over Apple’s regular MSRP of $159... Read more
Apple has refurbished 2019 13″ 1.4GHz MacBook...
Apple has a full line of Certified Refurbished 2019 13″ 1.4GHz 4-Core Touch Bar MacBook Pros available starting at $1099 and up to $230 off MSRP. Apple’s one-year warranty is included, shipping is... Read more
2019 13″ 1.4GHz 4-Core MacBook Pros on sale f...
Amazon has new 2019 13″ 1.4GHz 4-Core Touch Bar MacBook Pros on sale for $150-$200 off Apple’s MSRP. These are the same MacBook Pros sold by Apple in its retail and online stores: – 2019 13″ 1.4GHz/... Read more
11″ 64GB Gray WiFi iPad Pro on sale for $674,...
Amazon has the 11″ 64GB Gray WiFi iPad Pro on sale today for $674 shipped. Their price is $125 off MSRP for this iPad, and it’s the lowest price available for the 64GB model from any Apple reseller. Read more
2019 15″ MacBook Pros available for up to $42...
Apple has a full line of 2019 15″ 6-Core and 8-Core Touch Bar MacBook Pros, Certified Refurbished, available for up to $420 off the cost of new models. Each model features a new outer case, shipping... Read more
2019 15″ MacBook Pros on sale this week for $...
Apple resellers B&H Photo and Amazon are offering the new 2019 15″ MacBook Pros for up to $300 off Apple’s MSRP including free shipping. These are the same MacBook Pros sold by Apple in its... Read more
Sunday Sale: AirPods with Wireless Charging C...
B&H Photo has Apple AirPods with Wireless Charging Case on sale for $159.99 through 11:59pm ET on November 11th. Their price is $40 off Apple’s MSRP, and it’s the lowest price available for these... Read more
Details of Sams Club November 9th one day App...
Through midnight Saturday night (November 9th), Sams Club online has several Apple products on sale as part of their One Day sales event. Choose free shipping or free local store pickup (if available... Read more
Sprint is offering the 64GB Apple iPhone 11 f...
Sprint has the new 64GB iPhone 11 available for $15 per month for new lines. That’s about 50% off their standard monthly lease of $29.17. Over is valid until November 24, 2019. The fine print: “Lease... Read more
New Sprint November iPhone deal: Lease one iP...
Switch to Sprint and purchase an Apple iPhone 11, 11 Pro, or 11 Pro Max, and get a second 64GB iPhone 11 for free. Requires 2 new lines or 1 upgrade-eligible line and 1 new line. Offer is valid from... Read more

Jobs Board

*Apple* Mobility Pro - Best Buy (United Stat...
**746087BR** **Job Title:** Apple Mobility Pro **Job Category:** Store Associates **Store NUmber or Department:** 000319-Harlem & Irving-Store **Job Description:** Read more
Best Buy *Apple* Computing Master - Best Bu...
**743392BR** **Job Title:** Best Buy Apple Computing Master **Job Category:** Store Associates **Store NUmber or Department:** 001171-Southglenn-Store **Job Read more
Best Buy *Apple* Computing Master - Best Bu...
**746015BR** **Job Title:** Best Buy Apple Computing Master **Job Category:** Sales **Store NUmber or Department:** 000372-Federal Way-Store **Job Description:** Read more
*Apple* Mobility Pro - Best Buy (United Stat...
**744658BR** **Job Title:** Apple Mobility Pro **Job Category:** Store Associates **Store NUmber or Department:** 000586-South Hills-Store **Job Description:** At Read more
Best Buy *Apple* Computing Master - Best Bu...
**741552BR** **Job Title:** Best Buy Apple Computing Master **Job Category:** Sales **Store NUmber or Department:** 000277-Metcalf-Store **Job Description:** **What Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.