TweetFollow Us on Twitter

Worldscript
Volume Number:9
Issue Number:10
Column Tag:Worldscript

Related Info: Script Manager

Writing for WorldScript

Essential info about Apple’s Script Manager

By Gary Crandall, DataPak Software

About the author

Gary Crandall (better known as “Gar”) has been developing software for the Macintosh since 1983 and has personally generated over one million lines of code for the 68000-based processor.

Frustrated with TextEdit’s limitations and speed, Gary has been evolving text processing technology for DataPak Software, of which the most recent contribution is Word Solution Engine v2.1, “WorldScript” technology that breaks all barriers of TextEdit.

For the Macintosh, the term script refers to a specific class of characters whose behavior is different from other classes of characters. Put more simply, each unique alphabet on this planet requires a different script, such as Eastern versus Western alphabets.

If you are not already familiar with this aspect of the Macintosh, a script can be confused with language, or some other localization process. But language and scripts are distinctly different.

For example, all Western and European countries use the same script (Roman) even though there are many diverse languages. In the Macintosh, the same set of fonts work for English as well as French, German, Italian, Spanish, etc. as all such languages use the same basic alphabet.

Until recently, the ASCII set as we know it has been sufficient since we have generally excluded non-Roman countries from our software galleries. In a growing new age of world wide economics and an “ever shrinking” globe, the time and demand for additional character sets has arrived.

Unfortunately, the necessary transition can be far more involved than it sounds. Most of us have locked into the “standard” ASCII character set so much in our software design that converting to some new system will not only force virtual re-write, it will force our way of thinking to change.

The Script Manager

The purpose of this article is to offer some insight into an otherwise difficult topic: Apple’s Script Manager. First, let’s clarify some terminology to avoid confusion.

Regardless of what System you are operating, the Script Manager is the portion of the Toolbox that specifically handles non-Roman scripts (non a-z character sets). You can look at Script Manager as an extension of QuickDraw’s text handling functions. This was true in System 6 and is (still) true in System 7.

Some confusion has set in, however, with the release of System 7 and the term WorldScript. By and large, the term WorldScript merely means that the user can switch between multiple scripts without re-starting the machine. While this might be an excellent enhance for multilingual applications, the term WorldScript is really more of a promotional term, from the perspective of a developer, than a technical term.

In System 6, for example, you were generally restricted to a single script. If you ran a Japanese product, you would boot a Kanji System; to revert to a Roman environment, you would re-start from a “blessed” System folder for Roman text, and so on.

With the newer System software, the user is allowed to switch scripts with the same ease as switching fonts or point sizes. Thus, the term “WorldScript.” But that is an enhancement for the user, not the developer; from the programmer’s viewpoint, little has changed. You still need to understand and use the same Script Manager functions, for the most part, as if you were writing for System 6 even if you intend to operate in a WorldScript environment.

Getting Started

This could be personal preference, but I would strongly recommend studying the original Script Manager in Inside Mac Volume 5 before even considering the “WorldScript” information in Volume 6.

For one thing, I find Inside Macintosh Vol 6 generally confusing and convoluted, filled with unnecessary “information” while lacking the basic, important information you really need.

Furthermore, you will eventually realize that the System 7 “enhancements” to Script Manager won’t help you much - or at the very least you will discover the most useful functions are available (and described more clearly) in Volume 5. Again, this is my preference but I am speaking from many months of experience working in this area.

Old Habits

If you decide to go WorldScript with your software, you will need to break some old habits to guarantee compatibility for all present (and future) scripts.

I am using “WorldScript,” in this case, in its true sense: if the user has the ability to choose any script, than you need to be prepared for all the related nuances. If you are programming only for System 6 (where the user is generally stuck with one script), you could get away with a more hard-coding approach (building a “Kanji version,” “Hebrew version” or some other specialized modification of your code). But for true WorldScript, you need to respond correctly to all potential scripts within the same software.

For starters, you can no longer assume a 256-byte character set. In Kanji and other similar scripts, every character is represented by two bytes. You might not think your software will be affected by this, but I have found many unexpected surprises where a single-byte character was assumed.

One subtle case I experienced was an accounting program in which the letters “A,” “L” and “E” represented “Assets,” “Liabilities” and “Expenses.” Of course, there was a resource which defined these letters just in case the product was used in Europe - in which case the “A,” “L” and “E” could be changed to some other localized abbreviations.

What never occurred to me was the product could not possibly be localized for Japan: a Kanji language required two bytes for every character. Unfortunately, there were hundreds of places in the code that examined a single character to determine the account type. So much for “smart” localization!

Another bad habit you will need to break is the left-to-right assumption. Now, everybody knows that left is left and right is right... but not true in Arabic, Hebrew and other similar languages!

Not only does text seem draw from right to left in an Arabic environment, but everything else will be “backwards” as well - at least from our Western point of view. It is all done with mirrors, literally: the left side of your document becomes the right side; left margins are really right margins; “align left” changes its meaning to “align right,” and so on. The bottom line is, you need to stop thinking in terms of left sides and right sides, but rather “side of origin.” Under a Roman convention, the side of origin happens to begin on the left side; other environments originate on the right side.

At the very least, you should design your dialogs and alerts to look nice for either direction. If you have a WorldScript system handy, go into the Control Panel which changes the System direction, and you will see what I am talking about. When you change the System direction to right-to-left, the Dialog Manager will flip all the controls around... UGLY! That is, unless you fix them up to look good in either direction.

Script Manager Pitfalls

The information I am about to give could save you countless hours of grief. I can state this with confidence after completing a new WorldScript revision to Word Solution Engine. There are some very vital and crucial facts missing from both Volume 5 and 6 - and even the Tech Notes - which can throw you for a spin unless you know them in advance.

The first missing fact is the exact relationship and behavior between the current font and the Script Manager functions that return information about characters.

The following point is not made in any Script Manager documentation I have seen, yet it is vital to most functions: you must have the correct script font set in the current GrafPort to receive the correct answer(s) from Script Manager functions.

By “correct script font” I mean the font for which the text will be drawn. If you are asking Script Manager to give you information on text that will be drawn in Kanji, you had better set a Kanji font to be the current font. Otherwise, almost every function that returns information about a character will return the wrong answer.

For example, there is a Script Manager function called CharByte which you can use to determine if a given byte of text is a single-byte character, or if it is the first half or the second half of a double-byte character. When you look over this function, you will be lead to believe that Script Manager can determine the right answer through some magical character decoding of any arbitrary piece of text. Wrong!

The only way Script Manager can give you the correct response to CharByte - at least in a WorldScript environment in which mixed scripts are possible - is to know what font the character is intended for. This is because all 256 values of a byte are “legal” ASCII characters for Roman fonts, whereas for Kanji and other double-byte fonts many of those values denote the upper (or lower) half of a character. Hence, CharByte cannot possibly know which is which without knowing what font the character is intended for.

If you are mixing scripts together (e.g., Roman and Kanji within the same text stream), your problem is a bit more complicated because you need to know what font to set before asking Script Manager to tell you about the character(s). Rather strange, but true: in order to find out what type of characters are in a string of text, you have to already know what type of characters are in the text - or something along that line.

Fortunately, however, in the case of mixed scripts, there is a work-around to the “know before you know” situation. I have found that by setting the “worst case font” I usually get the right answer from CharByte and other similar functions. By worst-case I mean the font which is the most non-Roman. If you know that a piece of text could contain, say, a Kanji character, setting a Kanji font has a tendency to work for all characters in the text even if some of them are Roman.

I found the worst-case-font solution works particularly well for mixed directions, i.e., mixing Roman with a right-to-left script such as Arabic. With such a potential mix, setting the current font to a left-to-right font (Roman) will almost always return bad information, whereas claiming the whole thing is Arabic (even if it is not) will work more consistently.

But the supreme (and correct) solution, if you want to handle mixed scripts with 100% accuracy, is to know what scripts are present and to set the appropriate font for each character interrogation.

The next pitfall you might run into is the character offset mistake.

The function mentioned above, CharByte, and another function called CharType both require a pointer to some text and character position (offset) into that text for which you want information.

If you are anything like me, you will have a tendency to get lazy and always pass “zero” for the character position, and instead just point to a character. Wrong!

Certain scripts, particularly the double-byte variety, require that Script Manager examines a series of bytes in order to determine the characteristic of the byte you are asking about. Unless it can see what bytes come before the character you want to know about, it can return the wrong answer.

For example, the most common use of CharByte is to determine which “half” of a character a given byte of text represents for a double-byte string of text. Suppose you want to know whether the 10th byte of a string is the first or second half of a character - or if it is a single-byte character. If you do not pass “10” for the offset (and instead you merely increment a pointer to the 10th position and pass zero for the offset), Script Manager is not given a chance to examine the first 10 characters to make the correct decision. This point is not made anywhere in the documentation and can be a major cause of unexplained bugs. Here are some examples:

Wrong way:

Boolean IsCharOddByte (char the_char)
 // Returns “TRUE” if char is second half of a Kanji byte
{
 if (CharByte(&the_char, 0) > 0)
 return TRUE;
 else
 return FALSE;
}

Right way:

/* 1 */

Boolean IsCharOddByte (Ptr txt, short char_position)
 // Returns “TRUE” if char is second half of a Kanji byte
{
 if (CharByte(txt, char_position) > 0)
 return TRUE;
 else
 return FALSE;
}

Both of the examples above are an attempt to determine if a character is the second half of a double-byte Kanji character; one example won’t always work.

The first example will often fail because a single character can “look” like a Roman byte; furthermore, an offset of “zero” implies to Script Manager that no way is the character the second half of a Kanji byte.

The second example works because it gives the Script Manager a chance to examine the sequence of bytes and you will always get the right answer.

Note: BOTH examples will fail if the font in the current GrafPort is not a Kanji font.

The Infamous Position “Flip”

One area that can drive you mad - if you are not prepared for it - is the usage of MeasureJust for right-to-left script such as Hebrew or Arabic (MeasureJust is a function that returns consecutive character positions, in pixels, of a given block of text).

You would think that given the same font, style, point size, and even the same text, MeasureJust returns the exact same answer every time.

Wrong! MeasureJust will return a different answer depending on the writing direction setting of the System (right-to-left or left-to-right). Furthermore, the difference(s) will not be what you would expect. Let’s do a quick experiment to illustrate what happens:

 short  char_positions[5];// Holds MeasureJust answers
 
 MeasureJust(text_ptr, 4, 0, char_positions);

The small code sample above is attempting to measure text_ptr which is a series of, say, four Arabic characters. The result will be placed in the char_positions array which will be (supposedly) the physical pixel positions for each character in text_ptr. Here are the various results you will receive:

(1) If System is set for right-to-left, char_positions might look like this:

22, 16, 11, 6, 0

(2) If System is set for left-to-right, char_positions would look like this:

0, 16, 11, 6, 22

What you will conclude is the text is “drawn” differently depending on the System writing direction - but that too is incorrect: Arabic text is drawn the same (right-to-left) regardless of the writing direction. So you will conclude that MeasureJust was designed specifically to drive you crazy.

One solution to this inconsistency is to check for a situation where the two ends are “flipped” as in the second example above. Here is some code that demonstrates how to fix the problem:

/* 2 */

void ReturnActualCharPositions (Ptr text, short text_length, short *array)
{
 GrafPtrcur_port;// Used to check current font
 short  script;  // Used to check current script

 MeasureJust(text, text_length, 0, array);

// So far so good, but I will need to flip the two ends in 
// the array if right-to-left script but left-to-right System

 if (TESysJust() >= 0)  // If left-to-right  {
 GetPort(&cur_port);
 script = Font2Script(cur_port->txtFont); // Gets current script
 if (GetScript(script, smScriptRight)) // If right-left script {
 short  temp;

 temp = array[0];
 array[0] = temp[text_length];
 array[text_length] = temp;
 }
 }
}

Now & The Future

If you are anything like me, it is not only difficult to confront “change,” it is even more difficult to face an alphabet or language that you can’t even read!

However, regardless of your particular present and future text applications, the days of super-localization (made only for U.S. and Europe) are numbered - if not already obsolete. Our Japanese market alone is potentially huge and can no longer be ignored, if for no other reason than economics. Like it or not, you will need to confront WorldScript sooner or later, hopefully sooner: as in any other endeavor in business (or life), the more you know about a subject the more effective you will be.

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Latest Forum Discussions

See All

Top Mobile Game Discounts
Every day, we pick out a curated list of the best mobile discounts on the App Store and post them here. This list won't be comprehensive, but it every game on it is recommended. Feel free to check out the coverage we did on them in the links... | Read more »
Price of Glory unleashes its 1.4 Alpha u...
As much as we all probably dislike Maths as a subject, we do have to hand it to geometry for giving us the good old Hexgrid, home of some of the best strategy games. One such example, Price of Glory, has dropped its 1.4 Alpha update, stocked full... | Read more »
The SLC 2025 kicks off this month to cro...
Ever since the Solo Leveling: Arise Championship 2025 was announced, I have been looking forward to it. The promotional clip they released a month or two back showed crowds going absolutely nuts for the previous competitions, so imagine the... | Read more »
Dive into some early Magicpunk fun as Cr...
Excellent news for fans of steampunk and magic; the Precursor Test for Magicpunk MMORPG Crystal of Atlan opens today. This rather fancy way of saying beta test will remain open until March 5th and is available for PC - boo - and Android devices -... | Read more »
Prepare to get your mind melted as Evang...
If you are a fan of sci-fi shooters and incredibly weird, mind-bending anime series, then you are in for a treat, as Goddess of Victory: Nikke is gearing up for its second collaboration with Evangelion. We were also treated to an upcoming... | Read more »
Square Enix gives with one hand and slap...
We have something of a mixed bag coming over from Square Enix HQ today. Two of their mobile games are revelling in life with new events keeping them alive, whilst another has been thrown onto the ever-growing discard pile Square is building. I... | Read more »
Let the world burn as you have some fest...
It is time to leave the world burning once again as you take a much-needed break from that whole “hero” lark and enjoy some celebrations in Genshin Impact. Version 5.4, Moonlight Amidst Dreams, will see you in Inazuma to attend the Mikawa Flower... | Read more »
Full Moon Over the Abyssal Sea lands on...
Aether Gazer has announced its latest major update, and it is one of the loveliest event names I have ever heard. Full Moon Over the Abyssal Sea is an amazing name, and it comes loaded with two side stories, a new S-grade Modifier, and some fancy... | Read more »
Open your own eatery for all the forest...
Very important question; when you read the title Zoo Restaurant, do you also immediately think of running a restaurant in which you cook Zoo animals as the course? I will just assume yes. Anyway, come June 23rd we will all be able to start up our... | Read more »
Crystal of Atlan opens registration for...
Nuverse was prominently featured in the last month for all the wrong reasons with the USA TikTok debacle, but now it is putting all that behind it and preparing for the Crystal of Atlan beta test. Taking place between February 18th and March 5th,... | Read more »

Price Scanner via MacPrices.net

AT&T is offering a 65% discount on the ne...
AT&T is offering the new iPhone 16e for up to 65% off their monthly finance fee with 36-months of service. No trade-in is required. Discount is applied via monthly bill credits over the 36 month... Read more
Use this code to get a free iPhone 13 at Visi...
For a limited time, use code SWEETDEAL to get a free 128GB iPhone 13 Visible, Verizon’s low-cost wireless cell service, Visible. Deal is valid when you purchase the Visible+ annual plan. Free... Read more
M4 Mac minis on sale for $50-$80 off MSRP at...
B&H Photo has M4 Mac minis in stock and on sale right now for $50 to $80 off Apple’s MSRP, each including free 1-2 day shipping to most US addresses: – M4 Mac mini (16GB/256GB): $549, $50 off... Read more
Buy an iPhone 16 at Boost Mobile and get one...
Boost Mobile, an MVNO using AT&T and T-Mobile’s networks, is offering one year of free Unlimited service with the purchase of any iPhone 16. Purchase the iPhone at standard MSRP, and then choose... Read more
Get an iPhone 15 for only $299 at Boost Mobil...
Boost Mobile, an MVNO using AT&T and T-Mobile’s networks, is offering the 128GB iPhone 15 for $299.99 including service with their Unlimited Premium plan (50GB of premium data, $60/month), or $20... Read more
Unreal Mobile is offering $100 off any new iP...
Unreal Mobile, an MVNO using AT&T and T-Mobile’s networks, is offering a $100 discount on any new iPhone with service. This includes new iPhone 16 models as well as iPhone 15, 14, 13, and SE... Read more
Apple drops prices on clearance iPhone 14 mod...
With today’s introduction of the new iPhone 16e, Apple has discontinued the iPhone 14, 14 Pro, and SE. In response, Apple has dropped prices on unlocked, Certified Refurbished, iPhone 14 models to a... Read more
B&H has 16-inch M4 Max MacBook Pros on sa...
B&H Photo is offering a $360-$410 discount on new 16-inch MacBook Pros with M4 Max CPUs right now. B&H offers free 1-2 day shipping to most US addresses: – 16″ M4 Max MacBook Pro (36GB/1TB/... Read more
Amazon is offering a $100 discount on the M4...
Amazon has the M4 Pro Mac mini discounted $100 off MSRP right now. Shipping is free. Their price is the lowest currently available for this popular mini: – Mac mini M4 Pro (24GB/512GB): $1299, $100... Read more
B&H continues to offer $150-$220 discount...
B&H Photo has 14-inch M4 MacBook Pros on sale for $150-$220 off MSRP. B&H offers free 1-2 day shipping to most US addresses: – 14″ M4 MacBook Pro (16GB/512GB): $1449, $150 off MSRP – 14″ M4... Read more

Jobs Board

All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.