Talking
Volume Number: | | 1
|
Issue Number: | | 12
|
Column Tag: | | Assembly Language Lab
|
The Talking Mac
By Dan Weston, The NerdWorks, Salem, OR
In the May, 1985, issue of the Macintosh Software Supplement Apple released a package of tools and code units collectively called MacinTalk 1.1. With these tools programmers can make their Macintosh programs talk without any additional hardware. In this article I'll explain the general workings of MacinTalk and develop a small application program in assembly language that will show you how to use the main features of MacinTalk in your own programs.
Overview of MacinTalk
The MacinTalk system's most basic component is a driver that contains several procedures available to your programs. The driver is contained in a file called 'MacinTalk', and this file must be on the same volume as any application that wishes to use the MacinTalk driver. The most basic function of the driver is to convert ASCII strings of phonetic codes into speech. You can also use another part of the driver to convert standard English text into phonetic codes which can then be spoken by the driver. Furthermore, there are parts of the driver that you can use to control the rate of speaking and the pitch.
Beyond the actual driver procedures that you will be using in your programs, there are a few tools that are useful to you while you are preparing a program that will use speech. The program 'Speech Lab' allows you to enter English text in one window and then hear the MacinTalk speech and see the phonetic translation in another window. This program is very useful for learning the tricks of the phonetic code system used by Macintalk. For example, the English sentence "This is a test." is translated into the phonetic string,"DHIHS IHZ AH TEHST.#". This program can be used to pre-translate strings that your program will speak when the strings are known ahead of time. It is more efficient, both in time and memory, to feed phonetic strings directly to the MacinTalk driver rather than relying on translation at run time. Also, if you pre-translate you will be able to fine tune the phonetics, because the translation is not always perfect.
The translation of English to phonetics is governed by hundreds of phonetic and grammatical rules contained in the Macintalk driver, but these rules will not get every word right. Another program in the Macintalk 1.1 package is 'Exception Edit'. This program allows you to create a special file of tricky words and their correct phonetic translation. Exception Edit lets you experiment with the phonetic strings until you get them right, and then save those translations for later use. A file created by Exception Edit can be automatically loaded and utilized by mentioning it when the MacinTalk driver is opened, as shown in a later section of this article.
Fig. 1 Program Output
The Macintalk Driver
There are seven procedures in the MacinTalk driver that your program can call. They are listed briefly below.
FUNCTION SpeechOn(ExceptionsFile: Str255; theSpeech: SpeechHandle): SpeechErr;
This function opens up the driver and initializes the values for speecd and pitch. If you pass a null string for ExceptionsFile, then the translation of English to phonetics will follow the standard rules. If you pass a valid file name for ExceptionsFile, then that file, which must have been created by Exception Edit, will be used to help guide translation. If you pass the string 'noReader' for ExceptionsFile, then the driver will be opened but it will only be able to receive phonetic input and it will not be able to translate English to phonetics.
PROCEDURE SpeechOff(theSpeech: SpeechHandle)
This procedure closes the driver and deallocates any storage that it has been using.
FUNCTION MacinTalk(theSpeech: SpeechHandle; Phonemes:Handle): SpeechErr
This is the work horse of the driver. This is where phoneme code strings are converted to speech. The handle to the phonemes should refer to a string of ASCII phonemes without a length byte.
FUNCTION Reader(theSpeech: SpeechHandle; EnglishInput: Ptr; InputLength:
LongInt; PhoneticOutput: Handle): SpeechErr
This is where English strings are translated into phonetic strings that can then be fed to MacinTalk. The Ptr to EnglishInput should not point to a length byte of a Str255. Point to the first character instead. The Handle for PhoneticOutput can start out as a zero length Handle, and Reader will dynamically grow the Handle to fit the output.
PROCEDURE SpeechRate(theSpeech: SpeechHandle; theRate:INTEGER)
This sets the rate at which words are spoken, in words/min. The rate must be between 85 and 425 words/min.
PROCEDURE SpeechPitch(theSpeech: SpeechHandle; thePitch: INTEGER; theMode:
FOMode)
This sets the baseline pitch, in Hz, and sets the pitch mode, either natural or robotic.
PROCEDURE SpeechSex(theSpeech: SpeechHandle; theSex:Sex)
This is not implemented in MacinTalk 1.1
The glue which calls the various procedures in the driver is contained in the file SpeechASM.Rel, also available in the Software Supplement. Make sure that you include SpeechASM.Rel in the link file for you application so that the driver routines will be available to your code. Also, you must XREF the individual routines that you wish to use. See the listings of CheapTalk.ASM and CheapTalk.LINK for examples.
CheapTalk: a simple speech application example
The software supplement contains the source code for a very short example program that shows how to use the speech driver. As usual, it is in Pascal, so we assembly language programmers have to muddle along and figure things out ourselves. In order to learn the system myself, and to provide a clear example of the main features of MacinTalk, I have written CheapTalk, a dialog based application that speaks pre-translated text stored in a resource file and also translates and speaks user input at run time. CheapTalk opens a dialog and speaks the static message one time. Then it waits for the user to type English text into an edit text box in the dialog. Hitting return or pressing a 'Say it' button will translate the English text into phonemes and then say it.
This application will show you how to open and close the driver, and how to use MacinTalk and Reader from assembly language. It does not use the procedures to control the speed of pitch, but I imagine that you can figure that out for yourselves.
In my discussion of the code, listed in listing 1 as CheapTalk.ASM, I will concentrate on the parts pertinent to MacinTalk, and leave many of the details of the shell to speak for themselves.
Making the connection to SpeechASM.Rel
Toward the beginning of CheapTalk.ASM, notice the XREF statements necessary for the linker to establish the connection between our routine calls and the SpeechASM.Rel code that we link with our code.
XREF SpeechOn ; open driver
XREF MacinTalk ; speak phonetic string
XREF Reader; translate English to phonetics
XREF SpeechOff ; close driver
The linker control file is listed in listing 2 as CheapTalk.LINK. SpeechASM.Rel is a code file which contains the glue routines necessary to call the individual procedures contained in the driver. SpeechASM.Rel does not contain the actual speech routines, just short procedures to call the appropriate section of the MacinTalk driver. All the routines of the speech driver expect their parameter on the stack.
Setting up the global variables for speech
Next, notice the global variable, 'theSpeech', defined as a long word to hold the handle to the speech globals that will be allocated when the driver is opened. We only have to define a variable to hold the handle, the opening routine will allocate the neccessary storage for the speech globals. Other globals that we need to define include a word length flag that we use to show if the driver was successfully opened, a 256 byte block to hold an English string, and a handle which will be used for phonetic output from Reader.
theSpeech DS.L 1 ; handle to speech driver globals
speechOKDS.W1 ; our flag to show if driver opened
theString DS.B 256 ; keep our English string here
phHandleDS.L1 ; handle to phonetic string
If you look at CheapTalk.ASM you will see that there are several other global variables defined to use as VAR parameters associated with maintaining the dialog box.
Opening the driver
When we call SpeechOn to open the driver, we specify the null string (a string with length 0, which we define in the static variable area at the end of the code) for the ExceptionsFile so that the Reader will translate English to phonetics using the default rules. If we had a specific exceptions file that we had created with Exception Edit, then we could pass in that file name so that that exception file would be used. We also pass the address of our global variable, theSpeech, so that it can be updated to hold the handle to the speech globals which will be allocated by the open routine.
; assume that driver will open alright, set our flag to TRUE
MOVE.W #1,speechOK(A5) ; set flag to TRUE
; now open driver to use default rules for translation
;FUNCTION SpeechOn(ExceptionsFile:Str255;
; theSpeech:SpeechHandle): SpeechErr
CLR.W -(SP) ; space for result
PEA NULL; defined at end of code
PEA theSpeech(A5); handle for speech global
JSR SpeechOn ; jump to open routine
MOVE.W (SP)+,D0 ; get result code
BEQ @1; branch if open OK
; if driver open not successful then clear speechOK flag
; to prevent further use of invalid driver
MOVE.W #0,speechOK(A5) ; set flag to FALSE
; you could also dispay an error dialog here
@1 ; branch to this point if open is successful
You can see how the result code is checked after SpeechOn to see if the driver was opened successfully. In the event of a non-zero result, impying a problem with the opening, we set the speechOK flag to 0 and continue on with the program. All other parts of the program which use the speech driver first check the speechOK flag to make sure that there is a valid driver to work with.
Speaking pre-translated speech
The static message in our dialog box is "This is a talking dialog demonstration." There is a phonetic translation of that string kept in the resource file as a resource of type PHNM. The translation was done using Speech Lab, and the resulting phonetic string put into the RMaker source file, listed in listing 3 as CheapTalk.R. I created the PHNM resource type for RMaker so that the phonetic string would not have a length byte. As a general strategy you can translate the static message of any dialog into a PHNM resource with the same resource ID number as the dialog. That way, it is easy to display the dialog and speak the message together.
When the PHNM resource is loaded into memory by GetResource, you get a handle to the phoneme string that you can pass to MacinTalk to recite. Remember, no length byte on phonetic strings! Generally, you should to pre-translate any strings that you know at assembly time so as not to waste time and memory translating at run time and also to insure higher quality speech by testing and refining the phonetic strings. Look at the following code to see how the PHNM resource is retrieved and then fed to MacinTalk.
; first check our flag to make sure that driver is open
TST.W speechOK(A5)
BEQ @2; driver not valid
; branch around speech stuff
; driver valid, go ahead and speak
;FUNCTION
; GetResource(theType:ResType;ID: INTEGER): Handle
CLR.L -(SP) ; space for result
MOVE.L #'PHNM',-(SP); resource type PHNM
MOVE.W #theDialog,-(SP) ; use same ID# as dialog
_GetResource
MOVE.L (SP)+,A0 ; handle to phoneme string
;FUNCTIONMacinTalk(theSpeech:SpeechHandle;
; Phonemes:Handle):SpeechErr
CLR.W -(SP) ; space for result
MOVE.L theSpeech(A5),-(SP) ; speech global handle
MOVE.L A0,-(SP) ; handle to phonemes
JSR MacinTalk ; say it
MOVE.W (SP)+,D0 ; get result code
@2 ; branch to here to avoid speaking with invalid driver
Translating English to Phonetics and then Speaking
After saying the static dialog message upon opening, the program waits for the user to enter English text in the edit text window of the dialog. The program watches the results of ModalDialog until the 'Say it' button is pushed, at which point it uses GetDItem and GetIText to get the current English text of the edit text item. That text, which is a Str255, is fed into Reader to translate it into a phonetic string. Please notice that when we pass the English text into Reader we skip over the length byte at the head of the Str255. We do, however, use the length byte, after coercing it to a long word, as the length input to Reader. The Handle which we use to hold the phonetic output of Reader is initially associated with a zero length block, but Reader grows the block automatically to fit the output. Look at this code fragment which feeds the English string to Reader. (Assume that the string has already been placed in the variable 'theString' by calls to GetDItem and GetIText.)
; set up an empty handle first for Reader to fill with phonemes
;FUNCTION NewHandle(logicalSize: Size): Handle
; logicalSize => D0, Handle => A0
MOVEQ #0,D0 ; set up empty handle
_NewHandle
MOVE.L A0,phHandle(A5) ; save Handle for later
;FUNCTION Reader(theSpeech:SpeechHandle;
;EnglishInput:Ptr;
;InputLength:LongInt: PhoneticOutput:Handle);: SpeechErr
CLR.W -(SP) ; space for result
MOVE.L theSpeech(A5),-(SP) ; speech globals
PEA theString+1(A5); Ptr to string, skip length
CLR.L D0; clear out D0
MOVE.B theString(A5),D0 ; put length byte in D0
MOVE.L D0,-(SP) ; use longInt for length
MOVE.L phHandle(A5),-(SP); we just allocated this
;handle
JSR Reader; do translation
MOVE.W (SP)+,D0 ; get result
Once we have used Reader to translate the English text into a phonetic string, we pass the handle to the phonemes to MacinTalk, much as we did earlier, to hear it spoken. Here is the code which speaks the translation and then deallocates the handle which held the phonetic string. It is important to deallocate this handle after the phonemes are spoken to avoid cluttering up memory with old sayings.
;FUNCTION MacinTalk(theSpeech: SpeechHandle
;Phonemes: Handle):SpeechErr
CLR.W -(SP) ; space for result
MOVE.L theSpeech(A5),-(SP) ; speech globals
MOVE.L phHandle(A5),-(SP); handle to phonemes
JSR MacinTalk ; say it
MOVE.W (SP)+,D0 ; get result
; deallocate handle
;PROCEDURE DisposHandle(h: Handle)
; h => A0
MOVE.L phHandle(A5),A0 ; where phonemes are
_DisposHandle
This process can be generalized to other situations where you want to translate arbitrary English text into speech. Just get a pointer to the first character of the text, get the length of the text, allocate an empty handle, and feed it all to Reader. The phonetic output of Reader can then by handed to MacinTalk to recite.
Closing the driver
We merely make a call to SpeechOff with theSpeech as input to close up the driver and deallocate the memory used by it. Generally, Macintalk will use at least 20 k of memory, plus dynamic buffers equal to about 800 bytes/second of uniterrupted speech (usually less than 10 seconds). In addition, Reader utilizes 10k plus a buffer to hold the translated text.
;PROCEDURESpeechOff(theSpeech: SpeechHandle)
MOVE.L theSpeech(A5),-(SP) ; handle to speech ;globals
JSR SpeechOff ; close it up
Putting it all together
Listings 1, 2, and 3 show the assembler source file, the linker control file, and the RMaker source file. You should assemble CheapTalk.ASM, then link it with CheapTalk.LINK. One thing to notice about the output file from the linker is that it is not a functional application until it is combined with the necessary resources by RMaker. Since Link output files are normally application type file, CheapTalk.LINK assigns a file type of 'CODE' so that the resulting output file will not have the characteristic diamond shaped icon. The final step of the program development is to run CheapTalk.R through RMaker to create the DLOG, DITL, and PHNM resources and combine them in one application file with the output file from the linker. The output of RMaker, Cheap Talk, will be a independent application program which can be moved to any disk and run as long as the driver file, MacinTalk, is also on that disk.
Summary
This discussion has been rather superficial. You are encouraged to study the source code and steal whatever parts of it you find useful for your own applications. All parts of the MacinTalk system are available in the Software Supplement or in the DL8 area of the Mac Developers interest group (PCS-7) on Compuserve, including the MacinTalk 1.1 documentation that Apple provides. This documentation is a good place to learn more about the phonetic symbols that MacinTalk uses and some of the finer points of the availale routines. You should also be aware that there is a licensing fee if you distribute programs that use MacinTalk 1.1, so contact Apple before you start shipping disks with MacinTalk on them.
Fig. 2 Program files
; CheapTalk.ASM
; A short program to demonstrate how to
; use Macintalk 1.1 from assembly language
; This program displays a dialog and speaks
; the written message in the dialog
; It also will speak English strings written
; into an edit text box in the dialog
; copyright August 1985
; Dan Weston
; This program uses subroutines from the file SpeechASM.rel
; You must include that file in your link file list
; and XREF the particular routines here
; You must also have the file 'MacinTalk' on the same volume
; as this application program
XREF SpeechOn ; open driver
XREF MacinTalk ; say something
XREF Reader; translate English to phonemes
XREF SpeechOff ; close the driver
theDialog EQU 1 ; resource ID # of dialog
sayitbutton EQU 1 ; item # for 'say it '
quitbuttonEQU 2 ; item # for 'quit'
usertextEQU 3 ; item # for edit text box
INCLUDE Mactraps.D
; --------------- Global Variables -------------------
theSpeech DS.L 1 ; handle to speech driver globals
speechOKDS.W1 ; our flag to show if driver open
theString DS.B 256 ; VAR for GetIText
phHandleDS.L1 ; handle to phonetic string
ItemHit DS.W1 ; VAR for modal dialog
theType DS.W1 ; VAR for GetDItem
theItem DS.L1 ; VAR for GetDItem
theRect DS.W4 ; VAR for GetDItem
; --------------- Initialization ----------------------
BSRInitManagers ; at end of source file
; -------------- Open the Speech Driver ----------------
; Open speech driver to use default rules
; assume that driver will open alright, set our flag to TRUE
MOVE.W #1,speechOK(A5) ; set flag to TRUE
CLR.W -(SP) ; result
PEANULL ; defined at end of source code
PEAtheSpeech(A5) ; VAR theSpeech
JSRSpeechOn ; jump to to open routine
MOVE.W (SP)+,D0 ; check result
BEQ@1 ; branch if ok
; If driver open not successful then clear speechOK flag
; to prevent further use of invalid driver
MOVE.W #0,speechOK(A5)
; You could also put an error dialog here
@1 ; branch to this point if open is successful
;--------------- Get the Dialog from the Resource file --
CLR.L -(SP) ;Clear Space For DialogPtr
MOVE #theDialog,-(SP) ; Resource #
CLR.L -(SP) ;Storage Area on heap
MOVE.L #-1,-(SP);Above All Others
_GetNewDialog ;Get New Dialog
MOVE.L (SP)+,D6 ;Move Handle To D6
;PROCEDURESetPort (gp: GrafPort)
MOVE.L D6,-(SP) ;Move Dialog Pointer To Stack
_SetPort;Make It The Current Port
; usually you would not use DrawDialog, but we need to draw
; the dialog contents once before saying them, then go to
; Modal dialog which will draw the contents again
;PROCEDURE DrawDialog(dp:DialogPtr)
MOVE.L D6,-(SP)
_DrawDialog
;------------------- Speak pre-translated speech -------
; now Say the static text item which has been pre-translated
; into a phoneme string with the same ID as the dialog
; first, check our flag to make sure that driver is open
TST.W speechOK(A5)
BEQ@2 ; driver not valid
; branch around speech stuff
; driver valid, go ahead and speak
CLR.L -(SP) ; space for result
MOVE.L #'PHNM',-(SP); resource type PHNM
MOVE.W #theDialog,-(SP) ; use same ID as dialog
_GetResource
MOVE.L (SP)+,A0 ; handle to phoneme string
CLR.W -(SP) ; space for result code
MOVE.L theSpeech(A5),-(SP) ; speech global handle
MOVE.L A0,-(SP) ; phonemes, from above
JSRMacinTalk; say it
MOVE.W (SP)+,D0 ; get result code
@2 ; branch to here to avoid speaking with invalid driver
;------------------- Dialog loop ------------------
; now process the dialog
dialogloop
;PROCEDUREModalDialog (filterProc: ProcPtr;
; VAR itemHit: INTEGER)
CLR.L -(SP) ;default filter proc
PEAItemHit(A5) ;Item Hit Data
_ModalDialog
; see which button was pushed
CMP.W #quitbutton,ItemHit(A5) ; quit button?
BEQcloseit
CMP.W #sayitbutton,ItemHit(A5) ; say it?
BEQsayit
; none of the above
BRAdialogloop ; go around again
;----------------- Translate English to Phonetics and speak ------
sayit
; first, check our flag to make sure that driver is open
TST.W speechOK(A5)
BEQ@3 ; driver not valid
; branch around speech stuff
; driver valid, go ahead and speak
; get the current text in the edit text box
MOVE.L D6,-(SP) ; we saved DialogPtr here
MOVE.W #usertext,-(SP) ; the edit text item
PEAtheType(A5) ; VAR type
PEAtheItem(A5) ; VAR item
PEAtheRect(A5) ; VAR box
_GetDItem
;PROCEDUREGetIText(item:Handle;VAR text: Str255)
MOVE.L theItem(A5),-(SP) ; result of GetDItem
PEAtheString(A5) ; VAR text
_GetIText
; now feed the text into reader to translate it into phonemes
; set up an empty handle first for Reader to fill with phonemes
;FUNCTION NewHandle(logicalSize: Size): Handle
; logicalSize => D0, Handle => A0
MOVEQ #0,D0 ; set up empty handle
_NewHandle
MOVE.L A0,phHandle(A5) ; save Handle for later
CLR.W -(SP) ; space for result
MOVE.L theSpeech(A5),-(SP) ; speech globals
PEAtheString+1(A5) ;Ptr to string, skip length byte
CLR.L D0; clear out D0
MOVE.B theString(A5),D0 ; put length byte in D0
MOVE.L D0,-(SP) ; use LongInt for length
MOVE.L phHandle(A5),-(SP); we just allocated this
JSRReader ; do translation
MOVE.W (SP)+,D0 ; get result
; now feed the phonemes to Macintalk
;FUNCTION ;MacInTalk(theSpeech:SpeechHandle;Phonemes:Handle)
; :SpeechErr
CLR.W -(SP) ; space for result code
MOVE.L theSpeech(A5),-(SP) ; speech globals handle
MOVE.L phHandle(A5),-(SP); handle to phonemes
JSRMacinTalk; say it
MOVE.W (SP)+,D0 ; get result code
; deallocate handle and loop back for more
; PROCEDURE DisposHandle (h: Handle)
; h => A0
MOVE.L phHandle(A5),A0 ; this is where the phonemes are
_DisposHandle
@3 ; branch to here to avoid speaking with invalid driver
BRAdialogloop
;------------------ Close up shop -----------------------
closeit
;PROCEDURECloseDialog (theDialog: DialogPtr);
MOVE.L D6,-(SP) ;Get Dialog Pointer To Close
_CloseDialog;Close Window
; first, check our flag to make sure that driver is open
TST.W speechOK(A5)
BEQ@4 ; driver not valid
; branch around speech stuff
; driver valid, go ahead and close it
; PROCEDURE SpeechOff(theSpeech: SpeechHandle)
MOVE.L theSpeech(A5),-(SP) ; handle to speech globals
JSRSpeechOff; close it up
@4 ; branch to here to avoid closing invalid driver
_ExitToShell;Return To Finder
;--------------- Initialize Managers Subroutine ----------
InitManagers
;PROCEDUREInitGraf (globalPtr: QDPtr);
PEA-4(A5) ;Space Created For Quickdraw's Use
_InitGraf ;Init Quickdraw
_InitFonts;Init Font Manager
_InitWindows;Init Window Manager
;PROCEDUREInitDialogs (restartProc: ProcPtr);
CLR.L -(SP) ; NIL restart proc
_InitDialogs;Init Dialog Manager
;procedure TEinit
_TEInit
_InitCursor ; set arrow cursor
RTS; end of InitManagers
;--------------------- Static Data -----------------------------
NULL DC.W0 ; null string
/OUTPUT CheapTalkCode
; Since this code file will not run successfully until it has been
; joined with the resources by RMaker, set its file type so
; that it cannot be mistakenly run from the desktop.
; Link output files are usually of type APPL
/TYPE 'CODE' 'LINK'
; link our code, CheapTalk, with the glue for the speech driver
; routines
CheapTalk
SpeechASM
$
* CheapTalk.R
* create the application Cheap Talk
* First define all the resources, and then include the code
* output file name, File type, file creator
MDS2:Cheap Talk
APPLCHTK
* dialog resource is a vanilla dialog
* make it pre-loaded (4) to speed things up
Type DLOG
,1 (4)
60 100 260 400
Visible NoGoAway
1
0
1
* DITL resource for dialog has one static text item,
* one edit text item,
* and two buttons: 'Say it' and 'Quit'
* The 'Say it' button is item #1 so that hitting return is
* the same as clicking 'Say it'
* make it pre-loaded (4) to speed things up
Type DITL
demo,1 (4)
4
Button
170 200 190 250
Say it
Button
170 50 190 100
Quit
EditText
40 30 150 270
Enter English text here
StaticText Disabled
10 30 30 290
This is a talking dialog demonstration
* PHNM resource is defined by us to be a string without length
* byte it is a phonetic translation of the static tect in the DITL
* of the same resource #
* make it pre-loaded (4) to speed things up
Type PHNM = GNRL
demo,1 (4)
.S
DHIH9S, IHZ AH TAO4KIHNX DAY6AELAA1G DIH1MUNSTREY5SHUN #
* now include the code produced by the linker
INCLUDE MDS2:CheapTalkCode