Feb 00 Getting Started
Volume Number: 16 (2000)
Issue Number: 2
Column Tag: Getting Started
Speech Channels
by Dan Parks Sydow
Getting a speech-capable program ready for the use of different voices
In last month's Getting Started article we started in on the topic of speech in Macintosh programs. There you saw how your program can verify that the user's Mac is able to output speech, and you saw how your program can use the SpeakString() function to speak the text of one string. That was a good start, but if your program is to make good use of speech there are a couple of other speech-related issues to cover. You'll want your program to have the ability to speak more than a single string (without having to repeatedly call SpeakString()). This month we see how that can be done. You'll also want to give your program the power to speak in different voices. In last month's code we relied on the default voice - whatever voice is currently set for use on the user's machine. But the user has a wealth of different voices stored on his or her Mac - and your program can make use of any one of them! This month we'll see how to make use of speech channels. An application-allocated speech channel is a prerequisite for speaking in a different voice - so armed with this new speech channel information you'll be ready for next month's Getting Started article where we wrap up the topic of speech by creating an application that speaks in different voices.
Speech Basics Review
It's the Speech Manager - along with a speech synthesizer (that's a part of the Mac system software), the Sound Manager, and a Mac's audio hardware - that makes speech possible. If a Mac program is to speak, it should include the Speech.h universal header file to ensure that the compiler recognizes the speech-related Toolbox functions.
#include <Speech.h>
Before attempting to speak, your program should verify that the user's machine will be able to output the sound (refer to last month's article if you need an explanation of the following snippet).
OSErr err;
long response;
long mask;
err = Gestalt( gestaltSpeechAttr, &response );
if ( err != noErr )
DoError( "\pError calling Gestalt" );
mask = 1 << gestaltSpeechMgrPresent;
if ( response & mask == 0 )
DoError( "\pSpeech Manager not present " );
To speak a single string of text, use the SpeakString() function. Pass this function a Pascal-formatted string and SpeakString() turns the characters that make up the string into speech that emits from the user's Mac. Follow a call to SpeakString() with repeated calls to SpeechBusy() to ensure that time is provided for the full text of the string to be spoken.
OSErr err;
err = SpeakString( "\pThis is a test." );
if ( err != noErr )
DoError( "\pError attempting to speak a phrase" );
while ( SpeechBusy() == true )
;
Speech Channels
Whenever a Mac program speaks, a speech channel is involved. A speech channel is a data structure that holds descriptive information about speech that is to "pass through" that specific channel. The most important property that the channel keeps track of is the voice to use. If you want your program to speak in two different voices (perhaps a conversation between two people is going on), then you'd create two speech channels - each channel specifying a different voice. Each time your program speaks, you'd designate which channel is used, basing the decision on which voice is appropriate for the words that are about to be spoken.
Last month you learned about SpeakString() - the easy-to-use Toolbox function that speaks the text comprising a single string. In last month's article we didn't need to cover speech channels because when a call is made to SpeakString(), the Speech Manager takes care of the allocation of a speech channel. The Speech Manager than uses that channel to speak, and disposes of that same channel when speaking has completed. Because the Speech Manager takes care of all this behind the scenes, SpeakString() is very simple to use. There's a drawback to using SpeakString() though - the programmer lacks control of the speech channel. Because of this, a specific voice can't be selected. Instead, SpeakString() always uses the system default voice. Here you'll see how to create a new speech channel and then use a different Toolbox function - SpeakText() - to specify that this new channel be used when your program speaks.
Use the Toolbox function NewSpeechChannel() to create a new speech channel. This routine allocates memory for a SpeechChannelRecord. This is the data structure that holds information about a speech channel. NewSpeechChannel() finishes by returning to your program a SpeechChannel - a pointer to the new speech channel record.
SpeechChannel channel;
OSErr err;
err = NewSpeechChannel( nil, &channel );
The first NewSpeechChannel() argument is a pointer to a voice specification data structure. As you'll see in next month's Getting Started article, this data structure corresponds to the voice that is to be used for speech generated through this one speech channel. To simply use the system default voice, pass nil as the first argument (as shown here).
The second argument is the address of a SpeechChannel - a data structure that holds the information about a speech channel. After NewSpeechChannel() allocates the speech channel record, this variable references the new record.
The Toolbox should have no problem allocating a new speech channel, but you'll want to prepare for the worst. If NewSpeechChannel() returns an OSErr value other than noErr, you can use some standard error-catching routine like our own DoError(), or you can opt to call the Toolbox function DisposeSpeechChannel() to dispose of the problematic channel yourself. You might also want to set the SpeechChannel variable to nil to indicate to your program that the variable doesn't hold a valid reference to a speech channel.
if ( err != noErr )
{
err = DisposeSpeechChannel( channel );
channel = nil;
}
Your primary use of a speech channel will be in calls to the Toolbox function SpeakText(). Unlike SpeakString(), which accepts just a string of text to speak, SpeakText() accepts a speech channel and a buffer of text to speak. Here's a typical call to SpeakText():
OSErr err;
Str255 str = "\pHere's a test of SpeakText.";
err = SpeakText( channel, (Ptr)(str + 1), str[0] );
if ( err != noErr )
DoError( "\pError attempting to speak a phrase" );
while ( SpeechBusy() == true )
;
The first SpeakText() argument is a speech channel. Here we aren't making good use of the speech channel - but next month, we will.
The second argument is a pointer to the area of memory that holds the text to speak. In the above example I've used a string to represent the text. Your program could also append strings together in a buffer, or read text from an external file to a buffer. A variable of type Str255 uses its first byte to hold the length (in bytes) of the string, and the remaining bytes to hold the characters that make up the string. If we just used str as the second argument, SpeakText() would attempt to start speaking using the value in the first byte of str - which happens to be a number. By using str + 1, we tell SpeakText() to move to the second byte of str and start speaking from there. Finally, because SpeakText() requires a generic pointer to a buffer, the Str255 variable needs to be typecast to type Ptr. Collectively (Ptr)(str + 1) make up the second argument.
The third SpeakText() argument is the number of bytes that should be used from the buffer. In the case of a Str255 variable, the number of bytes making up the string is held in the first byte of the variable. So the first element (byte [0]) in the Str255 variable represents the number of buffer bytes holding characters to speak.
As you did for SpeakString(), end with repeated calls to SpeechBusy() to provide ample time for the Mac to speak the words in the SpeakText() buffer.
SpeechChan
This month's program is SpeechChan. This simple, non-menu-driven program provides a working example of how to create and make use of a sound channel. When run, SpeechChan speaks the phrase "We've successfully opened a speech channel". After speaking those words the program quits.
Creating the SpeechChan Resources
Start by creating a new folder named SpeechChan in your CodeWarrior development folder. Launch ResEdit and create a new resource file named SpeechChan.rsrc. Specify that the SpeechChan folder serve as the resource file's destination. This resource file will hold only two resources. As shown in Figure 1, you'll need just an ALRT and a DITL resource for this project.
Figure 1. The SpeechChan resources.
The resource file will hold one alert resource - ALRT 128. Corresponding to this ALRT is DITL 128. Together these two resources define the program's error-handling alert. If your version of the SpeechChan program doesn't commit any serious errors, then the program won't make use of these resources.
Creating the SpeechChan Project
Create a new project by starting up CodeWarrior and choosing New Project from the File menu. Use the MacOS:C_C++:MacOS Toolbox:MacOS Toolbox Multi-Target project stationary for the new project. Uncheck the Create Folder check box before clicking the OK button. Name the project SpeechChan.mcp, and specify that the project's destination be the existing SpeechChan folder.
Now add the SpeechChan.rsrc resource file to the project. Remove the SillyBalls.rsrc file. The ANSI Libraries folder can stay or go - as is the case for most of our example projects, we aren't making use of any ANSI C libraries, so you can remove them from the project if you wish.
If you plan on making a PowerPC version (or fat version) of the SpeechChan program, be sure to add the SpeechLib library to the PowerPC targets of your project. As mentioned last month, you'll want to choose Add Files from the Project menu and work your way over to this library. You'll find it in the Metrowerks CodeWarrior:MacOS Support:Libraries:MacOS Common folder. If you can't find the library in that folder, use Sherlock to search your hard drive for it. When you add the library to the project CodeWarrior displays a dialog box asking you which targets to add the library to. Check the two PPC targets. In Figure 2 you see how your project will look with the SpeechLib library added to it.
Figure 2. The SpeechChan project.
Now create a new source code window by choosing New from the File menu.. Save the window, giving it the name SpeechChan.c. Choose Add Window from the Project menu to add the new empty file to the project. Remove the SillyBalls.c placeholder file from the project window. At this point you're ready to type in the source code.
If you want to save yourself a little typing, connect to the Internet and visit MacTech's ftp site at ftp://ftp.mactech.com/src/mactech/volume16_2000/16.02.sit. There you'll find the SpeechChan source code file available for downloading.
Walking Through the Source Code
SpeechChan.c starts with the inclusion of the Speech.h file. If you compile the source file and you receive a number of undefined function errors, then you most certainly forgot to include this universal header file.
/********************** includes *********************/
#include <Speech.h>
After the #include comes a single constant. The constant kALRTResID holds the ID of the ALRT resource used to define the error-handling alert.
/********************* constants *********************/
#define kALRTResID 128
Next come the program's function prototypes.
/********************* functions *********************/
void ToolBoxInit( void );
SpeechChannel OpenOneSpeechChannel( void );
void DoError( Str255 errorString );
The main() function of SpeechChan starts with the declaration of several variables. The first three variables, err, response, and mask, are used in the determination of whether speech generation is possible on the user's Mac. The other two variables, str and channel, are used in the generation of speech.
/********************** main *************************/
void main( void )
{
OSErr err;
long response;
long mask;
Str255 str ="\pWe've successfully opened a speech channel.";
SpeechChannel channel;
After the Toolbox is initialized the previously discussed speech-related tests are made:
ToolBoxInit();
err = Gestalt( gestaltSpeechAttr, &response );
if ( err != noErr )
DoError( "\pError calling Gestalt" );
mask = 1 << gestaltSpeechMgrPresent;
if ( response & mask == 0 )
DoError( "\pSpeech Manager not present " );
Now we get down to business. First, a speech channel is allocated and a reference to it is saved in local variable channel. Note that this local variable could have been a global variable - the choice is dependent on your programming style. The application-defined function OpenOneSpeechChannel() is described just ahead.
channel = OpenOneSpeechChannel();
if ( channel == nil )
DoError( "\pError opening a speech channel" );
With a valid speech channel, we can make a call to SpeakText(). The arguments here match the ones described in this article's previous SpeakText() example snippet.
err = SpeakText( channel, (Ptr)(str + 1), str[0] );
if ( err != noErr )
DoError( "\pError attempting to speak a phrase" );
while ( SpeechBusy() == true )
;
When we're finished speaking, we dispose of the speech channel. Unlike a call to SpeakString(), SpeakText() required our efforts in creating the speech channel. We made it, so we need to throw it out.
err = DisposeSpeechChannel( channel );
if ( err != noErr )
DoError( "\pError disposing speech channel" );
}
ToolBoxInit() remains the same as previous versions.
/******************** ToolBoxInit ********************/
void ToolBoxInit( void )
{
InitGraf( &qd.thePort );
InitFonts();
InitWindows();
InitMenus();
TEInit();
InitDialogs( nil );
InitCursor();
}
OpenOneSpeechChannel() calls NewSpeechChannel() to create a new speech channel. The OpenOneSpeechChannel() routine is called from main(), so that's where the reference to the newly created channel gets returned.
/*************** OpenOneSpeechChannel ****************/
SpeechChannel OpenOneSpeechChannel( void )
{
SpeechChannel channel;
OSErr err;
err = NewSpeechChannel( nil, &channel );
if ( err != noErr )
{
err = DisposeSpeechChannel( channel );
channel = nil;
}
return ( channel );
}
DoError() is unchanged from prior versions. A call to this function results in the posting of an alert that holds an error message. After the alert is dismissed the program ends.
/********************** DoError **********************/
void DoError( Str255 errorString )
{
ParamText( errorString, "\p", "\p", "\p" );
StopAlert( kALRTResID, nil );
ExitToShell();
}
Running SpeechChan
Run SpeechChan by choosing Run from CodeWarrior's Project menu. After the code is compiled, CodeWarrior runs the program. After the program speaks a single phrase, the program ends. If the program successfully builds and appears to run and quit without speaking, then there's a good chance you have the speaker volume on your Mac set to 0!
Till Next Month...
Last month you saw how your program can speak the text in a single string. Here you saw how your program can speak a greater amount of text by way of a buffer. You also saw how to create a speech channel. Next month we'll wrap up the topic of speech by discussing how you can specify which voice - male, female, young, old, even robotic - you want your program to speak in. Until then you can study up on speech generation by looking at the Sound volume of Inside Macintosh...