Threading
Volume Number: | | 10
|
Issue Number: | | 11
|
Column Tag: | | Essential Apple Technology
|
Related Info: Process Manager
Threading Your Apps
Tying it all together
By Randy Thelen, Apple Computer, Inc.
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
About the Author
Randy Thelen - Randy (sometimes known as Random) is the kind of Apple engineer who just keeps coming up with wacky ideas that just might work. In his spare time, he's been playing with Threads way too much, and was last seen listening to They Might Be Giants while excitedly showing off fast, color, bit-shuffling code.
Tying it all together
The Thread Manager is a system software extension which allows applications to have multiple threads of execution. With multiple threads of execution you can easily move the processing of relatively lengthy operations into the background thus creating a more responsive application for your users. In this article well learn what the terminology is, well explore the programming model youll want to employ to make best use of threads, and well examine the application programming interface (API).
The Threads Manger extension version 2.0.1 (68K and PowerPC based threads) is available for distribution with your application from Apple for $50 (through APDA) and it is built in to System 7.5. Further, Apples future O/S endeavors will be threaded. If you employ a threaded model in current application structure, it will carry over transparently (or requiring only minor API changes) to the next generation system software.
Terminology
A few words from the man. Without using any names (Eric Anderson), theres this Smart Guy over here at Apple who was instrumental in actually packaging the Threads Manager for shipment. He wrote the following paragraphs for developers (I felt it fitting to enclose them in this article for you):
The Thread Manager is the current MacOS solution for lightweight concurrent processing. Multithreading allows an application process to be broken into simple subprocesses that proceed concurrently in the same overall application context. Conceptually, a thread is the smallest amount of processor context state necessary to encapsulate a computation. Practically speaking, a thread consists of a register set, a program counter, and a stack. Threads have a fast context switch time due to their minimal context state requirement and operate within the application context which gives threads full application global access. Since threads are hosted by an application, threads within a given application share the address space, file access paths and other system resources associated with that application. This high degree of data sharing enables threads to be lightweight and the context switches to be very fast relative to the heavyweight context switches between Process Manager processes.
An execution context requires processor time to get anything done, and there can be only one thread at a time using the processor. So, just like applications, threads are scheduled to share the CPU, and the CPU time is scheduled in one of two ways. The Thread Manager provides both cooperative and preemptive threads. Cooperative threads explicitly indicate when they are giving up the CPU. Preemptive threads can be interrupted and gain control at (most) any time. The basis for the difference is that there are many parts of the MacOS and Toolbox that can not function properly when interrupted and/or executed at arbitrary times. Due to this restriction, threads using such services must be cooperative. Threads that do not use the Toolbox or OS may be preemptive.
Cooperative threads operate under a scheduling model similar to the Process Manager, wherein they must make explicit calls for other cooperative threads to get control. As a result, they are not limited in the calls they can make as long as yielding calls are properly placed. Preemptive threads operate under a time slice scheduling model; no special calls are required to surrender the CPU for other preemptive or cooperative threads to gain control. For threads which are compute-bound or use MacOS and Toolbox calls that can be interrupted, preemptive threads may be the best choice; the resulting code is cleaner than if partial results were saved and control then handed off to other threads of control.
That said, lets make it clear early on that preemptive threads are not supported on the PowerMacintosh at this time and therefore developers are strongly encouraged to use cooperative threads. (In reality, this hasnt posed much of a problem for most developers, given the number of restrictions for preemptive threads.)
Programming Model
In this section, well discuss how to structure your program. Heres a block diagram of a basic application:
WNE Loop is, of course, that block of code which you cycle through more rapidly than GetCaretTime() ticks expire. Right? If not, thats one of the first things well learn about the Threads programming model. Its actually possible to cycle through your event loop more quickly, giving your customer a more responsive computer.
A thread is, of course, the center piece of this article.
Yielding is the process of giving up the CPU for some period of time. As Eric mentioned in his paragraphs was that preemptive threads are yielded inherently by a timer interrupt; they do not yield. Cooperative threads yield. They call YieldToAnyThread(). When we examine the API, well see this call. For now, lets just remember that yielding is the process of asking the Thread Manager to find the thread of execution which should execute next.
The circular shapes represent the looping nature of a thread. As it turns out, not all threads loop. Some follow some series of steps: A, B, C, ..., etc. Those threads, if they take a long time, should be making asynchronous I/O calls with yield calls where SyncWait would normally execute. For example,
/* 1 */
AsyncFileManagerCall( &pb);
while( pb.ioResult > 0)
YieldToAnyThread();
In our block diagram, we find the two steps most programs have: the initialization and the WaitNextEvent loop. The WNE loop looks something like this (code snippet from Sprocket, courtesy of Dave Falkenburg - thanks Dave, lets do lunch real soon):
/* 2 */
Boolean gDone = false;
Boolean gMenuBarNeedsUpdate = false;
long gRunQuantum = GetCaretTime();
long gSleepQuantum = 3;
RgnHandle gMouseRegion = nil;
Boolean gHasThreadManager = false;
void MainEventLoop(void)
{
EventRecordanEvent;
unsigned long nextTimeToCheckForEvents = 0;
while (!gDone)
{
if (gHasThreadManager)
YieldToAnyThread();
if (gMenuBarNeedsUpdate)
{
gMenuBarNeedsUpdate = false;
DrawMenuBar();
}
if ((gRunQuantum == 0) ||
(TickCount() > nextTimeToCheckForEvents))
{
nextTimeToCheckForEvents = TickCount() + gRunQuantum;
(void) WaitNextEvent( everyEvent, &anEvent,
gSleepQuantum, gMouseRegion);
HandleEvent(&anEvent);
}
}
}
Immediately we find two things: PowerMacintosh WaitNextEvent smarts and support for the Thread Manager. The WNE smarts is simply a mechanism for throttling the frequency with which WaitNextEvent is called on a PowerMacintosh. (The issue here, if youre not already familiar with it, is that WNE invokes a context switch from PowerPC code to 68K emulation and if the application calls WNE too frequently then performance goes into the proverbial toilet.)
The Thread Manager support is, as you can see, petty. (There was, one would hope, the appropriate check for the presence of the Thread Manager. Well see that code in a couple pages.)
The Toolbox trap YieldToAnyThread() uses trap number $ABF2 (which goes by the name ThreadDispatch, and gets a selector in D0). If we glance back at our block diagram, we see that the Toolbox trap YieldToAnyThread() calls into the Thread Manager. It yields the CPU to one of the other threads of execution within the program context. Each thread is then responsible for calling YieldToAnyThread() frequently enough that two things will happen: one, the cursor, if youve got one, will blink with reasonable consistency; second, youll want events (mouse downs, etc.) to be handled pretty quickly.
With regard to the insertion point blinking, the rule of thumb here really varies: if youve got an arbitrary number of threads (potentially greater than GetCaretTime()s smallest value), youll want to yield before a tick expires; if youve only got a few threads, you may want to time your processing using the same kind of TickCount() > someQuantum algorithm as what Daves done above.
Regarding events, there is something very important you should be familiar with. The Thread Manager will check to see if any events are pending for the application each time a thread yields. If there are events (or there is an event), the thread that gets time next is the main thread. If the main thread then yields without processing the event (or all events), another thread is executed, but upon the next yield, the main thread will then get time again. Its an algorithm that ensures two things: first, the main thread gets time often enough to handle incoming events as quickly as threads yield, and second, events are not starved (threads are always hungry for CPU time).
Obviously these over-simplified rules wont work for everybody. Hopefully, after youve read the bulk of these articles, youll get some feel for where to begin your experimentation for coming up with a processing time model that works best for your application and thread requirements.
Your application will create these threads whenever its appropriate to do so. In SortPicts (one of the apps included on the source disk and online sites this month), for example, a thread is created when a window is opened. In ThreadBrot (a threaded Mandelbrot, also on disk and online), threads are created as rectangles within the complex number plane are halved (its a divide-and-conquer algorithm) until some lower bound rectangle size is reached - recursive algorithms must always have a base case. In Steve Sisaks article next month, well see threads created for sending AppleEvents - whod a thunk it? In short, youre free to create threads during the execution of your program whenever your process is given time by the process manager. In fact, you can preallocate threads into a pool (which the Thread Manager will maintain for you) and then you can spawn new threads of execution during preemptive threads or during interrupts (remember, in 68K land your A5 world must be set correctly; for PowerPC applications, you must make this call from PowerPC code - or the Threads Manager will crash your application).
API
There are only a couple of calls you need to know about: NewThread, YieldToAnyThread. With these two calls, you could make your programs amazingly responsive. With other calls well discuss, you can do much more.
/* 3 */
FUNCTION NewThread (threadStyle: ThreadStyle;
threadEntry: ThreadEntryProcPtr;
threadParam: LONGINT;
stackSize: Size;
options: ThreadOptions;
threadResult: LongIntPtr;
VAR threadMade: ThreadID):OSErr;
Youll call NewThread whenever you wish to create a new thread of execution. threadStyles are kPreemptiveThread and kCooperativeThread. Again, unless youre writing a program for use only on a 68K machine, youll want to limit yourself to kCooperativeThread.
Your threadEntry is the address of the function which will get executed when the thread is first executed. You should remember that this entry point will be called only once for your thread. From then on, when you call YieldToAnyThread, your thread will continue execution from the instruction immediately following the yield call.
The threadParam is passed to your thread when it is first called. This allows you to pass a value or the address of an arbitrarily-large data structure to your thread.
The stackSize field defines the size that you believe to be adequate to maintain context switch information, satisfy Toolbox stack requirements, and fulfill your threads stack needs. If you pass zero for this field, you will get the default stack size. (You can inspect this size by calling GetDefaultThreadStackSize, and you can set it by calling SetDefaultThreadStackSize.)
The options field allows you to define some characteristics about the thread you want created: kNewSuspend creates a thread which is not inherently eligible for execution; kFPUNotNeeded denotes that (on 68K machines) the FPU context will not be saved on the stack during context switches (this option has no effect on the PowerMacintosh implementation of threads).
The threadResult field will be filled in when your thread actually terminates. Its return value is placed in the memory pointed to here. The NIL case is handled correctly (that is, nothing is put there if you pass in NIL; this avoids a write to memory location 0).
The threadMade is the ThreadID of the thread just created. Youll use this ID to refer to threads in the future. For example, if you wish to kill a thread, you may call DisposeThread( threadID);.
And last, but not least, there are error codes to be interpreted. Obviously noErr is a good thing. memFullErr means that the thread wasnt created because there wasnt room for the stack or thread structure. paramErr is returned if you attempt to create a kPreemptive thread on a PowerMacintosh or if you dont use one of the two defined values for thread type in the thread type field.
OK. So now youve got a dozen threads running rampant in your application and you need to know how to switch from one to the other to the other to the other and back to the main thread of execution (your WNE loop) so that you can actually get some processing done. No problemo, señor y señyorita programmer.
YieldToAnyThread() will get you around your threads quite simply. It takes no parameters and will simply invoke the Thread Manager scheduler to determine which thread should be next to execute. There will be occasions when you will know best which thread should execute next. On these occasions, you will want to call YieldToThread( threadID).
There are several other useful threads calls, and Ill cover a few here. For more detail, the entire Thread Manager specifications follow right after this article.
Programming Examples
Lets look at some code examples to see how these routines are used by a real program. The code well look at does three things: 1) Checks for the complete presence of the Thread Manager; 2) Create the thread; 3) YieldToAnyThread().
Checks for the complete presence of the Thread Manager
Checking for the complete presence of the Thread Manager is pretty simple. First, you need to call Gestalt (thanks to the magic of glue, even Gestalt is compatible across all versions of system software back to 4.3) and check that the thds selector is present. Second, youll need to check to see that the gestaltThreadMgrPresent bit is set. No problem.
For PowerPC code, you need two additional tests. Third, you need to see that the native library is present (this was added to threads with ThreadManager 2.0 -- thanks to Brad Post). Fourth, you need to confirm that the Code Fragment Manager (CFM) actually resolved the Thread Manager code fragment with your application (a shared library).
The reason for the fourth test is because of two conditions: A) the ThreadsLib library may not have loaded -- low memory conditions with VM off, for example; and, B) you should be weak linking your application with the ThreadsLib library. This way, if the ThreadsLib library isnt present on PowerPC machines, your native code can handle the conditional appropriately (a modeless dialog might mention that some features wont function because some system software features werent found) as opposed to the Finder bringing up a modal dialog, ThreadsLib couldnt be found. Like, what does that mean to most of your users?
Without further ado, heres the code:
/* 4 */
// Test for the presence of the threads manager
// 1. Is the gestalt selector defined? If not, we bail immediately
// 2. If we're compiling PPC native code, then check for the ThreadsLibrary
// 3. Also, if we're native, check that CFM actually linked my app to
the library
// 4. Is the Thread Mgr Present bit set to True? If not, bail
if( Gestalt( gestaltThreadMgrAttr, &threadGestaltInfo) != noErr ||
#if defined(powerc) || defined (__powerc)
threadGestaltInfo & (1<<gestaltThreadsLibraryPresent) == 0 ||
(Ptr) NewThread == kUnresolvedSymbolAddress ||
#endif
threadGestaltInfo & (1<<gestaltThreadMgrPresent) == 0)
{
// This is the bail clause. Yours may be more elaborate
printf( "Threads Mgr isnt present Cant run the test.\n");
return;
}
Create the Thread
Theres really no mystery about creating new threads, but well walk through an example, just the same.
/* 5 */
pascal void *MyThread_A( void *refCon);
myErr = NewThread( kCooperativeThread, MyThreadProc,
(void *)0, 0, kFPUNotNeeded + kCreateIfNeeded,
(void**)nil, &threadID);
The NewThread parameters are: thread type, entry procedure, refcon, size of stack, thread options, where to put a return value (when the thread entry procedure returns), and, last, a place to put the ID which identifies the thread just created.
So, what this call does is create a cooperative thread, executing the function MyThreadProc with 0 for a refcon with the default stack size. For options, the floating point unit is not needed and the ThreadManager is given the OK to allocate the thread memory if it is needed (remember the thread pool concept from earlier? yea, thats what this option affects). If the thread has a return value, I dont care about it (by passing nil as a Ptr to a void *). Last, when the thread is created I want the thread ID stored in the variable called threadID.
A note to C++ users: Some implementations of C++ (THINK C 5.0 with Objects, for example) allow the creation of Handle-based objects. If your object is Handle-based and you pass the address of an Object variable to this function (or any Mac Toolbox function which may move memory during its operation), your object may move! This is a bad thing because the ptr to your object variable will no longer point to your object variable after your object data block moves. Solutions to this conundrum are well understood: the best is to pass the address of a stack-based variable (i.e., local variable). The alternative (which is much uglier, but also viable) is to lock your object with HLock. I suggest the local variable.
After the thread is created, you can call YieldToAnyThread and the first line of the thread entry procedure will execute. A word regarding thread execution order: if you have more than one thread, there is no guarantee that the threads will execute in any particular order from yield call to yield call. The only exception is that the main thread (the thread that is created automagically which contains your main() function) will execute if an event is pending in the event queue. This feature means that if the user presses the mouse button, just as fast as your thread yields, the main thread will be executed and it will be free to call WaitNextEvent, which will simply return with a valid (mdown) event and then you can process the event accordingly.
YieldToAnyThread()
At the heart and soul of a thread, there is a thread procedure which will be executed: once. When that function completes, the thread is deemed dead and is no longer eligible for execution. So for example, logic which reads:
/* 6 */
main()
{
NewThread( MyThreadProc);
while( true)
YieldToAnyThread();
}
pascal void *MyThreadProc( void *refcon)
{
printf( Hello from thread\n);
return nil;
}
will print only one line, Hello from thread. Therefore, once your thread gets control, it needs to have its own looping logic built in. Like,
/* 7 */
pascal void *MyThreadProc( void *refcon)
{
for( i = 0; i < 5; i++)
{
printf( Hello from thread\n);
YieldToAnyThread();
}
return nil;
}
This will print 5 Hello from thread lines before it terminates.
Please look at the source code which accompanies this article. Next month, look for a case study on an app I wrote called SortPicts (and maybe another about AppleEvents and Threads). Read the Thread Manager Documentation. Reread Hitchhikers Guide to the Galaxy. Then take some time off. Enjoy life more. Read back issues of MacTutor and MacTech.
Heres the code to TestThreads.c. Its followed by the output it generates.
TestThreads.c
/* 8 */
#include <threads.h>
#include <stdio.h>
#include <GestaltEqu.h>
#if defined(powerc) || defined (__powerc)
#include <FragLoad.h>
#endif
pascal void *MyThread_A( void *refCon);
pascal void *MyThread_B( void *refCon);
pascal void *MyThread_C( void *refCon);
main
void main( void)
{
OSErr errWhatErr;
int i;
ThreadID threadID_A, threadID_B, threadID_C;
long threadGestaltInfo;
// Test for the presence of the threads manager
// 1. Is the gestalt selector defined? If not, we bail immediately
// 2. If we're compiling PPC native code, then check for the ThreadsLibrary
// 3. Also, if we're native, check that CFM actually linked my app to
the library
// 4. Is the Thread Mgr Present bit set to True? If not, bail
if( Gestalt( gestaltThreadMgrAttr, &threadGestaltInfo) != noErr ||
#if defined(powerc) || defined (__powerc)
threadGestaltInfo & (1<<gestaltThreadsLibraryPresent) == 0 ||
(Ptr) NewThread == kUnresolvedSymbolAddress ||
#endif
threadGestaltInfo & (1<<gestaltThreadMgrPresent) == 0)
{
// This is the bail clause. Yours may be more elaborate
printf( "Threads Mgr isn't present... Can't run the test.\n");
return;
}
// Create 3 threads: A, B, C
errWhatErr = NewThread( kCooperativeThread, MyThread_A,
(void *)0, 0, kFPUNotNeeded + kCreateIfNeeded,
(void**)nil, &threadID_A);
errWhatErr = NewThread( kCooperativeThread, MyThread_B,
(void *)0, 0, kFPUNotNeeded + kCreateIfNeeded,
(void**)nil, &threadID_B);
errWhatErr = NewThread( kCooperativeThread, MyThread_C,
(void *)0, 0, kFPUNotNeeded + kCreateIfNeeded,
(void**)nil, &threadID_C);
// Simple loop to test Yielding
for( i = 0; i < 5; i++) {
printf( "This is thread main\n");
YieldToAnyThread();
}
}
MyThread_A
pascal void *MyThread_A( void * /* refCon */)
{
int i;
for( i = 0; i < 5; i++) {
printf( "------- A ------\n");
YieldToAnyThread();
}
return nil;
}
MyThread_B
pascal void *MyThread_B( void * /* refCon */)
{
int i;
for( i = 0; i < 5; i++) {
printf( "------- B ------\n");
YieldToAnyThread();
}
return nil;
}
MyThread_C
pascal void *MyThread_C( void * /* refCon */)
{
int i;
for( i = 0; i < 5; i++) {
printf( "------- C ------\n");
YieldToAnyThread();
}
return nil;
}
TestThreads produces the following output when run (note that, although the order of execution appears deterministic, you shouldnt rely on that behavior):
This is thread main
------- A ------
------- B ------
------- C ------
This is thread main
------- A ------
------- B ------
------- C ------
This is thread main
------- A ------
------- B ------
------- C ------
This is thread main
------- A ------
------- B ------
------- C ------
This is thread main
------- A ------
------- B ------
------- C ------