Segment DAs
Volume Number: | | 5
|
Issue Number: | | 5
|
Column Tag: | | C Workshop
|
Related Info: Segment Loader
Segmenting DA's
By Tom Saxton, Bellevue, WA
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
Segmented DAs: Breaking The 32K Limit
In a variety of contexts on the Macintosh we find ourselves running up against the 32K limit. One area that has been particularly limiting and persistent is the 32K limit on resources containing code. The reason for this has to do with the way the MC68000 addresses memory; specifically one can address code within ±32K (and the minus is generally ignored) of some base address very efficiently, so this mode is used for jumping to subroutines and for building jump tables. In particular, Lightspeed C limits the code-containing resources it builds to 32K. This isnt a big problem with applications, since one can break up the code in an application into multiple segments. This segmentation is actually nice in itself, since it allows programs that are too big to fit entirely in memory to run by swapping code resources in and out as needed.
But Desk Accessories and stand-alone code resources (INITs, FKEYs, etc.) are a different story. The Mac OS has nothing built in to handle segmentation for anything but applications, so were stuck with a 32K size limit. Fortunately, this can all be fixed up with a little inline assembler. I have been working with these sorts of things on several different projects, and have developed a fairly painless way of segmenting either DAs or code resources. This is all done with stand-alone code resources, which I will call PROCs. Fairly painless means:
1) One shouldnt have to write and debug a bunch of assembly code every time a PROC or routine is added to the project.
2) A PROC should be allowed to have multiple entry points. Further, accessible routines should be able to have different numbers of arguments. (A routine with a variable number of arguments is a bit trickier, but not an unreasonable variation on the code given below.)
3) Lightspeeds ability to check arguments types via function prototypes should be left intact.
3) Calling a procedure in an external PROC should be completely invisible to the calling C routine.
4) It should be possible to have multiple entries into a code resource. That is, it should be possible to jump in and out of a given PROC, even to the point of allowing recursion between two PROCs. In short, there should be no more limitations in calling routines than exist in a normal segmented application.
5) There should be a mechanism for loading and unloading PROCs the same way application code-resources are handled, or at least a reasonable approximation thereto.
The biggest drawback of doing your own segmentation is that you have to handle part of the link yourself. That is, when you make a change to one of the PROCs in a project, you have to make it accessible to the parent code either by copying the PROC into the parents resource file, or by having the parent explicitly open resource file(s) containing the PROC(s). At the moment, I am working on a ShareWare DA that will make this process considerably less painful.
The LSC documentation (2.01) supplement shows how to write glue to handle a single-entry PROC. Their code breaks down if you want to either call different routines within the PROC, or if there is a possibility of the PROC being called from two different places in the same call chain; more on this below. Working with all of this, I found out several interesting side effects of some ToolBox calls. For instance, PrJobDialog() and PrStlDialog() both go to the trouble of redrawing any dialog windows in the window list that need updating. This can really hose things if you have user item-drawing routines in a PROC that isnt set up to handle being called again before it returns from the current call.
The Big Issues
There are three basic problems to solve. The first is how to handle multiple entry points. This is the easiest. The Mac PACK resources do this by tacking on an routine selector to the arguments for a routine within the PACK. Upon entry into the PACK, a bit of glue pulls the selector off of the stack and jumps to the right place. A variation on this theme will serve our needs.
Next, LSC (and other environments such as Aztec C) allow non-application code resources to have globals by addressing off of register A4 instead of A5 (which is reserved for applications and QuickDraw). This means that upon entry into a PROC (or a DA) the value of A4 has to be set to the correct value to address the PROCs globals. This is relatively easy; the tricky bit is to restore the old value when you exit so that whoever called you can still find their globals. The obvious solution (which LSC describes) is to store the original value of A4 in a fixed location (actually, they use a pc-relative location, which can run afoul of the instruction cache in a 68020, but thats another story). This fails if the PROC gets re-entered from somewhere having a different value of A4, since we can only store one.
The third problem has to do with handling the first two. The simple schemes for handling the first two problems involve three pieces of glue. The first handles putting the routine selector onto the stack and jumping to the (locked!) PROC. The next chunk is at the head of the PROC, it sets up A4, then uses the routine selector to jump to the desired routine. Finally, when the called routine is done, it cant just jump back to the caller; we must go through a last bit of glue to restore the callerss A4. Since, we would like to avoid tacking on this glue at every exit point of every externally accessible routine, we want to doctor the return address to trick the called routine into returning to a third bit of glue which handles the clean-up. This means we have to pull the real return address off of the stack and store it somewhere. Storing it in a fixed place leads to the same problems we had with A4 above.
The Solution
To avoid storing the two values which need to be saved and recovered in fixed locations, we will store them on the stack. We cant store them below the arguments to the called routine, since then the routine will think it is getting two extra (long word) arguments, and this is supposed to be invisible to the C code. So instead, the first bit of glue shuffles the routines arguments down on the stack to make room for the 8 bytes of storage we need. We then replace the old return value with the address to the third bit of glue (which does the clean-up). We then jump to the second piece of glue which lives at the entry point to the PROC. It sets up A4 then pulls the selector off of the stack and jumps to the target routine. When the target routine exits the clean-up code retrieves the saved values, restores A4 then jumps back to the original caller.
The details of implementing these ideas mainly have to do with doctoring the stack so that both the original caller and the target routine act exactly as if the usual JSR and RTS instructions were used to call and return from the target routine. (Remember this is all supposed to be invisible to the C code.) Care has to be taken to not attempt to address globals while A4 is holding the wrong value. We also cannot molest D0 on the way back since that is where any return value is stored.
There are a pair of routines which load and unload a PROC in the spirit of LoadSegment and UnloadSegment. To count as loaded, the PROC resource must be in memory, locked and marked unpurgeable. Unloaded is unlocked and purgeable (dont actually ReleaseResource, in case we want it again). If these PROCs are to be owned by a desk accessory, then we have to calculate the resources actual ID from the drivers ID number and the sub-ID of the PROC. This is all handled by FLoadProc() and UnloadProc().
The following example is a stupid little DA which calls three routines in an external PROC with differing numbers of arguments and return values. Before doing so, it calls FLoadProc() which loads the PROC and stores a pointer to the PROC in a location passed to the routine. FLoadProc() calls MoveHHi() on the resource handle to reduce the possibility of causing fragmentation problems. When the DA is closed (which is the only thing that can be done to or with it), it calls UnloadProc() to unlock and free up the space used by the PROC resource. While debugging, it would be wise to set the pointer to the PROC to -1 when it is unloaded, so that if someone tries to use the unloaded PROC, an address error is immediately generated. This is really only an issue if PROCs are to be loaded and unloaded while the program is running to conserve memory (not an unlikely occurance, but remember that under LSC, globals are effectively re-initialized when a PROC is released from memory and read back in from disk).
The second source is for the DA. It contains a bare bones DA with routines needed to load and call three routines in a PROC. There are also defines for the procedure selector numbers. These must begin at zero and occur in the same order as the jump table in the PROC. The glue to call a routine in a PROC consists of a call to an assembly language macro that does all of the work. Notice that the macro wants the number of WORDs passed as arguments to the routine, not bytes. Note also that this does not count the return address which also gets passed to the routine; the macro takes that into account.
The first source file is for the PROC resource. It contains the entry glue for the PROC. Every routine that is to be externally callable has to have an entry in the JumpTable; see the note about routine selectors above.
To build the PROC, create a new LSC project, add UselessProc.c, set its project type to Code Resource, its resource type to PROC and its ID to -16000 (sub ID zero owned by DRVR ID 12). Once created, the resource file is named UselessDAProj.rsrc so that it will be copied automatically into the DA file. In general, the PROC resource would have to be copied into the DAs resource file. Note that you dont have to include MacTraps with this project (it does not call any routines defined there).
After building the PROC, and giving its output file the specified name, create a new project called UselessDAProj, add the first source file and MacTraps to the project, and build the DA. Now Run the project (or open the DA file with Suitcase), and open the DA. The DA will then put up a window (the first PROC call), write something funny in the window (the second PROC call). It will then wait until you close it, at which time it will get rid of its window (the third call to the PROC) and go away. Note that it does not handle even so much as activate or update events. Like I said, useless.
For the Non-Hackers
If assembly language is not your second language already, youre likely thoroughly lost. This should not prevent you from using the trickery here. Below is all you need to know to use the segmenting code.
In the project that calls an external PROC, create a source module which contains the macro definition and a routine for each external procedure you wish to call. Pattern them after the routines OpenWindow(), DrawWindow() and KillWindow() in the DA source file. Each just calls the horrid assembler macro. Be sure to count the words of arguments correctly, or things will die horribly. If youre not sure how to do this, consult your local guru. Next set up a main() in the PROC project patterned after the main() in UselessProc.c. Make one entry in the JumpTable for each routine you want to access. Then build a set of defines for the routine selectors (in the same order as they appear in the jump table).
Potential Pitfalls
The easiest things to mess up are counting the words in the argument list. Remember that Pointers, Handles, Points, pointers to structs and arrays, and longs all count 2 words. Ints, shorts, Booleans and chars all count 1 word (a char is only half a word, but gets passed as a word on the stack, this can be tricky and is probably compiler-dependent). Counting words in a struct or union can be tricky since bytes can get packed together, use sizeof() when in doubt. Dont use sizeof(char), for the reason above. Next is mismatching the order of the routine selectors, check that one twice, too.
Finally, the assembler macro assumes that the glue routine begins with a LINK instruction. This is true for any routine that has arguments or local variables. If you want to call an argument-less routine in a PROC, give its glue routine a dummy local variable to force a LINK instruction (LSC talks about LINKs on page 9-2 of the original manual). Alternatively, one could create another version of the macro with the UNLK statement removed for functions with no arguments, but this would leave two chunks of assembler code to maintain.
Debugging Advice
Although I have been using the code for some time and have written it very carefully, if you hit an inexplicable bug while using it, youre going to have to be convinced that the segmentation glue is not at fault. Besides, there is always a chance that I have made some subtle oversight. So, here is some advice for verifying that its working correctly, or finding out why it is not.
If you havent already, read LSCs explanation of MacsBug, and the C calling conventions, print out the source for all of the assembler hacking and dive in. If necessary, turn on MacsBug Symbols under the Options and recompile the relevant modules so that you can see routine names. The best place to start watching things is just before the call to the glue routine. Watch the arguments as they are pushed onto the stack, step through the glue (not as messy as it sounds), and into the PROC. Make sure the right entry in the jump table is used. As soon as you hit the LINK statement that begins the target routine, do an mr (magic return) in MacsBug. You should then break out in the clean-up glue. Step through it and youre back to the caller. The Heap Dump command hd is useful for verifying the integrity of the application heap and that the PROC (and its caller!) are correctly loaded into memory, locked in place and unpurgable, use it liberally. Its a good idea to write down the state of the stack, the values for A4 and the return value as the glue is entered for comparison with things before and after the routine in the PROC is called.
The Upshot
This stuff is quite simple and reliable to use once properly set up. By creating multiple PROCs with different sub-IDs, there is virtually no limit to how much code a DA can address (pardon the pun). Routines can even be called directly between PROCs, making it possible to put libraries in PROCs for everyone to use. The only trick is calculating the PROC resource ID, since only the sub ID is known at compile time. The easiest way around that problem is to call an initialization routine for each PROC needing to make inter-PROC procedure calls which stores the base ID in a global.
Listing Useless.c
#include<MacTypes.h>
#include<MemoryMgr.h>
#include<Quickdraw.h>
#include<WindowMgr.h>
#include<EventMgr.h>
#include<OSUtil.h>
#include<ResourceMgr.h>
#include<ToolboxUtil.h>
#include<DeviceMgr.h>
#define NULL(0L)
#define drvrOpen 0
#define drvrPrime1
#define drvrControl2
#define drvrStatus 3
#define drvrClose4
#define subidProc0
enum {
irtnOpenWindow = 0,
irtnDrawWindow,
irtnKillWindow
};
extern shortmain(CntrlParam*,DCtlPtr,short);
extern shortDAOpen(CntrlParam*,DCtlPtr);
extern shortDAClose(CntrlParam*,DCtlPtr);
extern WindowPtr OpenWindow(char*,short,short,short,short);
extern void DrawWindow(WindowPtr,char*);
extern void KillWindow(WindowPtr);
extern Boolean FLoadProc(short,Ptr*);
extern void UnloadProc(Ptr);
PtrpProc;
short rsidBase;
#define JumpToProc(pPROC,irtn,cwArgs)\
asm { \
unlk a6/* get rid of LSCs frame */\
movea.lsp,a0 /* address of 1st argument */\
lea -8(sp),sp /* make room on the stack */\
movea.lsp,a1 /* bottom of stack */\
move.w #(cwArgs)+1,d0/*(# of words to copy)-1*/\
/* copy rtn addr and arguments */\
LCopyAnother: \
move.w (a0)+,(a1)+/* move a word down */\
dbra d0,@LCopyAnother/* done when --d0 == -1*/\
move.l a4,(a1)+ /* copy our a4 into storage*/\
move.l (sp)+,(a1) /* copy our rtn addr just behind it*/\
\
pea @LReturnToCaller/* push return for proc*/\
move.w #irtn,-(sp)/* push routine selector*/\
move.l pPROC,-(sp)/* push address of PROC*/\
beq.s @LHandleError/* jump if passed a NULL pPROC*/\
LJumpToProc:\
rts /* jump to proc(sleazy trick)*/\
\
LHandleError: \
dc.w 0xA9FF /* let Macsbug handle the error*/\
lea 10(sp),sp/* fix up the stack */\
\
LReturnToCaller: \
lea (2*(cwArgs))(sp),a0/* get add of storage space*/\
movea.l(a0)+,a4 /* restore our a4 */\
movea.l(a0),a0 /* get our return address*/\
lea 8(sp),sp /* fix up the stack */\
jmp (a0)/* return to sender (address known)*/\
}
short main(piopb, pdce, drvrRoutine)
CntrlParam*piopb;
DCtlPtr pdce;
short drvrRoutine;
{
GrafPtrpgpSav;
short valRtn;
if (pdce->dCtlStorage == NULL) {
if (drvrRoutine == 0) { /* open, but no data */
SysBeep(3);
CloseDriver(pdce->dCtlRefNum);
}
return(0);
}
GetPort ( &pgpSav );
switch (drvrRoutine) {
case drvrOpen:
valRtn = DAOpen(piopb,pdce);
break;
case drvrControl:
case drvrPrime:
case drvrStatus:
valRtn = 0;
break;
case drvrClose:
valRtn = DAClose(piopb,pdce);
break;
}
SetPort ( pgpSav );
return valRtn;
} /* main */
short DAOpen(piopb,pdce)
CntrlParam*piopb;/* pointer to parameter block */
DCtlPtr pdce; /* the device control entry */
{
rsidBase = 0xC000 | (~pdce->dCtlRefNum<<5);
if ( !FLoadProc ( 0, &pProc ) ) {
SysBeep(2);
CloseDriver(pdce->dCtlRefNum);
return 0;
}
/* Using hard-wired window coordinates? For shame... */
pdce->dCtlWindow=OpenWindow(\POur Window,100,50,300,150);
((WindowPeek)pdce->dCtlWindow)->windowKind=pdce->dCtlRefNum;
DrawWindow(pdce->dCtlWindow,\PSomething funny.);
return 0;
} /* DAOpen */
short DAClose(piopb,pdce)
CntrlParam*piopb;
DCtlPtr pdce;
{
if ( pdce->dCtlWindow != NULL )
KillWindow ( pdce->dCtlWindow );
if ( pProc != NULL )
UnloadProc ( pProc );
}
Boolean FLoadProc ( subid, ppProc )
short subid;
Ptr*ppProc;
{
Handle hProc;
hProc = GetResource ( PROC, rsidBase + subid );
if ( hProc == NULL ) {
*ppProc = NULL;
return FALSE;
}
/* avoid fragmentation and other evils */
MoveHHi ( hProc );
HLock ( hProc );
HNoPurge ( hProc );
*ppProc = *hProc;
return TRUE;
}
void UnloadProc ( pProc )
PtrpProc;
{
Handle hProc;
if ( pProc == NULL )
return;
hProc = RecoverHandle ( pProc );
if ( hProc == NULL )
return;
HUnlock ( hProc );
HPurge ( hProc );
return;
}
WindowPtr OpenWindow ( stTitle, xLeft, yTop, xRight, yBottom )
char stTitle[];
short xLeft, yTop, xRight, yBottom;
{
JumpToProc(pProc,irtnOpenWindow,6);
}
void DrawWindow(pwind,st)
WindowPtr pwind;
char st[];
{
JumpToProc(pProc,irtnDrawWindow,4);
}
void KillWindow(pwind)
WindowPtr pwind;
{
JumpToProc(pProc,irtnKillWindow,2);
}
Listing: UselessProc.c
#include<MacTypes.h>
#include<MemoryMgr.h>
#include<Quickdraw.h>
#include<WindowMgr.h>
#include<EventMgr.h>
#include<OSUtil.h>
#include<ResourceMgr.h>
#include<ToolboxUtil.h>
#include<DeviceMgr.h>
#define NULL(0L)
extern void main ( void );
extern WindowPtr OpenWindow ( char*, short, short, short, short );
extern void DrawWindow ( WindowPtr, char * );
extern void KillWindow ( WindowPtr );
void main()
{
asm {
movea.la0,a4 ; get pointer to our globals
move.w (a7)+,d0 ; pop off routine selector
add.w d0,d0 ; double the selector
add.w d0,d0 ; double it again
lea @JumpTable,a0; get address of jump table
jmp 0(a0,d0.w) ; jump to our entry
JumpTable:
jmp OpenWindow
jmp DrawWindow
jmp KillWindow
}
}
WindowPtr OpenWindow ( stTitle, left, top, right, bottom )
char stTitle[];
short left, top, right, bottom;
{
Rect rect;
SetRect ( &rect, left, top, right, bottom );
return NewWindow(NULL,&rect,stTitle,TRUE,documentProc,-1L,TRUE,0L);
}
void DrawWindow ( pwind, stMessage )
WindowPtr pwind;
char stMessage[];
{
short h, v;
SetPort ( pwind );
v = (pwind->portRect.top + pwind->portRect.bottom)/2;
h = (pwind->portRect.left + pwind->portRect.right)/2;
h -= StringWidth(stMessage)/2;
MoveTo ( h, v );
DrawString ( stMessage );
}
void KillWindow ( pwind )
WindowPtr pwind;
{
DisposeWindow ( pwind );
}