Oct 95 Tips
Volume Number: | | 11
|
Issue Number: | | 10
|
Column Tag: | | Tips & Tidbits
|
Tips & Tidbits
By Steve Sisak, Contributing Editor
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
TIP OF THE MONTH
Spotting the Elusive Ram Disk
Here is a handy function for performing a dynamic check for the existence of a RAM disk as created by Apples Memory control panel. I use it in some of my programs when speed is paramount. Since the user can remove the RAM disk at any time by turning it off from the Memory control panel, and can rename it at any time, the code here must be called before using the RAM disk. In other words, dont check just once during your application startup and assume thereafter that it will still be there, check before each access if possible. Program defensively...
- Greg Poole
HasRamDisk.h
#pragma once
#ifdef __cplusplus
extern "C" {
#endif
extern Boolean HasRamDisk( FSSpecPtr ramDiskSpec );
#ifdef __cplusplus
}
#endif
HasRamDisk.c
/******************************************************************************
HasRamDisk.c
A dynamic check for the existence of a RAM disk as created by Apple's
Memory control panel. Since the user can remove the RAM disk at any time
by turning it off from the Memory control panel, and can rename it at any
time, the code here needs to be called before assuming the existence of
a RAM disk. In other words, don't check just once during your application
startup and assume thereafter that it will still be there, check
before each access. Program defensively...
history:
modified: xx/xx/xx who are you? what did you do?
created: 08/10/94 greg poole
Greg Poole
Vital Images, Inc.
505 N. 4th Street
Fairfield, IA 52556
(515) 472-7726
email: greg@vitalimages.com
******************************************************************************/
#include <string.h>
#include "HasRamDisk.h"
// this structure is based on the DRVR definition in MPWTypes.r
//
struct DRVRresourceRec
{
// description of drvrFlags
//
// struct
// {
// unsigned hiUnused : 1; // unused
// unsigned needLock : 1; // lock drvr in memory
// unsigned needTime : 1; // for periodic action
// unsigned needGoodbye : 1; // call before heap reinit
// unsigned statusEnable : 1; // responds to status
// unsigned ctlEnable : 1; // responds to control
// unsigned writeEnable : 1; // responds to write
// unsigned readEnable : 1; // responds to read
// unsigned loUnused : 8; // low byte of drvrFlags word unused
// } drvrFlags;
short drvrFlags; // flags as defined above
unsigned short driverDelay; // driver delay (ticks)
short deskAccEventMask; // desk acc event mask
short driverMenuID;// driver menu ID
unsigned short offsetOpen;// offset to DRVRRuntime open
unsigned short offsetPrime; // offset to DRVRRuntime prime
unsigned short offsetControl;// offset to DRVRRuntime control
unsigned short offsetStatus;// offset to DRVRRuntime status
unsigned short offsetClose; // offset to DRVRRuntime close
Str31 driverName;// driver name
char driverCode[1]; // driver code
};
typedef struct DRVRresourceRec DRVRresourceRec;
typedef DRVRresourceRec *DRVRresourcePtr, **DRVRresourceHndl;
// constants
//
const char kDrvrHandleBit = 0x40;
// bit 7 of 'DRVR' dCtlFlags signals driver is handle
// instead of pointer and needs to be locked in memory
const char kRamDiskName[] = "\p.EDisk";// Apple's RAM disk driver name
// pass in an FSSpecPtr to hold a reference to a RAM disk,
// returns TRUE if there is currently a RAM disk, FALSE if not
//
Boolean HasRamDisk( FSSpecPtr ramDiskSpec )
{
BooleanhasRamDisk = FALSE, isHandle = FALSE;
short whichVol = 1;// start with first disk volume
HVolumeParam volPB;
OSErr theErr = noErr, anErr = noErr;
DCtlHandle dctlHndl = NULL;
DRVRresourcePtr drvrPtr = NULL;
DRVRresourceHndldrvrHndl = NULL;
Ptr aPtr = NULL;
Str31 volName;
do// test each mounted disk volume
{
volPB.ioNamePtr = volName;
volPB.ioVRefNum = 0;// 0 means use ioVolIndex
volPB.ioVolIndex = whichVol; // use this to determine volume
if ( (theErr=PBHGetVInfoSync( (HParmBlkPtr)&volPB )) == noErr)
{
// get this volume's device control entry from the unit table.
// do not lock the dctlHndl, I spent a couple of days figuring
// out that locking this handle causes a crash in the CompServer
// because it is locked at interrupt time...
//
if ( (dctlHndl = GetDCtlEntry(volPB.ioVDRefNum)) != NULL )
{
// is the device's driver in a handle or a pointer?
//
if ((isHandle=(*dctlHndl)->dCtlFlags&kDrvrHandleBit)!= 0)
{
drvrHndl = (DRVRresourceHndl) (*dctlHndl)->dCtlDriver;
drvrPtr = *drvrHndl;
}
else
drvrPtr = (DRVRresourcePtr) (*dctlHndl)->dCtlDriver;
// get this device's driver, check if it is a RAM disk
//
if ( !memcmp( drvrPtr->driverName, kRamDiskName,
*kRamDiskName+1 ) )
{
// this driver is the RAM disk driver, create an FSSpec to its root dir
//
anErr = FSMakeFSSpec( volPB.ioVRefNum, fsRtDirID,
volName, ramDiskSpec );
if ( anErr == noErr )
hasRamDisk = TRUE;
break;
}
}
}
whichVol++; // go to next volume
}
while ( theErr != nsvErr );
return hasRamDisk;
} // end HasRamDisk
// define TEST_RAM_DISK for a standalone test
//
#define TEST_RAM_DISK
#if defined( TEST_RAM_DISK )
// local function prototypes
//
static void InitTheMac( void );
static void InitTheMac( void )
{
InitGraf( &qd.thePort );
InitFonts();
InitWindows();
InitMenus();
TEInit();
InitDialogs( 0L );
InitCursor();
MaxApplZone();
} // end InitTheMac
void main( void )
{
FSSpec ramDiskSpec;
BooleanhasRamDisk;
InitTheMac();
hasRamDisk = HasRamDisk( &ramDiskSpec );
} // end main
#endif // TEST_RAM_DISK
Anti-Tip of the Month
Since Greg Poole won the Tip-of-the-Month I thought we could have a little fun and also give him the Anti-Tip-of-the-Month as well for a different submission. (Dont worry, Greg, youre getting paid for this too.)
Greg writes:
Heres a quick and clean way to swap data in place without having to resort to using a temporary memory location:
short *aPtr, *bPtr;
*aPtr ^= *bPtr;
*bPtr ^= *aPtr;
*aPtr ^= *bPtr;
While this is mathematically cool, lets take a look at the assembly code that it generates and see whats really happening. First, for comparison, a couple of more pedestrian implementations:
void swap2(short *aPtr, short *bPtr)
{
short a = *aPtr; // a version with two temporaries
short b = *bPtr;
*aPtr = b;
*bPtr = a;
}
void swap1(short *aPtr, short *bPtr)
{
short a = *aPtr; // a version with one temporary
*aPtr = *bPtr;
*bPtr = a;
}
void swap0(short *aPtr, short *bPtr)
{
*aPtr ^= *bPtr; // Gregs tip
*bPtr ^= *aPtr;
*aPtr ^= *bPtr;
}
Now, lets take a look at what the compiler actually generates for these functions. (Im using CodeWarrior with all optimizations on for these examples.)
Recall that as processsors have gotten faster, memory has not. For instance 1/80ns (the speed on memory in most Macintoshes) = 12.5 MHz. This means that if adjacent instructions have to address memory with no intervening computation, its as if the processor has slowed to 12.5MHz.
First the 68K compiler, starting with the two temp case:
Name="swap2"(6) Size=26
MOVEA.L $0004(A7),A1
MOVEA.L $0008(A7),A0
MOVE.W (A1),D0
MOVE.W (A0),D1
MOVE.W D1,(A1)
MOVE.W D0,(A0)
RTS
Ignoring the two MOVEA.Ls which set up the address registers and the return, this takes four instructions, all of which touch memory. Notice, however that there are no cases where the result of an instruction is used an an input to the next instruction, meaning that most of the instructions can overlap in the processor pipeline.
Next with one temp:
Name="swap1"(4) Size=24
MOVEA.L $0004(A7),A1
MOVEA.L $0008(A7),A0
MOVE.W (A1),D0
MOVE.W (A0),(A1)
MOVE.W D0,(A0)
RTS
Here we have three instructions, all accessing memory and all can overlap. This is smaller than the example above. Whether it is faster depends on the relative timing of the MOVE.W (A0),(A1) instruction. (If anyone wants to time this, Ill print the results.)
Now Gregs tip:
Name="swap0"(1) Size=30
MOVEA.L $0004(A7),A1
MOVEA.L $0008(A7),A0
MOVE.W (A0),D0
EOR.W D0,(A1)
MOVE.W (A1),D0
EOR.W D0,(A0)
MOVE.W (A0),D0
EOR.W D0,(A1)
RTS
This generates six instructions, all of which touch memory. Furthermore three of these are read-modify-write cycles, which are slower that a read or write and each instruction depends on the result of the instructon directly before it, meaning it wont overlap in the pipeline, making this both the largest and slowest implementation of the three.
Now lets look at the PowerPC code:
Name=".swap2"(6) Size=20
lha r0,0(r3)
lha r5,0(r4)
sth r5,0(r3)
sth r0,0(r4)
blr
Name=".swap1"(4) Size=20
lha r5,0(r3)
lha r0,0(r4)
sth r0,0(r3)
sth r5,0(r4)
blr
Note that both of the versions with temporaries generated the same code (4 instructions, all touching memory but pipelineable). This is because RISC processors typically dont have memory to memory operations; instead, they must move data to a register before operating on it.
Now our tip:
Name=".swap0"(1) Size=52
lha r5,0(r4)
lha r0,0(r3)
xor r0,r0,r5
sth r0,0(r3)
lha r5,0(r3)
lha r0,0(r4)
xor r0,r0,r5
sth r0,0(r4)
lha r4,0(r4)
lha r0,0(r3)
xor r0,r0,r4
sth r0,0(r3)
blr
This implementation is by far the largest and slowest, generating 12 instructions, including 6 memory accesses. Furthermore there are 2 pipeline stalls. Clearly this implementation is the largest and slowest of all.
The moral of the story is: dont get tricky. C programmers often try to minimize the number of lines of C in their program without consideration for what the compiler will generate. When in doubt, write clear code and give the optimizer a chance to maximize performance. Look at the compiler output. Your code will be easier to debug and probably faster too.
Till next time,
- Steve