June 92 - Creating PCI Device Drivers
Creating PCI Device Drivers
MARTIN MINOW
The new PCI-based Power Macintosh computers bring with them a subset of the
functionality to be offered by the next generation of I/O architecture. New support
for device drivers makes it possible to develop forward-compatible drivers for PCI
devices, while at the same time making them much easier to write and eliminating
their dependence on the Macintosh Toolbox. Key features of the new driver model are
described in this article and illustrated by the accompanying sample PCI device
driver.
Writing Macintosh device drivers has always been something of a black art. Details of how to do it
are hidden in obscure places in the documentation and often discovered only by developers willing to
disassemble Macintosh ROMs and system files. But this art that's flourished for more than a decade
is about to get a lot less arcane.
The PCI-based Power Macintosh computers are the first of a new generation of computers with
support for a driver model that's independent of the 68000 processor family and the Macintosh
Toolbox. Existing 680x0 drivers will continue to work on the PCI machines (although this may not
be true for future systems); a third-party NuBusTM adapter enables the use of existing hardware
devices and drivers without change. But drivers for PCI hardware devices must be written in
accordance with the driver model supported in the new system software release, which makes them
simpler to develop and maintain.
This article will give you an overview of the new device driver model, without attempting to cover
everything (which would fill a book and already has). After discussing key features, it suggests how
you might go about converting an existing driver to drive a PCI device. The remainder of the article
looks at some of the individual parts of a forward-compatible PCI device driver. The sample code
excerpted here and included in its entirety on this issue's CD offers a complete device driver that
illustrates most of the features of the new driver model. Of course, you won't be able to use the
driver without the hardware, and you'll need updated headers and libraries to recompile it.
How to write device drivers for PCI-based Macintosh computers is explained in detail in Designing PCI
Cards and Drivers for Power Macintosh Computers .*
KEY FEATURES OF THE NEW DRIVER MODEL
The following list of features will give you some idea of the rationale behind the move away from a
device driver architecture that's served the Macintosh operating system for more than a decade.
Some of these features address problems of the old architecture, while some anticipate new
requirements.
A simplified set of driver services independent of the Macintosh Toolbox
The existing
Device Manager design is closely tied to specific features of the Macintosh Toolbox.
The new system software release supports only a small set of driver services, which are independent
of the Toolbox and are limited to just those things that drivers need to do; they don't let drivers
display dialogs, open files, read resources, or draw on the screen. This greatly simplifies both the
driver's task (the driver interacts only with the actual hardware) and the operating system's task (the
OS needn't have a file system or screen available when starting up drivers).
Independence from the 68000 processor family
The old device driver architecture is
highly dependent on specific features of the 680x0 processor
architecture. For example, the way code segments are organized and the conventions for passing
parameters depend on the 680x0 architecture and make the old driver code different from other code
modules. This means that drivers can't be written in native PowerPC code -- or must make use of
computationally expensive mixed-mode switches.
Also, in the 680x0 architecture, critical sections and atomic operations use assembly-language
sequences to disable interrupts. The PowerPC processor has a completely different interrupt
structure, effectively making these techniques impossible to transport directly to native PowerPC
code.
In the new system software, support for the driver model is independent of any particular processor,
hiding processor-specific requirements in operating system libraries. Drivers can be compiled into
native PowerPC code and can be written in a high-level language such as C. Because they're standard
PowerPC code fragments, they aren't bound by the segment size limitations of the 680x0
architecture; they can be created with standard compilers and debugged with the Macintosh two-
machine debugger.
A more flexible configuration facility
Driver configuration in the old architecture requires the ability to read resources from a parameter
file, or from a 6-byte nonvolatile RAM area indexed by NuBus slot. These ad hoc configuration
mechanisms based on the Resource Manager, File Manager, and Slot Manager are replaced in the
new system software by a more flexible configuration facility that's used throughout the system.
Drivers use a systemwide name registry for configuration. Each device has an entry in the Name
Registry containing properties pertinent to that device. Device drivers can also store and retrieve
private properties. Device configuration programs (control panels and utility applications) should use
the registry to set and retrieve device parameters.
System-independent device configuration
Devices can use Open Firmware to provide operating system configuration as well as system-
independent bootstrap device drivers. Open Firmware is an architecture-independent IEEE standard
for hardware devices based on the FORTH language. When the system is started up, it executes
functions stored in each device's expansion ROM that provide parameters to the system. A device can
also provide FORTH code to allow the system to execute I/O operations on the device. This means a
card can be used to bootstrap an operating system without having operating system-specific code in
its expansion ROM.
Open Firmware and the bootstrap process are described in detail in IEEE document 1275 -- 1994
Standard for Boot (Initialization, Configuration) Firmware .*
Grouping by family
Drivers are grouped into generalfamilies , and family-specific libraries simplify their common tasks.
Currently, four families are defined: video, communications, SCSI (through SCSI Manager 4.3), and
NDRV (a catch-all for other devices, such as data acquisition hardware). The sample code is for a
device driver in the NDRV family.
Direct support for important capabilities
The existing Device Manager doesn't directly support certain capabilities, such as concurrent I/O
(required by network devices) and driver replacement. Driver writers who need these capabilities
have had to implement them independently, which is difficult, error-prone, and often dependent on a
particular operating system release. The new system software supports these capabilities in a
consistent manner.
A choice of storage
Drivers can be stored in the hardware expansion ROM or in a file of type 'ndrv' in the Extensions
folder. A later driver version stored in this folder can replace an earlier version stored in the hardware
expansion ROM.
Forward compatibility
Device drivers written for the new system software will run without modification under Copland, the
new generation of the Mac OS forthcoming from Apple, if they use only the restricted system
programming interface and follow the addressing guidelines inDesigning PCI Cards and Drivers for
Power Macintosh Computers .
For more on Copland, see "Copland: The Mac OS Moves Into the Future" in this issue of develop .*
CONVERTING AN EXISTING DRIVER
To illustrate how you'd go about converting an existing device driver to drive a PCI device, let's
suppose you've developed a document scanner with an optical character recognition (OCR) facility.
The document scanner is currently controlled by a NuBus board that you designed, and you're
building a PCI board to support the scanner on future Macintosh machines.
A useful way to approach the conversion effort is to conceptualize the device driver as consisting of
three generally independent layers:
- A high-level component that connects the device driver to the operating system
and processes requests.
- A mid-level component that has the device driver's task-specific intelligence. For
example, this might contain OCR algorithms. This part is unique to each driver
and generally hardware independent.
- The low-level bus interface "hardware abstraction layer" that directly manipulates
the external device and thus is always device dependent.
At the same time, you might also organize the code in each of these three layers into the following
functional groups:
- data transfer operations (Read, Write)
- interrupt service routines
- initialization and termination
- configuration and control (power management, parameterization)
Let's look at what you would do to each of these layers and groups.
First, you would throw out the high-level component in your driver that interacts with the Device
Manager and replace it with the considerably simpler request processing of the new system software
release. You would need to add support for the Initialize, Finalize, Superseded, and Replace
commands (discussed later), as they have no direct counterpart in the existing Device Manager. You
would also need to revise the way you complete an I/O request: instead of storing values in 68000
registers and jumping to jIODone, your driver would call IOCommandIsComplete.
The mid-level component in your driver would include scanner management and, in particular, OCR
algorithms. These algorithms comprise the intelligence that sets your product apart from its
competition. To convert your driver to a PCI device driver, you would recompile (or rewrite) the
algorithms for the PowerPC processor. If the algorithms were in 68000 assembly language, you could
get started by making mixed-mode calls between the new driver and the existing functions; however,
this won't work with Copland, and I would recommend "going native" as soon as possible.
You would replace the low-level bus interface that manipulates registers on a NuBus card with code
that manipulates PCI registers. Because this is specific to a particular hardware device, it won't be
discussed in this article, but the sample driver on the CD shows you how to access PCI device
registers.
You would also create Open Firmware boot code to allow your card to be recognized during system
initialization. Because the new driver model doesn't use Macintosh Toolbox services, you would have
to redesign your driver to (1) use the Name Registry for configuration instead of resources and
parameter files, and (2) use the new timer services, replacing any dependency on the accRun
PBControl call (the sample code shows how to call timer services, although it's not discussed here).
How your new driver code would look will become clearer in the next sections, where we examine
key parts of the sample device driver. To get the whole picture, see the sample driver in its entirety
on the CD.
The remainder of this article introduces a number of new operating system functions, as well as a few
new libraries, managers, and such. "A Glossary of New Operating System Terms" will help you
navigate through the new territory.
A GLOSSARY OF NEW OPERATING SYSTEM TERMS
CheckpointIO. A function that releases memory that had been configured by PrepareMemoryForIO.
DoDriverIO. A function provided by the driver that carries out all device driver tasks. When you build a
driver, it must export this function to the Device Manager.
DriverDescription. An information block named TheDriverDescription that the Driver Loader Library uses to
connect a device driver with its associated hardware. When you build a driver, it must export this block to
the Driver Loader Library.
Driver Loader Library. A library of functions used by the Device Manager to locate and initialize all drivers.
It uses the DriverDescription structure to match a driver with the hardware actually present on a machine.
Driver Services Library. A family-independent library of driver services limited to just those things that drivers
need to do.
Expansion Bus Manager. A library that provides access to PCI configuration registers.
GetInterruptFunctions. A function that retrieves the current interrupt service functions established for this
device.
GetLogicalPageSize. A function that retrieves the size of the physical page. Normally called once when the
driver is initialized.
InstallInterruptFunctions. A function that replaces the current interrupt functions with functions specific to this
device driver.IOCommandIsComplete. A function that completes the current request by returning the final status to the
caller, calling an I/O completion routine if provided, and starting the next transfer if necessary.
MemAllocatePhysicallyContiguous. A function that allocates a contiguous block of memory whose address can
be passed, as a single unit, to a hardware device. This is essential for frame buffers and similar memory
areas that must be accessed by both the CPU and an external device.
Name Registry. A database that organizes all system configuration information. Each device's entry in the
registry contains a set of properties that can be accessed with RegistryPropertyGet and
RegistryPropertyGetSize.
PoolAllocateResident. A function that allocates and optionally clears memory in the system's resident pool.
This replaces NewPtrSys, which isn't available to forward-compatible PCI device drivers.
PoolDeallocate. A function that frees memory allocated by PoolAllocateResident.
PrepareMemoryForIO. A function that converts a logical address range to a set of physical addresses and
configures as much as possible of the corresponding physical memory space for subsequent direct memory
access.
QueueSecondaryInterrupt. A function that runs a secondary interrupt service routine at a noninterrupt level.
RegistryPropertyGet, RegistryPropertyGetSize. Functions that retrieve, respectively, the contents and the size
of a property, given its name and a value that identifies the current Name Registry entity.
Software task. An independently scheduled software module that can call driver services, including
PrepareMemoryForIO. Software tasks can be used to replace time-based processing that previously used the
PBControl accRun service.
SynchronizeIO. A function that executes the processor I/O synchronize ( eieio) instruction.
A LOOK AT THE SAMPLE DRIVER: CONFIGURATION AND CONTROL
Now we'll look at key pieces of the sample driver, starting with the code for configuration and
control. As mentioned earlier, the sample driver is a member of
the NDRV family. To the operating system, an NDRV driver is a PowerPC code fragment
containing two exported symbols: TheDriverDescription and DoDriverIO. (Although all drivers have
a TheDriverDescription structure, the particular driver family they belong to determines which other
exported symbols are required.)
TheDriverDescription is a static structure, shown in Listing 1, that provides information to the
operating system about the device that this driver controls. The driver will be loaded only if the
device is present. TheDriverDescription also indicates whether the driver is controlled by a family
interface (such as Open Transport for the communications family) and specifies the driver name to
be used by operating system functions to refer to it. The Driver Loader extracts
TheDriverDescription from the code fragment before the driver executes; thus it must be statically
initialized.
Listing 1. TheDriverDescription
DriverDescription TheDriverDescription = {
/* This section lets the Driver Loader identify the structure
version. */
kTheDescriptionSignature,
kInitialDriverDescriptor,
/* This section identifies the PCI hardware. It also ensures
that the correct revision is loaded. */
"\pMyPCIDevice", /* Hardware name */
kMyPCIRevisionID, kMyVersionMinor,
kMyVersionStage, kMyVersionRevision,
/* These flags control when the driver is loaded and opened,
and control Device Manager operation. They also name the
driver to the operating system. */
( (1 * kDriverIsLoadedUponDiscovery) /* Load at system startup */
| (1 * kDriverIsOpenedUponLoad) /* Open when loaded */
| (0 * kDriverIsUnderExpertControl)/* No special family expert */
| (0 * kDriverIsConcurrent) /* Driver isn't concurrent */
| (0 * kDriverQueuesIOPB) /* No internal IOPB queue */
),
"\pMyDriverName", /* PBOpen name */
0, 0, 0, 0, 0, 0, 0, 0, /* For future use */
/* This is a vector of operating system information, preceded by
an element count (here, only one service is provided). */
1, /* Number of OS services */
kServiceTypeNdrvDriver, /* This is an NDRV driver */
kNdrvTypeIsGeneric, /* Not a special type */
kVersionMajor, kVersionMinor, /* NumVersion information */
kVersionStage, kVersionRevision
};
DoDriverIO is a single function called with five parameters to perform all driver services (see Table
1). The overall organization of the driver thus is very simple, as shown in Listing 2.
Table 1. DoDriverIO parameters
Parameter Type | Usage
|
addressSpaceID | Used for operating system memory management. Currently, only one address space is
supported; future systems will support multiple address spaces.
|
ioCommandID | Uniquely identifies this I/O request. The driver passes it back to the operating system when
the request completes.
|
ioCommandContents | Varies depending on the ioCommandCode value. For example, for Read, Write, Control,
Status, and KillIO commands, it's a pointer to a ParamBlockRec.
|
ioCommandCode | Defines the type of I/O request.
|
ioCommandKind | Specifies whether the command is synchronous or asynchronous, and whether it's immediate.
|
Listing 2. DoDriverIO
OSErr DoDriverIO(AddressSpaceID addressSpaceID,
IOCommandID ioCommandID,
IOCommandContents ioCommandContents,
IOCommandCode ioCommandCode,
IOCommandKind ioCommandKind)
{
OSErr status;
switch (ioCommandCode) {
case kInitializeCommand:
status = DriverInitialize(ioCommandContents.initialInfo);
break;
case kFinalizeCommand:
status = DriverFinalize(ioCommandContents.finalInfo);
break;
case kSupersededCommand:
status =
DriverSuperseded(ioCommandContents.supersededInfo);
break;
case kReplaceCommand:
status = DriverReplace(ioCommandContents.replaceInfo);
break;
case kOpenCommand:
status = DriverOpen(ioCommandContents.pb);
break;
case kCloseCommand:
status = DriverClose(ioCommandContents.pb);
break;
case kReadCommand:
status = DriverRead(addressSpaceID, ioCommandID,
ioCommandKind, ioCommandContents.pb);
break;
case kWriteCommand:
status = DriverWrite(addressSpaceID, ioCommandID,
ioCommandKind, ioCommandContents.pb);
break;
case kControlCommand:
status = DriverControl(addressSpaceID, ioCommandID,
ioCommandKind, (CntrlParam *) ioCommandContents.pb);
break;
case kStatusCommand:
status = DriverStatus(addressSpaceID, ioCommandID,
ioCommandKind,
(CntrlParam *) ioCommandContents.pb);
break;
case kKillIOCommand:
status = DriverKillIO();
break;
}
/* Force a valid result for immediate commands. Other commands */
return noErr if the operation completes asynchronously. */
if ((ioCommandKind & kImmediateIOCommandKind) == 0) {
if (status == kIOBusyStatus) /* Our "in progress" value */
status = noErr; /* I/O will complete later */
else
/* To prevent a subtle race condition, the driver must
not store final status in the caller's parameter
block. This prevents a problem where the caller can
reuse the parameter block before the caller's
completion routine is called. */
status = IOCommandIsComplete(ioCommandID, status);
}
return (status);
}
The driver must ensure that immediate operations (those that must complete without delay) return
directly to the caller and that completed synchronous and asynchronous requests call
IOCommandIsComplete. (The sample driver handler functions return the final status if they handled
the request, and a private value, kIOBusyStatus, if an asynchronous interrupt will eventually complete
the operation.)
In the sample driver, individual subroutines carry out the functions. I'll describe the administration
routines first, then the process of carrying out an I/O operation.
INITIALIZATION AND TERMINATION
Currently, drivers perform all of their initialization when called with PBOpen and generally ignore
PBClose. The new system software provides six commands for initialization and termination, as
shown in Table 2. Since drivers are code fragments, they can also use the Code Fragment Manager
initialization and termination routines, although this probably isn't necessary.
For details on the Code Fragment Manager, see Inside Macintosh: PowerPC System Software .*
Table 2. Driver commands for initializing and terminating
ioCommandCode Value | Usage
|
kInitializeCommand | Carries out normal initialization. Called once when the driver is first loaded.
|
kReplaceCommand | Indicates that this driver is replacing a currently loaded driver for the device (for
example, a ROM driver is being replaced by a driver loaded from the system disk).
|
kOpenCommand | Begins servicing of device requests.
|
kCloseCommand | Stops servicing of device requests.
|
kSupersededCommand | Indicates that this driver will be replaced by another.
|
kFinalizeCommand | Shuts down the device and releases all resources. Called once just before the driver is
to be unloaded.
|
When you look at the sample driver, you'll see that most of the work is done by Replace and
Superseded, with Open and Close having no function there.
Here are the tasks that a driver needs to perform when initialized, whether by Initialize or Replace:
- Initialize its global variables and fetch systemwide parameters, such as the memory
management page size.
- Fetch the device's physical address range (either memory address or PCI I/O
addresses) from the Name Registry.
- Enable memory or I/O access and use the DeviceProbe function to verify that the
device is properly installed.
- Fetch the interrupt property information from the Name Registry and initialize
the interrupt service routine.
- If all initializations complete correctly, use device-specific operations to reset the
hardware.
Listing 3 shows how to extract the physical addresses of your device and use the "AAPL,address"
property to get the corresponding logical addresses. Unlike address space assignments on NuBus
machines, where the slot number directly corresponds to the device's 32-bit address range, PCI
address space assignments are dynamic. Devices define a set of registers, and the system initialization
process (Open Firmware) uses this information, together with information about buses and PCI
bridges, to bind the device to its 32-bit physical address range. (Actually, although addresses use 32
bits, the low 23 bits select the physical address, while the high 9 bits select between main memory
and PCI bus address spaces. The device driver uses the logical address to reference device registers.)
Open Firmware code updates the Name Registry to show the device's binding. Note that the driver
must search for the required address register and can't rely on any particular address being in a
specific location within the property.
Listing 3. Fetching the device's logical address range
typedef struct AssignedAddress {
UInt32 cellHi; /* Address type */
UInt32 cellMid;
UInt32 cellLow;
UInt32 sizeHi;
UInt32 sizeLow;
} AssignedAddress, *AssignedAddressPtr;
#define kAssignedAddressProperty "assigned-addresses"
#define kAAPLAddressProperty "AAPL,address"
#define kIOMemSelectorMask 0x03000000
#define kIOSpaceSelector 0x01000000
#define kMemSpaceSelector 0x02000000
#define kDeviceRegisterMask 0x000000FF
OSErr GetDeviceAddress(UInt32 selector, UInt32 deviceRegister,
LogicalAddress *logicalAddress)
{
OSErr status;
RegPropertyValueSize size;
AssignedAddressPtr addressPtr;
LogicalAddress *logicalAddressVector;
int nAddresses, i;
UInt32 cellHi;
addressPtr = NULL;
logicalAddressVector = NULL;
status = GetThisProperty(kAssignedAddressProperty,
(RegPropertyValue *) &addressPtr, &size);
/* See Listing 6. */
if (status == noErr) {
/* GetThisProperty returned a vector of assigned-address
records. Search the vector for the desired address
type. */
status = paramErr; /* Presume "no such address." */
nAddresses = size / sizeof (AssignedAddress);
for (i = 0; i < nAddresses; i++) {
cellHi = addressPtr[i].cellHi;
if ((cellHi & kIOMemSelectorMask) == selector
&& (cellHi & kDeviceRegisterMask) == deviceRegister) {
if (addressPtr[i].sizeLow == 0)
/* Open Firmware was unable to assign an address
to this memory area. We must return an error
to prevent the driver from starting up (status
is still paramErr). */
break;
/* This is the desired address space. Find the
corresponding LogicalAddress by resolving the
"AAPL,address" property. We want the i'th
LogicalAddress in the vector. */
status = GetThisProperty(kAAPLAddressProperty,
(LogicalAddress *) &logicalAddressVector, &size);
if (status == noErr) {
nAddresses = size / sizeof (LogicalAddress);
if (i < nAddresses)
*logicalAddress = logicalAddressVector[i];
else status = paramErr;
}
break; /* Exit the for loop. */
} /* Check for the requested register. */
} /* Loop over all address spaces. */
DisposeThisProperty((RegPropertyValue *) &addressPtr);
DisposeThisProperty
((RegPropertyValue *) &logicalAddressVector);
} /* If we found our "assigned-addresses" property */
return (status);
}
When the driver reads the "assigned-addresses" property, it looks at the address type (I/O or
memory) and may also need to examine other information to make sure the address range is
appropriate. For example, a device may have two memory address ranges -- one for the device's
registers and a separate range for its on-card firmware. The GetDeviceAddress function in Listing 3
uses the register number to determine which of several address ranges to use, but this may not work
for all hardware. This function also resolves the logical address range that corresponds to the device's
physical address range using an Apple-specific property that records device logical addresses. This is
important for devices that require I/O cycles: using the logical address lets the driver treat thesedevices as if they used normal memory addresses, eliminating the overhead of the Expansion Bus
Manager routines.
Listing 4 shows how a driver might use the Expansion Bus Manager to enable a device to become
bus-master and respond to either memory or I/O accesses. It also shows how to read a device register
with the DeviceProbe function. While the actual values are specific to the NCR 53C825 chip, the
technique is generally useful. Note that the command word was changed using a read-modify-write
sequence.
Listing 4. Checking for the correct hardware device
Listing 4. Checking for the correct hardware device
OSErr InitializeMyHardware(void)
{
OSErr status;
UInt8 ctest3;
UInt16 commandWord;
status = ExpMgrConfigReadWord(
&gDeviceEntry, /* kInitializeCommand param */
(LogicalAddress) 0x04, /* Command register */
&commandWord); /* Current chip values */
if (status == noErr)
status = ExpMgrConfigWriteWord(
&gDeviceEntry, /* kInitializeCommand param */
(LogicalAddress) 0x04, /* Command register */
commandWord | 0x0147); /* New chip values */
if (status == noErr)
status = DeviceProbe(
gDeviceBaseAddress + 0x9B,
/* Chip Test 3 register */
&ctest3, /* Store value here */
k8BitAccess);
if (status == noErr && (ctest3 & 0xF0) != 0x20)
status = paramErr; /* Wrong chip revision */
return (status);
}
The code for initializing the interrupt service routine, including connecting the primary interrupt
service routine to the operating system, is shown in Listing 5. This code installs a single interrupt
handler; if your device supports multiple interrupts (for example, if it supports several serial lines),
you may want to use the new interrupt management routines in the Driver Services Library to build
a hierarchy of interrupt service routines.
Listing 5. Initializing the interrupt service routine
#define kInterruptSetProperty "driver-ist"
OSErr InitializeInterruptServiceRoutine(void)
{
OSErr status;
OSStatus osStatus;
RegPropertyValueSize size;
InterruptSetMember *interruptSetMember;
status = GetThisProperty(kInterruptSetProperty,
(RegPropertyValue *) &interruptSetMember, &size);
if (status == noErr) {
if (size < (sizeof (InterruptSetMember)) {
DisposeThisProperty
((RegPropertyValue *) &interruptSetMember);
status = paramErr;
}
}
if (status == noErr) {
/* We have the interrupt set ID and member number. Save the
current interrupt set and get the current functions for
this interrupt set. */
gInterruptSetMember = *interruptSetMember;/* Save globally */
DisposeThisProperty
((RegPropertyValue *) &interruptSetMember);
osStatus = GetInterruptFunctions(gInterruptSetMember.setID,
gInterruptSetMember.member, &gOldInterruptSetRefCon,
&gOldInterruptServiceFunction,
&gOldInterruptEnableFunction,
&gOldInterruptDisableFunction);
if (osStatus != noErr)
status = paramErr;
}
if (status == noErr) {
/* We have the information we need. Install our own interrupt
handler function. If successful, call the old enabler to
enable interrupts (we don't install a private enabler). */
osStatus = InstallInterruptFunctions(
gInterruptSetMember.setID,
gInterruptSetMember.member,
NULL, /* No refCon */
DriverInterruptServiceRoutine,
/* See Listing 11. */
NULL, /* No new enable function */
NULL); /* No new disable function */
if (osStatus != noErr)
status = paramErr;
}
if (status == noErr)
(*gOldInterruptEnableFunction)(gInterruptSetMember,
gOldInterruptSetRefCon);
return (status);
}
Interrupt management routines are described in Chapter 9 of Designing PCI Cards and Drivers for Power
Macintosh Computers .*
GetThisProperty (Listing 6) is a generic utility function that retrieves a property from the Name
Registry, storing its contents in the system's resident memory pool. This is useful for retrieving
configuration information. The driver must, of course,return the memory to the pool when it's no
longer needed, using DisposeThisProperty,also shown in Listing 6.
Listing 6. Retrieving properties from the Name Registry
OSErr GetThisProperty(RegPropertyNamePtr regPropertyName,
RegPropertyValue *resultPropertyValuePtr,
RegPropertyValueSize *resultPropertySizePtr)
{
OSErr status,
RegPropertyValueSize size;
*resultPropertyValuePtr = NULL;
status = RegistryPropertyGetSize(
&gDeviceEntry, /* kInitializeCommand param */
regPropertyName,
&size);
if (status == noErr) {
*resultPropertyValuePtr =
(RegPropertyValue *) PoolAllocateResident(size, FALSE);
if (*resultPropertyValuePtr == NULL)
status = memFullErr;
}
if (status == noErr)
status = RegistryPropertyGet(
&gDeviceEntry, /* kInitializeCommand param */
regPropertyName,
*regPropertyValuePtr,
&size);
if (status != noErr)
DisposeThisProperty(regPropertyValuePtr);
}
if (status == noErr)
*resultPropertySizePtr = size; /* Success! */
return (status);
}
/* DisposeThisProperty disposes of a property that was obtained by
calling GetThisProperty. Note that applications would call DisposePtr
DisposePtr instead of PoolDeallocate. */
void DisposeThisProperty(RegPropertyValue *regPropertyValuePtr)
{
if (*regPropertyValuePtr != NULL) {
PoolDeallocate(*regPropertyValuePtr);
*regPropertyValuePtr = NULL;
}
}
Applications can use the functions in Listing 6 but must replace calls to PoolAllocateResident and
PoolDeallocate with calls to NewPtr and DisposePtr. The latter aren't available to PCI device drivers. *
CARRYING OUT AN I/O OPERATION
There are two parts to starting an asynchronous I/O operation: the driver must
carry out the operations unique to the particular hardware device and it must configure memory so
that hardware direct memory access (DMA) operations can
take place. Completing an operation requires responding to hardware interrupts, updating user
parameter block fields, selecting the proper status code, and calling IOCommandIsComplete to
inform the Device Manager that the driver has finished with this I/O request. The sequence for a
complete, but somewhat simplified, I/O transaction might be as follows:
- Use parameter block information to configure device-specific information.
- Compute the logical addresses that are needed and call PrepareMemoryForIOto
compute the corresponding physical addresses. PrepareMemoryForIOreplaces the
LockMemory and GetPhysical functions and handles virtual memory
considerations.
- With all memory ready for DMA, configure the hardware to start the transfer.
- When the device completes its operation, it will interrupt the PowerPC processor.
The operating system kernel will call your driver's primary interrupt service
routine.
- When the device request is complete, memory structures prepared by
PrepareMemoryForIO for this operation are released with CheckpointIO, and the interrupt service routine calls IOCommandIsComplete to return final status to the
caller.
This sequence represents an idealized and somewhat simplified situation. For example, display frame
buffers generally don't interrupt when written to but might interrupt at the end of a display cycle.
I won't say much about the Read, Write, Control, Status, and KillIO handlers: they carry out tasks
that are specific to the particular driver. Often, they initiate an operation that will be completed by a
device hardware interrupt. Control and Status handlers must process PBControl csCode = 43
(driverGestalt) requests. These provide a systematic way to query device capabilities and are also used
for power management. KillIO replaces the PBControl csCode = 1 (killCode) used for desk
accessories; it stops all pending I/O requests.
Before jumping into the complexities of PrepareMemoryForIO and interrupt service, I need to
mention one small task: setting and reading values in the device registers.
SETTING AND READING DEVICE REGISTER VALUES
The PCI bus architecture gives hardware developers two methods for setting and reading values in
the device registers: memory-mapped I/O and I/O cycle operations (described in more detail in
"Methods of I/O Organization"). A device advertises its I/O organization through bits in its
configuration register and by providing a PCI-standardized "reg" property. When the system starts
up, it assigns each device a range of physical addresses in the system's 32-bit physical address space.
The driver canretrieve the device's physical addresses by resolving the "assigned-addresses" property
and can use the Apple-specific "AAPL,address" property to translate the values in an "assigned-
addresses" property to logical addresses, as was shown in Listing 3. Your driver should use these
values when accessing your device's registers. Ranges of logical addresses are assigned to PCI bus
memory and I/O cycles; thus, your driver can perform I/O cycles without calling operating system
functions.
For example, the sample driver's hardware device has a test register (byte) at offset0xCC from the
start of its memory base address. Suppose the logical address retrievedby GetDeviceAddress was
stored in the global gDeviceBaseAddress, defined as
volatile UInt8 *gDeviceBaseAddress;
The driver could then read the test register with
testRegister = gDeviceBaseAddress[0xCC];
The
volatile keyword is important, as it prevents the compiler from removing what appear to be
unnecessary operations. Drivers will also need to call the SynchronizeIO function in the Driver
Services Library to force the PowerPC processor to flush its data pipeline. While the sample device
driver appears to use only memory operations, the PCI hardware issues either memory or I/O
addresses depending on the particular logical address reference. To issue I/O addresses, your device
driver would have to retrieve the "AAPL,address" property shown in Listing 3.
While byte accesses are straightforward, word (16-bit) and long word (32-bit) accesses are more
complex. This is because the PCI bus is little-endian (the address
of a multibyte entity is the address of the low-order byte), whereas the Mac OS and the PowerPC
chip are big-endian (the address of a multibyte entity is the address of the high-order byte). To access
16-bit and 32-bit data, then, your driver must swap bytes in memory, either by using the PowerPC
lwbrx instruction or by calling the library functions EndianSwap16Bit or EndianSwap32Bit. The
Expansion Bus Managerroutines handle "endian swapping" internally. Failing to swap bytes was the
most frequent error when I wrote the sample driver; you would be wise to check this thoroughly in your code.
PREPARING THE MEMORY
Before starting a DMA operation, the operating system must ensure that the data accessed by the
operation is in physical memory and that any data in the processor cache has been written to
memory. This is done with the PrepareMemoryForIO and CheckpointIO routines. Because the
process is complex, I'll break it down into smaller pieces to describe it. Let's assume your driver will
prepare two areas: a permanent shared-memory area used to communicate with the device (this could
be used for a display frame buffer) and a request-specific area used for a single I/O request.
Listing 7. Preparing a shared memory area
IOPreparationTablegSharedIOTable;
LogicalAddress gSharedAreaPtr;
IOPreparationTable gSharedIOTable;
LogicalAddress gSharedAreaPtr;
OSErr PrepareSharedArea(
AddressSpaceID addressSpaceID) /* DoDriverIO parameter */
{
OSErr status;
ItemCount mapEntriesNeeded;
gSharedAreaPtr =
MemAllocatePhysicallyContiguous(kSharedAreaSize, TRUE);
if (gSharedAreaPtr == NULL)
return (memFullErr);
gSharedIOTable.options =
( kIOIsInput /* Device writes to memory. */
| kIOIsOutput /* Device reads from memory. */
| kIOLogicalRanges /* Input is logical addresses. */
| kIOShareMappingTables ); /* Share tables with kernel. */
gSharedIOTable.addressSpace = addressSpaceID;
gSharedIOTable.firstPrepared = 0;
gSharedIOTable.logicalMapping = NULL; /* We don't want this. */
/* Describe the area we're preparing and allocate a mapping
table. */
gSharedIOTable.rangeInfo.range.base = gSharedAreaPtr;
gSharedIOTable.rangeInfo.range.length = kSharedAreaSize;
mapEntriesNeeded =
GetMapEntryCount(gSharedArea, kSharedAreaSize);
gSharedIOTable.physicalMapping = PoolAllocateResident(
(mapEntriesNeeded * sizeof (PhysicalAddress)), TRUE);
if (gSharedIOTable.physicalMapping == NULL)
status = memFullErr;
else
status = PrepareMemoryForIO(&gSharedIOTable);
if (status == noErr)
status = CheckPhysicalMapping(&gSharedIOTable,
kSharedAreaSize);
return (status);
}
Preparing the shared area is fairly straightforward: your driver allocates a physical mapping table,
initializes an IOPreparationTable, and calls PrepareMemoryForIO. Listing 7 shows how to prepare a
shared area and Listing 8 shows several related utility routines. Because PrepareSharedArea allocates
memory for its physical mapping table, it must be called when your driver is initialized. Note that
GetLogicalPageSize, used in several routines, returns a systemwide constant value;
a production device driver would call it once, storing the value in a global variable.
Listing 8. PrepareMemoryForIO utilities
/* Return the number of PhysicalMappingTable entries that will be
needed to describe this memory area. */
ItemCount GetMapEntryCount(void *areaAddress,
ByteCount areaLength)
{
ByteCount normalizedLength;
UInt32 theArea;
theArea = (UInt32) areaAddress;
normalizedLength = PageBaseAddress(theArea + areaLength - 1)
- PageBaseAddress(theArea);
return (normalizedLength / GetLogicalPageSize());
}
/* Check that the entire area was prepared and that all physical
memory is contiguous. */
OSErr CheckPhysicalMapping(IOPreparationTable *ioTable,
ByteCount areaLength)
{
ItemCount i;
OSErr status;
if (areaLength != ioTable->lengthPrepared)
status = paramErr; /* Didn't prepare the entire area. */
else {
status = noErr;
for (i = 0; i < ioTable->mappingEntryCount - 1; i++) {
if (NextPageBaseAddress(ioTable->physicalMapping[i])
!= ioTable->physicalMapping[i + 1]) {
status = paramErr;
/* Area isn't physically contiguous. */
break;
}
}
}
return (status);
}
/* Return the start of the physical page that follows the page
containing this physical address. */
PhysicalAddress NextPageBaseAddress(PhysicalAddress theAddress)
{
UInt32 result;
result = PageBaseAddress
(((UInt32) theAddress) + GetLogicalPageSize());
return ((PhysicalAddress) result);
}
/* Return the start of the physical page containing this address. */
UInt32 PageBaseAddress(UInt32 theAddress)
{
return (theAddress & ~(GetLogicalPageSize() - 1));
}
To prepare a request-specific user area, your driver will initialize an IOPreparationTablewith the
procedure shown in Listing 9. Since your driver can be called from an I/O completion routine, it
can't allocate a physical mapping table for each I/O request. Instead, your initialization procedure
will allocate a maximum-length mapping table.
To process an I/O request, the driver initializes the options and I/O range and then calls
PrepareMemoryForIO and, after I/O completion, CheckpointIO. How to prepare a single request is
shown in Listing 10. You call CheckpointIO to complete your use of the buffer in the interrupt
service routine, as shown later in Listing 11.
Listing 9. Initializing a request-specific IOPreparationTable
IOPreparationTable gRequestIOTable;
ItemCount gRequestMapEntries;
OSErr InitializeRequestIOTable(void)
{
OSErr status;
ByteCount mapTableSize;
/* Compute the worst-case number of map entries. */
gRequestMapEntries =
GetMapEntryCount((void *) GetLogicalPageSize() - 1,
kDriverMaxTransferLength);
mapTableSize = (gRequestMapEntries * sizeof (PhysicalAddress));
gRequestIOTable.physicalMapping =
PoolAllocateResident(mapTableSize, TRUE);
status = (gRequestIOTable.physicalMapping != NULL)
? noErr : memFullErr;
return (status);
}
A production device driver must extend the algorithm in Listing 10 to handle two more complex
cases:
- Virtual memory is enabled. This being the normal case, the user area isn't
necessarily physically contiguous. If your hardware can handle this, you can
postprocess the physical mapping table into a scatter-gather table.
- The operating system has only a limited amount of permanently resident memory.
Even if your hardware can perform a single 500 MB I/O transfer, you won't want
to allocate that many physical mapping tables; you wouldn't get a significant
performance gain and you would make your driver unusable on smaller
configurations.
The solution to both of these problems is partial preparation. Your driver provides a physical
mapping table of reasonable size. PrepareMemoryForIO prepares as much as possible and your
driver uses the firstPrepared and lengthPrepared fields to navigate the physical mapping table. When
your driver has performed all I/O in a partial preparation, it recalls PrepareMemoryForIO to prepare
the next segment. So the overall, somewhat simplified, algorithm is as follows:
- Prepare the first area.
- Build scatter-gather tables and start up the device. When the device interrupts,
continue with the next step.
- When the device needs more data, have the interrupt service routine check the
state field in the IOPreparationTable. If the I/O is incomplete, send a software
interrupt to the driver's "restart I/O" task.
- Have the "restart I/O" task call PrepareMemoryForIO to prepare the next area
(this can cause virtual memory paging). If successful, continue with step 2 to
restart the device.
- When I/O completes, call CheckpointIO to release the kernel resources reserved
by PrepareMemoryForIO.
Listing 10. Using the request-specific IOPreparationTable
OSErr PrepareIORequest(AddressSpaceID addressSpaceID,
LogicalAddress userBufferPtr,
ByteCount userCount)
{
OSErr status;
ItemCount mapEntriesNeeded;
gRequestIOTable.options =
( kIOIsInput /* Device writes to memory. */
| kIOLogicalRanges /* Input is logical addresses. */
| kIOShareMappingTables ); /* Share tables with kernel. */
gRequestIOTable.addressSpace = addressSpaceID;
gRequestIOTable.firstPrepared = 0;
gRequestIOTable.logicalMapping = NULL; /* We don't want this. */
/* Store the user parameters in the IOPreparationTable. */
gRequestIOTable.rangeInfo.range.base = userBufferPtr;
gSharedIOTable.rangeInfo.range.length = userCount;
mapEntriesNeeded = GetMapEntryCount(userBufferPtr, userCount);
if (mapEntriesNeeded > gRequestMapEntries)
status = paramErr;
else {
gRequestIOTable.mappingEntryCount = mapEntriesNeeded;
status = PrepareMemoryForIO(&gRequestIOTable);
}
if (status == noErr)
status = CheckPhysicalMapping(&gRequestIOTable, userCount);
return (status);
}
THE INTERRUPT SERVICE ROUTINE
When the hardware device completes a request, it interrupts the PowerPC processor. The operating
system kernel fields the interrupt and searches an interrupt service treeto find a function that's been
registered to handle that interrupt. A driver has establishedthis function by calling
InstallInterruptFunctions, as was shown in Listing 5.
A driver's interrupt service routine is generally broken into two parts: a primary routine that handles
immediate operations and a secondary routine that completes the operation, releases any system
resources held by PrepareMemoryForIO, and calls IOCommandIsComplete. (Note that some drivers
will have no secondary routine.)
Secondary interrupt routines are serialized: they always run to completion before the system calls
them again. However, they don't block other devices from interrupting the system. This greatly
simplifies device driver design, as the secondary interrupt routine can manage the driver's internal
queues without the significant overhead that blocking all processor interrupts would require.
Device drivers may need more complex processing than can be accomplished with primary and
secondary interrupt routines. For example, a CD-ROM driver needs to check for disk insertion
periodically. Also, all drivers need to handle virtual memory paging. To accomplish this, a driver can
create a software task -- an independent function that's scheduled at a time when all system services
are available. Interrupt service and timer completion routines can schedule software tasks when
necessary.
Listing 11 shows an extremely simplified interrupt service routine to familiarize you with this
organization. DriverInterruptServiceRoutine, the primary routine, stores the hardware completion
status and then queues a secondary interrupt routine to complete the operation. The secondary
interrupt routine completes the I/O request by checkpointing the memory that was prepared before
the transfer started. It then passes final completion status back to the operating system kernel.
Listing 11. A simplified interrupt service routine
InterruptSetMember DriverInterruptServiceRoutine(
InterruptSetMember interruptSetMember, /* Unused here */
void *refCon, /* Unused here */
UInt32 theInterruptCount) /* Unused here */
{
OSErr status;
UInt8 driverStatus;
/* Retrieve the operation status from the device. This is
fiction: a real device will be much more complex. */
driverStatus = gDeviceBaseAddress[kDeviceStatusRegister];
if (driverStatus == <device is not interrupting>
return (kISRIsNotComplete);
if (driverStatus == kDeviceStatusOK)
status = noErr;
else
status = ioErr;
/* The operation is (presumably) complete. Queue a secondary
interrupt task that will release all memory and return the
final status to the caller. We'll ignore an error from
QueueSecondaryInterrupt. */
(void) QueueSecondaryInterrupt(
DriverSecondaryInterruptRoutine,
NULL, /* No exception handler */
(void *) status, /* Operation ioResult */
NULL); /* No p2 parameter */
return (kISRIsComplete);
}
OSStatus DriverSecondaryInterruptRoutine(
void *p1, /* Has ioResult value */
void *p2) /* Unused */
{
IOPreparationID ioPreparationID; /* Request I/O prep ID */
/* Copy operation-specific values (such as the number of bytes
transferred) into the caller's parameter block. */
gCurrentParmBlkPtr->ioActCount = <device-specific value>;
ioPreparationID = gRequestIOTable.preparationID;
if (ioPreparationID != kInvalidID) {
gRequestIOTable.preparationID = kInvalidID;
(void) CheckpointIO(ioPreparationID, kNilOptions);
}
/* IOCommandIsComplete is the only function that should set the
ioResult field. */
IOCommandIsComplete(gIOCommandID, (OSErr) p1);
return (noErr);
}
This sample doesn't use the interrupt set member number, the refCon, or the interrupt count, which
are needed for interrupt service routines that handle several devices (for example, in the case of a
hardware device that controls several serial lines). Also, to simplify this sample, I'm presuming that all
information is stored in driver globals. A better organization would make use of a "per-request" data
structure that encapsulates all information needed for a single user I/O request (such as PBRead); this
greatly simplifies the driver organization when you want to extend the driver to support multiple
simultaneous requests (concurrent I/O).
JUST THE TIP OF THE ICEBERG
There's a lot of material here -- and a lot more that I haven't discussed. Still, this should give you a
good overview of the new driver services and how they work together. While this may be
overwhelming if you've never written a device driver before, those of you who have (for any
operating system) will be happy to note how much isn't here: no assembly language, no dependencies
on the strange quirks of the Mac OS, and all hardware dependencies either hidden from you or
limited to your device's specific needs.
METHODS OF I/O ORGANIZATION
Memory-mapped I/O and I/O cycle operations represent two ways of designing a computer architecture.
Using memory-mapped I/O, device hardware responds to normal memory operations in a particular range
of addresses. For example, PDP-11 computers without memory management hardware reserved 8K for
peripheral hardware registers, limiting the memory available to programs to 56K.
I/O cycle operations effectively place external devices in an independent address space. This gives
programs additional memory but requires special instructions to access peripheral devices. The Intel 80x86
series uses this organization.
To the programmer, memory-mapped I/O has the advantage of allowing direct device operations without
special instructions, making it relatively easy to write device drivers in high-level languages. As bus widths and
memory size limitations have eased, the inability to use part of the address space for programs has become
less of an issue.
Apple's PCI-based machines use only memory-mapped I/O. However, the bus interface hardware generates
PCI I/O cycles for a subset of the physical address space.
REFERENCES
- Designing PCI Cards and Drivers for Power Macintosh Computers will be available from APDA in mid-June.
- IEEE document 1275 -- 1994 Standard for Boot (Initialization, Configuration) Firmware (Part number
DS02683, available from IEEE Standards Department, P.O. Box 1331, Piscataway, NJ 08855).
- Inside Macintosh: Power PC System Software (Addison-Wesley, 1994), Chapter 3, "Code Fragment
Manager."
MARTIN MINOW recently sneaked away to England from his job at Apple for a (too) brief vacation. The high point was at
the Kew Bridge Steam Museum outside of London, where he stood inside the oldest, or perhaps the largest, working steam
engine in the world. The four-story-high, 50-foot-long engine was used to pump water from the Thames for more than 100
years and is now the centerpiece of a large collection of working steam engines. And speaking of working, Martin's been
doing too much of it and already needs another vacation. *
Thanks to our technical reviewers Jano Banks, Holly Knight, Wayne Meretsky, Tom Saulpaugh, and George Towner. *