Resource Template
Volume Number: 16 (2000)
Issue Number: 3
Column Tag: Programming
Not Your Father’s Resource Template
by Neil Mayhew, Calgary, AB
A C++ template for handling resources in a type-safe way
To C or Not to C...
It's very hard to find a real C compiler these days. All the ones I know of are really C++ compilers operating in backwards-compatibility mode. So I find it surprising that so much code is still being written in C rather than C++. Since C++ is in every way a superset of C there seems to be no reason for not taking advantage of at least some of the features of C++ that make the professional programmer's life so much easier - even just the simple ability to declare variables at their point of use rather than at the head of the current block or function.
Many programmers of course do use C++, although I see a lot of Macintosh code that still looks a lot like plain C - ranging from simple things such as using preprocessor macros instead of const literals or inline functions, to using global or static variables instead of a fully-encapsulated solution using objects.
Why should this be so? As with many things, there are probably a number of reasons. For example:
- Most programmers find it hard to justify the time to learn something new when they are functioning very productively with the skills they already have.
- Many of the standard C++ library functions do not fit well with the Mac OS APIs - by using C strings and pathnames for files, for example - and so it is easier just to stay with the C-style of programming adopted by Apple's Universal Headers.
- C++ is considered to be synonymous with object-oriented programming, and many people feel that the run-time overheads associated with this style of programming are too high in their situation.
- It is thought that using C++ on Mac OS requires the use of an application framework such as MacApp or PowerPlant, and since the program under development is not an application then C++ is not appropriate.
I plan to show how these and other reservations about C++ are actually misconceptions. Not everyone believes all of these things, of course, but for those still with doubts, or those who would welcome the chance to discover more about some of the power-features of C++ applied in a Mac OS context, read on.
Be Careful With That Chainsaw...
Someone has said that compared with other languages, C is like juggling with knives. Someone else has said that compared with C, C++ is like juggling with chainsaws - awesome power, but rather nasty when mishandled. Actually, I think that C++ has the capacity to be much safer than C. The increased ability to hide or localize dangerous operations in a well-tested submodule frees the programmer to concentrate on the logic of the application rather than avoiding all the usual "gotchas". What's more, courtesy of inline functions and stack-based objects this ability can come at zero runtime cost - as you will see.
Many people are content to leave this work in the hands of the providers of application frameworks and class libraries. However, the real gains come when you develop data abstractions that are specific to the code in hand. This in turn requires an understanding of features such as template classes and smart pointers, which many people have not yet fully grasped and made their own. I hope that by the end of this series of articles you will feel that you have.
For those new to C++, I will now define some of the C++ jargon that is used throughout this article, but experienced readers may want to skip the rest of this section.
Data abstraction is the technique of creating an 'idealized' interface to a body of data that is independent of its actual representation. As far as possible, the abstraction should present the data as a single, easy-to-describe concept. For example, the abstraction of a stack could be an object with push and pop operations, rather than an array and a next pointer.
Overloading is the practice of assigning multiple meanings to the same identifier or symbol. In C++ this usually means defining several different functions with the same name but different sets of argument types, and allowing the compiler to distinguish which one should be called according to the types of arguments supplied. It can also mean that new versions of built-in operators (like +) are being defined, that take user-defined types as arguments. Note that overloading is not usually the same as overriding, since the latter refers exclusively to the redefinition of a superclass method in a subclass. A single class may overload several versions of one of its own method names, and overloaded operators are quite often not methods at all but global functions.
An inline function is defined just like any other function, and obeys all the same rules, but (usually at the compiler's discretion) its code is generated wherever the function is called, rather than just once where it is defined. In this way it is rather like a macro, except that all the rules for overloading, argument type-conversion and so on apply. It also avoids the typical macro problem of multiple-evaluation of arguments, just as with a 'real' function call. What's more, the compiler is free to optimize the generated code in any way that is consistent with the semantics of the function call, which can often bring huge gains in efficiency over a physical function call. Inline functions can reduce the cost of data abstraction to zero - yielding solutions that are as fast and compact as their unabstracted counterpart.
A stack-based object is one that is designed to be allocated just like any other local variable, rather than by a heap allocation via new. The space-allocation overhead is effectively zero (both in time and heap space) and the only execution overhead is in the constructor- and destructor-calls for the object. Very often, these calls represent code that would need to be executed anyway, for example to initialize a data structure or free some memory, and the compiler takes care of calling this code at its proper time.
A template class is a generic definition of an infinite family of related classes, in which each family member is associated with a different auxiliary data type. This association is made by substituting the specific data type for a formal argument of the template definition. For example, a stack template would define the behavior of a family of stack classes, and each family member (called an instantiation of the template) would be a stack class storing some particular type (char, double, Person*, etc.). Template arguments are specified between wedges <>, and subsequently within the template definition they can be used just like a typedef name. The methods of a template class are usually inline, although they don't have to be.
A smart pointer is an object that has the syntax and semantics of a pointer, through overloading the * and -> operators for that class. Hopefully it will also have behavior that is useful enough to be considered smart! For example, a smart pointer class can be defined that automatically maintains a reference count on the objects being pointed to. Smart pointers are almost always stack-based objects, or data members of some other object. They are designed to be passed around by value rather than by reference (otherwise the reference counting would not work, for example). A native C++ pointer is often contrastingly referred to as a raw pointer.
Cast Your Cares Away...
Casts are a fact of life in Mac OS programming. Although they are messy and error-prone, they can't be avoided. For example, GetResource returns a Handle, but this must be cast to whatever structure represents the data before using it. However, a mismatch between the actual contents of the resource and the cast that is used is usually disastrous. The sophisticated type-matching features of C/C++ are designed to help you avoid this kind of thing, but a cast circumvents these checks completely. Whilst good use of C++ cannot remove the need for a cast somewhere in your code, it can at least confine it to one carefully-chosen place. The result, as you will see, is a much clearer and neater program, as well as a safer one.
As an example of using resources, we will consider a code fragment that reads and writes high-scores from a preference file (very loosely based on July's Getting Started column). The first thing is to define a structure that represents the layout of the data in the resource:
Listing 1.1: Resource structure
Score
struct Score
{
UInt32 score; // What the player scored
Boolean used; // Whether this entry is in use
Str255 name; // The player's name (variable-length)
};
The code to read and display the high-score list might look as follows:
Listing 1.2: Traditional approach
ShowScores
const ResType kScoreType = 'Scor';
void ShowScores()
{
int n = Count1Resources(kScoreType);
for (int i = 1; i <= n; ++i)
{
Score** s = (Score**)Get1IndResource(kScoreType, i);
if (!s)
break;
if ((**s).used)
// Append (**s).score and (**s).name to score window...
ReleaseResource((Handle)s);
}
}
Good practice has been followed in using a symbolic definition for the resource type (although I have seen plenty of code that does not). However, note that:
- two casts are used (ugly and error-prone)
- there is no way to ensure that the cast and the ResType match (in the call to Get1IndResource)
- the resource-type symbol is used in two different places (an opportunity for mistakes if the name is ever changed)
- the cumbersome (**). syntax is needed for accessing the data.
How could we make use of C++ to overcome these deficiencies? Bear with me for a moment and take a look at one possible end result:
Listing 1.3: Alternative approach
ShowScores
typedef ResHandle<Score, 'Scor'> ScoreResource;
void ShowScores()
{
ScoreResource s;
int n = s.Count1();
for (int i = 1; i <= n; ++i)
{
s.Get1Ind(i);
if (!s)
break;
if (s->used)
// Append s->score and s->name to score window...
s.Release();
}
}
The symbolic definition of the ResType constant has been replaced with a typedef representing the instantiation of a template (I'll explain this more fully in a minute), and the Resource Manager API's have been replaced with methods. In contrast with the previous list of deficiencies:
- no casts are visible
- there is no possibility of mismatch between ResType and pointer type, as the relevant information is locked together in the typedef
- the ResType does not need to appear anywhere except in the typedef
- the -> operator is used in place of (**).
Now a word about the typedef. A template class called ResHandle is defined in a header file, and is instantiated (parameterized) with two pieces of information. One is the type of the data that the handle will refer to, and the other is the ResType constant that is associated with the data. Not many people realize that template parameters can be constant values as well as type names, but this is a very powerful feature of C++ that we will return to later. Finally, this instantiation of ResHandle is given the symbolic name ScoreResource.
You may also have realized by now that ResHandle is in fact a 'smart pointer'. It overloads the * and -> operators to perform the double-dereference for you (just as MPW C++ used to, way back). Some people would no doubt dismiss this as 'syntactic sugar' but I am in favor of anything that makes the code clearer and simpler. Of course, in one sense it makes the code more obscure, because it hides the fact that a handle is involved at all, but I think this is a price worth paying, especially if the ResHandle template is used universally throughout an application. Note that both . and -> are used on the ScoreResource object: the former calls methods that affect or use the value of the handle, and the latter allows one to access the object that is pointed to by the handle. This duality is very common with smart pointers, and it is important to understand the two different meanings clearly.
Under The Hood...
We can now take a look at the 'wizardry' behind the ResHandle template, although I hope you'll agree that in good C++, like Math, everything looks very simple once you have chosen the right definitions to use.
The original designers of the Mac OS came from an object-oriented background, and this is reflected in the fact that many of the Mac OS APIs are object-oriented in concept if not in syntax. For example, a resource handle "is a kind of" handle. All the calls that can be performed on a handle can be performed on a resource handle, and resource handles add a few extra calls of their own that can't be performed on a regular handle. So it would be nice if any C++ treatment of resource handles could reflect this inheritance relationship. This would have the added benefit of allowing compile-time detection of passing the wrong type of handle to an API - vastly preferable to discovering it at run-time, or even not discovering it at all.
So, the definition of ResHandle actually inherits from a more basic definition of MemHandle. We'll take a look at that first (Listing 2.1) and then proceed to ResHandle after that (Listing 2.2).
Listing 2.1: MemHandle template class
MemHandle
A smart-pointer template that encapsulates a regular Mac OS memory handle.
template<class Type>
class MemHandle
{
protected:
Handle h;
// Basic casts - up/down refers to the 'class hierarchy'
static Handle upCast(Type** t)
{
return reinterpret_cast<Handle>(t);
}
static Type** downCast(Handle h)
{
return reinterpret_cast<Type**>(h);
}
public:
// Constructor - initialize the handle to zero
MemHandle() : h(0) {}
// Using compiler's copy constructor and assignment operator
// Using compiler's destructor
// Dereferencing operators
Type* operator -> () { return *downCast(h); }
Type& operator * () { return **downCast(h); }
// Conversion operators
operator Handle() { return h; }
operator Type**() { return downCast(h); }
operator Type* () { return *downCast(h); }
// Status operators
bool operator ! () { return h == 0; }
operator bool () { return h != 0; }
// Mac OS APIs
bool Allocate(Size n = sizeof(Type))
{
h = NewHandle(n);
return MemError() == noErr;
}
bool Resize(Size n)
{
SetHandleSize(h, n);
return MemError() == noErr;
}
bool Dispose()
{
DisposeHandle(h);
h = 0;
return MemError() == noErr;
}
// etc. etc.
OSErr Error()
{
return MemError();
}
};
MemHandle explained
This template contains all the basic machinery needed to manipulate handles, although for simplicity of illustration quite a few methods have been left out (download the source code to see a slightly fuller implementation). It is defined as a template so that it can be used to represent any type of data that is held in a handle, acting much like a raw pointer. The compiler is then able to enforce proper type consistency throughout the program.
The template parameter, Type, is the data type of the contents of the handle. In our example, this is a Score. When used as a pointer, with -> or *, a MemHandle has the syntax and semantics of a Type*.
As far as data is concerned, MemHandle is just a wrapper for a raw Mac OS memory handle. A MemHandle, when allocated on the stack, takes up just the same amount of space. The data member, h, is declared protected so that subclasses (such as ResHandle) can easily access the raw handle. It could have been defined as a Type**, but the API-methods (of which there would be many in a full implementation) are greatly simplified if it is just a Handle.
All the casting is performed by two static, inline methods, upCast and downCast. The naming convention is based on the idea that a cast from a subclass to a superclass goes up the class hierarchy, and vice versa. Although a Handle is not actually a superclass of a Type**, the concept is the same. Internally, these two methods use reinterpret_cast rather than the traditional C cast. This is now preferred in C++, as a way to distinguish the code's intention from other uses of casting (such as static_cast, dynamic_cast and const_cast). By wrapping the casting inside these methods, all the other methods are made considerably cleaner and easier to read.
MemHandle defines a single constructor with no arguments, and allows the compiler to supply a standard copy-constructor (used for making a new MemHandle from another) and assignment operator (used for overwriting one MemHandle with another). Since the only legitmate way to create a non-nil Handle is by calling NewHandle or by copying another one, it makes sense to have these as the only constructors for MemHandle. If you want to allow copying of raw handles into smart ones, it's probably best to define a method (or explicit constructor) that takes a Type**, to avoid accidents. It's not wise to define a destructor that calls DisposeHandle because in the absence of reference-counting there is no way to be sure that another copy of the handle isn't still in use somewhere.
Whether or not you define them yourself, you should always give thought to "the fundamental four" whenever you design a new class: default constructor (no arguments, or arguments with default values), copy constructor, assignment operator and destructor. It's also a good idea to consider how you would define a less-than operator, especially if you ever want to store your object in an STL container. If the copy constructor or the assignment operator does not make sense for your object, then make it inaccessible by declaring it private and never implementing it (to stop the compiler implementing it for you). That way you get an error either at compile-time or at link time, but never at run-time.
The dereferencing operators are just simple calls to one of the casting functions, although the existence of these methods is the key to making our template be a smart pointer. The purpose of each is fairly self-evident, and the reason for the different return types is purely conventional.
Conversion operators allow one to supply a MemHandle whenever one of these other types is expected (such as in a parameter to a function or API), and the compiler will call the appropriate conversion code. Note, though, that because all the methods are inline, everything is resolved at compile-time and in the case of type-conversion no additional code will actually be generated.
The status operators simply provide a convenient way of checking whether the handle is currently nil or not. Statements such as if (mh) ... and if (!mh) ... will work as expected. Note that this is not the same as saying if (mh == 0) ... or if (mh != 0) ...
The rest of the class is API methods, and I have included just a few of these for illustration. Note that there needs to be a way to specify a size using Allocate and Resize since Type may actually have variable length (as in our Score example).
Listing 2.2: ResHandle template class
ResHandle
A smart-pointer template that encapsulates a Mac OS resource handle.
template<class Type, ResType kResType>
class ResHandle : public MemHandle<Type>
{
public:
// Constructors, etc. - using defaults
// Mac OS APIs
bool Get(int id, ResType type = kResType)
{
h = GetResource(type, id);
return ResError() == noErr;
}
bool Get1(int id, ResType type = kResType)
{
h = Get1Resource(type, id);
return ResError() == noErr;
}
bool GetInd(int i, ResType type = kResType)
{
h = GetIndResource(type, i);
return ResError() == noErr;
}
bool Get1Ind(int i, ResType type = kResType)
{
h = Get1IndResource(type, i);
return ResError() == noErr;
}
static short Count(ResType type = kResType)
{
return CountResources(type);
}
static short Count1(ResType type = kResType)
{
return Count1Resources(type);
}
void Changed()
{
ChangedResource(h);
}
void Release()
{
ReleaseResource(h);
h = 0;
}
// etc. etc.
static OSErr Error()
{
return ResError();
}
};
ResHandle explained
Apart from one subtlety in the template parameters, ResHandle just adds some extra API methods. All the basic functionality, and the tricky operator overloading, is inherited from MemHandle.
ResHandle has two template arguments. The first is simply passed on down to MemHandle, and is used to define the data type of the contents of the handle. The second, however, is not allowed to be a class or type name at all. It has to be a value of type ResType, and a constant one at that. As we have seen from the Score example, this would normally be a literal four-char code supplied as part of a typedef:
typedef ResHandle<Score, 'Scor'> ScoreResource;
So this constant value actually becomes a part of the datatype defined by the typedef. This may seem a little hard to grasp at first, but a couple of other examples may make things clearer. In traditional C, it is possible to write:
typedef char Acronym[4];
Acronym* myList;
Then the compiler knows that Acronym means "array of four characters," and when we write myList++ the compiler knows to increment the binary value by 4. So the constant value 4 becomes a part of the type definition.
C++ has extended this concept, mostly to allow generic implementation of fixed-size arrays using a template (useful for Pascal strings). So in C++ we could write:
template<int nchars> VariableAcronym
{
char data[nchars];
...
};
typedef VariableAcronym<4> Acronym;
Of course, the constant-template-parameter mechanism can be used for any purpose we like, and in this case it comes in extremely handy as a way of recording the ResType value that belongs with a particular type of resource handle. By making it a template parameter, we never have to store it at runtime along with the handle, because everywhere it is needed we can supply it as a compile-time constant (kResType). Of course, the value of kResType will be different for each instantiation of the template.
A different value for the resource type can however be provided at runtime by supplying an optional second argument to the Get methods. This is to allow for resource types that sometimes masquerade as a different ResType (owner resources, for example, which are really 'STR 's). To get the owner resource, you would declare a variable of type StrResource and pass the application's signature to the Get call.
The Count methods have been defined static, because they don't act on any particular handle. They do, however, fill in kResType for us, which is handy. In the Score example that we looked at earlier, the declaration of the ScoreResource handle was moved up out of the loop so that we could use it to count the number of score resources. We could, of course, call ScoreResource:: Count1 directly, but that would be more verbose and more error-prone as the name ScoreResource would have to appear twice, once in the method call and once in the handle declaration.
The Final Score...
So how does our solution stack up against the list of problems given at the start of this article?
- Taking time to learn something new: I hope you'll agree that the techniques I have outlined do bring real benefits to our code. I hope too you'll have found that learning something new didn't take up all that much of your valuable time.
- Living with the Mac OS APIs: the C++ that we wrote gets along just fine with the APIs, and makes them a lot easier to work with into the bargain.
- Run-time overheads: using our ResHandle template uses no more memory, generates no more code, and executes just as quickly as our traditional solution. Actually, the constructor does redundantly initialize the handle to zero once, at the start of the function, but the compiler's optimizations remove this (as shown by a disassembly).
- Application frameworks: we relied on nothing more than the Universal Headers, and yet managed to write some fairly effective C++. The low run-time overhead means that this code could be used without problem in a code resource or system extension (an entire example application is 2200 bytes code and 1800 bytes data when built for PPC). Equally, this code would fit quite comfortably into any framework-based application too.
Incidentally, PowerPlant and MacApp both have suites of utility classes that can be used independently of the rest of the framework, and you should check these out before rushing ahead to reinvent too many more wheels. However, it is hard to find a generic solution that fits everyone's needs, so there is still great benefit in being able to develop your own utility classes, or customize those written by others.
The code presented here has necessarily been rather short and simple, of course. It is intended only as an illustration of the kinds of things that become possible when the full power of C++ is brought to bear on the task of programming for the Mac OS. In a future article, I'll present a lightweight ostream-style class that is based around Pascal strings. As well as being invaluable for generating text messages that need to appear in the UI, it provides further illustration of the advanced capabilities of C++.
Neil Mayhew works for Wycliffe Bible Translators, a non-profit organization dedicated to translating the Bible for the world's 400 million people that do not have it in their own language. Neil started programming in C in 1983, and graduated to the Mac in 1989. When he's not at his Mac or trying to beat his kids at video games you might find him flying a stunt kite if it's windy, or throwing a boomerang if it's not.