November 93 - Booch Components
Booch Components
Scott L. Taylor
Many of you may have heard that Bedrock is going to use a subset of the Booch Components and are wondering what the Booch Components have to offer. This article gives an overview of what the Booch Components are, compares them to other frameworks and discusses some of the issues associated with integrating and using them with existing frameworks.
The Booch Components are a robust and complete structures and tools class library that provide fundamental data structures and algorithms used in large scale application development. The Booch Components represent a specialized framework in terms of functionality, but at the same time they are a very generalized and flexible framework. The developer can override any of the components and use the mixin inheritance design aspects of the library to build completely new components with new functionality that are tuned for any particular requirement. The Booch Components package consists of 300 files with almost 300 template based classes and some 3000 methods.
The Booch Components take up where MacApp left off. They are a platform independent reusable collection of C++ classes published by Rational and have been designed from the start with performance in mind. They are a second generation class library that take advantage of many of the new C++ features that are now becoming available in C++ compilers like templates and exception handling. The Booch Components structures included in the class library are as follows: Bags, Deques, Graphs, Lists, Maps, Queues, Rings, Sets, Stacks, Trees and Strings. The tools included are Filters, Pattern Matching, Searching, Sorting and Utilities. The Booch Components offer many more types of structures and tools than frameworks like MacApp or TCL and the they are much more flexible in terms of runtime performance and memory utilization.
A brief description of these structures and tools will be given later. It is obvious from the assortment of components available that an application developer can easily find a suitable structure for a given job. The Booch Components are based heavily on templates which allow a structure to provide type safety for the items contained in the structure. The Booch Components also allow the application programmer to choose different forms of each structure or tool based on the requirements of its use. Not only does the developer have a wide selection of components to choose from, but he or she also has a selection of different forms of the component that provide different time/space tradeoffs. The Booch Components are very customizable and use multiple inheritance to derive structures that have a common public interface but completely different implementations. This concept is used to provide different memory management schemes and different methods of process synchronization for each structure.
Anyone who is familiar with an application framework, like MacApp, has come to know and love the power and flexibility of classes derived from TDynamicArray like TList, and TSortedList. And anyone that has written a substantial application in MacApp has also soon realized that these structure classes are not the end all and be all.
After spending a little time with the Booch Components you will come to appreciate the power of the components and at the same time the expressive power of the C++ language.
Booch Components Design
Overview
The Booch components represent a second generation object oriented design that exploits many of the newer concepts in the C++ language. Because of this, the Booch Components don't look like a typical framework. For instance, the components are arranged as a forest of classes rather than a singly rooted tree of classes like MacApp or TCL. This allows the developer to use only the components desired and avoids dragging in the entire library to use just one component. The Booch Components are very modular and allow the application developer to easily create new types of structures, create new forms of structures, introduce new memory management schemes, and add new process synchronization mechanisms. This flexibility is indeed very powerful, allowing the application programmer to fine tune applications for optimum performance. The Booch Component structures and tools are layered on top of lower level support classes that implement some of the fundamental abstractions shared throughout the library such as arrays, lists, nodes, and memory management. An application programmer never needs to be concerned with these classes, but someone wanting to add new components or override some of the fundamental mechanisms in the library would be.
Forms
Each structure has two representations that provide bounded and unbounded forms for each structure. The components use multiple inheritance to provide the common public interface for each structure and the private implementation for each form. Thus the developer can easily choose the form of the structure that is needed for the task at hand. Bounded forms of structures are stack based and unbounded forms are heap based. Each form inherits and uses a different memory management scheme and allows the application to be tuned for performance or memory utilization. Figure 1 illustrates the class hierarchy for the Set Booch Component. This diagram was made using Object Master 2.0 from ACI US which is an excellent tool to use when working with the Booch Components. It allows graphical browsing of template based C++ code. Notice that the two concrete classes Unbounded_Set and Bounded_Set inherit from the abstract class Set and also inherit different memory managers, Simple_List and Simple_Vector respectively. All of the structure components are designed in this way. The bounded and unbounded forms of each structure are concrete and can be instantiated.
The use of mixin inheritance is a powerful feature of the Booch Components and illustrates a very powerful design concept in object oriented programming. The ability to provide an identical public interface while providing totally different implementations allows different forms of the component that have different time and space semantics to be substituted without side effects, other than the recompilation time required for that component. This is an important feature of the Booch Components which is clearly superior to MacApp and TCL. MacApp and TCL structures are all based on TDynamicArray or CArray respectively and have only one form of representation for time and space tradeoff. Even though the MacApp class TList is called a list it is still based on a dynamic array structure. Operations like inserting and deleting are slow since the underlying structure is an array. Insertions and deletions translate into block memory moves which can be very expensive when using large arrays. When using the Booch Components you have a choice for every structure in the library as to which form you wish to use. Using the unbounded form gives you a list based structure where insertions and deletions are optimized, while using the bounded form gives you an array that is more space efficient and is optimized for finding elements. A problem with the bounded forms is that their size must be declared when the template is instantiated: thus array structures in the Booch Components are not dynamic and cannot be resized at runtime. If a dynamic array type of structure is required, it would be quite easy to write a new memory manager based on a dynamic array and substitute this in when creating a new structure. The multiple inheritance design of the components makes this easy to do while still providing the same interface as the other forms.
Process Synchronization
Platform independence and portability were obviously another primary requirement during the design of the Booch Components and are evident in the handling of process synchronization within each structure. The Booch Components were designed to be used in both sequential applications and applications with multiple threads of control. The bounded and unbounded forms of each structure have subclasses that encapsulate the mechanisms associated with multiple processes accessing the same structure. These are the Guarded and Synchronized forms and represent different ways of dealing with process synchronization. Most Macintosh application programmers will not need to concern themselves with either of these two forms, but programmers working in an environment that supports concurrent threads will be very interested in these forms. Macintosh programmers using the Thread Manager will be interested in these forms as well. Guarded forms of structures add two additional member functions called seize() and release(). In order to operate on a guarded structure object a client must first seize() the object, perform the operation on the object, then release() it. This mechanism provides each component with mutual exclusion which guarantees that only one client may access a given structure at any given time. Synchronized forms do not add any new member functions to achieve synchronization; rather they redefine every virtual member function to make each action an atomic operation. This approach requires no special treatment by the client while achieving process synchronization and mutual exclusion.
Storage Management
The unbounded form of each of the Booch Components can use a variety of memory allocation schemes that allow for a much more efficient application. The default form of each unbounded structure simply uses the built in new and delete operators. This scheme can give very poor run time performance due to heap fragmentation from many allocations and deallocations of different sized objects and data. The Booch Components provide three storage management approaches. The first approach is unmanaged and is the default case using the built in operators new and delete. The second approach is managed. Managed allocations use another free list for allocations of type managed. As long as this managed memory pool has free space, allocations come from it. If this pool runs out, allocation come from the application heap. This scheme is good in the sense that all managed structure object allocations are confined to one contiguous memory block, but bad in the case of an application that typically uses very little managed memory in steady state but uses large amounts for short durations. The managed memory scheme causes the application's runtime footprint to be larger than if it was not used but helps avoid heap fragmentation. The last memory management scheme that is provided by the library is the controlled scheme. This is only used in applications that use multiple threads of control and require process synchronization. This scheme guarantees mutual exclusion of the free list that is shared by all instances of controlled structure objects.
Iterators
Each structure comes with its own form of an iterator that allows traversal of items within a structure. Two types of iterators are provided for each structure, passive and active. Passive iterators require much less interaction on the part of the client. A passive iterator is instantiated and used by calling the iterator's apply() method with a function pointer to the function to apply to all the elements within the structure. Active iterators allow much more flexibility but require more interaction from the client. Active iterators must be told to go on to the next item, and the iterator object returns a reference to each item in the structure for the client to process or use. Active iterators are very similar to MacApp style iterators.
Tools
The Booch Components come with a variety of tools and utilities that add support functionality to the library. Tools encapsulate algorithmic functions like searching, sorting and pattern matching. The Booch tools are based on templates allowing very generic use of each tool and provide type safety on all operations.
Figure 2 is a summary of the various components available in the Booch Components class library package. It is not within the scope of this article to explain the functionality of each structure in great detail. There are many good books on data structures that can provide this information.
Figure 3 gives a brief description of the various tools available in the Booch Components class library package.
Using the Booch components
The design and implementation of the Booch Components is impressive, but the most difficult aspect of the Booch Components is using them. The tools required to support template-based C++ code are not mature yet. Integration of the Booch Components into MacApp applications is difficult and requires a great deal of knowledge of the MPW environment, including MABuild and CPLUS, and the Booch Components TPL template preprocessor scripts and tools. Using the Booch Components with the Symantec C++ for Macintosh Environment and the THINK Class Library is currently impossible due to problems with their C++ translator's inability to process templates correctly. Symantec has acknowledged this problem but has not said when a fix might be available.
I am currently looking at a way to integrate the Booch Components with MacApp and am also very interested in using the Symantec tools to compile the Components as well. I will forward any news or progress through MADA and FrameWorks.
Template Processing
The Booch Components are based on CFront v3.0, and almost all structures and tools are based on templates. Templates are a relatively new addition to the C++ language and are used to provide type parameterization. Templates are, as their name suggests, templates for instances of classes. A template must be instantiated to form a class. Thus a template for a structure like a list can be used for lists of characters, strings, window objects and so forth. As stated earlier these classes are then guaranteed to be type safe for each type of object. Typically one of the arguments of the template is the type of object the structure will store. For example, a Bounded Set Booch Component used to store 100 CStr255 objects would be declared as follows.
Bounded_Set<CStr255, 100>
Templates pose some serious obstacles for the Macintosh developer. MPW CFront is based on v2.0 and will not process templates. Fortunately Rational provides an MPW tool that will preprocess template based C++ code into C++ code that MPW CFront can compile. This tool is called TPL and is invoked prior to the CFront tool in the build sequence. Unfortunately, getting TPL to work with the MABuild script in MacApp is very difficult and is something I am currently working on.
The components come with a complete set of example programs that test each component in the library and each example program comes with the MPW build scripts to compile it. These test programs are very useful when learning the Booch Components as they exercise, from what I could tell, every method of every component in the library. Looking at the build scripts one may soon wonder which is more difficult, writing the actual C++ code or writing the Make files. I am currently looking for ways to simplify this process as well.
Structures Example
In MPW, Component classes are instantiated from templates using the TPL template processing MPW tool that accompanies the Booch components. The following example illustrates how an unbounded set of characters is created. This example uses MPW CFront with the TPL tool provided with the Booch Components package. First a container class for each object or type is created to store elements in. This is done using the Node<char> template. The Unbounded set class can then be generated using the Character_Node as an argument.
// Instantiate the set container class
typedef Node<char> Character_Node;
// Instantiate the unbounded set class
typedef Unbounded_Set<char, Character_Node> Character_Set;
// Instantiate the set iterator class
typedef Set_Active_Iterator<char> Set_Iterator;
void SimpleSetExample( void )
{
Character_Set s1, s2, s3;
// build set of vowels
s1.add('a').add('e').add('i').add('o').add('u');
// build frameworks set
s2.add('f').add('r').add('a').add('m').add('e').add('w').add('o');
// Adding another 'r' here would cause an exception to be thrown,
// Set's don't allow duplicates
// s2.add('r');
s2.add('k').add('s');
s3 = s1; // save set in temporary
s1.intersection(s2); // perform intersection of set s1 and set s2
// Print set s1 which contains intersection of s1 and s2
PrintSet( s1 );
s1 = s3; // restore original s1
s1.set_union(s2); // perform union of set s1 and set s2
// Print set s1 which contains union of s1 and s2
PrintSet( s1 );
}
void PrintSet( Character_Set set )
{
Set_Iterator iter(set);
while (!iter.is_done())
{
fprintf(stdout, "%c", iter.item());
iter.next();
}
}
The example is simple but illustrates several key aspects of the Booch Component Structures. Template instantiation is accomplished via the typedefs and actually creates the classes needed in the code. Iterators are defined and used to traverse the sets and access items in them. The example code illustrates the use of Booch Component iterators in the function PrintSet() and is used to print each character in the set. This example illustrates the power of the Set Booch Component by using some of the classes built in set theory operations. Note that in this example we use the set_union() and intersection() methods to compute the union and intersection respectively of two sets. These methods will work equally well on user defined objects because the user must supply an overloaded equality operator for the item class that will be stored in the set. This allows the set class to compare items in the set and perform the higher level set theory operations. This brings up an important integration issue facing PascalObject derived frameworks and classes.
Using the Booch Components with PascalObject derived frameworks like MacApp and TCL presents a problem due to incompatibilities in C++ objects and PascalObject objects. Most of the Booch Component structures expect that the items or objects that are going to be contained in the structure provide methods that the structure can use. Most structures expect items to provide a default constructor, a copy constructor, assignment operator, and an equality operator. These methods are used internally and called by the structures and must be present for the structure to work properly. This "connection" into the objects contained within the structure allow searching, sorting, insertions and retrievals to operate seamlessly and correctly for any object. The problem with this from a PascalObject point of view is that Pascal Objects do not have and cannot use overloaded operators. This means that PascalObject objects cannot be stored directly in a Booch Component Structure. All objects that are to be stored in a Booch Component structure must be C++ derived. MacApp and TCL programmers can get around this by providing C++ wrapper classes for objects they wish to store in a Booch Structure, but this adds additional complexity and overhead to an application. The Booch Components must be used with true C++ derived objects. Of course this limitation will not be a problem in MacApp 3.1 where all objects are true C++ and the PascalObject dependency has been eliminated.
How They Stack up
The Booch Components are a much more specialized class library than MacApp or TCL so it's really not possible to compare the Booch Components to either MacApp or TCL. But in the areas where MacApp and TCL are similar to the Booch Components, the Booch Components are vastly superior. The Booch Components structures are designed for developing very robust, efficient, commercial quality applications. They have been designed to offer the application programmer a wide choice of time and space tradeoffs where MacApp and TCL only offer one form of one structure, a dynamic array. The Booch Components offer a much more complete selection of structures and offer structures that MacApp and TCL don't have like Trees, Maps, Sets, and true Lists. It can be argued that the Booch Components are more complex and have a larger learning curve than the structures in MacApp or TCL, but for commercial quality applications the Booch Components are the best choice.
The Future
Currently the Booch Components are very difficult and awkward to use with MacApp and are impossible to use with Symantec's TCL Class Library. Symantec has announced that a subset of the Booch Components will be used in Bedrock. This means that it is only a matter of time before the Booch Components become widely used in Macintosh software development. I am currently looking for an easier way to integrate the Booch Components with MacApp and will publish any findings in FrameWorks. If anyone has any thing they want to share regarding the use of the Booch Components in Macintosh software development, please send me an email or AppleLink. The Booch Components are an excellent example of state of the art in object oriented design and offer incredible functionality for serious application development and it will only be a matter of time before they are widely accepted in the Macintosh
OOP community.