November 92 - NeoAccess: Object Persistence Made Simple
NeoAccess: Object Persistence Made Simple
Bob Krause
Even the name raises the hair on the back of the uninitiated's neck- Object-Oriented Database Engine. This article shows how you can a Mac OODBMS named NeoAccess to reduce resource requirements, provide exceptional performance, organize objects and their relationships to one another, and allow you to focus on those aspects of your application that make it truely unique.
Introduction
So you have this way cool idea for a new Macintosh application, huh? One that will change the way that people use and think about their Mac. You've gone on the obligatory walks on the beach and hikes in the woods to flush out the details of how you're going to implement this beast. You want to do it right, so first off you know that the app is going to be object-oriented and that you are going to use either the THINK Class Library or MacApp 3.0 as an application framework.
But a problem has begun to set in. Way cool apps need to manipulate complex sets of objects with intricate inter-relationships. That's fine, but the state of some of these objects need to be preserved across this gulf referred to as session boundaries-that time between when the user quits your application at night and starts it up again eight hours later smelling of coffee and Corn Flakes. You knew that you were going to need to implement Save As and Open menu items, you just didn't think it was going to be so complicated. I mean, look at Greg Dow's Art Class. That was easy, wasn't it?
The more you think about it, the more complicated the issues become. You need a file format. A file format! What has that got to do with way cool? OK, OK, focus… You can just put all instances of a particular class together in the file, then immediately follow that with the next class, and so on. That gets the data written out to the file. (You can worry about subclasses and variable length objects later, right?) But what about the inter-relationships that exist between objects in memory. Don't those need to be saved so that they can be recreated when the objects are read back in? You've got one-to-one, one-to-many and many-to-many connections that need to be maintained. And if each object is written out one after another, then they need to be read in serially as well. When you get right down to it, you need to read in the entire file at once. Otherwise the app needs to do this dance every time it tries to reference an object-Is the object in memory? No? Then go back to the beginning of the file to read it in… If you don't do that then your application partition is going to be 2MB. Even under System 7 that's a lot of memory. And you know that virtual memory isn't an option. What a nightmare!
You finally decide that the Macintosh and object-oriented programming have not yet evolved far enough to support way cool ideas. Or if something like this can be done it has to be by a team of 40 engineers in a cold room up in Seattle. It's still a good idea. Maybe when Bedrock shows up…
NeoAccess
Each and every developer that has tried to write an application has had to face the issue of persistence. It is a difficult problem with many complications. Up until recently every developer has had to face the issue alone. But in this article I would like to show you how to use NeoAccess, an object-oriented database/persistence mechanism that developers can embed into their applications.
As a freelance consultant for many years, I was running through this nightmarish scenario over and over again. I bit the bullet a few times and developed a structured file format, came up with an organization on disk and wrote routines to read and write data. But on the next project I still ended up using only a portion of this code. It seemed that every project was different. Each had a different set of objects that need to be saved, different accessing needs and patterns and different relationships to maintain.
I finally said enough is enough and developed an object persistence mechanism, indeed a full blown database engine called NeoAccess. Because of my previous experiences I've designed the system to be versatile. I make no assumptions about the type of data that needs to persist or the connections between them. Things are very optimized toward storing and retrieving objects, but un-formatted, variable-length blobs of data can also be mixed in. I also tried to avoid making assumptions about how objects are organized in the system. How are objects sorted? How are they indexed? How do objects relate to one another? Are they accessed serially or randomly? There are defaults, but for the most part, with NeoAccess these questions can be answered by the application designer. The programming interface is designed to show minimum visible complexity. Years of consulting on object-oriented projects have shown that no matter how sophisticated a company's products may be, the one thing they all share is a common struggle to minimize complexity. Of course performance is also a major concern. I wondered whether it was possible to provide versatility and performance in the same system. Now I believe that it is, and NeoAccess is proof of that. Finally, the system is portable. The first version of NeoAccess was developed using THINK C and the THINK Class Library. But there are a lot of MacApp programmers out there as well, so now there is a MacApp 3.0 version. It will be moved to Bedrock when that shows up, and a PC version is planned. You might even be seeing it under Prograph and the Component Workshop.
In this article I'm going to show how to use NeoAccess in a way cool application that Bob Ackerman and I built (using the TCL) that I've nicknamed Photographer's Assistant (its real, though less imaginative, name is NeoDemo). The source code of this application is included on a disk that I distribute for free called the NeoAccess Introductory Toolkit. You can download it from most major on-line services. It is also on the latest develop CD.
So check this out… I've got this mythical photographer friend, Toby, who likes to use the Mac in his work. He has his film developed by a service bureau that also digitally scans his photographs and presses them onto CD-ROMs. The images are stored on the CD in a file that is readable by Photographer's Assistant.
My app displays images in a window that looks like the standard scrapbook DA on steroids. I've added a lot of extra fields that the DA doesn't have so that Toby can keep track of which camera, lens and film he used to take a shot. He has a journal that he keeps all this information in which he enters into NeoDemo when the CD arrives. The elephant image shown on the previous page is a good example.
The window is titled "Safari Images" because that is the name of the file that the images and ancillary data are in. Just above the image is the image title, "Charging Elephant" (there's a great story to go along with this one…). Toby can scroll through all the images in the file using the scrollbar. The indicator just below the scroll on the left size shows that this is the first of three images. The indicator on the right shows that its format is PICT. The fields in the lower portion of the window were set by Toby; He took the picture on September 2, 1992 with his Nikon using a 150mm lens, and so on. He has assigned this picture a catalog number and a set of keywords. The keywords are handy because NeoDemo has a nice feature that allows him to find all images that have a common keyword; like all animal images, or all those taken in Kenya.
The flag control in the bottom right-hand corner of the window opens up this search feature. When the flag is down an expanded window includes a text edit in which to enter a keyword, a popup to indicate the kind of images to include in the search (PICT, TIFF, GIF or Any), and a find button to start the search (see Figure 2).
Multiple windows can be open at once. That way you can copy an image from one document to another. This is how Toby creates a file to send back to the service bureau for printing with just a single image in it. Not only is the image copied but so is the technical info like where it was shot and how. That way all the information about the image stays with it. Of course if the scrap is pasted into some other application that doesn't know about image objects, then only the PICT data is carried over.
The NeoAccess Class Tree
Let's look under the covers to see how NeoAccess works. The best place to start is by looking at the definitions of some of its classes (see Figure 3). The two classes CObject and CDataFile are members of the core suite of classes in the THINK Class Library. CObject is the abstract root level class. It is the ancestor of all other classes in the library. CDataFile provides know-how for manipulating the data fork of Macintosh files as an extensible and randomly accessible stream of bytes.
CNeoFile
The class CNeoFile builds on the capabilities of CDataFile to provide a mechanism for storing, organizing and retrieving persistent objects.
If you consider for a moment what the responsibilities of the Macintosh File Manager are, you get a pretty good picture of the kinds of things that the CNeoFile class does as well. The File Manager manages allocation blocks on a volume. Most of a volume's blocks are used to store information contained in files-resources and data. The rest of the space is either unused or used to store volume catalog and extents information and desktop-management related data-the organization of folders and files, a list of used blocks, a list of blocks allocated to a particular file, the locations of folder windows on the desktop and the shape and location of icons in windows-all of this needs to be magically maintained for the user.
The primary objective of the File Manager is to reliably administer a volume without burdening the user (or developer) with unnecessary details. The overall efficiency with which it accomplishes this is a major concern. It has to be versatile enough to satisfy the needs of its clients (the Finder and other application programs). And, finally, it needs to be built in which a way that it can evolve in response to future needs.
Instead of managing allocation blocks on a volume, CNeoFile supervises space in a file. Most of this space is used to store the permanent attributes of persistent objects. But most of the complexity of CNeoFile's charter involves the manipulation of its internal data structures. Just as the File Manager must track folders and their contents, CNeoFile must keep track of classes, subclasses and objects. Classes are related to one another to create a class hierarchy. Objects are organized using indexes. Finally, CNeoFile keeps track of the free space in the file.
CNeoFile's primary objectives are very similar to that of the File Manager. Reliability and minimizing visible complexity top the list, followed closely by performance. But it also needs to be versatile and be capable of evolving in future directions.
Many of the capabilities of CNeoFile, such as specifying, opening and closing a Macintosh file, getting and setting the file mark and the file length are actually provided in whole or in part by its parent classes. I won't go into too much detail on what they do or how they do it. The operations we would like to consider are those having to do with objects, classes and the management of file space.
In order to access objects contained in a file, a developer must understand how information is organized. Figure 4 presents a schematic illustration of the reference hierarchy within NeoAccess.
Information is typically organized in a file primarily by class. Just as you must create a folder before putting a file in it, you must add a class to a file before storing objects of that class. For example, NeoDemo files contain camera and image classes.
C++ and most other object-oriented languages support the concept of inheritance. So does NeoAccess. When you define a subclass in C++ you indicate its parent class. When you add a class to a file you can also record what its parent class is. NeoAccess uses this information to maintain a tree that matches the C++ inheritance tree.
Objects are organized within a file according to class. This organization is supported by a construct called an index. Indexes keep objects in sorted order. There is a default collating sequence, but the sort order can be changed to support the needs of the application. NeoAccess even provides the ability to maintain multiple indexes for a class. Though developers rarely need to know the details, indexes are implemented as extended b-trees. This allows NeoAccess to locate objects quickly by using binary search algorithms.
CNeoPersist
Application-specific objects encapsulate the intelligence of your application. They are the value that you add to the Macintosh. The raison d'être of your application is to provide a mechanism that allows users to manipulate these objects. Application-specific objects are generally persistent objects. Users create something that they can come back to and work with again later. In order for this to happen, applications need to include a mechanism that preserves the state of these objects after the application has quit, and which can be used to locate the objects again later.
Using NeoAccess all persistent classes are based on the class CNeoPersist. The methods and attributes of CNeoPersist provide the know-how and state that allows NeoAccess to manage an object. The abbreviated class definition of CNeoPersist given on the next page shows the kind of operations one might use to manipulate an object's persistence state.
class CNeoPersist : public CObject {
public:
/** Instance Methods **/
virtual ~CNeoPersist(void);
virtual void Dispose(void);
virtual NeoID getClassID(void);
/** I/O Methods **/
virtual void readObject(const long aMark,
CNeoPersist *aParent, const NeoID aID);
virtual void writeObject(const long aLength);
virtual Boolean sync(CNeoFile *aFile, const Boolean
aCompletely, const Boolean aCompress);
virtual void getFromScrap(const OSType aFrmt,
Handle *aScrap, long *aOffset);
virtual void putToScrap(Handle *aScrap, long *aOffset,
const long aSize);
/** Searching Methods **/
static CNeoPersist * Find(CNeoFile *aFile, const NeoID aClassID,
NeoSelectKey *aKey, const Boolean
aSubclass, CNeoArray *aResults);
static CNeoPersist * FindByID(CNeoFile *aFile, const NeoID
aClassID, const NeoID aID, const Boolean
aSubclass, CNeoArray *aResults);
CNeoPersist * getNextSibling(void);
CNeoPersist * getPreviousSibling(void);
virtual NeoOrder compare(NeoSelectKey *aKey);
/** Persistence Methods **/
void setDirty(void);
virtual void makePermanent(CNeoFile *aFile,
const NeoID aID);
Boolean removeFromFile(CNeoFile *aFile);
virtual void relocate(const long aNewMark);
/** Concurrency Methods **/
void referTo(void);
void unrefer(void);
void autoReferTo(void);
void autoUnrefer(void);
/** Instance Variables **/
Boolean fLeaf : 1; // True if not an inode?
Boolean fRoot : 1; // Is the root of hierarchy
Boolean fBusy : 1; // object being manipulated
Boolean fDirty : 1; // Memory/file states differ
NeoID fID; // Symolic ID of this object
long fMark; // Location in file
CNeoPersist * fParent; // object's parent
char fRefCnt; // Purgeable when zero
};
Adding an Object
Let's look at some of the methods under the "Persistence Methods" category. The method makePermanent is used to add an object to a file (to make it persist). The aFile argument indicates the file which the object is to persist in, and aID is the object's 4-byte "identity". An object's identity is often used to uniquely identify an object in the file. By default, objects of a particular class are sorted in ascending order by id value. So an object's identity may be important. If aID is zero, then a unique identity is assigned by makePermanent.
Deleting an Object
Immediately below makePermanent is the method removeFromFile. Inevitably, an application will need to delete objects from a file. The object's removeFromFile method does this. Note though, that an object continues to exist in memory after it has been removed from a file. It can be manipulated just like any other object. It can even be re-inserted in the same or any other file at some later point.
Locating Objects
We'll come back and discuss some of the other persistence methods in a moment. But first let's look at a couple of methods under "Searching Methods". FindByID is used to locate an object (or set of objects) having a given identity. For example, an image object refers to a camera by its id. NeoDemo uses FindByID to locate the camera using the following call:
camera = (CNDCamera *)CNeoPersist::FindByID
(itsFile, kNDCameraID, aCameraID, FALSE, nil);
The first argument is the file to search. The second argument is the class of object we're looking for. In this case, we're looking for a camera. The constant kNDCameraID is the class id used to refer to the CNDCamera class. Just as an object's identity is defined by an object id, class ids refer to classes. The third argument is the identity of the camera object we are looking for. The fourth argument indicates whether all subclasses of CNDCamera should be searched as well. This ability to search for an object according to any of its base classes is very useful. In this example there are no subclasses of CNDCamera, so this argument is FALSE. If we expect a search operation to find more than one object having the given identity then we could pass a pointer to an array as the last argument to FindByID. We pass nil in our example so the first (and only) camera found will be passed back to us as the return value of the function.
FindByID is capable of locating a camera very quickly because camera objects are indexed by identity. This means that NeoAccess can use a binary search, which is very fast. But sometimes an application needs to locate objects using an attribute that is not a key. For example, Toby often looks at only those images that have a particular keyword. It is fairly easy for NeoDemo to do this even though images are not indexed by keyword. The doUntilObject method of the CNeoFile class is designed for just these kinds of situations. What doUntilObject does is apply an application-specific function to each object of a base class until the function returns a non-nil result. NeoDemo applies a function to a set of images. This function checks each object to see if it has the keyword of interest. If it does, then the object is added to a results array. When doUntilObject returns, the array refers to all those images with the keyword.
Changes to an Object's State
Another common occurrence in an application is when the permanent state of an object changes. For example, Toby may add a keyword to an image. That changes the object's state. This change needs to propagate back to the file. When the keyword list of an object is changed, the object's setDirty method is called. This marks the object as being different than its state in the file. Newly added objects are also marked dirty by NeoAccess. If dirty objects are not written back out to the file, then they revert back to their previous state when the file is closed and then reopened. The process of synchronizing the in-file state of objects with their in-memory state is performed by the sync method of the file object. It is called in NeoDemo when Toby does a Save or Save As. Notice that applications don't need to keep track of which objects are dirty. NeoAccess does that for them. And, unlike streams-based mechanisms which rewrite the entire file at once, only those objects that are dirty need to be written out when saving a file.
Object Sharing
CNeoPersist provides a sharing property to its subclasses that greatly simplifies intra-application concurrency issues. Every persistent object has a reference count which is used to insure that an object is not deleted from memory while there are still references to it. The reference count is initialized to 1 when an object is instantiated. A reference is automatically added by makePermanent and the searching methods and is decremented by removeFromFile. When the Dispose method is called to delete the object the count is decrement, but the object is only freed if the reference count is zero (meaning all references are deleted). The end result is that one component of an application doesn't need to be aware of whether an object that it refers to is referred to by another component. The object stays in memory as long as it needs to, and no longer.
As you can see, adding, removing, locating and changing objects in a NeoAccess file is fairly easy. Notice that the logistics of how objects are organized and where they are located in the file is for the most part transparent to the programmer. Also note that only those objects that are of immediate interest to the application are in memory. Yet simplicity and compactness does not compromise versatility. Indeed, applications such as NeoDemo are object-driven, so versatility is increased while complexity is reduced.
The NeoDemo Class Tree
Now that we have an understanding of what NeoAccess does, let's take a look at how NeoDemo uses these capabilities. The first thing to consider is the set of persistent classes that the application defines and their relationship to the CNeoPersist class and to one another (see Figure 5).
As you might expect, CNeoPersist is the base class of all persistent classes. One of its immediate subclasses is called CNeoBlob. Though object-oriented developers strive to make everything an object, the fact remains that not everything can be. NeoAccess provides an abstract class, CNeoBlob, in recognition of this fact. CNeoBlob is an abstract base class of persistent objects that is used to store and locate free-form, variable-length, non-object entities in a NeoAccess file.
There are a total of seven classes in NeoDemo's database, five of which are specific to NeoDemo. These application-specific classes are of two general types-images and cameras. There is one camera class and a base image class with three subclasses-CNDImagePICT, CNDImageTIFF and CNDImageGIF. The image classes are based on CNeoBlob. It is significant to note that the only difference between a PICT image and a TIFF image is the implementation of their draw method. The other capabilities and attributes are inherited from CNDImage.
Class objects are added to the file in CNeoDemoDoc::NewFile around the time it is created. I would like to extend NeoDemo so that Toby can add new cameras to the database. But for now the four cameras that he typically uses are added automatically when the classes are added.
exposure[0] = -125; /* 1/125 */
camera = new CNDCamera;
camera->INDCamera("\pKodak", 1, exposure);
camera->makePermanent(((CNeoFile *)itsFile), 3);
camera->unrefer();
Notice how simple it is to add a camera object. The exposures array which indicates the shutter speeds supported by the camera is filled in first. Then a new camera object is created and initialized. The method makePermanent is used to add the camera to the file. We no longer need to refer to a object once it is added to the file, so we call unrefer to remove our reference to it.
Cutting, Copying and Pasting Images
NeoAccess provides mechanisms for putting objects onto and getting objects off of the clipboard. The mechanics differ depending on whether the application is based on the THINK Class Library or MacApp 3.0. NeoDemo is based on the TCL so it uses putToScrap and getFromScrap. MacApp applications usually use a streams-based interface which CNeoPersist also supports.
The method CNeoDemoDoc::doCutCopy puts an image object onto the scrap.
if (image = GetImage()) {
ZeroScrap();
image->putToScrap(&scrap, &length, 0);
if (scrap) {
HLock(scrap);
PutScrap(length, 'IMGE', *scrap);
HUnlock(scrap);
}
…
}
The blob portion is written to the scrap as format PICT data. Any application capable of using the image in PICT format should be able to accept this type of scrap. The other attributes of CNDImage, such as photographer and exposure data, are written as format IMGE data. NeoDemo is the only one that understands this format. But having the image object in the scrap allows Toby to copy an image from one NeoDemo document to another without losing this data.
Images are added by pasting them into a document in the method CNeoDemoDoc::doPaste. The arguments to getFromScrap indicate the scrap format to look for (IMGE in this case), a pointer to a handle which will refer to the scrap data, and a pointer to a long that upon return will indicate the length of the data.
scrap = nil;
image->getFromScrap('IMGE', &scrap, &length);
if (scrap)
DisposHandle(scrap);
// Was PICT format data found?
if (pict = (PicHandle)image->getBlob()) {
image->makePermanent(gNeoFile, id++);
…
}
Scrap data of format IMGE is available only if the scrap was put there by NeoDemo. If the image came from some other app that does'nt support IMGE data (a.k.a., everybody else), then the image object is initialized with default data. The actual image is in the standard PICT format. The method CNDImage::getFromScrap copies this off of the scrap.
Saving a Document
After a new image is pasted into a document Toby needs to enter technical details having to do with where it was shot and under what conditions. This changes the state of the image object in memory. Each time a permanent attribute of an image changes the setDirty method is called to mark the object as needing updating in the file. File updating in NeoDemo occurs when the Save or Save As menu items are selected.
The method CNeoDemoDoc::DoSave is where the actual file update occurs.
Boolean CNeoDemoDoc::DoSave(void)
{
if (itsFile &&
!((CNeoFile *)itsFile)->refNum)
return(DoSaveFileAs());
((CNeoFile *)itsFile)->sync(TRUE); // Write out all objects.
dirty = FALSE; // Document is no longer dirty
inherited::UpdateMenus();
return(TRUE); // Save succeeded
}
Objects can be added, deleted and searched for in a memory-based CNeoFile object before a Macintosh file has been specified and opened for it. DoSave calls DoSaveFileAs to specify and open the file when this is the case. Updating the file is done with a call to the file's sync method. The single argument to this call indicates whether the file object should attempt to reduce the amount of file space the file uses on disk. This may slow down the synchronization operation but can reduce the file's size dramatically.
Searching for Images
The usefulness of assigning keywords to images is that NeoDemo includes a facility for creating a subset of images having a given format and keyword. Toby asked for this because, though a file may contain hundreds or even thousands of images at a time, he may be interested in only those having a specific keyword.
typedef struct {
char * fKey;
CArray * fResults;
} getImageInfo;
void CNeoDemoDoc::getImages(NeoID aBaseClassID, char *aKey)
{
short index;
char key[33];
getImageInfo info;
/* Translate the key to lower case */
for (index = 0; aKey[index]; index++)
if (aKey[index] >= 'A' &&
aKey[index] <= 'Z')
key[index] = aKey[index] +32;
else
key[index] = aKey[index];
key[index] = 0;
info.fKey = key;
info.fResults = fImageArray;
((CNeoFile*)itsFile)->doUntilObject(nil, aBaseClassID,
kNeoPrimaryIndex, TRUE,
(NeoTestFunc1)NDMatchImage, (long)&info);
…
}
The heart of a paste operation CNeoDemoDoc::getImages, called when the Find button of the extended window is pressed. The arguments indicate the class id of the images to look for (kNDImagePICTID, kNDImageTIFFID, kNDImageGIFID or kNDImageID to indicate all types) and the keyword of interest. Searches are not case sensitive, so the keyword is normalized to lower case.
A linear search of the file needs to be done because images are not indexed according to keyword values. The method CNeoFile::doUntilObject is used to apply the function NDMatchImage to each image of the type indicated.
The first argument to doUntilObject indicates the object to start at. The fact that it is nil indicates the search should begin at the beginning of the list. The second argument is the class id of objects to search. This value was passed to getImage. Since objects can be indexed in more than one way, the third argument indicates which index to use. Images have only one index so we use the default value of kNeoPrimaryIndex. The fourth argument being TRUE indicates that all subclasses of the class referred to by argument two should also be searched. So when the base class is CNDImage then all images classes are searched. The fifth argument is a pointer to the function to be applied to each object in turn. Finally, the last argument is a pointer to data shared between the caller of doUntilObject and the function. This can be anything, but in this case it is a pointer to a data structure containing the keyword value and a pointer to an array into which objects having the keyword will be placed.
static long NDMatchImage( CNeoPersist *aObject, long aParam)
{
getImageInfo * info = (getImageInfo *)aParam;
CNDImage * image = (CNDImage *)aObject;
if(image->isValidKey(info->fKey)) {
image->referTo();
info->fResults->InsertAtIndex(&image, 0x7FFFFFFF);
}
return 0;
}
The function applied to each object, NDMatchImage, is quite simple. Its arguments are a pointer to an image object and a pointer to the shared data structure which contains the keyword value and the results array. If the image object's isValidKey method returns TRUE, then the image is added to the end of the results array. (Note that a reference is added before adding an image to the results array.) After NDMatchImage has been applied to each object in turn, the results array will refer to all those that have the chosen keyword.
Summary
OK, so maybe NeoDemo isn't the coolest app you've ever seen, but it can be useful. Use it as a pumped up scrapbook if you like. But its real value is as an example of how easily the issue of object persistence can be addressed using a mechanism like NeoAccess. n