Finding Fault
Volume Number: 17 (2001)
Issue Number: 1
Column Tag: WebObjects
Finding Fault
By Sam Krishna and Patrick Taylor
Why the EOFaulting mechanism neatly solves a thorny problem in the Object-Relational paradigm
The Object-Relational paradigm is the conceptual foundation to the Enterprise Objects Framework (EOF). Concepts such as tables map nicely to classes, rows to objects, row-column values to instance variables (ivars), etc. The mapping is so neat and clean that one is rather surprised to find a terrible flaw marring the paradigm.
One difficulty is that few situations are easily encompassed by single, isolated tables. A relational approach is frequently needed. However, having started down the Object-Relational path, what happens to objects that are related to other objects? What happens if a large set of objects is related to a single object? How can these relationships be represented without compromising performance or overwhelming system memory?
In real life, relationships are numerous, complex and widespread. Consider a (simplified) model of a credit card reporting application.
How would such an application be able to traverse from Consumer to Purchases without swamping the memory footprint? If we tried to apply this model to 200 million individuals charging approximately US$10,000 per year on 3 credit cards (on average), the situation could get nasty in a hurry. Imagine fetching 600 million rows into an application in order to traverse the relationship graph for a credit card query. What kind of memory footprint would be needed to do a report on the purchasing information? When the program exceeds memory requirements, the machine will grind to a stop.
Without an efficient way of accessing our program's data, the value of the application comes into question. The types of applications created by WebObjects depend on a reasonable amount performance, but an elegant object-relational mapping is a useless affectation if it becomes an albatross. Object-orientation is not an end in itself, but rather a means - preferably a superior means.
Fortunately for us, the EOF engineering team solved this problem using a mechanism called the EOFault class and its EOFaultHandler. Together these classes form a system where light placeholder objects are substituted for regular objects before they are fetched. The EOFault is a raincheck, promising to fetch the real objects when actually needed.
Ready-Aim-Fire!
If you wish to fetch a related row in EOF, a fault has to be fired. Before you get the idea that this is extremely complicated in either theory or practice, stop fretting. It isn't. You simply "walk" the relationship graph and, most of the time, your WebObjects application will do this for you.
Take a look at the Movies example model that ships with WebObjects. Let's assume that you need to get a list of movie roles for any given movie that is selected. After fetching a particular movie, let's take a look at the movieRoles relationship before the fault is fired:
JavaDebug>> po movieRoles
movieRoles = <ArrayFault(0x195cfc8) source: [GID: Movie, (155)] relationship: movieRoles >
As you can see, movieRoles is an ArrayFault which knows to fetch all the rows on the destination side of a to-many relationship named movieRoles. The GID refers to an EOGlobalID, an object that contains the primary key information within Movie that is needed by the faulting system to fetch the related rows in the movieRoles table. When the fault is cleared or fired, EOF generates the SQL necessary to fetch the related rows. What is cool is that you don't have to write any code to fire these faults, WebObjects/EOF simply knows how to fault relationships automagically with very little (if any) programmer intervention.
At this point, your program might need to ask which studio produced which movie. It can do this by faulting the movie's studio relationship. Here's what the fault will look like before it is fired:
movie.studio is <EOGenericRecord(0x2057c98) Fault [GID: Studio, (9)]>
Once the studio fault is fired, EOF will again automatically generate the SQL, fetch the related row, stuff the results in a brand new Enterprise Object (EO), and return the new EO to the WebObjects Movies application.
Batch Faulting
If the Movies example was a real application with real users, then it's quite likely that someone would wish to see a list of credits for the movie's actors. This isn't an unreasonable demand since WebObjects will traverse the to-one relationship from the MovieRole EO to the Talent EO. The problem however is similar to the one we discussed in the credit card example. When you deal with a situation where many roles are getting fetched individually on an application getting a million hits per day, you need to do something to minimize the number of round-trip fetches to the database.
Every time a fault is fired, a round trip to the database is made. If your Movies application had 1000 users, each one of which looked at about 10 movies per visit, and each movie had a cast of 20 actors, your application would face approximately 200,000 round trips to the database for the cast list alone. Batch faulting is a great way to rein in your application's demands, it can potentially reduce the number of round trips to 10,000. The more relationships hanging off, the bigger the bang for batch faulting. I'm sure that you'll agree that a twenty-fold reduction is nothing to sneeze at!
Prefetching
Prefetching is an alternative method for faulting. It is used specifically with an EOFetchSpecification, which we will discuss in more detail next month when we cover the EOEditingContext. Suffice to say, it is a pre-empting action to the faulting mechanism. Based upon the key paths you feed to a particular EOFetchSpecification, you will be be able to prefetch EOs based upon the relationships from a particular set of entities you are fetching.
A developer must take care when using prefetching because it is too easy to fetch unnecessary EOs without thinking about it. It is better to profile the kinds of fetching patterns that occur during use of the web application before setting prefetching key paths upon the EOFetchSpecification, the class used with the EOEditingContext to fetch objects explicitly.
The method on the EOFetchSpecification class for prefetching, setPrefetchingRelationshipKeyPaths(), (setPrefetchingRelationshipKeyPaths: in Objective C), takes an NSArray of relationship keys. For example, if you wanted to prefetch both the movieRoles and studio relationships, you can pass the strings "movieRoles" and "studio" in an NSArray (@"movieRoles" and @"studio" in Objective C).
One thing to keep in mind, if you set an EOFetchSpecification to refresh through the method setRefreshesRefetchedObjects() (setRefreshesRefetchedObjects: in Objective C), any refetches are propagated through the object graph (meaning that both studio and movieRoles would be refetched as well to update the current state of the object graph as reflected in the database).
One way to batch fault is to set the number of items you want faulted in a to-many relationship using EOModeler's Advanced Relationship Inspector. While a terribly convenient approach since this allows you to fault lists without writing code, it unfortunately won't help us in this particular example.
We need to programmatically fault a whole array of Talent relationships based upon the MovieRole entity all at once. To achieve this, we message the EODatabaseContext with the batchFetchRelationship() method (batchFetchRelationship:forSourceObjects:editingContext: in Objective C). It isn't always quite clear how to get the EODatabaseContext from the EOF stack for a given WebObjects application, worse yet, even after you retrieve it, there are still a number of other objects you must retrieve in order to send the message to the EODatabaseContext. Here is an example of how we batch faulted the Talent EOs as they were related to an NSArray of MovieRoles for a given movie:
Java
Be sure to import com.apple.yellow.eoaccess.* as one of your import statements
protected NSArray movieRoles() {
// An example of how to batch fault programmatically in Java...
NSArray movieRoles =
(NSArray)movie.valueForKey("movieRoles");
EOEditingContext ec = session().defaultEditingContext();
EODatabaseContext dc =
EOUtilities.databaseContextForModelNamed(ec, "Movies");
EOModel moviesModel =
EOModelGroup.defaultGroup().modelNamed("Movies");
EOEntity movieRoleEntity =
moviesModel.entityNamed("MovieRole");
EORelationship talentRelationship =
movieRoleEntity.relationshipNamed("talent");
movieRoles.count(); // Fires the array fault to be used for the second argument
dc.batchFetchRelationship(talentRelationship, movieRoles, ec);
return movieRoles;
}
Objective-C
Be sure to declare #import <EOAccess/EOAccess.h> as one of your import statements
- (NSArray *)movieRoles
{
// Batch faulting example in Objective-C
NSArray *movieRoles =
[movie valueForKey:@"movieRoles"];
EOEditingContext *ec =
[[self session] defaultEditingContext];
EODatabaseContext *dc =
[ec databaseContextForModelNamed:@"Movies"];
EOModel *moviesModel =
[[EOModelGroup defaultGroup] modelNamed:@"Movies"];
EOEntity *movieRoleEntity =
[moviesModel entityNamed:@"MovieRole"];
EORelationship *talentRelationship =
[movieRoleEntity relationshipNamed:@"talent"];
[movieRoles count];
// Fires the array fault to be used for the forSourceObjects: argument
[dc batchFetchRelationship:talentRelationship
forSourceObjects:movieRoles editingContext:ec];
return movieRoles;
}
There are quite a few objects to message as part of the EOF stack particularly the various elements of an EOModel. However this sample code yields the desired result which is a batch fault that fetches all the related TalentEOs simultaneously in a single round trip to the database. We decided to batch fault the talent relationship in this method for demo purposes only, in a professional development context, it would make more sense to refactor the batch faulting code into a separate method.
Deferred Faulting
New to WebObjects 4.5 is deferred faulting, a performance-focused approach to dealing with faults. While incredibly interesting, we aren't going to be able to do it justice within the confines of this more general article. Since WO4.5, all EOGenericRecords automatically do deferred faulting so you will definitely be hearing much more about it.
What to look for in code
A method will distinguish itself as a fault-firing method pretty quickly in a generated EO class file. In Java, all accessor methods will have the willRead() method in it. In Objective-C, you will see an accessor that reads [self storedValueForKey:@"someKey"]. Remember that all accessors in a generated EO class are fault-firing methods. If a fault cannot respond natively to a method without firing, it will fire and fetch the real object in order to fulfil the response.
For array faults, the easiest way to cause the fault to fire is by sending the count() message to it (count in Objective C); this will fire any to-many relationship-based faults. If you need to fire a to-one fault, you can either send it a message that you know the original object will respond to, or you can send the fault a willRead() message. In Objective-C, you can send any variety of messages that the original class should respond to, or you can simply use the EOFault class object to fire the fault by sending the [EOFault clearFault:aFault] message. Almost any message that the original class will respond to should do the trick.
Reverting an object back into a fault
If you need to convert an object that was faulted back into its original fault (for example to redisplay the state of a relationship after recent editing), there are a few more hoops to jump through. You need to get the EOGlobalID for an EO, stuff the EOGlobalID into an NSArray, and send a message to the EOEditingContext telling it to invalidate the objects with that particular global ID.
Here is some sample code to refault the MovieRole objects that have been fetched:
Java
public void refaultMovieRoles() {
// Assume movieRoles is an ivar
EOGlobalID gid = null;
int i = -1;
EOEnterpriseObject eo = null;
NSMutableArray gids = new NSMutableArray();
EOEditingContext ec = null;
for (i = 0; i < movieRoles.count(); i++) {
eo = movieRoles.objectAtIndex(i);
ec = eo.editingContext();
gid = ec.globalIDForObject(eo);
gids.addObject(gid);
}
session().defaultEditingContext()
.invalidateObjectsWithGlobalIDs(gids);
return;
}
Objective C
- (void)refaultMovieRoles
{
// Assume movieRoles is an ivar
EOGlobalID *gid = nil;
int i = -1;
id eo = nil;
NSMutableArray *gids = [NSMutableArray array];
EOEditingContext *ec = nil;
for (i = 0; i < [movieRoles count]; i++) {
eo = [movieRoles objectAtIndex:i];
ec = [eo editingContext];
gid = [ec globalIDForObject:eo];
[gids addObject:gid];
}
[[[self session] defaultEditingContext]
invalidateObjectsWithGlobalIDs:gids];
return;
}
The invalidateObjectsWithGlobalIDs() method (invalidateObjectsWithGlobalIDs: in Objective C) is very powerful. It invalidates the entire undo stack for those particular objects with the editing context. It also signals the underlying object stores that the objects will need to be refetched the next time those particular objects are accessed (just like what would happen for a pure fault).
There are other methods, such as invalidateAllObjects() (invalidateAllObjects in Objective C), which will completely discard all EOs in the editing context as well as dump the snapshots at the EODatabaseContext level. This is a massive snapshot dumping operation in a traditional session-based editing context application and should be used very sparingly.
Things to Look for
If you have an object model where important entities have many relationships hanging off of them, you will want to look for opportunities to create a batch fault and optimize the performance of your application. If you have an array of objects that also have important individual to-one relationship targets, think seriously about creating a batch fault for those as well.
Faulting is a very cool mechanism that significantly simplifies the problem of whether or not to fetch the entire object graph based upon an entity. If you don't need to fault an EO into memory, then it is best to leave it alone; however, if you do need an EO on the other side of a relationship, then just reach out and fault it.