Recursion
Volume Number: 16 (2000)
Issue Number: 11
Column Tag: Mac OS X
Recursion: The Programmer's Friend
by Andrew C. Stone
A programmer's bag of tricks is loaded with heuristics, algorithms and techniques, but of these, few are as powerful as recursion. The Oxford English Dictionary recursively defines recursion as "The application or use of a recursive procedure or definitiion"! Recursive has the definition "Involving or being a repeated procedure such that the required result at each step except the last is given in terms of the result(s) of the next step, until after a finite number of steps a terminus is reached with an outright evaluation of a result."
In software, recursion is when a function or method calls itself, over and over, with slightly differing arguments. Of course this sounds like the perfect recipe for an infinite loop, but we will design in an exit condition so you don't end up in Cupertino! Recursion allows the writing of elegant and terse code once you understand how it works.
Recursion abounds in nature, and can be visualized by thinking of the fractal Mandelbrot set no matter how deep into the set you continue to go, the same forms appear over and over, in an increasingly minute, yet perfectly replicated form.
Programmers often use a recursive data structure called a tree to represent hierarchical data. A tree is a root (Now that's an oxyomoron!) with zero or more subtrees. Each subtree consists of a root node with zero or more subtrees. A subtree node with no branches or children is a leaf node. (Tree terminology uses both botanical (root/branch) and geneological (parent/child) terms.) A classic use of recursion is for tree traversal, where you want to perform some action on each node in the tree.
Figure 1. A Tree with nodes labeled postorder.
A tree can be implemented in various ways, depending on the structure and use of the tree. It's beyond the scope of this article to cover the implementation of "scales well to large N" trees such as the btree; however for a reasonably small number (e.g. under 1000) of nodes, arrays of arrays will work fine. Let's define a simplistic Node as:
@interface Node {
id data; // an opaque pointer to some kind of data
NSArray *_children;
// if an internal node, this contains children nodes
// if this is a leaf node, it contains 0 elements.
}
// query the Node:
- (NSArray *)children;
- (NSData *)data;
// establish the Node:
- (void)setChildren:(NSArray *)newKids;
- (void)setData:(id)data;
@end
And our tree controller object would look like:
@interface TreeController {
Node * _rootOfTreeNode; // the root is all you need to get at any Node
}
- (void)visitNode:(Node *)node;
- (void)printData:(Node *)node;
@end
Walking a Tree: Kinds of Traversals
There are three types of tree traversals: preorder, inorder, and postorder - each defines the particulars of whether you work on a node before or after working on its children. In preorder, you work on the parent and then do an inorder traversal of each of the children . In inorder, you do a preorder traversal on the leftmost child, work on the parent, and then do an inorder traversal of each of the remaining children. In postorder, you do a postorder traversal of each of the children and then work on the parent.
Here's the code for a recursive preorder traversal:
- (void)visitNode:(Node *)node {
NSArray *childrenToVisit = [node children];
unsigned i, count = [childrenToVisit Count];
// visit the current node:
[self printData:[node data]].
// if there are no children, then recursion ends:
for (i = 0; i < count; i++) // make recursive call:
[self visitNode:[childrenToVisit objectAtIndex:i]];
}
- (void)printData:(Node *)node {
// this is just an example - in reality you might do something useful!
NSLog([node description]);
}
To traverse an entire tree, you would simply start the recursive process at the top of the tree by calling our method with the root node as the argument:
[self visitNode:_rootOfTreeNode];
Note that to perform a postorder traversal, you just move the "printData" (our placeholder for the action we wish to perform on any node) to the bottom of the method:
- visitNodePostorder:(Node *)node {
NSArray *childrenToVisit = [node children];
unsigned i, count = [childrenToVisit Count];
for (i = 0; i < count; i++)
[self visitNodePostorder:[childrenToVisit objectAtIndex:i]];
// now visit the current node:
[self printData:[node data]].
}
Stacks and Stacks of Stacks
So how does this all work? Every programming language that supports recursion (including Objective-C) maintains a stack of information about parameters and local variables for each time a procedure, function, or method is called. When a procedure is called, the information is pushed onto the stack; when the procedure exits, the information is popped from the stack.
Given the tree pictured in Figure 1, let's mentally walk through what's happening. First we call visitNodePostorder: on the root node (node A), placing this method call on the stack. Node A has 3 children so we call visitNodePostorder on the first child (node B), pushing that call on the stack. Node B has 2 children, so we call visitNodePostorder on the first child (node C), placing that call on the stack. We then call the method on node C's first child, node D. Now the stack is 4 deep, but this final node is a leaf, so we call printData on node D. We're back in visitNodePostorder:C and the second child (node E) is called, increasing the stack to 4 again. Since it is also a leaf, we print it, pop the method call, and we're back down to 3. Now node C will print itself, and we're down to 2 deep in the stack. The second child of node B is a leaf, so we go 3 deep, and after printing, we're at 2 deep. Node B is then printed, and then we're back to the first frame in the stack (node A), on the second iteration of its for loop. Since the second child (node G) of the root node is a leaf, it gets printed, and we're on the third child. And so on for the rest of the tree.
Figure 2. Snapshots of the stack during the tree traversal.
So, you can see that the stack will grow to the depth of the tree. If you are working with very deep trees and you're concerned with implementation efficiency and want to avoid the overhead inherent in making method calls, you can remove the recursion by managing your own stack of variables. However, this results in much more complex, less elegant, less maintainable code. I'd avoid it in all but the most extreme situations.
Conclusion
As programming challenges arise, remember this recursive jewel I learned many years ago in Computer Science school: if you cannot tackle a problem, divide it into smaller problems which you can tackle. Eventually you'll have wittled the problem down to something manageable, and applying this recursive problem solving tool will usually result in a readable and maintainable software solution.
Andrew Stone <andrew@stone.com> is founder of Stone Design Corp <http://www.stone.com/> and has spent 12 years writing Cocoa software in its various incarnations.