Simple Call Stack Logging
Volume Number: 25
Issue Number: 06
Column Tag: Programming
Simple Call Stack Logging
Who called NSLog()?
by Sengan Baring-Gould
In this article I present an extension to NSLog() which not only prints out a user specified message but also lists the function calls that led to its invocation. By the end of this article you will have a new tool you can use in your own applications, and you'll understand how it works so that you can adapt it to your needs.
Why log?
NSLog() is an important tool for debugging. It can be placed anywhere in an application to log internal state.
Many programmers prefer logging to using a debugger, as it helps them concentrate on possible causes of a problem while filtering out irrelevant information. Logging provides a complete textual record of the problem that can be studied later.
Debuggers on the other hand interrupt the developer who must record by hand all the relevant state before letting the application continue. Continuing is an unforgiving operation: once performed, prior state that was not recorded is lost.
Brian W Kernighan (one of the authors of the seminal text on the C language) said:
"As a personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two. One reason is that it is easy to get lost in details of complicated data structures and control flow; we find stepping through a program less productive than thinking harder and adding output statements and self-checking code at critical places. Clicking over statements takes longer than scanning the output of judiciously-placed displays. It takes less time to decide where to put print statements than to single-step to the critical section of code, even assuming we know where that is. More important, debugging statements stay with the program; debugger sessions are transient".
However, if NSLog() is invoked from a function that is called from many other functions, determining which function call caused the bug can be very difficult. We need a record of the function calls that led to the invocation of NSLog().
Where we are headed
Our new debug function debugLog() will print out any message we want the same way NSLog() does. Following the message, it will list the function invocations that led it to be called:
2009-04-05 15:37:58.119 TestDebugLog[14442:10b] C++ constructor
0000301f CPP::CPP() + 33 (TestDebugLog)
0000316c main + 86 (TestDebugLog)
00001a5e start + 54 (TestDebugLog)
The first line follows NSLog()'s traditional format: the date, the time, the name of the application (TestDebugLog) and then the message we passed as argument: "C++ Constructor".
On the next lines, debugLog() lists the function invocations that led it to be called:
- debugLog() was called from the C++ constructor CPP::CPP() defined in TestDebugLog.
- CPP:CPP() was called by main also defined in TestDebugLog.
- main was called by start also defined in TestDebugLog. (start is the function the Operating System calls when it starts an application).
The module name between parentheses specifies where a function is defined. In the following example, NSPopAutoreleasePool is shown to be defined in the Foundation framework:
2009-04-05 15:40:03.921 TestDebugLog[14462:10b] Objective-C dealloc
000030d3 [Objc dealloc] + 33 (TestDebugLog)
91117e4f NSPopAutoreleasePool + 431 (Foundation)
000031d1 main + 207 (TestDebugLog)
00001a4a start + 54 (TestDebugLog)
Obtaining the list of function invocations
Obtaining the list of function invocations to print is a two step process. First, debugLog() must obtain the addresses of the functions that called it. A computer uses addresses to keep track of what it is doing, but addresses are not specified in a program's source-code.
Then debugLog() must translate these addresses into the function names that appear in the program's source code. Three different methods are required to obtain C, C++, and Objective-C function names.
1. Retrieving the list of called functions' addresses
Compilers transform source-code into machine code that computers understand. When a function is called, the caller's return address must be saved so that the CPU can continue running the caller after the function invocation completes. In the context of this discussion, we will assume that these return addresses are stored on the stack. (We will ignore specific optimizations used by the PowerPC and x86 CPUs).
Unfortunately, the stack also records other information, such as local variables and function arguments. Determining precisely which items in the stack are return addresses requires compiler specific knowledge. Conveniently, the new version of gcc which ships with Leopard provides a new function, backtrace(), which gives us the return addresses in the current stack.
Remember that inlined functions are embedded within their callers rather than being invoked. This means backtrace will not see them in the stack and they will not be listed by debugLog().
2. Function layout in memory
Functions are compiled independently by the compiler and occupy contiguous areas of memory. Therefore if we know the starting addresses of any two consecutive functions f and g, and if we have an address x which falls between f and g ( &f <= x < &g ), we know that x belongs to the function f.
One rarely has addresses that match the beginning of each function precisely. Therefore function-lookup functions are designed to return information about the preceding function when given an address. Our case matches this scenario: the return addresses provided by backtrace occur within functions. Thus we can safely ignore the difference between return addresses and starting function addresses for most of this discussion. Similarly, we'll adopt the standard convention of referring to starting function addresses as function addresses.
3. Retrieving C function names
When an application is first loaded into memory, it needs to be told the addresses of the external library functions it wants to use. Because libraries are updated independently of applications, the addresses of their functions may change, although the names of their functions will not. It is the responsibility of the dynamic linker to give each application this information.
The dynamic linker reads function names from a symbol table embedded in the application and the libraries the application uses. The symbol table lists all the C function addresses and the C function names that can be accessed externally. Therefore if we know an address, we can ask the dynamic linker for the corresponding function name. The backtrace_symbols() function provides this functionality.
Because the dynamic linker only knows about externally visible function names, backtrace_symbols() always returns the nearest preceding external symbol. Static C functions are not exported and will not be given the correct name by the linker.
Most symbols will be exported as external if you compile your application in Debug Configuration. This is not true if you compile it in Release Configuration. The UNIX utility nm lists exported function names with a preceding capital T letting you check if an unexpected function name shows up.
4. Retrieving Objective-C method names
Objective-C does not use the dynamic linker. Instead it uses the Objective-C runtime, which like the dynamic linker keeps track of all known method addresses and names. (A method is a function defined within a class). However there is no equivalent to backtrace_symbols() which returns a function name when given a function address. We must build one ourselves.
The Objective-C runtime provides a function to enumerate all the Objective-C classes that can be invoked by the application, including those in the frameworks bound to the application. It also provides a function to enumerate the methods in any Objective-C class. With these components we can obtain every method's address and name.
Implementation
This code is Objective-C++ so don't forget to use the ".mm" extension for your implementation filename. We start with the necessary includes:
DebugLog Implementation File: SBGDebug.mm
#include "SBGDebug.h"
#include <cxxabi.h>
#include <map>
#include <string>
#include <execinfo.h>
#include <stdio.h>
#import <objc/Object.h>
1. Finding Objective-C methods
We want to build a method-lookup function that will return the method in which an address lies. We know how to enumerate the name and address of every Objective-C method available to the application, but this enumeration is expensive. Therefore we need a data structure in which to store the locations of the member functions.
The data structure must return the correct member function when queried with any address belonging to that member function. NSDictionary does not provide this functionality, but C++'s Standard Template Library's (STL) map container does.
map::upper_bound returns an iterator to the first element in the map whose key is larger than the queried key. For a map with method addresses as keys and method names as values, map::upper_bound will return an iterator pointing to the name of the method following the one we are looking for. Simply decrementing the iterator will make it point to the preceding method name.
// Lookup Function names
static std::map<uint32, std::string>* objectiveCMethodNames = NULL;
inline static uint32 lookupFunction(uint32 addr, const char** name)
{
if (objectiveCMethodNames == NULL)
return NULL;
// Find the next function
std::map<uint32, std::string>::iterator iter
= objectiveCMethodNames->upper_bound( addr );
// Go back a function: now we are looking at the right one!
iter;
*name = iter->second.c_str();
return iter->first;
}
Populating the STL map is a simple matter of iterating through all the classes known to the Objective-C runtime, and enumerating their methods.
// Add classes by stepping through their method lists.
inline static void addObjectiveCMethod(uint32 addr, const char* name)
{ (*objectiveCMethodNames)[addr] = std::string(name); }
void addClass(Class c)
{
unsigned int method_count;
Method *method_list = class_copyMethodList(c, &method_count);
for (int i = 0; i < method_count; i++)
{
Method func = method_list[i];
const char* name = sel_getName( method_getName( func ) );
uint32 addr = (uint32) method_getImplementation( func );
addObjectiveCMethod( addr,
[[NSStringstringWith Format:@"[%s %s]",
class_getName(c), name] cString]);
}
}
We could call the code that enumerates the Objective-C methods explicitly from main, but that requires remembering to add the call to each new application that uses debugLog(). Instead I can put the enumeration code into the load method of an Objective-C class which is guaranteed to be called if debugLog() is built into the application. The only gotcha is that the load method is invoked before Cocoa has created an NSAutoreleasePool. That's why the code below creates its own NSAutoreleasePool to avoid memory leaks.
@implementation SBGDebug
+ (void) load
{
if (objectiveCMethodNames != NULL)
return;
NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
objectiveCMethodNames = new std::map<uint32, std::string>();
int numClasses = objc_getClassList(NULL, 0);
if (numClasses > 0 )
{
Class *classes = (Class*) malloc(sizeof(Class) * numClasses);
numClasses = objc_getClassList(classes, numClasses);
for (int i = 0; i < numClasses; ++i)
addClass(classes[i]);
free(classes);
}
[pool release];
}
@end
2. Deciding whether a function is C, C++ or Objective-C
The return addresses provided by backtrace() could belong to a C function, a C++ function or an Objective-C method. We need a way to decide which case we're dealing with. We start by asking both the C/C++ function-lookup function and the Objective-C method-lookup function to what function they believe a return address belongs. We will obtain two addresses a and b which should both be smaller than the return address r. Because functions are contiguous and do not intersect with each other, a and b must differ, and one of them must be lower than the other. For the same reason, the return address r cannot belong to the function with the lower address, as that function must end before the higher address. Therefore we use the name of the function that starts at the higher address:
extern "C" void debugLog(NSString* format, ...)
{
// Print the debug message
va_list arguments;
va_start(arguments,format);
NSLogv(format, arguments);
// Dump the callstack
uint32 callstack[128];
int frames = backtrace((void**) callstack, 128);
char** strs = backtrace_symbols((void**) callstack, frames);
for (int i = 1; i < frames; ++i)
{
char functionSymbol[64*1024];
char moduleName [64*1024];
int offset = 0;
sscanf(strs[i], "%*d %s %*s %s %*s %d", &moduleName,
&functionSymbol, &offset);
uint32 addr = callstack[i];
if (objectiveCMethodNames)
{
const char* objcName;
uint32 objcAddr = lookupFunction(addr, &objcName);
if ( (objcAddr != 0)
&& (addr > objcAddr)
&& (addr - objcAddr < offset))
{ printf("\t%8.8x %s + %d\t\t(%s)\n", addr, objcName,
addr - objcAddr, moduleName);
continue; };
};
int validCppName;
char* functionName = abi::__cxa_demangle(functionSymbol, NULL, 0,
&validCppName);
if (validCppName == 0)
printf( "\t%8.8x %s + %d\t\t(%s)\n",
addr, functionName, offset, moduleName);
else
printf( "\t%8.8x %s + %d\t\t(%s)\n",
addr, functionSymbol, offset, moduleName);
if (functionName)
free(functionName);
}
free(strs);
}
The output of backtrace_symbols() is an array of C-strings. We use sscanf() to parse them. The resulting function names are passed to the C++ demangler to convert into human-readable form if they are C++ names.
debugLog() uses printf() rather than NSLog() to avoid printing the application's name at the beginning of each line.
debugLog() is declared as extern "C" so it can be linked directly to Objective-C code. The class interface file does the same:
DebugLog Interface File: SBGDebug.h
@interface SBGDebug : NSObject
+ (void) load;
@end
#ifdef __cplusplus
extern "C" {
#endif
void debugLog(NSString* format, ...);
#ifdef __cplusplus
};
#endif
Using DebugLog
Invoke debugLog() just as you would NSLog():
Test file: TestDebugLog.mm
#import "SBGDebug.h"
// C++ test
struct CPP { CPP(); ~CPP(); };
CPP::CPP() { debugLog(@"C++ Constructor"); };
CPP::~CPP() { debugLog(@"C++ Destructor"); };
// Objective C test
@interface Objc : NSObject
@end
@implementation Objc
- (id) init
{ debugLog(@"Objective-C init"); return [super init]; };
(void) dealloc
{ debugLog(@"Objective-C dealloc"); [super dealloc]; };
@end
// Main
int main(int argc, const char* argv[])
{
NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
CPP cpp;
Objc* objc = [[[Objc alloc] init] autorelease];
debugLog(@"C test arguments work too: %@", objc);
[pool release];
return 0;
}
Conclusion
Leopard provides all the components necessary to build a cross-platform NSLog() which can print function call traces. Although the code I provide assumes compilation to a 32-bit executable, extending it to 64 bits should be straightforward as only standard library functions are used.
You can download the entire Objective-C++ project from the MacTech ftp source archive at ftp.mactech.com/src/mactech/volume25_2009/25.06.sit.
Don't forget to compile it and run it in Debug Configuration!
References:
backtrace() : man backtrace.
backtrace_symbols() limitations : http://lists.apple.com/archives/darwin-dev/2009/Mar/msg00111.html.
backtrace_symbols() : man backtrace_symbols.
abi::__cxa_demangle : http://www.ib.cnea.gov.ar/~oop/biblio/libstdc++/namespaceabi.html.
Objective-C runtime: http://developer.apple.com/documentation/Cocoa/Reference/ObjCRuntimeRef/ObjCRuntimeRef.pdf.
The Standard Template Library: http://www.sgi.com/tech/stl/
Dr. Sengan Baring-Gould is a Boulder, Colorado-based independent Mac OS X developer and writer. He is available for consulting and specializes in Algorithms, AI, Cocoa, Debugging tools, High performance code, and UIs. He can be reached at sengan@ansemond.com.