Modern DebugStr
Volume Number: | | 12
|
Issue Number: | | 9
|
Column Tag: | | Debugging Aids
|
DebugStr, the Modern Way
Capturing your programs iostream of consciousness
By Jon Kalb
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
I have a friend who spends most of his time in cutting-edge (read, reliable tools not yet available) environments. He likes to say: There is no debugger; there is only DebugStr. While he may stretch the truth, he has a point. Our friends who build source-level debugging tools are a heroic lot, but there are times when MacsBug is the only option. This article, along with the accompanying dout Library, is one attempt to make using DebugStr a little more modern.
Using DebugStr is usually a clumsy process of converting integer values to strings and concatenating Pascal strings. There had to be a better way, I thought. Thats when it occurred to me to implement the low-level debugger as a C++ output stream.
The Solution Domain
This approach may be valuable for two groups: fellow users of DebugStr, and individuals interested in using and/or implementing C++ streams. But first, some limitations:
I use the term library rather loosely. The dout Library is really just a small collection of source files. I also include project files for both Metrowerks and Symantec, to demonstrate how to use the library.
In no way does this library replace a source-level debugger. If it is possible for you to use one, do so.
Although, for the sake of clarity, I dont follow this practice in this article, in real code your should always surround every use of the dout stream with some type of #ifndef NDEBUG statement. (NDEBUG is the preprocessor variable that controls the assert macro found in assert.h. Define NDEBUG for shipping code, and leave it undefined for code under development.) To make this task a little easier, the dout Library includes the data statement macro, ds. For example, the line ds(dout << myObject); would be completely stripped if NDEBUG was defined, and would become dout << myObject; if NDEBUG wasnt defined.
Streaming the semicolon character ';' to dout is problematic. MacsBug interprets the text following a semicolon as a MacsBug command, and attempts to execute text that you intended to display.
Below are examples of overloading the insertion operator for objects to support debugging in a streams environment. This is possible if your code base doesnt already use insertion operator overloading for other purposes. It may be possible to use the same routine for, say, storing an object to disk and for debugging purposes, but it seems unlikely. I suggest a work-around later.
Streaming data to dout in the constructor of a static or global object (an object that is constructed before the first line of main is executed) is problematic. More on this below.
Using the Library
To use the dout Library you must include the standard C++ iostreams and the source file debugbuf.cp. Each source file that writes to the dout stream must include the header file doutstream.
Standard Streams
You can use dout just as you would cout. For example:
#include "doutstream"
// ...
dout << "file corrupted at offset " << byteOffset << endl;
// ...
dout << "name: " << un << endl << "password: " << pw << endl;
// ...
dout << "completed pass " << i << " of " << total << endl;
// ...
You can use dout just like any standard C++ output stream, but, since the debugger is more than just a byte sink, we should be able to do more, and we can.
The dout Difference
The library defines a set of symbols that cause the stream to behave in interesting ways. Normally, the stream buffers characters until it fills a line (arbitrarily defined as 60 characters) or until the caller streams an endl. You can flush this buffer at any time by streaming an endl. (endl is the standard C++ streams manipulator function for advancing to the next line.) dout produces a blank line if there is currently no text in the streams buffer. To flush the buffer without creating blank lines, stream the constant doutsoftflush ('\r').
dout supports both horizontal and vertical tabs when streaming douttab ('\t') or doutverttab ('\v'). Streaming doutformfeed ('\f') flushes the buffer and inserts a line of underscores. Streaming doutbackspace ('\b') eats the last character in the buffer. Streaming doutsysbeep ('\a') calls SysBeep; however, just like other data, this is buffered. (To beep immediately, follow doutsysbeep with endl.)
Streaming data to dout doesnt normally suspend your program the way that a call to DebugStr would. If you want to drop into the debugger, then stream either doutdebug ('\0') or doutdropin ('\0'). These are synonyms; either one will flush the buffer and leave you in the debugger with your application suspended. (Streaming the End-of-Text character ('\x03'), usually associated with the Enter key, will do the same thing.)
As a side-effect of using MacsBug for streaming output, we can execute MacsBug commands. If a semicolon appears in the data stream, MacsBug attempts to execute the text that follows as a MacsBug command. (This can produce unexpected results if the data you are streaming just happens to contain a semicolon.) Several commands can be executed in sequence as long as each command is preceded by a semicolon.
Although you can use the dout Library in conjunction with source-level debuggers (products from Jasik, Metrowerks and Symantec support capturing messages sent with DebugStr), only MacsBug is going to be able to interpret and execute MacsBug commands.
This is an example of using dout to execute MacsBug commands.
dout << doutcommand << "hc" << doutsoftflush;
// ...
dout << doutcommand << "dm #";
dout << (unsigned long)aPointer << endl;
Note the use of the constant doutcommand (';'), and that the stream buffer is flushed (with either endl or doutsoftflush) after each command. In this example, MacsBug executes the Heap Check command, and displays memory from the location pointed to by aPointer. The default behavior for commands is to execute the command and continue program execution. In the case of Heap Check, MacsBug suspends the execution of your program if the Heap Check finds that the heap is corrupted.
For my purposes this alone would justify using the library. The library handles conversion of integer values, there is no fooling with Pascal strings, and, the C++ implementation gives us type checking without worry. But, as they say on the infomercials: Wait, theres more.
Extending Streams
Using C++ means that we also get extensibility.
I have included streamstructsmac.h and streamstructsmac.cp as an example of how to stream standard Macintosh types (or any structure). This is not an attempt to define insertion routines for all Macintosh Toolbox structures, only an example to show you how to declare and define insertion routines for structures that you may find useful.
Listing 1: streamstructmac.h
Declarations of Insertion Operators for Point, Rect, and BitMap
The following declarations allow Points, Rects, and BitMaps to be streamed out.
#include <iostream.h>
#include <QuickDraw.h>
// inserters
ostream &operator<<(ostream &stream, const Point &rhs);
ostream &operator<<(ostream &stream, const Rect &rhs);
ostream &operator<<(ostream &stream, const BitMap &rhs);
This is the standard way to implement insertion operator overloading to standard C++ library streams. Note that there is nothing specific to the dout Library in this set of declarations (or even in the implementation that follows). These same routines could be used to stream structures of these types to any C++ ostream.
Listing 2: streamstructmac.cp
Definitions of Insertion Operators for Point, Rect, and BitMap
The following definitions allow Points, Rects, and BitMaps to be streamed out.
#include "streamstructsmac.h"
// inserters
ostream &operator<<(ostream &stream, const Point &rhs)
{
stream << ".v(" << rhs.v << ") ";
stream << ".h(" << rhs.h << ") ";
return stream;
}
ostream &operator<<(ostream &stream, const Rect &rhs)
{
stream << ".t(" << rhs.top << ") ";
stream << ".l(" << rhs.left<< ") ";
stream << ".b(" << rhs.bottom<< ") ";
stream << ".r(" << rhs.right << ") ";
return stream;
}
ostream &operator<<(ostream &stream, const BitMap &rhs)
{
stream << ".baseAddr(" << (void *)rhs.baseAddr << ") ";
stream << ".rowBytes(" << rhs.rowBytes<< ")\n";
stream << ".bounds(" << rhs.bounds<< ") ";
return stream;
}
Note that the BitMap insertion function calls the Rect insertion function without any special syntax. C++s ability to extend the language syntax to user-defined types allows this.
The style that I use shows the field names (preceded with a '.') followed by the value of the field in parentheses. Sample output for a Point, a Rect, and a BitMap, might look like this:
.v(1) .h(2)
.t(3) .l(4) .b(5) .r(6)
.baseAddr(0x12345678) .rowBytes(7)
.bounds(.t(3) .l(4) .b(5) .r(6) )
This is all you need to know to use streams with structs or classes that either have no private or protected members or provide accessors for all such members. But what if you want to stream private members? This situation will arise often as you develop your own classes. The solution is to implement the insertion operator overload as always, and declare it as a friend function when declaring your class. A trivial but complete example of this is included in the source files as the class privateMembers.
Suppose that you have already defined insertion operator functions for some purpose other than debugging. For example, if you are streaming objects for persistent storage, the streamed form of your objects may not be what you would like to see streamed into the debugger. One way to work around this problem is to subclass the ostream class and make dout an object of this new subclass. Now, instead of defining the insertion operator in terms of an ostream, create an insertion operator function based on the new subclass.
The Implementation
The first thing to know about the implementation of dout is that dout is a standard C++ library ostream. Stream objects are responsible for taking inserted data, formatting it, and passing it to objects of type streambuf. It is the streambuf object that decides where the data ultimately ends up. By subclassing streambuf, we make dout possible. The library contains a debugbuf class that inherits from streambuf.
The following code, from the top of the debugbuf.cp file shows the relationships of these objects:
static debugbuf debugstream;
ostream dout(&debugstream);
dout is a standard ostream. Like all ostreams it requires a streambuf object to function; so, in the constructor, we pass the address of an object of type debugbuf, which is derived from streambuf. All of our work is done in the object of type debugbuf.
Class Declarations
The first thing to notice about the declaration of the debugbuf class is that no destructor or constructors are declared. Lets ignore this for the moment and look at the virtual functions inherited from streambuf in the protected section.
Listing 3: debugbuf.h
Declarations of debugbuf and debugbufinit
For debugbuf, virtual member functions of streambuf are declared as well as private members required for
implementation. For debugbufinit the declaration does not have a memory footprint.
class debugbuf: public streambuf
{
public:
//debugbuf();
//virtual ~debugbuf();
// Construction and destruction are really handled by class debugbufinit.
// Initialization is done in the private routine init().
protected:
// These protected member functions are virtual functions in the parent
// (streambuf) class that we override to create our behavior.
// The name overflow may be confusing -- this just outputs a character to the
// stream.
// Calls outputchar().
virtual int overflow(int c = EOF);
// Since this is not an input stream, we want both pbackfail() and underflow() to
// return EOF (thus indicating failure). It turns out that the default behavior (the
// base class implementation) does just that.
//virtual int pbackfail(int c = EOF);
//virtual int underflow();
// we dont do input so always return EOF
virtual int uflow() {return EOF;}
// we dont do input so always return 0 chars read
virtual int xsgetn(char *, int) {return 0;}
// Calls outputchar() for each character.
virtual int xsputn(const char *s, int n);
// We use the default behavior which returns a streampos that is in an invalid
// position. We do not support repositioning on this stream.
//virtual streampos seekoff(streamoff off,
// ios::seekdir way,
// ios::openmode which =
// ios::in | ios::out);
//virtual streampos seekpos(streampos sp,
// ios::openmode which =
// ios::in | ios::out);
// We dont support setting the buffer so we use the default which is just to
// return this.
//virtual streambuf *setbuf(char *s, int n);
// There is nothing to sync with, so we just do the default which is to return
// zero, indicating no error.
// virtual int sync();
// Actually, as an alternative implementation it would be possible to use this
// function to call our soft flush routine. In practice, there would be no different
// result. sync() is usually only called by pubsync(), which is usually only called
// by flush(), which is usually only called by the endl manipulator function after
// it has streamed \n. So the soft flush would always follow a hard flush and
// result in a no-op.
private:
enum
{
kMaxDebugStrReadableString = 60,
kSizeOfSemicolonG = 2,
kSizeOfLengthByte = 1,
kTabSize = 5,
kStop = true
};
// the buffer
char pbeg[ kSizeOfLengthByte +
kMaxDebugStrReadableString +
kSizeOfSemic olonG];
// location of the next streamed char
char *pnext;
// This always points to the end of the readable string buffer. Once it is set in
// init(), it is never modified
char *pend;
// the number of queued alerts
int alertCount;
void init();
void outputchar(char c);
void formfeed();
void horztab();
void backspace();
void verttab();
void alert();
void addchar(char c);
void softflush();
void flushdebugstring(int stop = false);
void flushalerts();
friend class debugbufinit;
};
class debugbufinit
{
static unsigned int count;
public:
debugbufinit();
// Our destructor is not virtual. It is important that objects of this class have no
// memory footprint. We will end up with one object of this class per translation
// unit (.cp file). If this class has any virtual member functions then objects of
// this class would have v tables in memory. Since this is not intended to be a
// base class for other class, there is no need to be virtual.
~debugbufinit();
};
Of the virtual functions we inherited from streambuf, four are for input, which we dont support. For two, pbackfail and underflow, we can just accept the default behavior, and for the other two, uflow and xsgetn, we write trivial routines that return values which indicate that reading is not supported. Two other functions, seekoff and seekpos, are for positioning the stream pointer - another feature that we cant support. We also do not allow the caller to set our buffer, so setbuf is not supported. The sync function is also unneeded. The only routines that really do any work are overflow, which calls our private member outputchar once, and xsputn, which calls outputchar once for each character passed to it.
Our private section includes some constants defined as an enum, the buffer and the pointers that we need to manage it, a counter for buffering SysBeep calls, and our private functions. Ill discuss the private member functions later.
We finally return to the observation that instead of constructors and a destructor for debugbuf, a separate class, the debugbufinit class, is declared. This attempts to work around a problem with static objects. Before explaining the problem and what Ive done about it, let me point out that I have not implemented a complete solution. Do not stream to dout in the constructor of a static (or global) object unless you are prepared for your application to crash. This is called crossing the streams. See Ghostbusters.
dout is a static object and, like all static objects, its constructor will be called before the first line of main is executed. But the first line of main is not the first line of code that is executed. If a static object streams data to dout in its constructor, this code may be executed before dout is constructed.
There is no way to reliably order construction of static objects in different translation units (.cp files), but we do know that within translation units, static objects are constructed in the order in which they appear. That is why we have the class debugbufinit and why the header file doutstream declares a static object of this type. Note that it is not declared extern. Each translation unit that includes doutstream has its own object of type debugbufinit (which is why it is important that it does not have a memory footprint).
When the debugbufinit object is constructed, it uses its static member, count, to determine if it is the first object of its class to be constructed. If it is, then it calls the init member of the static debugstream object. The init member performs the function of a constructor for the debugbuf class. This way, debugstream gets constructed only once, at the time of the first construction of a debugbufinit object. Since init may be called before or after the real constructor is called, it is important that the constructor is a no op.
P.J. Plauger explains this problem, along with the solution used in his implementation of the standard library streams (cout, cin, and cerr), in his book, The Draft Standard C++ Library, which I recommend. Close inspection reveals why my implementation is not a complete solution. Although I can guarantee proper construction of the debugstream object, the same cannot be said for the dout object.
I suspect that a complete solution exists for both Metrowerks and for Symantec, but I do not believe that any single solution works for both. In any case, the complete solution is left as an exercise for the reader (Ive always wanted to say that).
Member Function Definitions
As the listing for debugbuf.cp shows, the implementation of both debugbuf and debugbufinit is straightforward.
Listing 4: debugbuf.cp
Definitions of debugbuf and debugbufinit
The debugbuf class manages a buffer using a switch statement to differentiate between characters with
special meanings.
#include "doutstream"
#include <string.h>// for strlen() and strcpy()
#include <OSUtils.h> // for SysBeep();
static debugbuf debugstream;
ostream dout(&debugstream);
unsigned int debugbufinit::count = 0;
// this is debug code -- performance is not a goal
void debugbuf::init() // called by friend class debugbufinit
{
pbeg[0] = '\0';
pnext = pbeg + kSizeOfLengthByte;
pend = pnext + kMaxDebugStrReadableString;
alertCount = 0;
}
int debugbuf::overflow(int c)
{
if (EOF == c)
{
return '\0'; // returning EOF indicates an error
}
else
{
outputchar(c);
return c;
}
}
int debugbuf::xsputn(const char *s, int n)
{
for (int i = 0; i < n; ++i)
{
outputchar(s[i]);
}
return n;// we always process all of the chars
}
void debugbuf::outputchar(char c)
{
switch (c)
{
case '\b':
backspace();
break;
case '\f':
formfeed();
break;
case '\n':
flushdebugstring();
break;
case '\r':
softflush();
break;
case '\t':
horztab();
break;
case '\v':
verttab();
break;
case '\a':
alert();
break;
case '\0':
case '\x03':
flushdebugstring(kStop);
break;
default:
addchar(c);
break;
}
}
void debugbuf::backspace()
{
if (pbeg[0]) // if the buffer is empty, dont bother
{
--pnext;
--pbeg[0];
}
// note that alerts cannot be backspaced away -- a possible enhancement
}
void debugbuf::formfeed()
{
softflush();
strcpy(pbeg, (char *)
"\p_______________________________________________");
pnext += strlen(pnext);
flushdebugstring();
}
// both horizontal and verticle tabbing is done by brute
// force -- performance is not a goal
void debugbuf::horztab()
{
// we dont wrap tabs so if we are within kTabSize of
// the end of the buffer then we just flush
if (pend - pnext <= kTabSize)
{
flushdebugstring();
}
else
{
for (int i = 0; i < kTabSize; ++i)
{
addchar(' ');
}
}
}
void debugbuf::verttab()
{
int position = pnext - &pbeg[1];
flushdebugstring();
for (int i = 0; i < position; ++i)
{
addchar(' ');
}
}
void debugbuf::alert()
{
++alertCount;
}
void debugbuf::addchar(char c)
{
*pnext++ = c;
++pbeg[0];
if (pnext == pend)
{
flushdebugstring();
}
}
void debugbuf::softflush()
{
if ('\0' != pbeg[0])
{
flushdebugstring();
}
else
{
flushalerts();
}
}
void debugbuf::flushdebugstring(int stop)
{
if (!stop)
{
addchar(';');
addchar('g');
}
flushalerts();
DebugStr((unsigned char *)pbeg);
pbeg[0] = '\0';
pnext = &pbeg[1];
}
void debugbuf::flushalerts()
{
while (alertCount)
{
--alertCount;
SysBeep(30);
}
}
debugbufinit::debugbufinit()
{
if (0 == count++)
{
debugstream.init();
}
}
debugbufinit::~debugbufinit()
{
if (0 == --count)
{
// nothing to dispose, but we should flush
debugstream.softflush();
}
}
The main entry points are the protected members overflow and xsputn, both of which call outputchar. outputchar uses a switch statement to call one the following if the character is special: flushdebugstring, softflush, formfeed, horztab, backspace, verttab, or alert. If the character is normal we call addchar.
If the character being processed is doutbackspace ('\b'), we reduce the length of the Pascal string in the buffer by one and back up the pointer by one character. Of course, we check first to be certain that there is at least one character in the buffer.
If the character being processed is doutformfeed ('\f'), we flush any characters already in the buffer, fill the buffer with underscores, then flush the underscores.
If the character being processed is a horizontal tab or douttab ('\t'), we treat it as if it were kTabSize (5) spaces, except that we dont wrap remaining spaces to the next line. We dont support tab stops.
If the character being processed is a vertical tab or doutverttab ('\v'), we calculate how many characters are currently in the buffer, flush the buffer, and add a space character for each character that had been in the buffer.
If the character being processed is an alert ('\a'), we increment the alert counter. Since all alerts are the same (just a call to SysBeep), we can buffer them with just a counter.
If the character being processed is normal, we process it in addchar. The new character is added to the end of the Pascal string in our buffer. When the string fills the readable string buffer, we flush the buffer by calling flushdebugstring.
Notice that I said the readable string buffer. The buffer is actually two bytes larger than the largest Pascal string that we can handle. This is to save room to append a semicolon and the letter g.
Usually when DebugStr is called, MacsBug displays the Pascal string that is passed to it and waits for commands from the user. This is not really the behavior that we want. We want the string stored in the MacsBug buffer, but we dont want to stop the application and drop into MacsBug every time data is streamed to dout. To work around this, we take advantage of the DebugStr/MacsBug command processing feature. When MacsBug receives the string passed to it by DebugStr, it displays the contents of the string up to, but not including, the first semicolon (if there is one). Any characters after the semicolon are treated as a command for MacsBug to execute.
By appending ;g to the string passed in DebugStr, we cause MacsBug to execute the g or go command which resumes execution of the suspended application. Since we are going to append this to almost every call to DebugStr (the exceptions being when '\0' or EOT are streamed), we never let the Pascal string in the buffer to grow into the last two bytes of the buffer.
The softflush routine is called when the user streams doutsoftflush (\r'). We check to see if there are any characters in the buffer. If there are, we call flushdebugstring, and if there arent, we call flushalerts. Since flushdebugstring calls flushalerts, alerts are always flushed.
flushalerts calls SysBeep the number of times specified in alertCount and resets alertCount to zero.
The routine that actually calls DebugStr is flushdebugstring, which is called when the user streams '\n', '\0', or EOT. flushdebugstring takes a single parameter which is used to determine whether or not to append ;g to the buffer before passing it to DebugStr. After appending the ;g or not, we flush alerts with a call to flushalerts, call DebugStr, and then zero out the string in the buffer.
Bibliography and References
Information on using the standard C++ iostreams library is readily available. Any recently published work on ANSI C++ would include information on library usage. Implementation is a somewhat different matter. The dout Library would probably not have been possible without:
Plauger, P.J. The Draft Standard C++ Library. Prentice Hall, 1995.
For information on MacsBug:
Apple Computer. MacsBug Reference and Debugging Guide. Addison Wesley, 1990.