Memman
Volume Number: | | 7
|
Issue Number: | | 9
|
Column Tag: | | Programmer's Forum
|
Related Info: Memory Manager
A Memory Manager for the Rest of US
By Jordan Zimmerman, Pacific Grove, CA
A Memory Manager for the Rest Of Us: The Evolution of Portable Virtual Memory
[Jordan Zimmerman lives in Burbank, California where he drinks fresh ale, plays Smash T.V. and writes Movie Magic Scheduling for Screenplay Systems, Inc.]
Introduction
Our story begins with a humble Macintosh programmer faced with what is becoming the issue of the 90s: there are people in the world who insist on having windows on their blue boxes.
I was confronted with the task of reconciling the different memory management schemes of the Macintosh and Windows 3.0. In the process of solving this, a memory management scheme was developed that would be useful regardless of the porting issues. This memory manager automatically handles virtual memory (without the need for System 7 or a PMMU), is portable, and traps a multitude of errors and crashes caused by mistakes in memory usage.
While the full-blown manager is beyond the scope of this article; what follows is an outline that should be all one needs to write such a manager.
At this point, I must give due credit to my co-workers Ken Smith and Mark Guidarelli who helped design our Memory Manager, Memman.
In the Beginning
It has been my experience that the Windows 3 API (Application Programming Interface) is less flexible than the Macintoshs. A perfect example is the respective memory managers.
While both platforms use the label Handle for their basic type of memory, they are really very different animals. On the Mac, a Handle points to a real location in memory. At that location there is another pointer that points to your data. Once the OS returns an allocated Handle, you are free to use it at will - you dont need to check with the OS before using it (except, of course, to lock it).
Under Windows, the Handle it returns is merely a reference. It doesnt point to any real memory. In order to get a pointer to real memory, you have to go through an OS call. When you are done using the real memory, you make another call to let the OS know youre through.
The restrictions of the Windows model didnt give us a lot of choice. Ultimately, it made a lot more sense to try to fit the Mac model into the Windows model than it did to try it the other way around.
And Then There Was Light
It quickly became apparent how much control the memory manager would have, given the constraints on the user of the manager. I could count on several things:
a) Id know whenever memory was or wasnt being used;
b) Id have knowledge of every allocation made; and
c) I could do whatever I wanted with the data of an allocation when it wasnt being used so long as I restored its condition before it was needed again.
The Windows Model
Windows has five basic memory routines. They are:
a) GlobalAlloc() - allocates a block of memory;
b) GlobalReAlloc() - changes the size of an allocation;
c) GlobalLock() - returns a real pointer to memory;
d) GlobalUnLock() - signals that real memory is done being used; and
e) GlobalFree() - disposes of an allocation.
So, it seemed simple enough to fit the Macintosh memory model into Windows - just put wrappers around all memory calls.
Memman is born
This is the basic structure of Memman:
/* 1 */
#ifdef windows
typedef cookie_t HANDLE
#elif macintosh
typedef cookie_t Handle
#endif
cookie_t MMAlloc(long size);
void MMRealloc(cookie_t hdl,
long new_size);
void *MMUse(cookie_t hdl);
void MMUnuse(cookie_t hdl);
void MMFree(cookie_t hdl);
The cookie_t is what we call around the office a magic cookie - a reference to something that is unusable by itself. In Memman, the magic cookie is a Handle on the Macintosh and a HANDLE on Windows. But that doesnt really matter to the user of the manager.
The Memman model ports perfectly between the two platforms:
MEMMAN Macintosh Windows
-------------------------------------------------
MMAlloc NewHandle GlobalAlloc
MMRealloc SetHandleSize GlobalReAlloc
MMUse HLock GlobalLock
MMUnuse HUnlock GlobalUnLock
MMFree DisposHandle GlobalFree
Memman imposes some constraints on a program that Macintosh programmers wont be used to.
Before you read to or write from memory, you MUST call MMUse() to get a real pointer to memory. When you are through reading/writing, you MUST call MMUnuse(). This is a very different way of coding. The program becomes a client of the Operating System. On the Mac, its somewhat the other way around normally.
MMUse() can be unlimitedly nested. However, for every MMUse(), there must be an MMUnuse() eventually.
Heres an example:
/* 2 */
/* the Mac way of allocating memory and
then writing to it */
. . .
short **short_array;
short *short_ptr;
short_array = (short **)NewHandle(10 *
sizeof(short);
short_ptr = *short_array;
short_ptr[1] = 1;
short_ptr[2] = 2;
...
/* now, the Memman way */
. . .
cookie_treference;
short *short_ptr;
reference = MMAlloc(10 * sizeof(short));
short_ptr = (short *)MMUse(reference);
short_ptr[1] = 1;
short_ptr[2] = 2;
/* etc. */
MMUnuse(reference);
. . .
Wheres the VM Beef?
So how does this get us Virtual Memory? Given the control that Memman has over memory allocation and usage, Virtual Memory becomes somewhat simple.
What is Virtual Memory, Anyway?
Virtual Memory is a technique that allows an application to access more memory than is physically present in the system. Data is paged to and from disk as needed, thus giving the appearance of more memory than is really available.
Under System 7, this is done on a hardware level by the Paged Memory Management Unit (PMMU). This is the fastest and most desirable way to implement Virtual Memory. But there is nothing stopping the lowly software programmer from doing it manually.
Todays operating systems provide sophisticated disk I/O and memory managers. These are all a programmer needs to do Virtual Memory.
Memman knows about every allocation that is made. It also knows whenever an allocation is or isnt being used. So, the first thing to do is to keep track of every allocation made through MMAlloc().
/* 3 */
typedef longhdl_t;
typedef struct {
cookie_t platform_hdl;
long size;
void *ptr;
short access_cnt;
} alloc_rec;
Memman keeps an array of alloc_recs. Every time MMAlloc() is called, an entry into this array is stored. platform_hdl is a Handle on the Mac or a HANDLE on Windows. Because there is no equivalent to the Macs GetHandleSize() on Windows, size stores the size of the allocation.
Instead of MMAlloc() returning a cookie_t, Memman defines its own magic cookie, hdl_t. This is an offset into the array of alloc_recs.
ptr is NULL if the allocation isnt currently being used (i.e. MMUse() hasnt been called) or a real memory location if it is being used. This is done as an optimization. If MMUse() is called in a nested way, there is no need to go through the OS (HLock() or GlobalLock() ) to get a pointer.
access_cnt is the number of unbalanced times MMUse() has been called for the allocation. This is how Memman determines if an allocation is in use or not. When allocated, the access_cnt is set to zero. Every time MMUse() is called, it is incremented by one. Every time MMUnuse() is called it is decremented by one. When the access_cnt is zero, Memman knows that the allocation is not being used.
It is the knowledge of when an allocation is in use or not that allows us to do VM. When an allocation isnt in use, its data can be stored on disk (however, youd probably only want to do this when memory is tight). Lets change alloc_rec a little.
/* 4 */
typedef longhdl_t;
typedef struct {
cookie_t platform_hdl;
long size;
void *ptr;
short access_cnt;
long location;
} alloc_rec;
Memman uses the location field to determine whether or not an allocation is in memory or on disk. MMUse() is responsible for reading in a paged allocation. If location >= 0, then the allocations data is on disk; otherwise, location == -1.
A simple implementation of MMAlloc(), MMUse() and MMUnuse() for the Mac might look like this:
/* 5 */
alloc_rec **alloc_array;
hdl_t MMAlloc(long size)
{
alloc_rec*alloc_ptr;
Handle h;
long old_size;
hdl_t hdl;
/* get some real memory from the OS */
h = NewHandle(size);
if ( MemError() )
DoError();
/* add another alloc_rec */
old_size = GetHandleSize(alloc_array);
SetHandleSize(alloc_array,old_size +
sizeof(alloc_rec));
if ( MemError() )
DoError();
/* get the index into the array */
hdl = old_size / sizeof(alloc_rec);
alloc_ptr = (*alloc_array)[hdl];
/* store away the information */
alloc_ptr->platform_hdl = h;
alloc_ptr->size = size;
alloc_ptr->ptr = NULL;
alloc_ptr->access_cnt = 0;
alloc_ptr->location = -1;/* in memory */
return hdl;
} /* MMAlloc */
void *MMUse(hdl_t hdl)
{
alloc_rec*alloc_ptr;
void *ptr;
/* hdl is an index into the array of alloc_recs */
HLock(alloc_array);
alloc_ptr = (*alloc_array)[hdl];
/* make sure its in memory */
if ( alloc_ptr->location >= 0 )
load_from_disk(alloc_ptr);
/* increment the access_cnt and lock the Handle if necessary */
if ( ++alloc_ptr->access_cnt > 1 )
ptr = alloc_ptr->ptr;
else {
HLock(alloc_ptr->platform_hdl);
ptr = *alloc_ptr->platform_hdl;
}
HUnlock(alloc_array);
return ptr;
} /* MMUse */
void MMUnuse(hdl_t hdl)
{
alloc_rec*alloc_ptr;
alloc_ptr = (*alloc_array)[hdl];
if ( --alloc_ptr->access_cnt > 0 )
return;/* handle is still in use, keep it locked */
alloc_ptr->ptr = NULL;
HUnlock(alloc_ptr->platform_hdl);
} /* MMUnuse */
Memman opens a temp file that stores any paged data. Memman defines a function, MMPage(), that is used to page data to disk. This would probably be called from the GrowZone or could be setup to be called automatically by MMAlloc() (if NewHandle() failed).
Heres a simple implementation of MMPage():
/* 6 */
/* page out needed bytes of data */
void MMPage(long needed)
{
alloc_rec*alloc_ptr;
long total = 0;
long i;
long size;
size = GetHandleSize(alloc_array);
HLock(alloc_array);
alloc_ptr = *alloc_array;
/* go through all allocations paging them out until total >= needed
*/
for ( i = 0; i < size; ++i ) {
if ( alloc_ptr->location == -1 ) {
long offset;
offset = get_disk_block(alloc_ptr->size);
write_data(alloc_ptr->platform_hdl,
alloc_ptr->size,offset);
alloc_ptr->location = offset;
DisposHandle(alloc_ptr->platform_hdl);
if ( (total += alloc_ptr->size) >= needed )
break;
}
}
HUnlock(alloc_array);
} /* MMPage */
You might consider writing MMPage() so that it pages allocations in a least recently used fashion. The way Memman does this is by keeping a field (a short) in the alloc_rec that is incremented every time MMUse() is called on the allocation. Allocations with the smallest time stamp are the oldest and are paged first. This reduces the likelihood of a lot of swapping to and from disk because an allocation is paged and then read back in, etc.
Ideally, youll keep track of any free blocks within your temp file and reuse these (a free block is one to which data was paged and then re-read into memory; thus, the block is no longer being used).
Debugging - The Best Benefit
The final benefit of Memman is the automatic debugging it provides. There are several debugging tools that can be built into this memory model.
The first is inherent in the design: Handles are always locked when they are being used. It is a common plague of the Macintosh that a lot of bugs are caused by unlocked handles. With Memman, this is no longer an issue.
The other tools must be added to the memory manager. The following is a list of things weve added to Memman at the office. It is by no means an exhaustive list. It seems we are always finding new debugging code to add. You should surround all your debugging code with
/* 7 */
#ifndef NDEBUG
...
#endif
so that it can be turned off easily for the shipping product.
Overdraft Protection
Weve changed MMAlloc() so that it always allocates 2 bytes more than requested. These two bytes are then set to some unlikely value like 0x1234. Every time MMUse() or MMUnuse() are called, the last two bytes of the allocation are checked and Memman asserts if the value isnt 0x1234. This catches those pesky bugs where the program writes past the end of an allocation (at a resolution that even Protected Memory cant achieve!).
Corruption Police
Our Memman has an extra field (a short) in the alloc_rec. This field is used to store a checksum of the data. A checksum of the allocations data is stored at MMUnuse() time when the access_cnt gets set to zero (we use a public domain CRC routine). Whenever MMUse() is called, this checksum is verified and Memman asserts if the checksum doesnt match. This catches memory corruption errors.
The Enforcers
Whenever MMFree() is called, every byte of the allocations data is set to 0xff before DisposHandle() is called. This sets up a condition that will always produce incorrect results if an allocation is accessed after it is disposed.
Whenever MMUnuse() is called and the access_cnt gets set to zero, HandToHand() is called on the allocation to duplicate it. The old Handle has every byte set to 0xff and is then disposed. This is the Memman equivalent of Heap Scrambling.
Conclusion
We are using Memman at our office. It has already proved invaluable in weeding out bugs and cleaning up the way we look at memory allocations. Our Memman has been ported to the Mac, Windows and Unix without a hitch.
Even if porting is not an issue for you, the memory model laid out in this article is valuable for any situation. Indeed, the Memman model, I believe, is ideal for every situation and has become an integral part of all the code that I currently write and plan on writing.