TweetFollow Us on Twitter

September 94 - KON & BAL'S PUZZLE PAGE

KON & BAL'S PUZZLE PAGE

Heaps of Fun

KONSTANTIN OTHMER, BRUCE LEAK, AND Steve NEWMAN

[IMAGE 117-123_Puzzle_Page_html1.GIF]

[IMAGE 117-123_Puzzle_Page_html2.GIF]

See if you can solve this programming puzzle, presented in the form of a dialog between Konstantin Othmer (KON) and Bruce Leak (BAL) -- and a special guest, developer Steve Newman. The dialog gives clues to help you. Keep guessing until you're done; your score is the number to the left of the clue that gave you the correct answer. Even if you never run into the particular problems being solved here, you'll learn some valuable debugging techniques that will help you solve your own programming conundrums. And you'll also learn interesting Macintosh trivia.


Steve I've got a machine that crashed into MacsBug. I think it's this bug that some of our beta testers have been reporting; it's really intermittent, so I may not get it to happen again. I've got to find it just by looking at this one crash.

KON It's not reproducible?

Steve Not if it's the bug I've been hearing about. The reports are always the same: The machine crashes while saving a file. Afterward the file is unreadable. If they go back to an older copy of the file, the problem doesn't recur. No single user seems to have had this crash happen more than twice, and no one has been able to associate it with something they were doing in the program before they told it to save.

BAL What does this program do?

Steve It's a PIM -- personal information manager. Data entry and dialog boxes and stuff. It's a pretty big program, but very vanilla in its use of the ROM -- strictly Volume I stuff, plus the Memory Manager and File Manager, of course.

KON You've tried stress testing? Heap scramble, low-memory conditions, MemHell, QC, all of that?

Steve Yeah. It was a war zone, and we couldn't bring out the bug. But it just happened to one of our tech support people, Stephanie. I've taken over her machine until I can figure out what's going on. She closed a file, it asked her if she wanted to save changes, she clicked Yes, and it crashed into MacsBug with an illegal instruction.

BAL Illegal instruction? Sounds like you've branched off into the middle of nowhere. Where's the program counter?

100 Steve wh pc says we're in CODE segment 44, $017C bytes into a routine called Preflush. According to a link map I can look at on another machine, segment 44 has the file-saving code.

KON Is the heap trashed?

95 Steve MacsBug says the heap is fine.

BAL Perhaps some random memory-trashing bug has overwritten part of the code segment. Disassemble around the program counter.

90 Steve It looks like valid code, but the PC is in the middle of an instruction.

KON Do you have any purgeable code segments?

85 Steve We have a fairly complicated code segment management scheme based on reference counting. We're pretty careful about it, though; it's been a long time since we've had any problems there. As it happens, segment 44 is purgeable, but it has too many entry points to do reference counting, so we just unload it from our event loop.

BAL Sounds like code right out of the Finder. Let's try to find out how we managed to branch into the middle of an instruction. Do a stack crawl and see where we came from. Let's look at all the registers to see if one of them contains a clue as to how we got here.

80 Steve OK. sc6 says the last call came from a function named "Document:: SaveAs(int, unsigned char)".

KON What kind of a function name is that? It looks more like a UNIX pathname than a function name.

Steve Hey, you should see what it looks like with name unmangling disabled. sc7 shows another return address under that one, in "Document::SaveAs(char*, short, unsigned char, int, unsigned char)".

BAL SaveAs? I thought it was just doing a regular save. And why two functions called SaveAs?

75 Steve Stephanie insists that it was a regular save -- the document had been opened from an existing file and was being saved to that same file. But if that were true, the program should have called a function named Save, not SaveAs. As far as having two functions with the same name, the five- parameter one saves into a specified disk file; the two-parameter one brings up a Standard File dialog and then calls the five-parameter one. I'm using C++ function overloading.

BAL You sure you're not running the Finder? Mercer should be able to solve this for you in a snap.

KON I heard Mercer moved to Chicago. So, how did we get called?

70 Steve The call came from a "JSR (A1)" instruction. It looks like a standard C++ virtual function call.

KON What's the value of A1?

Steve It points into the jump table. Disassembling at that address shows a JMP to the current program counter.

KON That makes sense. Virtual function tables are stored in the global data segment, so their function pointers are data-to-code references, which have to go through the jump table. So all virtual function calls go through the jump table. That would be necessary no matter how the vtables were implemented, since at link time there's no way of knowing what version of the function will get called or what segment it's in.

BAL So the jump table is trashed relative to the data in the heap. I still think something's wrong with the heap. MacsBug can be funny about deciding whether a heap is trashed. Do a heap dump and page down until you see the block containing the program counter.

65 Steve It's code segment 44, and everything around it looks reasonable. But there's a question mark next to the master pointer address.

KON That probably means the master pointer doesn't point back to this heap block. Let's look at the header for the heap block and find its master pointer to double-check MacsBug. The format of the header depends on whether the machine is using 24- or 32-bit addressing. We can tell the current mode, which is probably the machine's standard mode unless we're in some slimy QuickDraw code, by looking at MacBug's status display along the left side of the screen.

Steve MacsBug says the machine is in 24-bit mode. It's an old IIci with only 8 meg of memory.

BAL In that case, the block header is 8 bytes long. The first byte is a tag byte that indicates the type of block (free, pointer, or handle) and the slop factor; the next three bytes are the size; and for handles, the final four bytes are the offset from the beginning of the heap zone to the master pointer for this block. Check that offset in the heap and make sure there's a valid master pointer there.

60 Steve It agrees with the location printed in the heap dump. But the value in that master pointer doesn't point back to this heap block. It turns out MacsBug won't flag this as heap corruption, but it will put a question mark next to the master pointer for blocks where the master pointer doesn't make sense.

KON Do a heap dump and keep paging down until you find the address that it does point to.

55 Steve It points to a block labeled "CODE segment 44". There are two code segment 44s in the heap!

KON Is there a question mark on this one?

50 Steve No, MacsBug seems to be happy with the second block. According to MacsBug, it has the same master pointer address as the first block. Both blocks are marked as being locked and purgeable.

BAL But the master pointer really does point to this block, so there's no question mark. And "locked and purgeable" is the expected state for a purgeable code segment that's currently loaded -- the lock flag overrides the purgeable flag.

KON To decide whether a heap block is a resource, MacsBug looks at the resource flag for the block. In a 24-bit heap, that flag is stored in the high byte of the master pointer. Since both blocks think they have the same master pointer, they're sharing the same flag byte; and when MacsBug searches the open resource maps to figure out which resource each block comes from, it gets the same answer.

BAL Now we have two mysteries: why there are two heap blocks with the same master pointer, and why the jump table points into the middle of a routine. Let's see if the heap blocks are really the same. Check the heap dump to see if they have the same size, and then dump memory from each one to see if they're the same.

45 Steve They are the same. And by the way, you left out one mystery: If we're doing a Save, why is SaveAs on the stack? There's no way that the two-parameter SaveAs can call the five-parameter SaveAs without first bringing up the Standard File dialog; but Stephanie insists there was no such dialog.

KON Maybe we took another bad branch through the jump table earlier on. Take another look at the stack crawl. When did we first enter segment 44?

40 Steve A routine called OKToClose, which is not in segment 44, called the two-parameter SaveAs, which is.

BAL Look at the JSR instruction in OKToClose.

35 Steve It's jumping to an A5-relative address, in the jump table. That address contains a JMP into the middle of the two-parameter SaveAs, shortly before the place where it calls the five-parameter version.

KON Aha! By taking a wild branch into the middle of the routine, it skipped over the call to Standard File. Maybe all the jump table entries for this segment are skewed by the same amount. Disassemble from $017C bytes above where this JMP points.

30 Steve $017C bytes above the JMP target is the beginning of Document::Save.

BAL That makes sense. OKToClose tried to call into segment 44 to the Save routine, but something went wrong with LoadSeg, and it ended up $017C bytes farther down, in the middle of the two-parameter SaveAs. Two-parameter SaveAs called five-parameter SaveAs; this is an intra-segment call, so it wasn't affected by the bad jump table. Then five-parameter SaveAs called Preflush, which is a virtual function, so it went through the jump table even though it's in the same segment. This time the wild branch happened to hit an illegal instruction, so it dropped into MacsBug.

KON It's interesting that the two SaveAs routines were able to function more or less correctly even though OKToClose branched into the middle of the first routine, thus bypassing all of its parameter setup.

BAL Well, it sounds like all of these functions are methods of the same object. MPW's C++ compiler usually puts the object pointer in A4. So any references to object data members or virtual functions would work even though we skipped the entry code for the first SaveAs.

KON Aren't all C++ functions fairly interchangeable? Link, save A4, load A4, test a bit off A4, restore A4, unlink, rts? That's part of the efficiency.

BAL In any case, we need to find out what went wrong in the LoadSeg call. Maybe there's a clue on the stack. Dump memory for a few hundred bytes starting at the stack pointer.

25 Steve $0028 bytes after the stack pointer, you notice a funny value: $4080BD0A.

KON That's an address in ROM, probably a return address. Disassemble around that address.

20 Steve It's in LoadSeg, one instruction after a call to StripAddress. It looks like this:

Disassembling from 4080bce0
    _LoadSeg
        +0000   4080BCE0     MOVEM.L    D0-D2/A0/A1,-(A7)
        +0004   4080BCE4     MOVE.L D1,-(A7)
        +0006   4080BCE6     JSR        Dispatcher+00C6
        +000A   4080BCEA     MOVE.L (A7)+,D1
        +000C   4080BCEC     MOVE.W $0018(A7),D0
        +0010   4080BCF0     BSR.S      LoadSeg+007C
        +0012   4080BCF2     BEQ.S      LoadSeg+0076
        +0014   4080BCF4     HGetState
        +0016   4080BCF6     BTST       #$07,D0
        +001A   4080BCFA     BNE.S      LoadSeg+0026
        +001C   4080BCFC     TST.B      SegHiEnable
        +0020   4080BD00     BEQ.S      LoadSeg+0024
        +0022   4080BD02     MoveHHi
        +0024   4080BD04     HLock
        +0026   4080BD06     MOVE.L (A0),D0
        +0028   4080BD08     StripAddress
        +002A   4080BD0A     MOVEA.L    D0,A0
        +002C   4080BD0C     MOVEA.L    A5,A1
        +002E   4080BD0E     ADDA.W CurJTOffset,A1
        +0032   4080BD12     ADDA.W (A0),A1
        +0034   4080BD14     CMPI.W #$4EF9,$0002(A1)
        +003A   4080BD1A     BEQ.S      LoadSeg+005C
        +003C   4080BD1C     MOVE.W $0002(A0),D0
        +0040   4080BD20     BEQ.S      LoadSeg+005C
        +0042   4080BD22     MOVE.W $0018(A7),D1
        +0046   4080BD26     MOVEQ      #$00,D2
        +0048   4080BD28     MOVE.W (A1)+,D2
        +004A   4080BD2A     MOVE.W D1,-$0002(A1)
        +004E   4080BD2E     MOVE.W #$4EF9,(A1)+
        +0052   4080BD32     PEA        $04(A0,D2.L)
        +0056   4080BD36     MOVE.L (A7)+,(A1)+
        +0058   4080BD38     SUBQ.W #$1,D0
        +005A   4080BD3A     BNE.S      LoadSeg+0048
        +005C   4080BD3C     MOVEA.L    $0014(A7),A1
        +0060   4080BD40     SUBQ.L #$6,A1
        +0062   4080BD42     MOVE.L A1,$0016(A7)
        +0066   4080BD46     MOVEM.L    (A7)+,D0-D2/A0/A1
        +006A   4080BD4A     ADDQ.W #$2,A7
        +006C   4080BD4C     TST.B      LoadTrap
        +0070   4080BD50     BEQ.S      LoadSeg+0074
        +0072   4080BD52     Debugger
        +0074   4080BD54     RTS
        +0076   4080BD56     MOVEQ  #   $0F,D0
        +0078   4080BD58     SysError
        +007A   4080BD5A     Debugger
        +007C   4080BD5C     ST         ResLoad
        +0080   4080BD60     SUBQ.W #$4,A7
        +0082   4080BD62     MOVE.L #$434F4445,-(A7)    ;'CODE'
        +0088   4080BD68     MOVE.W D0,-(A7)
        +008A   4080BD6A     GetResource
        +008C   4080BD6C     MOVEA.L    (A7)+,A0
        +008E   4080BD6E     MOVE.L A0,D0
        +0090   4080BD70     RTS

BAL The JSR to Dispatcher+00C6 flushes the instruction cache. Because the 68030 has separate instruction and data caches, LoadSeg needs to do that to make sure that the newly loaded data is eligible to make it into the cache. Next the subroutine at +007C gets the code resource. If the handle isn't locked, it's moved high and locked. Then we find the first jump table entry for the segment, and test to see if it's loaded by checking whether the first instruction is $4EF9 (a JMP.L). If it's not loaded, each entry for this segment is updated from the unloaded form (involving a call to LoadSeg) to the loaded form (involving a JMP.L). But it must have skipped this, because otherwise the PEA at +0052 would have overwritten the return address from the call to StripAddress, and that return address is still on the stack.

KON It skipped over the code to transform the jump table entries, so the segment must have already been loaded. But if the segment was loaded, the jump table wouldn't have any LoadSeg calls for that segment. Somehow LoadSeg was called for a segment that was already loaded. So your application must be calling LoadSeg manually!

15 Steve Honest, I'm not calling LoadSeg manually. A search of my source code verifies this.

BAL The only other way for LoadSeg to get called is through the jump table. How does your reference-counting segment unloader work? Is it possible that a segment gets called by your reference-counting code while you're in the process of loading it?

10 Steve It shouldn't be. The reference counting is done manually; we don't patch LoadSeg or anything nasty like that. At any rate, segment 44 isn't reference-counted.

KON Here's an idea: When LoadSeg was called to bring in segment 44, it called GetResource to bring the resource into memory. Assuming the code segment had been loaded in the past and later unloaded and purged, GetResource would have called ReallocHandle, which was short on memory and called your GrowZone hook. Your GrowZone function started freeing memory and then called another function in segment 44, triggering a recursive call to LoadSeg.

BAL With enough memory free, segment 44 was loaded. Then the GrowZone function exited back to the ReallocHandle call, which succeeded, and segment 44 was loaded again when the GetResource call completed. When the original LoadSeg checked the state of the jump table, it was already kosher, so the test for $4EF9 fired and the return address from StripAddress didn't get overwritten.

KON That certainly explains the confused heap. The GZSaveHnd that was passed to the GrowZone function shouldn't be touched, but you called GetResource on it indirectly via LoadSeg. It also explains the skewed jump table entries: after allocating a memory block, ReallocHandle simply assigns the master pointer to point to that block, without preserving the handle state stored in the high byte of the master pointer. This effectively sets the handle state to 0, erasing the HLock call from the inner LoadSeg. Thus, when the outer LoadSeg called MoveHHi on the second copy of the segment, the lock bit in the master pointer -- which is shared by both blocks -- was clear. So when MoveHHi called CompactMem, the first copy of the segment was free to move (in this case, by $017C bytes). Finally, GetResource returned to the original LoadSeg, which set the lock bit again.

BAL Take another look at your link map. Are there any routines in segment 44 that could be called from your GrowZone hook?

5 Steve That's funny. There are some routines in this segment that shouldn't be there -- in fact, they shouldn't be anywhere. They're supposed to be inline functions!

BAL The C++ compiler won't always copy a function inline, even if it's declared that way. This can happen if the function body is too complicated. Segment loading is a foreign concept that doesn't fit well into a C++ class hierarchy, and the MPW implementation has a few puzzlers.

KON Some of the calls to these "inline" functions were from segment 44, so they happened to be placed in that segment. Then, when the GrowZone hook tried to call one of the inline functions, it had to load segment 44 -- and the rest is history.

BAL C++ claims another victim.

Steve So how do I avoid this in the future? Put segment #pragmas around all my inline functions?

BAL That's a superstition believed by some people who should know better. It doesn't work.

KON What does work is what those Finder folks did. MPW CFront puts the uninlineable functions at the end of the file it's compiling. The Finder folks just end every file with "#pragma segment CFrontCruft," and all the unexpected functions wind up in one easy-to-manage segment.

Steve "Uninlineable" isn't a word.

BAL That's why it's called cruft. Incidentally, this technique also catches functions that the compiler has to synthesize entirely, such as constructors and destructors for classes where they're needed (to initialize the vtable, for example) but aren't declared explicitly. And by looking at the link map, you can see what the compiler is doing behind your back -- although you might be happier not knowing.

KON Nasty.

BAL Yeah.

SCORING

  • 80-100 You should be a guest puzzler yourself; send in a draft to AppleLink DEVELOP.
  • 55-75 Pretty sharp; maybe you can write the first hot OpenDoc container app.
  • 30-50 Maybe you can write an OpenDoc part.
  • 5-25 Maybe you'd better stick to AppleScript. *

KONSTANTIN OTHMER AND BRUCE LEAK have given up sleep because they need all the time they can get to manipulate the penny stock market via the budding information superhighway. They're no longer trying to break the sound barrier, but are working on the Hedgehog barrier. BAL wonders, "What's that blue Hedgehog got that our green Armadillo doesn't have?"*

Steve NEWMAN (AppleLink STEVENEWMAN) has been programming on the Macintosh since 1984. Currently, he works at Common Knowledge, Inc., writing information management tools. In a previous life he cowrote FullPaint, FullWrite Professional, and Spectre. When asked if he thought the Power Macintosh was the hottest new game platform he'd seen since the Atari 800, he replied "yes." *

Thanks to scott douglass for reviewing this column, and to Ludis Langens for wading into a haystack of hex and emerging with a needle labeled 4080BD0A.*

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Latest Forum Discussions

See All

Tokkun Studio unveils alpha trailer for...
We are back on the MMORPG news train, and this time it comes from the sort of international developers Tokkun Studio. They are based in France and Japan, so it counts. Anyway, semantics aside, they have released an alpha trailer for the upcoming... | Read more »
Win a host of exclusive in-game Honor of...
To celebrate its latest Jujutsu Kaisen crossover event, Honor of Kings is offering a bounty of login and achievement rewards kicking off the holiday season early. [Read more] | Read more »
Miraibo GO comes out swinging hard as it...
Having just launched what feels like yesterday, Dreamcube Studio is wasting no time adding events to their open-world survival Miraibo GO. Abyssal Souls arrives relatively in time for the spooky season and brings with it horrifying new partners to... | Read more »
Ditch the heavy binders and high price t...
As fun as the real-world equivalent and the very old Game Boy version are, the Pokemon Trading Card games have historically been received poorly on mobile. It is a very strange and confusing trend, but one that The Pokemon Company is determined to... | Read more »
Peace amongst mobile gamers is now shatt...
Some of the crazy folk tales from gaming have undoubtedly come from the EVE universe. Stories of spying, betrayal, and epic battles have entered history, and now the franchise expands as CCP Games launches EVE Galaxy Conquest, a free-to-play 4x... | Read more »
Lord of Nazarick, the turn-based RPG bas...
Crunchyroll and A PLUS JAPAN have just confirmed that Lord of Nazarick, their turn-based RPG based on the popular OVERLORD anime, is now available for iOS and Android. Starting today at 2PM CET, fans can download the game from Google Play and the... | Read more »
Digital Extremes' recent Devstream...
If you are anything like me you are impatiently waiting for Warframe: 1999 whilst simultaneously cursing the fact Excalibur Prime is permanently Vault locked. To keep us fed during our wait, Digital Extremes hosted a Double Devstream to dish out a... | Read more »
The Frozen Canvas adds a splash of colou...
It is time to grab your gloves and layer up, as Torchlight: Infinite is diving into the frozen tundra in its sixth season. The Frozen Canvas is a colourful new update that brings a stylish flair to the Netherrealm and puts creativity in the... | Read more »
Back When AOL WAS the Internet – The Tou...
In Episode 606 of The TouchArcade Show we kick things off talking about my plans for this weekend, which has resulted in this week’s show being a bit shorter than normal. We also go over some more updates on our Patreon situation, which has been... | Read more »
Creative Assembly's latest mobile p...
The Total War series has been slowly trickling onto mobile, which is a fantastic thing because most, if not all, of them are incredibly great fun. Creative Assembly's latest to get the Feral Interactive treatment into portable form is Total War:... | Read more »

Price Scanner via MacPrices.net

Early Black Friday Deal: Apple’s newly upgrad...
Amazon has Apple 13″ MacBook Airs with M2 CPUs and 16GB of RAM on early Black Friday sale for $200 off MSRP, only $799. Their prices are the lowest currently available for these newly upgraded 13″ M2... Read more
13-inch 8GB M2 MacBook Airs for $749, $250 of...
Best Buy has Apple 13″ MacBook Airs with M2 CPUs and 8GB of RAM in stock and on sale on their online store for $250 off MSRP. Prices start at $749. Their prices are the lowest currently available for... Read more
Amazon is offering an early Black Friday $100...
Amazon is offering early Black Friday discounts on Apple’s new 2024 WiFi iPad minis ranging up to $100 off MSRP, each with free shipping. These are the lowest prices available for new minis anywhere... Read more
Price Drop! Clearance 14-inch M3 MacBook Pros...
Best Buy is offering a $500 discount on clearance 14″ M3 MacBook Pros on their online store this week with prices available starting at only $1099. Prices valid for online orders only, in-store... Read more
Apple AirPods Pro with USB-C on early Black F...
A couple of Apple retailers are offering $70 (28%) discounts on Apple’s AirPods Pro with USB-C (and hearing aid capabilities) this weekend. These are early AirPods Black Friday discounts if you’re... Read more
Price drop! 13-inch M3 MacBook Airs now avail...
With yesterday’s across-the-board MacBook Air upgrade to 16GB of RAM standard, Apple has dropped prices on clearance 13″ 8GB M3 MacBook Airs, Certified Refurbished, to a new low starting at only $829... Read more
Price drop! Apple 15-inch M3 MacBook Airs now...
With yesterday’s release of 15-inch M3 MacBook Airs with 16GB of RAM standard, Apple has dropped prices on clearance Certified Refurbished 15″ 8GB M3 MacBook Airs to a new low starting at only $999.... Read more
Apple has clearance 15-inch M2 MacBook Airs a...
Apple has clearance, Certified Refurbished, 15″ M2 MacBook Airs now available starting at $929 and ranging up to $410 off original MSRP. These are the cheapest 15″ MacBook Airs for sale today at... Read more
Apple drops prices on 13-inch M2 MacBook Airs...
Apple has dropped prices on 13″ M2 MacBook Airs to a new low of only $749 in their Certified Refurbished store. These are the cheapest M2-powered MacBooks for sale at Apple. Apple’s one-year warranty... Read more
Clearance 13-inch M1 MacBook Airs available a...
Apple has clearance 13″ M1 MacBook Airs, Certified Refurbished, now available for $679 for 8-Core CPU/7-Core GPU/256GB models. Apple’s one-year warranty is included, shipping is free, and each... Read more

Jobs Board

Seasonal Cashier - *Apple* Blossom Mall - J...
Seasonal Cashier - Apple Blossom Mall Location:Winchester, VA, United States (https://jobs.jcp.com/jobs/location/191170/winchester-va-united-states) - Apple Read more
Seasonal Fine Jewelry Commission Associate -...
…Fine Jewelry Commission Associate - Apple Blossom Mall Location:Winchester, VA, United States (https://jobs.jcp.com/jobs/location/191170/winchester-va-united-states) Read more
Seasonal Operations Associate - *Apple* Blo...
Seasonal Operations Associate - Apple Blossom Mall Location:Winchester, VA, United States (https://jobs.jcp.com/jobs/location/191170/winchester-va-united-states) - Read more
Hair Stylist - *Apple* Blossom Mall - JCPen...
Hair Stylist - Apple Blossom Mall Location:Winchester, VA, United States (https://jobs.jcp.com/jobs/location/191170/winchester-va-united-states) - Apple Blossom Read more
Cashier - *Apple* Blossom Mall - JCPenney (...
Cashier - Apple Blossom Mall Location:Winchester, VA, United States (https://jobs.jcp.com/jobs/location/191170/winchester-va-united-states) - Apple Blossom Mall Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.