KON & BAL'S PUZZLE PAGE
SLEEPING BEAUTY
KONSTANTIN OTHMER AND BRUCE LEAK
See if you can solve this programming puzzle, presented in the form of a dialog
between Konstantin Othmer (KON) and Bruce Leak (BAL). The dialog gives clues
to help you. Keep guessing until you're done; your score is the number to the left of
the clue that gave you the correct answer. These problems are supposed to be tough. If
you don't get a high score, at least you'll learn interesting Macintosh trivia.
During the development of QuickTime, a number of interesting bugs reared their ugly heads. Try
to figure out this one before KON does.
120 BAL Here's the problem: When you play a movie on a Macintosh IIfx the machine hangs
after about six hours. If you turn off the sound, or try it on any Macintosh other than
an fx, it doesn't hang.
KON How does it hang? Is the Macintosh locked up?
115 BAL Well, the movies aren't playing. They just all freeze about halfway through. Menus
still work, though. You can even switch to the Finder and click between windows and
move icons around, but when you launch an application or open a folder, the Finder
draws one of the zoom rectangles and then hangs.
KON Like a time bomb. Can you get into MacsBug?
110 BAL Yeah.
KON Whew. For a second I thought this was going to be really tough. You've got MacsBug,
so what's the problem?
BAL I don't know, you tell me.
KON How do movies get time? SystemTask or something?
105 BAL No, you have to call MoviesTask.
KON Figures. So I set a break on MoviesTask; is it getting called?
100 BAL Nope.
KON The application is supposed to call it?
95 BAL Yep.
KON So is WaitNextEvent getting called?
90 BAL Nope.
KON But menus work???
85 BAL Yep.
KON WaitNextEvent is being called with a bogus sleep time. I set a breakpoint on it, pull
down a menu, and see what the sleep time parameter is.
80 BAL Now you're thinking, Kon! But no. The sleep time is 1, just as it's supposed to be.
KON Hmmm. So I trace 50 times and see where I'm spinning.
75 BAL There's a bunch of stuff going on; it's not just some simple loop.
KON So I record A-traps.
70 BAL There are three traps getting called: ABF7 from inside MF, A0DD (PPCToolbox)
from a big block in the system heap that's not QuickTime, and A030 (EventAvail)
from a 'scod' resource that's different from the block calling ABF7.
KON There are no null events coming through to the application, so MoviesTask never gets
called. The sleep time must never be expiring. Are ticks running?
BAL How do you figure that out?
KON I DM ticks, continue executing, and then DM ticks again.
65 BAL Ticks doesn't change.
KON It's updated by some hardware mechanism at interrupt time, right?
BAL Sure, a hardware mechanism . Is that what they teach you at Caltech?
KON OK, OK. There's a heartbeat task that generates a level-1 interrupt that updates ticks,
right?
60 BAL Yeah.
KON So is the level-1 interrupt happening?
BAL How do you check that?
KON I know from reading this cool develop column that the level-1 interrupt vector is at
location $64, so I set a break there and see if it's firing.
55 BAL Wait a second. That's the way ticks are updated on every Macintosh except the fx.
This problem happens only on an fx.
KON Why is the fx different?
50 BAL The guy that designed the hardware assumed that the heartbeat task--and thus ticks--
was supposed to happen every 60th of a second. Unfortunately it's supposed to be
every 60.14th of a second or something, so Gary Davidian fixed it by installing a Time
Manager task that updates ticks. The extended Time Manager, found in System 6.0.4
and later, adds a drift-free mode for Time Manager callbacks. This allows for accurate
scheduling of periodic events without long-term drift. With the extended Time
Manager, the next callback is scheduled with respect to when the current callback
should have fired, rather than the current time. So as of System 6.0.4 there are two
types of Time Manager tasks: the regular ones and the drift-free ones. Ticks are
updated via a drift-free task, so they advance accurately over time.
KON Hmmm. So is the Time Manager task that updates ticks getting called?
45 BAL Obviously not; ticks aren't changing.
KON How are the tasks in the queue organized?
40 BAL They're kept in the order that they fire in.
KON So where is the ticks task in the queue?
35 BAL Well, it's not the first one, and the first one is scheduled for sometime tomorrow.
KON Fine. I leave and come back tomorrow. Does my movie start playing?
30 BAL It could be. But we're shipping before then.
KON So the first element is getting messed up and never completes. Then none of the other
items in the queue get executed because they're all deltas off the first element and the
movies hang.
25 BAL Now we're getting somewhere.
KON So how is that first element getting confused?
BAL Whose problem is this?
KON OK. Who owns the first item?
BAL How do you figure that out?
KON I do an IL to see what traps it's calling. I see where it is in the heap. I set a breakpoint
on it, force it to fire, and see what it does.
20 BAL You're knee deep in spaghetti. You could probably figure it out this way but it's pretty
nasty.
KON OK. I break on InsTime to see who installs it. I break on PrimeTime to see who starts
it up. I figure out whether it's using InsTime or InsXTime.
15 BAL The PrimeTime comes from an 'snth' resource. It was installed with InsXTime.
KON Aha! The Sound Manager. That's why it works when the sound is turned off.
BAL They don't pay you enough, Kon.
KON So how does this element get updated in the queue? Don't Time Manager tasks call
PrimeTime to reschedule themselves?
10 BAL Yeah.
KON So is someone calling PrimeTime on the Sound Manager task with a bogus value?
5 BAL Not really. It always calls PrimeTime with a value of 0, indicating that it wants to be
called right away. The Sound Manager does this since it's not reentrant. By scheduling
a task, it knows it won't be called until it's finished servicing the current interrupt,
avoiding reentrancy problems.
KON I'm sure everyone that's reading this has figured it out by now. I know I have.
BAL You're bluffing again, Kon.
KON It's easy. Since the drift-free Time Manager schedules tasks based on when they
should occur, and since this sound task is using a count of 0 when calling PrimeTime,
its backlog gets bigger and bigger. So each PrimeTime call schedules an event that
should have occurred further and further in the past. The task is executed
immediately, of course, but the backlog builds. It always puts this element at the head
of the queue, but eventually it overflows the long that contains the backlog, and the
scheduling time becomes incredibly large. There are about 4 billion microseconds in a
long, which is about an hour. The Time Manager counts in units that are about 20
microseconds, so it would take about 20 hours to overflow. Take away a bit for signed
math and another because it's calculating scheduling times, and I would guess it should
hang about once every 5 hours. It must have taken forever to find that one.
BAL Yeah, it was a drag. We fixed it before we shipped QuickTime, of course. But it's just
the kind of thing that makes scheduling software projects so hard.
KON Nasty.
BAL Yeah.
KONSTANTIN OTHMER AND BRUCE LEAK have been puzzling about reality, life, the universe, and even computers for a
long time. Since the great success of their Graphics '90 World Tour, which included peace-keeping, hostage-freeing, and
wall-smashing, they settled down, shipped a few QuickDraw packages, and cleaned out their closets. Then came the
coup: da division of da Union. Konstantin got QuickDraw, 200 rubles, and a guaranteed spot at the front of the bread
line. Official party line on Bruce: "vacationing in the Crimea." Bruce was actually working on the first QuickDraw spinoff.
To provide a seamless upgrade path, and to leverage off of brand awareness, he decided to call this project QuickTime. *
SCORING
- 100-120 Members of the QuickTime team and their immediate family aren't eligible.
- 75-95 Scores count only on the first reading!
- 50-70 Not bad--buy yourself an ice cream.
- 25-45 The next one will be easier.
- 5-20 Stick to word searches.*
Thanks to Gary Davidian and Jean-Charles Mourey for reviewing this column.*