Precise timing
Volume Number: | | 6
|
Issue Number: | | 6
|
Column Tag: | | Assembly Lab
|
Related Info: Time Manager
Mac II Timing
By Oliver Maquelin, Stephan Murer, Zurich, Switzerland
Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.
Precise timing on the Macintosh II
[Olivier Maquelin and Stephan Murer are both researchers and teaching assistants at the Swiss Federal Institute of Technology in Zurich, Switzerland. Currently they are involved with dataflow multiprocessor project and working towards their Ph.D. thesis. The working environment at the institute consists of about 80 networked Mac IIs including five Appleshare fileservers, some Laserprinters, a Scanner, two MicroVAXes and some communication hardware. We are programming in Pascal and Modula-2 under MPW and make use of many other Mac applications.]
The Problem
Determining time or measuring the duration of some process from within a program is a task most programmers have had to face at least once in their careers. For that reason, most operating systems, including the Mac OS, offer services to determine the current time and date. Unfortunately, in some cases the resolution or the accuracy of the system clock is not sufficient to solve the task at hand. We had that problem lately, as we wanted to develop a profiler to test programs written in P1-Modula-2 under MPW. The Time Manager provides only delays with 1 ms accuracy, which is much too long to measure the execution time of small procedures. The timer we want to describe here is accurate to a couple of microseconds, depending on how it is used.
The Idea
A straightforward way to measure time on a Mac is to use the global variable Ticks, which is incremented during each Vertical Blanking interrupt, that is every 16.63 ms. A more complicated, but much more precise way to do it is to use one of the hardware timers, which are decremented every 1.2766 µs. The Mac Plus and SE have two such timers, which are used by the Sound Manager and the Disk Driver. The Mac II has four of them, two being used by the Sound Manager and the Disk Driver as in the older machines, one being used to generate the Vertical Blanking signal, and the last one being currently unused by the Mac OS.
We could have used the fourth timer, but that would have meant installing an interrupt routine in the VIA dispatch table and setting up the VIA, and there was the risk of someone else using that timer. We decided instead to use the already set up Vertical Blanking timer in conjunction with the global variable Ticks. Because we dont need to modify the configuration of that counter, multiple applications can use our Timer module at the same time without interfering with another. A minor complication in doing so is that the timer does not directly generate an interrupt. Instead, each time it reaches zero, bit 7 of VIA2 buffer B is inverted. This bit is used as an output and drives the CA1-pin of VIA1, an interrupt being generated at each transition from 0 to 1. For that reason, the state of VIA2 buffer B has also to be taken into account.
Determining Time
To determine the current time, four different values must be read: the low and high bytes of the Vertical Blanking timer, that must be read separately from the VIA (vT1C and vT1CH), the state of the Vertical Blanking signal (vBufB bit 7) and the global variable Ticks. The Vertical Blanking timer is set up to count repeatedly downwards from hex $196E (= 6510) to zero. In fact, due to a peculiarity of the 6522 VIA, zero is first followed by hex $FFFF (= -1), and then only by hex $196E, adding a supplementary step in the counting process. Each timing period lasts thus for 6512 cycles, which leads to the following formula to calculate the time in microseconds since startup:
{1}
viaVal = (vBufB bit 7) * 6512 - vT1CH * 256 - vT1C
time = (2 * 6512 * (Ticks + 1) - viaVal) * 1.2766µs
Unfortunately, because all these values are constantly changing, it is not sufficient to simply read these values and apply the formula. Consider the following two examples, where the high byte of the counter is read first, then after about two microseconds the low byte:
counter value (hex): $0228value read from vT1CH (hex): $02
counter value (hex): $0226value read from vT1C (hex): $26
counter value (hex): $0200value read from vT1CH (hex): $02
counter value (hex): $01FEvalue read from vT1C (hex): $FE
In the first example everything went well. The resulting hexadecimal value is $0226, which corresponds to the last counter value. In the second example however, the resulting hexadecimal value is $02FE, which is much different from either $0200 or $01FE. Such errors always occur when the high byte of the counter changes between the two reads.
Different solutions to that problem exist. Our solution, shown as Pascal code below, relies on the fact that the time between two changes of the counter is relatively long. The values needed for the future computations are read once and a test is done to check if the high byte of the counter changed during that time. If it did, all the values are read a second time and should be valid. The variable hib also has to be read once more, in case the first read was from the special timer value hex $FFFF. Interrupts are disabled to make sure that all these operations are done without interruption. The variable Ticks can be read safely as long as interrupts are disabled, because it is incremented by the Vertical Blanking interrupt handler.
{2}
DisableInterrupts;
hib0 := vT1CH; (* read the high byte a first time at *)
buf := vBufB; (* read all the values needed *)
lob := vT1C;
hib := vT1CH; (* read the high byte a second time *)
(* if the high byte changed in between... *)
IF hib <> hib0 THEN
BEGIN
(* read all the values once more *)
buf := vBufB; lob := vT1C;
(* in case first read of hib was $FF *)
hib := vT1CH;
END;
(* the Ticks can be read safely here *)
myTicks := Ticks;
EnableInterrupts;
A last problem occurs when the Vertical Blanking signal becomes high after interrupts have been disabled and before the timer has been read. In that case, the state of the VIA reflects the beginning of the new timing interval, while the Ticks variable still contains the old tick value. This can be handled by testing if the value read from the VIA is within a small number (i.e. 10) of cycles from the beginning of the interval, and incrementing the number read from the Ticks variable by one if this is the case. Such small numbers can not be read after the Vertical Blanking interrupt, because of the execution time of the interrupt handler.
The Unit Timer
The unit Timer exports procedures to initialize, start and stop software timers and allows any number of them to be active (i.e. started but not yet stopped) at the same time. When stopped, they contain the measured time as a 64 bit wide number of cycles (32 bits allow only measurements up to 1.5 hours). They can be started and stopped repeatedly and will then contain the total time they have been running. A constant to convert the 64 bit format into an extended real value in milliseconds is provided for convenience.
Because we want to use these timing routines in a profiler, they should not only be accurate, they also should not disturb the temporal behavior of the code they are timing, even if many measurements are being done at the same time. Because the execution time of a procedure can be very short, this is only possible if the routines execute very fast (a few microseconds) or through some kind of compensation. In our case, the execution time of the routines is about 35µs and a compensation is needed. For that purpose, a counter tracking the total time spent in the routines StartTimer and StopTimer is maintained. In addition, the processor cache is disabled during these routines in order to keep the execution time as constant as possible and to reduce the influence on other parts of the code.
It is interesting to note that a single number is sufficient to contain the state of a timer during its whole existence. To implement the compensation, a single global counter is needed, that contains a running total of the time spent in the routines to be compensated for. The algorithm used here is in fact very simple. As can be seen below, StartTimer subtracts the current time from the timer value and adds the current compensation value, while StopTimer adds the current time to the timer value and subtracts the compensation value. Before doing that, both procedures add their expected execution time to the compensation value. After calls to InitTimer, StartTimer and StopTimer in sequence, timer contains thus the value: 0 - Time1 + 35µs + Time2 - (35µs + 35µs) = Time2 - Time1 - 35µs, which is the time difference between the two calls minus the compensation.
{3}
InitTimer (timer):
timer := 0;
StartTimer (timer):
totalComp := totalComp + 35µs;
timer := timer - ActualTime + totalComp;
StopTimer (timer):
totalComp := totalComp + 35µs;
timer := timer + ActualTime - totalComp;
Using the unit Timer
Consider the following example that shows the usage of the Timer unit. The main program contains two FOR-loops that are both executed 100 times. The first loop does nothing and the second calls repeatedly the empty procedure Dummy. Three timers are used in that example. Timer t1 measures the execution time of the first loop, timer t2 does the same for the second loop and timer t3 measures the total execution time.
{4}
PROCEDURE Dummy; BEGIN END;
...
(* Initialize the three timers *)
InitTimer (t1); InitTimer (t2);
InitTimer (t3);
StartTimer (t3);
StartTimer (t1);
(* First loop *)
FOR i := 1 TO 100 DO END;StopTimer (t1);
StartTimer (t2);
(* Second loop *)
FOR i := 1 TO 100 DO Dummy END; StopTimer (t2);
StopTimer (t3);
...
The following table shows the resulting timer values with and without compensation and with the processor cache enabled or disabled. In the compensated case the value of timer t3 is roughly equal to the sum of t1 and t2, as would be expected from an ideal timer. In the uncompensated case the execution time of StartTimer and StopTimer is added once to the value of t1 and t2 and five times to t3 (about 175µs). This example also shows that in this case using the processor cache leads to a speed improvement of 30 - 40% and that the execution time of the procedure Dummy is about 1.5µs. This seems reasonable, since the compiler generates only a RTS instruction for such a procedure.
Compensation Cache Timer t1 Timer t2 Timer t3
27 (= 35µs) On 0.167ms 0.314ms 0.480ms
27 (= 35µs) Off 0.271ms 0.465ms 0.738ms
0 (Off) On 0.202ms 0.349ms 0.651ms
0 (Off) Off 0.306ms 0.499ms 0.910ms
Concluding remarks
As the previous example shows, our timer routines can give very accurate results. Also, because no interrupt routines are used and because the configuration of the hardware timers is not modified, there are no compatibility problems and no unwanted interactions with system routines. A few things have to be kept in mind however. First, our timer works only on the Macintosh II (probably also on the Macintosh IIx and on the SE/30, but we could not test this). Second, measurements of small execution times must be done several times in order to detect slowdowns due to interrupt routines, which have execution times ranging between 60µs and 1ms or more. And third, the compensation value depends not only on the routines themselves but also on the calling sequence generated by the compiler. For example, using timers stored in an array will be slower than using timers stored as variables, because of the additional array indexing operations. When the processor cache is enabled it further depends on how much of the calling sequence is contained in the cache.
*--------------------------------------------
*
*IMPLEMENTATION of UNIT Timing
*
*Version 1.0 / O. Maquelin / 22-May-89
*
**** Runs only on Macintosh II ***
*
* --------------------------------------------
CASE ON
MACHINEMC68020 ; needs 68020 instructions
HWNonPortable EQU1 ; needs Mac II hardware
onMac EQU 0
onNuMac EQU 1
INCLUDEHardwareEqu.a
INCLUDESysEqu.a
EXPORT (unitComp, totComp): DATA
EXPORT (INITTIMER,STARTTIMER,STOPTIMER) : CODE
ClkPerTickEQU 13024 ; cycles per tick
;(16.663 ms)
Timer RECORD0 ; local definition of
; Timer
hi DS.L 1 ; high longword
lo DS.L 1 ; low longword
ENDR
* --------------------------------------------
*
*Declaration of the exported variables
*
* --------------------------------------------
unitCompRECORD EXPORT; 27 cycles
DC.L 27; compensation (35µs)
ENDR
totComp RECORD EXPORT; totComp initially
hi DC.L 0 ; zero
lo DC.L 0
ENDR
* --------------------------------------------
*
*PROCEDURE InitTimer(VAR t:TimeRec)
*
*Initializes a timer (t := 0)
*
* --------------------------------------------
INITTIMER PROC EXPORT
MOVE.L (SP)+,A0 ; get return address
MOVE.L (SP)+,A1 ; get address of t
CLR.L (A1)+ ; clear two longwords
CLR.L (A1)
JMP (A0); back to caller
ENDPROC
* --------------------------------------------
*
*PROCEDURE GetTime (hi: D0.L; lo: D1.L)
*
*GetTime returns the actual time in
*clock cycles (1.2766 µs per
*cycle) in the registers D0 and D1. Time
*is determined from the
*global variable Ticks and from the
*state of VIA 2.
*
* --------------------------------------------
GetTime PROCENTRY
MOVE.L #VBase2,A1; get base address of
; VIA2
MOVE SR,-(SP) ; disable interrupts
ORI #$0700,SR
MOVE.B vT1CH(A1),D1; read high byte of
; timer 1
MOVE.B vBufB(A1),D0; read state of
; pseudo-VBL
MOVE.B vT1CH(A1),D2; read low byte of
; timer 1
ROR.W #8,D2
MOVE.B vT1CH(A1),D2; read high byte
; second time
CMP.B D1,D2 ; if both are equal we
; are done,
BEQ.S @1; else read everything
; once more
MOVE.B vBufB(A1),D0; read state of
; pseudo-VBL
MOVE.B vT1C(A1),D2; read low byte of
; timer 1
ROR.W #8,D2
MOVE.B vT1CH(A1),D2; read high byte of
; timer 1
@1 ROR.W#8,D2 ; exchange low and high
; byte
MOVEQ #7,D1 ; first phase of the
; tick?
BTST.L D1,D0
BNE.S @2ADD.W #ClkPerTick/2,D2; no, correct
; number of cycles
@2 MOVE.L Ticks,D1 ; read Ticks
MOVE (SP)+,SR ; enable interrupts
CMP.W #ClkPerTick-10,D2; was the value
; of Ticks valid?
BLE.S @3
ADDQ #1,D1 ; no, correct the value
; read
@3 MULU.L #ClkPerTick,D0:D1; convert ticks
; to cycles and
EXT.L D2; subtract VIA value
SUB.L D2,D1
MOVEQ #0,D2
SUBX.L D2,D0
RTS
ENDPROC
* --------------------------------------------
*
*PROCEDURE StartTimer (VAR t: Timer)
*
*Starts a timer (t := t - Time+totComp;
*totComp := totComp+unitComp)
*
* --------------------------------------------
STARTTIMERPROC EXPORT
MOVEC CACR,D0 ; disable cache, save
; old state
MOVE.L D0,A0
AND.B #$FE,D0
MOVEC D0,CACR
MOVE.L unitComp,D0; increment total
; compensation
ADD.L D0,totComp.lo
BCC.S @1; is there a carry to
; add
ADDQ #1,totComp.hi; yes, increment
; high word
@1 JSR GetTime ; determine actual time
MOVE.L 4(SP),A1 ; get address of t
MOVE.L totComp.hi,D2; subtract
; compensation
SUB.L totComp.lo,D1
SUBX.L D2,D0
MOVE.L Timer.hi(A1),D2; subtract result
; from timer
SUB.L D1,Timer.lo(A1)
SUBX.L D0,D2
MOVE.L D2,Timer.hi(A1)
MOVEC A0,CACR ; restore cache state
MOVE.L (SP)+,A0 ; return to caller
ADDQ #4,SP
JMP (A0)
ENDPROC
* --------------------------------------------
*
*PROCEDURE StopTimer(VAR t: Timer)
*
*Stops a timer (t := t + Time - totComp;
*totComp := totComp + unitComp)
*
* --------------------------------------------
STOPTIMER PROC EXPORT
MOVEC CACR,D0 ; disable cache, save
; old state
MOVE.L D0,A0
AND.B #$FE,D0
MOVEC D0,CACR
MOVE.L unitComp,D0; increment total
; compensation
ADD.L D0,totComp.lo
BCC.S @1; is there a carry to
; add
ADDQ #1,totComp.hi; yes, increment
; high word
@1 JSR GetTime ; determine actual time
MOVE.L 4(SP),A1 ; get address of t
MOVE.L totComp.hi,D2; subtract
; compensation
SUB.L totComp.lo,D1
SUBX.L D2,D0
MOVE.L Timer.hi(A1),D2; add result to
; timer
ADD.L D1,Timer.lo(A1)
ADDX.L D0,D2
MOVE.L D2,Timer.hi(A1)
MOVEC A0,CACR ; restore cache state
MOVE.L (SP)+,A0 ; return to caller
ADDQ #4,SP
JMP (A0)
ENDPROC
END
UNIT Timing;
INTERFACE
{$PUSH} {$J+}
{ The actual variables and code are contained in a
separate assembly language file. The assembled
output must be linked with programs using this
unit }
TYPE
Timer = COMP;
CONST
MsPerClock = 1.2766E-3;
VAR
unitComp: LONGINT;{ Compensation for one call }
totComp: COMP; { Accumulated compensation }
{$POP}
PROCEDURE InitTimer (VAR t: Timer);
{ Initializes a timer (t := 0) }
PROCEDURE StartTimer (VAR t: Timer);
{ Starts a timer (t := t - Time + totComp;
totComp := totComp + unitComp) }
PROCEDURE StopTimer (VAR t: Timer);
{ Stops a timer (t := t + Time - totComp;
totComp := totComp + unitComp) }
END.
PROGRAM TimingTest;
USES Timing;
VARi: INTEGER;
t1, t2, t3: Timer;
PROCEDURE Dummy; BEGIN END;
BEGIN
unitComp := 0;
InitTimer (t1);
InitTimer (t2);
InitTimer (t3);
StartTimer (t3);
StartTimer (t1);
FOR i := 1 TO 100 DO { empty loop };
StopTimer (t1);
StartTimer (t2);
FOR i := 1 TO 100 DO Dummy;
StopTimer (t2);
StopTimer (t3);
Write (t1 * MsPerClock: 16: 3);
Write (t2 * MsPerClock: 16: 3);
Write (t3 * MsPerClock: 16: 3);
WriteLn;
END.