Read Assembly

Volume Number:		9
Issue Number:		5
Column Tag:		Assembly Workshop

The Secrets of the Machine

Or, how to read Assembler

By Malcolm H. Teas, Rye, New Hampshire

About the author

Malcolm H. Teas, 556 Long John Road, Rye, NH 03870 Internet: mhteas@well.sf.ca.us

AppleLink: mhteas@well.sf.ca.us@INTERNET#

America Online: mhteas

Why Read Assembler?

When we write programs in C or Pascal, what we’re really doing is writing in the computer’s second language. When I studied French I was always translating it into english in my head to understand it. Well, that’s just what the computer’s doing. It’s taking C, Pascal, or whatever you’re programming with and translating it to its native language - assembler. Translation is what a compiler’s job is. But just like my french translations would lead to errors and awkward speech, the compiler can occasionally make mistakes and the code that the compiler creates from your source is a little awkward too - it isn’t always the most efficient. Sometimes this doesn’t matter that much since the CPU is quite fast. However, if you’re doing a time critical algorithm, the speed of your application just isn’t what you want it to be, or you suspect that there’s some strange error, then it’s time to talk to the machine in its native language. This article is a traveller’s phrasebook.

Where do I find assembler?

Although this wasn’t always so, these days you can now find some way to examine either the translated assembler code for your source or your disassembled application. If you’re using the latest version of Think C (version 5.0), then try the “Disassemble” item in the “Source” menu. This generates the translated assembler version of the source in the front window. The new window that results shows the assembler and can be printed, saved and otherwise treated as any other Think C editor window.

MPW (Macintosh Programmer’s Workshop) also offers a number of tools to get at your assembler listings. The dumpCode tool takes any type of code resource and disassembles it. It also can list the jump table and other information that is included. I’ll talk about jump tables when I cover the memory map of an application. If you use the SourceBug debugger, it has an option to view source as either the original language or as assembler.

ResEdit now has an external that, when you open a code resource, disassembles it. It is quite helpful in finding the targets of jumps and other memory addresses, it shows you graphically with arrows. Unfortunately, it doesn’t permit editing of the assembler. While this external is not officially supported by Apple, it has worked well for me.

If you cannot get any of these, you can always use a low-level debugger like MacsBug or TMON. By getting into the debugger while in your application, you can disassemble the code you’re interested in and save it to a file. In MacsBug, you’d use the “ip” command to disassembler around the program counter, then use the “log” command to save the screen to a file.

What the computer really looks like.

When you read assembler, you see instructions that refer to registers, memory locations, and have an unusual syntax. The programming environment for assembler is the bare machine so it has some constraints. To learn to read assembler, you need to know something about the environment. Actually, this is the hard part, reading the assembler instructions is easy.

First are the registers. The CPUs (a computer’s central processing unit) that are used these days have a number of registers to hold data or addresses currently being used by the program. The Motorola chips used in the Macintosh (the 680x0 family) have eight data registers, eight address registers, a program counter (also called a PC), and a condition code or status register. The 68020 and later chips have some other specialized registers used by the Mac’s operating system for handling interrupts, mapping memory, and managing the CPU’s cache. However, these are only used in the operating system and are not interesting to the application programmer.

Data registers are used more often by instructions that manipulate data like the logical and arithmetic instructions. Address registers are used to address locations in the computer’s memory. They’re often used to index data and can be used in “move” instructions to help calculate the memory location of data. Address register seven (A7) is used by the CPU as a stack pointer. Some instructions can address data on the stack and automatically push or pop the stack. The Mac operating system has a convention to use address register five (A5) as the pointer to the top of an application’s global data and to use A6 (address register six) as the stack frame pointer. I’ll cover more of the stack frame and global data space later.

The program counter (PC) is a special register that holds the address of the next instruction to execute. The status register (SR or CCR for Condition Code Register) holds flags showing the results of the last data operation: zero, negative, positive, etc. These are used in all branching instructions that implement the “if” statements, loops, and multi-way ifs like the C “switch” statement.

The memory of the Mac, to an assembler language programmer, just looks like a big array. Some of this array holds the program, some holds the system, some holds the application’s data, and some other is used by other applications. This explains why one application’s bugs can cause problems for other applications. The first application can overwrite the contents of memory anywhere so that data or code for another application can be damaged too. As a result, keeping track of pointers and handles is quite important.

But for an application, the Mac’s memory is organized into application memory areas which hold the heap, stack, and global data. Any application is expected to stay in its own area. Low memory belongs to the interrupt table and system globals. Above that is the system heap, followed by the multifinder area. In high memory are the address locations of the cards and I/O devices that the Mac is equipped with. The Mac operating system divides the MultiFinder area into application memory areas or partitions, one for each application in memory at the time. The size of an application’s partition is determined when an application is launched from the ‘SIZE’ resource. If there is no ‘SIZE’ resource, a default partition size of 512K bytes is used. The partition size can be changed by the user in the Finder’s “Get Info ” box. This creates a new ‘SIZE’ resource. (See Inside Mac VI page 5-14 for more information on the ‘SIZE’ resource.)

This application memory partition is, in turn, subdivided into the application’s heap, stack, and global area. Your application’s code, opened resources, handle and pointer blocks are all in the heap which occupies the bottom part of the partition and may grow upward. The jump table, global variables, and the quickdraw application globals are all stored in the global area at the top of the partition. Register A5 (by convention) points into this area, at the top of the application’s globals. When a routine references global data, it’s done as a negative offset from register A5. Due to how this addressing mode is coded in instructions, this makes the maximum size of the application globals 32K. Although some compilers have ways around this limit, it’s best to stay under it the larger global areas are more difficult to access and make your program less efficient. Parameters for routines and local variables are stored on the stack which grows downward in memory and is located just beneath the QuickDraw application globals.

The jump table is fixed in memory for the life of the application and so is used to get around the 32K limit on ‘CODE’ resources and to allow them to be moved in memory. When a routine is called that isn’t in the same code resource as the calling routine, the compiler & linker make a jump table entry. This is a jump instruction to the other code resource. So, the calling routine does a JSR (Jump to Subroutine) to the address in the jump table, and the jump table the jumps control to the location in the new code resource. When a code resource is moved in memory, the jump table is corrected. This also allows the Segment Loader manager (part of the Mac Toolbox) to load code resources.

The stack grows toward low memory, in other words, when something is pushed onto the stack, the stack pointer is decremented. When it’s popped, the stack pointer is incremented. The stack is used to pass parameters to subroutines. The parameters are pushed onto the stack, then the routine is called. The routine then executes the link instruction. This instruction pushes the contents of register A6 on the stack, copies the register A7 (the stack pointer) into register A6, then decrements the stack pointer by the cumulative size of the routine’s local variables. This makes A6 the frame pointer. Each routine called has a frame of state information preserved on the stack. This is what enables debuggers to retrace the stack (called a “stack crawl”) to find the list of current routines. So, when you see your code accessing data via a positive offset from A6, the code’s accessing its parameters. A negative offset is used for its local variables.

Parameters are passed in different orders depending on the language being used. C passes the parameters from right to left. The rightmost parameters in the C call are pushed on the stack first. This enables a routine to use the information in the parameters topmost on the stack to determine the number of parameters that should follow. The C stdlib library routines printf() and scanf() use this technique. But, to keep life interesting, Pascal passes parameters in the opposite order. The leftmost parameters are pushed on the stack first. In addition, the return value for a Pascal routine is passed on the stack. The calling routine clears a location for the callees return value before pushing its parameters. A return value for a C routine is passed in register D0. By seeing this, you can tell from the assembler code what language a routine was written in. The Mac Toolbox was written in Pascal, so, all Toolbox traps have their data passed from left to right. Also, it’s possible in both Think C and MPW C to declare a C routine type modifier of “pascal”. This then makes it expect it’s parameters in Pascal order on the stack.

The instructions

The most common instructions you’ll see in assembler code are move instructions that copy data from one place to another, it often seems that most of what any program does is move data around. Three move instructions follow:

;1
            
 move.b #4, d4
 move.w (a4), d2
 move.l #16(a3, d2.w), -(a7)

Notice the the name of the instruction is on the left and its two operands are on the right. The first operand is the source of the data to move, the second is the destination. The first moves the immediate data “4” into the lowest byte of data register four (d4). Immediate data is coded into the instruction, I’ll talk more about that in the addressing mode section. We know that this is a byte move from the “.b” that follows the instruction name. Instructions can come in three sizes, “b” for byte, “s” or “w” for short integer or word (two bytes), and “l” for long integer (four bytes). In the case of byte and word operations, the lower byte or word is always used. The second instruction copies the data pointed to by address register four to data register two. The last instruction also uses an address register as a pointer, but does some address arithmetic before using it. It adds the 16 and the value in D2 to the value of A3 to get the address to copy the data from. Then, it decrements 4 from A7, and puts the data into the memory location now pointed to by A7. This latter part of the move is a stack push. It’ll decrement 4 from A7 since the size of the move is “long”, or four bytes.

Although move instructions are the most common, following closely in use are the ALU instructions, so named for the part of the CPU which processes them: the Arithmetic-Logic Unit. These instructions add, subtract, multiply, divide, shift, rotate, and, or, xor, and negate data. There isn’t just one instruction for each, instead, there may be several instructions for each operation. For example, the add operation is done by the ADD, ADDA, ADDI, ADDQ, and ADDX. The different instruction codes for the same operation are for handling different addressing modes or registers. ADD works just with data registers, ADDA works with address registers. ADDI and ADDQ work with immediate data, ADDQ uses a short form of immediate data that can range from one to eight. ADDX allows adding the source to the destination register, but also adds the extend bit. This last is used for multiple-word arithmetic operations. This sort of naming is common throughout the ALU instructions.

The ALU operations are generally done by taking the source data and destination data, operating on it, and leaving the result in the destination data’s location. Naturally, this constrains some of the addressing modes. Immediate data isn’t going to be possible for the destination data.

Both the ALU operations and the move operations affect the condition code. This is a set of flags held in the condition code register (CCR) that indicates whether the data is zero, negative, has overflowed, carried, or set the extend bit. Different instructions affect the CCR in different ways, some don’t affect the CCR at all. For example ADDA doesn’t change the CCR, it’s generally used in calculating addresses not data, all the other add instructions affect the CCR. If you get into reading assembler often, you’ll want one of Motorola’s books on the 68000 family of CPUs as reference. This’ll be able to tell you, among other things, what instructions affect the CCR and in what ways.

The CCR is used by a set of instructions called conditionals. These instructions make decisions based on the flags in the CCR by directing the program to execute either one or another block of instructions. Or, just optionally skip a block of instructions. The conditional instructions are Bcc and DBcc. The “cc” is replaced with a conditional test code. This directs the CPU to test some combination of the flags in the CCR, if the condition is met, the jump is taken, otherwise, execution continues with the next instruction. DBcc is a special loop instruction. It tests the condition, if it’s true, then it executes the following instruction. Otherwise, it decrements the data register its got as an operand. If the data register is -1, it executes the following instruction. Otherwise, it jumps to its target address. Another conditional instruction doesn’t make conditional jumps, instead it sets or clears the addressed data. Scc sets its destination location to zero, if the condition is false, or to all ones if it’s true.

The CCR can also be set with the CMP set of instructions. This does the same thing as a subtract instruction, but the result of the subtraction isn’t saved. Just the CCR flags are set. This is a way of comparing two pieces of data so that the CCR reflects that comparison. The TST instruction also sets the CCR flags. However, since it just has one operand it just clears the overflow and carry bits, and sets the negative and zero bits according to the data.

There are also jump and jump-to-subroutine instructions. Jump just unconditionally jumps to a new location in the program. The JSR or jump-to-subroutine instruction pushes the current PC (the pointer to the next instruction) on the stack, then, just like the jump instruction, it puts the location of the jump’s target in the PC and continues execution. The JSR is the part of a subroutine call. Parameters are pushed on the stack before the JSR is executed, then, at the end of the subroutine, the RTS instruction is executed. It pops the topmost longword off the top of the stack and puts in into the PC. This makes the CPU continue at the instruction just after the JSR. The other elements in the subroutine call are the LINK and UNLK instructions. The first pushes the value of its operand (an address register) or the stack, copies the stack pointer (address register seven) into its operand, then adds the displacement (its second operand) onto the stack pointer. This has the effect of saving the old routine’s stack frame, and allocating a new stack frame for the subroutine. A stack frame is the instantiation of a routine’s data, that is, the local data declared for a routine.

Some other useful instructions are designed to shortcut some common operations. LEA and PEA calculate the address location for the data through the given addressing mode and leave that as their result. LEA leaves it in the named address register and PEA pushes it onto the stack. This is a way of using a complex addressing mode and calculating the resulting address location on the fly.

EXG, SWAP, and EXT can also be useful operations. EXG swaps the values of its source and destination. SWAP exchanges the high and low words of the longword data addressed. EXT is used for typecasting bytes to words or words to longwords. It’ll take the high bit of the byte, for example, and copy it to all the bits in the upper byte of that word. This has the affect of sign-extending the byte to a word. The word-to-longword conversion operates similarly.

How to find data: Addressing modes

When the CPU saves data from a register to memory or loads a register from memory, it uses an address to determine that memory location. But there are a number of ways that this address is determined. The address can be part of the instruction, in which case it’s called immediate. But this is only useful for the low-memory globals on the Macintosh as memory in the heap is moved around and the application may be loaded in a different area of memory next time it’s run.

More often, the data’s address is calculated. A value from an address register is used, possibly in combination with an immediate value (coded in to the instruction) indicating an offset. Or, the offset can come from a data register. In some cases, as with jump instructions, the address is offset from the PC register. This is called PC-relative addressing.

An address can also be generated indirectly. An indirect address is a two-step process for the CPU. If we were using an indirect addressing mode with register A3 for example, the CPU would fetch the contents of the location addressed by A3, then use those contents as the address. Offsets can also be done from indirect addresses.

Addressing modes can be quite complex, but most compilers and programmers generally use a fairly small number. Although, the Motorola chip’s designers tried to implement operations that would be useful for compiler writers, the compiler writers found that it was usually easier to use the simpler modes. Doing otherwise required compilers to be larger and slower.

The two simplest addressing modes are “immediate” and “register direct”. Immediate mode is where the operand for the instruction is included in the instruction. The assembler syntax is to specify the operand with “#data” where data is the number. Often small or hardcoded integers are handled like this. Register direct is where the operand is already in a register. In that case just specify the register number: “D0” is data register zero and “A4” is address register four. These modes are the simplest.

The next set of addressing modes are called address register indirect. They’re used when the data (operands) are in memory and an address register already has the data’s memory address. There are several variations for this mode. Address register indirect, the simplest, uses the specified address register as a pointer to the data. The syntax is “(A3)”, the parentheses indicate the indirection. The CPU also does address register indirect with predecrement or with postincrement. The address register is again used as a pointer, but with predecrement; it’s decremented before it’s use. With postincrement, it’s incremented after use. The amount of the increment or decrement is the size of the instruction: one for byte, two for word, and four for long. These modes are useful in scanning arrays or working with stacks.

When dealing with arrays or structures, the next two address modes are useful. Address register indirect with displacement adds a 16 bit (word or short integer) displacement to the pointer in the specified address register to get the data’s address. Address register indirect with index mode also lets you specify a second register (either data or address) with an index value to add to the pointer. It also has a displacement number too, but this time, it’s only eight bits long. The index register can be treated as a word or a long by appending a size after the index register specification.

You have the tools now to read assembler. If you’d like to write it, you’ll need more work. But, like anything, practice makes perfect. Often just knowing what’s really happening is a help. Just remember that I’ve found that the lower the level you are in the computer, the simpler things often are, it’s just that they’re in unfamiliar terms.

;2

/*----------------------------------
 Center the window or dialog on the current screen.  First,
 find the height and width of both the window passed in and
 the screen.  Then take half of the total margin for the
 width and height to find the top left point of the window.
 Move the window.  */
          
void center (WindowPtr w)
{
 int    wHeight, wWidth,  /* Window heighth and width. */
 sHeight, sWidth,/* Screen heighth and width. */
 top, left; /* The new top-left of the window. */
; Make the stack frame
00000000        LINK      A6,#$FFFE
; Save the registers we’ll use.    
00000004        MOVEM.L   D3-D7/A4,-(A7)     
; Get the parameter in A4
00000008        MOVEA.L   $0008(A6),A4 

 if (w == 0L)  return;    /* If null window, ignore it. */
; Do a move, this sets the CCR
0000000C        MOVE.L    A4,D0    
; PC-relative jump if zero to 52 
0000000E        BEQ.S     *+$0044            

 /*Find the heighths and widths. */
 wHeight = w->portRect.bottom - w->portRect.top;
; Get portRect.bottom
00000010        MOVE.W    $0014(A4),D7 
; Subtract portRect.top
00000014        SUB.W     $0010(A4),D7 

 wWidth = w->portRect.right - w->portRect.left;
; Get portRect.right
00000018        MOVE.W    $0016(A4),D6 
; Subtract portRect.left  
0000001C        SUB.W     $0012(A4),D6 

 sHeight = screenBits.bounds.bottom - screenBits.bounds.top;
; Get screenBits bottom
00000020        MOVE.W    $000A(A5),D5 
; Subtract screenBits top
00000024        SUB.W     $0006(A5),D5 

 sWidth = screenBits.bounds.right - screenBits.bounds.left;
; Get screenBits right
00000028        MOVE.W    $000C(A5),D4 
; Subtract screenBits left
0000002C        SUB.W     $0008(A5),D4 

 /*Now calculate top-left point of the centered window. */
 top = (sHeight - wHeight) / 2;
; Move screen height to D3
00000030        MOVE.W    D5,D3    
; Subtract window height
00000032        SUB.W     D7,D3    
; Prepare for division
00000034        EXT.L     D3
; Divide height result by two
00000036        DIVS.W    #$0002,D3

 left = (sWidth - wWidth) / 2;
; Move screen width to D0
0000003A        MOVE.W    D4,D0    
; Subtract window width
0000003C        SUB.W     D6,D0    
; Prepare for division
0000003E        EXT.L     D0
; Divide width result by two
00000040        DIVS.W    #$0002,D0
; Save width result in local var
00000044        MOVE.W    D0,$FFFE(A6) 

 MoveWindow (w, left, top, FALSE); /* And center it. */
; Push the windowPtr on stack
00000048        MOVE.L    A4,-(A7) 
; Push the width result on stack
0000004A        MOVE.W    D0,-(A7) 
; Push the heighth result
0000004C        MOVE.W    D3,-(A7) 
; Push a place for return value
0000004E        CLR.B     -(A7)    
; Trap call 
00000050        _MoveWindow 

}
; Restore registers
00000052        MOVEM.L   (A7)+,D3-D7/A4     
; Clear stack frame
00000056        UNLK      A6
; And return from subroutine
00000058        RTS

Software Updates via MacUpdate

Latest Forum Discussions

Combo Quest (Games)

Combo Quest 1.0 Device: iOS Universal Category: Games Price: $.99, Version: 1.0 (iTunes) Description: Combo Quest is an epic, time tap role-playing adventure. In this unique masterpiece, you are a knight on a heroic quest to retrieve... | Read more »

Hero Emblems (Games)

Hero Emblems 1.0 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0 (iTunes) Description: ** 25% OFF for a limited time to celebrate the release ** ** Note for iPhone 6 user: If it doesn't run fullscreen on your device... | Read more »

Puzzle Blitz (Games)

Puzzle Blitz 1.0 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0 (iTunes) Description: Puzzle Blitz is a frantic puzzle solving race against the clock! Solve as many puzzles as you can, before time runs out! You have... | Read more »

Sky Patrol (Games)

Sky Patrol 1.0.1 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0.1 (iTunes) Description: 'Strategic Twist On The Classic Shooter Genre' - Indie Game Mag... | Read more »

The Princess Bride - The Official Game...

The Princess Bride - The Official Game 1.1 Device: iOS Universal Category: Games Price: $3.99, Version: 1.1 (iTunes) Description: An epic game based on the beloved classic movie? Inconceivable! Play the world of The Princess Bride... | Read more »

Frozen Synapse (Games)

Frozen Synapse 1.0 Device: iOS iPhone Category: Games Price: $2.99, Version: 1.0 (iTunes) Description: Frozen Synapse is a multi-award-winning tactical game. (Full cross-play with desktop and tablet versions) 9/10 Edge 9/10 Eurogamer... | Read more »

Space Marshals (Games)

Space Marshals 1.0.1 Device: iOS Universal Category: Games Price: $4.99, Version: 1.0.1 (iTunes) Description: ### IMPORTANT ### Please note that iPhone 4 is not supported. Space Marshals is a Sci-fi Wild West adventure taking place... | Read more »

Battle Slimes (Games)

Battle Slimes 1.0 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0 (iTunes) Description: BATTLE SLIMES is a fun local multiplayer game. Control speedy & bouncy slime blobs as you compete with friends and family.... | Read more »

Spectrum - 3D Avenue (Games)

Spectrum - 3D Avenue 1.0 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0 (iTunes) Description: "Spectrum is a pretty cool take on twitchy/reaction-based gameplay with enough complexity and style to stand out from the... | Read more »

Drop Wizard (Games)

Drop Wizard 1.0 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0 (iTunes) Description: Bring back the joy of arcade games! Drop Wizard is an action arcade game where you play as Teo, a wizard on a quest to save his... | Read more »

Price Scanner via MacPrices.net

Our MacBook Price Trackers will show you the...

Our Apple award-winning MacBook Price Trackers are continually updated with the latest information on prices, bundles, and availability for 16″ and 14″ MacBook Pros along with 13″ and 15″ MacBook... Read more

Amazon is offering a 10% discount on Apple’s...

Don’t pay full price! Amazon has 16-inch M4 Pro MacBook Pros (Silver and Black colors) on sale today for 10% off Apple’s MSRP. Shipping is free. These are the lowest prices currently available for 16... Read more

13-inch M4 MacBook Airs on sale for $150 off...

Amazon has new 13″ M4 MacBook Airs on sale for $150 off MSRP right now, starting at $849. Sale prices apply to most colors and configurations. Be sure to select Amazon as the seller, rather than a... Read more

15-inch M4 MacBook Airs on sale for $150 off...

Amazon has new 15″ M4 MacBook Airs on sale for $150 off Apple’s MSRP, starting at $1049. Be sure to select Amazon as the seller, rather than a third-party: – 15″ M4 MacBook Air (16GB/256GB): $1049, $... Read more

Amazon is offering a $50 discount on Apple’s...

Amazon has Apple’s 11th-generation A16 iPads in stock on sale for $50 (or a little more) off MSRP this week. Shipping is free: – 11″ 11th-generation 128GB WiFi iPads: $299 $50 off MSRP – 11″ 11th-... Read more

Clearance 13-inch M1 MacBook Airs available f...

Walmart has clearance, but new, Apple 13″ M1 MacBook Airs (8GB RAM, 256GB SSD) available online for $649, $360 off original MSRP, in Space Gray, Silver, and Gold colors. These are new MacBooks for... Read more

iPad minis on sale for $100 off Apple’s MSRP...

Amazon is offering $100 discounts (up to 20% off) on Apple’s newest 2024 WiFi iPad minis, each with free shipping. These are the lowest prices available for new minis among the Apple retailers we... Read more

AirPods Max headphones on sale for $479, $70...

Amazon has AirPods Max with USB-C on sale for $479.99 in all colors. Shipping is free. Their price is $70 off Apple’s MSRP, and it’s the lowest price available today for AirPods Max. Keep an eye on... Read more

14-inch M4 Pro/M4 Max MacBook Pros on sale th...

Don’t pay full price! Get a new 14″ MacBook Pro with an M4 Pro or M4 Max CPU for up to $320 off Apple’s MSRP this weekend at these retailers…they are the lowest prices available for these MacBook... Read more

Get a 15-inch M4 MacBook Air for $150 off App...

A couple of Apple retailers are offering $150 discounts on new 15″ M4 MacBook Airs this weekend. Prices at these retailers start at $1049: (1): Amazon has new 15″ M4 MacBook Airs on sale for $150 off... Read more

Jobs Board

SPREAD THE WORD:
Slashdot
Digg
Del.icio.us
Reddit
Newsvine

MacTech