Natural Object Rotation
Volume Number: 15 (1999)
Issue Number: 3
Column Tag: Programming Techniques
Natural Object Rotation without Quaternions
by Kas Thomas
The usual way of doing free-rotations isn't necessarily the best way.
Introduction
User interface design in 3D applications remains a difficult area. The problem of how best to allow user access to features like object, face, and vertex selection; translation, scaling, and rotation; and selection and movement of lights and camera, is still largely an open question. Most interfaces are highly modal and rely on the use of unfamiliar widgets, glyphs, or icons. Most end-users, alas, remain confused.
One of the most fundamental operations in 3D design is object rotation. This may, in fact, be the single most important mode of object manipulation. Users expect to be able to rotate objects at will. (This is why, for example, the Apple 3D Viewer supports free rotation as the default mouse-drag behavior.) Most interfaces, the 3D Viewer's included, support simultaneous rotation on x- and y-axes, keyed to mouse horizontal and vertical movements. That is to say, side-to-side displacement of the mouse results in an apparent rotation of the 3D object about the yaw (y) axis, while up-and-down (or fore-and-aft) dragging of the mouse causes rotation on the pitch (x) axis. Since the mouse has only two degrees of freedom, z-axis rotations are ignored. The code for achieving the rotation usually looks something like this:
delta_x = mousePtNew.h - mousePtOld.h;
delta_y = mousePtNew.v - mousePtOld.v;
// now rotate on the X-axis by the amount of vertical motion (dy)
// and rotate the object around the Y-axis by the amount of
// lateral motion (dx):
RotateObject( delta_y, delta_x, 0.0 );
The RotateObject function typically stuffs the (appropriately scaled) argument values into a 4X4 rotation matrix and performs a matrix multiplication with the object's global "rotation matrix" to calculate the new orientation of the object.
The problem with this approach is that object motion is not constrained to just the x- and y-axes. Cross-coupling of axis rotations inevitably causes some motion of the object about the third (z) axis. This well-known effect (which is responsible for "gimbal lock") arises because the three degrees of rotational freedom that would seem to be available in a Cartesian coordinate system are not truly independent. Translations in Cartesian 3-space are independent, because they combine (via vector addition) in commutative fashion. Rotations on Cartesian axes are not independent, because their combination (via matrix multiplication) is not commutative.
The reason this is a problem for user-interface design is that mixed x/y-axis object rotation (using the kind of code shown above) gives confusing feedback to the user. Some z-axis crossover always occurs. The unexpected z-axis rotation that occurs is counterintuitive in and of itself; but there is also the very serious problem of pointing-device hysteresis: This refers to a situation where rotational positioning of objects depends not only on the final position of the mouse or trackball, but the path it took getting there. In other words, when you drag the mouse from Point A to Point B, the final orientation of the object can be different each time depending on the exact path the cursor took. Dragging the cursor back to Point A may or may not restore the object to its original position. (Usually it does not.)
This behavior goes against Apple's own Quickdraw[TM] 3D User Interface Guidelines (as published in the original SDK for QD3D), where it is stated: "To support interaction, the application must complement every user action with appropriate feedback."
It's hard to conceive of how a pointing device with serious built-in hysteresis and a propensity to generate uncommanded third-axis rotations can be said to provide "appropriate feedback."
The Way Out
Because of programmers' traditional slavish adherence to Euler-angle rotation methods, there would seem, at first, to be no way out of the "uncommanded z-rotation" dilemma. But in fact, as first pointed out by Ken Shoemake (at SIGGRAPH '85), quaternion algebra provides a solution to this very problem.
Quaternions were conceived (as an extension of complex numbers to higher dimensions) in 1843 by the mathematician William Hamilton. They bear a strong resemblance to complex numbers - that is, numbers of the form a + bi, where a is a real number and bi is the so-called imaginary term (i being equivalent to the square root of -1). A quaternion is a four-component number (hence the name) of the form:
q = a + bi + cj + dk
Where i2 = j2 = k2 = -1. Just as complex numbers can be written as ordered pairs (a, bi), quaternions can be considered as a grouping of a scalar quantity with a vector quantity, q = (s, v ) where s is the scalar component and v is a 3D vector in ijk-space. In this view of things, an ordinary 3D vector can be considered a kind of degenerate quaternion; that is, a quaternion with a zero-valued scalar component. Such a pure-vector quaternion is known as a "pure quaternion." Also, just as ordinary vectors can be normalized (a process in which the components of the vector are scaled by the vector's magnitude) to give unit vectors, quaternions can be normalized to give "unit quaternions." Space doesn't permit a full discussion of quaternion algebra here, but the resemblances to vector algebra and complex-number arithmetic are extensive. (See Chapter 15 of Advanced Animation and Rendering Techniques by Watt, or a good math text, for further details.)
Perhaps the most intuitive way to consider quaternions is to regard them as pairings of rotational angles with rotational axes. The scalar part of a quaternion can be seen as the amount of rotation, while the vector part represents the axis on which the rotation takes place. A pure-vector quaternion is thus an axis without a specified rotation.
The significance of quaternion algebra for graphics programmers lies in the fact that rotations can be calculated in a way that makes arc intervals add and subtract like vectors, which is to say commutatively. This is because quaternions assume a parametrization of coordinate space based on the natural rotational axes of objects. In the quaternion view of things, two arbitrary points A and B on a unit sphere are separated by an angular interval that can be expressed in a single quaternion. A third point C can be expressed in terms of its arc-interval from A or B by the appropriate quaternion. Owing to the way quaternions parametrize rotation space, the quaternion expressing the rotation from A to C is just the sum of the quaternions giving A-to-B and B-to-C. Contrast this with a Euler-angle representation of "rotation space" in which going from point A to point B requires fixed rotations on x, y , and z axes. There are generally no fewer than 12 different ways of combining Euler angles to achieve a desired rotation in x-y-z space.
If quaternion algebra has a downside, other than unfamiliarity, it's that most graphics systems are designed around matrix math. Even if rotations were to be represented wholly in terms of quaternions, scalings and translations are typically done with matrices, and for convenience it is usually the case that individual matrices are combined into one final "object manipulation" matrix that simultaneously performs translations, scaling, and rotation. So the graphics programmer who wants to use quaternions to control rotations usually is forced to convert between quaternion and matrix representations on a frequent basis. The necessary conversion code in turn adds to program size and computational overhead - two things most graphics programs don't need more of.
The Pure-Vector Alternative
It's not strictly necessary to use quaternion math to achieve hysteresis-free mouse or trackball rotation of screen objects. Hysteresis-free rotations can be set up using ordinary vector math, if we apply the lessons learned from quaternion space parametrization. The main idea is to think in terms of arc intervals. If we have two arbitrary points A and B on the surface of a unit sphere, the most natural way to get from A to B is to rotate the sphere so that A follows the shortest path (or geodesic) from A to B. Thus the rotation occurs in the plane of the geodesic. If a = (xa, ya, za) and b = (xb,yb,zb) are the position vectors of the points, the axis of rotation is given by a 5 b. The angle of the rotation can be obtained from cos-1 (a × b).
How do we get from 2D mouse coordinates to 3D rotations? We construct a pseudo-3D coordinate space as follows. (This is essentially the method of Shoemake in Graphic Gems IV, p. 176.) Superimpose an imaginary sphere on our 3D object such that the center of the sphere is at the position vector c = (screen_x, screen_y, 0), where screen_x and screen_y are the local screen coordinates of the center of the object. We assume that any mouse-downs will happen on the surface of our imaginary sphere, which has a radius of r pixels. To get the 3D position vector m0 = (mx, my, mz) of the point on the sphere where a mouse-down occurs, we make the assumption that
mx = (float) (mouseDown_x - screen_x)/r;
my = (float) (mouseDown_y - screen_y)/r;
mz = sqrt( 1.0 - (mx * mx + my * my) );
Note that in scaling mx and my by the sphere radius, we end up with a normalized vector - a position vector on a unit sphere.
We now track the mouse and calculate the components of a "current mouse position" vector, cm0 = (cmx, cmy, cmz) by the same method:
cmx = (float) (currentMouse_x - screen_x)/r;
cmy = (float) (currentMouse_y - screen_y)/r;
cmz = sqrt( 1.0 - (cmx * cmx + cmy * cmy) );
Point A is now m0 and Point B is cm0 . The axis of rotation is obtained as the cross product of the two vectors. The angular interval is obtainable from the dot product of the two. (Recall that the dot product of any two normalized vectors is the cosine of the angle formed by them.)
Code Listing No. 1 shows the function Vectorize, which converts a mouse hit to a 3D position vector on the unit sphere. The code is extremely straightforward and needs little comment except to note that it is crucial to check for mouse out-of-range conditions. If a mouse hit occurs outside the radius of the imaginary sphere, you'll be trying to get the square root of a negative number and sqrt() will return NAN (not a number). Note that the radius value is entirely arbitrary. It's up to you to decide how wide a "sphere of influence" to give your user. For consistency, we passed the radius argument as a float, but it will generally be some integral number of pixels (e.g., the greater of the view pane width or height). Note also that we multiply y by minus-one, to correct for the fact that in screen space the y-axis has a positive downward direction and negative upward direction, just the opposite of ordinary 3D space.
Listing 1: Vectorize()
Vectorize()
// Create a 3D position vector on a unit sphere based on
// mouse hitPt, sphere's screen origin, and sphere radius in
// pixels. Return zero on success or non-zero if hit point
// was outside the sphere. This routine uses Quickdraw[TM] 3D
// and assumes that the relevant QD3D headers and library
// files have been included. Also need to #include <math.h>.
#define BOUNDS_ERR 1L
long Vectorize(
Point *hit,
Point *origin,
float radius,
TQ3Vector3D *vec )
{
float x,y,z, modulus;
x = (float)(hit->h - origin->h)/radius;
y = (float)(hit->v - origin->v)/radius;
y *= -1.0; // compensate for "inverted" screen y-axis!
modulus = x*x + y*y;
if (modulus > 1.) // outside radius!
return BOUNDS_ERR;
z = sqrt(1. - (x*x + y*y) ); // compute fictitious 'z' value
Q3Vector3D_Set( vec, x,y,z ); // compute pseudo-3D mouse position
return 0L;
}
Code Listing No. 2 shows how we can use Quickdraw[TM] 3D routines to perform a zero-hysteresis rotation of a 3D object with the aid of the vectors generated by Vectorize(). Note that in Listing 2, we assume that the 3D object we're going to rotate is contained in the fModel field of the global gDocument, an instance of a DocumentRec, which is a custom data structure. (This should look familiar to anyone who has read the sample code in the Quickdraw[TM] 3D SDK.) The gDocument struct also holds a pointer to our object's rotation transform, which kept in fRotation. The arguments v1 and v2 are just the position vectors of Point A and Point B (the starting and ending points of our geodesic). To save time, we start by checking the dot-product of these two vectors to see if it's equal to unity, in which case the angle between them is zero and we obviously needn't continue. Assuming the dot-product is not zero, we next calculate the cross product of the two input vectors. (Quickdraw[TM] 3D has an extensive math-utility library for performing vector calculations.) The cross product gives us our rotational axis. Note: We normalize the result of the cross to avoid unintentionally rescaling our object duration the ensuing matrix rotation. (Recall that the cross product is a vector whose magnitude is governed by the sine of the angle of the two input vectors. If we don't keep the magnitude equal to one, we might accidentally rescale our object!)
Once we have a rotational axis and an angular displacement value, we can make use of Quickdraw[TM] 3D's handy Q3Matrix4x4_SetRotateAboutAxis function, which will custom-make a 4X4 rotation matrix for us that achieves exactly the rotation we want. This function takes, as arguments, a pointer to a (destination) matrix structure, a pointer to an origin point for the rotation, a pointer to an orientation vector for the axis of rotation, and an angle in radians. All we have to do is stuff the appropriate values into these arguments, multiply the resulting "rotate on axis" matrix by our 3D object's stored rotation matrix, and draw the object.
Listing 2: ZeroHysteresisRotation()
ZeroHysteresisRotation()
// From two 3D vectors representing the positions of
// points on a unit sphere, calculate an axis of rotation
// and an amount of rotation such that Point A can be
// moved along a geodesic to Point B.
// CAUTION: Error-checking omitted for clarity.
void ZeroHysteresisRotation( TQ3Vector3D v1, TQ3Vector3D v2 )
{
TQ3Vector3D cross;
TQ3Matrix4x4 theMatrix;
TQ3Point3D orig = { 0.,0.,0. };
float dot,angle;
dot = Q3Vector3D_Dot( &v1, &v2 );
if (dot == 1.0)
return; // nothing to do
Q3Vector3D_Cross( &v1, &v2, &cross ); // axis of rotation
Q3Vector3D_Normalize( &cross,&cross );
angle = 2.*acos( dot ); // angle of rotation
// set up a rotation around our chosen axis...
Q3Matrix4x4_SetRotateAboutAxis(&theMatrix,
&orig, &cross, angle);
Q3Matrix4x4_Multiply( &gDocument.fRotation,
&theMatrix,
&gDocument.fRotation); // multiply
DocumentDraw( &gDocument ) ; // draw
}
Note that the rotation angle obtained from the call to acos() is multiplied by 2. This is not a sensitivity adjustment value! (If you want to give the user a sensitivity adjustment for his mouse rotations, scale the radius of the imaginary sphere on which our Point A/Point B geodesic is drawn. Don't adjust the rotation angle.) The multiply-by-2 is necessary in order to make the rotations behave according to the laws of quaternion algebra. If we change this value, our rotations will not commute and we will have reintroduced hysteresis or axis-crossmodulation, which we don't want.
To track mouse movements, we simply need to call a function like the one in Listing 3 from our event loop (assuming we're in free-rotate mode and not object-translation mode or some other mode). Our screen object will now rotate in accordance with geodesic-addition laws - and user expectations.
Listing 3: FreeRotateWithMouse()
FreeRotateWithMouse()
// Call this function from the main event loop to
// do free-rotations of 3D objects
// the mouse action radius in pixels:
#define RADIUS_VALUE 300.0
void FreeRotateWithMouse(void)
{
Point now, oldPt, center;
WindowPtr win = FrontWindow();
float radius = RADIUS_VALUE;
TQ3Vector3D v1,v2;
long err;
GetMouse( &oldPt );
center.h = (win->portRect.right - win->portRect.left)/2;
center.v = (win->portRect.bottom - win->portRect.top)/2;
while (StillDown())
{
GetMouse( &now );
err = Vectorize(&oldPt, ¢er, RADIUS_VALUE, &v1 );
err += Vectorize(&now, ¢er, RADIUS_VALUE, &v2 );
if (!err)
ZeroHysteresisRotation( v1,v2 );
oldPt = now;
}
}
Does It Really Work?
The proof, as they say, is in the pudding. The best way to judge the desirability of hysteresis-free object rotations is to experience the phenomenon for yourself. A sample program using the above routines (and also the more conventional Euler-angle matrix rotations, for comparison purposes) is available at ftp://www.mactech.com. Our sample program, FreeRotate, has menu options that let you choose Euler_XY, Euler_XZ, or Euler_YZ rotation styles in addition to the Zero-Hysteresis method outlined above. If you try the four methods, you'll surely agree that the Euler-angle methods pale by comparison to the zero-hysteresis "geodesic arc" method. The Zero-Hysteresis method gives an all-around superior user experience, because:
- Objects turn in the direction of mouse travel, and in direct proportion to mouse travel, as expected.
- An object's final orientation depends on the final position of the mouse, but not on how the mouse got there.
- There is no permanent penalty for dragging the mouse "the wrong way" - returning it to the starting point always restores the object's original orientation. (The incremental angular error accumulation of mouse dragging is, for once, reversible.)
The Euler_XY method obeys the first test but fails the other two. (The other Euler methods fare even worse.)
A simple test: Load a 3DMF teapot in the SimpleText Viewer, click on the center of the model, and drag the mouse in three counterclockwise circles of ever greater radius, then return the mouse to where you originally clicked. If you do this in any current version of Apple's 3D Viewer, you will end up with the teapot spout-down - i.e., having experienced a net counterclockwise rotation on the z-axis; it is no longer in its original orientation. (The exact amount of z-rotation depends on how far away from center you moved the mouse.)
With zero-hysteresis rotation, you can make as many random circles with the mouse as you want; returning it to center always returns the teapot to its original orientation.
Conclusion
Engineering a solid, correct-feeling User Interface for 3D programs is difficult, in part because of the challenges inherent in implementing three-dimensional object manipulations with a two-degree-of-freedom pointing device (such as the mouse or trackball). But this hardly means we should settle for counterintuitive pointing-device feedback, especially if something better is available. In the case of Euler-angle free rotations, something better is available: namely, rotation along geodesics.
Because free rotations are central to the 3D user experience, it is crucial to implement this feature in the best, most linear, most intuitive and natural-feeling (to the user) manner possible. That means implementing zero-hysteresis rotations - rotations along geodesics.
Kas Thomas <tbo@earthlink.net> has been a Macintosh user since 1984 and has been programming in C on the Mac since 1989. He holds U.S. Patent No. 5,229,768 for a high-speed data compression algorithm and is the author of a QD3D-powered Photoshop® plug-in, Callisto 1.0, available at http://users.aol.com/Callisto3D.