Chapter 7

Three-Dimensional Graphics -


M. Firebaugh

© Wm. C. Brown Communications, Inc.


Visual forms - lines, colors, proportions, etc. - are just as capable of articulation, i.e. of complex combinations, as words. But the laws that govern this sort of articulation are altogether different from the laws of syntax that govern language. . . . They do not present their constituents successively, but simultaneously, so the relations determining a visual structure are grasped in one act of vision.

Suzanne Langer


As we move from two to three dimensions, let us consider several concepts valuable in visualizing the 3D environment. The first involves new geometric considerations and problems posed by the additional dimension. Second is the increased importance of object representation for efficient algorithmic processes in 3D. Finally, we will observe how smoothly the homogeneous coordinate representation allows us to extend 2D transformations to 3D.

Another concept introduced is visual realism, which is the theme running through the chapters on 3D. Visual realism has been the Holy Grail of classical computer graphics almost from its inception. As many of the figures in the book demonstrate, great progress has been made toward this goal. The trend in both graphical hardware and software is to provide the programmer with tools for production of realistic visual images.

The goal of visual realism itself remains irresistible, but difficult for all but relatively simple scenes. The science of ray tracing and radiosity and the mathematics of fractal geometry provide powerful tools for rendering realistic scenes. And yet, the trained observer can, in most cases, distinguish a photograph of a given scene from a computer graphics rendering of the same scene. As you scan the computer-generated graphics in the computer science literature, try to detect the subtle clues which distinguish natural from artificial scenes.

As a final observation, we note that, while visual realism remains an important goal of traditional computer graphics, it is definitely not the goal of all computer graphics. Many productions of computer animation (e.g., "flying logos") and computer art (e.g., distortions of color and space) explicitely exploit the computer-origin of the images rather than attempting to model any physical reality. Similarly, the whole area of image processing is intrinsically an artificial process-the manipulation of natural or artificial images to achieve the desired visual effect. So while visual realism is an important aspect of computer graphics, it is not the whole story.


Moving from 2D to 3D

The task of envisioning 3D worlds on 2D devices has been characterized by Tufte as "escaping flatland." Part of the fun in the N-dimensional approach to graphics is to speculate on what new effects will arise as we move from dimension N to dimension N+1 space. The move from 2D to 3D is particularly interesting. Not only do new transformations appear that are related to the dimensionality of the space itself, but also new problems arise related to the viewing pipeline - the sequence of steps required to transform a world coordinate scene into an image on the screen.


New Transformations

As we moved from 1D to 2D, certain transformations became possible simply through the addition of another axis. These included the translations and scaling now possible along two axes independently. In moving into 3D these transformations naturally extend to all three axes. In going from 1D to 2D, however, some fundamentally new transformations appeared. These included rotation, a transformation undefined in 1D. In moving to 3D new transformations, undefined in 2D, become possible. Two of the most interesting include rotations about an arbitrary axis, shown in Figure 7.1, and the parity operation, shown in Figure 7.2. The parity operation changes right-handed systems into left-handed systems. Handedness (parity) is a concept undefined in 2D.


Figure 7.1

Rotation about an arbitrary axis. In 3D, rotations may be performed about any of the three principle coordinate axes or about any arbitrary axis as shown.


Figure 7.2

The parity operation. The operation P(X) takes the vector, X, into the vector = -X. The effect is to change a right-handed system into a left-handed system. The parity operator applied to the right hand (top view) changes it into a left hand which is shown rotated 180° (bottom view).

The viewing transformation is particularly simple in 2D since it involves mapping a window on a 2D scene onto a viewport on a 2D display. In the previous chapter we showed that this transformation involves a single matrix generated by concatenating a scaling, a translation, and another scaling. In 3D, the viewing transformation is not so simple because the viewing process now involves projecting a 3D scene onto a 2D display surface. The projection transformation involves mapping world coordinate (x,y,z) triplets onto (x',y') display pairs. As we shall see shortly, projection may be a nonlinear function of the newly introduced z-coordinate. This new complexity is considered in the section on projections and viewing in 3D.

Two Points of View

There are two completely equivalent modes for performing the projection process of the viewing transformation. It is helpful to present these now since they interrelate the rotation transformation of objects and the projection phase of the viewing transformation. By clearly naming these alternative modes, the basic concepts become apparent. Consider the world scene shown in Figure 7.3.


Figure 7.3

Original configuration of camera and world scene.


Now consider the following two options for obtaining a side view of the world coordinate scene:


Figure 7.4

Fixed world/movable camera viewing mode.

A little thought will convince the reader that these two modes are completely equivalent. Since the fixed camera/movable world mode fits so naturally into the transformation matrix formalism, this is the mode we adopt in this chapter.

Representation of 3D Objects

Consider how objects might be represented in 3D space. What are the options available? A range of alternatives exist on the concrete/abstract spectrum. The most concrete geometric representation consists of specifying every (x,y,z) point on the surface of the object. This representation requires no abstract data structure-a single file of points defines the object. But it is also enormously inefficient. A single object would require a megabyte or more of storage space to represent it to the resolution available on most display screens.


Figure 7.5

Fixed camera/movable world mode.


The other end of the spectrum involves the abstract representation available through parametric curved surfaces. With a relatively small set of parameters, significant portions of the surface of the object may be accurately modeled. The whole object may, in turn, be represented by smoothly joining a set of parametric patches. The advantages of this representation are efficient storage and a high degree of accuracy independent of the characteristics of the display. The main disadvantages of parametric curved surface representations stem from the difficulties in determining the correct set of curved patches with which to model real objects.

Wire Frame Representation

Most 3D graphics systems represent a compromise along the concrete/abstract spectrum. For certain applications, a wire frame representation is adequate. This can be achieved most simply by a file of straight lines, each of which is represented by its two end points, (x1,y1,x2,y2). Wire frame representations show adequately the skeletal structure of objects, but are not capable of representing surfaces or any other features of a 3D object. Figure 7.6 illustrates the use of wire frame representations.

For certain applications, wire frame representations have distinct advantages. They allow the user to "see through" objects and visualize the internal structure and shape of normally invisible surfaces. However, wire frames also have definite limitations. The complexity of even relatively simple objects soon overwhelms the observer, with insight into shape being lost in the clutter of lines. Figure 7.6(b) is approaching this limitation. Another severe limitation stems from the inefficiency of the independent- line data structure. For instance, each vertex (except the top and bottom vertices) of Figure 7.6(a) is the end point for four lines. So a simple file of lines would contain a four-fold redundancy of data points, and the top and bottom vertex would appear twelve times!


    (a)     	(b)

Figure 7.6

Wire frame representation of two objects.

Polyhedral Representations

The compromise most widely used in computer graphics systems is the polyhedron, a set of smoothly joined polygons. By the appropriate selection of polygon shape and number, polyhedrons are capable of modeling any 3D object to the desired degree of accuracy. Some objects, e.g. tetrahedra and cubes, may be modeled precisely by a set of four equilateral triangles or six squares, respectively. Others, such as spheres and cylinders, may be approximated by combinations of trapezoids, triangles, rectangles, and n-sided polygons. For certain applications a coarse polygon grid may be adequate-for applications requiring greater accuracy, more polygons with a smaller grid spacing may be required. Figure 7.7 demonstrates how a sphere may be approximated with increasing accuracy by polyhedra.


(a)   (b)   (c)

Figure 7.7

Approximating a sphere by polyhedra. (a) consists of 32, (b) of 288, and (c) of 3200 polygons.


Figure 7.8

Human with Robot scene based on polyhedra. Note how the whole scene is composed of rectangles, trapezoids, or, in the case of the 3D letters, rectangles and n-sided polygons.


A polyhedra-based 3D scene such as that shown in Figure 7.8 may be represented by the hierarchical data structure shown in Figures 7.9-7.10. The highest level is the scene; in the next lower level are objects composing the scene; each object is composed of subobjects which may contain other subobjects or geometric primitives; each geometric primitive is a polyhedron composed of polygons; each polygon, in turn, is composed of vertex points; and finally, each point consists of a (x,y,z) triplet.

An illustration of the hierarchical structure of one of the objects in the Figure 7.8 scene is shown graphically in Figure 7.9 in which the robot is exploded into its geometric primitive subobjects. As the figure indicates, the robot has only one level of subobject. However, in general, subobjects may themselves consist of subobjects which, in turn, may consist of subobjects and so on. A useful algorithmic approach is to label the lowest level of sub-object as a 3D geometric primitive and recursively branch down the hierarchical tree until a primitive object is reached. Primitive objects contain, as a minimum, a polygon list pointing to the vertices defining the polygon and a vertex list defining the coordinates in three dimensions. This data structure provides the basis for the complete polyhedral scene.


Figure 7.9

Exploded view of hierarchical structure of robot. The robot main object (bold) is constructed by assembling graphical primitive subobjects which are easily generated by CAD systems.


The polyhedra-based representation of scenes has the advantages of simplicity, generality, and computational efficiency. Objects may be manipulated and transformed by operating on the points composing the polygons. Polygon surfaces have well-defined orientations which simplify computations of visibility and shading. Thus, polygonal representations provide a natural basis for the addition of refinements to increase the visual realism of natural objects. The addition of shading, shadows, and edge antialiasing to the camera polyhedron of Figure 7.6(b) produces the more realistic Figure 7.11. These techniques for enhancing visual realism are discussed in subsequent chapters and continue as areas of active computer graphics research. A serious attempt to establish a mathematical theory for representation of 3D graphical objects and scenes is given in Fiume.

The principal weakness of polyhedral representations is their poor approximation to smooth curved surfaces, complex shapes, and life-like forms. Simply increasing the number of polygons to achieve visually realistic representations of complex scenes will overwhelm the storage and computing capacity of even the largest computers. Efficient algorithms have been developed smoothing the transition from one polygon to the next in order to reduce the discontinuities in color and shading as one moves across a multipolygon surface. Such smoothing techniques can produce visually realistic results with a manageable number of polygons.

Figure 7.10

Hierarchical structure of polyhedral scene. Note that each subobject will have its own polygon list and associated vertex list. Also, subobjects such as right arm will have its own subobjects such as upper arm, lower arm, and hand. The hand may, in turn, have subobjects such as fingers, and so on.

Transformations of 3D Objects

In the remainder of this chapter we consider two important but conceptually distinct topics - the manipulation of objects in 3D space and the projection of objects from 3D to 2D space. By clearly distinguishing these two concepts the reader will save herself considerable grief.

From our experience to date with the homogeneous coordinate system and the matrix representation, the reader should find the extension to 3D transformations straightforward, particularly for translation and scaling. Recall that the fundamental transformation equation may be written in homogeneous coordinates as:

 = X·M        [7.1]


X´ = [x´y´z´1] (transformed point),

X = [x yz 1] (original point),


M = 4 x 4 transformation matrix.



Figure 7.11

More visually realistic rendering of polyhedral representation of camera. Adding shading, shadows, and edge antialiasing to the polygons of Figure 7.6(b) produces this image.

Primary Transformations

Note that by transforming the points defining the object (wire frame or polygon vertices), the object itself is transformed. The desired transformation is uniquely determined by setting the transformation matrix, M, appropriately. We can summarize the M matrices for the primary transformations as follows.



For translating an object a distance Tx along the x-axis, Ty along the y-axis, and Tz along the z-axis, the matrix M is the T matrix, defined as:


Substituting T for M in equation 7.1 and multiplying gives the standard linear translation equations:


x´ = x + Tx [7.3]

y´ = y + Ty

z´ = z + Tz .

In vector notation, this transformation may be considered as the addition of a displacement vector, D, to an original position vector, x, to move the object to . Mathematically, this is written:


= x + D [7.4]


x = (x,y,z), (original coordinates of object)

= (x´,y´,z´) . (object coordinates after translation).

The displacement vector, D, has Tx, Ty, and Tz as its components as illustrated in Figure 7.12.



Generalizing the 2D scaling transformation to 3D involves adding the scale factor, Sz, to the previously defined Sx and Sy for independent scaling along the new z-axis. The M matrix of Equation 7.1 is then represented by the S matrix, defined as:


Substituting S for M in Equation 7.1 and multiplying gives the standard linear scaling equations:


x´ = Sx · x [7.6]

y´ = Sy · y

z´ = Sz · z

Figure 7.12

Translation of an object by vector D.The transformation is performed by the T matrix with components Tx, Ty, and Tz whose meanings are shown here.


For the commonly used uniform magnification of an object by a scale factor S, all three scale factors are set equal to S. However, differential scaling also provides a useful tool for object deformations as demonstrated in Figure 7.13.

Figure 7.13

Differential scaling of a sphere. Starting with the centered sphere, the oblate ellipsoid (top) was generated by (Sx,Sy,Sz) = (2,1,2). The prolate ellipsoid (right) was produced with (Sx,Sy,Sz) = (1,2,1).



Note how naturally and easily the translation and scaling matrices have expanded to accommodate the third dimension. In fact, comparing with the extension from 1D to 2D, you probably predicted the Equations 7.2 and 7.5 as the correct form for 3D translations and scaling, respectively. However, in extending rotations to 3D we run into some unforeseen complexity. There are three aspects to this complexity.

The solution to the first problem is to define three rotation matrices, Rx, Ry, and Rz, corresponding to a rotation by q about the x-, y-, and z-axes, as shown schematically in Figure 7.14. The last of these, Rz, is a simple extension of the 2D R matrix because the motion in both R and Rz occurs parallel to the x-y plane. The three matrices are expressed as:

Figure 7.14

Rotation transformations about three Cartesian coordinate axes.


Substituting these rotation matrices for M in Equation 7.1 and multiplying gives the familiar equations for rotation about the x-axis:


x´ = x [7.10]

y´ = y cos q - z sin q

z´ = y sin q + z cos q

and similar sets of equations for rotations about the y- and z-axes.


Secondary Transformations

The most important secondary transformations are reflections and shearing. The introduction of a new axis increases the number of such transformations.



Reflections can be classified as occurring through one, two, or three planes. As examples of each of these classes, consider the following three reflection matrices:

Although this sequence of transformations appears to be simply a case of increasing negativism, it involves some interesting geometry. Equations 7.11-7.13 also each have two other forms corresponding to cycling the (-) sign through other possible scale factors. Comparison of Equations 7.11-7.13 to Equation 7.5 confirms that reflections are special instances of the more general class of scaling transformation.

Mx corresponds to simple mirroring through the y-z plane. Mxy corresponds to mirroring through both the y-z and x-z planes. Mxyz corresponds to mirroring the object through all three coordinate planes and is designated as the parity operator, written as P(X). In vector notation it can be expressed as:



P(X) Æ -X [7.14]

That is, performing the parity operation on any vector simply inverts it.

A curious effect is that both the simple mirror operators (e.g., Mx) and the parity operator, P = Mxyz, change right-handed systems into left-handed systems and visa versa. These operators are very useful to designers in developing systems with a unique parity. In modeling a human form, for instance, only the right hand, arm, and leg need to be designed. The corresponding left limbs are designed by copying the right limbs and applying a simple mirror or parity operation.

Figure 7.15

Mirror transformation. Note how a simple reflection changes a right-hand coordinate system into a left-hand coordinate system.



As discussed in the 2D section, shearing may be thought of as a coordinate-dependent translation in which the cross section in one dimension is preserved while its position along that dimension depends on the other dimension. We can define analogous 3D shearing as preserving the cross section parallel to a given plane but making the cross section's position a function of the third coordinate. Thus, for instance, a y-dependent shear could be written as:


x´ = x + a y [7.15]

y´ = y

z´ = z + b y


a, b = shearing constants, one of which may be zero.

This transformation may be written in matrix form as

Figure 7.16 shows the results of applying this transformation to a simple object.

      (a)             (b)                 (c)

Figure 7.16

Shearing transformation in 3D - Shearing parameters in Equation 7.16 are:
a) a = 1/3; b = 0
b) a = 1/3; b = 1/3
c) a = 0; b = 1/3


Inverse Transformations

All of the M transformations presented to this point have inverse transformations indicated as M-1. The role of M-1 is to return the transformed object back to its original state. We can summarize the inverse transformations with the matrix element substitutions listed in Table 7.1.


Table 7.1
Inverse Transformation Matrix Elements

Tx,Ty, Tz













-a, -b


As mentioned in the 2D discussion, one can readily verify that M-1 is correct by multiplying it by M. If M-1 correct, the result will yield the identity matrix, I.


Rotation about an Arbitrary Axis - Matrix Composition

As with lower dimensional transformations, complex operations may be composed by concatenating a series of more primitive transformations. Such transformations include shearing in directions other than the coordinate axes and reflections through planes displaced from the origin and not parallel to planes containing the axes. One of the most useful operations is rotation about an arbitrary axis (see Figure 7.17).

The general algorithm for the rotation of an object about an arbitrary axis requires the following sequence of five steps: shift the axis to the origin, align it with one of the coordinate axes, perform the desired rotation, undo the initial alignment, and shift back to the original position. Note that any of the three coordinate axes may be used for the alignment. We have chosen the y-axis to assist in visualizing the process. Note also that the alignment process requires two rotation steps in general.

To get more specific, consider what is required to rotate an object by an angle q about an axis, A, which contains point P a distance D from the origin. Seven primary transformations which accomplish the five-step algorithm are summarized here and illustrated in Figure 7.18.

1. Translation, T, of the object/axis system a vector distance -D so that some point, P, on the axis lies on the origin.

2. Rotation Rx about the x-axis by qx in order to bring axis A into the x-y plane.

3. Rotation Rz about the z-axis by the angle qz to align axis A with the y-axis.

4. Rotation Ry by the desired angle, q.

5. Rotation Rz-1 about the z-axis by the angle -qz to undo step (3).

6. Rotation Rx-1 about the x-axis by the angle -qx to undo step (2).

7. Translation, T-1, of point P a distance D to undo step (1).

How might we express this sequence of primary transformations in terms of the familiar homogeneous matrix formalism? It is clear from the figures that we need more information than just q in order to specify the four matrices, T, Rx, Ry, and Rz and their relevant inverse matrices. In particular, we need the location of P and the orientation of A.

Assume that coordinates of point P are the triad, (Px,Py,Pz). This information is sufficient for specifying the two translations, T and T-1.

In order to define the rotation matrices we need to specify the orientation of the rotation axis, A. We can specify this orientation in terms of a triad of angles, (a,b,g) which the vector A makes with x-, y-, and z-axes at the origin. We can then define a unit vector, L, parallel to A in terms of the unit vectors (i,j,k) and its x, y, and z components, (a,b,c):

L = a i + b j + c k 	[7.19]


a = x-component of L,

i = unit vector in x-direction, and so on.

The relationship between the angles, (a,b,g), and the components, (a,b,c), is simply:

a=cos a 				[7.20]
b=cos b
c=cos g


Figure 7.17

Rotation about an arbitrary axis. The object in the upper left system is to be rotated by q = 90° to give the lower right configuration.

As defined in Equation 7.20, a, b, and c are called the direction cosines. Figure 7.19 demonstrates these relationships and clarifies how the rotation parameters are derived from the orientation angles.

From Figure 7.19 and because L is a unit vector, the following expressions may be written:

a2 + b2 + c2 = 1 			[7.21]

(L is a unit vector),

d2 = b2 + c2 				[7.22]
(by construction).

The rotation, Rx, required to take A (and L) into the x-y plane, also swings d onto the y-axis. The angle involved in the rotation is qx, and, in this case, it will have a negative value because of the right-hand rule. A positive angular rotation about the x-axis carries the y-axis into the z-axis. The rotation parameters for Rx may be written:


cos qx = b/d [7.23]

sin qx = - c/d [7.24]

Similarly, inspection of Figure 7.19 leads to the following expressions for qz, the angle required to rotate L onto the y-axis within the x-y plane.


cos qz = d [7.25]

sin qz = a [7.26]

Because the Rz rotation carries the x-axis into the y-axis in a right-handed system, the angle qz is positive. Equations 7.23-7.26 allow us to complete the specification of the rotation matrices, Rx and Rz . These are written:

The final rotation of the object by q degrees occurs, in this case, about the y-axis, and the rotation matrix is the familiar

Figure 7.18

Primary transformations composed to generate a rotation about an arbitrary axis. Note that each figure shows the result of the transformation indicated. Also note that transformation (7), the final shift back to P, is not shown.


The inverse matrices for steps 5-7 are readily derived from Table 7.1. Now that we have the component primary transformations, the final concatenated matrix, M, for the rotation of an object about an arbitrary axis may be computed by the product matrix shown in the following box.


Although this expression looks scary, every element in it is defined above. By substituting numerical values corresponding to an actual rotation, the matrix multiplication reduces M to a simple 4 ¥ 4 numerical matrix.

Figure 7.19

Rotation angle, qx, in terms of orientation angles and direction cosines.

Projections and Viewing in 3D

Now that we know how to represent objects with a polyhedral structure and to manipulate them with matrix operators, the next step is to explain the fundamentals for viewing our models on the screen. The addition of a third dimension in world coordinates adds complexity to the viewing task. Because scene models are represented in 3D but viewing occurs in 2D, an additional process called projection must be added to the viewing pipeline. Projection can be thought of as the process of collapsing a 3D representation into a 2D image. Burger and Gillies present a summary of the projection operation in concise vector notation.

A helpful model for understanding projection is the analogy between projection of computer graphics images and projection of slide film shown in Figure 7.20. In a slide film projector, light starts at the bulb of the projector, passes through the slide, and focuses an image of the slide on the screen. In computer graphics projection, the eye is analogous to the projector bulb, the display screen is analogous to the slide, the world scene is analogous to the slide screen image, and the direction of the light is reversed. Figure 7.20 illustrates several important concepts concerning projection.




Figure 7.20

Analogy between slide projection and graphics projection. Note the reversal in light ray direction between (a) and (b).


Classes of Projections

Projections commonly used in computer graphics fall into three categories which can be summarized as:

In order to specify these projections in more detail, we need to define several additional terms related to 3D viewing. These are illustrated in Figure 7.21.

With these definitions in mind, let us examine each of the three projection classes in more detail and present the matrix which performs the projection transformation.




Figure 7.21

Definition of 3D viewing terms. The projectors map the 3D object from the view volume in (a) onto the 2D viewing plane in (b).

Parallel Projection

When the projectors are all anti-parallel to the z-axis we have the case of orthographic parallel projection. The effect is to map all view volume triads, (x,y,z), onto view plane pairs, (x´,y´), according to the equations:


x´ = x [7.31]

y´ = y

z´ = 0

(since viewing plane is the x-y plane).

In matrix formalism, this can be considered a transformation of the form = XM1, where M1 is given by

The relationship between projectors, object, and image is shown schematically in Figure 7.22.


Figure 7.22

Parallel projection.This unique parallel projection, in which the direction of projection is in the (-z) direction, is called an orthographic projection. Note that the (x´,y´) coordinates of an image point are independent of the z coordinate of the object point.


Orthographic parallel projections are particularly useful for applications such as architectural CAD. Floor plans are generated by setting the direction of projection in the (-y) direction. The front view (shown here) and side views are called elevations. This projection mode has the advantage of mapping parallel lines of the object into parallel lines on the image. Lines lying in a plane parallel to the x-y plane preserve their orientation and length. All other lines are foreshortened. The primary disadvantage of an orthographic projection is the loss of visual cueing on the depth (z-coordinate) of points on the object.


Oblique Projections

Another class of parallel projections, called oblique projections, retains the parallelism preserving advantages of orthographic projections while overcoming the lack of depth cueing. The basic idea is to retain the parallelism of the projectors but introduce a non-zero angle, g, between the projectors and the z-axis. The choice of g determines the scale of the depth encoding of the z dimension. Two standard choices include:

Defining m as the unit vector parallel to the oblique projectors, we have the following relationships.

m = a i + b j + c k 			[7.33]


a,b,c = direction cosines

i,j,k = unit vectors along (x,y,z) axes.


Figure 7.23

Two cavalier projections. Parallel projectors make an angle of g = 45° with respect to the z-axis. Shown are two of an infinite set of projections as a function of the angle, d, the angle between the x-axis and the projector projected on the x-y plane.


For cavalier projections, the g = 45° constraint gives c = cos g = 0.707. The direction cosines a and b are each arbitrary but constrained by the relationship a2 + b2 = 0.5. Different cavalier projections are generated by different choices of a and b. A useful angle for specifying these projections is the angle, d, defined as:

d = tan-1(a/b) 				[7.34]


This is the angle the vector, d, makes with respect to the x-axis, where d is the component of the unit projector which is perpendicular to the z-axis.

d = a i + b j  				[7.35]

The only change in this formalism for cabinet projections is to set


d/c=1/2 [7.36]

This leads to

g = tan-1(d/c) = 26.56°			 [7.37]

A whole family of cabinet projections are generated for various d angles. Particularly useful choices are 30° and 45°. Figure 7.24 shows cabinet projections for d = 45° and -30°.

The transformation equations mapping the 3D point, (x,y,z), onto the viewing plane at (x´,y´), are:

These equations may be simplified and put in matrix form as M2:

Figure 7.24

Cabinet projections for two different angles, d. Note how the z-depth is now encoded by a scale factor of 0.5. Note also how parallel lines project as parallel.


Some interesting observations are apparent from comparing Figures 7.23 and 7.24. Both oblique projections correctly encode z-depth information, with cavalier projection having the edge since the encoding is one-to-one. However, all parallel projections distort the perspective image generated by our eyes. Of the two oblique projections, this nonperspective distortion is severe for cavalier projections but much less objectionable for cabinet projections. Cabinet projections provide a compromise between readily decoded depth information and visual realism.

Perspective Projection

Perspective projection provides the most visually realistic transformation of 3D scenes onto 2D viewing planes. Perspective projections mimic the process by which the eye maps world scenes into images on the retina and the process used by a camera in recording images on film. We therefore intuitively recognize images created by perspective projections as "real" or correct and detect those made by other projections as "unnatural" or distorted. The one exception to this general rule are photographs taken by telephoto lenses which closely approximate orthographic parallel projections. These photographs are intuitively interpreted as natural even though virtually no effects of perspective projection are perceptible.

The mathematics of mapping an object point at (x,y,z) into an image point at (x´,y´) with perspective projection is simply derived from similar triangles shown in Figure 7.25.

The coordinates of the five points indicated in Figure 7.25 are as follows:


a Æ (0,y,z) dÆ (0,0,0)

b Æ (0,y´,0) e Æ (0,0,z)

c Æ (0,0,-D)


Figure 7.25

Perspective projection of object ae onto viewing plane image bd. Center of projection (COP) is located on the negative z-axis at coordinate (0,0,-D). The viewing plane has the equation, z = 0.


Projector a-b-c forms the hypotenuse of two similar triangles, ace and bcd. Setting the tangents of the common angle equal gives:

The three transformation equations, 7.42-7.44, can be summarized in homogeneous coordinate representation as = XM3, where M3 is given by

Equation 7.46 emphasizes the most important aspect of perspective projection-the scale factors transforming object coordinates to image coordinates are depth-dependent. The z-dependence of (x´,y´) points is the basis for perspective side effects such as the non-parallel projection of parallel lines. The emergence of vanishing points arises directly from this side effect. Figure 7.26 illustrates perspective projection and the vanishing point concept.

We conclude the section on projections by a short object lesson on the power of mathematical abstraction. A casual scan of the figures of this section may appear intimidating as new phenomena such as depth encoding and vanishing points are encountered. Without a careful reading of the mathematics it is natural to worry about such unnecessary questions as: How do I pick the vanishing points? How do I set the projection angle, d and so on.

The point of the object lesson is that all the subtle and mysterious projection effects presented in this chapter are contained in the three projection matrices, M1, M2, and M3. Applying these matrices to the sets of points defining the world scene generates the various projections and all associated effects. Note in particular the simplicity of the matrices. M1 (parallel projection) contains nothing except the numeric constants 0 and 1. M2 (oblique projection) contains only the direction cosines, (a,b,c), specifying the angle of the parallel projectors. M3 completely specifies perspective projection with two parameters: D, the distance from the origin to the eye, and the variable, z, the depth coordinate of the point being projected. Thus, with two numbers, four constants, and one variable we can compute the three matrices, M1, M2, and M3, which completely specify parallel, oblique, and perspective projections. This is one of the most convincing examples of the power of the homogeneous coordinate representation.

Figure 7.26

Perspective projections. Foreshortening of the z-axis in (a) produces one vanishing point, P1. Foreshortening the x- and z-axis results in two vanishing points in (b). Adding a y-axis foreshortening in (c) adds an additional vanishing point along the negative y-axis.

Viewing in 3D

As indicated earlier in this chapter, there are two approaches to viewing in 3D. The first involves fixing the COP in a permanent (x,y,z) coordinate system and transforming the coordinates of the world objects to obtain the desired viewpoint and viewing direction. This we consider the fundamental viewing operation, and we have in place all the mathematical formalism necessary for this mode of 3D viewing.

The second, completely equivalent, mode of viewing is the synthetic camera model.


Pascal Example - 3D Projections

To illustrate the simplicity of the concepts involved in 3D projections we develop an algorithm, Project, and implement it in Pascal. What criteria and features should we consider in designing a 3D projection algorithm?

Figure 7.27 illustrates the implementation of these design specifications. The specific implementation of each feature is discussed in detail below.


Hierarchical Data Structure

The world scene, consisting of a cube centered at the origin and a tetrahedron on down the z-axis was represented by the twelve points and ten polygons listed in Table 7.2.


Table 7.2
Data Structure for Cube/Tetrahedron Scene


1 -100 100 100

2 100 100 100

3 100 -100 100

4 -100 -100 100

5 -100 100 -100

6 -100 -100 -100

7 100 -100 -100

8 100 100 -100

9 -100 -100 323.2

10 0 73.2 273.2

11 100 -100 323.2

12 0 -100 150


1 4 1 2 3 4

2 4 2 8 7 3

3 4 1 5 8 2

4 4 1 4 6 5

5 4 3 7 6 4

6 4 5 6 7 8

7 3 9 10 11

8 3 11 10 12

9 3 10 9 12

10 3 11 12 9


Figure 7.27

Output and graphical user interface of Project . Note the hierarchical menu structure with four major functional blocks: ROTATE, PROJECTION, CONTROL, and ANGLE selection. The first three operate as pick devices (selectable by clicking the mouse) and the last as a valuator device (dragging the pointer selects the angle read out in the lower left-hand box).


The first data item is the number of points in the file. Next comes twelve lines of data, each corresponding to a point in the format: Point i, xi,yi,zi. The "10" in line 14 is the number of polygons for which the points form the basis. The last ten lines are the polygon vertex table in the format: polygon number, number of vertices in this polygon, and the vertex (point) numbers constituting that polygon. Note the efficiency that this two-level hierarchy permit-each point is specified only once but can contribute to three or more polygons. Any transformations (rotations, translations, or projections) involve only the points. The polygonal structure is preserved through the transformation and rides along through the viewing pipeline, including the final 2D plotting. The program Project makes use of a higher-level function to plot the polygons.

Note that no attempt was made to distinguish the two objects in the scene. For complex scenes made from many objects or scenes in which one wishes to manipulate objects individually, it is necessary to maintain an object table with pointers to individual objects. The present table considers the combination of cube + tetrahedron to be one object.



When one of the ROTATE buttons is clicked, it lights up and assigns the appropriate axis of rotation. Clicking a second axis button re-assigns the rotation axis. No actual rotation computation is performed, however, until the Comp.Rot button is clicked. Each button remains lit until the New Object or Quit button is selected. This retains a partial graphical record of the operations performed to date on the object.

The coordinate system used for rotations is the standard graphics system centered in the display window, with y-axis pointing up, x-axis pointing to the right, and the z-axis pointing along the line of sight into the screen. After a few minutes of practice, the user develops an accurate intuition about the behavior of the system and can predict what the resulting image will be before performing the selected rotation. This, in fact, is the primary pedagogical value of programs like Project. Figure 7.28 shows a continuing series of 30° rotations about the y-axis following the configuration shown in Figure 7.27.


Figure 7.28

Project images of object rotated in 30° increments about the y-axis. Note that the tetrahedron, located about 300 units along the z-axis originally, swings around and "collides" with the COP (eye) located at z = -300.

Note the problem of interference of the displayed image of the graphical object and the menu items. Careful clipping to a viewport would eliminate this problem. There are two attractive features of Project which Figure 7.28 illustrates. First, it demonstrates graphically the phenomena of an object swinging behind the COP located here at z = -300. The back plane of the tetrahedron extends to z = - 323 for the 180° rotation case.

Secondly, all images in Figure 7.28 were generated by a sequence of two button clicks-Comp.Rot and Plot Proj. The program remembers the indicated status of axis = 'y', projector = 'P', and angle = 30 indicated in Figure 7.26 and uses them for all subsequent computations and displays, until the user overwrites them with another selection. All three parameters may be altered at will at any point during a user session.


The purpose of the PROJECTION module of the menu is to select the mode of projection. This is accomplished by setting the parameter projector to a character value of ('L', 'O', 'P') for parallel, oblique, and perspective projections, respectively. Upon exiting the CONTROL option Comp.Rot, this parameter is used to set the projection matrix M equal to M1, M2, or M3 defined in the projection section of this chapter.

The programming of this assignment operation raises an interesting issue of computational efficiency. The issue is this: M1 and M2 are completely determined once and for all by the selection of parallel and oblique projection. The optimal strategy for these projections is given by the algorithm:

1. Select axis and angle, and set R = Rx or Ry or Rz.

2. Select projection and set a projection matrix P = M1 or M2.

3. Compose the rotation and projection matrix to generate a final matrix M = RP.

4. Apply M to all points (xi,yi,zi) to generate (xpi,ypi).

5. Reconstruct the transformed polygons from (xpi,ypi) and plot.

However, the z-dependence of the perspective projection, M3, requires a different strategy. We can call this strategy delayed evaluation, and it is the basis of the algorithm used in Project. We separate the steps of computing the rotation from that of computing the projection by assigning a separate CONTROL button to each operation. The actual projection routine assumes a parallel projection matrix, M1, as default value and modifies it to an oblique projection matrix, M2, or perspective projection matrix, M3, as required. By adopting the delayed evaluation strategy, we can treat all three projection modes in an identical manner.

Implementing this strategy in Project we follow the algorithm:

1. Select axis and angle, and set R = Rx or Ry or Rz.

2. Select projection and assign pointer projector to store latest selection.

3. Upon selection of CONTROL option, Comp.Rot, apply R to present object, (xi,yi,zi), to generate rotated object which is stored back into (xi,yi,zi). This option may be repeated to effect multiple rotations.

4. Upon selection of Plot Proj. CONTROL option, use the latest value of projector to set M = M1 , M2, or M3, and apply it individually to each point (xi,yi,zi) to generate projected points, (xpi,ypi).

5. Shift these points to the center of the screen, restructure them as polygons, and plot the polygons.

Figure 7.29 shows the output of Project for selection of each of the three projection modes with no rotations applied.


Figure 7.29

Output of Project for three different projection modes.


The reader can readily verify most of the features of the three projections from this figure. For example, parallel projections give no depth cuing, oblique projections correctly encode the z-depth for lines parallel to the z-axis, and perspective projections involve one or more vanishing points. However, Project also provides a useful tool for investigating the properties of projections for "non-textbook" cases, e.g. when planes are not parallel to the viewing plane and edges are not parallel to the z-axis. Figure 7.30 demonstrates non-intuitive difficulties arising from an oblique projection of the scene after a y-axis rotation of -45°.


CONTROL Module and GUI

The behavior of the CONTROL module is fairly apparent from the listed menu options. Quit exits the program and New Object opens a query window from which the user can select the hierarchical file specifying a new scene. Plot Proj. and Comp.Rot have been discussed. However, Project contains several helpful graphical user interface features that are not apparent from the figures. These include:

Figure 7.30

Oblique projection of scene following a -45° rotation. Note how z-depth cuing is confused because no edges satisfy the criterion of z-axis parallelism. In fact, it takes considerable concentra-tion to interpret the flattened parallelepiped as a cube, but it can be done.

Although these features are not exhaustive, they do provide a flexible and capable system for viewing 3D objects in wire frame mode from any angle with any of the standard projections. Additional options which would improve performance might include:

The problem with implementing such extended features lies not only with the additional complexity of the programming involved. It also involves the steeper learning curve for potential users. Mastering a 3D object manipulation package easily requires that it be modular and conceptually simple. Each additional feature extends the range of transformations possible and increases the difficulty of mastery. As it stands, Project represents a reasonable compromise between capability and simplicity.

Pascal ProgramProject

program Project;
{Program to read the X, Y, Z coordinates of 			}
{an object and the polygon connectivity information 	}
{for the object,rotate it repeatedly upon demand 		}
{about any axis by N degrees and display the		} 
{projected image with a user-selectable projection.	}
{All interaction is menu-driven.					}
		Nctrl = 15;
		nd = 4;
		D = 400;
		a = 0.5;						{Direction Cosines }
		b = 0.5;						{of Oblique Projector}
		c = -0.707;				{for Cavalier projection.}
		rotate = [2, 3, 4];
		projection = [6, 7, 8];
		control = [10, 11, 12, 13];
		draw = [2, 3, 4, 6, 7, 8, 10, 11, 12, 13, 14];
		mat = array[1..4, 1..4] of real;
		vect = array[1..3] of real;
		BoxArray = array[1..Nctrl] of Rect;
		x, y, z: array[1..500] of real;
		xp, yp, zp: array[1..500] of real;
		key: array[1..500] of integer;
		ind: array[1..500, 1..5] of integer;
		npp, shade: array[1..500] of integer;
		box: BoxArray;
		vectIn, vectOut, P, R, M: mat;
		i, j, k, npts, rank, nplanes: integer;
		active, xm, ym, angle: integer;
		done: Boolean;
		axis, projector: char;
		th, dt, Lmag: real;
		window: rect;
		Pt: Point;

{******************* diagmat *********************}
	procedure diagmat (var xmat: mat; nd: integer);
{procedure to generates identity matrix}
			i, j: integer;
		for i := 1 to nd do
				for j := 1 to nd do
					xmat[i, j] := 0.0;
				xmat[i, i] := 1.0;

{******************* matmlt **********************}
	procedure matmlt (var d, a, b: mat; r, nd: integer);
{Calculates matrix product: D=A*B }
			n, i, j, k: integer;
			sum: real;
			temp: mat;
		n := nd;
		for i := 1 to r do
				for j := 1 to n do
						sum := 0.0;
						for k := 1 to n do
							sum := sum + 
								a[i, k] * b[k, j];
						temp[i, j] := sum;
		for i := 1 to r do
			for j := 1 to n do
				d[i, j] := temp[i, j];

{******************* GetObject ******************}
	procedure getObject;
{Procedure to read hierarchical data structure.}
			filename: string;
			j, i, n, nppp: integer;
			infile: text;
		reset(infile, OldFileName
				('File of Input Object?'));
		readln(infile, npts);
		for i := 1 to npts do
				readln(infile, n, x[i], y[i], z[i]);
		read(infile, nplanes);
		for i := 1 to nplanes do
				read(infile, n, nppp);
				for j := 1 to nppp do
					read(infile, ind[n, j]);
				npp[i] := nppp;

{**************  Set Rotation Matrix **************}
	procedure SetRot (axis: char; degrees: real;
												nd: integer; var R: mat);
			s, c: real;
			i: integer;
		th := degrees * 2 * 3.141592654 / 360.0;
		diagMat(R, nd);
		c := cos(th);
		s := sin(th);
		vectIn[1, 4] := 1.0;
		if axis = 'x' then
				R[2, 2] := c;
				R[3, 3] := c;
				R[2, 3] := s;
				R[3, 2] := -s;
		if axis = 'y' then
				R[1, 1] := c;
				R[3, 3] := c;
				R[1, 3] := -s;
				R[3, 1] := s;
		if axis = 'z' then
				R[1, 1] := c;
				R[2, 2] := c;
				R[1, 2] := s;
				R[2, 1] := -s;

{******************** Menu  **********************}
	procedure Menu (var box: BoxArray);
{Procedure to draw interactive menu for selecting}
{		• Rotation axis									}
{		• Rotation angle								}
{		• Projection Mode							}
			Nctrl = 15;
			T = 10;
			B = 400;
			L = 1;
			R = 550;
			BoxArray = array[1..Nctrl] of Rect;
			SBox: Rect;
			Top, Bot, Left, Right: Integer;
			CLabel: array[1..Nctrl] of string;
			color: array[1..Nctrl] of LongInt;
			i: integer;
		for i := 1 to Nctrl do
			color[i] := blueColor;
		color[1] := whiteColor;
		color[5] := whiteColor;
		color[9] := whiteColor;
		color[15] := redColor;
		CLabel[1] := 'ROTATE';
		CLabel[2] := 'x-Axis';
		CLabel[3] := 'y-Axis';
		CLabel[4] := 'z-Axis';
		CLabel[5] := 'PROJECTION';
		CLabel[6] := 'Parallel';
		CLabel[7] := 'Oblique';
		CLabel[8] := 'Perspective';
		CLabel[9] := 'CONTROL';
		CLabel[10] := 'Comp. Rot';
		CLabel[11] := 'Plot Proj.';
		CLabel[12] := 'NewObject';
		CLabel[13] := 'Quit';
		CLabel[15] := 'ANGLE';
													{Open Drawing Screen}
		SetRect(SBox, L, T, R, B);
{Outline Main Menu Sections}
		PenSize(3, 3);
		Top := -15;
		Bot := Top + 104;
		Left := 5;
		Right := 100;
		SetRect(SBox, Left, Top, Right, Bot);
		Top := Bot - 3;
		Bot := Bot + 100;
		SetRect(SBox, Left, Top, Right, Bot);
		Top := Bot - 3;
		Bot := Bot + 125;
		SetRect(SBox, Left, Top, Right, Bot);
{Iterate to produce control buttons}
{ and name boxes.}
		PenSize(2, 2);
		for i := 1 to Nctrl - 2 do
				Left := 15;
				Right := Left + 20;
				Top := 25 * i - 45;
				Bot := Top + 20;
				SetRect(Box[i], Left, Top,
							Right, Bot);
				MoveTo(Right + 5, Top + 15);
		Left := 5;
		Right := Left + 40;
		Top := Bot + 15;
		Bot := Top + 20;
		SetRect(Box[15], Left, Top, Right, Bot);
		MoveTo(Right + 5, Top + 15);
		Move(5, 0);
		Left := 100;
		Right := 460;
		SetRect(Box[14], Left, Top, Right, Bot);
		MoveTo(Right + 5, Top + 15);

{************** Apply Rotation  ******************}
	procedure transformRot (R: mat);
{Procedure to apply rotation matrix to all points.}
			i: integer;
		for i := 1 to npts do
				vectIn[1, 1] := x[i];
				vectIn[1, 2] := y[i];
				vectIn[1, 3] := z[i];
				vectIn[1, 4] := 1.0;
				matmlt(vectOut, vectIn, R, 1, nd);
				x[i] := vectOut[1, 1];
				y[i] := vectOut[1, 2];
				z[i] := vectOut[1, 3];

{************* Projection transform *************}
	procedure transformView (projector: char);
{Procedrue to apply viewing projection}
{ to all points.}
			i: integer;
		DiagMat(M, nd);	{Use parallel projection}
		M[3, 3] := 0.0;		{as default.}
		for i := 1 to npts do
				vectIn[1, 1] := x[i];
				vectIn[1, 2] := y[i];
				vectIn[1, 3] := z[i];
				vectIn[1, 4] := 1.0;

				case projector of	
					'O': 		{Adjust projection }	
						begin	{matrix for oblique.}
							M[3, 1] := -a / c;
							M[3, 2] := -b / c;
					'P': 	{Compute z-dependent }
						begin	{scale factor.}
							M[1, 1] := 1 / (z[i] / D + 1);
							M[2, 2] := M[1, 1];
				matmlt(vectOut, vectIn, M, 1, nd);
				xp[i] := vectOut[1, 1];
				yp[i] := vectOut[1, 2];
				zp[i] := vectOut[1, 3];

{******************* Plot Proj  *******************}
	procedure plotProj;
{Procedure to shift points to (xc,yc) of screen,	}
{reconstruct polygons from vertex table, and	}
{plot out resulting projected polygons.			}
			xc = 300;
			yc = 200;
			xx, yy: array[1..6] of integer;
			i, j, np, np1: integer;
			zz: array[1..6] of real;
			window: rect;
			triPoly: PolyHandle;
			pat: pattern;
		penSize(2, 2);
		for i := 1 to nplanes do
				np := i;
				for j := 1 to npp[np] do
					zz[j] := zp[ind[np, j]];
					xx[j] := round(xp[ind[np, j]]) + xc;
					yy[j] := -round(yp[ind[np, j]]) + yc;
				np1 := npp[np] + 1;	{Close polygon.}
				xx[np1] := xx[1];
				yy[np1] := yy[1];
				tripoly := OpenPoly; {Open polygon}
				moveto(xx[1], yy[1]);	{ structure.}
				for j := 2 to np1 do
					lineto(xx[j], yy[j]);

{* * * * * * * * * MAINPROGRAM * * * * * * * * * * * * }
{Program to decode menu selections and jump	}
{to the appropriate worker routines.			}
	done := false;
	DiagMat(M, nd);
	repeat		{until done}
		repeat	{Loop until control button pushed.}
			repeat{Loop to highlight}
				GetMouse(Pt);	{ active button-box}
				for i := 1 to Nctrl - 1 do
					if i in draw then
							if PtInRect(Pt, Box[i]) then
								active := i;
				while PtInRect(Pt, Box[14]) do
					begin				{Read angle}
						GetMouse(Pt);	{box valuator}
						xm := Pt.h;
						ym := Pt.v;
						moveto(xm, box[14].top + 2);
						lineto(xm, box[14].bottom - 2);
						angle := (xm - 100) div 2;
							 box[15].bottom - 5);
			until Button;
			foreColor(magentaColor);	{Color active}
			paintRect(box[active]);	{button}
			ForeColor(BlueColor);		{Reframe it.}
			if active in rotate then	{Assign}
				case active of	{ selected axis.}
						axis := 'x';
						axis := 'y';
						axis := 'z';
			if active in projection then	{Assign }
				case active of  {selected projector.}
						projector := 'L'; 	{Parallel case}
						projector := 'O';	{Oblique case}
						projector := 'P';	{Perspective}
		until active in control;
		case active of
			10: 		{Concatenate rotations.}
					setRot(axis, angle, nd, R);
					transformRot(R);  {Perform rotation}
				begin			{Perform projection}
					repeat		{Loop while marveling} 
					until button;	{at output.}
					SetRect(Window, 100, 0, 550, 400);
				done := true;
	until done;



The chapter begins by highlighting new transformations possible in 3D which were undefined in 2D. These include rotation about an arbitrary axis and the parity operations. In addition, the viewing pipeline now requires an additional operation, projection, for transforming 3D scenes onto 2D viewing planes. Two modes for viewing 3D scenes from any desired viewpoint includes the fixed world/movable camera and the fixed camera/movable world paradigms. Since the world is easily moved by the homogeneous coordinate formalism, the latter mode was selected for the viewing transformations presented in this chapter.

Before objects can be transformed, however, they must be represented. A range of representations was summarized, and the polyhedral representation high-lighted as a compromise offering simplicity of data structure with reasonable visual realism. The advantage of using hierarchical data structures in scene representations was discussed. Transformations in 3D are provided naturally through extensions of the homogeneous coordinate matrix representation. The only additional complexity arises from the need for three independent rotation matrices for rotations about the three coordinate axes and a seven-matrix composition for rotation about a general axis. Projection was illustrated by describing the three principle classes-parallel, oblique, and perspective-graphically and mathematically. The projection operations are readily formulated as transformation matrices within the homogeneous coordinate representation. The algorithm Project was developed to demonstrate the three classes of projections within a user-friendly GUI environment providing three axes of rotation.