Chapter 5 

Two-Dimensional Graphics -

Representation

M. Firebaugh 

© Wm. C. Brown Communications, Inc. 

 

The real task of computer graphics is that of describing graphical entities.

J. W. Wendorf




As we move from 1D to 2D graphics, we would like to do so in the context of representation issues. Representation is at the heart of computer science. Finding the optimal representation for a problem is often the key to its solution. By representation, we mean the description of graphical objects. "Graphical objects" implies an object-oriented description-both the data structures themselves as well as the operations allowable on the data must be defined. Representation occurs at various levels of abstraction. At the highest level, we have the purely mathematical representation of vectors, Xi, transformation matrices, Mj, and the relationships between them. The next lower-level of abstraction is the algorithmic level in which the sequence of operations required to solve the problem is specified. Finally, at the lowest level, we have the program which implements the algorithm. At this level actual data structures must be specified and specific plotting routines called.

We start the discussion of 2D graphics by introducing the more abstract, mathematical representations of graphical objects and conclude it with a chapter on issues of implementation. However, before looking at how points, lines, and so on are represented, it helps to note how a 2D environment differs from that of 1D.

Efforts to put computer graphics on a firm mathematical foundation constitute much of the academic research in computer graphics. A particularly good summary of these efforts is given by Eugene Fiume in the book based on his thesis research.

 

Moving from 1D to 2D

In the preceding chapter, a 1D environment was used to develop the representation of graphical objects and the operations which are possible on them. Matrix techniques for operating on homogeneous coordinates were introduced for combining scaling and translation transformations, and the method was shown to be easily extensible for more complex transformations such as scaling about a point and reflection about a point. The viewing transformation for mapping a scene onto a screen was described in terms of windows, viewports, and the clipping operation.

In the next two chapters, we extend these ideas to two dimensions. However, as suggested in Chapter 4, when we turn the dimension meter up from 1 Æ 2, we expect to observe the following:

Other complexities arise as we click the dimension dial from N = 1Æ 2. Lines are no longer constrained to lie along a single axis, but instead can wander about the whole x-y plane, requiring the description of a new object, the curve. Objects, which in 1D required n numbers to specify, now require 2n numbers. In a Cartesian coordinate system, points are specified by the pair, (x,y). Straight line segments require the 4-tuple, (x1,y1,x2,y2). Finally, increased complexity appears in the clipping algorithm since line segments may now be totally inside, totally outside, or straddling the edge of the clipping window.

The beauty of the matrix representation will become apparent as we attempt to specify the operations (transformations) allowed on 2D objects. Just as 2 ¥ 2 matrices were required to handle the general 1D transformation, the affine transformations possible in 2D require 3 ¥ 3 matrices. It is a relatively simple task to determine the appropriate matrix.


(a)

(b)

(c)

(d)

Figure 5.1
Some 2D graphics primitives. (a) point defined as (x,y) pair, (b) line defined as the (x,y) coordinates of its end points, (c) curve defined as x and y as a function of the parameter, t, and (d) polygon defined as vertex points.


Representation of 2D Objects

Before looking at what operations are allowed on objects in 2D space, we ought to consider just what classes of objects we are interested in and how they can be represented. A little thought leads to the following categories:

In Figure 5.1, we show the four primitive objects and the parameters that specify them. Note that points are no longer simple 1D scalar objects but rather vectors requiring the 2D coordinate pair, (x,y), as shown in (a). To understand the term vector, imagine an arrow with a head at point (x,y) and a tail at (0,0). Clearly, a line is well-specified by giving the coordinate pairs of its end points as shown in (b).

 



(a)

(b)



(c)



(d)

Figure 5.2
Four composite 2D graphical objects. (a) A scatter plot of the concentration of CO2 in parts per million (volume) in excess of 300 vs year, (b) Scanned image of FIGS author climbing Glacier Peak, c)US Map, and d) Electron density plot of Niobium trimer.


 

Curves are somewhat more problematic to specify. If y is a single-valued function of x, such as that shown in (c) (i.e., no loops, circles, and so on), the generally accepted representation is to give y = f(x), where f(x) is some mathematical function of x. However, a more general representation, which we discuss below, is to specify both x and y as a function of a third, independent parameter, t. This parametric representation has the virtue of generality-circles, ellipses, spirals, and loops are represented with the same formalism as simple functions of x.

Planar regions may be represented in 2D by a variety of techniques. They may be generated as the interior surface bounded by analytic functions such as circles, ellipses, and lines. They may be specified as the interior surface of a closed polygon bounded by a set of lines; primitives we already know how to specify. They may be specified as the interior surface of a closed figure bounded by a series of connected parametric curve segments such as that shown in (c). Finally, they may be specified by the interior surface of a polygon bounded by lines connecting a set of points as shown in Figure 5.1 (d).

In Figure 5.2, four composite 2D images are shown.

Note the increasing complexity of the composite objects in Figure 5.2. Graph (a) is composed essentially from lines and text with a round-cornered rectangle pasted on as title. Image (b) is an electronic reproduction of a photographic reproduction of a real mountain scene. It is a PICT file generated by a color scanner. Map (c) is a collection of objects (states), each of which is drawn as a polygon filled with a color. Finally, synthetic image (d) is a supercomputer simulation of the electron density of three Niobium atoms in the process of ionization. Although both (b) and (d) represent 3D objects at a fundamental level, we consider (b) to be a simple 2D photograph and (d) to be a 2D slice through a 3D object.


Mathematical Representations

Two levels of abstraction are available for representing 2D graphical primitives. The first we shall call a scalar representation; the second is called a vector representation. Another way of distinguishing these two representations is by considering the scalar representation as algebraic and the vector representation as geometric.

 

Points

Figure 5.3 illustrates both the scalar and vector representation of a point. Note that the scalar representation does require a pair of scalars to represent the point in 2D, and the abstract position vector, P, requires specification of its two components for any implementation. (Note: vectors are designated by boldface characters.)




Figure 5.3
Two equivalent representations of a point in 2D. The scalar representation, (a), requires an n-tuple (n = 2) of scalar numbers. The vector representation, (b), defines the point more abstractly as the vector, P, whose components must be specified before any computation.


 

 

The homogeneous coordinate system representation equivalent to both of these representations is the 3-element row vector

 

Lines

Lines may be defined in several ways. Two familiar algebraic schemes include the slope-intercept form:

and the two-point form:

Note that both Equations 5.1 and 5.2 may be rewritten in the more general form,

The parameters, m and b, of the slope-intercept representation are shown in Figure 5.4(a).

The general algebraic form, [5.4], may be generated by setting to zero the scalar product of the row vector, P, with a column vector whose elements are the A, B, C coefficients of [5.4]. Using row vector, P = [x, y, 1], and column coefficient vector, C, given as

Equation 5.4 is the 1 ¥ 1 scalar product of these row and column vectors:

Thus the algebraic formulation of a line has a very concise vector representation. Points are represented as 3-element row vectors, and lines are represented by 3-element coefficient column vectors.

 


Figure 5.4
Two representations of a line. (a) Slope-intercept algebraic form: y = mx + b, (b) Parametric, vector form: L = P + t T, where t = parameter.


 

The parametric vector representation of a line assumes that every point on the line is a vector, L, from the origin to that point. The parametric vector equation for L in terms of a vector point P on the line, a tangent vector T parallel to the line, and an arbitrary parameter, t, is given as

This parametric vector representation has a very simple geometric interpretation shown in Figure 5.4(b). Equation 5.6 says that you can get to point L on the line by first going to point P on the line and then moving t units along the vector T tangent to the line. For the case shown in Figure 5.4(b), where T is a unit vector, the value of the parameter, t, would be approximately 2. To get to those points on the line left of P, the value of t would be negative.

Note that the parameter, t, acts like a "control knob" telling us how far away from point P we are on the line. When t = 0, we are precisely at point P. As t is turned up to 1, we get to the tip of the T vector in Figure 5.4(b). As it moves on towards 2, we approach the tip of the L vector. As it moves past 2 toward infinity, we move along the line to the right toward infinity.

The parametric vector representation is a convenient way to represent line segments as well. If the line segment is defined by two end point vectors, P1 and P2, the tangent vector, T, in Equation 5.6 can be replaced by the unnormalized vector, (P2 - P1):

Then, the vector L sweeps uniformly along the line from P1 to P2 as the parameter t changes from 0 Æ 1.

The connection between the Cartesian coordinates, (x,y), of vector L and the coordinates and components of the defining vectors, P and T, are given by:

The advantage of the vector formalism is the ease with which relationships between points and lines may be computed. Consider the following useful tests for classifying points P and lines C.


Inside test: P · C < 0

Multiplying P = [x, y, 1] by C = [A, B, C]T and applying the inequality gives the result

which is the algebraic expression for points (x,y) which lie inside (i.e., on the same side as the origin) of a line whose coefficients are (A, B, C). Note that the "T" symbol means transpose-interchanging rows and columns.


Outside test: P · C > 0

Similarly, the test for "outside" points (i.e., those on the other side of the line from the origin) is given by


Normal lines test: T1 · T2 = 0

Recall from geometry that two vectors, V and U, are normal lines (perpendicular to each other) if U · V = 0. Writing the vector representation of two lines as

A little thought leads to the conclusion that the elements of this representation important for testing perpendicularity are the tangent vectors, Ti. The test for two lines being normal then reduces to the simple expression

In terms of Cartesian components, this product yields:

This may be rewritten as

which is equivalent, in the slope-intercept form (y = m x + b), to

and, since m = -(A/B) in the Ax + By + C = 0 form, the relationship of the coefficients for two perpendicular lines may also be written:


Distance from P to C

The distance, d, from a point P to a line C is given by the expression:

This result follows from the following arguments. Given the point: P = [xp,yp,1] and the line: C = [Ao, Bo, Co]T Equation 5.16 can be used to write the following equation for the line normal to C that contains P:

The normal line intersects the original line at point (xi,yi), at which point the equation of the original line can be written:

These last two equations may be solved for the point of intersection to give

Since the point of intersection is the point on the line closest to point P, the distance from P to the original line is simply

Substituting Equations 5.20 and 5.21 into 5.22 and doing a bit of algebra yields

which is an expanded form of the desired result, Equation 5.17.

 

Polygons

The third 2D primitive we discuss in this chapter is the polygon. Polygons are valuable graphical objects for a number of reasons. First, they provide powerful modeling tools with a simple data structure format. That is, n-sided polygons can be used to model triangles, squares, pentagons, stars, and even circles and ovals in the limit of n Æ large. Very irregular objects, such as the outline of an animal or a human face, may be represented accurately by the appropriate n-sided polygon with n sufficiently large.

Secondly, collections of polygons have proven to be extremely effective 3D modeling tools. That is, any solid object may be represented by the seamless conjunction of three- and four-sided polygons (triangles and rectangles). This is an important tool in CAD and the essential tool of finite element analysis. Since these techniques are becoming standard techniques for engineering design and scientific modeling, it is important to study the nature of polygons and their representations.

How might we represent a generalized polygon? First, it helps to define a polygon as a closed, n-sided, planar, simply-connected geometric figure. This definition and some of the problems it helps avoid are illustrated in Figure 5.5.

Several schemes are possible for representing polygons. Each has its own particular advantages and drawbacks. Three possible representations of an n-sided polygon include:

1. An ordered list of n+1 (x,y) vertices (absolute) - This representation has the virtue of the simplicity of a "dot-to-dot" algorithm, but suffers slight redundancy in that the first and last data pair are identical. This redundancy can be useful, however, in flagging the last point of a general, n-sided polygon.

2. An ordered list of n+1 (x,y) vertices (relative) - Here the first vertex is given in absolute coordinates, and all successive vertices are in relative coordinates. This is a more object-oriented structure defined by a single position (coordinates (x1,y1)) and internal structure (relative coordinates (xi,yi)). It has the advantage of increased efficiency since relative coordinates, as smaller, off-set numbers, generally require fewer bits to represent them. The cost of this scheme is a somewhat increased complexity over absolute schemes.

3. A set of n lines - This representation is a higher level of abstraction that correctly reflects our intuition of a polygon consisting of a collection of edges. It has the advantage of flexibility, generality, and extensibility. That is, the lines need not be ordered, and any line, for example, may be deleted and replaced by two or more additional lines which close the figure. These advantages are bought at a dear price in memory requirements, however, since each line requires two vertices for representation. This doubles the storage requirements over schemes (1) and (2).



(a)

(b)

(c)

(d)

Figure 5.5
Legal and illegal polygons. (a) Although an n-sided figure, it is not a polygon because it is not closed. (c) Although constructed from five lines, it is not a polygon because it is not simply-connected. Both (b) and (d) are legal polygons.


 

The particular choice of representation will depend upon the application. In this book, we use both scheme (1) and (3).

 

Curves

The intuition we developed in calculus suggests that it is possible to represent any geometric object to any desired degree of accuracy by using enough straight line segments, or, at a higher level, by some combination of polygons. This concept, in fact, is the foundation for finite element analysis in which complex objects are decomposed into many primitive polygons or polyhedra. While this intuition is correct and finite element techniques have proven to be powerful analysis tools, from a mathematical point of view these techniques are embarrassingly crude and often computationally inefficient. Curves and curved surfaces have the tremendous advantage of more realistic modeling of graphical objects using far fewer parameters than required by finite element techniques.

Consider, for example, the two contrasting representations of a circle.

Furthermore, many of the objects of nature are smooth curves or have smooth surfaces with well-behaved mathematical properties. Examples include the trajectories of particles moving in force fields, an egg, a raindrop, and the human face. Such smooth, "organic" forms have greatly influenced the design of human artifacts as a survey of silverware, furniture, and automobile design will verify. In fact, the motivation for some of the best work in computer modeling of smooth surfaces was specifically to develop more flexible and powerful tools for automobile hood design. It was at Citroën where P. de Casteljau developed his (unpublished) algorithms for designing and displaying smooth surfaces in the early 1960s. And it was at Renault where Pierre Bézier developed his UNISURF CAD system and the curves which bear his name.

Before looking in more detail at the mathematical representation of Bézier curves, it is helpful to consider the question: Why use curves? There are two quite distinct answers which are determined, again, by the all important representation issue.

A final observation concerns the constraints relating points and curves, and it applies equally to both scientific and the design applications. The two classes of constraints are:

 

Parametric Cubic Polynomial Representations

A number of schemes have been developed for relating curves and points. In scientific curve fitting, the user typically selects some trigonometric or polynomial series as the basis for representing the data. In computer aided geometric design, on the other hand, the choice has consistently been a parametric polynomial representation, generally with polynomials of order two, three, or four. Cubic curves (order 3) provide a compromise between simplicity and computational efficiency on the one hand, and smoothness and sensitivity to points on the other. The major forms of parametric polynomials are the Hermite, Bézier, B-spline, and Beta-spline.

The advantages of a parametric cubic representation include the following.

 

Bézier Formulation

We can express x and y as cubic functions of parameter t by writing:

Four points are selected to control the curve as follows:

The four points and the controlled curve are shown in Figure 5.6.

 

 


Figure 5.6
Bézier curve with control points. Vector T1 determines the direction of the curve leaving point P1. Vector T2 determines the direction the curve enters point P4. Longer vectors pull on the curve more than do shorter vectors. Point P1 corresponds to t = 0, and point P4 to t = 1.


 

Note that Equation 5.25 may be represented in vector form as:

where

and

and similarly for y.

Now the problem becomes: Given a set of control points, P1,P2,P3, and P4, how can we express the unknown coefficients, (a, b, c, d)x and (a, b, c, d)y in terms of the control point coordinates, (x1,y1), (x2,y2), (x3,y3), and (x4,y4). From now on, all arguments applied to x will apply similarly to y, so we shorten the discussion by solving only the x equations.

We need four equations to solve for these four unknowns, and we obtain them by applying boundary conditions to the curve and its derivative at points P1 and P4. The derivatives are taken with respect to the parameter, t. Point P1 corresponds to the parameter t = 0, and P4 is reached when t = 1. The derivatives at the two end points are defined in terms of the four points as:


The number "3" is chosen as an arbitrary scale factor. The parametric form itself gives:

and since

direct substitution of Equations 5.29 and 5.30 into 5.25 and 5.31 yields:

These four equations in four unknowns are readily solved to give:

This may be concisely expressed in matrix form as:

Substituting Equation 5.34 back into the original matrix representation, 5.26, gives

Equation 5.37 is a very concise parametric form for x(t) in terms of the controlling point coordinates. Note that the parametric form for y(t) would be identical with y's substituted for x's in 5.37 and 5.38. This form is very convenient for implementation in a computer algorithm.


Blending Functions

An equivalent form for representing Bézier curves is through blending functions. This formulation may be written:

The Bi(t)s are the Bernstein polynomial blending functions. Since [5.25] and [5.39] are equivalent representations, we should be able to derive the blending functions directly from equation 5.37. Multiplying it out and simplifying, we get:

from which, by comparison to 5.39, we can read off:

Note how the blending functions, Bi, act as weighting functions for the four control point coordinates, xi. That is, at point 1 (where t = 0), point 1 controls completely since B1 = 1 and all the other Biís are zero. As t approaches 0.33, point 2 dominates the curve as the influence of point 1 is dying out and that of point 3 is growing but not yet strong. As we move through t = 0.67, point 3 dominates and the influence of points 2 and 1 is dying and dead, respectively. Finally, at point 4, B4 approaches 1 and all the other blending functions fade away. This behavior can be better visualized by examining the blending function behavior as a function of t as shown in Figure 5.7.

 


Figure 5.7
Blending functions. These functions, called Bernstein polynomials, serve as weighting functions to weight points P1, P2, P3, and P4, respectively.


 

Pascal Program Bezier


program Bezier;
{Program to read in four Bezier control points: } 
{ P1 = first point } 
{ P2 = second point, controlling slope at first } 
{ P3 = third point, controlling slope at fourth } 
{ P4 = fourth point, end of section } 
{and compute and plot resulting parametric } 
{cubic curve. } 
type 
mat = array[1..4, 1..4] of real; 
var 
Px, Py, Cx, Cy: mat; 
Tv,Mb,x, y: mat; 
i, j, k, xx, yy: integer; 
window: rect; 
t, dt: real; 
P:point;
{************* MatMlt **************} 
procedure matmlt (var d, a, b: mat; n, m: integer); 
{ Program to calculate matrix product: } 
{D=A*B, where } 
{ A = n x 4 input matrix } 
{ B = 4 x m input matrix } 
{ D = n x m output matrix } 
var 
i, j, k: integer; 
sum: real; 
temp: mat; 
begin 
for i := 1 to n do 
begin 
for j := 1 to m do 
begin 
sum := 0.0; 
for k := 1 to 4 do 
sum := sum + a[i, k] * b[k, j]; 
temp[i, j] := sum; 
end; 
end; 
for i := 1 to n do 
for j := 1 to m do 
d[i, j] := temp[i, j]; 
end;

{+++++++++ Get-Points ++++++++++++} 
procedure getPoints; 
var 
i, x, y: integer; 
begin {First set up drawing window} 
setRect(window, 30, 30, 400, 300); 
setDrawingRect(window); 
showDrawing; 
penSize(2, 2); 
moveto(40, 20); {Print heading } 
textSize(18); 
textFont(2); 
writeDraw('Bezier Parametric Cubic Curve'); 
textSize(12); 
ForeColor(409); {Set pen to blue.} 
{Loop to read 4 points.} 
for i := 1 to 4 do 
begin 
setRect(window, 90, 30, 255, 50); 
eraseRect(window); 
moveto(100, 45); 
writeDraw('Please click in point', I : 3); 
frameRect(window); 
repeat 
getMouse(P) 
until button; 
repeat 
until (not button); 
Px[i, 1] := P.h; 
Py[i, 1] := P.v; 
setRect(window, (P.h - 2), (P.v - 2), (P.h + 4), (P.v + 4)); 
{Draw point as 6-pixel diameter circle.} 
paintOval(window); 
end; 
end;

procedure set_BezMat; 
begin {Simplest to go brute force.} 
Mb[1, 1] := -1; 
Mb[1, 2] := 3; 
Mb[1, 3] := -3; 
Mb[1, 4] := 1; 
Mb[2, 1] := 3; 
Mb[2, 2] := -6; 
Mb[2, 3] := 3; 
Mb[2, 4] := 0; 
Mb[3, 1] := -3; 
Mb[3, 2] := 3; 
Mb[3, 3] := 0; 
Mb[3, 4] := 0; 
Mb[4, 1] := 1; 
Mb[4, 2] := 0; 
Mb[4, 3] := 0; 
Mb[4, 4] := 0; 
end;
{******** MAIN PROGRAM **************} 
begin 
set_BezMat; 
getPoints; 
{Calculate coefficient vector} 
matmlt(Cx, Mb, Px, 4, 1); 
matmlt(Cy, Mb, Py, 4, 1); 
{Now trace out curve in 101 steps } 
t := -0.01; 
xx := round(Px[1, 1]); 
yy := round(Py[1, 1]); 
moveto(xx, yy); 
ForeColor(137); {Set pen to magenta.}
for i := 1 to 101 do 
begin 
t := t + 0.01; 
Tv[1, 4] := 1; 
Tv[1, 3] := t * Tv[1, 4]; 
Tv[1, 2] := t * Tv[1, 3]; 
Tv[1, 1] := t * Tv[1, 2]; 
matmlt(x, Tv, Cx, 1, 1); 
matmlt(y, Tv, Cy, 1, 1); 
xx := round(x[1, 1]); 
yy := round(y[1, 1]); 
lineto(xx, yy); 
end; 
end. 


 

The purpose of Bezier is to illustrate the Bézier control paradigm with a simple but convenient program. The user is prompted for four control points and the program then constructs the resulting cubic curve using the matrix representation developed above.

Let's examine some output from program Bezier. In Figure 5.8, the curve starts at point 1 in the lower lefthand corner and heads towards point 2 immediately above it. It then veers off and curves into point 4, approaching it from the direction determined by point 3 near the bottom of the window.

 


Figure 5.8
Bezier parametric cubic curve. Note how second and third points determine the tangents at points one and four.


 


Figure 5.9
Another Bézier curve. Note the capability of loops.


 


Figure 5.10
Bézier curves are capable of cusps when control points are arranged in an "X."


 

In Figure 5.9, a fish-like object is generated from point 1, the lower, lefthand point, with the curve pulled towards point 2 in the upper, righthand corner. As the influence of point 3 in the lower, righthand corner is felt, the curve loops around and heads for point 4, the upper righthand point. Figure 5.10 demonstrates that Bézier curves are capable of generating sharp cusps as well as smoothly-rounded curves.

These three curves demonstrate the great flexibility and power of Bézier curves as design tools.


More Complex Curves

In actual practice, a designer generally requires curves with more complexity than that possible with a single Bézier segment. Three available approaches include:

  • The use of higher order Bézier curves - By increasing N the polynomial order, from three to four, five, or higher, the user gets five, six, or more control points with which to mold the curve between the first and last points. The price paid is that of more computational complexity.

  • The joining of multiple cubic Bézier segments - It is relatively simple to make the first point of a second Bézier segment coincide with the last point of the first segment. This guarantees continuity of the curve itself (called C0 continuity. By taking care to align point P3 of the first segment with point P2 of the second segment and both co-linear with the bracketed point, the user can achieve a smooth join, that is, one with continuity in the slope of the curve at the join point (called C1 continuity).

  • The use of cubic B-splines - The B-spline curve can be used to fit n points where n is 3 4. It is even smoother than Bézier curves by having continuous curvature (C2 continuity) as well as C0 and C1 continuity. The curve goes through the first and last point but is only pulled by the intermediate control points. The mathematical formalism is identical with that of Bézier curves, with Ms replacing Mb and the point vector composed of the elements (Pi-1, Pi, Pi+1, Pi+2). The full matrix representation for the coordinate x is then given as:
  • Thus, as t goes from 0 to 1, the set of control points moves from i = 2 to i = n-2 for an n control-point set. That is, the local control-point set starts at points (1,2,3,4) for t = 0 and then moves through (2,3,4,5), (3,4,5,6), and so on until it reaches (n-3, n-2, n-1, n) for t = 1. Each set corresponds to a different cubic fit with the four nearest points controlling the curve. An equivalent reresentation applies for y(t).


    Representing Actions - Transformations in 2D

    As noted in Chapter 4, when we move from N dimensions to N+1 dimensions, new transformations are possible which were undefined and unimaginable in the lower dimension. The most important new transformation, which is undefined in 1D but opens up in 2D space, is rotation. The homogeneous matrix formalism previously introduced is easily extended to represent both the familiar translation and scaling operations as well as the new concept of rotation. Let us consider these in sequence.

    All 2D homogeneous coordinate transformations are represented by the familiar equation:

    It is useful to distinguish between the following four classes of transformations:

    Primary transformations (translation, scaling, and rotation) are so common and useful that they appear in nearly every drawing and drafting program. Secondary transformations (reflections and shears) are less common and often used for special effects. Finally, complex transformations are concatenated from a series of primary, secondary, or other complex transformations to perform particular operations.


    Primary Transformations

    The following three transformations perform the operations of shifting, magnification, and rotation of objects.

    Translation

    For translating an object a distance Tx along the x-axis and Ty along the y-axis, the matrix M is the T matrix, defined as:

    Substituting T for M in Equation 5.45 and multiplying gives the standard linear translation equations:

     

    Scaling

    Generalizing the 1D scaling transformation to 2D involves defining separate scale factors, Sx and Sy, for independent scaling along each axis. The M matrix of Equation 5.45 then becomes the S matrix, defined as:

    Substituting S for M in Equation 5.45 and multiplying gives the standard linear scaling equations:

    (a)
    (b)
    (c)

    Figure 5.11
    Three primary transformations: (a) linear translation by the vector T = Tx·x + Ty·y, (b) linear scaling by S = Sx = Sy = 2, and (c) rotation about the origin by q = 45°.


     

    Rotation

    A very useful transformation in drawing, drafting, and CAD applications involves rotating an object by some arbitrary, user-defined angle, q. This transformation is performed by redefining the generalized M matrix as the rotation matrix, R, where

    Substituting R for M in equation 5.45 and multiplying gives the standard rotation equations, familiar from trigonometry:

    These three primary 2D transformations are illustrated in Figure 5.11.

    Two items of note are apparent in [5.11]. First is that scaling involves both a magnification (as expected) and a translation (sometimes an unwanted side-effect). Second is that a pure rotation as specified by Equations 5.51 is defined as taking place about the origin, (x,y) = (0,0). Therefore, unless the object to be rotated is already centered on the origin, rotation will cause it to swing in a large circle centered at the origin. The frequently desired rotation in place may be accomplished by composing a series of transformations to produce a single complex transformation.


    Secondary Transformations

    Two additional classes of transformations, that may be useful in certain applications, are reflections and shearing. Reflections are special cases of the scaling transformation with Sx, Sy, or both having the value -1. Shearing may be considered as a special case of scaling in which the scale factor in the direction of one axis is a function of the coordinate of the other axis. Letís look at these two transformations in more detail.

     

    Reflections

    We will consider three reflections here. First, an object may be reflected across the x-axis (all yi Æ -yi). Second, an object may be reflected across the y-axis (all xi Æ -xi). Third, an object may be reflected through the origin (all xi Æ -xi and all yi Æ -yi). These three reflections are neatly accomplished by the three mirror reflection matrices, Mx, My, and Mxy, given by:

    These three reflection transformations are illustrated in Figure 5.12.

    (a)
    (b)
    (c)

    Figure 5.12
    Reflection transformations. (a) Reflection through x-axis, Mx, (b) Reflection through y-axis, My, (c) Reflection through origin, Mxy.




    Shearing Transformations

    Shearing transformations are a particular form of translation where the offset factor is position dependant. The most common dependencies involve x-shear and y-shear defined by the following transformation matrices:

    When substituted for M in Equation 5.45 and multiplied out, these matrices lead to the following sets of transformation equations:

    and

    In Figure 5.13, the effects of shearing transformations along the two axes are shown. The x-axis shear transformation is easily demonstrated physically by laying a paper-back book on a table, holding the bottom cover stationary with respect to the table top, and pressing the top cover in a direction perpendicular to the binding direction with the other hand. The end view of the book will illustrate the x-axis shearing transformation.

     

    (a)
    (b)

    Figure 5.13
    Shearing transformations. (a) along the x-axis; (b) along the y-axis.




    Complex (Composed) Transformations

    The final class of transformations considered in this chapter are those complex transformations generated by composing (multiplying) two or more elementary transformation matrices. Such complex transformations are required to perform the useful operations of scaling about a point, reflection about a line, and rotation about a point. Consider how we might, for instance, magnify an object in place, i.e., scale it about some internal point.

    Scaling About a Point

    As Figure 5.11(b) clearly indicates, simple scaling produces not only magnification but also a translation in general (unless the object happens to be at the origin). Since the goal is frequently to simply magnify the object in place, we must introduce more complexity into the transformation process.

    To accomplish the goal of scaling about a point, consider the following algorithm which is illustrated in Figure 5.14.

    This sequence of operations can be expressed very cleanly through the composition of the three transformation matrices within the homogeneous coordinate system. We simply rewrite Equation 5.29 as

    where the composed matrix, Mc, is computed by

    Computing this gives the matrix,

    (a)

    (b)
    (c)
    (d)

    Figure 5.14
    Scaling about a point. To scale the icon shown in (a) by 400 percent about center point, (xc,yc), we perform the sequence: (b) translate by T(-xc,-yc); (c) scale by S(4,4); finally, translate back by T(+xc,+yc).


     

    As a concrete example, consider Figure 5.14 where Sx = Sy = 4 and we measure (xc,yc) to be (10,5). The composed matrix then reduces to

    Solving Equation 5.59 for the transformed coordinates in terms of the original ones gives the set of transformation equations

    Rotation About a Point

    Nearly identical arguments can be made for the operation of rotation in place or rotation about a point.

    The algorithm, illustrated in Figure 5.15, becomes:

    (a)
    (b)
    (c)
    (d)

    Figure 5.15
    Rotation about a point. Select a point, (xc,yc), as in (a). Then, shift the object to the origin by T(-xc,-yc) as shown in (b). Next, rotate by desired angle, q, as shown in (c). Finally, shift back to the original position as in (d).


     

    The mathematics for the complex transformation of Figure 5.15 closely resembles that for scaling about a point, with the rotation matrix, R(q), replacing the scaling matrix, S(Sx,Sy). Rotation about a point may be represented as:

    Computing this gives the concatenated matrix,

    Although this looks a bit forbidding, knowing (xc,yc) and q allows us to quickly compute the composed rotation-about-a-point matrix, Mc, as a simple 3 ¥ 3 numerical matrix. Again, the reader should appreciate the computational efficiency of representing a complex, composition of three transformation matrices by a single, numerical array.


    Inverse Transformations

    For each forward transformation matrix, M, discussed above, there is an inverse transformation matrix, M-1. What M-1 means, in a graphical sense, is that whatever transformation M does, M-1 "undoes" it. If M shifts an object 4 units to the right, M-1 shifts it 4 units to the left; if M magnifies an object by a factor of 2, M-1 de-magnifies it by a factor of 0.5; and if M rotates an object by 45°, M-1 rotates it by -45°. It is apparent that the combined operation of applying M followed immediately by applying M-1 will leave an object completely unchanged.

    From a matrix algebra point of view, we say that the composition of M and M-1 is the identity matrix, I. That is, M M-1 = I. The identity matrix is a square matrix with 1s on the diagonal and 0s elsewhere. Its effect, when multiplied with any other conformal matrix, is to return the original matrix. Therefore, we have X I = I X = X, a result which agrees with the word proof above.

    A little thought on the nature of translation, scaling, and rotation leads to the following inverse matrices for the primary and secondary transformations.

    Inverse Translation:

    Inverse Scaling:

    Inverse Rotation:


    Inverse Reflection Transformations:

    These matrices are their own inverses.


    Inverse Shearing Transformations:

    The reader may wish to verify that these are correct representations of the inverse transformations by checking that M M-1 = I. The inverse transformation matrix is particularly useful in implementing the Undo feature of drafting, drawing, and CAD programs.


    Pascal Example - TRANSFORM

    To illustrate the role of representation of data structures and the implementation of the matrix transformations, we conclude this section with a Pascal example, TRANSFORM. In TRANSFORM, we must consider and resolve several important representation issues which greatly influence algorithm design and implementation.

    The goal of TRANSFORM is to build a minimal system for performing the primary transformations on reasonably complex and general line drawings. The representation issues which arise immediately include:

    As a realistic object to represent by line drawing, we selected the sail plan from a boat featured in the magazine, Sailing. A grid was superimposed on the schematic and the major features of the boat were outlined by a set of vertices connected by straight lines. The (x,y) location of each vertex was measured and tabulated.

    At this point, there were several options. We could have used a brute force approach and represented the boat by a set of lines, each specified by the quadruple, (x1,y1,x2,y2), of its end points. Or, we could have attempted to force some structure on the drawing by representing each feature of the boat by a polygon. However, this would have required re-digitizing the drawing with this structure as a constraint, and the representation would have been unnatural for long, thin elements such as the forestay and backstay.

    The compromise data structure chosen was a file of ordered triads, (xi,yi,ci), representing the coordinates of each vertex along with a control code, ci. The control code is given as:

    This scheme is equivalent to a list of (non-closed) polygons in which one jumps to the first point on a polygon labeled with the ci = 0 code and continues plotting the polygonís sides using vertices labeled with the ci = 1 code. The scheme eliminates the considerable redundancy of points inherent in a set-of-lines representations and quite naturally matches the mapping technique by which the original vertex table was generated. The entire sailboat figure is represented by the 48 vertices listed in Table 5.1.

     

    Table 5.1

    Sailboat Encoded Vertices

    xi yi ci
    xi yi ci
    xi yi ci
    xi yi ci

    4.5 0.35 0

    14.75 0.35 1

    15.55 0.62 1

    16.0 0.85 1

    16.65 1.2 1

    17.0 1.55 1

    17.3 1.90 1

    15.0 1.65 1

    11.3 1.40 1

    8.0 1.325 1

    5.0 1.20 1

    1.50 1.35 1

    (i = 1 Æ 12)

    1.85 1.10 1

    4.5 0.35 1

    11.6 1.50 0

    11.2 21.25 1

    11.1 21.25 1

    11.3 1.5 1

    15.55 1.75 0

    11.35 16.2 1

    11.1 2.65 1

    15.55 1.75 1

    11.3 2.6 0

    3.7 2.6 1

    (i = 13 Æ 24)

    3.7 2.85 1

    11.3 2.85 1

    11.1 21.25 1

    9.2 17.1 1

    1.5 1.35 0

    11.1 21.25 1

    4.8 1.22 0

    5.0 1.6 1

    8.6 1.75 1

    11.0 1.75 1

    11.3 1.4 1

    5.0 6.65 0

    (i = 25 Æ 36)

    6.8 6.0 1

    6.3 10.2 0

    8.0 9.5 1

    7.7 13.7 0

    9.5 13.0 1

    9.2 17.1 0

    10.5 16.7 1

    3.7 2.85 0

    5.0 6.65 1

    6.3 10.2 1

    7.7 13.7 1

    9.2 17.0 1

    (i = 37 Æ 48)

     

    The natural Pascal data structure selected for representing matrices was the square array of dimension nd. The nd = 3 arrays required for homogeneous 2D coordinates are easily extended to nd = 4 arrays required for 3D transformations. So all of the matrix manipulation routines of TRANSFORM are easily extended to 3D applications.

    The one nonstandard matrix algebra short-cut employed in TRANSFORM is based on the observation that all of the primary transformations and complex transformations built up from multiplying them can be represented by the form:

    Therefore, the matrix multiplication, = X M, is equivalent to the two algebraic expressions:

    and this is the form used to apply the transformation in routine applyxfn. This allows a somewhat cleaner, more efficient code than possible with vector matrix multiplication.

    The control structure selected for TRANSFORM was a user-controlled loop in the routine, Menu, which requires the user to move in any order among the following transformations and utility functions by use of one-letter keyboard commands.

    Although this set of operations and utilities is minimal, it is capable of providing any transformation possible through composition of the primary transformations.


    Pascal Program Transform


    program Transform;
    { Program to transform graphical objects } 
    { by user-specified translation, scaling, and rotation } 
    {and any concatenation of these three transformations.} 
    const 
    nd = 3; {Dimension of matrices} 
    ns = 100; {Maximum size of object } 
    {(in control points)} 
    VxL = 20; {Viewport left boundary} 
    VxR = 450; {Viewport right boundary} 
    VyT = 20; {Viewport top boundary} 
    VyB = 350; {Viewport bottom boundary} 
    Xo = 50; {X pixel coordinate off-set of origin} 
    Yo = 50; {Y pixel coordinate off-set of origin} 
    type 
    mat = array[1..nd, 1..nd] of real; 
    var 
    dx, dy: real; 
    m: mat; 
    i, j, k, npts: integer; 
    x, y, xp, yp: array[1..ns] of real; 
    c: array[1..ns] of integer; 
    infile: text; 
    pt, xx, yy, cc: integer; 
    ViewPort: rect; 
    {***************************************} 
    procedure applyxfn; 
    {Applies concatenated transform.} 
    var 
    i: integer; 
    begin 
    for i := 1 to npts do 
    begin 
    xp[i] := x[i] * m[1, 1] + y[i] * m[2, 1] + m[3, 1]; 
    yp[i] := x[i] * m[1, 2] + y[i] * m[2, 2] + m[3, 2]; 
    end; 
    end; 
    {************************************} 
    procedure diagmat (var xmat: mat); 
    {Generates the identity matrix} 
    var 
    i, j: integer; 
    begin 
    for i := 1 to nd do 
    begin 
    for j := 1 to nd do 
    xmat[i, j] := 0.0; 
    xmat[i, i] := 1.0; 
    end; 
    end; 
    {*************************************} 
    procedure matmlt (var d, a, b: mat); 
    {Calculate matrix product: D=A*B } 
    var 
    n, i, j, k: integer; 
    sum: real; 
    temp: mat; 
    begin 
    n := nd; 
    for i := 1 to n do 
    begin 
    for j := 1 to n do 
    begin 
    sum := 0.0; 
    for k := 1 to n do 
    sum := sum + a[i, k] * b[k, j]; 
    temp[i, j] := sum; 
    end; 
    end; 
    for i := 1 to n do 
    for j := 1 to n do 
    d[i, j] := temp[i, j]; 
    end;
     
    {*****************************************} 
    procedure display; 
    {Procedure to draw object in viewport} 
    var 
    i, xx, yy: integer; 
    screen: rect; 
    begin 
    ShowDrawing; 
    applyxfn; {Apply concatenated matrix to X vector} 
    for i := 1 to npts do {Draw object} 
    begin 
    xx := VxL + Xo + round(xp[i]); 
    yy := VyB - Yo - round(yp[i]); 
    if (c[i] = 0) then {invisible move} 
    moveto(xx, yy); 
    if (c[i] = 1) then {draw visible line} 
    lineto(xx, yy); 
    end; 
    end;
     
    {**************************************} 
    procedure getobject; {Procedure to select } 
    {and read in data file.} 
    var {Format is (xi, yi, ci).} 
    infile: text; 
    i, pt, cc: integer; 
    xx, yy: real; 
    begin 
    i := 0; 
    open(infile, OldFileName('Transform DataFile?')); 
    repeat 
    i := i + 1; 
    readln(infile, xx, yy, cc); 
    x[i] := xx; 
    y[i] := yy; 
    c[i] := cc; 
    until eof(infile); 
    npts := i; 
    end;
     
    {*************************************} 
    procedure load; 
    begin 
    getobject; 
    diagmat(m); 
    display; 
    end;
    
    
    {**************************************} 
    procedure translate; 
    {Procedure to translate point by dx,dy} 
    var 
    t: mat; 
    begin {Get translation parameters.} 
    writeln('Tx, Ty?'); 
    readln(dx, dy); 
    diagmat(t); {Set up matrix, T } 
    t[nd, 1] := dx; 
    t[nd, 2] := dy; 
    matmlt(m, m, t); {Now compose with M matrix} 
    end;
    {**************************************}
     
    procedure scale; 
    {Procedure to scale X by SX and Y by SY} 
    var 
    sx, sy: real; 
    s: mat; 
    begin {Get scaling parameters.} 
    writeln('SX, SY?'); 
    readln(sx, sy); 
    diagmat(s); {Set up matrix, S} 
    s[1, 1] := sx; 
    s[2, 2] := sy; 
    matmlt(m, m, s); {Now compose with M} 
    end; 
    {***************************************} 
    procedure rotate; 
    {Procedure to set up rotation matrix} 
    const 
    rad = 57.29578; 
    var 
    th: real; 
    R: mat; 
    begin {Get rotation angle, th} 
    writeln('Angle (deg)?'); 
    readln(th); 
    th := th / rad; 
    diagmat(R); 
    R[1, 1] := cos(th); 
    R[2, 2] := R[1, 1]; 
    R[1, 2] := sin(th); 
    R[2, 1] := -R[1, 2]; 
    matmlt(m, m, r); {Concatenate M with R} 
    end;
     
    {***************************************} 
    procedure clear_viewprt; 
    begin 
    SetRect(ViewPort, 0, 0, VxR - VxL, VyB - VyT); 
    EraseRect(ViewPort); 
    end;
     
    {***************************************} 
    procedure menu; 
    var 
    done: Boolean; 
    command: char; 
    begin 
    done := false; 
    setRect(ViewPort, VxR - 20, VyT, VxR + 80, VyB); 
    setTextRect(ViewPort); 
    ShowText; 
    setRect(ViewPort, VxL, VyT, VxR, VyB); 
    setDrawingRect(ViewPort); 
    repeat 
    writeln(' Menu'); 
    writeln('L) Load'); 
    writeln('T) Translate'); 
    writeln('S) Scale'); 
    writeln('R) Rotate'); 
    writeln('D) Display'); 
    writeln('C) Clear'); 
    writeln('Q) Quit'); 
    readln(command); 
    {Now branch to chosen procedure} 
    case command of 
    'l', 'L': 
    load; 
    't', 'T': 
    translate; 
    's', 'S': 
    scale; 
    'r', 'R': 
    rotate; 
    'd', 'D': 
    display; 
    'c', 'C': 
    clear_viewprt; 
    'q', 'Q': 
    done := true; 
    otherwise 
    begin 
    writeln('Bad command'); 
    writeln('Try again . '); 
    readln; 
    end; 
    end; 
    until done; 
    end;
     
    {********** Main Program *************} 
    begin 
    diagmat(m); 
    getobject; 
    menu; 
    end. 
    

    Figure 5.16
    Output of TRANSFORM. The small object on the left is generated by displaying the raw data file given in Table 5.1. It is so small because its world coordinates (cm) are plotted out directly as pixels with no scaling. The larger boat is generated by scaling the model by ten times as indicated in the menu transcript. The menu can be scrolled to recall previous transformations.

     

    Let's examine three cases of output generated by TRANSFORM.

    When the program runs, it requests a data file containing the image. Upon selecting the file SAIL.Dat (Table 5.1) and typing "D", the user observes the tiny object shown in the lower lefthand corner of Figure 5.16. The obvious reaction is "to magnify that thing so we can see what it is." So we type "s" to scale the object and respond with "10 10" when asked for Sx and Sy. Upon displaying this transformation, we get the magnified boat shown in Figure 5.16.

    Various aspect ratios between height and width are easily achieved by making Sx different from Sy as demonstrated in Figure 5.17. This figure also indicates that the shipís direction can be reversed by the reflection operation: Sx = -1.

    Finally, more complex scenes such as the sailboat race shown in Figure 5.18 may easily be composed by combining instances of the basic object under various transformations. The illusion of a 3D scene is created by appropriate scaling and small rotations.

    Note that the screen contains two viewports: a Drawing window in which the objects appear and a TEXT window for the control menu and commands. The scrolling control window provides both a continuous prompt on command options and a complete transcript of preceding commands and transformation parameters. This, frequently, is useful information.

    While TRANSFORM does provide many graphics functions in a relatively compact program, it does have its problems. Among these are the lack of any Undo option for erasing the last object drawn and the use of awkward keyboard control instead of the more natural mouse menu control. These are easily implemented, but were omitted because of program length considerations.



    Figure 5.17
    Reflections and differential scaling output of TRANSFORM. By judicious use of the three primary transformations, interesting special effects are possible.


     


    Figure 5.18
    Rounding the mark in a sailboat race. The illusion of 3D is achieved by scaling down the size of each object and giving it a small rotation.




    Conclusions


    As we move from 1D into 2D, new transformations such as the rotation group become possible. Within a homogeneous coordinate system representation, the matrix formalism is easily extended to incorporate rotations as well as other emergent transformations like differential scaling and shearing. The key importance of representation issues becomes apparent in describing polygons and 2D curves. Considerations in the selection of representation for polygons and curves are discussed and example solutions are presented. Transformations of these graphical objects are represented elegantly and concisely with the matrix formalism. Finally, a program is presented to show an implementation of this formalism and the ease with which complex transformation may be generated by a composition of primary transformations. Throughout the chapter the theme has been the importance of the correct representation to the solution of any given problem.