Lights! Camera! Action!

A guide to The Fundamental Maths And Principles Of Rendering.

Part 1

[ Contents | Part 1 | Part 2 ]

By James Sharman.

The Absolute Basics

Co-ordinate systems

The first thing that must be understood before progressing with 3d graphics is co-ordinate systems. An agreed co-ordinate system is what allows as to deal with 3 dimensional or 2 dimensional space by allowing us to numerically define position. If you were to imagine a piece of graph paper where you had numbered the horizontal and vertical edges, from zero (in the bottom right) up to a relevant value then you have now created a co-ordinate system allowing you to identify any point on the graph paper with two numbers. Screens, Texture Maps and 3D environments are all addressed in this simple manner. When dealing with co-ordinate systems remember that values can be negative as well as positive and points are expressed relative to the origin.

You must however understand that there can be different co-ordinate systems describing any one piece of space. If our graph paper example is compared to a similar piece of paper having the vertical scale number in reverse order, any number pairs (co-ordinates) from the old system take on a different meaning.

It is important before we move on to understand the difference between a change in the co-ordinate system and a change in the viewing parameters. To explain this further let us consider how we would draw a shape onto our graph paper using the co-ordinate system. We can identify a point on the graph paper with two values which we will call X and Y. Now if we create a list of random point's using this notation and join them together with lines. The points are (3,2),(1,6),(3,6),(7,7),(4,5),(9,3).

On the right there are two images. Figure 1 shows the original piece of graph paper we discussed earlier with the points plotted and joined up with lines. The image this produces is an interesting shape that can be used to demonstrate the characteristics of the co-ordinate system. Look carefully at figure 2. This is the same data plotted with the X axis increasing vertically and the Y axis decreasing horizontally. The shape has clearly been rotated by 90 degrees although it is still the same shape. The co-ordinate system is still the same and it is simply the way in which we are viewing it that has changed.

 

Figure 1.   Figure 2.

 

Now examine figure 3. This image is identical to figure one accept the Y axis now decreases with height. This time the image has changed. The image has been flipped from top to bottom and is now totally different, no degree of rotation will reverse the change. We can there for say that this is a different co-ordinate system. There are 2 possible co-ordinate systems within a 2 dimensional environment both with many possible orientations. I know of no naming convention to separate these two co-ordinate systems although I would say in Figures 1 & 2 that X increases to the right of increasing Y, and in figure 3 the equivalent would be to say that X increases to the left of increasing Y.

Figure 3.

So far we have used the labels X & Y for the two axes in a co-ordinate system, this is of course not always the case. There are a number of different sets of labels that are used in 2 dimensional co-ordinate systems that are essentially the same thing. The following table shows the four most common letter pares used and gives some idea of where you may find them used.

X,Y

The most commonly used Co-ordinate naming convention. These are most commonly used to describe a tangible environment such as the screen, 2D gaming world or a piece of graph paper.

U,V

Another very common Co-ordinate pair. This is most commonly used where the Co-ordinate space is non tangible (such as referencing a texture map).

A,B

A & B are quite general purpose, it is more common however for these to be used where the values represent a vector as opposed to a location.

S,T

Another General purpose Co-ordinate pair, it is however common to find these used where the values are non-linear.

Now it's time to look at three dimensional co-ordinate systems. We make the change to three dimensions my introducing a new axis 'Z'. Thinking about 3 dimensional space really is quite a simple once you have properly got to grips with 2 dimensional space. We now have 3 values that define a location that are labelled X,Y & Z but apart from that all the principles apply.

There are 2 different co-ordinate systems in 3 dimension space that you need to worry about, unlike there two dimensional counterparts these have well established names. The Left-Handed Co-ordinate System is most commonly used (and unless specified otherwise will be the system used here). If you imagine our earlier piece of graph paper as a window onto a 3 dimensional world then for the left handed co-ordinate system x you increase the right, Y would increase vertically and Z would increase into the scene.

Basic 2D math

More complex 2 dimensional math will be described in later parts of this document, here are the basics you need to work with 2 dimensional structures. Pay particular attention to the dot product, this is extremely important and it is advisable to get to grips with it as soon as possible.

Sine & Cosine

Two mathematical functions you really have to understand before moving on are sine and cosine. Both sin and cosine are single parameter mathematical functions. To best understand sine and cosine in the context we will be using it in imagine a circle with a radios of 1 with a point at the top (12 on a clock face). Now rotate the point clockwise plotting the points position on the x axis as sine and the y position as cosine, you should see that the diagram on the right showing sine and cosine curves (result plotted against angle) is the result. Any medium level text book on math will give you the complete set of relationships between sine / cosine and angles. I have listed some relationships that you may wish to note and will be useful to remember whilst reading the remainder of this document.

Generalised Rotation

When dealing with rotation of any kind we are usually dealing with rotations around the origin, to rotate around a point other than around the origin is simply a matter of performing a translation before and after the rotation. For example to rotate around the point x=10 and y=5 we would subtract 10 from the x co-ordinate and 5 from the y co-ordinate of each point before rotation, after rotation we would then add the same values back on. The basic formula for 3d rotation is show bellow in pseudo code.

newx = x * cos( r ) - y * sin( r );
newy = x * sin( r ) + y * cos( r );

In this sample code r is the angle by which the point is to be rotated and newx,newy is the rotated co-ordinates of the point, x and y is the original position of the point. The image on the right shows the example from the co-ordinates discussion rotated by 10 degrees (or 0.175 radians). The original position of the shape has been included in a lighter shade of red for reference. Note that the algorithm produces rotation in a counterclockwise direction, although the algorithm can be easily modified to rotate in a clockwise direction this is the convention that most systems seem to work with. If you were actually implementing this for speed you should of course calculate the values of 'sin( r )' and 'cos( r )' .

Stepped Rotation

It is possible to create a rotation by 90 degrees very easily with only additions and subtractions, the following table shows how to rotate a point by 90,180 and 270 degrees. In situations where you require to rotate by these angles then this technique gives a massive speed increase since it requires no multiplication's. Once again the angles represent counterclockwise rotation.

Degree of
Rotation

90

180

270 ( -90)

Formula

newx = 0 - y;
newy = x;
newx = 0 - x;
newy = 0 - y;
newx = y;
newy = 0 - x;

The Dot Product

The Dot Product really is one of the basic building blocks of graphics math. The dot product takes two vectors and returns a single value, the formula is simply the sum of the products of each of the vectors elements (the same applies to 2D, 3D and nD). The interpretation of the result depends on the length of the input vectors, if both are unit vectors then the result is the cosine of the angle between the vectors. The dot product has a wide variety of uses both in 2D and 3D and will be referred to often in this document series.

The Line

One of the simplest structures we can define within 2D space is the infinite line. It is usual to define a line segment (A non infinite line) by defining the two end points, we can easily define an infinite line in the same way by presuming the line extends beyond the end points. The formula that can be used to describe a line mathematical line is shown bellow.

Ax + By + C = 0

Essentially the line is defined with the values A,B and C. For any point (x,y) on the line the function results in zero. The two values (A,B) are a vector perpendicular to the direction of the line, we call this vector the Normal. The normal vector can be calculated by taking a vector that lies parallel to the line and rotating it by 90 degrees (you can use the stepped rotation algorithm to do this). If for a while we ignore the C component of the equation then the function becomes a dot product between the normal of the line and the vector between the origin (0,0) and the point being tested. Presuming that both are unit vectors we know that the result will be the cosine of the angle between the vectors, conveniently the cosine of the angle is also the distance from the point to the line (We have only defined the angle of the line so we presume it passes through the origin). If the co-ordinates of the tested point are not equivalent to a unit vector then dot product still returns the distance from point to line, effectively the result has been scaled according to the length of the vector. The remaining parameter C is used to move the line so that it does not necessarily pass through the origin, since we want any point on the line to result in zero we can calculate C as being:

-C = Ax + By 

Where (x,y) is any point we know to be on the line (normally we use one of the end points of the lines segment used to calculate A and B). We now can construct the line formula for any line. The value returned from the formula is signed, for points on the side of the line the normal points toward the result will be positive, for points on the other side of the line the result is negative. If you construct this function with a normal that is not a unit vector then the result will scale according to the vectors length, this can be useful if you want the result to be scaled or if you only want to know which side if the line the point is on.

Basic 3D math

As with 2 dimensional math, more complex elements will be discussed in later documents, pay particular attention to the dot product and the cross products, you will find yourself using these very regularly, they are two of the most fundamental building blocks of 3d math.

Rotations in 3D

The simplest form of rotation in 3 dimensions is to deal with it as a sequence of 2 dimensional rotations. We could for example rotate around the Z axis by ignoring Z values and performing a standard 2 dimensional rotation of the X and Y components. When we talk about rotating around a specific axis it is common to refer to rotations as being clockwise or anticlockwise, to determine what direction a clockwise rotation around an axis is imagine yourself looking directly down the axis (so that the axis increases with depth). The image on the right shows a basic left handed co-ordinate system with counterclockwise rotation marked with an arrow for each of the three primary axis, convention has it that positive rotation values (degrees or radians) result in a counterclockwise rotation. The one remaining issue with performing simple 3 dimensional rotations is the order in which these rotations take place, the results of rotation can be very different depending on the order in which the three possible rotation stages are carried out. It is usual to rotate first around the X axis, secondly around the Y axis and thirdly around the Z axis although this is hardly a standard and you should use which ever combination you want (based on personal preference or requirement of a specific task). In later part's in this document series we will discuss using matrices and other methods of rotation.

The Cross Product

The Cross product is one of fundamental elements of 3d math. The function returns a vector that is perpendicular to both of the two input vectors. For now it is better that we use it as a 'black box' function although in a future document we can look at how to modify it to better suit specific needs.

The pseudo code for the cross product is:

c->X = ( a->Y * b->Z ) - ( a->Z - b->Y );
c->Y = ( a->Z * b->X ) - ( a->X - b->Z );
c->Z = ( a->X * b->Y ) - ( a->Y - b->X );

The Plane

In 3 dimensional space the plane is very similar mathematically to the line in 2 dimensional space. We know that both the plane and the 2 dimensional line are infinite and are comprised of one less dimension than the environment.

Ax + By + Cz + D = 0

You will notice the function is very similar to the function for a line in 2 dimensional space. The makeup of the function is also very similar in that we calculate the dot product between the normal of the plane and the point before adding an adjusting value. The normal of a plane is a little more complicated to calculate than the normal of the line. The normal is the result of a cross product between two non parallel vectors that are parallel to the plane, since we are usually calculating a plane from a polygon we can use two of the edges of the polygon to calculate these vectors. We finish of by calculating the final value in the same way as we did for the line, the same rules concerning vector length apply.

Display Projection

What good is having being able to manipulate a 3 dimensional scene if we have no way of displaying it. Accept for a few rare pieces of hardware we only have 2 dimensional devices for outputting out images. The solution is of course to find some way of discarding some of the 3 dimensional information in order to gain a realisable view of the information.

Orthogonal

Orthogonal projects are the easiest to perform. Simply scale two of the axes to a relevant size for the screen and discard the remaining axis. Orthogonal projection is quite commonly used in design applications where it's useful for the artist / designer not to be confused by changing scale. I rendered the button bar at the top of the main index pages with an orthogonal projection so as to ensure that their would not be any kind of perspective distortion. Orthogonal projections however do suffer in that they have a very limited ability to convey 3 dimensional information such as depth. Orthogonal projects are very similar to Forced 3d projections. The two images to the right show a cube displayed using an Orthographic projection. The left hand cube is shown with one of it's corners pointing exactly at us, the cube on the far right shows how the shape can easily disappear under the correct rotation.

Perspective

Perspective projection is perhaps the most commonly used since it gives a more realistic impression of the third dimension. The principle of the perspective transform is quite simple, if you look at your monitor as a window into the scene a point is projected by drawing a line from it's location in the scene to your eye, the projected point is the located where the line intersects the screen. The maths is still relatively simple although you may find it expressed in many different ways since it can be rearranged quite easily. Consider The image on the right, the camera is marked on the left with a small eye symbol and the point to be projected is marked by a cross on the right hand side. Exactly how we determine the result depends on where the origin lies, some graphic programmers consider the origin to be located at the point of the camera (and so D would represent the distance into the scene the viewing plane is located), other programmers would place the origin in the centre of the viewing plane (D representing the distance the camera is offset back from the viewing plane). For the rest of this document I will be using the former case but beware that other documents may presume the latter. The projection is very simple from here, the value Z is divided by D to give a relative figure for Z, Y is then divided by the result, this gives us V=Y/(Z/D). We can now look at how to optimise this math, the first thing we can observe is that Y*D/Z = Y/(Z/D), this allows us to replace one division with a multiplication (division is usually far faster than multiplication). The second optimisation is to consider D to be always equal to 1 eliminating the need for even the multiplication. Remember however that D is equivalent to the field of view, moving the camera closer to the window causes more of the scene to be visible through the window. Since in this case we would want to project both the X and Y values for the point we can take advantage of inverse multiplication to speed the operation yet further reducing the operation to only one divide and two multiplies.

Forced 3D

There are a whole range of Forced 3d projections in use, the one you are most likely to have come across is the isometric projection used in many games. The maths required to produce a Forced 3D projection is far simpler than perspective and has the added advantage to sprite based games that distance on any axis does not affect size. The generalised math for the Forced 3D projection involves multiplying each of the X,Y,Z co-ordinate elements by a vector, the vector essentially defines the length and angle of a line along that axis. Forced 3D projects are usually used as is and no rotations are applied before transformation from scene co-ordinates to screen co-ordinates takes place. The table bellow shows the approximate vectors groups for the example images, since programs rarely use this kind of generalised approach I have also included example code to make the transformation directly.

Example Vectors Hard Coded
Left hand Example

X*(1,0.5)
Y*(0,0.7)
Z*(-1,0.5)

sx=x+(y*0.7);
sy=-z+(x*0.5)+(y*0.7)+(z*0.5);
Right hand Example

X*(1,0)
Y*(0,1)
Z*(0.3,0.3)

sx=x+(z*0.3);
sy=y+(z*0.3);
Vertical offset

X*(1,0)
Y*(0,1)
Z*(0,1)

sx=x;
sy=y+z;

General Projection Considerations

All the projections listed here have the advantage that straight lines are retained through the transformation. It is important to match a projection with the needs and requirements of your engine. In real time speed becomes and issue and obviously the perspective function (with its divides) is the slowest although this tends to be quite a small proportion of the over all rendering time. Forced 3D does offer the real time engine the advantage that the amount of the scene visible can be quite tightly controlled without the need for rapid 'fogging out' or depth cueing. There are projections that are significantly more complex and where the straight line rule doesn't apply, these more complex projections tend to be used in more specialist applications and will be covered in a later document in the series.

Lighting and Illumination

Light can be transmitted from a light source to the camera through many different process. In use today are three very simple mathamaticle models of these process that are used to produce fast and efficient lighting. Both the mathematics used and the real life lighting being simulated are covered. In later documents in this series we will go on to look at various kinds of light source as well as some more advanced illumination models.

Diffused

When light hits a surface some of the light is scattered in all directions. The reflected light is modulated with the colour of the surface (due to the absorption of parts of the light by the surface). We can simulate this process for illumination directly from a light source with a comparatively simple mathematical formula. Firstly we need to know how much light is hitting the relevant point on the surface. We can calculate this by modulating the intensity of the light source by the cosine of the angle between the direction vector to the light source and the surface normal. The angles cosine can of course be calculated using a simple dot product. We should now modulate the resulting light level by the quantity of diffused light the surface can reflect (the diffused level). If we are using a colour illumination model remember that each colour component (r,g and) from the light source must be independently multiplied by the result. We can now modulate the light level with the surface colour to give the final levels transmitted to the viewer. The illumination graph on the right shows the light that is potentially reflected to the camera across the surface of a circle. Note how the viewing direction plays no part in determining the light level.

Specular

Specular light differs in a number of key ways from diffused light. When light hits a surface some of it may be reflected directly from the surface. Unlike Specular reflection (a mirrored surface) the light is scattered slightly by imperfections in the surface. We obviously cannot afford to calculate the effects of millions of imperfections in the surface (called micro facets) so we approximate the effect with a mathematical function. Since the light is reflected directly by the surface the observed colour will be dependant entirely on the light source unaffected by any colour on the surface. To approximate this effect we calculate the vector that light is reflected through from the light source. This is essentialy the angular reflection of the vector to the light source (L) and the surface normal (N). We can then make a comparison between the reflection vector we have now calculated (R) and the vector to the viewer (V) by calculating the dot product between R and V. We calculate R with the following formula.

R = 2N (N dot L) - L.

The light reflection does not necessaraly exhibit the same properties as with diffused light (which would be the equivalent of a very rough surface), we can control the apparent roughness of the surface by raising the result to a power of N, with N being the roughness of the surface. Higher values of N produces a more focused Specular highlight (The lighting diagram on the right shows a Specular highlight where N=20. The resultant level is modulated with the amount of light the surface can reflect. The light level reaching the camera can then be calculated by modulating this final value with the intensity of the light source. Note how the lighting dependant on the viewing direction. IIn future documents in this series we will see how to simplify this model to make it faster in certain cercamstances.

 

Ambient

A simple ambient lighting model is the simplest to deal with mathematically. If you look at any area with no direct illumination (such as under a desk) the area is not void of light, This light has been reflected of potentially many different surfaces before it hits your eye. This diffused interaction between surfaces is very difficult to render and expensive systems such as radiosity are the only way of doing it properly. The solution is to have what is called an ambient light term in your lighting equation. The ambient term is essentially global light that illuminates totally evenly, you can see from the lighting diagram on the right that ambient light is evenly distributed around the range of angles regardless of viewing direction. Mathematically you simply modulate the ambient lighting value with the surface colour to determine the amount of ambient light reaching the camera.

The Final Combination

The lighting diagram on the right shows all three of these initial lighting methods combined on one diagram. Note that the centre of the Specular highlight is offset from the centre of the diffused illumination. The various colour mixing is not representative in any way of resulting surface colour.

Basic Lighting System Comparisons

Bellow are some comparative examples designed to give you a good idea of how ambient, diffused and Specular lighting can be made to interact. Obviously there is no right or wrong levels, the levels chosen for any given surface are supposed to define the properties of it as a surface. In the later parts of this document I will explain the mechanics of reflection and refraction along with other lighting systems to make surfaces look more metallic, glass like etc..

Ambient & Diffused Illumination

This grid of spheres shows combinations of ambient and diffused lighting. Vertically the spheres represent ambient light from none up to 50% of the maximum possible output. Look carefully at the interaction of the two lighting systems. The spheres lit only with ambient light (on the left hand column) appear to be only circles with no shape. The row of spheres with only diffused lighting (The bottom row) the spheres give a better impression of shape, the black areas however look unrealistic (particularly if the background was black) and so the sphere's only look right with a carefully balanced combination of both ambient and diffused lighting.

 

Specular & Diffused Illumination

This grid of spheres shows various combinations of Specular and diffuse lighting, ambient lighting is held at a gentle constant. The size of the Specular highlight is also held constant with an appropriate value. Notice how the Specular lighting on it's own (the left hand column) does not provide a good feel as to the shape of the object. Specular illumination is only of real value where a diffused component is also included.

 

Specular Intensity & Specular Size

This final comparison grid shows Specular intensity set against the size of the Specular highlight. Appropriate constant values for ambient and diffused were used to clarify the image. The spheres on the extreme left simulate a very rough surface with large Specular highlights. Of particular note should be that the most visually pleasing results follow a sloping line where the larger the Specular highlight the duller it is. This is true to reality where the Specular highlight is very bright when focussed in a point and quite dull when it is being spread over a larger area. If you remember that we are simulating light being reflected of then surface then this fits, the light on a larger highlight would be spread over a wider area and hence a larger but less bright highlight.