Adding Extreme Detail To Polygons
Using Voxel Sets As Input Data
By Jacco Bikker
Current 3D hardware is capable of drawing huge amounts of polygons at high refresh rates. Current 3D games typically show only hundreds of polygons on the screen simultaneously, and still no one seems to be asking why today's games apparently aren't capable of showing the amount of polygons that are possible according to the hardware vendors.
The reason why 3D games use only a small amount of polygons is this: Each and every polygon requires a huge amount of processing. First, the renderer has to decide wich polygons to draw. While a good renderer is perfectly able to determine the set of visible polygons, the amount of work per visible polygon is considerable. After the renderer has decided that a polygon is potentially visible, it needs to be clipped, rotated and finally send to the accelerator to be rasterized. Because of the amount of processing per polygon, detail is usually added by using detailed textures, effectively faking detail. While this gives rather satisfying results, one can't help but wonder why this is satisfying: Sure, the polygon count has gone up, but not by a factor of 100.
The General Idea
There is however a way to use the power of an accelerator, without having to do all the work that is normally done for each individual polygon. How? Simple: Postpone adding detail until the very last moment. Use plain, large polygons in the early stages of the rendering pipeline: Visibility determination, rotating, clipping. Then subdivide them just before rasterization, adding relief (using a bumpmap, for example). Or rather, make the subdivision process part of the visualization routines. That way, subdivision can be omitted for slower cards or software rendering without changing the entire renderer.
So far, I assume I have not really said anything new. In the rest of this article, I will probably not do so either, but I would like to talk over some ideas here. Subdividing polygons this way introduces some interesting problems, but if we can overcome these, we might be able to use extreme polygon counts, and more important: Have extreme detail in our 3D environments. So, please read on.
Questions And Potentially Incomplete Answers
Q. How do we subdivide a polygon using a bumpmap and a texture in real time?
A. We don't have to. We could simply precalculate a set of polygons for each polygon/texture/ bumpmap combination, and store it as a set of triangle fans. These could be rendered pretty quickly.
Q. But that would require us to have lots of triangle fan sets: One for each polygon/texture/bumpmap combination...
A. Use something similar to a lightmap cache for this, and calculate the fans only when needed.
Q. So it has to be done in real-time after all. How do we turn a texture/bumpmap combination into a triangle fan?
A. Start with a regular grid with the resolution of the bumpmap. Since we're generating triangles, the number of triangles is: [Hbump * Vbump * 2]. Then, find triangles that are (almost) in the same plane, and merge them. After all, we only need to add polygons for detailed spots on the bumpmap.
Q. How about clipping? Clipping thousands of polygons might get slow...
A. Absolutely true. Although I think that most of the tiny triangles will not get clipped, there's quite a huge amount of clipping at the edges of the screen. I have no idea how well an accelerator will handle this. A possible optimization is to add a band of pixels around the rendered view, that is cleared just before the buffers are swapped. this band could be used to draw triangles in without clipping them. Triangles that are completely outside the view area are not rendered at all, triangles that are completely on-screen, but partially in the band are clipped by clearing the band, so now only triangles that are partially off- screen, and partially in the view need to be clipped. Depending on the average size of a triangle and the size of the band, this might reduce the number of clips considerably.
Q. How about culling? Bumps may be still visible, even when a plane is not facing the camera.
A. True again. Consider this situation, a wall section with a round button:
Obviously, when the 'side view' turns away a bit more from the camera, the bump would still be visible. There is a moment at wich the entire 'detail polygon' is invisible however. For the moment, it might be best to find out when, and draw all the polygons until then using z-buffering. We could also test all the triangles of course, but that would move the backface culling process to our rasterizer, and that is exactly what we where trying to prevent.
Q. But we still need to rotate all those vertices of the tiny triangles...
A. True, but we can apply some interesting optimizations here. First, the range of the coordinates is limited: If the bumpmap is 64x64, then we have 64 possible 'x' values (in texture space), 64 'y' values, and 256 'z' values. For 'z' we may as well use a range of 0..31 to get sufficient height levels. We already have the rotated 3D coordinates of the large detail polygon. These coordinates can now be used to quickly determine the 3D coordinates of the small triangles: First create three temporary lookup tables. One for X (containing 64 coordinates, interpolated from vertex 1 to vertex 2), One for Y (containing 64 coordinates, interpolated from vertex 1 to vertex 3), One for Z. The Z table introduces another problem:
Our detail polygon is flat. Well, let's make it fat! So, each detail polygon gets an extra set of vertices, at a fixed distance of the original polygon. This distance is the maximum height that can be stored in the bummap. Now we can also interpolate Z in the third table. Determining the rotated vertex for a triangle is now simply a matter of combining the values in the LOT's: For the triangle vertex that is at grid position (U, V) and height (W), the 3D coordinate is:
x = Xtable[U] + Ytable[U] + Ztable[U] y = Xtable[V] + Ytable[V] + Ztable[V] z = Xtable[W] + Ytable[W] + Ztable[W]
So, we have now reduced the cost of rotating all the 3D coordinates to creating 3 LOT's that can be interpolated using only additions, plus 9 look-up's and 6 additions per vertex. That's much better than processing all those triangles in our rendering pipeline. I'm sure this can be made much better still, I've even heard someone say that he could do something like this with a SINGLE look-up and TWO additions per vertex...
So where do we go from here? Well, there are some other interesting issues involved with this 'algorithm'. Let me summarize some:
1. How far should we go with using bumpmaps for representing detail in our world? Theoretically, you could even build stairs with a single polygon this way... Or an completely detailed and textured mountain...
2. How do we get polygons to the accelerator as fast as possible? When we're talking about 500 polygons, Glide is fine, and so is OpenGL. But when we're talking about 10.000 for example, low-level access to an accelerator might become interesting...
3. Should we always generate triangle fans at the same resolution? No way, you could use LOD...
4. How do you generate the triangle fans as fast as possible? After all, we don't want any hick-ups...
5. How do we shade the detail polygons? (just an idea: Simply use the lightmaps that we're used too?)
I have presented an algorithm that may or may not be new, and I have shown that theoretically this could result in much more detailed graphics. I have also tried to show that this approach requires a different way of thinking. The keyword is: 'Postponing'. I hope that I gave you something to think about. If I'm completely wrong, or if you have additions, suggestions, questions, don't hesitate to contact me.
Keep on coding,
|The FlipCode Outpost||for more stuff from Jacco Bikker.|