Realtime Visibility Determination
Using An Octree And
By Jacco Bikker
a.k.a. "THE PHANTOM"
The problem of rendering scenes with many polygons, but only a small set of visible polyons from a given camera position has been solved. There are some approaches that give satisfying results: A portal structure for example allows rendering of huge data sets, while only the visible set of polygons is processed. And there are more approaches, all offering fast rendering, independent of scene size, only dependent of local complexity.
All these approaches have disadvantages, however. Most of them require the scene to be 'indoor', or severely limit the possibility of real-time modifications to the scene. All of them require some sort of pre-processing.
In this document I describe a rendering algorithm that allows real-time insertion, deletion and movement of polygons, while providing a 100% perfect visibility set. The algorithm requires only a small amount of preprocessing, and accepts both enclosed volumes and double sided polygons. Finally, it can be applied to both hardware- and software rendering.
An octree is an easy-to-compute and easy-to-maintain structure for 3D polygonal data. It consists of a set of recursively subdivided cubes, starting with a cube in wich the entire data set fits. This cube is then split in eight subcubes (hence the name), and each subcube is subdivided again in eight subcubes. Splitting a cube in smaller cubes ends when the cube contains less then a preset (small) amount of polygons. This way, areas in the data set with a large amount of small polygons are splitted in tiny cubes, while mainly empty areas are represented by larger cubes in the structure. When a polygon crosses a cube boundary, it is splitted.
For this algorithm, I will extend the octree with some extra data: For each node (cube), six pointers are kept to the neighbours in six directions. These pointers are also easy to determine and maintain, and they ease moving from one cube to another.
Adding a polygon to this structure is easy: It is first inserted by adding it to the correct cubes, and splitted against their boundaries. Then, these cubes need to be re-evaluated, and, if neccessary, splitted in smaller cubes. Removing a polygon is a bit harder. This time, all the parts of the original polygon need to be removed from their cubes. When a cube becomes empty, it's parent need to be re-evaluated. For large polygons, this can be a time consuming process.
A c-buffer is a structure that is normally used to speed up software rendering by eliminating overdraw. This is achieved by storing information about drawn spans per scanline: For each scanline a linked list of spans is maintained. When a polygon is drawn, it is turned into spans, and each span is inserted in the linked list. If it partially overlaps a span that is already in the list, it is clipped against that span, and the clipped portion (if anything is left) is inserted. This is normal s-buffer behaviour. A c-buffer extends this algorithm: When a new span touches an existing span, the two spans are merged, so that only one span remains. This means that at the end of the rendering process, each scanline contains a single span, provided that the entire screen is filled with pixels.
The rendering process can now be done as follows: First, we determine the cube in wich the camera is. If we know where the camera was in the previous frame, we can use this to quickly scan whether the camera is still there, or if it has moved to a neighbour of that cube. Otherwise, the entire set must be checked. This is a very fast process too, as we can simply walk the octree with a few simple tests per level. Next, we draw the polygon(s) in that cube. The polyons are clipped against the view frustum, and rasterized. Rasterization can be done using either software rendering, or a hardware accelerator. In both cases, the polygon is also inserted in the c-buffer. In the case of hardware rendering, this means that the c-buffer must be maintained at the same resolution as the accelerated view. Then we scan the s-buffer for areas that are not covered by polygons yet. This is done by scanning for spans that do not have an end coordinate equal to the physical screen width. When such an 'empty space' is encountered, we cast a ray from the camera viewpoint to that pixel, and determine wich cube in the octree is at that position. Then we render the polygons in that cube, and start searching again.
Each polygon that we find this way will produce multiple spans in the c-buffer, so the number of rays that we need to cast is limited. In general, if the octree holds only a single polygon per cube, we need to cast exactly as many rays as there are visible polygons. The ray casting process can be quite fast, because of the axis-aligned nature of the octree, and the extra neighbour data that we stored in the octree.
To speed up this process further, we can store a pointer to the last encountered cube for each pixel. That way, a ray will only go through multiple cubes when it encounters empty cubes.
The c-buffer is quite efficient too: It doesn't need to hold texture or depth data, as we are only interested in occlusion of pixels. Clipping spans is not neccessary when we use hardware acceleration, as existing spans can always be extended with the new span when they overlap. For software rendering, we do need the span-to-span clipping to determine wich pixels to render.
I have presented an algorithm that determines the visible set of polygons using an easy-to-modify data structure, the octree, in combination with another simple structure, the c-buffer. The algorithm processes only visible polygons, and it's speed is therefore data size independent. It requires the first stages of software rasterization (edge scanning) for each visible polygon, to be able to store it in a c-buffer, plus the casting of a single ray for each visible polygon. This algorithm can be used as a basis for a more advanced algorithm that threats objects or their bounding volume differently than the larger occluders. It is therefore a good replacement for well-known visibility determination techniques that have more restrictions for the designers of the data set.
|The FlipCode Outpost||for more stuff from Jacco Bikker.|