Thursday, February 12, 2009

POSIX Threads And OpenMP Can Co-Exist

Not exactly a major revelation, but as an experiment I changed the mesh build process to a threaded model using POSIX threads (pthreads). The short version is that when the quad tree changes, first a manager thread is spawned, which spawns several worker threads. The worker threads build the actual mesh and then quit when there are no more meshes to build. The manager thread just waits for all the worker threads to finish then updates the render queue with the new list of meshes and quits.

One important thing to mention, besides all the mutexes to protect the lists, is that the quad tree is not allowed to be updated while the new meshes are being built. Changing the quad tree while still building would invalidate our list of meshes to build and could get us into a situation where the render queue is never updated because the list of meshes to build is never completed!

After the meshes are all built and the new visible list of meshes is given to the render queue then the quad tree is allowed to update, and when the tree changes the whole process begins again.

Also note that while the worker threads are building the new mesh, the render queue continues to display the meshes from the previous version of the quad tree.

The second part of the experiment was to use OpenMP to utilize the extra core of my processor in building the meshes using work sharing. Fortunately, XCode supports GCC 4.2 which supports OpenMP 2.5, and that will suffice for now. I only added parallel processing to the for loop that generates the vertices so I don't expect to see much improvement, however when I start generating normal maps I think the improvement will be noticable. Anyways, it compiles and runs.

Next I have to fix some logic bugs, re-implement the code that deletes old meshes in the cache list, and either re-implement the heightfield and normal generating, or take a crack at a second quad tree that is updated based on where the camera will be so those meshes can be pre-built and put in the cache.

A small aside - making the build process threaded made my simple profiler not completely reliable anymore because the thread may start sampling during one frame and then before it is done sampling, the profiler might attempt to process the samples because that part of the profiler runs in the main thread! Fortunately that is unlikely to occur and if it does it shouldn't skew the results much, though I suppose I could add a semaphore to resolve the issue.

No comments: