Finally, I had my first 3D model textured, lit, and spinning on the screen. Not content with the same old example, Full Sail's Ed Womack dug up a better-looking model and I placed a sky background behind his game model of a tri-plane. Figure 1 is the result, running live on the Gizmondo. The plane model is lightweight, only 977 triangles, and I was getting a frame rate of just over 67 fps.
On small screens, you don't need a lot of geometry for nice-looking models. Using low poly models, 3D games can easily get playable frame rates. Also, in my example, I was using floating-point math. There is no native floating-point support on the ARM9 CPU, and I knew that I was falling back to floating-point emulation, and that biting the bullet and going to fixed point could have dramatic performance benefits. I couldn't have been more wrong.
Float Versus Fixed
Switching to fixed-point math is probably the biggest single change in mindset that developers will face when moving to the embedded space. My SDK had a library for doing fast fixed-point math on the ARM9 CPU, and the documentation encouraged use of this library and fixed point.
In reality, I wasn't doing a lot of fixed-point math. The problem was that I wasn't writing a real gameI was just displaying and rotating a 3D model. So what I was really benchmarking was geometry submission and processing speed. Using floating-point emulation is certainly not going to be faster than fixed-point math for most game-time calculations, and there is quite a bit of math going on behind today's games.
When I converted my model loading code to store the model as arrays of fixed-point values, the frame rate dropped considerably. The conversion of my model data from float to fixed occurs only at startup, and the data sits in a vertex array and is simply dropped to the hardware every frame. When I did this, my frame rate dropped from 67 fps to 52 fps. Even higher resolution test models resulted in a drop of almost one half! What was going on?
According to the spec sheet, the NVidia 4500 processor supports both fixed-point and floating-point geometry batches. However, what is really going on is that fixed-point geometry is being converted to floating-point data before the hardware actually starts the transform and lighting process. My NVidia contact was a bit vague as to whether this was a hardware issue (really supporting only
floats), or if the Gizmondo was to blame by insisting in its driver that all fixed-point values go to the hardware as
floats. Regardless (at least on this platform), by sending down geometry as fixed point, I was incurring a fixed-to-float conversion cost with every single frame. The lesson learned? At least on the 4500 Go Force platforms, store your static geometry as floating-point vertex datanot fixed point!
My last performance observation was the cost of lighting. If possible, lighting should be avoided in your games. Turning off the lights in my plane sample (and eliminating the submission of normals in the vertex batch) resulted in a jump in frame rate to 106 fps. That's about a 50-percent boost in performance if you can forgo using the standard OpenGL lighting model. The same boost occurred with fixed-point vertex and normal data. There are a number of multitexture lighting tricks that old PC OpenGL games used to avoid lighting in the days of CPU transformlighting that should prove beneficial to today's OpenGL ES games.
Mobile Phones Get Smaller, Software Builds Get Bigger
The feature and form-factor war in mobile handsets has generated a growing headache for developers of software for mobile devices: Longer and longer compile times have been met with shorter and shorter release cycles. At the same time, feature proliferation has made build times grow longer. This math obviously does not add up.
Case in point: One of the leading global developers of wireless telecommunication products found itself with a build matrix so complex that the build team at one business unit had more than 100 unique software builds to support at any given time. At the same time that complexity was increasing, build times had increased dramatically, such that a single build grew from 20 minutes to more than two hours in just two years. Multiply that by 100 unique builds and the usual overnight build cycle was just not feasible.
To address this bottleneck, the company turned to parallel builds enabled, in this case, by Electric Cloud (www.electric-cloud.com). The wireless company quickly achieved an 84 percent reduction in build speed while enabling engineers to make incremental code changes they avoided in the past due to long compile times. They're now running 30-40 simultaneous builds during peak times of the day with more than 500 developers building to a cluster of 288 CPUs (432 build agents) at one site alone. They've now rolled out parallel build clusters across four sites in three countries.
A frame rate of over 60 fps for my first attempt was encouraging. Although my model contained less than 1000 triangles, NVidia claims that this hardware can transform and draw 3000 to 4500 triangles every 1/60th of a second. These are most certainly unlit triangles. As in the old days of 3D game development, low poly models are a necessary prerequisite. Texture memory is also tight, and using compressed textures is paramount. While game-time math calculations should be fixed point, static geometry submissions may be significantly faster using floating-point values.
Richard writes science visualization and educational software at Starstone Software Systems and teaches OpenGL programming at Full Sail Real World Education. He can be reached at email@example.com.