Texture Atlases and Bitmap Fonts

I realise that I haven’t been posting a lot of text posts about the new game’s progress since I started posting video blogs (or “vlogs” as the kids call them these days). However, I spent the last two days working on some code that doesn’t really feel appropriate for the vlog, so I thought I’d write about it here. Hooray!

I decided to take a bit of a break from gameplay coding and instead focus on some performance stuff. I want the game to run at 60 fps, even on a 1st gen iPod touch, if I can. On Monday I took at look at the frame rate and it was around 40 fps on my iPod. I booted up Shark (a performance analysis tool for Mac OS X and iPhone) and took at look at what was slowing things down. The game’s not doing a whole lot right now, so there’s no reason it shouldn’t be running at 60 Hz. Plus, the physics engine uses a fixed time-step, so the frame rate needs to maintain 60 fps as much as possible to keep the physics in check.

What Shark told me was that my font rendering was taking up about 40% of my frame time! At 40 fps, each frame is taking about 25 ms, which means about 10 ms was devoted to rendering text on the screen! That’s insane! Especially since I was rendering two strings: “Score: 5” and “Debug”.

The code I was using for text rendering is pretty inefficient, and I had always intended on replacing it. I just hadn’t realised how inefficient it was. The code was from the old CrashLander example app that Apple pulled off the dev site a long time ago (because of stuff like this). It was using CoreGraphics to dynamically render the text out to a texture, which was then used as an alpha mask to render the text to the screen. This was all extremely slow.

So, it looked like I needed to implement my bitmap font system sooner than I had planned. A bitmap font is a fast way of drawing text, but that has some drawbacks. A bitmap font is created by rendering out the character of a font at a particular size to a texture (the big downside of using a bitmap font is that it doesn’t scale well, since the font is rendered out at a specific size). This is all done on your computer. You end up with a texture atlas (more on that in a minute) that contains all the characters you want to be able to render on one big texture (or several small textures, if you want).

So what’s a texture atlas, you ask? A texture atlas is when you cram a bunch of smaller textures into one big texture. You end up with one large texture (that has power of 2 dimensions for optimal video memory usage) that has all the smaller textures laid out next to each other. A texture atlas also requires a data file that describes the atlas. The data file will contain information on where each sub-texture lives in the atlas. This data can be used in the game to determine which small portion of the atlas to draw onto a polygon that gets drawn to the screen.

The reason this is done is that every time OpenGL has to change which texture it’s currently drawing with, it takes time. So if you draw, say, 100 different little sprites every frame, and each one is in it’s own texture, OpenGL has to change which texture it’s using 100 times. This can lead to a lot of inefficiency and can actually significantly slow your rendering code. But if you put those 100 sprites into one big texture atlas, then OpenGL doesn’t need to swap textures, it just changes the coordinates of the current texture each time.

So for texture atlases to really work, you want sprites that need to be drawn together grouped into the same atlas. You lose all the efficiency gains if you have two atlases and every alternating draw call is in a different atlas. In huge 3D games, this usually means putting all of a character model’s textures in one atlas (so a soldier gets his uniform textures, facial textures, etc all put into one atlas), since the character is rendered all at once. In a small game like mine, I can generally fit all the sprites I need into one texture atlas.

Finally, the other big benefit of texture atlases is that they can be more video memory efficient. You can do a much better job of fitting non-power of 2 textures into one giant power of 2 atlas, than padding out each smaller texture to a power of 2. This means you’ll have more VRAM available for other things.

Building a texture atlas creation and rendering system was the first thing I did this week. I actually use a Python tool that my friend Noel pointed me to, to do the packing of the texture:

  • AtlasGen (atlasgen.svn.sourceforge.net)

Then I parse the atlas data into a plist which I can load at runtime. Writing all this code would allow me to speed up the rendering of my sprites in general, but I could reuse the rendering code for my font rendering system.

So, back to the bitmap font system. I considered building my own bitmap font generation system, but that seemed silly. I poked around on the internet looking for Mac tools available, but couldn’t find any. Then Noel pointed me at some tools for Windows. Earlier this year I bought a copy of VMWare Fusion, so I can run Windows programs on my Mac. Hooray, it’s coming in handy! I did some more poking around and found this tool, which I quite like:

One of the nice things about this tool is that it seems to handle kerning (the distance between adjacent characters) quite nicely, even for italic fonts. This tool allows me to export a font texture and also generates the data file for me. Then it was just a matter of parsing the data file in the game and using the existing texture atlas code I had already written. Finally, to render the string, it’s just a matter of iterating over the string to render, grab each character, look up into the font data, and draw the appropriate part of the texture onto an appropriately sized quad.

The net result of this? My frame rate is holding at 60fps most of the time now. I still get some spikes, but it’s good enough for now. Shark tells me that my font rendering now takes about 9% of my frame time. On a 16.7 ms frame (60 Hz), that’s about 1.5 ms. And digging further into the profile, it looks like a significant portion of that time is actually spent inside NSString operations. The actual rendering is about half the time. That’s a huge reduction in render time! On an iPhone 3GS, this thing will fly!

So things are looking good. The frame rate is back to a point where it matches the physics system, which means I can do some proper tuning of the physics now. Thanks for sticking with me through such a technical post. Look for another video blog episode in the next day or two.