What every developer should know about bitmaps
Virtually every developer will use bitmaps at times in their programming. Or if not in their programming, then in a website, blog, or family photos. Yet many of us don’t know the trade-offs between a GIF, JPEG, or PNG file – and there are some major differences there. This is a short post on the basics which will be sufficient for most, and a good start for the rest. Most of this I learned as a game developer (inc. Enemy Nations) where you do need a deep understanding of graphics.
Bitmaps fundamentally store the color of each pixel. But there are three key components to this:
- Storing the color value itself. Most of us are familiar with RGB where it stores the Red, Green, & Blue component of each color. This is actually the least effective method as the human eye can see subtle differences on some parts of the color spectrum more than others. It’s also inefficient for many common operations on a color such as brightening it. But it is the simplest for the most common programming tasks and so has become the standard.
- The transparency of each pixel. This is critical for the edge of non-rectangular images. A diagonal line, to render best, will be a combination of the color from the line and the color of the underlying pixel. Each pixel needs to have its level of transparency (or actually opacity) set from 0% (show the underlying pixel) to 100% (show just the pixel from the image).
- The bitmap metadata. This is informat about the image which can range from color tables and resolution to the owner of the image.
Bitmaps take a lot of data. Or to be more exact, they can take up a lot of bytes. Compression has been the main driver of new bitmap formats over the years. Compression comes in three flavors, palette reduction, lossy & lossless.
In the early days palette reduction was the most common approach. Some programs used bitmaps that were black & white, so 1 bit per pixel. Now that’s squeezing it out. And into the days of Windows 3.1 16 color images (4 bits/pixel) were still in widespread use. But the major use was the case of 8-bits/256 colors for a bitmap. These 256 colors would map to a palette that was part of the bitmap and that palette held a 24-bit color for each entry. This let a program select the 256 colors out of the full spectrum that best displayed the picture.
This approach was pretty good and mostly failed for flat surfaces that had a very slow transition across the surface. It also hit a major problem early on with the web and windowed operating systems – because the video cards were also 8-bit systems with a single palette for the entire screen. That was fine for a game that owned the entire screen, but not for when images from different sources shared the screen. The solution to this is a standard web palette was created and most browsers, etc. used that palette if there was palette contention.
Finally, there were some intermediate solutions such as 16-bits/pixel which did provide the entire spectrum, but with a coarse level of granularity where the human eye could see jumps in shade changes. This found little usage because memory prices dropped and video cards jumped quickly from 8-bit to 24-bit in a year.
Next is lossy compression. Compression is finding patterns that repeat in a file and then in the second case just point back to the first run. What if you have a run of 20 pixels where the only difference in the second run is two of the pixels are redder by a value of 1? The human eye can’t see that difference. So you change the second run to match the first and voila, you can compress it. Most lossy compression schemes let you set the level of lossiness.
This approach does have one serious problem when you use a single color to designate transparency. If that color is shifted by a single bit, it is no longer transparent. This is why lossy formats were used almost exclusively for pictures and never in games.
Finally comes lossless. This is where the program compresses the snot out of the image with no loss of information. I’m not going to dive into what/how of this except to bring up the point that compressing the images takes substantially more time than decompressing them. So displaying compressed images – fast. Compressing images – not so fast. This can lead to situations where for performance reasons you do not want to store in a lossless format on the fly.
Transparency comes in three flavors. (If you know an artist who creates web content – have them read this section. It’s amazing the number who are clueless on this issue.) The first flavor is none – the bitmap is a rectangle and will obscure every pixel below it.
The second is a bitmap where a designated color value (most use magenta but it can be any color) means transparent. So other colors are drawn and the magenta pixels are not drawn so the underlying pixel is displayed. This requires rendering the image on a selected background color and the edge pixels that should be partially the image and partially the background pixel then are partially the background color. You see this in practice with 256 color icons where they have perfect edges on a white background yet have a weird white halo effect on their edges on a black background.
The third flavor is 8 bits of transparency (i.e. 256 values from 0 – 100%) for each pixel. This is what is meant by a 32-bit bitmap, it is 24-bits of color and 8 bits of transparency. This provides an image that has finer graduations than the human eye can discern. One word of warning when talking to artists – they can all produce “32-bit bitmaps.” But 95% of them produce ones where every pixel is set to 100% opacity and are clueless about the entire process and the need for transparency. (Game artists are a notable exception – they have been doing this forever.) For a good example of how to do this right take a look at Icon Experience – I think their bitmaps are superb (we use them in AutoTag).
Many formats have a resolution, normally described as DPI (Dots Per Inch). When viewing a photograph this generally is not an issue. But take the example of a chart rendered as a bitmap. You want the text in the chart to be readable, and you may want it to print cleanly on a 600 DPI printer, but on the screen you want the 600 dots that take up an inch to display using just 96 pixels. The resolution provides this ability. The DPI does not exist in some formats and is optional in others (note: it is not required in any format, but it is unusual for it to be missing in PNG).
The important issue of DPI is that when rendering a bitmap the user may want the ability to zoom in on and/or to print at the printer’s resolution but display at a lower resolution – you need to provide the ability for the calling program to set the DPI. There’s a very powerful charting program that is useless except for standard viewing on a monitor – because it renders at 96 DPI and that’s it. Don’t limit your uses.
Ok, so what file formats should you use? Let’s go from most to least useful.
PNG – 32-bit (or less), lossless compression, small file sizes – what’s not to like. Older versions of some browsers (like Internet Explorer) would display the transparent pixels with an off-white color but the newer versions handle it properly. Use this (in 32-bit mode using 8 bits for transparency) for everything.
ICO – This is the icon file used to represent applications on the desktop, etc. It is a collection of bitmaps which can each be of any resolution and bit depth. For these build it using just 32-bit png files from 16×16 up to 256×256. If your O/S or an application needs a lesser bit depth, it will reduce on the fly – and keep the 8 bits of transparency.
JPEG – 24-bit only (i.e. no transparency), lossy, small file sizes. There is no reason to use this format unless you have significant numbers of people using old browsers. It’s not a bad format, but it is inferior to PNG with no advantages.
GIF – 8-bit, lossy, very small file sizes. GIF has two unique features. First, you can place multiple GIF bitmaps in a single file with a delay set between each. It will then play through those giving you an animated bitmap. This works on every browser back to the 0.9 versions and it’s a smaller file size than a flash file. On the flip side it is only 8 bits and in today’s world that tends to look poor (although some artists can do amazing things with just 8 bits). It also has a set color as transparent so it natively supports transparency (of the on/off variety). This is useful if you want animated bitmaps without the overhead of flash or if bandwidth is a major issue.
BMP (also called DIB) – from 1 up to 32-bit, lossless, large file sizes. There is one case to use this – when speed is the paramount issue. Many 2-D game programs, especially before the graphics cards available today, would store all bitmaps as a BMP/DIB because no decompression was required and that time saving is critical when you are trying to display 60 frames/second for a game.
TIFF – 32-bit (or less), lossless compression, small file sizes – and no better than PNG. Basically the government and some large companies decided they needed a “standard” so that software in the future could still read these old files. This whole argument makes no sense as PNG fits the bill. But for some customers (like the federal government), it’s TIFF instead of PNG. Use this when the customer requests it (but otherwise use PNG).
Everything Else – Obsolete. If you are creating a bitmap editor then by all means support reading/writing every format around. But for other uses – stick to the 2+4 formats above.