Thus far, we’ve discussed the reasons for wanting to do depth of field (DoF) on the GPU. We’ve figured out that we need to get some idea of what our focal length should be, and to figure out by how much the depth of each fragment in our scene render differs from that focal length. All of this information is available in the depth buffer we generated when we rendered the scene’s geometry; if we rendered into a framebuffer object with a depth attachment, this means we have the information available on the GPU in the form of a texture.
Getting data back off the GPU is a pain. It can be accomplished effectively with pixel buffer objects, but the available bandwidth between CPU and GPU is comparatively tiny and we don’t really want to take any up if we can help it. Additionally, we don’t want either the CPU or the GPU stalled while waiting for each other’s contribution, because that’s inefficient. It’s therefore more logical to analyse the depth buffer on the GPU using a shader. As a bonus, you can do this at the same time as you’re linearising the depth buffer for other post-processing operations – for example, you might need a linear depth buffer to add fog to your scene or for SSAO.
What we’re going to do is generate a representation of how each fragment’s depth differs from the focal length, the results of which will look something like this:
Modern 3D games use a bunch of tricks to convince our brains that we are viewing their world through some bizarre hybrid sense organ which consists of about 30% human eye and 70% movie camera. Hence we get lens flares, aperture changes and other movie staples which aren’t exactly true to life; we accept these effects probably because a) we are all so highly mediated these days that we expect the things which appear on our TVs/monitors to look like that and b) because they make shiny lights dance around the screen, and us primates love that stuff.
(An aside; anybody who wears glasses is totally used to lens flares, bloom lighting and film grain effects in everyday life, which is probably another reason why us nerds are so accepting of seeing the world as a movie. These settings can be temporarily toggled off with the use of a small amount of detergent and a soft cloth, but tend to return to the defaults over time).
What the human eye does have in spades, though, is dynamic depth of field. Anything outside of the centre of the field of view is out of focus and therefore appears blurred (and also in black and white, but let’s pretend we don’t know that). Humans generally focus on the thing in the centre of their visual field, even when the thing they are actually attending to isn’t (hence when you watch something out of the corner of your eye, it’s still blurry). Because depth of field effects weren’t at all viable on early graphics hardware, a lot of people have got used to everything in a scene having the same sharpness and dislike the addition of depth of field. However, used tastefully, it can nicely work as a framing effect; in addition it’s pretty handy to hide lower-resolution assets in the background.
The technique I am going to explain here has a major advantage for my purposes; the whole thing can be done as a post-process on the GPU, meaning that you don’t have to fiddle around with scene graphs or reading your depth buffer back for calculations on the CPU.