March 2, 2011

Rick Regan's Non-deterministic Floating Point Bug


Like many systems software engineers, I am proficient but not expert in floating point arithmetic. I've read the classic texts, such as David Goldberg's What Every Computer Scientist Should Know About Floating-Point Arithmetic, and William Kahan's papers on the IEEE 754 floating point standard for C.

But for the most part I am content to be a simple user of floating point arithmetic, letting the authors of the standard library routines such as strtod or Double.parseDouble() handle the heavy lifting for me. I know when to use floating point, and I know how to use it effectively and reasonably efficiently.

Still, there's always something more to learn, so I've been enjoying a recent deep dive into floating point arithmetic brought on by Rick Regan's fascinating discovery of a new bug in Java's floating point conversion library. Regan's article clearly and thorough explains the bug, and takes you through his analysis.

As is often the case, Regan initially thought he had a non-deterministic behavior on his hands, but after experimenting and discussing the problem with others he realized that he was seeing a different variation of a problem he'd studied earlier, a double rounding on underflow error.

Recently, a colleague of mine remarked that

fixing bugs is a prolific source of bugs,

which is true; however, there is a related observation, which is that

finding bugs is a prolific source of bugs.

When you find a bug, it's often true that there are other similar bugs lurking about nearby. Perhaps that same coding pattern was repeated in several places, in slight variations. My next-door cubicle buddy just spent last week tracking down a common "value out of range" problem in a whole family of formatting routines.

Kudos to Rick Regan, not only for finding this bug, and not only for such a clear and thorough explanation of it, but for having that sense of deja vu and realizing that he'd seen that bug before and knew what it was. People often call these instinctive reactions to a bit of code "code smells", and if you aspire to be a great programmer, you should constantly work on training your "code nose", as Regan does.