Seven Pillars of Pretty Code

The essence of pretty code is that one can infer much about the code's structure from a glance, without completely reading it. I call this "visual parsing": discerning the flow and relative importance of code from its shape. Engineering such code requires a certain amount of artifice to transform otherwise working code into working, readable code, making the extra step to leave visual cues for the user, not the compiler.

These Pillars of Pretty Code are somewhat intertwined. The first five are formulaic; the last two require intuition. Just about all of them are evidenced in jam/make.c. This is a C example, but these practices can be applied to just about any high level programming language.

 

Blend In

Code changes should blend in with the original style. It should not be possible to discern previous changes to a file without seeing the previous revisions. Nothing obscures the essential visual cues more than a shift in style.

This practice should be applied as wide as possible: absolutely within functions, generally within a file, and if you're lucky across the system.

When presented with really ugly or neglected code, and you can't infer anything about its structure from a glance, you may have to consider reformatting it wholesale. The deep understanding you gain will then be available for all subsequent readers.

make.c is on revision 44, with no major rewrites.

 

Bookish

Keep columns narrow. Just as with books and magazines, code should be narrow to focus the gaze. As I mention in "Overcome Indentation," the left edge of the code holds the structure and the right side holds the detail, and big long lines mix zones of structure and detail, confusing the reader.

There are many remedies for long lines: use shorter names (see "Declutter"); line up multiple function arguments one per line (see "Make Alike Look Alike"); and just plain streamline logic (see "Overcoming Indentation").

As a rule of thumb, 80 columns fits everywhere, though admittedly it is not physically possible to format some code (such as wide tables) within this strict limit.

make.c uses both short variable names and a strong hand on indentation to keep itself narrow.

 

Disentangle Code Blocks

Break code into logical blocks within functions, and disentangle the purpose of separate blocks, so that each does a single thing or single kind of thing. A reader can only avoid a total reading if a cursory inspection can reveal the whole block's nature.

One approach is when a function is actually a series of mini-functions: each mini-function is a block, and should be fairly self-contained. That is, information passed from block to block should be carefully considered.

Another approach is when a function is a single large operation. In this case separate blocks could be organized along the lines of type of activity: e.g. initializing variables, checking parameters, computing results, returning results, and printing debug output.

This practice is applied recursively for subblocks within large blocks (like big while loops).

make.c is a hybrid: it separates out a block of debugging/tracing, and is otherwise a series of minifunctions, with each block's purpose segregated.

 

Comment Code Blocks

Set off code blocks with whitespace and comments that describe each block. Sometimes large blocks (with multiline comments) may embed small blocks (with single line comments).

Comments should rephrase what happens in the block, not be a literal translation into English. That way, even if your code is inscrutable and your comments jibberish, the reader can at least attempt to triangulate on the actual purpose.

Big comments are needed for subtle or problematic code blocks, not necessarily big code blocks.

Historically, I have a ratio of 15% blank and 25% comment lines.

make.c goes so far as the number of the blocks and subblocks for easy identification.

 

Declutter

Reduce, reduce, reduce. Remove anything that will distract the reader.

Use short names (like i, x) for variables with a short, local scope or ubiquitous names. Use medium length for member names. Use longer names only for global scope (like distant or OS interfaces). Generally: the tighter the scope, the shorter the name. Long, descriptive names may help the first time reader but hinder him every time thereafter.

Eliminate superfluous syntactic sugar (like '!= 0', needless casts, and heavy parenthesizing). Such stuff may help educate a novice programmer, but is unneeded by anyone doing serious debugging and a hindrance to someone trying to get the big (or medium) picture.

Drop 'ifdef notdef' and any other dead code altogether. It's hard enough reading live code. SCM systems hold old code.

make.c uses short names almost exclusively, has just about no syntactic sugar, and has no dead code.

 

Make Alike Look Alike

Two or more pieces of code that do the same or similar thing should be made to look the same. Nothing speeds the reader along better than seeing a pattern.

Further, these similar looking pieces of code should be lined up one after the other. Such grouping reduces the number of entities the reader has to grasp, a critical approach to simplifying the apparent complexity of code.

This practice is best used in conjunction with "Disentangle Code Blocks:" a separated code block, composed of a pattern of lines with a single purpose, is a simple entity for the reader to grasp.

Unfortunately, this practice must be applied everywhere and requires finesse. Fortunately, it rarely affects the generated code. Examples help:

  • Initialize variables together.
  • Consistently use 'this' (or don't).
  • Line up parameters on a long function call.
  • Consistently use {} around if/else clauses: either all blocks have them, or none do.
  • Put the { of a if/for/while on its line (because the closing } is).
  • Break apart conditionals at the &&'s or ||'s and align them.

make.c's "4d" block is an elaborate example of how many lines of code can be made to look like a single entity.

 

Overcome Indentation

The left edge of the code defines its structure, while the right side holds the detail. You must fight indentation to safeguard this property. Code which moves too quickly from left to right (and back again) mixes major control flow with minor detail.

Forcibly align the main flow of control down the left side, with one level of indentation for if/while/for/do/switch statements. Use break, continue, return, even 'goto' to coerce the code into left-side alignment. Rearrange conditionals so that the block with the quickest exit comes first, and then return (or break, or continue) so that the other leg can continue at the same indentation level.

Real code of course requires substantial subblocks, necessarily indented, and these subblock will then have their own indentation battles to be fought. You must ensure that you are indenting only to move from structure to detail, and not because of an artifact of the programming language.

This is the most difficult of these practices, as it requires the most artifice and can often influence the implementation of individual functions.

make.c rarely goes more than two levels of indentation, and this is by no means an accident.

 

These Seven Pillars of Pretty Code aren't the last word, or even the first word, on good code. But for good code they are, I believe, what distinguishes the readable and pretty code from the rest.

 

Additional Reading

In their chapter, "Code in Motion," from Beautiful Code: Leading Programmers Explain How They Think (©2007 O'Reilly), Laura Wingerd and Christopher Seiwald discuss source code that survives and thrives in an evolving, long-lived software product. The real-world case they cite is that of "DiffMerge," a component of the Perforce SCM System.