An Anthropology of Clean Code
The cultural origins of the "clean" code metaphor and a half-review of Mary Douglas's Purity and Danger
Swallow. Now, spit in a glass, and swallow that.
Why is doing the second thing gross? Sure, sometimes we rightfully care about germs:
But in our case, it’s the same saliva in both scenarios. Once we bracket off the pathogenic notion of uncleanliness, what else drives this feeling? Mary Douglas's Purity and Danger contends what’s left is the conception of dirt as “matter out of place”.
Ketchup in a bottle, great. Ketchup on your dress, dirty.
Blood in body, fine. Blood out of body, dirty.
Shoes in closet, normal. Shoes on counter, dirty.
On this view, if an object is dirty we know at least two things. The existence of a system for classifying and placing objects, and that the dirty object is in the wrong place in this system.
Programmers, more than most, depend on systems of classification. Writing software is writing abstractions, reinterpreting binary data into complex mathematical structures so they can be safely operated on.
Incidentally, programmers are also obsessed with cleanliness:
Tidy up that interface. Have you read Clean Code? This code is gross. Clean up that PR. Run a code linter. There are still bugs in this. I'm missing some plumbing code. This is a code smell. Referentially transparent macros are hygienic 1. This is a big ball of mud. That cast is dirty. Haskell is pure. Run that through a taint checker. Code quality is like a dirty bread factory. Protect your computer from viruses. Also worms. Don't let unsanitized input near your database.
Is this a coincidence?
As individuals, humans are schematizers. Newborn babies receive a chaos of sensory inputs, a "blooming, buzzing, confusion", and build from this a conceptual world where raw perceptual data maps to objects with defined lines and permanence. As experience accrues, we understand more complex objects and build more complex rules. These rules ossify so that when an ambiguous object is discovered we tend to ignore or distort it. Only when the incongruence is unavoidable do we undertake the large task of rebuilding our classification system to accommodate it.
In a similar way, groups of humans create agreed-upon categories to understand the physical and social world. And because the world is difficult to understand, any attempt to classify the world is imprecise and ambiguity-laden. Often, attempts end up looking like the Celestial Emporium of Benevolent Knowledge2:
In its remote pages it is written that the animals are divided into:
(a) belonging to the emperor,
(d) sucking pigs,
(g) stray dogs,
(h) included in the present classification,
(k) drawn with a very fine camelhair brush,
(l) et cetera,
(m) having just broken the water pitcher,
(n) that from a long way off look like flies.
Compared to the individual level, ambiguity at the cultural level is more difficult to resolve:
A human mind can ignore a perceptual anomaly, a human culture has no analogous power.
Confronted with a large enough anomaly, an individual may (begrudgingly) revise her categorizations. This could be impossible for a culture.
So a society must provide some mechanism for dealing with anomaly if it is to avoid being constantly destroyed and remade. Rather than ignore them, societies use the rules of pollution to avoid anomalies if they can, and ritually cleanse them if they must.
Purity and Danger argues that this is the essential component of culture. Differences between human societies can be reduced to how ambiguity arrives in their cultural order and the details of how this ambiguity is handled.
On occasion, these ambiguities can be redefined by fiat, in what can only be described as a brutal c-style cast:
when a monstrous birth occurs, the defining lines between humans and animals may be threatened. If a monstrous birth can be labelled an event of a peculiar kind the categories can be restored. So the Nuer treat monstrous births as baby hippopotamuses, accidentally born to humans and, with this labelling, the appropriate action is clear. They gently lay them in the river where they belong.
Or they can be thrown out, like this judicious replay attack prevention:
in some West African tribes the rule that twins should be killed at birth eliminates a social anomaly, if it is held that two humans could not be born from the same womb at the same time.
But the usual mechanism is the concept of cleanliness. Our behavior towards dirt "condemns any object or idea that is likely to confuse or contradict our cherished classifications." For the most part, cleanliness is a matter of avoidance, attributing danger to system-threatening ambiguity.
Whether it's the class structure:
A Havik, working with his untouchable servant in his garden, may become severely defiled by touching a rope or bamboo at the same time as the servant.
Or taxonomy (on unclean meat in Leviticus):
Cloven-hoofed, cud-chewing ungulates are the model of the proper kind of food for a pastoralist. If they must eat wild game, they can eat wild game that shares these distinctive characters and is therefore of the same general species. [...] this failure to conform to the two necessary criteria for defining cattle is the only reason given in the Old Testament for avoiding the pig; nothing whatever is said about its dirty scavenging habits. [...] in general the underlying principle of cleanness in animals is that they shall conform fully to their class. Those species are unclean which are imperfect members of their class, or whose class itself confounds the general scheme of the world.
Or nationality (on the Mae Enga, whose men marry women from an enemy clan):
The Enga belief about sex pollution suggests that sexual relations take on the character of a conflict between enemies in which the man sees himself as endangered by his sexual partner, the intrusive member of the enemy clan. There is a strongly held belief that contacts with women weaken male strength. So preoccupied are they with avoiding female contact that the fear of sexual contamination effectively reduces the amount of commerce between the sexes. [...] Above all they fear menstrual blood:
‘They believe that contact with it or with a menstruating women will, in the absence of appropriate counter-magic, sicken a man and cause persistent vomiting, “kill” his blood so that it turns black, corrupt his vital juices so that his skin darkens and hangs in folds as his flesh wastes, permanently dull his wits, and eventually lead to a slow decline and death.’
Time and again we find the rules of pollution are there to sharpen the lines between categories.
Chaos, for the programmer, is bits. Imagine if your brain never built up the concepts of depth and objects, that your visual field was just points of colored light with no boundaries or lines. This is how it feels to read x86 assembly.
Software must include rules that constrain what patterns of bits can be operated on, and name these patterns so they can be used. Bits that look like this are a request, bits in this location are an integer. This conceptual structure can be constructed from that conceptual structure, etc. Soon, if she's lucky, the programmer hardly thinks about bits at all.
Clean code categorizes chaos neatly. It is abstract enough that there is no disorder, but not so over-fitted that new cases can be handled. Things are named so that the reader understands what everything does. It avoids nulls, that formless spectre that may possess any unsuspecting type. It encapsulates, enforcing separation between distinct objects. Its physical structure must also be organized — slipping a configuration file into the place where utilities go is dirty.
Perhaps cleanliness was just one of many linguistic metaphors we could have chosen. Why couldn't the default analogy have been dance (graceful, flowing, ungainly, uncoordinated), or ethical (virtuous, righteous, wicked, immoral)? The survival of a program depends on its structure, and cleanliness is what we use to preserve it. Clean code isn't beautiful —it's organized, safe. And so, dirty code is dangerous. It could mean the process crashing, the software becoming unmaintainable, or the company folding.
Humans use cleanliness both to avoid a direct pathogenic threat (germs) and a symbolic threat (ambiguity) to their system of classification. Both of these are real; ambiguity is a threat to societal function. That the symbolic threat is only psychological doesn't make it any less threatening. For programmers (who I guess are also humans), all threats are symbolic, since programs are abstract to begin with. Still, some threats are only symbolic in the sense that their effect is psychological. This is why threats to an object-oriented tribe member might not match those to a functional tribe member — their systems of categorization are different. On Douglas's view, we should expect the notions of cleanliness between different cultures of programmers to reflect their systems of classification:
Functional Freya categorizes items in the world based on inputs and outputs. She puts things that take an int and return a string in one place, she puts things that have a certain type of effect in another. If Freya wants to know how to concatenate two strings, she looks at the class of things that take two strings and return one. A function that takes an input, returns an output, and modifies some memory not represented in that output, is not acceptable to Freya's system. It's unclassifiable, impure.
Object oriented Otto, on the other hand, slices the world into hierarchies of named objects. He decides what boundaries are most useful to him: This is a dog, that's a cat, both are animals. If Otto wants to concatenate two strings, he looks at the category of strings and sees what he's allowed to do to them. If he makes some method call that manipulates some memory, it's no big deal. His dog is still fundamentally a dog, no schematic transgression has occurred. What bugs Otto is objects that are not what they say. Like a god object does so much it cannot possibly be classified. Or an object that claims to support all IO but only supports reading. Objects like these makes Otto uneasy — he cannot accomplish anything if he cannot trust the category of an object.
Both Freya and Otto have cultural rules of cleanliness that harden the lines of their systems of classification. But since their systems of classification differ, the things that they think are ambiguous, and thus dirty, differ too.
How does this compare to societal notions of cleanliness? Pollution thinking reinforces societal structures only symbolically, and often unconsciously. We may feel that Freya and Otto's cleanliness inclinations are instead based on real (er, digital) world considerations, and are the result of reasoned reflection. This is misguided on both points.
One: symbolic action in a society is no less impactful than symbolic action in programs. The object of action in a ritual is the psyches of everyone participating in it. Take Douglas's description of a shamanistic cure for a patient with both psychological and physical ailments:
The symptoms were palpitations, severe pain in the back and disabling weakness. The patient was also convinced that the other villagers were against him and withdrew completely from social life. Thus there was a mixture of physical and psychological disturbance. The doctor proceeded by finding out everything about the past history of the village, conducting seances in which everyone was encouraged to discuss their grudges against the patient, while he aired his grievances against them. Finally the blood-cupping treatment dramatically involved the whole village in a crisis of expectation that burst in the excitement of the extraction of the tooth from the bleeding, fainting patient. Joyfully they congratulated him on his recovery and themselves on their part in it. They had reason for joy since the long treatment had uncovered the main sources of tension in the village.
Like writing code that is psychologically pleasant to read, so too can symbolic cultural actions have real psychological impacts.
Two: The structural motivation for cleanliness in a culture is often opaque — the Israelite who believes rock badgers unclean does not think it is because they are taxonomically confusing — but we think the programmer's cleanliness feelings are transparent. This gives the programmer too much credit: though our programming preferences are sometimes reflective of someone's reasoning, best practices most often come from someone else. This is a feature, not a bug. It's faster to learn some rules about what's clean and why than to make and learn from every possible mistake. But just like pollution thinking, the gap between the reasoning and the learned notion of cleanliness can widen. Take the overzealous DRY-radical. Or, like, I fully believe it's okay to use a goto to break out of nested for-loop. But do this in a PR and prepare for a ceaseless cacophony of cries from engineers who've never experienced a goto-ridden codebase first-hand.
The programmer wants to write maintainable code that functions well. It is difficult to evaluate programming decisions on this criteria though. It's akin to moral judgment — the morality of an action is rarely clear cut, and often includes conflicting moral principles. Unequivocal rules of cleanliness are easier to apply. In human cultures these can be used to supplement the moral code, simplifying cases where reaching a moral judgment is impossible. Likewise, when it's hard to determine whether a change will be to the benefit of a program, the harder rules of clean code can be used as an approximation.
Not everything that is immoral has a pollution belief. When the moral judgment is clear, there's no need for the system of morals to be augmented with pollution rules. Similarly, if some code is just wrong — say, an off-by-1 error — we don't bring cleanliness into the discussion. It's already clear the code will fail at its intended purpose.
For the first time, you're working on the low-level serialization code in your application — it's a bit less legible than the rest of the codebase. Confused and uneasy, you call over Gandalf the Greybeard, the code's terrifying but wise owner, and mumble: "I've got the raw bytes, how can I deserialize them?" Gandalf whisks away your keyboard and hammers at it, jumping to hard-coded offsets in the buffer, adding checks for magic numbers and checksums, then, once sure, reaches into the void* and casts out your struct.
You back away, aghast.
Gandalf shoots you a knowing look, and whispers: "reinterpret_cast is serialization"3.
You nod, quietly promising yourself you'll never pick up a serialization layer ticket again.
Programming that blurs the lines between categories, that works without the support of structure, is witchcraft. When performed by the uninitiated, it's considered only dangerous. But there's a power here too — some of our most respected, and feared, code sheds abstraction and drops into disorder.
Energy to command and special powers of healing come to those who can abandon rational control for a time. Sometimes an Andaman Islander leaves his band and wanders in the forest like a madman. When he returns to his senses and to human society he has gained occult power of healing
That rare truly heinous bug, something in the compiler, or the operating system, or (god forbid) a hardware issue, is a rite of passage. The programmer drops into a frenzy, dumping cores and inspecting the raw contents of ram. They come to doubt even the atoms of programming, they start double checking the results of every float operation and whether assignments are actually happening. Inevitably, the programmer triumphs and returns the sane world of structured programming, but they've changed — they know the order they rely on is only a single obscure gcc bug away from descending into total chaos.
Power, and danger, lies in the margins.
Take, for example, the unborn child. Its present position is ambiguous, its future equally. For no one can say what sex it will have or whether it will survive the hazards of infancy. It is often treated as both vulnerable and dangerous. The Lele regard the unborn child and its mother as in constant danger, but they also credit the unborn child with capricious ill-will which makes it a danger to others. When pregnant, a Lele woman tries to be considerate about not approaching sick persons lest the proximity of the child in her womb causes coughing or fever to increase.
Among the Nyakyusa a similar belief is recorded [...]
‘The child in the belly . . . is like a witch; it will damage food like witchcraft; beer is spoiled and tastes nasty, food does not grow, the smith’s iron is not easily worked, the milk is not good.’
The Lele are preoccupied with animal taxonomy and have built a complex structure of dietary avoidance rules. But because of this, power can be found in the ambiguous pangolin:
Then comes the inner cult of all their ritual life, in which the initiates of the pangolin, immune to dangers that would kill uninitiated men, approach, hold, kill and eat the animal which in its own existence combines all the elements which Lele culture keeps apart. [...] they confront ambiguity in an extreme and concentrated form. They dare to grasp the pangolin and put it to ritual use, proclaiming that this has more power than any other rites.
Programmers too respect and fear the power of code that transgresses order. It's understood that delving underneath abstractions, into the arcane, may allow the initiated programmer to perform feats that would be otherwise slow or impossible.
I previously discussed my formula for enjoyable non-fiction:
I. Pick an aesthetic or moral principle that I'm mostly on board with
II. List interesting historical anecdotes. A lot of them.
On these criteria, Purity and Danger succeeds.
On the first, "Dirt is matter out of place" is a surprisingly useful insight. More generally, Douglas asserts that "it is part of our human condition to long for hard lines and clear concepts". Before Purity, if I were to attempt to understand a mysterious motivation, I'd consider emotion, habit, or maybe signaling. Now, I ask if the action could be driven by ambiguity-aversion. I found this to be a fruitful perspective, applicable to much more than just programming culture.
On the second, this book has a lot of neat anecdotes. Like this advice for Mountain Arapesh men, who are wary of marrying women outside of the clan:
If he marries one, he should not marry her hastily but permit her to remain about the house for several months growing accustomed to him, cooling down the possible passion of slight acquaintance and strangeness. Then he may copulate with her, and watch. Do his yams prosper? Does he find game when he goes hunting? If so, all is well. If not, let him abstain from relationship with this dangerous, oversexed woman still many more moons, lest the part of his potency, his own physical strength, the ability to feed others, which he most cherishes, should be permanently injured.
Or this myth among the Coorgs, on spit:
A Goddess in every trial of strength or cunning defeated her two brothers. Since future precedence depended on the outcome of these contests, they decided to defeat her by a ruse. She was tricked into taking out of her mouth the betel that she was chewing to see if it was redder than theirs and into popping it back again. Once she had realised she had eaten something which had once been in her own mouth and was therefore defiled by saliva, though she wept and bewailed she accepted the full justice of her downfall. The mistake cancelled all her previous victories, and her brothers’ eternal precedence over her was established as of right.
My largest misgiving is that Purity is pretty reductive. You'd think that a book that is entirely about humans rigidly categorizing intrinsically ambiguous reality wouldn't... rigidly categorize an intrinsically ambiguous reality? But Purity reduces almost all cultural motivations to structure-preservation. And, alright I'm convinced structure-preservation is more important than I thought. But I'm not convinced that categorization is like the fundamental human drive or anything.
I think this relates to my other problem with this book: I'd estimate about 50% of Purity is dedicated to dunking on other anthropologists. Douglas aims at those who argue for a fundamental (usually condescending) distinction between primitive and modern society. It sounds like these guys needed to get dunked on I guess, but it's boring to read a half-century later when they've already been thoroughly dunked on. Broadly, Douglas argues that pollution beliefs were highly symbolic and differ from our own by 'only a matter of detail':
We moderns operate in many different fields of symbolic action. For the Bushman, Dinka and many primitive cultures the field of symbolic action is one. The unity which they create by their separating and tidying is not just a little home, but a total universe in which all experience is ordered. Both we and the Bushmen justify our pollution avoidances by fear of danger. They believe that if a man sits on the female side his male virility will be weakened. We fear pathogenicity transmitted through micro-organisms. Often our justification of our own avoidances through hygiene is sheer fantasy. The difference between us is not that our behaviour is grounded on science and theirs on symbolism. Our behaviour also carries symbolic meaning. The real difference is that we do not bring forward from one context to the next the same set of ever more powerful symbols: our experience is fragmented. Our rituals create a lot of little sub- worlds, unrelated. Their rituals create one single, symbolically consistent universe.
So of course if Douglas is claiming that all societies are the same in some way, this lens might end up a bit reductive.
On reflection, a good name for this genre might be 'hedgehog' non-fiction, after Isaiah Berlin's hedgehog/fox distinction: "A fox knows many things, but a hedgehog knows one big thing." Purity and Danger has one big idea. But this is hardly a criticism; hedgehog books are more fun to read.
Fun fact: this etymology precedes its application in lisps. Probably comes from the "hygiene condition" in lambda calculus