A defense of machine-information metaphors in biology

Metaphors in biology

If you pick up any biology textbook, you will likely notice that many of the concepts therein are explained primarily by analogy. Sub-cellular organelles and biomolecular complexes are routinely described as “factories”, while the genome may be described as a “blueprint”, “recipe” or “code” for life, with informational properties. The “central dogma of molecular biology” is entirely structured as an analogy: DNA is first transcribed into mRNA and then translated into polypeptide chains. It is evident that analogies have an appeal, both to educators and frontline scientists grasping for conceptual footholds as they break new ground. I have always found this to be a fairly reasonable and innocent approach. My experience - moving from high school biology, through undergrad, and graduate school - has been that the process of learning is practically synonymous with picking up didactically useful oversimplifications that, with greater refinement, one eventually learns not to take too seriously. I think that analogies are a form of abstraction, and abstraction is at the core of scientific explanation. So I believe that metaphors have the potential to be very useful. Before saying more about metaphors however, I want to mention two salient features of scientific abstraction which I think will be useful to bear in mind. For one, all (descriptive) abstractions inevitably oversimplify underlying phenomena they seek to capture. This is because there is a trade-off between explanatory power and intelligibility, and good abstractions are those that approach some kind of Pareto optimality with respect to this trade-off. The second point is that scientific abstractions are constantly evolving – better and more intelligible abstractions are, I think, a target of the scientific enterprise as it proceeds, they are not fixed totems to be fetishised or despised.

The debate about metaphors

Late in my undergraduate education, I became interested in some work in philosophy of science (in particular, the vexed debate over “reductionism” in biology). This exposed me to a parallel universe of arguments, taking place in philosophy journals and seminars, over what it is that biologists do, and even more surprisingly perhaps, what they should or should not be doing. At the outset, I should make clear that I take no issue with this in principle – I’m not about to advocate for scientism. However, some of the debates seemed strange to me at least, coming from my own perspective as a student of biology. One thing that really took me aback, and which is now the basis for this blog, was all of the work aimed at articulating a critique of the use of metaphors in science, biology in particular. Many articles have been written on this subject. The best and most informed among these, in my view, are those of the brilliant biologist-turned-philosopher Massimo Pigliucci (this article in particular was a major proximal impetus to write this blog.)

The most problematic metaphors seem to be those that relate to genetics. It appears very troubling to some philosophers that anyone should consider a “blueprint” or even “code” to be an appropriate analogy for the genome. At some level, this seems fair. After all, if we take “blueprint” specifically, we cannot just look at an arbitrary genome and form a picture of what the developed organism will look like in the way we (roughly) can from a real blueprint. In fact, we can’t even come close. Moreover, with some rare exceptions (e.g., the Drosophila Hox genes), there is no clear spatial mapping between sub-genomic units and visible adult tissues/organs. “Code” analogies might likewise suggest a kind of determinism and predictability of organismal phenotypes from genotypes that simply does not obtain in the real world. There is a (maybe justified) worry that such inaccurate metaphors contribute to the spread of misinformation, and reinforcing problematic “genetic determinist” ideas. For my part, however, I do not think there is much to be gained from these sorts of arguments. These are essentially empirical questions about the effective techniques of biology education. To my knowledge, no one in the “critique” camp has taken up a serious empirical program to show that “blueprint” or “code” metaphors are any worse than some alternative approach from a pedagogical standpoint. Unless a substantive critique along these lines arises, I believe it is better to stay in the conceptual realm, and there I think there are several points worth making. Indeed, while critiques of these sorts of metaphors abound, I have not seen many defenses of them, so I present a small one here.

It goes without saying, in line with my points above about abstraction, that all analogies are imperfect and harbour disanalogies. Every biology student learns the “lock and key” model of enzyme action in high school as part of their first forays into biochemistry. In this model, enzyme active sites fit their substrates almost perfectly in order to mediate chemical reactions, in the same way that the perfect complementary of a key to its lock enables the “door unlocking” reaction. This is a wildly inaccurate and discarded idea, but the average ~14 year old student may lack the necessary background knowledge to understand the theories of transition state stabilisation and conformational change widely accepted in modern enzymology. The lock and key model is crude, but it captures something. It is a useful way for learners to start thinking about proteins as having internal sub-structure and substrate selectivity, and is a building block towards deeper understandings that will come later in their education.

Criteria for good and bad metaphors

The best way to evaluate an analogy, in my view, is to consider whether it usefully captures important features of the underlying phenomenon, given our current¹ understanding of it. What is the argument that “code” and “blueprints” fail so spectacularly in this regard? I cannot do this full justice in a short blog, but I take this from Søren Hough to be a pithy and fairly typical presentation

Describing DNA as the “code of life” is a common trope, and one that we should disabuse ourselves of. Though our psychological, social, and biological development are certainly affected by genetics, DNA alone cannot capture the whole picture.

The argument then is simply that, presumably unlike other types of “code”, DNA cannot capture the “whole picture” of “psychological, social, and biological development”. To my mind, it is hard to think of any putative code-for-X that meets this criterion vis a vis its respective X. Does the code for a computer program contain all of the information required to capture how it is compiled, executed, and used by end users? The source code of a computer program deeply underspecifies all of these things, often to the despair of software engineers (this is part of the reason for the “it works on my machine” meme). When I experimented with building a Python application I developed during my PhD for multiple targets (macOS, Windows, and Linux), I found that the GUI styling looked acceptable (but different) in Windows/Linux, and horrid in macOS, and that the app was prone to crashes in Windows that I had never experienced while developing on Linux. So I decided to stick to a Linux target, precisely because the code I wrote with my own hands didn’t sufficiently constrain the runtime behaviour of my app. Even the binary of a compiled application cannot possibly account for all varieties of downstream use it will be put to and the consequences of such use, and still requires execution in the correct environment (compare the behaviour of video games on emulator systems with their performance on their native console system). In fact, the existence of user feedback and evolving codebases incline me to think that actual “code” is far less constraining over the whole lifecycle behaviour of a piece of software than even the more modest proposals for genetic constraints on phenotypes.

I am not an architect, I hope I can help myself to saying much the same obviously holds for blueprints. Blueprints may need to change through a construction project as unanticipated problems are encountered onsite, and even if not, it is quite clear that the actual execution and quality of a construction project depends on many variables outside the blueprint itself (e.g., the handiwork of the particular workers on the job). I would argue that the mapping between, say, a protein-coding gene’s nucleotide sequence and its biochemical properties is much tighter than that between components of a blueprint and their architectural cognates. Full chemical synthesis of functional proteins given only their nucleotide sequence is possible and is highly reproducible. It seems reasonable too to guess that there is, say, more in common between cloned animals/plants than between buildings built on same blueprint by different construction teams on different sites.

Now, should terms like “blueprint” and “code” be discarded in programming and construction because they impute too much ontological significance to their role in constraining system behaviour? Hardly. So it seems that this is inadequate grounds to reject these metaphors. What they have in common with their biological cognates is not some possibly incoherent “universal explanatory power”, but rather, I take it, a certain kind of internal regularity, composability, transmissibility, and perhaps even fungibility – as well as standing in analagous relations to external systems. The extent to which any of these specific homologies are valid may well be disputed on deeper theoretical grounds, but my sense is that most critics are not doing this. Instead it seems to me that they are for the most part simply overrating the amount of determinism that is inherent to mechanical/computational systems at relevant scales.

Ulterior motives

In my view, underneath some of the critiques of biological machine-information metaphors is in fact a dispute with a certain kind of scientific approach which these analogies are taken to reify, or “smuggle in”. Although I introduced the critique as coming primarily from philosophers of biology, it also frequently emerges from a group of biologists aligned to developmental systems theory (DST). It is out of scope for me to engage in a dissection of ideas within this school of thought here, but suffice it to say that proponents regard themselves as hewing to a more “holistic” model of biological function than the mainstream, and often express skepticism about various attempts to systematise or schematise biological knowledge in concrete ways (in my opinion, they have much in common with the early vitalists, at least in this regard). If I’m right in my characterisation, we may predict that almost any structured metaphor that does not directly convey the infinite complexity and malleability of biological systems will be subject to critique by DST advocates. Thus, it is not a matter of this metaphor or that metaphor, but rather the “reductive” attempt to find metaphors whatsoever. I believe theirs is a fundamentally counterproductive perspective - and that would be the case, even if it were true. Science hitherto has progressed in function of our collective ability to develop compact, human intelligible abstractions over the behaviour of complex systems. The central claims of DST, it seems to me, amount to the claim that this is not feasible with respect to living systems. That could well be right, but if it is, I do not see the alternative proposal for actually understanding anything. I think the prospects for useful understandings of extremely complex systems with few generalisable regularities is very dim indeed, and the DST perspective amounts to the claim that this is precisely the challenge facing biologists.

It nevertheless strikes me that there are some powerful abstractions available to biologists today. Considering living systems as predictable, constrained systems governed by (at least manageably) few common principles, together with functional modularity has, as a matter of empirical fact, been fruitful so far. This perspective naturally lends itself to machine-information metaphors considering that machines and information systems have been constructed with precisely these traits in mind, and I think not coincidentally, very often directly inspired by living systems.

Footnotes

The current proviso here seems important, since I think we’d want to debates over the scientific facts and theories to be a broadly separate domain from arguments about how to best express or teach a particular scientific theory.↩︎