Features















Medieval Academy News Articles

Medieval PH.D. Registry Project

from Medieval Academy News

Publish or Perish: How It Used to Work for Texts as Well as Authors
by John L. Cisne
Manuscript traditions grow like family trees and can be studied in at least as many ways. So far, however, study of the growth process has concentrated almost exclusively on only one of these: deducing the branching in a manuscript tradition's stemma from inferred slips of the pen, in much the same way that a family tree's branching now can be deduced from inferred slips in the copying of DNA. I described a complementary alternative in "How Science Survived: The 'Demography' of Manuscripts and the Survival of Classic Texts," Science 307 (2005), 1305–7.

The approach itself was pioneered on family trees by Charles Darwin's younger cousin Francis Galton, one of the founders of mathematical statistics. Concerned for the survival of Britain's peerage, Galton posed what came to be known as the Family Name Problem. This can be stated in various ways: given statistical fluctuations in the growth of the family tree, how is the number of a peer's male heirs likely to change with time? How likely is the peer's title to pass eventually to collateral relatives? How long could they expect to wait?

Change a few words and the Family Name Problem becomes the Publish or Perish Problem discussed here: how likely was a text to perish through statistical fluctuation in scribes' "publication" of manuscripts? How are surviving manuscripts now likely to be distributed by age?

The Family Name Problem crops up in many other contexts—the growth of a biological population, the fate of a mutant gene, the contagion of a disease, the detonation of an atomic bomb. The common element is that the maximum rate at which the statistical population can grow is directly proportional to the population's size: the larger the population, the faster it can grow. This is the recipe for exponential growth. My paper points out that it also applies to manuscript traditions. Unbeknownst to me when I wrote it, this had already been pointed out by the late Michael P. Weitzman in "The Evolution of Manuscript Traditions," Journal of the Royal Statistical Association, A150 (1987), 287–308. Had this young philologist lived, he would have left me with nothing to add and much to read.

Obviously, each of these examples has unique attributes that must be taken into account in building a successful model. To get a better sense of the abstractions, approximations, and simplifications involved, let us concentrate for now on Galton's work rather than my own because everyone has a feeling for what it is to be a twig on a genealogical tree but not on a stemmatic one.

In an instance of lèse-majesté perhaps unsurpassed in science, Galton conceptualized the growth of Britain's great families in such general terms that his model could apply to bacteria. As if inspired by Gilbert and Sullivan's arch-aristocrat Pooh-Bah, who traced his ancestry back to "a protoplasmal primordial atomic globule," Galton supposed that in any given instant a peer has a certain probability ? per unit time of budding off a son and heir, and likewise a certain probability ? per unit time of dying. The statistical population's expected growth rate per capita will be the difference between the birth probability ? and the death probability ?.

Galton's predictions tended to confirm his fears. If death probability ? exceeds birth probability ?, the lineage is doomed. It is expected to decay exponentially in size and to go extinct with probability one. If birth probability ? exceeds death probability ?, on the other hand, the lineage is expected to grow exponentially, but may go extinct anyway. In theory, the population will go extinct with probability ?/? if it can grow indefinitely large, or with probability one if not. In practice, a potentially viable population is expected either to be extinct within relatively few generations or to have grown so large it likely will survive until doomsday.

Change a few words and the preceding applies to manuscript traditions.

What Galton discovered has come to be known as the birth-and-death process, one form of the branching process. Though built to describe statistical populations that either explode or fizzle, the birth-and-death model can be recast to describe other statistical processes in which the population can approach a steady state. I adapted one of these, the logistic process, which occurs so widely that it was once held up as a universal law of growth. In population biology, the logistic differential equation has proved so important that one introductory textbook actually pictured it emerging from a cloud, like some figure on the ceiling of the Sistine Chapel.

Once a model has been sketched out, the next step in applying the scientific method is to test its predictions against observations to determine whether it could indeed apply to the real world. Galton pioneered many of the standard testing procedures. One useful statistic is the squared standard deviation of the differences between observed and predicted values divided by the total variance, the squared standard deviation of the observed values. This can be written 1 – R2, where R2 is the coefficient of determination, the fraction of the total variance explained by the model. For the perfect, too-good-too-be-true model (R2 = 1), predictions plot on top of the observations along a perfectly straight line. For the utterly hopeless model (R2 = 0), points typically form a round, featureless cloud.

The idea in developing a successful model is to explain as much as possible with as little as possible, that is, to maximize R2 using a plausible equation that contains as few constants as possible that must be estimated in fitting a curve to the data. For manuscript traditions, as for biological populations, the logistic model is about as simple as can be. My version predicts the distribution of surviving manuscripts by age given the number of surviving manuscripts and the time of appearance. The shape of the curve is determined by ?/?, the ratio of the death to the birth probabilities encountered above, and it changes continuously from an S-shaped logistic curve (?/? = 0) to a more or less exponential growth curve (?/? > 0.2).

The paper tests this no-frills model on four likely candidates, all works by the Venerable Bede, on technical matters. It fits curves to the data points by maximizing R2, as described above, while simultaneously estimating ? and ?.

So how well does the model stand up to scrutiny? In each case, the model explains more than 95% of the variance (R2 > 0.95), leaving less than 5% to be explained by any number of real or imagined complicating factors. For similar data on biological populations living in field or laboratory under conditions favorable for logistic growth, this would be considered very good agreement. Past a certain point, scrutinizing a simple model becomes as pointless as looking at a map with a microscope. Trying to explain the unexplained 5% without more and better data seems well past that point.

To verify that the model can indeed be falsified when tested (qualifying it as science as opposed to pseudoscience), my paper's online supplement takes advantage of the well-documented perturbation of the English monastery system by Vikings to demonstrate, using Bede's History of the English Church and People, that the model does not test positive where it is not supposed to.

The results, rough as they are: ? ~ 3/century, ? ~ 0.1/century, and ?/? ~ 0.03, which translates as a logistic-looking curve shaped more like a running sigma than an S-shaped logistic one.

Conclusions? At least for Bede's four works, Bernhard Bischoff was about right after all in estimating that roughly one in seven manuscripts survives in some form from Carolingian libraries. Medieval librarians drew up remarkably short inventories, but claimed losses beyond measure after the barbarians came or the building burned down. Insurance adjustors must see cases like this all the time.

Feedback on the paper has been something of an experiment in itself. From the first, the signal seems not to have been received too clearly at the other end of C. P. Snow's Bridge between the Two Cultures. The resulting confusion even echoed back to the other side, and perhaps even back again. Readers are invited to judge from themselves from Science 307 (2005), 1208–9; 309(2005), 698–701; 310 (2005), 1618, and, especially in the last case, their online supplements (available at the Science Website through libraries or other subscribers).

Two Cultures are too many. I interpret the experiment as showing the need to reverse the trend toward even more, and for my part resolve to improve my deplorable Latin and all but nonexistent Greek.

Editor's note: John Cisne teaches the course on dinosaurs at Cornell University. He thanks Robert Ziomkowski, the medievalist patiently collaborating with him in following up on the research discussed here, who deserves much credit and none of the blame.

 



Send all correspondence to The Medieval Academy of America
104 Mount Auburn St., 5th Floor, Cambridge, MA 02138
phone: 617-491-1622 fax: 617-492-3303 e-mail: speculum@medievalacademy.org

The Medieval Academy Website is best viewed in an updated browser.
©2004 The Medieval Academy of America.