Real Historians Do Bayes!

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

by Neil Godfrey

How do historians, comparative linguists, biblical and textual critics, and evolutionary biologists establish beliefs about the past? How do they know the past?

Aviezer Tucker

That’s the subject of Aviezer Tucker‘s Our Knowledge of the Past: A Philosophy of Historiography (2004). Tucker’s interest is the relationship between the writing of history (historiography) and evidence (p. 8). It is written for audiences interested in philosophy, history, biblical criticism, the classics, comparative linguistics and evolutionary biology (p. 22).

When I began to review Richard Carrier’s book, Proving History, I pointed out that far from substituting crude mathematics for historical inquiry, the application of Bayes’ Theorem merely expresses in symbolic terms the way historians evaluate the nature of evidence and test hypotheses to explain evidence for certain events and artefacts. Some fearful critics have objected to the application of Bayes because they have never understood this fact.

All Bayes’ theorem does is help us clarify our thinking. Bayes theorem is simply a symbolic way of expressing how we do our best thinking when seeking explanations for evidence or evaluating hypotheses against the evidence. The more complex the factors that need to be considered in addressing a problem the easier it is for us to overlook a critical point or draw invalid comparisons. Bayes’ helps us to clarify thinking about the most complex of issues, including those in the social sciences and history. *

Why Bayes?

Tucker writes as a philosopher and concurs with the above assessments of other authors addressed in my earlier posts. Philosophers like to clarify the complexities they are discussing and are apt to use illustrative symbols to this end.

Philosophers find often that formal representation, Bayesian probability in our case, clarifies and concentrates the discussion. Some historians and many classicists may not be as used to this form of representation as their philosophical colleagues. . . . When I use formal representation, I express the same concepts in words, for the benefit of readers who are not accustomed to formal notation. (p. 22)

Historians ask questions like the following:

To what degree does a piece of evidence contribute or not to the confirmation of a hypothesis, given background conditions? (p. 96)


To what extent does a similar saying in the Gospels of Matthew and Luke support, or not support, the Q hypothesis, given everything else we know that is relevant to the question?

To what extent does the passage “born of a woman” in Galatians 4:4 support, or not, the hypothesis that the author believed Jesus was an historical person in the recent past, given everything else we know about Galatians, that verse in particular and its context, and evidence for Jesus?

The Bayesian theorem purports to state formally the relation between a particular piece of evidence and the hypothesis. (p. 96)

In the fifty or so pages of chapter 3 Tucker demonstrates

that an interpretation of Bayesian logic is the best explanation for the actual practices of historians. (p. 96)

The Theorem

Tucker sets out Bayes’ Theorem thus:

Pr(H|E & B) = [Pr(E|H & B) x Pr(H|B)]:Pr(E|B)

Pr — the Probability of. . .

H — the Hypothesis, or any historical proposition about past events

E — the Evidence (often this means similarities between two or more independent sources)

B — the Background knowledge of theories, methods, other hypotheses

The vertical line | should be read as “given”. So the first part of the equation expresses:

The Probability of the Hypothesis being true given the evidence and background information.

Pr(H|E & B) translated into words:

The probability of the hypothesis that George Washington was the first president of the United States, given the massive amount of documentary evidence for it and background knowledge of the causal chains that led to this evidence, is almost 1. We are almost certain that George Washington was the first president. (p. 97)

Contrast this:

The probability of the hypothesis that Jesus was the founder of what became the Christian Church, given the massive documentary evidence for it and background knowledge of the causal chains that led to this evidence, is . . . ?

Unfortunately we have no background knowledge of the causal chains that led to the Gospels and writings of Paul. We only have other hypotheses (e.g. oral tradition) to fill in these gaps.

Pr(H|B) in words:

This is the prior probability of a particular hypothesis being true given our background knowledge prior to knowledge of the evidence.


The probability of the hypothesis that there was a city of Troy that was destroyed in a war in the twelfth century BCE was low given the background information that had been known prior to the archaeological discovery of the city.


The probability of the hypothesis that there was a Gospel of Thomas that was the text of an “unorthodox” Christian group in the second and third centuries CE was relatively high given the background information that had been known prior to its discovery in 1945 (e.g. Hippolytus of Rome (c. 222–235) and Origen of Alexandria (c. 233) wrote about it.)

Pr(E|H & B) in words:

This “expresses the likelihood of the evidence given the hypothesis in question in conjunction with background knowledge.”


Given the hypothesis that George Washington was the first president of the United States, and background theories and information about the nature of paper and its preservation and use over two centuries and our knowledge of paper trails that governments and politicians generate, it is highly likely that we can encounter today many contemporary documents that refer to Washington as the first president. (p. 97)


Given the hypothesis that Jesus was revealed through Jewish Scriptures, and background theories and information about the way Second Temple Jews used and adapted their Scriptures, it is highly likely that we can encounter many passages in the Jewish Scriptures that are alluded to in the Gospels and Epistles when talking about Jesus.


Given the hypothesis that education in Greek literacy required the study of Greek literature, and background theories and information about the way Second Temple Jews used and adapted Greek literature, it is highly likely that we can encounter at least some traces of Greek literary influences and ideas in the early Christian Greek literature.

Finally for now,

Pr(E|B) in words:

This is “the expectancy, the probability of the evidence given background information.”

For example,

If our evidence is an invitation to Washington’s inaugural, it is only to be expected, given all that we know. If, however, we find it in the archives of King George with a personal dedication from Washington saying, “hoping to see you there,” it is highly surprising and would require rewriting American historiography. (p. 97)

A early Christian studies example,

If our evidence is Christian apologetic writings that claim to quote a letter from Jesus to the king of Edessa, it can be dismissed as a fabrication given all we know. If, however, we found the letter in scientifically verifiable archives of King Abgar of Edessa, it would be very surprising and lead to a serious rethink about Jesus and Christian origins.

Pr(H|E & B) in words:

This is “the posterior probability of the hypothesis given new evidence and background information.” This is “the ratio of the likelihood of the evidence given the hypothesis and its prior probability, to the expectedness probability of the occurrence of the evidence whether or not the hypothesis is true.”

Let our hypothesis be that there was widespread literacy among speakers of the Y language in the time of X: . . . .

  • New evidence comes to light. It is a book in prose in the Y language and from the time of X.
  • Our background knowledge informs us that prose writings are associated with widespread literacy since the earliest (nascent) stages of literacy produce characteristically poetic (easily remembered) writings. Prose is a later development that accompanies growing literacy.
  • The posterior probability of our hypothesis that there was widespread literacy at this time among speakers of language Y is almost 1.

Or let our hypothesis be that Chinese printing caused the invention of European printing:

  • New evidence comes to light. It is a fourteenth century Persian text describing Chinese printing techniques.
  • Our background knowledge informs us that there are many possible scenarios where a Persian author might know about Chinese printing that do not require that knowledge to be transmitted to Europe.
  • Our posterior probability of our hypothesis that Chinese printing led to European printing is not increased by this new evidence.

Or imagine this (as per Tucker):

  • New evidence comes to light. It is Gutenberg’s diary in which he recounts meeting a Chinese merchant from whom he bought a Chinese printed book before the time of his printing press.
  • The likelihood of such evidence existing is very high given our hypothesis that Chinese printing led to the invention of printing in Europe. So the posterior probability of our hypothesis is dramatically increased by the discovery of the new evidence.

Discovery of new evidence does not necessarily mean tossing a stone into another cave and hearing the sound of it hitting a pot full of new manuscripts. Sometimes it can be an observation or record that has been in the literature but long overlooked.

I’ll cover more examples used by Tucker to demonstrate that Bayes’ theorem really does give us a symbolic representation of the processes by which historians really do evaluate hypotheses and test evidence.

Unfortunately, we will also see, by way of contrast, how theologians who think they are historians of the historical Jesus fail badly and really do not “do history like real historians do”. We will see that in fact many of them value what Tucker calls “therapeutic values” above “cognitive values”.


H/T Richard Carrier @ http://freethoughtblogs.com/carrier/archives/3923


Another critic of Carrier’s view, a theologian, has confused a Bayesian application to historical questions with classical logical-positivism. That, too, is a misinformed criticism: historians have long since (close to a hundred years now!) moved away from such positivism. Evidence is not theory-free. Theories are acknowledged today as necessary for deciding where to look for evidence, how we decide certain data is relevant evidence, etc. Clear thinking (which is all Bayes helps us to keep in mind) applies to more than just one philosophical approach to evidence and interpretation.

The following two tabs change content below.

If you enjoyed this post, please consider donating to Vridar. Thanks!

16 thoughts on “Real Historians Do Bayes!”

    1. The beauty of your post is that it establishes a clear arguments with all factors and options given due consideration — which is the benefit of Bayes’ (the symbols are not necessary for this but are very useful! — and maybe it’s good to speak about it as symbolizing thought processes rather than as “maths”) — it is so easy to dismiss the virgin birth as “impossible” but of course this is sort of reaction always leaves room for protest. It makes all the factors transparent in arriving at a conclusion. (Not that a person of faith will be affected, but we are talking about “cognitive values” here, supposedly.)

    2. “The last one, Pr(E|H & B), looks like the likelihood instead of the posterior probability.”

      Damn, you’re right, of course. I repeated that one by switching my H and E around. It’s fixed now.

  1. My only problem with Bayesian probability is that the initial probability of anything is assumed and, therefore, arbitrary. This fact kind of renders Bayes’ Theorem a tautology that encourages a form of cognitive bias known as “anchoring and adjustment.” Garbage in is still garbage out. One of these days I will figure out how to demonstrate this formally in a Bayesian way.

    1. No, it’s not arbitrary. This is a common misconception. It’s merely a representation of what we believe the probability to be on the basis of current information. Take the letter of Jesus to the king of Edessa. How likely is it that that is a genuine letter? (I almost wrote “email” there!) If we think it is almost certainly genuine, then we can assign it a probability of anything, say, between 80% and 100%. That’s not an arbitrary range.

      We use these sorts of numbers all the time when discussing how much we like something or believe something to be true.

      Are we 50-50 on something? Or only slightly favouring one option over another, say 60-40? Do we think a horse has only a 5% chance of winning a race? We accept this number talk in everyday situations, and there is no reason to consider our numbers here any differently. In fact, that’s all we are doing in this case.

      What happens is that Bayes’ is helping us think carefully about each of our pieces of data and background information relating to it and to our hypothesis — so that we keep tabs on what bits are highly likely, what are highly unlikely, etc. to arrive at a balanced and well-considered decision at the end of it all.

      We don’t need Bayes’ theorem to do this. We can get exactly the same result without it. All Bayes’ is doing is assisting us to keep track of each piece of data that needs to be considered. All too often it is easy to overlook something or make a careless assumption along the way and then another scholar picks up our failing and the dialogue goes back and forth till we arrive at the “optimum decision”. By using Bayes we are encouraged to set all the details and assumptions out on the table and less likely to overlook or misjudge something in the first place. Bayes is simply a tool to help in the most efficient and best informed evaluations of evidence and hypotheses.

      1. “We don’t need Bayes’ theorem to do this. We can get exactly the same result without it”

        For decades I hung around lawyers and judges socially and professionally, cos of my job ,and even in social situations they loved to use words as qualifiers. It was a cultural habit.
        Take for example the statement above -“Take the letter of Jesus to the king of Edessa”. – my legal mates would have thrown in an ‘alleged’ thusly:

        “Take the alleged letter of Jesus to the king of Edessa”

        Bingo – the statement is no longer presented as one of given fact but as a questionable claim [‘claim’ being another qualifying word].
        “In Mark 36.44″ Jesus says …” could be, should be presented as something like ” in the anonymous gospel ascribed to Mark, the author has his [or her] character Jesus say ….”
        Words like these plus ‘purported’, ‘supposed’, and probably several others should be far more widely used ,in the interest of accuracy and credibility , in Christian studies than they are .
        The result would be twofold [at least] – some incredibly clunky expressions as alleged and claimed and purported pile on top of each other but also a probably valid awareness that all too frequently we are reading claims not facts.

        1. This is the strongest benefit of Bayes’ — it forces inquirers to take note of exactly what it is that they are talking about. So often scholars will speak of the empty tomb as if it were raw data; in fact what is raw data is a narrative about an empty tomb. So the questions centre around the probabilities of this or that about the evidence we have — the narrative.

          If we don’t think we need to use it at all then it never hurts to do a check afterwards upon our own reasoning by setting it out and seeing how our results compare with a Bayesian conclusion. If it’s about the same, that’s reassurance. If it’s not, maybe we have been alerted to something we overlooked in our initial foray.

    2. Anchoring and adjustment only applies if you update once. BT is about updating multiple times, you don’t just use it once. And the driving force of BT are the conditional probabilities, not the prior. So as long as your conditionals aren’t arbitrary, two people who start out with vastly different priors can eventually converge as evidence is gathered. The only priors that aren’t useful are 0% or 100% since those aren’t probabilities

      So for example, if we have two people who are dealing with the Synoptic Problem — one starts with a prior of 90% that Mark was written first and the other starts with a prior of 22% that Mark was written first — it almost doesn’t matter where you start at because accumulating all of the evidence will push the two priors towards each other. And actually, if your prior is really in line with reality, then it won’t move much no matter how much evidence is gathered. So for example:

      Person 1 assumes the prior probability that Mark was written first, or P(Mark) = 90%. P(Mark) for the second person is 22%. The first evidence they analyze is the length of Mark. It is a fact that Mark is the shortest Synoptic Gospel (MSSG), so this is our evidence E (if Mark being the shortest gospel was only hypothetical, then the logic behind Occam’s Razor would apply, not BT). How likely is it that Mark would be the shortest gospel given that Mark was written first? A historian might look at this by doing a large survey of ancient works and see how many of the newer versions are shorter than the older versions. I’m not a historian so I wouldn’t know, but I would guess that the usual way it happens is that original compositions are shorter than the ones derived from them, though there could be shorter versions that came after the long version (like what’s argued is Marcion’s relation to Luke, at least until relatively recently). The word “usually” could be quantified in some way somewhere around 80% (since I don’t have the experience with historical documents that actual historians do). The third value we need is the false positive rate, or how many times a shorter document is actually a rewrite of a longer document. Again, not a historian, but this doesn’t seem like it happens a lot (5%). So for the sake of example BT would look like this:

      Person 1: P(Mark | MSSG) = P(MSSG | Mark) * P(Mark) / [P(MSSG | Mark) * P(Mark)] + [P(MSSG | Not Mark) * P(Not Mark)]
      : = .8 * .9 / [.8 * .9] + [.05 * .1]
      : = .9931

      Person 2: P(Mark | MSSG) = P(MSSG | Mark) * P(Mark) / [P(MSSG | Mark) * P(Mark)] + [P(MSSG | Not Mark) * P(Not Mark)]
      : = .8 * .22 / [.8 * .22] + [.05 * .78]
      : = .8186

      Both priors moved up, which makes sense since shorter documents are usually earlier versions of longer documents (but again, not a historian here).

      With our new priors, we look at some other evidence. Like the content only found in Mark such as Mk 8.22-26. How likely is it that this pericope would be in Mark given Mark’s priority? How likely is it that this pericope would be in Mark given some other Gospel’s priority, or restated, Mark added this pericope after reading the other Synoptics?

      Again, just for example since I’m not a historian, P(Mk 8.22-26 | Mark) is “probable” or 80% and P(Mk 8.22-26 | Not Mark) is “extremely improbable” or 1%.

      Person 1: P(Mark | Mk 8.22-26) = P(Mk 8.22-26 | Mark) * P(Mark) / [P(Mk 8.22-26 | Mark) * P(Mark)] + [P(Mk 8.22-26 | Not Mark) * P(Not Mark)]
      : = .8 * .9931 / [.8 * .9931] + [.01 * .0069]
      : = .9999

      Person 2: P(Mark | MSG) = P(MSG | Mark) * P(Mark) / [P(MSG | Mark) * P(Mark)] + [P(MSG | Not Mark) * P(Not Mark)]
      : = .8 * .8186 / [.8 * .8186] + [.01 * .1814]
      : = .9972

      As you can see, the two priors are starting to converge. And you would repeat the process for each piece of evidence both for and against Markan Priority, with the priors changed from the previous use of BT (i.e. the posteriors) functioning as the priors for the next line of evidence. Again, the driving force here is the conditional probabilities. Which makes sense since these two numbers are crucial for figuring out Bayes Factor, which determines how strongly the evidence either favors or disfavors your hypothesis.

      Of course in this example, my conditionals actually are subjective since I’m not a historian and would be a more invalid, subjective use of BT.

      1. @J. Quinton,

        Thanks for the response. I don’t think anchoring and adjustment is a one shot process, and I am not sure why you say it is. The anchor defines the range of acceptable possibilities and won’t disappear after one try. Indeed, one of the reasons mythicists are dismissed is because their views, regardless of how objectively reasonable they may appear, cannot be accepted. By comparison, even minimalists play by the same rules as maximalists and are worthy of criticism (as opposed to complete scorn). While you may be right that probabilities converge under BT, that is only true if you keep on applying it. Most people stop once they are satisfied.

  2. Thanks, J. Quinton. A much more competent response than my amateurish effort.

    Just for interest, here are a few extracts from the “preface and note to readers” in The theory that would not die : how Bayes’ rule cracked the enigma code, hunted down Russian submarines, & emerged triumphant from two centuries of controversy by Sharon Bertsch McGrayne:

    On its face Bayes’ rule is a simple, one-line theorem: by updating our initial belief about something with objective new information, we get a new and improved belief. To its adherents, this is an elegant statement about learning from experience. Generations of converts remember experiencing an almost religious epiphany as they fell under the spell of its inner logic. Opponents, meanwhile, regarded Bayes’ rule as subjectivity run amok. . . .

    At its heart, Bayes runs counter to the deeply held conviction that modern science requires objectivity and precision. Bayes is a measure of belief. And it says that we can learn even from missing and inadequate data, from approximations, and from ignorance. . . .

    It has become a metaphor for how our brains learn and function. . . .

    1. Even religious claims and false propaganda use logic and reason. 🙂

      And I’ve posted here on scientific methods in historical analysis, most recently with respect to “scientific methods of dating” texts.

      But I invite you to read, say, my earlier posts on Carrier and how Bayes sits against the nature of historiography as understood by most historians themselves. Even Tucker Tucker calls “historiography” a science, but why not take up my suggestion from some months ago and actually read what historians themselves believe they are doing? One well known book to start with is On “What Is History” by Jenkins. It won’t hurt, unless readjusting long held preconceptions really hurts. You might actually expand your understanding.

      1. Even religious claims and false propaganda use logic and reason.

        True, but they also use ill-logical and un-reason. History (if you want to do it well) should not. [In saying that I am presuming, of course, that there is historical truth, however imperfectly humans can attain it, and that it is not merely an arbitrary social construction. But then I presume that you are committed to that position also?]

        1. I do wish you would read what historians themselves explain about the nature of their work. None so blind as he who will not see. “Historical truth” is a highly debated construct among historians. If you mean simply an historical fact, such as that Julius Caesar crossed the Rubicon, and call that “historical truth”, then that’s fine if by “truth” you simply mean an existing “fact”. But historians do a lot more than merely record chronicles nowadays.

          1. I do wish you would read what historians themselves explain about the nature of their work.

            As I’ve replied several times, I have read what they say and I am aware of what they say.

            None so blind as he who will not see.

            On this issue you seem very like a Jesus-historicist, simply pointing to a mainstream consensus as though that settles it and refusing to discuss the issue further. You of all people should realise that just because a group of people have agreed something among themselves doesn’t make it right.

            “Historical truth” is a highly debated construct among historians.

            Indeed so, and opinions range all the way from those who argue the postmodernist stance that there is no “truth” and that the different accounts are merely social constructions, to those who consider that there is indeed a truth about historical events, at which they are trying to get. I’d consider that the “real” historians, who use Bayes and evidence, are in the latter camp.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: