A scientific standard of evidence

by Dr Barry R. Clarke

Having been engaged in the Shakespeare authorship question for many years, and having witnessed various arguments that have been put forward for various contestants, I have to say that I have grown to become less interested in the answer to the question ‘Who contributed to this particular Shakespeare work?’ than in how the answer is to be established beyond reasonable doubt. In other words, how effective are our methods for establishing the degree of truth of this or that fact?

For example, did a group of contributors write the Sonnets or only one person? My response to this question is: How are we going to decide? My best current answer is: By means of a rare phrase test using the Early English Books Online (EEBO) database. This method, to which I devoted considerable thought and time during my PhD work at Brunel University 2010-13, I call Rare Collocation Profiling (RCP). Although still in need of much development, I cannot see a more convincing method available. After all, the rarer a phrase the more it points to a single mind, and the procedure gives thousands of contemporary authors the opportunity to appear or not appear in the search results for rare correspondences. Crucially, it offers the chance to eliminate contestants, an important characteristic of the scientific method, and in my view, it is far more reliable than traditional academic methods such as a stylometric word count of a document, which rests on the dubious assumption that a text is uniform in a single contributor.

The difficulty with a stylometric method is that it relies on an extended body of text to perform a count. To be informative, this text must have only one contributor to which is associated a set of numbers associated with the word characteristics that are being counted (e.g. one characteristic could be words ending in ‘-ish’). Now, it is by no means certain that a text in an author’s corpus has not been revised by a different hand and if it has then this set of numbers would be an inaccurate representation of this author. However, there is a larger objection. A target text with more than one contributor has little prospect of its several hands being suggested by the method. The assumption is usually made that multiple contributors would be allocated different scenes of a play text, and would therefore be separated in their contribution. However, it is easily possible that a scene has been revised at a later time by a different contributor. This impasse, which I call here the ‘problem of uniformity’, is the chief difficulty with stylometric methods and can easily result in misattribution. What is needed is a more forensic procedure that does not rely on an uncorrupted text, and which through the meaningfulness and rarity of the elements being processed (i.e. phrases and collocations) is more likely to point to particular personalities. The RCP method satisfies this criterion.

So far I have applied RCP to A Funerall Elegye by W.S. (identified matches: John Ford, Shakespeare canon), the Gesta Grayorum (identified matches: Francis Bacon), A Comedy of Errors (identified matches: Thomas Nashe, Thomas Heywood, Francis Bacon), Love’s Labour’s Lost (identified matches: Thomas Nashe, Thomas Heywood, Thomas Dekker, Francis Bacon), The Tempest (identified matches: Francis Bacon), and Twelfth Night (identified matches: Thomas Heywood, George Chapman, Francis Bacon). The Tempest is strong in rare correspondences that Bacon shares, and although it may come as a surprise that he appears in these lists, he was heavily involved in the 1594-5 Gray’s Inn Christmas revels where The Comedy of Errors and Love’s Labour’s Lost were intended for performance (see PhD thesis) and he was a prominent member of the Virginia Company which has strong connections to The Tempest (see peer-reviewed publication ‘The Virginia Company’s role in The Tempest‘). Also, the method’s identification of John Ford as the major contributor to A Funerall Elegye has assisted in settling a long-standing debate as to whether or not ‘W.S.’ was referring to William Shakespeare. It seems not, at least not unless Ford’s name was in need of concealment in the dangerous circumstance of a murder having been committed.

However, what if the RCP test is applied to the Sonnets and the result is inconclusive, that is, no contestant turns out to have a particularly strong return? There are several reasons why this might be so, for example, the supposed sole Sonnets contributor has no corpus of letters and prose work in the EEBO database. In that case, he won’t be identified and I say we cannot then challenge Shakespeare’s default claim to the Sonnets (which is justified by his name, or a similar one, appearing on the work). In fact, this is precisely how one would conduct a statistical hypothesis test. First, set up the null hypothesis ‘Mr Shakespeare wrote the Sonnets alone’, then establish the alternative hypothesis ‘There are one or more different contributors’, and finally test it using the available data in EEBO. Of course, Mr Shakespeare could not participate in such a test himself as there are no independent letters or prose works of his to test the Sonnets against. So since he has no opportunity to fail the test, the conclusion has to be that he cannot be scientifically eliminated from contributing to the Sonnets.

There are those who might recoil at the suggestion that a current attribution method they are employing is unsound. For example, someone might object that they have spent a lifetime researching Joe Soap and have found dozens, no … hundreds, of biographical correspondences between his life and the Sonnets. The trouble is, so have other researchers … for Fred Bloggs, Egbert Nobacon, Sid Snipe … What does this imply? It implies that this type of evidence can be collected for several candidates and for this reason it establishes nothing. The notion that biographical connections are informative is an illusion and they have no place in the science of authorship attribution. Their only use is to reinforce the views of those who have already decided on their favourite candidate, who only collect information relating to that person, and who reject any opportunity to test whether or not he/she was actually involved. This is certainly not a scientific attitude yet the Shakespeare authorship question is heavily populated with such investigators who have no intention of modifying their hypotheses as new evidence arises!

I can hear others objecting that there are some very odd things going on, like no one came out at the time and said Mr Shakespeare wrote the Sonnets, and I would agree that it is not what might reasonably be expected for a man whose name is on a collection of first-rate poetry. In fact, there are books devoted to lists of doubts which range from the non-existence of manuscripts to the absence of eulogies on Shakespeare’s death in 1616. Nevertheless, however well researched these works are, suspicion is not evidence. There must be some kind of scientific test that gives every contestant a reasonable chance of being either eliminated or suggested, and if there isn’t one, or if it is inconclusive, then I say we must be ready to admit that a scientific challenge to Shakespeare’s claim to this or that work is not possible. Fortunately, it seems that the RCP test shows promise despite being limited only to those with several works in the EEBO database.

I have often thought it unfortunate that the Shakespeare authorship question has attracted so many who trade with abandon their favourite candidate’s biographical connections to the Shakespeare work. The attraction seems to lie in the thought that since little is known with any degree of certainty, then one is free to fantasize whatever one likes, free from critical analysis. After all, if no one knows very much then one can be secure in the knowledge that there is no compelling evidence to deliver the pain of contradiction. As a trained scientist this thought depresses me greatly, because there is no easy route through the vast and intricate maze of connections that is the Shakespeare authorship question, and the construction of a convincing scientific argument takes considerable (and I mean considerable) research, care, and judgment.

So what I would like people to do is to think more carefully about the standard of evidence of a proposed method, and to consider ways in which a proposed scheme might allow the ruling in or out of other contestants. To me, any attribution method that does not have this characteristic is defective and biased. I also believe we should be more ready to declare that such and such a question has insufficient data to answer it, rather than succumb to the temptation to over-interpret circumstantial evidence. It’s a pleasurable game but it is unscientific.

I hope that these conclusions don’t extinguish anyone’s enthusiasm for investigation in any way. All I ask for is more thought devoted to the standard of evidence and how far it eliminates other contestants.

barryispuzzled

My name is Barry R. Clarke. I have a PhD from Brunel University, UK, on the thesis 'A linguistic analysis of Francis Bacon's contribution to three Shakespeare plays: The Comedy of Errors, Love's Labour's Lost, and The Tempest'. My paper 'The Virginia Company and The Tempest' questioning Shakespeare's access to the Strachey letter appeared in the Journal of Drama Studies (July 2011). A book chapter 'The Virginia Company's role in The Tempest' examining Bacon's connections to the play appears in Petar Penda, 'The Whirlwind of Passion: New Critical Perspectives on William Shakespeare' (Cambridge Scholars Publishing, 2016). I write mathematics and logic puzzles for The Daily Telegraph and Prospect (UK) magazine. My books of original puzzles include Puzzles for Pleasure (Cambridge University Press, 1993), Mathematical Puzzles and Curiosities (Dover, 2013), and Extreme Logic Puzzles (Puzzlewright Press, 2014). Challenging Logic Puzzles Mensa (Sterling, 2003) has sold over 80,000 copies and is an amazon.co.uk 'Brainteasers' bestseller. Since its creation over ten years ago, my puzzles website barryispuzzled.com has received over half a million hits! I also have a research degree and publications in quantum mechanics, play guitar, and draw cartoons. My latest book is "The Quantum Puzzle: Critique of Quantum Theory and Electrodynamics" an academic treatise on the foundations of physics for World Scientific Publishing.

Author archive Author website

April 30, 2016

authorship

authorship, barry, clarke, evidence, francis bacon, Shakespeare

8 thoughts on “A scientific standard of evidence”

Add yours

alexanderwaugh
May 1, 2016 at 2:14 pm

Reply

Dear Barry,

I am full of admiration for your commitment to investigate the Shakespearean authorship mystery using word tests from EEBO and sure you will continue to make many interesting discoveries along the way. You will of course be severely hampered by the fact that there is not, as far as I am aware, much, if anything at all, to be found on EEBO that is ascribed to Oxford, who, as you well know, is the most popular alternative authorship candidate to Stratford-Shakspere. John Bodenham (1600) tells us that ‘divers essayes of poetrie’ by Oxford, Derby, Raleigh, Dyer, Greville and Harrington’ are ‘extant among other Honrourable personages writings’ which means, in simple language, that their works are published under the names of other living people, and I would be interested to know how your scientific method intends to deal with this.
What you call ‘scientific method’ I am sure you will agree, is a mathematical method, that draws conclusions from the probabilities of various patterns and groups of patterns recurring among different writers. If probability is key to your method, may I suggest that you also conduct research into the probabilities of other key pieces of evidence presented by Baconians and Oxfordians. For instance I should be interested to know what mathematical probability you put upon the leading alternative authorship candidate (Oxford) appearing as ‘Our de Vere – a secret’ under the name ‘Oxford’ in a unique formulation (‘courte-deare-verse’) that is carefully annotated by the marginal note ‘Sweet-Shakspeare’ on one of only 5 printed references to William Shakespeare of the 1590s – Covell’s ‘Polimanteia’ of 1595. What are the odds? It is not interesting to me whether people instinctively believe in this or not, I want to know the mathematical odds of its occurring by random chance. I asked Stanley Wells to comment upon it, but he is innumerate and mumbled something about the compositor putting the printed marginal note ‘sweet Shakspeare’ in the wrong place. So I asked him to which part of the text he thought the margent should have been appended, and he could not say. ‘Covell is being cryptic’ he wrote ‘ but I cannot explain it.’ Well I have offered an explanation, and many people do not like it, but I am interested to learn what are the mathematical probabilities of ‘our de Vere – a secret’ appearing under the name ‘Oxford’ and annotated by the words ‘Sweet Shakspeare’ on one of only five printed references to Shakespeare from this decade that, incidentally, uses a well established technique, known as ‘charade’, to disguise a name within longer words without disturbing the name’s correct letter and word order.
I would also be very grateful if you could turn your scientific approach to the vexing question of why no one knew ‘Shakespeare’. I have recently taken the names of all of those whom Meres (1598) cites as the best for tragedy and best for comedy, who were active in the theatre world in the decade 1588-1598. The connections between them are labyrinthine – they all either worked together, knew one another, wrote about each other during each other’s lifetimes or wrote elegies for one another, immediately following a death – all except one! What are the odds that William Shakespeare should be the only name among 20 that does not have a single documented connection to a single one of the other 19 names? So, while I wish you well in your EEBO trawls I urge caution and, at the same time, hope to inspire you in new areas of scientific enquiry concerning the (to me) extraordinary improbabilities of the Shakespeare data as it ahs come down to us,

Warmest wishes,

Alexander

LikeLike

Reply
barryispuzzled
May 1, 2016 at 5:25 pm

Reply

~~~ “the most popular alternative authorship candidate to Stratford-Shakspere”
The degree of truth of a fact is not determined by its degree of popularity. Otherwise, the heliocentric theory would never have caught on. I cannot imagine sitting in my PhD viva and arguing that they have to accept the validity of my RCP method because all my friends like it!

~~~ “our de Vere – a secret’ appearing under the name ‘Oxford’ and annotated by the words ‘Sweet Shakspeare”
I don’t know what the explanation of this is. It seems that Stanley Wells doesn’t know either. Yet are you saying you do? If you don’t know the probability either why are you committing to a view?! The fact that one cannot provide a verified explanation for an event does not show that some other favoured interpretation is correct.

~~~ “You will of course be severely hampered by the fact that there is not … anything at all, to be found on EEBO that is ascribed to Oxford”
Oxford has eight published poems in Richard Edwards, The Paradyse of Daynty Deuises (1576), which is insufficient data. This is not a defect in my method. It is an opportunity to conclude that he cannot be suggested or eliminated by the RCP method (or any other method). In that case, a claim for rare verbal correspondences between his work and the Shakespeare canon (which is surely the best test of attribution) would be reduced to a metaphysical claim. Of course, there are his 77 extant letters of some 50,000 words which (in my humble opinion) are completely lacking in rhetorical skill. If you would like to suggest just one phrase from them that might be rare enough to be significant I should be delighted to test it.

In conclusion, I think where there is no testing procedure, it is wiser to admit that the answer to a question is unknown than to persist with a favoured interpretation.

P.S. Please don’t get angry with me for saying this, but I’m mighty relieved for you that you didn’t gamble your £40,000 on a mock trial! 🙂

LikeLike

Reply
Frode
May 11, 2016 at 9:59 pm

Reply

Hi!
Dennis McCarthy, in his “North of Shakespeare” uses the same method you are using to prove that Thomas North wrote plays attributed to Shakespeare. Do you think he is right?

Frode

LikeLike

Reply
barryispuzzled
May 11, 2016 at 10:55 pm

Reply

Dear Frode

If Mr McCarthy’s method claims to prove that anyone wrote (i.e. originated) an entire play then he cannot be using the method I am using.

LikeLike

Reply
Frode
May 12, 2016 at 12:01 am

Reply

McCarthy is using the same method you are using, but he draws bolder, less responsible conclusions from his data. However, from these data we could equally well draw the more modest conclusion that Thomas North contributed (or probably contributed) to plays attributed to Shakespeare. Some of the data are found in McCarthy’s book, while most of his examples were presented on a website which at the moment is not available. My personal belief is that the data are interesting, but could also be the result of pure coincidences, or all sorts of different reasons why two distinct authors would use the same words in close proximity.

Frode

LikeLike

Reply
barryispuzzled
May 12, 2016 at 8:01 am

Reply

OK, so could you please take one particular play and provide some examples of phrase matches that support the view that he contributed to it. Are you the author or a friend of the author promoting this book?

LikeLike

Reply
Frode
May 12, 2016 at 2:35 pm

Reply

No, and no. You can find his book if your interested. Just wondered if you were aware that others are using the same method. I like the method, but since only a fraction of the Elizabethan and Jacobean plays remain, and the EEBO therefore represents a very limited database, one has to be cautious about assumptions concerning possible influences and rareness.

LikeLike

Reply
barryispuzzled
May 12, 2016 at 7:26 pm

Reply

Well, with no data there isn’t anything for me to comment on. Thank you for your interest anyway.

LikeLike

Reply

Shakespeare authorship

Musings on what we can know …