For years, experts working in the field of literary stylistics have been interested in identifying the 'genetic fingerprint' of individual authors. The argument is that style comes from the unconscious mind, and that, unless an author deliberately attempts to disguise his or her style, it will have a unique genetic fingerprint. Generally, author fingerprinting attempts to demonstrate how works by a particular author are similar to each other, but are different from other works with which they can meaningfully be compared. Before the development of the kinds of corpus handling techniques that we talked about in the previous unit, such work was extremely time-consuming.
Probably the most famous early case where author fingerprinting was used to establish authorship involves Mosteller and Wallace's seminal work of 1964, which helped demonstrate the authorship of the Federalist Papers. In order to establish whether particular papers were written by either Hamilton or Madison, Mosteller and Wallace used techniques such as analyzing the relative frequency of synonym pairs (e.g. 'while' and 'whilst') in different texts.
In another case, corpus analysis was also used to investigate one of the most elaborate hoaxes ever to be played on the literary scene. It involved a well-known author, Romain Gary, who published a number of novels under the name of Emile Ajar. By doing so, he received the prestigious Prix Goncourt twice, something that is expressly forbidden. Linguistic Fingerprints and Literary Fraud by Vina Tirvengadum from the University of Manitoba, Canada investigates this hoax and shows how corpus linguistics can be used. Although the article itself contains quite a lot of statistics, its introduction and conclusion provide useful insights into linguistic fingerprinting.