Text similarity analysis entails studying identical and closely similar text passages across large corpora, with a particular focus on intentional and unintentional borrowing patterns. At a larger scale, detecting repeated passages takes on added importance, as the same text can convey different meanings in different contexts. This approach offers numerous benefits, enhancing intellectual and literary scholarship by simplifying the identification of textual overlaps. Consequently, scholars can focus on the theoretical aspects of reception with an expanded corpus of evidence at their disposal. This article adds to the expanding field of historical text reuse, applying it to intellectual history and showcasing its utility in examining reception, influence, popularity, authorship attribution, and the development of tools for critical editions. Focused on the works and various editions of Bernard Mandeville (1670–1733), the research applies comparative text similarity analysis to explore his borrowing habits and the reception of his works. Systematically examining text reuses across several editions of Mandeville’s works, it provides insights into the evolution of his output and influences over time. The article adopts a forward-looking perspective in historical research, advocating for the integration of archival and statistical evidence. This is illustrated through a detailed examination of the attribution of Publick Stews to Mandeville. Analysing cumulative negative evidence of borrowing patterns suggests that Mandeville might not have been the author of the piece. However, the article aims not to conclude the debate but rather to open it up, underscoring the importance of taking such evidence into consideration. Additionally, it encourages scholars to incorporate text reuse evidence when exploring other cases in early modern scholarship. This highlights the adaptability and scalability of text similarity analysis as a valuable tool for advancing literary studies and intellectual history.

Keywords: Bernard Mandeville, text reuse, reception, text similarity analysis, sequence alignment, Reception Reader

Text similarity analysis entails a systematic approach that examines overlapping identical, or near-identical, passages across different works.1 Here, the term similarity is used to encompass a wider definition of the detection of textual overlap, to include not only the deliberate reiteration of text, but other forms of intentional and unintentional borrowing. What may seem like a trivial exercise of identifying repeated textual passages takes on a new dimension when done at scale, as the same textual passage can carry different meanings depending on its context, making this data adaptable to a wide range of research questions.

Identifying and working with text reuse – sometimes called sequence alignment – in historical texts is a burgeoning field. Significant examples include the work done by the ARTFL project to identify textual sources of the Encyclopédie along with other intertextual studies, and the eTrap project, which used text reuse to identify how ancient authors copied, paraphrased and alluded to each other.2 Text reuse analysis has also been widely used to study the viral circulation of news stories.3 Our contribution in this paper is to expand the scope of analysis in historical research by demonstrating how the detection of overlapping textual passages can be applied to examine the complete works of specific early-modern authors and their interactions with other printed books of the time. We aim to propose new use-cases for sequence alignment, namely how it can benefit stylometry and the creation of critical editions. Furthermore, we focus on text similarity rather than text reuse, as defined above.

Our study focuses on the works of Bernard Mandeville, a prominent 18th-century figure, and applies text reuse analysis to gain insights into his borrowing habits, questions of authorship, popularity and reception in the context of 18th-century British and European literature and political thought. Our approach can be applied to any early-modern author, provided that the relevant data is available. By doing so, we aim to showcase the potential of text similarity analysis to deepen our understanding of the literary landscape of early-modern Britain.

Scholars have traditionally relied on direct and allusive references to understand an author’s influences and the impact of their works.4 However, despite the increasing interest and adoption of text reuse analysis in digital humanities, a systematic approach to thoroughly explore an author’s entire body of work, as we have done, has not been extensively undertaken until now. Although we believe that working with methods of text reuse is a natural step in building critical editions (as we demonstrate in this paper), and a natural continuum to earlier work, we are not aware of other scholars who have implemented this approach before.5 Our paper contributes to the scholarship in a number of ways, beginning with showcasing the potential of text reuse analysis to inform the construction of critical editions, and we hope that our work will inspire further research in this area. Additionally, the practice of utilising an author’s own works and primary texts of reference as negative evidence in authorship attribution has not been thoroughly exploited in existing literature.6 In this paper, we also undertake the use of text reuse to shed light on this aspect of authorship attribution. Furthermore, we extend the methodology of text reuse analysis by tracing all instances of reuse and similarity, both large and small, from a large corpus of 18th-century books, to provide a comprehensive understanding of Mandeville’s works and their reception.

Our paper addresses three interrelated research questions: 1) What kind of overlap did Mandeville have within his own works, and what can they reveal about the evolution of them? 2) What patterns of borrowing did Mandeville have, and how much did he borrow from specific authors? 3) Who later quoted Mandeville, and what does that tell us about his reception and the debates in which he was involved? By answering these questions, our paper intends to use text similarity analysis to gain new insights into Mandeville as an author and to compare his works with those of other authors.

Our findings confirm and expand on previous scholarship’s views of Mandeville’s borrowing habits, particularly his use of Pierre Bayle, but with the advantage of a systematic approach which is as close to comprehensive as possible.7 We also demonstrate how text similarity analysis can complement traditional work on critical editions, and provide examples of how this can work in practice, using Mandeville’s Treatise of the Hypochondriack and Hysterick Passions (hereafter Treatise) and Free Thoughts on Religion. Furthermore, our study provides statistical evidence for authorship attribution, clearing ground to challenge previous assumptions about Mandeville’s authorship of A Modest Defence of Publick Stews, based on negative text similarity evidence and compositional patterns that differ from Mandeville’s expected corpus.

These findings highlight the value of text similarity analysis for enhancing our understanding of the evolution, context and influence of literary works. They also open up new avenues for further research, such as investigating the cultural and ideological implications of Mandeville’s intertextual practices and comparing them with those of other writers and thinkers of the period. However, we acknowledge that our study is not meant to settle all debates or questions about Mandeville’s works or authorship, but rather to contribute to a more comprehensive and nuanced understanding of how texts interact and shape each other. The use of text similarity as a methodological tool in intellectual history has great potential, but it is crucial to maintain a balanced view of its significance.

Data and methods

Detecting identical, overlapping sequences of text is a useful way to assess influence. This practice provides significant evidence of the popularity and afterlife of a given text, which may come in many forms such as quotations, borrowing of idioms, sentences, phrases, or extended reprinting of passages from works to form a dialogue with them. The 18th century was a period that gradually moved from a culture where imitation was commonplace to one which recognised the primacy of authors’ original utterances.8 However, the direct reuse of such original utterances was still common, and 18th-century authors often reused others’ material freely (Duhaime 2016; see also Spencer 2019).

In practice, we can define text similarity as any pair of identical (or near identical) sequences of text where one sequence (the reused) was originally published before the other (the reusing).9 In order to look at text reuse at this scale, a large dataset of reused sequences of text across the entirety of Eighteenth Century Collections Online (ECCO) has been put into use. This ECCO data comprises the digitised versions of over 200 000 texts published in the 18th century, either in the English language or published in Great Britain and Ireland (Tolonen, Mäkelä and Lahti 2022). While far from complete, it is by far the largest digital text corpus available for the period, and widely used by scholars. The text of ECCO – the original data – is derived from a process of OCR of digitised scans created from microfilm. It is important to note that the texts suffer from significant OCR errors: the scans themselves are often of low quality which has affected the accuracy of the derived text. This is traditionally a barrier to detecting text reuse at scale, as it typically relies on identical or very close sequences of text.

However, the innovation of the text reuse dataset used for this article is that it utilises software which has been developed specifically to recognise connected sequences in noisy data. Called BLAST, this software was originally developed for comparing and aligning biological protein sequences (Vesanto 2019; Vesanto et al. 2017). With this method, the text of ECCO is converted into a similar sequence and the software detects aligned parts of it. In another study, BLAST has been shown to be almost twice as effective in finding text reuse passages in noisy data than a widely-used alternative, Passim (Vesanto et al. 2017, pp.54–58).

Figure 1
Figure 1

Figure showing key features of the text reuse method. The above is a pair of passages marked as an instance of text reuse by the BLAST software method. Inconsistencies and very minor edits between the texts are highlighted, showing that even in these cases, the software can detect genuine text reuse.

The result of this process is a large dataset containing almost two billion text reuse pairs. These have been made available in a queryable database, and are connected to rich information on the source texts, such as dates, authors, publishers and so forth.10 To facilitate the qualitative, close-reading of the resulting reuse, the data also forms the basis for an online resource, ‘Reception Reader’: a user interface which allows users to quickly get data on the reuse of any book through a search interface, and ‘zoom in’ to the specific passages in question (Figure 2; Rosson et al. 2023, p.5). This article makes use of both of these resources for its claims.

Figure 2
Figure 2

Reception Reader interface

Naturally, the sequences found cover a wide range of uses. Taking inspiration from a similar version created by the eTrap project, we created a taxonomy of the text reuse (Figure 3) found by the BLAST method (Büchler et al. 2014). The sequences can be mostly explained under a small number of categories. First, reprints, which is text reuse detected because it is found in a reprinted version or new edition of an earlier text. This can be further divided into full and partial reprints. Next are quotations: we distinguish first between short and extensive quotations, which in practice can serve very different purposes. Short quotations can range from a few words to a short passage, and are used, for example, when the reusing author is in direct dialogue with the language being reused. Extensive quotations are more likely to be entire sections or passages of a book, where the purpose is not such specific dialogue with or reuse of an author’s words but simply repeating their own text. Quotations (both short and extensive, but more often the latter) can be verbatim or near verbatim (both are picked up in this ‘fuzzy’ method of text reuse). An important sub-category of quotations are secondary quotations, which are cases of an author quoting another but in a third text, for example Bible quotes used by an author and then requoted by another. A final category of text reuse, artefacts, are other reuse snippets picked up by the method which are not quotations or reprints, such as the reuse of identical title page information, or simply incorrect reuse due to OCR errors.

Figure 3
Figure 3

Taxonomy of lexical text reuse

How some of these types of reuse look in practice is best explained with a series of examples from the works of Bernard Mandeville, found using the Reception Reader interface. The most common type of text reuse are direct short quotes. Mandeville, as is well known, borrowed liberally from other authors. These quotes are plentiful within Mandeville’s works. We might think of, for example, Mandeville’s reuse of Pierre Bayle in his Fable of the Bees. As well as plentiful indirect influence through his ideas, Mandeville directly borrows short passages and phrases of translations of Bayle, such as his word-for-word reuse of Miscellaneous Reflections, Occasion’d by the Comet (a 1680 translation of Bayle’s Lettre à M.L.A.D.C., docteur de Sorbonne). For example, Mandeville reuses part of Bayle’s description of the rape of Lucretia by Sextus. The following is the text from Bayle’s work (1708, p.372):

She bravely held out against all the Prince’s Attacks, even when he threaten’d her Life. But when he threaten’ her Reputation with eternal Infamy, she fairly surrendred, and then slew her self. An evident Argument, she valu’d nothing in the Vertue but the Glory which attends it

This becomes the following in Mandeville’s text (1714, p.196; identical parts in bold):

Lucretia held out bravely against all attacks of the Ravisher, even when he threaten’d her Life; which shows that she valued her Virtue beyond it: But when he threaten’d her Reputation with eternal Infamy, she fairly surrender’d, and then slew herself; a certain sign that she valued her Virtue less than her Glory.

Key to note here is that the method can pick up sequences even when they are broken, as is the case here, or very slightly different or with slight modifications.

A second type of reuse commonly found in Mandeville are secondary quotes: passages which Mandeville reuses but where the original quote actually comes from somewhere else, such as commonplaces or other sources. Returning again to borrowings of Bayle found in Fable of the Bees, one detected sequence of text reuse is the intentional requoting of a text or idea not found directly in ECCO: a chapter on courtesans in Relation de Venice attributed to one ‘Monsieur Didier’. In this case, Bayle gives Didier as the source of his description in a footnote; Mandeville reuses the text directly, modified to insert the name of the source directly in the main body of the text.

Bayle’s (translated) version (1708, p.335):

About two hundred and fifty years ago, Venice being in want of Courtizans, the Republick was oblig’d to procure a great Number from Foreign Parts. Doglioni, who has written the Memorable Affairs of Venice, highly extols the Wisdom of the Republick in this Point

And used by Mandeville (1714, p.69):

About Two Hundred and Fifty Years ago , says Monsieur de St. Didier, Venice being in want of Courtezans, the Republick was obliged to procure a great number from Foreign Parts. Doglioni, who has written the memorable Affairs of Venice, highly extols the Wisdom of the Republick in this Point

This kind of secondary borrowing – in this case, Mandeville is directly quoting a passage from a translated version of Bayle, which in turn is referencing or quoting information from two other sources – is typical of the kinds of complicated networks of reuses and intertextuality found in general by the text reuse method.

In addition to these shorter quotes or passages, text reuse detection also finds more extensive sections of reuse. For example, Mandeville reuses whole passages of poetry in his Virgin Unmask’d: in it, one of two women, Lucretia, shows the other, Antonia, a poem which she purports demonstrates the view men have of women (1724, p.116). Though not directly attributed, the text reuse method connects this poem to a version found in Alexander Smith’s The School of Venus, or, Cupid Restor’d to Sight, from 1716 (p.168). Sticking with Mandeville but looking at reuses of his work, we can find, for example, extensive and attributed quotations of Mandeville in texts published later in the century – in this case they are very clearly demarcated and referenced quotations, along the lines of the change in the value and treatment of ‘original utterances’, such as Frederick Morton Eden’s extensive quotation of Mandeville’s Fable of the Bees in his 1797 work The State of the Poor (pp.286–87).

There are many other reuses of text for various reasons which cannot really be considered as evidence for an author’s influence. Naturally, 18th-century authors borrowed ideas and passages from classical works or canonical texts. Because these were regularly reprinted in 18th-century editions, they are detected as text reuse in the data. Examples include (attributed) quotes from Plutarch’s Lives (1703, p.154), found in the Fable of the Bees (1714, p.213), and extremely commonplace Latin quotes from classical texts, such as Horace.

We must also acknowledge these as part of the text reuse data to be taken into account. Alongside these we also find ‘artefacts’: for example, the text reuse data includes texts from book titles, which will often be found ‘reused’ in book catalogues, which are also commonly found in ECCO. These may perhaps be interesting as a study in their own right but for our purposes are ignored. Imprint information, where the same imprint is found across many books by the same publisher, is also a common source of ‘false’ reuses (Figure 4). Other common reuses include commonplace writing and quotes, reused or boilerplate text, or commonly reused text such as recipes or Bible passages.

Figure 4
Figure 4

An (almost identical save for the year) imprint picked up by the text reuse as a reused passage of text.

Tracing the evolution of editions

One interesting aspect of text similarity is how it evolves over different editions of the same work. When an author revises their work, they may choose to add entirely new material, incorporate material from previous editions, or they may choose to remove passages that they no longer find relevant or useful. This process can be tracked through the use of computational tools. If text is added or taken away from one edition to a subsequent one by an author, this can result in a net increase or decrease in the amount of text reuse between the two editions. Other projects have made similar visualisations of changes between editions, though in this case the text reuse data itself is used to detect the similarities and changes – a process we know is robust to noisy OCR (Whitelaw, Hinchcliffe and Roe, n.d.). In Mandeville’s case, analysing the evolution of text reuse over multiple editions can shed light on the author’s writing process and the evolution of his ideas over time. It can also reveal patterns of text reuse that may be indicative of an author’s style or thematic preoccupations. Overall, the study of the evolution of editions provides a valuable tool for literary analysis and can help us better understand the creative process of authors.

The question of an author’s role in further editions of a particular work is fundamental to our understanding of the role of the author and their impact. F. B. Kaye’s edition of Fable of the Bees has been significant, but his scholarship was not always accurate. Although Kaye commended Mandeville as a deliberate stylist editing the Fable in consideration of its subsequent editions post-1723 (Kaye 1988, p.xxxv), in actuality, there were only a few noteworthy changes following the publication of the first Tonson edition. Mandeville’s intervention in his own texts is complex. On the one hand, unlike David Hume, Mandeville did not act as an editor for many of his own works; he simply finished them, sold them to publishers, and moved on. This was also the case with the first part of Fable of the Bees, which Mandeville for his part seems to have completed in 1724 (Tolonen 2010, pp.73–74). On the other hand, there is substantial evidence for Mandeville having a direct role in the process of editing some of his works, as will be outlined below. Additionally, Mandeville happily repurposed his old works, such as his own Grumbling Hive which was incorporated as part of Fable of the Bees.

This ability of the text reuse method to look within an author’s own work has several uses. One key one is the affordances it gives us to understand the patterns and evolution of editions of a single work. In this case, we use textual overlaps not as a way to find connected snippets of text, but rather to look for gaps and unbroken sequences of text from one edition of a work to the next, which allows us to see, at a glance, what has been removed and what has been inserted. To do this, we treat a seed book as a single text, and then extract all the text reuse between the seed book and a later edition, which is represented as a sequence. An identical copy to the original would result in an unbroken sequence of text reuse; any breaks in the sequence point to additions and subtractions from the first edition to the later one. In essence, this allows us to see the evolution of Mandeville’s texts over time, in terms of their added and removed parts. This information can be visualised, and allows for a quick overview of the differences between editions of a single text, which could be used, for example, in supplementing work to make critical editions.

To provide insight into the role of Mandeville as an author and editor, two case studies of his works will be examined. The first is the comparison between the 1720 and 1729 editions of Free Thoughts on Religion, the Church, and National Happiness, which were assumed to have minimal changes. The second case study looks at the changes in the Treatise between its first edition and the third one, where substantial changes were made. Through the text reuse method, we can gain a deeper understanding of the changes in these texts. While Mandeville’s role in the development of the first part of Fable of the Bees is well known, his contributions to Free Thoughts and Treatise have often been undervalued in previous scholarship.11 Irwin Primer’s edition of Free Thoughts, in particular, placed greater emphasis on the first edition, resulting in some significant changes being left out.12 The text reuse method makes detecting these changes much easier than before, streamlining the process of creating critical editions.

Free Thoughts was first published in 1720 and republished in 1729 with Mandeville’s involvement (a third edition, published after his death, is not compared). Scholars working on the text have done so on the assumption that little changed between the first and the second, save for a few key passages and some smaller edits. In particular, Primer, in his edition of Free Thoughts (2001), treats the second edition as more or less the same, highlighting several important passages but otherwise no substantial changes (pp.xxvi–xxix). From a scholarly perspective, undermining the second edition (as Primer consistently did) is curious also because Free Thoughts was the work that was translated in the 1720s during Mandeville’s productive period; it is therefore likely that its European reception affected Mandeville’s own views of the work in the time between the first and second edition. Using the text reuse method, we can look in detail at the specific changes. In the first instance, the changes noted by Primer are also picked up by the method, confirming its usefulness. These include a new section in the preface (Mandeville 1729, pp.xix–xx), plus a passage on the royal succession (pp.350–51).

However what we also see are numerous small changes – i.e. gaps in the text reuse – pointing to a pattern of systematic editing by the author. In many cases the small edits result in important changes in meaning. In the first chapter, ‘On Religion’, Mandeville edits the short paragraph below to remove the final words ‘to be performed on ourselves’, suggesting a subtle change in his view of religious self-denial (Mandeville 1720, p.15; Mandeville 1729, p.16):

The chief duty then of real Religion among Christians consists in a Sacrifice of the Heart, and is a task of self-denial, with the utmost severity against nature to be performed on ourselves

Many edits are for clarity or improved style: the phrase in the first edition ‘I would tell them likewise, that many of them gave not themselves that trouble, or else were very inconsistent with themselves’ (Mandeville 1720, p.51) becomes the following in the second edition: ‘I would likewise ask them whether, while they are distinguishing themselves from others, they are always consistent with their principles?’ (Mandeville 1729, p.56). Mandeville changes the sentence ‘Every Man may be convinc’d within himself, that Believing is not a Thing of Choice’ (1720, p.67) to ‘But whoever will attend to what passes in his own mind, may soon be convinced, that believing is not a thing of choice’ (1729, p.73). The method picks up even smaller changes, such as the inclusion of the bolded phrase in the following sentence: ‘The impossibility there is in our little knowledge of reconciling either the system of predestination, or that of free will, to all the necessary Attributes of God’ (Mandeville 1729, p.125). Elsewhere ‘The heathens’, becomes ‘The defenders of the religion of old Rome’ (Mandeville 1729, p.156). One interesting addition is to a passage arguing that early Christian fathers deliberately misrepresented some of Pagan Rome’s customs: Mandeville specifically adds that the early Christian author Lactantius may have done this by mistake rather than to deliberately deceive (1729, p.191).

In the section ‘On Government’, Mandeville makes some more substantial changes, many of which were noted by Primer. The most significant addition of Mandeville omitted from Primer’s notes on the second edition is the extension of a paragraph (Mandeville 1729, p.338; see also Mandeville 1720, p.302). The second edition adds to the paragraph below as follows (new text in bold):

From what has been said it is evident that the chief end why the king is invested with this power is to enable him to maintain the laws, and since the king has no prerogative but what is given by law, it is impossible he could have a power, without his parliament, to make, repeal, or alter any; and nothing is more absurd than to advance that a person has a just authority to destroy what he has sworn to keep. But to render it still more manifest that the king has no power to claim obedience, and that it is not so much as surmis’d he should require it of his subjects to any command that is unlawful, we are but to observe what everybody knows, that all persons are accountable for their own actions, and that no order of the king, however plain or expressed soever, unless produced in writing and corroborated with his sign manual, can extenuate a man’s guilt, much less exempt him from fine, imprisonment, or other punishment, incurring that order he has acted against the law. Nay the king has no power to claim obedience to any command, that is not founded in law, that is, where there is not some law that requires obedience to it. If the king commands me to give him my estate, if I think fit to comply and give it to him, I break no law; the act I do is not unlawful; but I am not bound to do it, neither in law nor in conscience; because there is no law that gives him the authority to require it.

Elsewhere, in a section on royal powers, Mandeville makes a small but significant change to his description of the limits of the power of the king: in the first edition, Mandeville writes that ‘the Power of Arbitrary Confinements is only granted to the King for a few months’ (1720, p.303). In the second, this becomes ‘the power of arbitrary confinements is never given to the king; all that is done by so laying aside the Habeas Corpus act, is only suspending for a few months the privileges given to the subjects by that act’ (Mandeville 1729, p.339).

In general, we see patterns of Mandeville clarifying his political ideology and subtly changing his arguments and evidence. The power and role of Parliament is extended, from an authority which ‘makes Commonwealths and mix’d Governments from Absolute Monarchies’, to one which, in England, ‘established the regal authority’. On James II’s abdication, Mandeville adds the sentence: ‘He had claimed and exercised a despotick power inconsistent with the limited authority of the regal power which the laws vested in him’ (1729, p.355).

Overall, when taken individually, these small edits are easily overlooked or dismissed. However, when we look at them from this perspective, we can see a process of Mandeville gently but significantly reshaping his views and arguments throughout the text.

A text with more substantial known changes is Treatise. First published in 1711, this treatise is a medical text which takes the form of a series of dialogues between a physician and his patient who suffers from a number of ailments including hypochondria. The 1730 edition was substantially revised, including the title (the word ‘diseases’ replacing ‘passions’, and much of the text, which grew by about one hundred new pages (Kleiman-Lafon 2013; Kleiman-Lafon 2017). Using the text reuse method to compare editions shows at a quick glance the position and size of the major and minor revisions to the text. Figure 5 shows the first edition compared to the 1711 reprint, the 1715, and two 1730 editions. In this figure, each additional word of the text (in other words, gaps in the text reuse) not found in the original is represented by a thin black line: as can be seen, the 1711 and 1715 editions are more or less identical, save some additional snippets where OCR errors have made the text different enough to be highlighted by the text reuse method. Substantial revisions can be seen in both 1730 editions: many at the beginning, in the preface, as well as several longer chunks of text towards the middle and end of the work.

Figure 5
Figure 5

A visualisation comparing subsequent editions of Treatise of the Hypochondriack and Hysterick Passions to the first one. Each is compared to the first edition, and black vertical lines represent a gap in the sequence of text reuse from one edition to the next.

Using a similar interface to the Reception Reader to zoom in to specific parts of the text, we can highlight these passages and read them in their original context, verify they are genuine additions, and situate them in terms of where they are inserted with reference to the original text. With this method we can see some of the major revisions to the text, which are already well known by scholars but allow us to verify them and demonstrate how quickly we can find these differences. The evolution of the editions highlights how Mandeville revised his thinking and his arguments throughout his career. Mandeville inserted, for example, significant new chunks of text regarding anatomical evidence for his earlier argument that ‘animal spirits’ were responsible for the process of digestion. In the dialogue, Mandeville warns of the risks of inference based on observations from other animals (or accepted truths), and argues that only empirical evidence will suffice. In the original, Mandeville goes straight to a metaphysical, Cartesian argument, which he would later revise (see Kaye 1921, pp.420–21).

As well as these significant additions, this method allows us to survey the many other smaller changes made between the editions. There are parts added to the preface to aid readability, for example a description of the dialogues and characters, and an explanation that in this edition Mandeville will translate Latin passages; an extension of his argument on the relationship between body and soul (1730, pp.50–53); additional passages on digestion which mention specific theories and natural philosophers such as Robert Boyle (1730, pp.89–90); and a case taken from Philosophical Transactions as evidence that hypochondria was not caused by the spleen or liver.

All in all, the results from this method line up with what is already known from scholarly editions and scholarly research. However, the edition comparer allows us to see a quick graphical overview, to understand the text reuse in graphical terms, and quickly zoom in, particularly valuable for understanding the context around minor edits. For other, lesser-known texts which have not had critical editions, this method is valuable as a starting-point to understand changes and differences between texts.

The absence of textual overlaps: the case of Publick Stews

Moving from text reuse as a method for comparing multiple editions to each other, a second way of using text reuse to look at a single author is to compare their own works with each other. Mandeville, like most 18th-century authors, was a prodigious borrower of his own work (Hayes 1993). This can take various forms: reprinted works, such as Grumbling Hive, which are reprinted in part as new works (in this case it forms the basis of Fable of the Bees), as well as self-quotation and self-reference, or the reuse of quotes from a second author.13 Text reuse methods allow us to compare all of Mandeville’s works to each other, and understand how they function in terms of internal borrowing.

Previous studies, including those by Paul Anderson and Maurice Goldsmith, have used textual overlaps that were retrieved by hand to provide positive evidence for Mandeville’s authorship of Female Tatler (Anderson 1935; Mandeville 1999, pp.44–47). These findings, including Anderson’s identification of Mandeville’s verse fables within Female Tatler and Goldsmith’s discovery of other overlapping Mandeville passages from different sources, have been widely accepted in Mandeville scholarship (see Vichert 1964; McKee 1993). This paper follows a similar path but takes a more comprehensive approach, examining all attributed Mandeville works available in ECCO in multiple different ways.

Authors commonly engage in text reuse, drawing from their own previous works to clarify or expand upon ideas, utilising familiar phrases or idioms. However, even by the standards of the 18th century, Bernard Mandeville stands out as an extreme case of self-reuse in his writings. Our data shows, for example, that 96% of Grumbling Hive is reused in Fable of the Bees; that 96% forms about 2.5% of the latter. About 22% of Mandeville’s last publication, A Letter to Dion, comes from Fable of the Bees (about 4% of the latter is reused). At a smaller scale, we can also see small sections of reuse: about 3% of Mandeville’s earlier work, Free Thoughts on Religion, the Church, and National Happiness (1720) is reused later in Fable of the Bees. About 38% of An Enquiry into the Origin of Honour, and the Usefulness of Christianity in War can be found in Fable of the Bees, 11% in Fable of the Bees Part II, and very small amounts (3%) of overlap between An Enquiry and Free Thoughts.

What is perhaps even more intriguing are the gaps in this pattern of textual overlaps. Figure 6 counts the number of Mandeville texts which appear in other Mandeville texts. One remarkable omission is A Modest Defence of Publick Stews, which has been attributed to Mandeville. Astonishingly, this work shows no textual overlap with any other text that we unquestionably know to have been authored by him. Whether Publick Stews was authored by Mandeville or not is the subject of some dispute, as will be shown. Not a single passage or snippet is detected by the text reuse software, making this absence of reuse in the case of Modest Defence of Publick Stews a significant and noteworthy observation. The role of Publick Stews in Mandeville’s œuvre ought to be a matter of debate even without the introduction of further evidence. The book was published in 1724, during the height of the controversy surrounding Mandeville’s Fable of the Bees. It was not published by Tonson although Mandeville had just entered his publishing roster with the third edition of the Fable (also Treatise in 1730 was published by Tonson; see Tolonen 2010, pp.47–49). This is of course circumstantial evidence and despite it, it is possible that Mandeville did in fact write Publick Stews; yet there are different kinds of external evidence that ought to make scholars stop before declaring that Publick Stews was authored by Mandeville.

Figure 6
Figure 6

Count of the text reuse in other Mandeville works for each Mandeville work.

The evidence supporting the attribution of Publick Stews to Mandeville is slim and internal. Kaye, who compiled a canon of Mandeville’s publications in the 1920s that most scholars still follow, had very little to work with in the case of Publick Stews but felt comfortable to attribute it to Mandeville (1921, p.454). His main argument, which Paul Sakmann had already provided (1897, p.34), was that Publick Stews can be seen as an extension of Remark H of the Fable (published in 1714) (see also Hundert 1994, pp.216–17; Branchi 2022, p.114). Other proof that Kaye provides are elaborations of the idea that ‘the content and style are typical of Mandeville’ (1921, pp.454–55). In the 1930s, the authorship of Publick Stews was debated based on the name under which it was entered into Stationers’ records entailing the owner of the copyright (Harder 1933, pp.200–203).14 Richard Cook in the 1970s trusted Kaye wholeheartedly without providing any new evidence for the authorship question (Cook 1973; Cook 1975). Scholars ever since analysed Publick Stews as Mandeville’s, without an apparent need to think about the question of authorship (Rogal 1976; Canfield 1998; Nacol 2015). Irwin Primer in his 2006 edition of the Publick Stews does not discuss the attribution question at any considerable length nor does he provide any new evidence for it.

In the light of the internal evidence alone there are good reasons to consider seriously the possibility that Mandeville did not author the Publick Stews. Statistically, the absence of a habitual practice of borrowing from his own works is negative evidence, which can be used to strengthen or weaken a hypothesis. In this case, the absence of such a practice weakens the hypothesis that Mandeville authored Publick Stews, because it runs counter to what is known about his authorial habits. By contrast, if there was evidence that Mandeville did borrow from his own works in the Publick Stews, just like in the case of the Female Spectator, this would strengthen the hypothesis of his authorship, because it would be consistent with his known authorial practices.

In summary, the absence of a habitual practice of borrowing from his own works is a valuable piece of negative evidence that should be taken into account when considering the attribution of this work. Negative evidence is important in statistical inference because it can be used to rule out certain hypotheses, even in cases where positive evidence is not available.15 For example, in a criminal trial, the absence of DNA evidence linking a suspect to a crime scene may weaken the hypothesis of their guilt, even if there is some positive evidence linking them to the crime. Similarly, in literary attribution studies, the absence of a habitual practice of borrowing from an author’s own works can weaken the hypothesis of their authorship, even if there is no positive evidence pointing to another author.

Mandeville’s patterns of borrowing from others

Bernard Mandeville is a controversial author known for his unique style of engaging his audience to confront their own hypocrisy. However, despite his versatility and genius, his legacy has been marred by the limited surviving evidence about his life and his reputation as the ‘Man-Devil’.16 One reason for this treatment is that Mandeville did not follow the conventional style of academic philosophers or authors, choosing instead to use doggerel, dialogues and fables.17 Moreover, he borrowed extensively from other authors, including lifting historical reflections directly from their works. In this section we aim to address the extent of Mandeville’s text reuse of other authors and its relevance. While it is evident that Mandeville borrowed heavily from other authors, using text reuse we seek to determine the precise extent of his reuse and explore its implications.

Figure 7
Figure 7

Count of the reuse (in pages) of authors by Mandeville.

Previous scholarship was correct in its overall views of Mandeville’s borrowing habits, especially the relevance of Mandeville’s use of Bayle. Mandeville’s extensive use of quotations is well-documented, particularly in his references to Horace, as evidenced by Figure 7, which lists the authors he borrowed from the most. Additionally, his engagement with Shaftesbury is apparent in the substantial amount of borrowing he did from him, as well as his admiration for William Temple. Other well-known authors he drew from include Thomas Willis and Jean Le Clerc. However, the list of top authors also includes some surprising religious figures, such as John Alphonso Turretine, the author of the translated Dissertations on Natural Theology, and William Beveridge (1637–1708). But, above all, what becomes most evident from Figure 7 is the role of Pierre Bayle in Mandeville’s borrowing habits.

Kaye’s edition of the Fable made the presence of Bayle abundantly clear (Mandeville 1988, pp.44, 97, 315–16 and throughout; see also Kaye 1921). In Irwin Primer’s introduction to Free Thoughts (2001), it is noted that much of Mandeville’s work was constructed from the words and thoughts of Pierre Bayle. Primer points out that Free Thoughts is ‘highly derivative’ (p.xviii). Indeed, Mandeville has been called a ‘blind follower’ of Bayle by the contemporary critic Bluet, who implores readers to compare Fable to Bayle. Bluet argues that Mandeville’s inspiration comes not from the original Bayle, but from his English translator (1725, pp.132 and 121–38 passim.).18 Thus, it has been clear for a long time that there was a close relationship between Mandeville and Bayle, with Mandeville drawing heavily from Bayle’s ideas and reusing his texts in constructing his own philosophical system.

However, existing scholarship has not looked at Mandeville’s borrowing at scale. Doing so allows us to draw more concrete conclusions from the sources used by Mandeville. When we examine Publick Stews, we encounter the same lack of evidence as in the case of borrowings from Mandeville’s own works. While Bayle’s writings are the most frequently reused in Mandeville’s other works, Publick Stews contains no reuses of Bayle. Kaye and other scholars have used the similarity of style and themes between Publick Stews and Fable of the Bees to support their argument for Mandeville’s authorship of the former. One would expect that such similarities would extend to borrowing from Bayle as well, yet no such evidence exists. While Publick Stews does contain references to other authors, these are not the same as those found in Mandeville’s other works (for example there are several text reuses of William Butler, Edmund Gibson, Thomas Hayley and John Heylyn in Publick Stews, all which are absent from Fable, Free Thoughts and Letter to Dion). This combination of negative and positive evidence further undermines the case for Mandeville’s authorship of Publick Stews, highlighting both the absence of expected borrowings and the presence of distinctive borrowing patterns not found in Mandeville’s other works. The absence of a habitual practice of borrowing from his own works and the same authors (especially Bayle) that he uses in his other works raises doubts about Bernard Mandeville’s authorship of the Publick Stews. This evidence is significant for Mandeville scholarship and highlights the importance of a systematic approach to the study of text reuse more broadly.

Reception, popularity and influence: the evidence from text reuse

The narrative of Mandeville’s popularity, as depicted in Kaye, Hundert and other sources, outlines the story of the public scandal caused by the third edition of the Fable of the Bees and Mandeville’s indirect impact on Enlightenment thinkers such as Voltaire, Smith, Hume and others.19 This ‘rags to infamy’ narrative has been further amplified by interpretations that position Mandeville as a predecessor of capitalism by the Chicago School (Hayek 1966; see also Tolonen et al. 2020). However, it is important to interrogate these assumptions as evidence for Mandeville’s influence even if true. What are the criteria used for measuring popularity, and how does it compare to other authors such as Berkeley, Hutcheson, Smith or others who have been hailed as champions of the 20th century?20 What does popularity actually mean in the 18th-century context? These are critical questions that scholars should consider when making claims about Mandeville’s popularity or that of other authors. Kaye, writing in 1922, argued that Fable of the Bees had a significant effect on 18th-century thought, calling Mandeville a ‘major dignitary’ (1922, p.83). On the other hand, Kaye writes that his literary influence was ‘not considerable’ (1922, p.85). Due to scholars’ consistent use of the same evidence and repeated narrative, ambiguity persists in assessing Mandeville’s impact and popularity, particularly with regard to his long-term influence and public reception.

Measuring the influence and popularity of early-modern authors is a complex task if taken seriously. For studying the hand-press era (roughly until the 1830s), one approach is to systematically study the number of editions and published works over time, which provides objective and comparative insights into an author’s literary output (Tolonen et al. 2021). However, this method only captures one aspect of influence, specifically popularity in terms of relative readership. A commonly-used approach to measure influence and popularity is to examine references and allusions to an author in the works of popular authors, as demonstrated by Kaye and other scholars in the case of Mandeville.21 This method allows for a more nuanced understanding of an author’s impact, as it takes into account how their ideas were referenced and incorporated by other influential writers of their time. However, it should be noted that this approach may not be systematic enough and could be biased towards particular trends or perspectives that were popular during the time of analysis.

Kaye’s treatment of Mandeville as a utilitarian thinker in his work is an example of how anecdotal approaches can be prone to emphasising particular trends or selecting authors deemed important at the time of analysis, rather than taking a more objective and broader perspective. Kaye initially presented the lack of utilitarian components in the Fable as a possible reason to doubt Mandeville’s authorship of the Publick Stews (1921, p.456). However, a year later he used Publick Stews as an example of Mandeville’s influence on utilitarian thinking in his subsequent work on influence, which was motivated by trends in scholarship in the 1920s (1922, p.87). When evaluated objectively, this may be considered lacking in scholarly rigour. For Kaye, being able to claim that Mandeville authored Publick Stews was important because there were several European publications of that work, which could have added to the popularity and influence of Mandeville. In this way, Mandeville’s popularity was not only riding on the public controversy based on the 1723 printing of the Fable.

In the case of Mandeville’s reception, his public reception (which in Britain, France and Germany was undoubtedly a matter of public outcry) needs to be conceptually separated from authorial reception at the level of productive ideas, which is a somewhat different matter. Computational similarity analysis allows us to understand Mandeville’s authorial reception systematically and at scale, rather than just understanding his influence on a small number of significant, canonical authors. This raises an interesting question from a methodological perspective – it is easy to claim that an author was popular, but it is much harder to provide specific numbers or justifications for such claims, besides particular and anecdotal aspects of reception.

In this part of our article, we aim to implement text reuse as an additional method for measuring influence, which can complement other approaches such as examining raw numbers based on edition and work counts during the hand-press era. While Mandeville’s influence on Enlightenment thinkers has been well established, the magnitude of his popularity in 18th-century Britain is not entirely clear. Our aim here is to introduce one new way of studying influence in the concrete case of Mandeville and show what this means in terms of producing comparable numbers. The aim, therefore, is not to overturn traditional Mandeville scholarship but to supplement it. It is hoped that in the future these approaches will merge.

Text reuse as an indicator of influence

Text reuse allows us to see exactly how this direct influence looked and how it can be found throughout ECCO. In total, instances of Mandeville text reuse can be found in 2557 works (3989 editions), ranging from extensive passages to very short segments of text. In Figure 8, this reuse is charted, for Mandeville’s most reused texts: Fable of the Bees, followed by Free Thoughts on Religion, Treatise, and Fable of the Bees Part II. Each bar represents the amount (in pages for simplicity, converted from characters) of reuse of Mandeville works found in editions by other authors. Reuse found in reprinted editions is also included, using the logic that if Mandeville was reused by an author, and then that author’s work was subsequently popular enough to warrant further editions, then this is also an indirect part of Mandeville’s influence.

If we look at the pattern of the reuse of these top four works over the century, we can chart in some detail Mandeville’s changing influence. The chart shows a similar initial pattern for each work: following the publication of the first edition, the reuse of each grows to a peak of popularity before fading out again. This is followed by sustained ‘bumps’ in reuse at semi-regular intervals throughout the century, as well as a substantial increase in the reuse of Fable of the Bees from about 1795. While this should be taken in context of the volume of material in ECCO (which increases greatly as the century goes on meaning that in real terms the proportion of Mandeville reuse falls), it is still a meaningful change in the general pattern and worth further exploration.

Figure 8
Figure 8

Reuse of four most reused Mandeville texts, by year in the 18th century.

Text reuse also helps us to understand exactly who was reusing Mandeville. Counting the total amount of Mandeville texts reused by other authors, we see a list of many important authors, including many identified by Kaye as influenced by Mandeville (e.g. George Berkeley and William Law, as well as many others). Substantial reuse of Mandeville can be found in the work of Frederick Eden (whose writings on social issues such as poverty make a natural alignment with Mandeville), the antiquary John Smith (1747–1807), and the Whig historian Catharine Macaulay. Other quoting authors are surprisingly absent or minimally present, such as David Hume, Edward Gibbon and Adam Smith. As we know, these authors directly and indirectly referenced and were influenced by Mandeville, for example in the course of the luxury debate (Berry 1994). This demonstrates the limits of the method which should be combined with other measures of influence.

What caused these ‘spikes’? The measured metric is the amount of text reused, so in some cases a single author reusing longer passages is the cause. This is the case for the spike in 1736–1737, where Mandeville is extensively quoted by Erasmus Jones in his works Luxury and The Man of Manners. Text reuse highlights early, overlooked reception to Fable of the Bees, such as that by Aubry de La Mottraye in his Travels. In many cases, the strongest examples of reuse are direct responses to Mandeville, such as Law’s Remarks Upon a Late Book, Entituled, The Fable of the Bees, Dennis’s Vice and Luxury Publick Mischiefs, or the anonymously published The True Meaning of the Fable of the Bees. George Berkeley quotes short passages – attributed to Mandeville in footnotes – such as quoting Mandeville on ‘moral virtues’ as a political method of keeping the populace in check, a quote he uses in both Alciphron and A Discourse Addressed to Magistrates and Men in Authority.

By the end of the century we can see the more direct influence of Fable of the Bees. Eden directly quotes from it in reference to poverty, and calls Mandeville’s writings on civil society ‘ingenious’. Mandeville text can be found in reference works (Motherby’s A Medical Dictionary; or, General Repository of Physic, Francis Fuller’s Medicina gymnastica) and literary magazines such as Tatler. We find Mandeville quoted in the work of Edward Henry Iliff (Angelo, a Novel, which quotes a passage of Mandeville where he argues that the principle of honour keeps society together). Viceimus Knox’s The Spirit of Despotism quotes and borrows liberally from Fable of the Bees, often without attribution or attributed simply to a ‘sagacious author’.

Overall, therefore, the pattern of borrowing from Mandeville is clear. Taking Fable of the Bees as an example, we can plot its popularity or influence over time as found through the text reuse data. The timeline shows that Fable had an initial run of popularity in the years after its first publication, peaking in about 1732, before remaining steady or waning for most of the century until the final two decades, when it rose sharply. This points to a strong revival of interest in the ideas of Mandeville, for example his thoughts on luxury as a necessity for a functioning society, which became the prevailing viewpoint later in the century. It is clear that this manner of detecting text reuse is of significant value to tracing the real influence of a given author or text, rather than relying solely on contemporary anecdotes or 20th-century scholarship.

Mandeville and genre

As mentioned above, scholars have historically tended to downplay Mandeville’s literary influences, highlighting instead his influence on social theory, the history of ideas and so forth. However, when looking at text reuse at this scale, we can get a much more complete picture of the types of works which reused Mandeville. The chart in Figure 9 displays the counts of the volume of Mandeville text reused by various genres of texts found in ECCO.22

Figure 9
Figure 9

This diagram visualises the proportions of reuse for the top five most reused works of Mandeville, by genre. The first edition of each Mandeville text is used to avoid the emphasis of later editions (texts with many editions will show up as being reused more, overall). The size of the rectangle containing each work and genre is determined by the word count of the reused text.

At this level, taken at the aggregate, a very different picture of Mandeville’s reuse can be seen. What is immediately surprising is the importance of literary and religious texts, contrary to the traditional view of influence. This chart shows that while we traditionally consider Mandeville’s influence as primarily around social and political thought, in fact his work was more likely to be reused by religious and literary texts. Treatise, for example, is reused by authors in literary and religious texts more often than in scientific, medical texts. This is of particular interest when we consider the changing role of the author over the 18th century. By extracting passages in this ‘cut and paste’ way, authors could and did change the original meaning of a passage, which gains much of its meaning from its context.23 Perhaps unsurprisingly, authors were less concerned about the original generic context of a quote or passage, and had no qualms about using it in a completely different context – they were interested primarily in how it could be repurposed to suit their own means. This is perhaps why we see this wide spread of genres across which Mandeville was reused.

The network of Mandeville reusers

A final way to understand the influence of Mandeville is by looking at how the authors who reuse him are clustered by their own textual influence and cross-referencing – to build a picture of the network of intertextuality surrounding Mandeville. To do this, we took all the authors who reuse Mandeville (747 in total), and then extracted all of their text reuse across all of ECCO. These authors are responsible for nearly 23 000 editions and overall they reuse, to some degree, almost the entirety of the texts in ECCO – nearly 200 000 editions. To understand more about the structure of this group of authors and their patterns of reuse, we counted the number of times pairs of these authors reused the same work, and used an algorithm to cluster the results – so that authors who reuse many of the same works have a strong ‘weight’ connecting them, and will be clustered together.24

The algorithm splits this network into two, evenly-sized clusters, which might be thought of as the separate ‘communities’ of those influenced by Mandeville. The first is mainly literary and includes Samuel Johnson, Gilbert Burnet, George Brewer, Alexander Pope, Joseph Addison and Richard Steele. The second is weighted towards religious authors in particular. Authors in this community include the Anglican cleric and philosopher Samuel Clarke, Joseph Priestly (known primarily as a natural philosopher but he also published theological works), John Wesley, Isaac Watts and the Scottish reverend Thomas Boston.

If we look at what these communities are actually reusing from Mandeville, there is a clear difference, with the religious ‘community’ referencing more Free Thoughts and the first group with a bias towards Fable of the Bees and Treatise.

Figure 10
Figure 10

Count of reuse of Mandeville of the two communities identified by network analysis. This chart shows which Mandeville texts each of those communities were mostly reusing.

What this highlights are two separate communities within which Mandeville’s influence falls in the 18th century. We might think of them as two different communities of practice, each with its own reason for reusing Mandeville and also having their own types of texts which they in turn reuse when we look at ECCO more widely. This information, along with the genre analysis above, allows us to dig more deeply and get a fine-grained picture of the communities of authors (and texts) which reuse Mandeville and for what reasons. This is further evidence that Mandeville’s texts and quotes are repurposed for multiple different reasons.


Using a systematic and data-driven approach through text-similarity analysis to examine the patterns of overlap in Mandeville’s works and comparing them to other popular works of the time, we have been able to study the composition of Mandeville’s works, shed light on their reception and challenge previous assumptions about authorship attribution.

One of the key contributions of our study is the critical examination of the attribution of Publick Stews to Mandeville. Our findings have revealed clear borrowing patterns that differ from those found in Mandeville’s other works, casting doubt on the attribution of Publick Stews to him. This highlights the importance of using empirical evidence and data-driven analysis in authorship attribution studies, as relying solely on content similarities can be misleading.

Furthermore, our research has demonstrated the potential of text reuse analysis in tracing the borrowing habits of both Mandeville and other authors. By systematically examining the reuse patterns across different editions of Mandeville’s Free Thoughts and Treatise, we have been able to gain insights into the evolution of these works and the influences on Mandeville’s writing. This has opened up new avenues for understanding the intertextual connections between literary works of the 18th century and the broader intellectual history of the time.

Our study also underscores the significance of using text reuse analysis as a tool for enhancing our understanding of the reception context of literary works. By analysing direct quotes from Fable of the Bees in comparison to other works of Mandeville and popular works of the time, we have been able to provide empirical evidence for the popularity and influence of Mandeville’s works during his time, which would have been difficult to ascertain through traditional methods alone.

Our research has highlighted the versatility and scalability of text reuse analysis as a valuable tool for advancing literary studies and the history of ideas, particularly in the context of historical texts available in digital databases like EEBO or ECCO and expanding its use beyond studies of virality and news. Our systematic analysis of Mandeville’s works has demonstrated the power of text reuse analysis in revealing insights about intertextuality, borrowing habits, authorship attribution and reception context. The scalability of this approach offers opportunities for reproducibility, transparency and interdisciplinary collaborations. However, limitations such as the quality of digitised texts and potential biases should be acknowledged. Overall, our study encourages further exploration and utilisation of text similarity analysis in early-modern scholarship.


