Encyclopedias, Copyright, and Plagiarism

Joseph Reagle


Wikipedia and Bentham

My adviser noted that material appearing in the Jeremy Bentham Wikipedia article was a verbatim copy of that found elsewhere.

Jeremy Bentham article

Hi I was just going through WP:CP#September_29 and I noticed that Jeremy Bentham has been listed as a copyvio of the biography at the UCL site . If you did copy this from there can you ask permission from Irena Nicoll (email ) if they don't give permission the article will need to be redrafted.... Cheers Arniep 00:37, 12 October 2005 (UTC)

So redrafted, with apologies. --Susurrus 05:04, 12 October 2005 (UTC)

So, perhaps a technical copyright infringement has been avoided, but is this still plagiarism?

Pliny the Elder ("the vacuum")

Gauis Julius Solinus ("the Xerox")

Stockwell (2001:19) argues that "the work of Gauis Julius Solinus of the third century draws so heavily (about 90%) on Pliny's Natural History, without acknowledgment, and on other works of the time that one hesitates to list him at all except for the fact that several medieval writers copied parts of it into their own encyclopedias."

Chambers ("the bee")

He admitted that his dictionary contained "little new, and of my own growth," but felt no embarrassment: the work was professedly "not the produce of one man's wit" but a collection from the world of learning: in "nobody that fell in my way has been spared, ancient nor modern, foreign nor domestic, Christian nor Jew, nor Heathean: philosophers, divines, mathematicians, critics, causists, grammarians, physicians, antiquaries, mechanics, have been all brought under contribution" (Yeo 2001:205).

Chambers further expresses his personal views in the actual entry on 'Plagiary,' linking the practice to scientific contribution or a humble bumble bee:

Their [dictionary compilers'] Works are supposed, in great Measure, Assemblages of other Peoples; and what they take from others they do it to avowedly, and in the open Sun. In effect, their Quality gives them a Title to every thing that may be for their purpose, whereever they find it; and if they rob, they don't do it in any otherwise, as the Bee does, for the public Service. Their Occupation is not pillaging, but collecting Contributions. (qtd. in Yeo 2001:216).

William Smellie ("the scissors")

William Smellie, the compiler of the first Encyclopedia Britannica is said to have admitted over a drink "with paste pot and scissors I compose it" (McArthur 1986:107 citing Kogan) and in another account he confessed that he "made a Dictionary of Arts and Sciences with a pair of scissors, clipping out from various books a quantum sufficit of matter for the printer" (Yeo 2001:182 quoting Kerr).

The rest (filchers all)

Johnson's publishers were surprised by Barley's sometimes thinly veiled plagiarism (Morton 1994:27).

Webster and Worcester warned of alleged plagiarism (Morton 1994:46).

Many articles of the Encyclope'die were lifted directly from Chambers, though Diderot claimed most had been reworked (Yeo 2001:126).


Between 1710 and 1774 ... the new major British encyclopedias -- the Cyclopedia (in several editions and a Supplement) and the Encyclopaedia Britannica -- as well as five editions of Harris' Lexicon Technicum and its Supplement were published, together with at least four other minor competitors. All borrowed from each other, and especially from Chambers who had himself used earlier material. (Yeo 2001:206)

Uses of knowledge

Abridgment was a long standing scholarly practice and Yeo argues that encyclopedias evolved from "commonbooks":

John Le Clerc, who asked block to publish this method, made a link with the Renaissance interest in memory aids. In urging the use of a commonplace book, he remarked that when Memory became "Oppressed, or Over burthen'd by too many Things, Award or and That that or to be called in to its Assistance. So that when we extract any Thing out of and Author which is like to be of future Use, we may be able to find it without any trouble." (Yeo 2001:112)

Knowledge as property

Chambers wrote in his Preface:

'Tis vain to pretend anything of property and things of this nature. To offer our thoughts to the public, and yet pretend a right reserved therein to one's self, if it be not absurd, yet it is sordid. These words we speak, nay, the breath we emit, is not more vague and common than our thoughts, when divulged in print. (qtd. in Yeo 2001:215).

Declaring against perpetual literary property he [Lord Camden in 1774] said that "science and learning are in their nature publicii juris, and they ought to be as free and general as air or water... Knowledge has no value or use for the solitary owner; to be enjoyed it must be communicated." (Yeo 2001:204)

Sharing the light

Lord Camden reminds me of Thomas Jefferson who wrote:

He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me. That ideas should freely spread from one to another over the globe, for the moral and mutual instruction of man, and improvement of his condition, seems to have been peculiarly and benevolently designed by nature, when she made them, like fire, expansible over all space, without lessening their density in any point, and like the air in which we breathe, move, and have our physical being, incapable of confinement or exclusive appropriation. Inventions then cannot, in nature, be a subject of property. Society may give an exclusive right to the profits arising from them, as an encouragement to men to pursue ideas which may produce utility, but this may or may not be done, according to the will and convenience of the society, without claim or complaint from anybody... (letter to Isaac McPherson, 1813 as cited in Kock & Peden, 1972).

Knowledge as commerce

Ironically, whatever justifications editors gave for borrowing from other works — if they even bothered to rationalize — they also expressed concern with the infringement of their own work: applying for and prominently displaying the sovereign grants ("letters patents") that granted them a monopoly to publish the work. This practice continued even after the first copyright law, the Statute of Anne, was passed given ambiguities with respect to the statutes relationship to existing common law, grants, and the types of works involved. And one can understand why, the Cyclopedia was one of the most valuable literary works of its day -- any copyright worth over 400£ was prized and the average value person will work was 200£; the Cyclopedia was thought to be worth 6,400£ in the 1740s (Yeo 2001:199).

HG Wells, the World Brain, and the Spinster

Given the resources of "micro-photography" Wells felt: "the time is close at hand when a student, in any part of the world, will be able to sit with his projector in his own study at his or her convenience to examine any book, any document, in exact replica" (Wells 1938:54).

An additional feature of Wells' proposal was that it "should consist of selections, quotations, and abstracts as assembled by authorities -- one need not create summaries" (Wells 1936:921). Interestingly, Wells may have used this liberal sentiment of borrowing in the creation of his own book The Outline of History, for which he was accused of plagiarism by Florence Deeks and whom he dismissed for "conceiv[ing] the strange idea that she held the copyright to human history" (Keats 2002) as detailed in A.B. McKillop's The Spinster and the Prophet.

Recently, the famous popular historian Stephen Ambrose got into trouble for borrowing primary source quotations and the surrounding secondary material, tweaking that, and presenting it as his own, though with an (imprecise) citation, as detailed in Hoffer's (2004) Past imperfect: facts, fictions, fraud -- American history from Bancroft and Parkman to Ambrose, Bellesiles, Ellis, and Goodwin.

Plagiarizing Wikipedia

On Slashdot in January:

Tim Ryan, a 21 year veteran entertainment columnist for the Honolulu Star Bulletin, was fired yesterday after an investigation revealed multiple instances of his incorporating unattributed paragraphs from other sources. This case is unique in that it was first revealed by Wikipedia after an attentive Wikipedia editor noted similarities between a Wikipedia article and one of Ryan's columns. However he wasn't fired until after other news outlets started to run the story. Sadly, though the Star-Bulletin has admitted to the plagiarism, they failed to publicly acknowledge that Wikipedia was responsible for bringing this situation to light. (anonymous 2006)

Citing Wikipedia

If a reference work points me to a more authoritative source, should I at least not acknowledge this bit of help? Particularly, if I'm more likely to be influenced by the summary provided by the reference? Additionally why would any book among the thousands published a year be any more authoritative than a general reference work on the sole basis of its form? I could compile a multipage bibliography of books denying the Holocaust, but find few -- if any -- general-purpose reference works that did the same.

Should we teach students to trust a claim because it was simply uttered by a credentialed person? Or, should we encourage them to click a link and teach them how to investigate for themselves?

Provenance and protection

Many open source projects are instituting mechanisms by which they can ensure the provenance of the contributions. But, in the case of Wikipedia, this might be harder (given a massive numbers of editors, many of them anonymous), and would not address plagiarism. In fact, does the notion of plagiarism even exist for code?

Larry Sanger, a co-founder of Wikipedia, posted an essay last year entitled Why collaborative free works should be protected by the law in which he asked how can open content communities be protected against attacks arising from incidental abuses (e.g., copyright or libel) within the community. (Sanger 2005)

... the law should be written in such a way as to make it possible for free, collaborative works to survive, because society will greatly benefit from them.

He proposes such works be treated as shopwork "a portmanteau constructed from 'shared open work'"

A recent example

[WikiEN-l] Our Featured Article of the Day Contains Probable Copyright Violations


Do we need different standards for the production of knowledge on the basis of:


