Published: Wed 11 January 2023
By Joseph Reagle
In technology .
Like the startling clap that follows distant rumbles of thunder,
artificial intelligence (AI) has arrived. Stunning images and precocious
prose can be generated at the behest of anyone. You need only download
Diffusion
Bee or create a free account at chat.openai.com to toy with
these marvels yourself.
And anyone who does so wouldn’t be surprised by the many headlines
warning that “AI’s threat to jobs
and human happiness is real” (Strickland 2022 ) . The thunderclap can be
so startling, that in July 22, a Google engineer thought that machines
had finally become sentient — and he was fired for publicizing his
belief (Lemoine
2022 ) . At the outset of 2023, anyone can experiment and draw
their own conclusions.
While this technology might, some day soon, be disruptive to artists,
journalists, authors, teachers, copywriters, photographers,
illustrators, and others, it is already a problem for some: online
moderators. At Reddit, Wikipedia, and Stack Overflow, contributors pride
themselves on producing content that is, in their lights, correct. These
are epistemic communities , whose members share an understanding
of what constitutes quality contribution, how it is made, and how to
reward its authors (Tzouris 2002, 21 ) .
Reddit and Stack Overflow, especially, gamify the creation of quality
content via “karma” and “reputation.” Such voting is intrinsic to the
platforms’ curation, and members share more substantive and personal
recognition by way of flair and awards. On r/AskHistorians users upvote
questions and answers; additionally, great questions are labeled as
such; flairs are given by moderators to design areas of expertise based
on previous contributions; and fellow users can give awards including
“helpful,” “awesome answer,” and “wholesome.” Questions and answers are
similarly voted on at Stack Overflow; open questions can have bounties,
and flairs include gold, silver, and bronze badges. At Wikipedia, voting
and ranking is less explicit; nonetheless, some dedicated Wikipedians
keep a public tally of their edit count, the quality assessments of
their articles, and the awards given to them by their peers.
The challenge for epistemic communities in the face of AI is that of
verisimilitude . The latest bots can produce content that looks
good but substantively fails — even if the substance is spot on most of
the time. When I asked ChatGPT about this concept, it responded that
“verisimilitude does not necessarily have to be based on actual truth or
reality. Rather, it refers to the appearance of being true or real, or
the extent to which a work of fiction seems believable or convincing”
(ChatGPT
2022 ) . This aligns with other definitions, including
Wiktionary’s user-edited definition: “1. The property of seeming true,
of resembling reality; resemblance to reality, realism. 2. A statement
which merely appears to be true” (“Verisimilitude” 2022 ) .
How do I know, however, that someone didn’t use ChatGPT to create the
Wiktionary definition? In this case, both services’ definitions align
with more authoritative references, but this is the challenge for
epistemic communities. When AI can produce verisimilitude — in response
to requests for advice, to questions about history, or queries about
dunder methods in Python — what ought these communities do?
Traditionally, there’s been a rough correlation between the quality of
content and its polish. Poor quality content can now, nonetheless,
evince a brilliant sheen.
Unlike the artists who object to their work being used to train AIs,
or the illustrators who fear that their jobs are threatened, or the
teachers who worry cheating will be harder to detect, moderators are
complaining that their communities are seeing polished content that
appears accurate but is not. Two years ago, u/pianobutter on
r/TheoryOfReddit anticipated “The New Generation of Spam Bots are
Coming.” Their thread, on a subreddit dedicated to musing about Reddit
itself, began: “Reddit is about to become a battleground. A test site
for a new age of social media. Perhaps even civilization. Things are
going to get weird.” They believed that “Transformer models +
Reinforcement Learning” would replace “human astroturfers and trolls.”
(ChatGPT uses these techniques; it is a Generative Pre-trained
Transformer chatbot.) Most frighteningly, “Downvoting them won’t help:
by downvoting them you are just training them, making them better.
Ignoring them is no good either. They will participate and their
influence will inevitably increase.” Reddit would be the front-line, and
the bots would usher in a “a new era of propaganda” (pianobutter
2020 ) .
Not surprisingly, r/CryptoCurrency was one of the first to be hit
with GPT-style bots. The bots were using GPT-NEO, an inexpensive yet
“extremely powerful neutral langue network,” to farm karma (i_have_chosen_a_name 2021 ) . One can
imagine a network of high-karma bots running pump-and-dump
cryptocurrency schemes. More recently, a concerned Redditor posted to
r/ModSupport that r/nostupidquestions is seeing an “increasing large
number of GPT style answer bots which provide nonsensical but reasonable
sounding responses” (Petwins 2022 ) . The karma these bots
accumulate could be used to boost propaganda, for example. In December,
another Redditor noted the problem had spread to other subreddits,
including r/AskReddit, and the mods had banned over a thousand GPT bots
in that week alone; the moderators were struggling. Even r/houseplants
discovered that a well-regarded embroidery of a leaf posted to their sub
turned out to be an AI creation (VVHYY 2022 ) .
Over at StackOverflow, where you can find questions and answers about
artificial intelligence algorithms, ChatGPT answers were banned soon
after the tool’s release: though “the answers which ChatGPT produces
have a high rate of being incorrect, they typically look like they might
be good and the answers are very easy to produce” (Makyen 2022 ) . A ban, of course, is only
observed by honest contributors. Dishonest contributors who are seeking
to “reputation farm” need to somehow be moderated — perhaps by limiting
an account’s ability to answer a given number of questions in a day.
Even so, you need merely ask ChatGPT to “write me a Wikipedia article
with five cited sources” and it appears to do so — even if some of the
sources don’t, in fact, exist. Because
Wikipedia lacks explicit voting, karma farming is not so rife. When
Wikipedian Ian Watt shared his resulting example, another longtime
contributor, Andrew Lih, reviewed it relative to the distinctions
between data, information, knowledge, and wisdom: “Vanilla GPT produces
plausible data, prose that esoterically resembles information, passable
but inconsistent knowledge for certain verticals, and most definitely
not wisdom. The worry comes when the bad and good are commingled and
indistinguishable from one another.” Amusing, he also noted that when he
asked ChatGPT how to upload media to Wikipedia, “it’s answer was clearer
than most of our on-wiki documentation. I’m not sure if that’s a
compliment to the AI, or an indictment of our documentation” (Lih 2022 ) . Whether uploading
AI-generated images is acceptable has been a topic of discussion at
Wikimedia Commons for the past two years (Owlsmcgee 2019 ; RAN 2021 ) . The
most recent discussion was accompanied by an image whose caption spoke
to one opinion on the copyright of the resulting image: “‘An astronaut
riding a horse, in the style of Monet’. Monet did not paint this image,
and even if he were alive today, he is not the copyright holder of this
work simply because of the brushstroke patterns” (Arkesteijn 2022 ) . There are many other
issues implicated as well.
As the many headlines indicate, the widespread availability of stable
diffusion and transformer AI has far-reaching implications within the
near future. But people at Reddit, Stack Overflow, and Wikipedia are
grappling with those implications today. And many of us will soon be
grappling with the meaning of verisimilitude in the digital age as it is
used to infiltrate the epistemic communities we rely upon in a world
already struggling with misinformation.
(Thanks to Sarah Ann Gilbert
for discussing this with me.)
References
Arkesteijn, Jan. 2022.
“Commons:Village pump/Archive/2022/10 .” Wikimedia Commons.
October 21, 2022.
https://commons.wikimedia.org/wiki/Commons:Village_pump/Archive/2022/10#AI-generated_works .
ChatGPT. 2022.
“A Chat about Verisimilitude.” OpenAI.
December 15, 2022.
https://chat.openai.com/chat .
i_have_chosen_a_name. 2021.
“/r/cryptocurrency Is Being Run over by GPT-NEO
Bots. Every Single Topic You Make, Not Matter What It Is about Will
Instantly Have 5 -10 Comments Made by Bots. A Good 40% of New Comments
Made Here Are Made by Bots.” r/CryptoCurrency .
https://www.reddit.com/r/CryptoCurrency/comments/p8m0ik/rcryptocurrency_is_being_run_over_by_gptneo_bots/ .
Lemoine, Blake. 2022.
“Is LaMDA Sentient? — an Interview by Blake
Lemoine.” Medium (blog). June 11, 2022.
https://cajundiscordian.medium.com/is-lamda-sentient-an-interview-ea64d916d917 .
Lih, Andrew. 2022.
“Now That Said….” Mastodon .
https://wikis.world/@fuzheado/109467318402404985 .
Makyen. 2022.
“Temporary Policy: ChatGPT Is Banned.” Meta
Stack Overflow. December 8, 2022.
https://meta.stackoverflow.com/questions/421831/temporary-policy-chatgpt-is-banned .
pianobutter. 2020.
“The New Generation of Spam Bots Are Coming:
Where Do We Go from Here?” r/TheoryOfReddit .
https://www.reddit.com/r/TheoryOfReddit/comments/i7yd7m/the_new_generation_of_spam_bots_are_coming_where/ .
Strickland, Eliza. 2022.
“AI’s Threats to Jobs and Human Happiness
Are Real.” IEEE Spectrum , May 12, 2022.
https://spectrum.ieee.org/kai-fu-lee-ai-jobs .
Tzouris, Menelaos. 2002.
“Software Freedom, Open Software and the
Participant’s Motivation - a Multidisciplinary Study.” In
M.Sc. Thesis . London School of Economics and Political Science.
http://opensource.mit.edu/papers/tzouris.pdf .
“Verisimilitude.” 2022. Wiktionary. December 10, 2022.
https://en.wiktionary.org/wiki/verisimilitude .
There are comments .