A two-month trial of the Pending Changes feature on the English Wikipedia is scheduled to begin on June 14. This will mean that for a small subset of problematic pages (i.e., biographies) edits by unregistered or newly-registered users will first have to be reviewed by an experienced editor to be seen by the "public" (i.e., those not logged in to Wikipedia).
I fully expect this will prompt much attention on whether Wikipedia is now more closed, open, or has even "failed."
I recently read Jaron Lanier's manifesto: You Are Not a Gadget. Lanier's critique of Wikipedia and digital Maoism plays an important role in my discussion of Wikipedia's reception. Hence, I was surprised to find the tone of Lanier's book to be more muted than I expected. While he does make an argument against "cybernetic totalism," it reads like learned musings that lead to intriguing pet-theories rather than a diatribe about Web 2.0. Jon Dron has written an informative review.
I did specifically wish to comment on something that makes me uncomfortable with a lot of cultural criticism: the critic's POV. (I use Wikipedia's acronym for "point of view" tongue-in-cheek in that critique is quite contrary to Neutral Point of View (NPOV).) In my exposure to cultural criticism, including Theodor Adorno's seminal 1936 critique On Jazz, I've had the uncomfortable sense that much of this is simply the subjective, disenchanted complaints of a grouch who attempts to convince us that his or her opinions are anything more than his or her opinions. Actually, it's not even that they are trying to convince us, but that anyone who does not agree with their subjective opinion is obviously part of the problem that they are railing against in the first place. I think it is important to be skeptical, to be critical, and to have personal opinions. (Despite the provocative title, Mark Bauerlein's The Dumbest Generation: How the Digital Age Stupefies Young Americans and Jeopardizes Our Future is an excellent example of persuasive criticism beyond opinion.) But sometimes the critic's statements seem over "totalizing." In Lanier's case, consider his distinction between first- and second-order expression -- using terms that have an authoritative mathematical/logical sort of feel:
First-order expression is when someone presents a whole, a work that integrates its own worldview and aesthetic. It is something genuinely new in the world. Second-order expression is made of fragmentary reactions to first-order expression. A movie like Blade Runner is first-order expression, as was the novel that inspired it, but a mashup in which a scene from the movie is accompanied by the anonymous masher's favorite song is not in the same league. (p. 122)
Now, I love Blade Runner; I think it is genius. And of course, a YouTube mashup is not in the same league as the complete film. But, it is something, and maybe something people value, even if lightly. I laughed at some of the recent Hitler parodies from Downfall and was sad to see them removed. But video mashups in no way diminish the value of the original film. And in the case of Blade Runner, it is a second-order expression of a written book, and one that is famous for its cyber-noir aesthetic that so famously synthesized so many existing elements of visual culture. Later, Lanier writes without qualification or caveat: "The web should have developed along the ThinkQuest model instead of the wiki model -- and would have, were it not for hive ideology" (p. 146). Well, if wishes were horses, beggars would ride, and the subjectivity and ahistorical conceit in such a statement simply boggles the mind.
The way in which Wikipedia is collaboratively produced has caught the attention of the world. Discourse about the efficacy and legitimacy of such a work abound, from the news pages of the New York Times to the satire of the Onion. Building on the literature around controversies surrounding other reference works, such as Harvey Einbinder’s The Myth of the Britannica and Herbert Morton’s The Story of Webster’s Third, Joseph Reagle makes a broader argument that reference works can serve as a flashpoint for larger social anxieties about technological and social change. With this understanding in hand, he tries to make sense of the social unease embodied in and prompted by Wikipedia relative to technological inspiration in knowledge projects.
I just finished an excellent biography of Ayn Rand and her philosophy in the context of American political culture. While reading, I couldn't help think of Wales' expressed interest in Objectivism and the next to the last page actually comments on this issue:
One of the many ironies of Rand's career is her latter-day popularity among entrepreneurs who are pioneering new forms of community. Among her high-profile fans as Wikipedia's founder Jimmy Wales, once an active participant in the listserv controversies of the Objectivist Center. A nonprofit that depends on charitable donations, Wikipedia may ultimately put its rival encyclopedias out of business. At the root of Wikipedia are warring sensibilities that seemed to both embody and defy Rand's beliefs. The website's emphasis on individual empowerment, the value of knowledge, and its own risky organizational model reflects Rand's sensibility. But its trust in the wisdom of crowds, celebration of the social nature of knowledge, and faith that many working together will produce something of enduring value contradict Rand's adage "all creation is individual." (Burns 2009, p. 284)
2009 Dec 07 | News of Wikipedia’s Death Has Been Greatly Exaggerated (Again)
In the past few weeks there's been much discussion of news stories based on Felipe Ortega's dissertation; the concern is that Wikipedians are abandoning the online encyclopedia “in droves.” (What is a drove you ask? According to Wikipedia, it is an ancient route by which livestock were herded.) However, Erik Zachte, with the help of Felipe, shows how in such analysis the way that one constructs one’s parameters significantly affects the conclusions one can draw. For example, the alleged drop-off (deaths) of Wikipedia editors may be more the result of when and how the analysis is done. If you assume that an active Wikipedian is someone who did one edit (i.e., someone who was just experimenting), rather than five, or some other number (i.e., actual Wikipedians), this can significantly affect the outcome. Or, if you assume that a "death" is when someone has not been active for a month, you will naturally have a lot of deaths at the end of the analysis period because these people may have been simply "sleeping" for that month, but come back in the next month and you weren't there to see it. (Like the line from Twin Falls Idaho, a favorite movie of mine, "The sad ending is only because the author stops telling the story. But it still goes on. It's just untold.")
Wikimedia’s lesser noted response to the story claims significant efforts are being made to improve the recruitment and retention of users, but on the numbers side:
On the English Wikipedia, the peak number of active editors (5 edits per month) was 54,510 in March 2007. After a more significant decline by about 25%, it has been stable over the last year at a level of approximately 40,000. (See WikiStats data for the English Wikipedia.) Many other Wikipedia language editions saw a rise in the number of editors in the same time period.
Successful open communities must occasionally interact with closed worlds. For example, Wikipedia's openness and transparency sometimes conflict with their obligations to be responsive to the law (e.g., defamation, copyright, and human safety). Such is a consequence of becoming a notable and established institution.
A new source of tension is the "professionalization" of Wikipedia administration -- a move I otherwise commend. It appears professional marketers were asked to develop a marketing/fundraising campaign, yielding the "WIKIPEDIA FOREVER" slogan. Some Wikipedians feel this is inappropriate, arrogant, and loud -- a sentiment with which I agree. A more wiki-typical discussion of appropriate slogans can be found here.
In the previous analysis, of the 174 women from the National Women's History Project, Wikipedia lacked articles on 23 of the women, Britannica missed 65. Hence, I found no support for the idea that gender imbalance in Wikipedians leads to similar imbalance in biographical coverage. However, this did support the (unsurprising) fact that Wikipedia has greater coverage in its number of subjects and article length. Therefore, as noted, on the gender question it would be nice to have a sense of relative proportions.
Consequently, in the second analysis I look at Time's "100" most influential people from 2008. (There are more than 100 subjects because there are a few couples that I break out.)
43 entries are missing from EB; 4 from WP. 4 entries are in neither. For articles existing in both, WP articles are 7.66 times larger on average (median of 6.81).
Of the 105 entries: I guess that 23 are female, 82 are male and 0 are unknown. That is, the ratio of females to males is 0.28. Of the Wikipedia articles, females are 0.29 (23/78) of males; and 0.27 (13/49) at Britannica.
That is, while one might claim that this ratio of 0.28 is evidence of a bias -- on the part of Time or the world at large -- it is a base line from which we can judge the reference works: neither Wikipedia nor Britannica are disproportionately better or worse. If the reference works were biased towards coverage of men, we would expect that ratio to be lower than 0.28 (e.g., if all missed articles were females).
Of course, I'd like to run this over a larger corpus, but in terms of easy to find lists of notable persons, these "100" lists are all I've found so far. Also, I'm relying upon heuristics again to guess the gender of subjects, but they seem to be working well. (EB's Mia-Farrow article is guessed as male because it's actually a stub/sentence in the Woody Allen article.) Finally, an additional feature my approach has is to augment the table with the content from both reference works, but I expect Britannica would not be happy about that so I don't provide that version publicly.
The recent
controversy about gender imbalance and sexism in open content communities
has been remarkable this summer, and this week's news about Shuttleworth's comments
might mean it will extend into the autumn. While I think these events merit a
historical and cultural analysis -- and prompts the questions if sexism
increased, is it being noticed more, or both? -- I want to postpone that
undertaking for the moment. Instead, I wonder if the recent demographic data
that shows women are about 13%
of Wikipedians affects its topical coverage?
64 entries are missing from EB; 23 from WP. 23 entries are in neither. For articles existing in both, WP articles are 6.29 times larger on average (median of 4.00).
That is, of 174 women, Wikipedia is missing articles on 23 of them. That's
almost a third of those missing from Britannica, which doesn't have
any articles not at Wikipedia. When both do have an article, Wikipedia articles
have much more content. Of course, those are just the quantitative numbers.
Even so, when I browse the actual articles, I am partial to the extra content
and images of Wikipedia.
Yet, a difficulty in this work is finding a useful corpus of biographical
persons. To say that there are more articles about men than women in any
reference work, isn't saying much given world history. So, for this analysis I
use those women recognized by the NWHP for Women's History Month. The NWHP is a
nice collection in that it has both well known women and lesser-known women who
are thought to be notable nonetheless. However, this only tells us that
Wikipedia has greater coverage of women than a traditional encyclopedia. (And
while this is one of the first large and topical -- rather than quality --
comparisons it should not be all that surprising given Wikipedia's size.) And,
Wikipedians are aware of their own systemic
bias and make attempts to counter it. For example, those recognized by
Black History Month were the focus of a WikiProject
that documented every person recognized. (Ironically, this list was taken from
Britannica. And perhaps the NWHP list will prompt a similar project at
Wikipedia, which is why I use permanent links to the specific versions I
analyzed.)
What would really be nice is a source corpus of notable persons, both male
and female. I could then compare this against Wikipedia and Britannica
to see how they fare relative to the source corpus. That is, a source corpus of
100 people might recognize 75 men and 25 women (25% female), and if one of the
references had a 60/15 split, it'd be less "feminine" (20%) than the source.
How, then, does each reference work compare to each other, relative to their
source? If you have a suggestion for corpus, please leave a comment.
Finally, while speaking with Nora about this, she also raised the question
of if the gender ratio of disruptive editors differs from that of the larger
community? Our hypothesis is disruptive editors might be disproportionately
male. But who can say? Unfortunately, I expect it's difficult to get survey
responses from such editors.
I just finished reading Eric Goldman's "Wikipedia’s
Labor Squeeze and its Consequences" and it is a more reasoned argument than
the hyperbolic prediction of Wikipedia's failure. In fact, the claim that there
is a tension between openness and protecting against disruption shouldn't be a
surprise to anyone that is familiar with online communities. Wikipedia
has always had to balance the merits and challenges of openness (i.e.,
collaboration and disruption). Goldman's paper is a nice treatment of this
tension, here's my summary:
The author poses the feature of "free editability" against the need to
defend against unproductive contributions. Noting that technological
restrictions to date have been "fairly modest", he suggests Flagged Revision
features may be a significant change. The plateau of Wikipedian growth is
likely caused by editor turnover, an inability to attract and keep new
editors, and the lack of incentive mechanisms (e.g., relying only upon
intrinsic motivation). The author endorses technological barriers that
further constrain "free editability," and the recruitment and maintenance of
new contributors, including converting readers into contributors, recruiting
cash-motivated individuals, companies, academics, and students to
participate.
I have two substantive comments on the paper. First, I am surprised that he
even made the failure claim, or that the observation of this tension is novel,
given that he quotes a 2005 email by Jimmy Wales. Last week, when I wrote that
an open community was not the founding vision of Wikipedia, but a surprisingly
productive means, I did not include one of the most compelling -- but later --
messages on that topic. Goldman quotes one sentence from Wales' 2005
message:
Wikipedia is first and foremost an effort to create and distribute a free
encyclopedia of the highest possible quality to every single person on the
planet in their own language. . . .
However, the rest of that paragraph that Goldman doesn't include shows that
Wales was purposely highlighting the encyclopedia as the goal, and the
community as a means:
. . . Asking whether the community comes before or after this goal is
really asking the wrong question: the entire purpose of the community is
precisely this goal.
Furthermore, Wales writes:
The community does not come before our task, the community is organized
around our task. The difference is simply that decisions ought to
always be made not on the grounds of social expediency or popular majority,
but in light of the requirements of the job we have set for ourselves. (Wales2005w)
I recommend you read the whole message.
Second, Goldman characterizes Wikipedia as atypical in rejecting
contributions from paid/professional content creators. He is conflating the conflict of
interest policy with the means of production. Yes, free and open source
developers are often paid for their work, and while this hasn't taken off at
Wikipedia (the market/incentives are different), I am not aware of any
Wikipedia policy that prohibits the adoption of professionally produced content
if it is appropriate to the encyclopedia and under a compatible license.
However, Wikipedia is rightfully careful about contributors who edit articles
about their own financial or reputational interests. This is the difference
between incorporating content written by a paid expert on their topic of
expertise, and rejecting their edits to their own biography.
So, on this note, what are some examples of content that was produced for
pay at the Wikimedia Foundation? I can think of some archival material, such as
the use of some material form the 11th edition of Britannica and images now in
Commons.
2009 Sep 04 | Some Figures on Wikipedia Protection Mechanisms
The recent focus on Wikipedia "failing" or being "closed" merit some figures and explanation. On the afternoon of Sept 04, 2009 the English Wikipedia has 3,024,063 articles.
5,137 articles are protected (that's 0.17% of all articles).
The majority of those, (3,553 articles or 69% of protected articles), are semi-protected, meaning that while they aren't editable by anonymous users, they are by Wikipedians (i.e., those with an account in good standing).
Therefore, only 1,583 articles (.05%) are fully protected, and not available to editing by non-administrative Wikipedians.
Of all the articles being protected, 1337 of them (26%) are set to expire, most within a month or two.
That's the status quo. Yet, flagging a vetted version of an article has been discussed since 2005. The current widely discussed idea is to conduct a two month experiment in which biographies of living people (402,672 articles, about 13% of the English Wikipedia), or some subset, are "flag protected"; this means anyone can still edit but the public (not Wikipedians) see the last reviewed version. This doesn't necessarily replace the existing protection mechanisms, but could be a good alternative to semi-protection. The experiment will helpfully give guidance on who should be a "Reviewer," and answer the questions of whether this limits disruption, furthers quality, and how long does it takes to review and flag a newer version of an article. Another part of the experiment is "partrolled revisions" which would apply to a wider swath of articles and permit vandalism fighters to bookmark a known good version so they can easily evaluate subsequent contributions, but it won't affect who can edit or what the public sees.
The goal of this, and other features, is to maximize the benefits of open collaboration while minimizing the damage from disruptive edits. In my opinion, this has always been the case and Wikipedia continues to experiment with achieving the best balance.