Open Codex

2009 Jun 30 | Wikipedia Suppressing News

There's been a lot of coverage of the New York Times story "Keeping News of Kidnapping Off Wikipedia." It's prompted discussion about balancing issues of free speech, safety, and responsibility at the Times and Wikipedia. Within Wikipedia, the discussion has only just begun, but has started off quite constructively as seen in Wikipedian Apoc2400's proposed policy: in the short term, Wikipedia should refrain from spreading information if that information is not widely and reliably sourced, of little public interest, and is "likely to have very severe direct negative consequences."

this entry posted to social/wikipedia;
comments (0)

2009 Jun 25 | Our Work After Us

At the beginning of this year, I was sad to learn of the passing of Peter Kollock. He was one of the first to carefully think about cooperation and online communities. I've been citing his 1996 paper "The Economies of Online Cooperation: Gifts and Public Goods in Cyberspace" for a long time now.

Unfortunately, while checking Web references, I discovered the above link to his paper no longer works (i.e., 404). This is the link that appears on his Wikipedia page and dozens of online bibliographies. It appears UCLA yanked his whole web space. The lack of institutional commitment to preserving work and providing stable URIs has always been a great irritation (e.g., see my entry on digital posterity about the links in my dissertation that were soon broken); at the W3C we would frequently talk about this frustration and how to best maintain our own commitment to preservation. And it's not only in death that our work soon disappears. After my time at the Berkman Center, subsequent to a Web site reorganization, I noted all the links to my work there were broken. They were able, and kind enough, to restore the HTML files though my biographical page looks screwy because of broken CSS and relative links -- so I don't even link to that anymore.

In the case of this particular paper by Kollock, it was fortunately published in a book, and I found a PDF version as well -- though I preferred the HTML.

Kollock, P. (1999a). The economies of online cooperation: Gifts and public goods in cyberspace. In Smith, M. and Kollock, P., editors, Communities in Cyberspace. Routledge Press, London. URL http://dlc.dlib.indiana.edu/archive/00002998/

this entry posted to method;
comments (0)

2009 Jun 25 | Anderson and Citing Wikipedia

Chris Anderson's "apparent plagiarism" of Wikipedia has prompted me to post something I was experimenting with last week about citations and URLs. Anderson claims that his text, which is very much like that of some Wikipedia articles, previously quoted and cited Wikipedia as a reference. However, in discussions with his publisher, there was some uncertainty about how to treat URLs (since Web pages might change) and Wikipedia (since it is collaboratively authored). Hence, he attempted a "write-though" for the "case of source material without an individual author to credit (as in the case of Wikipedia)." This is obviously problematic and Wikipedia, on every article, gives guidance on how it can be cited, including the use of a permanent link to a specific version.

However, I can sympathize with the ugliness of long URLs and "last accessed" requirements. Since I began work on my Wikipedia manuscript an aspiration has been to create a work in which the vast majority of historical and ethnographic sources are readily accessible to the reader. This means I have a lot of references. So, as I give thought to the book in print and online form, I wonder how to strike the best balance. I've moved on from the dissertation's APA author-year towards Chicago Manual of Style notes format. Yet, I noticed that notes with URLs can get rather ugly. Particularly if one has more than one citation in a note. (Otherwise it looks like a law review paper.) My notes only implementation of Chicago, where the first reference is a full citation and subsequent references are short but include the oldid since I make use of different versions of the same article, is below. Imagine pages of this stuff, it's not easy to read:

  1. Wikipedia, "Wikipedia:Neutral Point of View," Wikimedia, September 16, 2004, http://en.wikipedia.org/w/index.php?title=Wikipedia: Neutral point of view & oldid = 6042007 (accessed March 5, 2004); Wikipedia, "Wikipedia:Neutral Point of View," Wikimedia, November 3, 2008, http://en.wikipedia.org/w/index.php?title=Wikipedia: Neutral point of view&oldid=249390830 (accessed November 3, 2008).

    ...

  2. Wikipedia, "Wikipedia:Neutral Point of View (oldid=249390830)."

In the context of the Chicago notes variants, I've made the following experiment in my manuscript:

  1. Long (end) notes upon first instance (including URL) and subsequent short notes (with version number noted in title of Wikipedia pages, such as in note 63 above) subsequently yields 396 pages.
  2. Exclusively short (end) notes followed by bibliography with full citation (including URL) yields 452 pages.

Option 2 is more readable, but requires another redirection by the reader if they want full bibliographic detail, and adds pages (and weight and cost) to a book. Another option is to use an adaptation of Option 1: standard long-then-short Chicago without URLs in the printed book, which are provided online. This make a practical sort of sense (and this is what Anderson says he was planning to do), but is non-standard and I'm not sure how it would be received.

However, this difficulty doesn't mean that one should simply "write through" one's sources (whatever that means) and remove the attributions all together.

this entry posted to method;
comments (1)

2009 Jun 11 | The Informed Analysis of New Media

I recently finished two works about the "free culture" movement, each of which are polar opposites -- and in a way that is unsettling. The most recent is Mark Helprin's Digital Barbarism: a Writer's Manifesto. I have long found it ironic that critics of "Web 2.0" -- to use a problematic term for this larger new media phenomenon -- end up adopting the evils they attribute to their subjects: visceral, from the hip, slapdash. Lawrence Lessig excoriates Helprin in a review so I need not waste any words here; even so, I continue to be surprised at what passes for informed criticism. On the other hand, David Bollier's Viral Spiral: How the Commoners Built a Digital Republic of Their Own is an excellent history of the Creative Commons and Free Culture movement.

However, am I only praising those works that are congruent with my sympathies? While Bollier is not presenting criticism (pro or con), it is a favorable portrayal. But I don't think I'm being unfair. I consider myself allergic to unalloyed "Net boosterism" and the "Boing Boing" crowd. In my work on Wikipedia, I admit that I am fond of it but I try to take a "Neutral Point of View" as a scholar and an intellectual hobby. By this I mean that beyond academic concerns, I personally enjoy learning about different perspectives and trying to understand how people come to differing opinions. (So I'm identifying as a "skeptic" more so than an academic.) In fact, I was delighted to read Mark Bauerlein's The Dumbest Generation: How the Digital Age Is Stupefied as Young Americans and Jeopardizes Our Future: or, Don't Trust Anyone under 30. While it sounds like another rant, it is a well-founded critique of how digital media is damaging literacy and civic preparedness in youth. He argues that while screen-based technology might further spatial cognitive skills, knowledge is being replaced with a narcissistic preoccupation with social peers and popular culture. And he actually makes logical arguments based on citations to research. One doesn't have to agree with his argument, but it deserves one's full consideration.

This is why I was disappointed a few semesters ago when I recommended Bauerlein to an otherwise excellent student who was a Net enthusiast. She treated Bauerlein as if he were a Keen or Helprin, cursorily brushing him off as someone who didn't "get it." This was counter to the spirit I was trying to inculcate in that class and began my musing on whether we have a genuinely informed and vital discourse.

this entry posted to method;
comments (2)

2009 Jun 10 | Wiki-Conference New York, July 25-26

This year's picnic will be better than ever, as we'll have an unconference to get us started:

The 1st Wiki-Conference New York will be held over the weekend of July 25-26 2009 (confirmed!) at New York University, and hosted by Free Culture @ NYU and Wikimedia New York City.

Sign up on the wiki, propose a lightning talk or breakout topic, or round up some Wikimedians for a panel discussion.

this entry posted to social/wikipedia;
comments (0)

2009 Jun 08 | Institutions vs. Norms

In Noam Cohen's recent New York Times article about "The Wars of Words on Wikipedia's Outskirts" (i.e., the recent ArbCom decision about Scientology edit wars) I note that organizations often develop towards bureaucratic forms (citing Max Weber) but even in their more free-form states communities still have structure, even if informal and implicit (citing Jo Freeman). I believe this means that while we might enjoy the informal and personal touch of working within a small community, if it is successful, that community will likely move towards more bureaucratic forms. Also, this can also have some benefits if the informal/implicit structures were unsavory. (As Mitch Kapor wryly noted, "Inside every working anarchy, there's an Old Boy Network.") As I said to Noam, rather than lament the passing of the good old days, I think it better to ask how to address issues in the present (including the maintenance of earlier values). (And actually, while it has slipped a bit from its original mission/intention, I think the ArbCom is doing a good job.)

Richard James asks if this sentiment is contrary to my focus on informal social norms, particularly in my blog entry about "Morality and the Dilemma" (i.e., Olson, Ostrom, and Hardin). Also, am I not abusing notions of "technical solutions" with institutional governance? To be clear, Wikipedia production might be explained by any number of approaches including: technical features, institutional governance, and social norms. In trying to complete my dissertation, I had lengthy, and sometimes stressful, arguments about to what extent one of these is more important than any other. Granted, all of these are important and to deny otherwise is silly. However, I found the initial focus upon technical features in accounts of FOSS/Wikipedia to be insufficient, and therefore offered a complementary social/cultural account of Wikipedia in response. But I'm not excused from trying to understand how each of these things interrelate and affect one another. My argument is that informal "good faith" social norms (supported by wiki features) are good at dealing with good faith participants, but more formal and autocratic forms of authority are often necessary to deal with those of bad faith or to make decisions as a last resort when no community consensus emerges -- hence the existence of Benevolent Dictators in open content communities. If such leadership or institutional governance persistently fails, the community might then fork.

this entry posted to social/wikipedia;
comments (0)

2009 May 21 | Extrapolating to 100,000 Featured Articles

I recently noted there were some new numbers on the 100,000 feature-quality articles page. In May 2008 (based on a January assessment I believe) there were 2,421 featured articles. Today, based on a February 2009 assessment, there are 2,570. That's a 6% increase -- below the 24% growth rate to 2.7 million total articles. If we assume a similar rate of increase, it would take 62 years to reach the goal of 100,000 articles.

initial = 2570; target = 100000; growth = .06;
years = (log(target)-log(initial))/log(1+growth)

If we relax the goal to have 100,000 good or better articles, that will require 24 years at a 16% growth starting with 11,024 "good" articles. Of course, I don't know to what extent the rate of growth is increasing or decreasing.

this entry posted to social/wikipedia;
comments (0)

2009 May 18 | Morality and the Dilemma

The challenge at the heart of collective action is how cooperative behavior emerges when there are apparent reasons for it not to. This is famously demonstrated by the Prisoner's Dilemma in which two co-suspects have compelling cause to defect -- turn informer -- against the other but the consequent of both following such a strategy is worse than had they cooperated and remained silent (Axelrod 1984). That it, if your partner remains silent, you will get six months in jail if you are also silent, but you go free by defecting and saddling your partner with a ten year sentence. If your partner informs on you, and you do the same, you each receive five years unless you're the sucker and get ten. Defecting is the dominant "equilibrium" state regardless of your partner's choice: going free is preferable to six months; five years is preferable to ten. So both players defect, get five year sentences, and wish they had remained silent and gotten off with six months. The dilemma is that the individual's dominant strategy also creates a mutually suboptimal result; in this case, fear of the worst-case scenario inhibits beneficial collective action. Understanding the distance between the lack of cooperation implied by the dominant strategy and the mutual benefits of cooperation has been a central concern of social science since Garrett Hardin's (1968) article "The Tragedy of the Commons." In this scenario, the dominant strategy of a herder is to put as many animals as possible on common land, despite the fact that if everyone were to do the same it would soon be overgrazed. A few years before, in 1965, Mancur Olson (1971) published a book by which he characterized this type of problem as "The Logic of Collective Action."

Olson, considering production rather than consumption, asks who would contribute to a common public good when they might just as easily defect and "free ride"? Yet, again, should everyone follow this reasoning, no public goods will be produced. Olson provides an extensive taxonomy of group characteristics that affect this logic, including their size and interdependence, the market's demand elasticity, the balance of costs and benefits, and the ability for a group to exclude or penalize those who fail to contribute. (Ultimately, "trust" becomes a central element in such group dynamics and might arise in the context of time and reputation, institutional controls, or group norms.)

Around the same time, Robert Trivers (1971) characterized a related problem in animal behavior. In his article "The Evolution of Reciprocal Altruism," he defined an "altruistic situation" as one in which "one individual can dispense a benefit to a second greater than the cost of the act to himself" (Trivers 1971) and modeled the conditions under which altruistic behaviors were likely to emerge. (Like Olson, these relate to the character and extent of social interaction.) Of course, as noted by Frans de Waal (2008), "a return-benefits calculation typically remains beyond the animals cognitive horizon" and altruism itself is likely the result of a more proximate evolved behavior: empathy. (This link between empathy and altruism is hypothesized, outside of the evolutionary context, by Daniel Batson (1991).)

Recently, these two threads of political economy and evolution have been combined in the work of Elinor Ostrom. In "Governing the Commons" she makes a slight digression away from a macro-political perspective to note that "communities of individuals have relied on institutions resembling neither the state nor the market to govern some resource systems with reasonable degrees of success over long periods of time" (Ostrom1990gce). By studying such institutions she recommends that the dilemma of "common pool resources" might be addressed by eight institutional design principles: clearly defined boundaries, congruence between appropriation/provision rules and local conditions, collective-choice arrangements, monitoring, graduated sanctions, conflict-resolution mechanisms, state recognition of groups' right to self-organize, and the nesting of enterprises in large systems.

More recently, Ostrom makes greater use of the evolutionary approach to focus on the emergence of norms (Ostrom 2000). She takes issue with Olson's (1971) earlier claim that unless the group is small, or there is a way to force individuals to act in their common interest, "rational self-interested individuals will not act to achieve their common or group interests." She characterizes this as Olson's "zero contribution thesis" and notes that it contradicts everyday experience; the problem of free riding exists, but community governance regimes do emerge and persist (Ostrom 2000). While it might be "irrational" from the egoist perspective, a significant proportion of people will act cooperatively (i.e., 40-60% of people will initially contribute to the public good in a finite-round game). This cooperation is affected by factors such as expectations about others, and the framing and number of interactions between peers. And, in keeping with Olson, people will expend resources to punish those who make below average contributions. Hence Ostrom characterizes norms as those values (e.g., reciprocity, fairness, and trustworthiness) that affect the preference for cooperation. If there is a sufficient proportion of "norm using" players (i.e., conditional cooperators and willing punishers), this "creates an opening for collective action" (Ostrom 2000). This is especially so if there is good information about the trustworthiness of one's peers. If cooperation has been successfully established, new members will likely be appropriately acculturated. Hence, collective action and their supportive social norms can emerge in an evolutionary context: the gap of the cooperative dilemma can be bridged. Indeed, Olson recommends her eight institutional mechanisms (or "principles") to further such outcomes.

Recently, a number of scholars have applied this literature on collective action to Wikipedia. Johnson (2007) uses Ostrom to characterize vandalism and point-of-view (POV) pushing as collective action problems. Viegas, Wattenberg, and Mckeon (2007) argue that Wikipedia's Featured Article process reflects Ostrom's first four principles of locality, collective choice (participation), monitoring (accountability) and conflict resolution. Andrea Forte and Amy Buckman (2008) use all eight of Ostrom's design principles to evaluate Wikipedia governance and its Biography of Living Persons policy; they argue that there is decentralized policy creation, interpretation (i.e., its Arbitration Committee) and enforcement (i.e., administrators) but conclude the biggest lack relative to Ostrom is the uneven enforcement of policy.

However, these works tend to remain focused at an institutional level, focusing on community mechanisms for content and membership policy. (Two exceptions are a quantitative analysis of patterns in Wikipedian references to policies and guidelines from discussion pages (Beschastnikh, Kriplean, and Mcdonald 2008) and a characterization of the type of "utterances" used on Discussion pages (Goldspink 2009).) If, following Ostrom, we can think of norms as those values (e.g., reciprocity, fairness, and trustworthiness) that affect the preference for cooperation, can we find and characterize such norms in Wikipedia culture? I believe we can, and this is the focus of my work on Wikipedia.

Might we even characterize prosocial norms as a form of morality, in the sense employed by Bowles and Gintis (1998)? Indeed, despite preceding theorists of collective action by almost two centuries, Kant's (2005) categorical imperative is a moral response to the collective action dilemma: "I ought never to act in such a way that I couldn't also will that the maxim on which I act should be a universal law." Coincidently, the lesser well known subtitle to Hardin's famous "Tragedy of the Commons" article is "the population problem has no technical solution; it requires an extension in morality." Therefore, I do not think it is a stretch to conclude that Wikipedia collaboration is as much a "moral" problem as a technical one.

this entry posted to social/wikipedia;
comments (1)

2009 May 11 | Making Word Useful

Because I use speech recognition software (SR) I'm forced to tangle with proprietary software and formats; this provides a continuous reminder of the benefits and joys of Free Software. However, I have learned a few things about maintaining a Windows system for SR over the past five years.

In 2004 I began using continuous SR with ViaVoice on a headless Shuttle box accessed over VNC. (This was a big improvement over the discrete speech system I used 10 years before.) Despite the ameliorative provided by imaging the OS partition (PING is great for this), Windows was still a dreadful thing to maintain; the advent of virtualization has been a blessing. And up until the beginning of this year, I relied upon Win2K so as to keep a lean and portable OS. However, security and software support for Win2K is ending and the excellent VirtualBox 2.* software permits one to emulate a consistent hardware profile (including the bios); this allows me to placate XP's annoying validation system.

I presently use NaturallySpeaking 10.1. While the underlying recognition is often remarkable, the user experience and Nuance's support are dreadful. To have useful macro support one must pay hundreds of dollars more for a "professional" version to a company that charges its users for tech support because of its own breakage, which, if reported as bugs, are ignored. Fortunately, there is a friendly FOSS community and DragonFly is an amazing (Python-based) macro application that helps me get around the worst annoyances in NaturallySpeaking.

Then there is the matter of application support. While coders might be content with Emacs or UltraEdit, I dictate prose and want a visually meaningful processor: paragraph/heading styles, a spelling and grammar checker, word counter, etc. Lyx, Amaya, OpenOffice, and Abiword are not "Select-and-Say" capable applications (i.e., not useful with NaturallySpeaking). That leaves Microsoft Word and its loathsome ".doc" binary format. These binary files are impervious to the more useful features of versioning systems, or simple scripting. If I need to fix the capitalization of a term in my manuscript, I have to manually open each chapter and do "find/replace" rather than fix it with a simple one-line command (or with KFileReplace). While I had some hope the new ".docx" format would be useful (it is easy enough to unzip and parse) making sense of it is an outrageously difficult task (particularly lists). So, for years now I've been writing pseudo-LaTeX in doc files, converting them to text via antiword and processing it from there.

However, I recently accumulated enough Microsoft Word hacks to turn it into a decent text editor.

  1. Set the default save format as plain text and its default font to something nice like Andale Mono.
  2. Bind {control-v} to this PasteUnformattedText() macro.
  3. Bind {control-s} to this FileSave() macro to get rid of the annoying "you will lose your formatting saving to text" dialog.
  4. Office XP doesn't use UTF-8 encoding by default and nags you with a dialog every time you open such a file. UTF-8 is the encoding used by every other sensible application of late. Make it the default with this registry edit, but realize it uses the byte-order-mark (BOM) which even otherwise sensible applications get confused by. When processing the text, you can remove it in Python with: line = line.lstrip(unicode(codecs.BOM_UTF8, "utf8")).
  5. You can even "syntax highlight" your text with VBA: this AutoOpen() macro shows how editing markdown and LaTeX visually looks much like what I was seen before, but it is now an open format UTF-8 encoded text file!

this entry posted to technology;
comments (0)

2009 May 05 | A Google Group Gripe

In the past few months I have received invitations to join varied Google Groups. While they are no doubt easy to set up, the (ironic) thing these groups had in common was a focus on free culture (e.g., FOSS and Wikipedia). However, I have not been able to learn how to subscribe to these groups. Instead, I have to log in using a GMail identity. So not only do we have echoes of Microsoft's presumptive ubiquity (if you don't have their software, you are not welcome to participate), Google has access to both your browsing history and your private email?! I am not a Google-hater, but I am concerned about my privacy and proprietary lock-in. The majority of my web browsing is done in Konqueror, and I don't accept any cookies from Google. I also filter email to GMail via a procmail recipe for when I'm out and about, but this is occasional, rare, and on public machines that I don't spend significant time on. (If I need to use a Google service, I pull up Firefox with cookies enabled.) In a literal sense, those people that use Google for searches, email, and calendaring are like Alice in Wonderland, having eaten a magic cookie from Google that reveals all. (I suppose if I must, I will have to create another Gmail identity for subscribing to these lists exclusively; I can then fetch those messages via the POP service that Google kindly provides.)

On another (miscellaneous) privacy related note, this was certainly an odd conflict: Mozilla Ponders Policy Change after Firefox Extension Battle

this entry posted to social;
comments (2)

Open Communities, Media, Source, and Standards XML

by Joseph Reagle

powered by pyblosxom


reagle.org

What I'm reading online (blogroll)


Categories

Archives