2007 Dec 17 | The elites and bourgeoisie
I recently had the opportunity to catch up on some of my reading
including new quantitative analysis of Wikipedia contribution. In
particular, the question about the inequality of user contribution is a
long-standing one (Wales 2005wew2, Voss 2005mw, Swartz 2006www,
Ball2007, Kittur et al. 2007, Viegas et al. 2007, and Priedhorsky et
al. 2007.) Jimmy Wales originally noted in December of 2005 that "half
the edits by logged in users belong to just 2.5% of logged in users."
Research since 2005, particularly Kittur et al., measuring contribution
differently, showed that elite contributions were less powerful
relative to the long tail of small contributors, or even that the trend
has changed over time. (As those authors put it: Power of the Few Vs. Wisdom of the Crowd: Wikipedia and the Rise of the Bourgeoise.) However, in Quantitative Analysis of the Wikipedia Community of Users
Felipe Ortega and Jesus Gonzalez-Barahona
(2007) conclude that their analysis shows that "approximately 90% of
the active editors is responsible altogether for less than 10% of the
total number of contributions (Gini coefficient of 0.9360)" (p. 82). So
the long tail isn't doing as much as we might think. The authors
explain this difference by way of a methodological concern: counting
user contribution via total contributions of the life of the user
misses those users who are new and active, but have not accumulated a
significant total count yet. After segmenting users based on their
contributions in specific periods Ortega and Gonzalez-Barahona find
that those users with a high number of edits in early months typically
continue to make a high number of edits (i.e., stable), and a
discrepancy between high contributing and low contributing editors is
significant (i.e., unequal).
I met Felipe Ortega at this year's WikiSym and recently asked him about the present state of research today:
The current state of research about the inequality of contributions to
the English Wikipedia (also extended to the top ten language editions of
Wikipedia) shows that the distribution of contributions to articles
(including stubs and redirects, filtering bots) is strongly skewed
towards a small core of very active contributors. This is the same
well-known effect already identified in libre software development
projects. The graphs depicting the contributions from distinct
generations of very active users, along with the graphs showing the Gini
coefficients of contributions per month, rebate the argument of the
"rise of the bourgeoisie" stated by Kittur et al. The inequality level
of contributions to the English Wikipedia has remained stable during the
past 4 years. Similar inequality levels per month have been found for
the other top ten language editions, thus showing a common pattern
shared among the biggest Wikipedias. Moreover, we have found that the
inequality level in these top-ten language editions is stabilized around
a 80%-85% interval for the Gini coefficient, showing a spontaneous
autorregulation process that deserves further research.
this entry posted to
social/wikipedia;
comments (1)
2007 Dec 13 | The Iron Law of Oligarchy
In my dissertation I make only a passing reference to the "Iron Law of Oligarchy," an expression coined by Robert Michels, an early 20th-century sociologist and student of Max Weber, in his book Political Parties.
Much like Weber, a lot of the cases and references are difficult to
follow because of its age and issues of translation, but there are
still some gems that are relevant to today. In particular, I find his
"final considerations" to be worthy of sharing on the Wikipedia
question, touching on "adminitus," incumbancy, and evolution:
Leadership is a necessary phenomenon in
every form of social life. Consequently it is not the task of science
to inquire whether this phenomenon is good or evil, or predominantly
one or the other. But there is great scientific value in a
demonstration that every system of leadership is incompatible with the
most essential postulates of democracy. We are now aware that the law
of the historic necessity of oligarchy is primarily based upon a series
of facts of experience.... The process which has begun in
consequence of the differentiation of functions and the party is
completed by a complex of qualities which the leaders acquire through
their detachment from the mass. At the outset, leaders arise
spontaneously; their functions are accessory and gratuitous. Soon,
however, they become professional leaders, and in the second stage of
development they are stable and irremovable.... (Michels 2001:240)
It
follows that the explanation of the oligarchical phenomenon which thus
results as partly psychological; oligarchy derives, that is to say,
from the psychical transformations which the leading personalities in
the parties undergo in the course of their lives.... The oligarchical
structure of the building suffocates the basic democratic principle.
That which is oppresses that which ought to be. (Michels 2001:240-241)
The
democratic currents of history resemble successive waves. They break
ever on the same shoal. They are ever renewed. This enduring spectacle
is simultaneously encouraging and depressing. When democracies have
gained a certain stage of development, they undergo a gradual
transformation, adopting the aristocratic spirit, and in many cases
also the aristocratic forms, against which at the outset they struggled
so fiercely. Now new accusers arise to denounce the traitors; after an
era of glorious combats and of inglorious power, they end up by fusing
the old dominant classes; whereupon once more they are in their turn
attacked by fresh opponents who appealed to the name of democracy. It
is probable that this cruel game will continue without end. (Michels
2001:245)
this entry posted to
social/wikipedia;
comments (0)
2007 Dec 12 | Conflict management and class exercises
In the spring I will again be teaching a class on conflict management. More than one colleague has expressed puzzlement as to why I would teach this class, but I really enjoy it. While I, and a few students, might enjoy discussions on the historical nuances of technology or reference works, conflict management is relevant to everyone -- and I do get to discuss Wikipedia NPOV and good faith! I have developed two exercises for understanding cognitive priming on cooperation/competition (i.e., prisoners dilemma) and integrative bargaining that might be of use to others.
this entry posted to
career/teaching;
comments (0)
2007 Dec 05 | Blogging anxiety and post-naive abandonment
As we all know by now, there are manifest anxieties associated with
the practice of blogging. The most common one being the stress of
feeling as if one hasn't "updated" the blog recently. Biella Coleman offers the non-intuitive theory that not updating
is a virtue. (She kindly refers to the moribund state of this blog as
an example.) Another perennial issue is those who throw off the burden
of blogging and declare that while it caught their interest for a time,
they are done with it, such as Peter Krapp's recent "Top Ten Reasons I Don't Blog Anymore".
Inspired by a common theological turn, I think of this as a
"post-naive" blog declaration. Marcus Borg, a liberal theologian, argues
that many people will go through three phases of religious belief:
naivete (a superstitious child), critical (a skeptical adult), and
post-critical naivete (an open heart). (Neil Gillman has noted a similar theory of transition in Abraham Heschel's "situational thinking," Gabriel Marcel's "secondary reflection," and Paul Ricoeur's
"second" or "willed naivete".) Therefore, I often expect that after the
initial flush of excitement with blogging, subsequent anxiety and
abandonment, there will come a time when the "post-critical" blogger
will post again without worry about site statistics, updates, and ego.
On another "blog," (though it had daily content, photos, audio programs
and such before blogs, flickr, and podcasts), I've asked the robots to
please pass on by ("User-Agent: * Disallow: ") and presently post to it
about once a month. I'm quite happy with that.
this entry posted to
social;
comments (0)
2007 Dec 04 | Peer production: Wikipedia and the OED
I often compare the "peer production" of Wikipedia to that of the OED, which was also built with the contributions of hundreds of people from all walks of life. I just finished Simon Winchester's The Meaning of Everything: the Story of the Oxford English Dictionary -- an excellent complement to his earlier The Professor and the Madman -- and note that the following could perhaps be said of Wikipedians as well:
... but we do not really know why so many people gave so much of their time for so little apparent reward. And this is the abiding and most marvelous mystery of the enormously democratic process that was the Dictionary -- that hundreds upon hundreds of people, for motives known and unknown, for reasons both stated and left unsaid, helped to chronicle the immense complexities of the language that was their own, and that they dedicated in many cases -- such as the Thompson sisters did -- years upon years of labour to a project of which they all, believed by some set of unfathomable and optimistic notions, insisted on becoming a part. (Winchester 2003:215)
this entry posted to
social/wikipedia;
comments (0)
2007 Nov 29 | Durova: openness and enclaves
Previously, I've explained
what openness means relative to a community rather than the license
governing its content. In short an open content community, as I specify
it, has five normative features: open products, transparency,
integrity, nondiscrimination, and noninterference. It also has an
important descriptive feature: some level of structure/closure (which I
believe is unavoidable, see "The Tyranny of Structurelessness")
and consequent discussion about how this can be reconciled with the
larger egalitarian ethos. In the dissertation I pose Wikipedia in light
of this criteria and explore three challenging cases: can anyone really
edit?, office actions and oversight, and the female only WikiChix list.
The latest Wikipedia controversy (Mestel 2007) would also be a good case given references to a secret cabal-like email list, as summarized in The Signpost:
A case involving the actions of Durova and Jehochman,
and in particular a controversial block by Durova of !! as a
sockpuppet, based on evidence she refused to reveal on-wiki, saying
that she was concerned that to do so could give puppetmasters too much
information on her investigative techniques. This evidence was provided
to some administrators, and was later leaked, with some users,
including the Arbitration Committee, arguing that the evidence was
insufficient for blocking !!. Durova has resigned her adminship, and,
in what appears to be an exceptionally quick resolution of the case,
remedies have been proposed, admonishing Durova to exercise greater
care when issuing blocks, admonishing all participants to act with
proper decorum, and noting that Durova must go through normal channels
to regain adminship.
The very fact that there may be ancillary structure and closure
(what Sunstein calls "enclaves") is alarming to some, particularly when
Wales (2007moh) wrote: "*I*
am involved in multiple ongoing private discussions with dozens of
people. The list in question is being badly misrepresented as some kind
of problem. It is a good list, and the purpose of the list is good, and
not everyone on the list is perfect (as is always true)." However, with
respect to the criteria, I believe it is appropriate -- even
unavoidable -- for enclaves to form. If they aren't handled well -- as
in this case -- they can come off as rather unseemly. And they can
never be relied upon as a source of authority and discussion in the
larger community: Arguments and their evidence must be ported into the
larger open context or the criteria of transparency and integrity are
at risk. Durova's mistake was not only in incorrectly banning someone,
but in referring to that list and her secret methods of "sleuthing"
sock-puppets as an authority within the larger open discourse. (Silence
also had an interesting role to play, as it often does: it appears that
while some people on that list did not object, this does not mean they
assented.)
However, Durova has apologized for the mistaken administrative
action and I presume she now appreciates the illegitimacy (and
sensitivity) associated with recourse to closed authority. What I also
find interesting in this discourse is the recognition of the dangers of
administration -- I saw it referred to as something like "adminitus"
but can't find it now. Furthermore, I am intrigued by the common
Wikipedia sentiment that administration is comprising, but that
encyclopedic work has restorative powers for one's own wiki-soul and
relation to the community.
this entry posted to
social/wikipedia;
comments (0)
2007 Nov 07 | The cognitive construal of a free encyclopedia
In a parenthetical in my dissertation I note that when I speak of the men in the history of reference works I am being unfortunately literal. Aside from the Suzanne Briet, a French documentalist at the beginning of the 20th century, few women have fallen within the scope of my readings in the secondary literature. (And sadly, Briet doesn't yet have a Wikipedia article about her.) Consequently I've kept my eyes open for such materials when perusing bibliographies. (And as a researcher Wikipedia I've been especially engaged by Milos's Women and Wikimedian projects interviews.) I recently started Gillian Thomas' (1992) A Position to Command Respect: Women and the Eleventh Britannica in which she writes of her father's reverence for his EB11:
As far as he was concerned, whatever he read aloud to us from those finely-printed onion-skinned pages was incontrovertible truth. My mother was more skeptical, and would point out that she would put more trust in an "up-to-date" reference book My father bought his copy of the Britannica as a young man intent on self-improvement, like thousands of others, through its widely-advertised installment system. (Thomas 1992:v)
This then made me think about the authority of Wikipedia. Thomas writes: "The outlay of money had been a considerable sacrifice at the time, which was perhaps another reason why he retained a reverent attitude towards its authority" (p. vi). Paying for something, or undertaking a difficult task, is known to positively influence the construal of that something (Tavris and Aronson 2007:17); therefore perhaps the freeness of free content shades how it is perceived. I can imagine interesting cognitive experiments along these lines. For example, reward subjects for accurately answering a factual question while providing them with access to a free or nonfree version of the same reference material -- the cost would essentially be against the eventual reward. I expect those who paid for access would have greater confidence in their answer and greater sense of authority in the resource than those who did not. And this is but a single experiment among a whole number of interesting construal tasks where the material essentially stays the same but secondary characteristics are varied: how does the perception of confidence in one's own answer or authoritativeness of the reference change relative to cost, media presentation (e.g., print or online), timeliness, the number of contributors etc.
this entry posted to
social/wikipedia;
comments (0)
2007 Oct 18 | WikiSym Presentation
My presentation for WikiSym 2007 is now up, including a link to the abstract and paper. I fear it's a little wordy by I find that the academic habit of citation is hard to suppress even in slides, and my love of ethnographic excerpts is similarly irrepressible.
this entry posted to
social/wikipedia;
comments (0)
2007 Sep 27 | All is reviewed, but in what order?
The recent
discussions about the deployment of features for supporting
vetted/approved pages and trusted authors, starting with German Wikipedia and
perhaps expanding to the English one, is interesting in light of one
of the few analyses of whether it is best to review a contribution upfront,
or after the fact. (This is like the arguments in programming about whether
it is better to write code in which you "look before you leap" or just leap
with the knowledge that "it's easier to ask forgiveness than permission"). If
you haven't read it, and you are short of time, just have a look at the
discussion and conclusion which includes:
These results suggest that Wiki-Like systems create value faster than
Pre-Review ones. Most people contributed under the Pre-Review system
overall, but since people prefer editing to checking, a backlog of wasted
work builds up. The model also suggests that Pre-Review group will do about
the same as the Wiki-Like group in the long-term However, contributions
taper off faster than predicted Pre-Review systems may increase people's
willingness to contribute... or deter people from damaging the system...
compared to Wiki-Like. (Cosley et al. 2006:9)
Also, one provocative observation is one could start with Wiki-Like then
switch to Pre-Review as contributions taper off; "However, imagine the outcry
from its members if Wikipedians were to switch to a Pre-Review system."
(Cosley et al. 2006:9)
this entry posted to
social/wikipedia;
comments (0)
2007 Sep 24 | The travail of grading
I find "giving" grades as a teacher to be as troublesome as getting them
when I was a student. Alfie Kohn (1999) in his book Punished by Rewards: The
Trouble with Gold Stars, Incentive Plans, A's, Praise, and Other
Bribes argues, based on solid research, that rewards, such as
grades, often undermine intrinsic motivation (p. 148), which is key to a
substantive long-term learning. This counterproductive practice persists
because our educational system attempts to do two things that are often at
odds with one another: facilitating learning and sorting students (p. 202).
I've seen this in my own classroom. Some of the brightest students, and no
doubt the most consistent "performers," have expressed a strong distaste for
open ended assignments. Asking them to propose a topic that interests them is
far too frightening relative to the more remedial types of tasks they have
clearly mastered. As Kohn notes, "when we are working for reward, we do
exactly what is necessary and no more" (p. 63); this isn't necessarily
because of laziness, it also avoids the risk of hurting one's GPA. On the
flipside I've seen students with a lot of potential but also significant
challenges (perhaps English isn't their first language, their previous
education wasn't as rigorous, illness, financial constraints, etc.) become
demoralized with a poor grade. Few things are as frustrating as seeing
motivated students and a positive classroom culture taking hits because of
grades. Nor do I want to be in that position of judging students'
circumstances: perhaps Solomon could fairly judge between genuine illness,
family emergency, forced overtime, or a hangover -- but I can't.
I'm not completely comfortable with my present approach, perhaps one day I
will become an "easy" grader and submit all "A"s except for the most
obviously negligent, but this is what I work with now: explicit criteria and
early feedback.
According to the standards
of my department, which are quite useful and comprehensive, an "A" is a
reflection of an "An Outstanding Student" whose "Writing demonstrates
impressive understanding of readings, discussions, themes and ideas. Written
work is fluid, clear, analytical, well-organized and grammatically polished.
Reasoning and logic are well-grounded and examples precise." My present
understanding of an "A" is also informed by my experience as a Ph.D. student.
I expect I've been a bit of a "grade grubber" myself, though fortunately
willing to take risks to pursue my interests. One of my greatest
disappointments in my nearly 10 years of classes, but not my lowest grade,
was an "A-" in a historical methods course. I loved the course, adored the
professor, and invested a lot of myself in the research and final paper. But
at the outset the professor said he only gave an "A" to those papers he could
see being accepted for publication and he was true to his word. After a few
days I could admit to myself that my paper was not yet at that level, my
research and thinking weren't developed enough yet, and I learned I was not
alone -- in fact I was in the vast majority. (It's a sad truth of how we can
feel better or worse about ourselves through comparison with others! A
colleague of mine once cynically captured this with a sentiment that, "every
time a friend of mine succeeds I die a little inside.")
In any case, I use a similar threshold in undergraduate classes. I don't
"give" grades, I evaluate performance according to the departmental criteria.
I don't grade on a curve, but I do make sure my expectations are reasonable
by first reviewing the range of performance. An "A" is truly outstanding,
something I could use as an exemplar in future courses or even recommend to
someone interested in the topic. An "A-" fell little short and could be a "A"
with a few small tweaks. A "B" is a reflection of good work, a "C" of "fair"
work. I do want to be humane, some professors have cut me slack in the past,
but also fair. It is not at all uncommon that at the end of the semester when
I'm porting grades from my spreadsheet to the bubbles of the Scantron to want
to bump a grade, but I fear this may be favoritism, so I don't.
Grading sucks, but it's a requirement of the job, and I am not sure of
what the alternatives would be.
this entry posted to
career/teaching;
comments (5)
2007 Aug 28 | Keen's misunderstanding?
I recently read Andrew Keen's The
Cult of the Amateur: How Today's Internet Is Killing Our Culture. Keen
provides an easy to read litany of problems associated with "Web 2.0" user
generated content. While I do have some thoughts in response to his larger
thesis, one thing I found confusing is his understanding of a particular
decision made on Wikipedia. Like many critics, Keen is concerned by the fact
that experts aren't appreciated or respected above amateur contributors. With
respect to a well-known "climate change" conflict, he writes how Dr. William
Connolley, an expert on climate modeling, had to go "head-to-head" with an
aggressive Wikipedia editor, and was eventually put on "editorial parole" by
the Arbitration Committee:
Connolly, who was pushing no POV [point of view] other than that of
factual accuracy was put on editorial parole by Wikipedia, he was limited
to making one entry a day. When he challenged the case, Wikipedia
arbitration committee gave no weight to his expertise, treating Connolly,
an international expert on global warming, with the same deference and
level of credibility as his anonymous vote -- who, for all anyone knew,
could have been a penguin in the pay of ExxonMobil.
I believe Keen is referring to this
case, which I consider to be a lovely example of judicious decision
making on Wikipedia. While I'm very sympathetic to the frustration and amount
of work entailed in repulsing crackpots and moonbats from articles such as
evolution and global climate change, the decision was not as Keen describes
it. Consider the following excerpts from the arbitration case:
- Principles
- Neutral point of view
- Consensus
3) As put forward in Wikipedia:Dispute resolution,
Wikipedia works by building consensus. This is done through the
use of polite discussion, in an attempt to develop a consensus
regarding proper application of Wikipedia:Policies and
guidelines such as Wikipedia:Neutral point of view. Surveys and
the Request for comment process are designed to assist
consensus-building when normal talk page communication has not
worked.
- Relative value of references
8) Since the goal of Wikipedia is to provide accurate
content, we cannot regard all references as equally valid and
give them all equal weight. Editors should exercise care in the
selection and use of references. The closer a reference is to
current peer reviewed work, the better. Balance must also be
attained by properly labeling and attributing significant
dissenting views (where they exist).
- Passed 7-1 with 1 abstain
- Findings of Fact
- William M. Connolley as expert
- Cortonin's view of real greenhouses
5) Cortonin has persistently and aggressively advanced
views which confuse metaphorical explanations of the greenhouse
effect and greenhouses with the technical scientific phenomena
underlying them. Despite determined efforts by other editors to
inform him and point him to information on the subject he seems
to have difficulty understanding both the use of metaphor and
the scientific literature in the field, see Talk:Greenhouse
effect. This is a persistent condition which seems likely to
continue.
- Remedies
That is, despite Keen's summary, Connolley was paroled for reverting
others too frequently without explanation. His expertise was noted, as was
the importance of providing authoritative references -- which is the ultimate
authority in claims one can make on Wikipedia. In response, he was put on a
six-month parole limiting his ability to make reversions; there was no limit
to "making one entry a day" from this case. Yet of his "opponents," Cortonin
received a six-month ban from editing global climate articles, and JonGwynne
was banned from Wikipedia for three months and banned from editing climate
related articles for six months.
Because Keen opts not to reference online sources, it is possible he is
referring to something else. But if he is referring to this case, which I
think he is, his summary is very misleading.
this entry posted to
social/wikipedia;
comments (0)
2007 Aug 24 | Ward Cunningham's Keynote at Wikimania 2005
At this year's Wikimania conference I was talking to Cormac about my
dissertation and how at the beginning of the 20th century some like Wells and
Otlet were inspired about the potentials of technology (i.e., microfilm and
index cards) and their possible contributions towards a universal
encyclopedia. Cormac noted that Cunningham had actually mentioned index cards in
his Wikimania 2005 presentation, which I was not able to attend. Fortunately,
I was able to find a copy of the video and my rough notes on it are available
(Cunningham
2005).
this entry posted to
social/wikipedia;
comments (0)
2007 Aug 15 | Lessig and emergent corruption
It is with interest that I've noted Lawrence Lessig's announcement
a career shift away from his work on restoring sanity to the copyright
regime. I too once grew frustrated and fatigued with the policy process
around copyrights, patents, and privacy because of that which Lessig now
calls "corruption." (Fortunately, Lessig had the fortitude to launch Creative
Commons and help give voice to the free culture movement despite his much
invoked pessimism.) By "corruption" he doesn't mean bald bribery but systemic
and often hidden "queering" of the political process away from genuine
discussion and reason, and towards monied interests. (See my little "done
with privacy" essay I wrote when I had gotten fed up in the privacy
space.)
I agree -- and hope -- that if there was a way to tackle this corruption
issue, other concerns might be more amenable to improvement. And I think
Lessig is the man for the job. In addition to his passion, commitment, and
brilliance -- and despite his claim that he's changing his focus -- I think
his expertise and history with free-speech and constitutional law is highly
relevant to this new endeavor. I say this because I've always felt that this
issue of "corruption," or perhaps "monied influence" is a very American sort
of paradox: the democratic ideal of free speech, we each say our piece,
entangled with a market and sociological reality, we are influenced most by
those with the most resources.
For example, I have been a grudging advocate of campaign finance reform. I
am an advocate because of the problems I see in our present system, but
hesitant because I can't easily square any corrective intervention with my
free-speech values. At times I think that the solution is to attack the
corporate angle: companies should not be considered persons, nor should they
have civic rights (i.e., to support a candidate) as a person does. But I'm
not sure this would solve the problem -- some wealthy might simply (continue)
to act as agents for their corporate symbionts as it is in their own personal
short-term interest.
So, I think -- and hope -- Lessig can help. Yet I'm surprised to see in
his musings that he's worrying about questions
of intention: is Hillary Clinton consciously taking money for votes, or
perhaps being subconsciously influenced? (I would suspect there is some social
and political
psychology research on opinion formation and social influence that might be
relevant.) But sadly, while there is plenty of conscious and unconscious
influence peddling going on, the problem is actually much harder.
Consider a scenario in which every politician is scrupulously honest and
true to her beliefs -- a best case relative to our present. Also, support,
particularly money, can bestow much power upon any politician to influence
the public -- a case much like our own today. In such a scenario resource
rich minorities can exert disproportionate influence counter to the interests
of the majority of the present, majority of the future (e.g., the long-term),
and reason itself. While this might be "corrupt" it requires no conscious or
even subconscious persuasion from money. Rather, it is an emergent result of
the social, psychological, and system dynamics of influence: supporters simply
select the politician; the politician through
media purchases influencing the public. And it just so happens that this is
sometimes counter to our own public interests, and not easily fixed.
this entry posted to
social;
comments (0)
2007 Aug 13 | Source-as-primary-character
I recently finished Peter Heather's (2006) The
Fall of the Roman Empire. This popular, though no less rigorous, history
is widely praised. The narrative is engaging and I appreciate the glossary,
dramatis personae, and timeline; these help given the scope of the book spans
150 years, dozens of emperors (East and West), generals, and barbarian Kings.
What most impressed me was Heather's treatment of sources. Many histories,
particularly of ancient societies, are written in the third person objective.
Yet, as I learned in my historical methods course, the practice of history is
more than a recounting of events, but a substantiated argument about people
and events in time. Heather presents his arguments as such: identifying when
he agrees or disagrees with others or scholarly consensus, and addressing the
circumstances of his sources. Rather than being simply a footnote, sources
come to the foreground and become part of the story. A history of the source,
such as Pullodius' commentary on Ambrose written in the margins of De Fide,
or the listing of fourth century military and civilian offices, the Notitia Dignitatum,
are interesting in themselves and contribute to a much deeper understanding
of the ground on which Heather's arguments rest. While a popular history
might present a more accessible or exciting version of an old tale, it is
rare for it to communicate the challenges and excitement within the
discipline -- because popular history often obscures its scholarship. But
Heather brings it forward and what I thought might be a rather staid field --
don't we already know all we can do about the ancients? -- is shown as alive
with new archaeological finds, textual fragments, analysis, and argument.
I know this will influence the next revision of one of my historical
chapters with respect to how I speak about some of the primary sources I
found.
this entry posted to
method;
comments (0)
2007 Jul 27 | The Wikipedia Plays
One of my favorite categories in my "field notes" mindmap is that of
"zeitgeist." Very little of this material will actually find its way into the
dissertation -- indeed, in that mindmap few of the ~800 sources I have across
~50 categories will -- but I collect it nonetheless. I use "zeitgeist" for --
often amusing -- examples of Wikipedia leaking out into the broader culture.
A friend of mine just forwarded this announcement of "a mini-marathon of
short plays that surf the wikipedia wave through seventeen related entries"
(Gans 2007).
That tickles me for some reason.
In a few days I also leave for Taiwan to attend Wikimania 2007.
This should be a lot of fun, and I look forward to reconnecting with and
meeting anew fellow enthusiasts.
this entry posted to
social/wikipedia;
comments (0)
2007 Jun 29 | Punditry and The Web 2.0 debate
I've been following the discussion at the Web 2.0
Forum with interest. In summary, Michael Gorman of Encyclopaedia
Britannica complains that Web 2.0, and it supposed champion Wikipedia, is a
"digital tsunami" (Gorman
2007jer) threatening education, scholarship, and the underlying values of
Western civilization (Gorman
2007ssi1). Yet, while I follow the discussion with interest, I actually
don't find it substantively engaging. Many of the arguments, particularly
Gorman's, tend to be characterized by unsubstantiated claims and the
purposeful construal of nuanced issues as extremes -- propping up strawmen
for subsequent potshots. As I've already indicated,
while it might bring pundits a sense of righteousness and attention, in the
end "Time, not arguments, will utlimately tell." (And, for this reason I
appreciate Larry Sanger's continuing efforts to implement his vision.)
Why, then, do I find this discussion of interest? Punditry, communicative
disorders, and history. First, I'm trying to come to an understanding of
"punditry," and I think Gorman's recent bloggings is an exemplar. My sense is
that sometimes people argue for arguments' sake. That is, even if they
genuinely believe the thing they are arguing for, attention, not persuasion,
is the goal. (In a sense, perhaps it is a high-brow, and perhaps more
genuinely held, form of trolling -- another interesting phenomenon.) Second,
I'm interested in communicative disorders. For example, Gorman faults Wales
for allegedly saying "If you can't google it, it doesn't exist." (This
quotation was originally unsourced, challenged by Wales, sourced by Gorman,
and the disagreement continued.) But earlier in the same essay, Gorman
himself complains "More solid and reputable websites are buried by the
current algorithms of the Internet because they are often fee-based and
cannot garner as many links as free sites (links are key to boosting one's
search engine rank)" (Gorman
2007ssi1). If Wales made such a claim, I would expect it would
likely have been a descriptive statement, rather than normative. That is,
this is largely the way it is, versus the way it should be. And, this is
essentially the same thing Gorman notes, and laments, above: if one's content
is not freely accessible it is "buried." And I can't imagine anyone claiming
that all nondigital information should be dismissed out of hand, and perish
from the earth. The real issue is the normative response one should make in
light of the "Google description": make information freely accessible, or
enable Google to index proprietary sources, and even nondigital media (e.g.,
old books). This, to me, is an interesting question, something which is
happening today, and something I would like to learn more about. For example,
what kind of arrangements does Google make with fee-based sites to index or
content? (JStor articles often are a prominent results in my Google queries.)
What is the user response to a search result which is not immediately
available to them? By purposeful misunderstanding punditry confuses genuine
grounds for agreement and disagreement, and possible understanding.
My final reason for my interest is because of the ways in which this
debate parallels similar discussions throughout history. That's right, as I
argue elsewhere on this site, and in my forthcoming dissertation, reference
works frequently act as a flashpoint and focus for larger social anxieties
about change. I suspect my argument here is a consequence of a historical
sensibility: what most people see as an extraordinary shift appears, with
perspective, to be one thread in a larger tapestry. For example, Gorman's
concern with plagiarism is ahistorical. Again, elsewhere I argue that the
history of reference works is a history of plagiarism. When the Britannica
was in its infancy, much like the Wikipedia today, its founding editor
admitted he "made a Dictionary of Arts and Sciences with a pair of scissors,
clipping out from various books a quantum sufficit of matter for the
printer." (Yeo
2001:180) I'm not condoning plagiarism, but I find if we want to make
normative statements about how things should be, it is best to genuinely
understand the way things are, and have been in the past. This seems to be a
difficult task in the midst of punditry, without some level of scholarly
remove or Neutral Point of View -- as Wikipedians say.
this entry posted to
social/wikipedia;
comments (0)
2007 Jun 20 | Creating a Semester's Class Schedule
I just discovered Python's awesome dateutil package which
implements much of the iCalendar standard, including recurrences!
Consequently, it's trivial to generate a calendar for the days classes
meet. I assume with a little work one could even handle the holidays.
In any case, here's an example:
#!/usr/bin/python2.5
from dateutil.rrule import *
from dateutil.parser import *
sem_start = '20070903T140000'
sem_end = '20071212T140000'
days = MO,WE
meetings = list(rrule(WEEKLY, wkst=SU, byweekday=(days),
dtstart=parse(sem_start), until=parse(sem_end)))
for meeting in meetings:
print meeting.strftime("%b %d %a")
this entry posted to
technology/python;
comments (0)
2007 Jun 19 | Steinhardt LaTeX template
Given a wretched experience with using Microsoft Word for my Masters
thesis 10 years ago, I decided to try LaTeX this time around. One
might think that Microsoft Word has improved, but my tinkering has shown it
is still quite dangerous. Word's notions of styles are extremely frustrating,
and have changed over time. Additionally, creating multi-document files, or
very large files risks corruption. Furthermore, given I work in an
interdisciplinary space, it is useful to be able to format a document,
including footnotes and bibliography, as, say, either historical or
sociological: LaTeX is quite good at this.
That said, LaTeX is a pain. Granted,
I prefer a simple structured text markup language over a corruptible
proprietary binary blob, but LaTeX is like the Perl of markup languages, and
I am a Python
guy. (To be fair, TeX and LaTeX are now decades old.) No doubt,
regardless of what you want to do, there is a way to do it in LaTeX. The
problem is, like Perl, there are too many ways to do it. There are dozens of
packages that appeared to do the same thing, though many are different enough
to make you wonder why the difference is important. It is difficult to
discern the present best practices and most of the documentation is in
annoying PDF. Even understanding LaTeX syntax is a confounding task. Does
'[]' mean an optional parameter to a command? Mostly yes, but sometimes no.
The only way I could get a handle on the world of LaTeX was to purchase Tex
for the Impatient and The
LaTeX Companion.
In any case, when I do have a problem the LaTeX community on comp.text.tex
is extremely helpful. So even though there is a steep learning curve, when I
ascend a particular hill, that challenge stays behind me. There is no
equivalent to Microsoft Word kicking me down the mountain.
Like all colleges, Steinhardt has a particular format they require for
doctoral dissertations. Unfortunately, its specification is sometimes
ambiguous, and more a creature of typewriting, than computer typesetting.
(For example, section headings are supposed to be underlined!) In any case, I
thought I would share the fruits of my frustrations: steinhard-pkg-opts.tex.
I haven't yet received approval that this is sufficient, nor am I an expert
in LaTeX, but, should someone else at Steinhardt need such a thing, this
might be a start.
this entry posted to
career/phd/dissertation;
comments (0)
2007 May 02 | Help with teleonomy
While grading papers I noticed that a student referred to the Wikipedia definition of teleonomy, a concept I have discussed in class, with a quotation that I found contrary to my own understanding: "teleonomic process, such as evolution..." I thought, "whoa!", evolution is not teleonomic and the version of the Wikipedia article I had referred the students to did not say as much. Turns out, the student used the current version of the article, which has been greatly expanded, a good thing, but in a way contrary to my understanding. (The lesson for the student will be, as I told them, they should only use Wikipedia articles as authoritative, when they use the specific versions I have vetted. Otherwise, they should be used as a non-authoritative (starting) source, backed up by an authoritative source.)
In any case, I ended up falling into a rabbit hole of posting a massive comment about my concerns with the article, but now need to get back to grading papers! Are there any "philosophers of science" out there, that are not grading papers :-), and would be willing to contribute to the article were discussion?
this entry posted to
social/wikipedia;
comments (0)
2007 Apr 25 | The untested prejudice
Sanger's latest article, Who Says
We Know: on the New Politics of Knowledge is an argument that
meritocracy, including an authority accorded to credentialed experts, is
preferable to "epistemic egalitarianism." He writes that Wikipedia defenders
argue for epistemic egalitarianism, or "dabblerism," on the basis of
pragmatics (i.e., experts don't have the time or interest, or the crowd can
be wise) and fairness. However, Sanger objects to these claims as either
inaccurate (i.e., experts can be included, or the crowd is often dumb) or
inferior to a genuine meritocracy. He notes that Wikipedia, too, in a conflicted
and twisted way, also "likes" meritocracy. But its meritocracy is not with
respect to what you know, but how much time you spend on the project. Or,
academically credentialed expertise, as a form of merit, is accepted, but
only with respect to citation. This leads him to ask, "If Wikipedians
actually believe that the credibility of articles is improved by citing
things written by experts, will it not improve them even more if people like
the experts cited are given a modest role in the project?" Sanger concludes
that Wikipedia has an "untested" prejudice against experts.
The essay has a number of interesting points. For example, Sanger claims
Wikipedia's parity with Encyclopedia Britannica in the famous Nature study
was not because of many Wikipedian's background in technology and science,
but because the science domain is epistemologically more "objective" and
consequently an easier topic on which to collaborate. This makes me think of
The Story of Webster's Third: Philip Gove's Controversial Dictionary and
Its Critics, in which the Third's editor, Philip Gove, campaigned for a
new standard of objectivity, much to the chagrin of the editor preparing the
wine guidelines (Morton 1994:92). I am no wine expert, but I wonder how
Wikipedia's article fare?
In any case, it is the claim that Wikipedia has an untested prejudice that
I find most perplexing. If Wikipedia has a prejudice, it is towards
"dabblerism" because of the often cited failure of the Nupedia "test" --
which perhaps unfairly taints "expertism" -- and the success of Wikipedia.
Sanger will keep making his expertise argument, and Wikipedians will keep
making their crowd arguments -- and despite all the words being spent,
neither party precludes the participation of crowds, in Sanger's case, nor of
experts, in Wikipedia's case. Ultimately, this argument will be won by the
test of "running code" -- in this case a widely used reference work. Will
Sanger's new project provide so much added value to sustain itself? Or, will
Wikipedia develop means of rating the quality of articles or contributors?
Time, not arguments, will utlimately tell.
this entry posted to
social/wikipedia;
comments (0)
2007 Apr 20 | Recent special issues on commons and wikis
Recently two short essays, which started out as blog entries here, have
been published in collections that might be of interest to readers. In
Re-public's Wiki politics -
part I, I'm hoping Yanis
Varoufakis can tell us more about the actual practice of democracy in ancient
Greece compared to present day Wikipedia governance. In addition to my NYU
colleague Alex Galloway, I particularly recommend Felix Stalder's work from
In
the Shade of the Commons -Towards a Culture of Open Networks.
this entry posted to
career;
comments (0)
2007 Apr 12 | NYU and Sallie Mae
Longtime readers will know I've been frequently frustrated and puzzled by
NYU's infrastructural choices. When I first arrived I learned they used my
SSN as my student identification. I protested,
wrote a formal letter, and was eventually issued a new SSN-like number. But,
of course, this didn't easily propagate through the institution so there were
often mismatches, duplication, and confusion when I used the computer labs,
athletic, library, health, and insurance services. Eventually, this policy blew up in
their face when students' SSNs were accidentally released and all
students were issued a new, proper, identification. But this meant that for a
period, I had to remember three different numbers!
Other poor choices of NYU: an inefficient library site, an inaccessible
and proprietary classroom management system (i.e. blackboard), a broken
student registration site (e.g., "don't use the back button", and "it
works in Internet Explorer"), blocking the SSH port so I can't securely use
their network, requiring a proprietary client for their wireless network (for
which there's no "support" for Linux and Palm), etc.
The latest head-scratcher is an announcement that we must use their new
electronic billing system. Clicking on a URL in the e-mail announcement, I
was redirected to what appeared to be a Sallie Mae site with an NYU logo. I
immediately think this feels an awful lot like a phishing attack. (I
frequently receive advertisements from spammers, who know I am a student, to
my unpublished NYU e-mail address, so some how even this leaked out!) And
even if this isn't a phishing attack, Sallie Mae is the last company I want
to have any dealings with: their slimy marketing and brutal collection
practices are widely criticized;
NYU's announcement came just before Sallie Mae's sweetheart payoffs
to university officials became headline news!
After I e-mailed NYU about the security and privacy practices of the
service, I learned I can "petition" to opt out
after I fill out more paperwork. NYU... man...!
this entry posted to
career;
comments (0)
2007 Mar 29 | Brandt and Wikipedia plagiarism
Since I am a student of plagiarism in reference work production, I read with interest Daniel Brandt's (2006) analysis of Plagiarism by Wikipedia editors.
He claims that he found 145 instances of Wikipedia plagiarizing others
(roughly 1% of his sample), and projects that the "plagiarism rate on
Wikipedia is at least 2%."
I have a few of thoughts, the first of which are statistical
nuances. For example, the size of the sample that was used to calculate
the 1% rate is not clear. His description of the winnowing down of his
original corpus to the plagiarized cases is somewhat confusing, but
this is how I understood it:
- 16,750: the original biographical articles
- 12,095: those articles containing one or more "clean" sentences not obfuscated by wiki markup
- 5,867: those articles for which Google returns a verbatim copy of a clean sentence from another online source
- 1,682: articles remaining after removing "rogue" sites
- 145: the final plagiarizing articles
Because his 1% figure is rounded, it's not clear if his divisor is
the 16,750 articles he started with (yielding 0.087) or the 12,750
articles in which he found at least one clean sentence (yielding
0.012); I think it should be the latter: 1.2%.
Also, it's difficult to follow the details of the winnowing of the
5,867 articles for which Google found duplication, to the 1,682
articles that weren't "rogue," to the 145 "confirmed" plagiarism, but
clearly the bulk of duplications were those containing material from
Wikipedia. This invites the question of how much is Wikipedia itself
plagiarized?
And, remember, these are descriptive statistics of his sample only.
Before making any inferential claim to the whole of the Wikipedia one
has to ask about the sampling methodology: why biographical articles,
and are they more or less likely to have plagiarism than others? (In my
experience I find biographical plagiarism to be common.) Also, we have
no parameters for the confidence
of the inference. For example, to be very confident (i.e., 99%) that
the inferred estimate is only off by 1%, one would need a proper random
sample of 16,453 ("clean") articles.
Finally, and more substantively, Brandt conflates plagiarism
with verbatim copying. As Posner (2007:12) writes: "not all
plagiarism is
copyright infringement and not all copyright infringement is
plagiarism." I'm not sure if he excludes all public domain copying
(which still might be considered plagiarism), or only unsourced public
domain material. Also, elsewhere, I wrote how I found a verbatim copy
of text in a biographical article, raising copyright infringement
concerns. After placing my report on the copyright infringement page
(there can be dozens of reports today), the infringing text was
rewritten perhaps removing it from the scope of copyright -- and
Brandt's method -- but perhaps not from the cloud of plagiarism.
Consequently, my understanding of Brandt is that his conclusion
should be read as: 1.2% of a sample of Wikipedia biography articles
appear to contain infringing verbatim copying from other online
sources. Brandt's approach was conservative, but isn't really about the
whole of the much larger, but murkier, domain of (non-verbatim)
plagiarism.
this entry posted to
social/wikipedia;
comments (0)
2007 Mar 29 | The origins, or etymology, of the "Internet"
On the Association of Internet Researchers mailing list Tamara Paradis raised the question of the origins of the term "Internet". I am not sure why, but I am always drawn to questions like this -- perhaps it is my historical sensibility. For some reason, I enjoy going through old documents to find the origins of terms. For instance, in 1999 as a complement to my work on the intersection of law and computer design, I wondered how technicians came to talk about computer proxies and agents, common terms from contract law. So, I researched The Etymology of "Agent" and "Proxy" in Computer Networking Discourse. Similarly, as a complement to work on how informal norms in the form of quotations about the Internet govern, I researched the provenance of many famous quotations such as "Inside every working anarchy, there's an Old Boy Network."
So, I thought I would share some of my notes on the "Internet." Vint Cerf (2000) is fond of talking about how the merging of ARPANET, PRNET, and SATNET were known as the "'inter-net' problem." However, I've not found much documentation of that.
What I have found is that the terms "international", "internet", and "internetwork" were used rather interchangeably throughout the 1970s, they couldn't even settle on what to call it, or what ITP stood for:
- Cerf (1973), A partial specification of an International Transmission Protocol which specifies a International Transmission Protocol (ITP) implemented via TCP.
- Cerf, Dalal and Sunshine (1974), Specification of Internet Transmission Control Program.
- Cerf (1977), IEN #5: Specification of Internet Transmission Control Program: TCP (Version 2), which uses the term Internet, but otherwise speaks about Internetwork.
- Cerf and Postel (1978), Specification of Internetwork Transmission Control Program: TCP, Version 3, which simplifies TCP by breaking out IP into a separate specification, but then goes back to using "Internetwork".
In version 3 (1978) because IP was split out of TCP, and was unambiguously referred to as Internet Protocol, I think that's when the term began to stick. However, there's more ambiguity on the details and versioning of these specs, so it's not as easy as that!
this entry posted to
social;
comments (0)
2007 Mar 22 | The elusive Jimmy Wales
Like others, I have been surprised that the Wikipedia policies of No Original Research (WP:NOR, Wikipedia 2006nor) and Verifiability (WP:V, Wikipedia 2006v) had been collapsed into a new policy of Attribution (WP:ATT, Wikipedia 2007a). The two former policies, in addition to Neutral Point of View (WP:NPOV, Wikipedia 2006npv), have been essential for understanding and explaining Wikipedia collaborative culture -- even more so than the Trifecta (Wikipedia 2006pt) and The Five Pillars (Wikipedia 2006fp).
But Wikipedia is huge, and it's not hard to miss something even as important as this, so last week I updated my dissertation to read:
The second policy of Attribution requires, in a nutshell, that "All material in Wikipedia must be attributable to a reliable, published source" (Wikipedia 2007a). In a manner of speaking, this second policy is relatively new -- becoming "official" in February 2007 -- because it incorporates and supersedes two the long-standing policies of No Original Research (WP:NOR, Wikipedia 2006nor) and Verifiability (WP:V, Wikipedia 2006v).
Yet, for the dissertation, I concluded that having these two ideas remain distinct was useful to me in my writing and I would continue referring to them even if it now required a parenthetical comment about their merger. Evidently, Wales (2007wvw) agreed, as he wrote yesterday:
The change was made before a sufficient process had taken place to make the change, with the result that many good editors were unaware that such a fundamental change was about to take place. Many have reported being baffled and unhappy with the change.
However, because he intervened with a "rejection of [[WP:ATT]]" (Wales 2007jrw) this prompted two threads of interest to me: to what extent does WP:NOR act as a proxy for Notability, and "Just what is Jimbo's role anyway?" (Bennett 2007). In writing about leadership in Wikipedia and other open content communities, I have wondered the same, and now some were pressing for an explicit enumeration of Jimbo's powers. Others engaged in the perennial question of is this role more like that of a dictatorship, ministership, presidency, or a monarchy? Stephen Bain (2007tdv) has posted a thoughtful argument that constitutional monarchy is the most apt, something Wales himself has said in the past:
But we have retained a 'constitutional monarchy' in our system and the main reason for it is too support and make possible a very open system in which policy is set organically by the community and democratic processes and institutions emerge over a long period of experimentation and consensus-building. No one needs to be afraid that VfD will be hijacked, and our rules turned against us. (Wales2005 nnw1)
And this brings me, finally, to the point of this essay: the elusive Jimmy Wales. I am not sure if this is a feature of other auctorial leaders (e.g., Linus Torvalds, Guido von Rossum, Larry Wall, etc.) but it is a sometimes frustrating and seemingly useful characteristic of Wales who commented on the ambiguity of his role as follows:
I think the limits on my power are quite a bit unknown for a few reasons, mainly that I really don't exercise power all that much, ever, and so most questions of what I could do just simply don't come up. And passing a priori laws against me seems rather injudicious since our community institutions are all quite carefully limited for good reasons in an effort to create an atmosphere of calm loving respect. (Wales 2007jwi1)
This ambiguity is one reason why I find the statist notions somewhat inappropriate and place my theory of leadership in the lineage of emergent leadership (Yoo and Alavi2003).
Wales has followed this strategy from the start, once characterized, to the frustration of Wikipedia cofounder Larry Sanger (2005), as a good cop rarely wielding a big stick:
In retrospect, I wish I had taken Teddy Roosevelt's advice: "Speak softly and carry a big stick. Since my "stick" was very small, I suppose I felt compelled to "speak loudly," which I regret. (This was not such a problem, by the way, on Nupedia; partly, that was because there were not nearly as many problem users on Nupedia, but partly it was because there was clear enforcement authority.) As it turns out, it was Jimmy who spoke softly and carried the big stick; he first exercised "enforcement authority." Since he was relatively silent throughout these controversies, he was the "good cop," and I was the "bad cop": that, in fact, is precisely how he (privately) described our relationship. Eventually, I became sick of this arrangement. Because Jimmy had remained relatively toward the background in the early days of the project, and showed that he was willing to exercise enforcement authority upon occasion, he was never so ripe for attack as I was.
This elliptical approach serves Wales well and I think it is appropriate to the many challenges he faces, but it is also a challenge for writing about Wikipedia. For example, in explaining an inspiration for Wikipedia, Wales (2005nt) has acknowledged the debt to Hayek's, "'On the Use of Knowledge in Society' as a pivotal essay in guiding my own thinking on topics like decentralization, knowledge, and society." But Wales has also resisted (Wales 2005wew) the idea that The Wisdom of Crowds (Surowiecki 2004) is a factor in Wikipedia dynamics. This isn't trivial for me to reconcile and I can only do so by reading the latter statement not as a disavowal of social emergence, but a purposeful shift in focus to the community and culture of Wikipedia. (Since that, too, is my own belief: there are underlying emergent dynamics, but don't forget the collaborative culture!)
Or, consider that Wales (2005wdm) vehemently disagreed with Seigenthaler's claim (AP 2005) that because Wikipedia allows anyone to edit, Wikipedia permits vandalism. Yet, six months later, in order to limit such vandalism Wales argued for what was popularly understood as a new blocking feature, so those logged in from an otherwise blocked IP address could still edit. Wales (Wales 2006nyt1) argued this was not a restriction:
Openness refers not only to the number of people who can edit, but a holistic assessment of the entire process.I like processes that cut out mindless troll vandalism while allowing people of diverse opinions to still edit. Those are much better than full locking.
"Holistic," "elliptical," "elusive"... and a challenge in writing about Wales!
this entry posted to
social/wikipedia;
comments (0)
2007 Mar 15 | Reuse vs. self plagiarism
Yesterday's New York Times reported on another case of high profile plagiarism: a relatively young professor who had copied parts of her dissertation from another. Even though she had previously acknowledged as much in private and has now resigned -- so there's no question of ambiguous boundaries -- a few things struck me as salient:
- A colleague, fed up with low-quality peers, started the investigation and even hired a private investigator to bust her.
- Before the story broke she contacted her former adviser about revising copied material, but when questioned by the reported the adviser responded "He said that he barely recalled her, 'I only remember one thing, that she was in a hurry?'" That seems odd.
- The article's evidence of plagiarism are two fragments: a short description of a reference, and a one sentence description of her project. I find both types of statements lend themselves to a remarkable degree of homogeneity, and wouldn't find them convincing on their own among hundreds of pages of text.
Interestingly, this article came along at the same time I have been following an interesting discussion of turnitin, a student plagiarism detection service, and struggling with issues of "reusing" my own work.
Unlike my previous issue of how to deal with priority in relation to self-published grey literature, my present concern arises out of published work. I am presently working on Chapter 4 of my dissertation which specifies criteria for an "open content community" as well as some interesting boundary cases on openness. A dissertation is, understandably, supposed to be an original work; I read this as "new work since matriculation" but I have heard it said this could mean only unpublished work: this would be horrible. Perhaps my strong view is partly because I'm a "mid-career" Ph.D. student that has already presented papers and it strikes me as contrary to stop a professional activity that is essential for getting feedback. I also appreciate new Ph.D.'s reuse their dissertation in subsequent articles and/or books, which I also plan to do. But to sit on all that material and labor on in solitude -- aside from one's dissertation committee members -- until the dissertation is complete seems counterproductive. Consider the genealogy of parts of the present chapter:
- the notion of an open content community was a fragment from a sociology term paper from four years ago that I then developed into a short published paper, which I then expanded into a more extensive published paper that I planned to make use of in this chapter.
- the boundary case of open community and closed law was a blog entry I planned to use in this chapter. I received a request for its use in a book and it is now "published."
- the case of WikiChix was another blog entry I planned to use and was happy to extend and submit it in response to a request since I knew I'd eventually want to work on it more; it will soon be published.
- the case of a Wikipedia blocking proposal, now implemented, is written and no one else has ever seen the text. Consequently, I expect it stinks and I thought of posting here.
In no case did I assign any copyright -- though they of course are published under various copyright licenses -- and so I am not legally precluded from using them in compilations or derivative works. Making them available has provided me with feedback and opportunities for publication which yields more feedback and builds relationships within my scholarly community. This is great! But what of "self-plagiarism"? (So, perhaps this is like my earlier post but questions of priority and public but "unpublished" work are exchanged for questions of "published" works and self plagiarism.)
I've been reading up on the topic and found Green (2005) interesting, and Hexham (1999) useful:
Self-plagiarism must be distinguished from the recycling of one's work that to a greater or lesser extent everyone does legitimately. Although self-plagiarism in academic publications is a gray area many universities implicitly recognize the practice as fraudulent. Thus most universities have rules preventing students from submitting essentially the same essay for credit in different courses. There are also rules against someone submitting the same thesis to different universities. Among established academics self-plagiarism is a problem when essentially the same article or book is submitted on more than one occasion to gain additional salary increments or for purpose of promotion.
Like all plagiarism, self-plagiarism occurs when the author attempts to deceive the reader. This happens when no indication is given that the work is being recycled or when an effort is made to disguise the original text. The issue once again is one of deception. Disguising a text occurs when an author makes cosmetic changes that make the same book or paper look different when it actually remains unchanged in its central argument. Changing such things as paragraph breaks, capitalization, or the substitution of technical terms in different languages, causes readers to believe they are reading something completely new. If these are the only changes an author has made then they may be legitimately described as self-plagiarism and fraudulent.
The extent of re-cycling is also an indication of self-plagiarism. Academics are expected to republish revised versions of their Ph.D. thesis. They also often develop different aspects of an argument in several papers that require the repetition of certain key passages. This is not self-plagiarism if the complete work develops new insights. It is self-plagiarism if the argument, examples, evidence, and conclusion remain the same in two works that only differ in their appearance.
Which brings me, finally, to my simple and mundane question for my dissertation. Is a citation to my own published works sufficient if I am reusing text -- though continuing to rework and integrate it -- or should I also give an acknowledgment often seen in scholarly books that "portions of this text are republished from or based on...")?
(BTW: a possible irony is I expect this and earlier entries could be turned into a decent paper on "scholarship in the open" should the opportunity ever present itself!)
this entry posted to
method;
comments (2)
2007 Mar 05 | Good faith, bad faith
Despite appearances, I do make note of the Wikipedia disaffected. A possible criticism of my
work is my focus on the notion of "good faith" in collaboration misses
this constituency. No doubt one could write many volumes on the
conflict and "dissensus" of Wikipedia. (Deetz (1996) considers the
consensus/dissensus cleavage as a common facet in approaches to
"Describing Differences in Approaches to Organization Science.")
However, I often feel going for conflict is the easier, dramatic,
route; it is even privileged. Perhaps to a lesser extent than
in psychology (Seligman 2002) social scientists -- and particularly
"critical theory" types -- privilege the pathological. When writing of
his studies of primate behavior de Wall (1996) noted that "terms
related to aggression, violence, and competition never posed the
slightest problem" to editors and reviewers. Yet, he "was supposed to
switch to dehumanized language as soon as the affectionate aftermath of
the fight was the issue" (p. 512). This bias is common, but fortunately
in organizational science there is room to consider both good and bad
organizational leadership, culture, and structures, with a descriptive,
normative, and/or prescriptive stance (Bell, Raiffa and Tversky 1988).
This week a controversy has been raging about
misrepresentations a prominent Wikipedian made about his real-life
credentials and Wales was criticized for seemingly endorsing this
behavior. I planned to say nothing, or at least be patient, as it
seemed much of the heat was from those with grudges or posing punditry. However, Wales had finally spoken up
-- he had been traveling in India without easy access to an Internet
connection -- with the following:
I have asked EssJay to resign his
positions of trust within the community. In terms of the full
parameters of what happens next, I advise (as usual) that we take a
calm, loving, and reasonable approach. From the moment this whole thing
became known, EssJay has been contrite and apologetic. People who
characterize him as being "proud" of it or "bragging" are badly
mistaken.... Wikipedia is built on (among other things) twin pillars of
trust and tolerance. The integrity of the project depends on the core
community being passionate about quality and integrity, so that we can
trust each other. The harmony of our work depends on human
understanding and forgiveness of errors. (Wales 2007utj)
I read that and think: this is why I am
studying good faith.
In any case, the thread that really caught my interest this
week was the banning of a user from the Wikipedia e-mail lists who
exemplifies the operation of bad faith in two interesting ways. (I'm
not linking to or identifying this person as it's not my intention to
attack or to make any judgments about the substance of his complaints.)
The first interesting issue is what constitutes a contribution? In goal
oriented and merit-based open content communities one's authority in
policy/administrative issues is often built upon contributions towards
the substantive goals of the community (e.g., writing an encyclopedia).
Strident, even if partially sound, criticism from someone who does
little else has little weight. When John Doe responded that he did
contribute he asked:
What do they consider "not making any
positive contributions"? How about this? Or this? Or this?
The fact that these links only pointed to
other complaints is telling. Later, when his links to his previous
criticism in the email archives broke, he assumed the worst:
Yes, that's right. Someone
sort-of-cunningly reindexed the month. ... Why would they do this?
Presumably, it's an attempt to hide the post. They know they'd get
caught if they just deleted it, but if they just "shift" it a bit...
sneaky, boys.
Much as I marvel at Wales' invocation of
good faith, I boggle at the paranoia of this statement. As readers of
this blog know, I too was frustrated when references to the archive
broke -- a common "feature" of mailman -- but to assume it was done to
"hide the truth," which is the subject of Doe's entries, is an
astounding and ill-founded assumption of bad faith.
this entry posted to
social/wikipedia;
comments (0)
2007 Feb 16 | History,
Wikinomics, and Causation
An issue related
to the question of priority, noted in a previous entry, is the general
historical question of causality. Priority, who first had an idea or
published it, can be a trivial question relative to a claim about who
or what caused something. Niels Bohr modeled the
atom, but how did the US come to drop the bomb on Japan?
In the history of technology, the question of "founding" is of
particular interest to me. Given the spat
between Wikipedia's cofounders on the balance of credit, I note with
interest how Wikipedia's "creation story" is told. The new book, Wikinomics
(Tapscott and Williams 2006), tells the story as follows:
Wales first ventured into the world of
encyclopedic content in 1998, when he established Nupedia with former
employee Larry Sanger. Like Wikipedia, Nupedia allowed anyone to submit
articles and content. Unlike Wikipedia, it was a centralized, top-down
hierarchy: paid academics and topic experts follow the laborious
seven-step process to review and approve content. One year and $120,000
into the project, Nupedia had only published twenty-four articles, and
Wales decided to scrap it.
One of Wales' employees then introduced him
to the Wiki, a concept invented by Ward Cunningham in March 1995, and
Wales started again with a much more open way of organizing the site
that would allow anyone with the inclination to participate. In the
first month, Wikipedia published 200 articles, and in the first year
the total reached 18,000 (p.72).
This is the only reference to Sanger in the
book, and omits his well-known and public role in Wikipedia's launch in
favor of the undocumented
claim that Bomis employee Jeremy Rosenfeld
proposed a wiki as a way to solve Nupedia's problems before Sanger did.
Despite Sanger's strident efforts to defend his claim, I expect that in
the popular press he will only be known as the authority-mad academic
of Wikipedia's failed predecessor.
(I don't necessarily discount Wales' claim, but one can't deny
that the historical documentation shows Sanger played a prominent role
in launching Wikipedia. Nor do I believe Sanger was the right leader for Wikipedia; with regards to the original vision and eventual success of
Wikipedia, we have Wales to thank.)
this entry posted to
social/wikipedia;
comments (0)
2007 Feb 13 | Grey
literature, stigmergy and priority
Last
week I read a provocative paper by Helen Nissenbaum (2002) where she
considers the norms, values, and ends previously served by the
convention of scholarly priority, and, now that the contextual
landscape is changing because of electronic media, whether intellectual
property (patents) can serve just as well in their stead. Helen
recommended it to me while we were discussing my dissertation chapter
on encyclopedic production, including questions of copyrights and
plagiarism. This chapter is partly based on a draft I wrote in 2005 in
which I argued the concept of stigmergy is helpful in understanding the
sort of socialty involved in the cumulative production of knowledge in
reference works.
An irony is that Nissenbaum's paper speaks to the question of
scholarly priority in the age of the Internet, which bears on my
adoption of the term stigmergy. (She doesn't mention blogs or wikis,
but instead refers to "wildcat publishers," "grey literature," and
whether there is any scholarly obligation to search these realms for
the purposes of citation.)
I think I first wrote of stigmergy in the spring of 2005, in a
draft I made available on this blog on September
30. Roughly a year later, I read Mark Elliott's piece Stigmergic
Collaboration: The Evolution of Group Work in the May issue
of the online MC/ Journal. Elliott explores the idea much more
thoroughly than I did or will, and that is good. But how do I deal with
the question of priority and citation? I definitely want to -- and do
-- cite Elliott in my present version of the chapter, but what to do
with my earlier version? I don't know Elliott and assume he knows
nothing of me. And I don't feel that proprietary about saying Wikipedia
might be stigmergic. And for all I know we read the same thing about
wasps -- though I was also inspired by early reference work compilers
likening their copying of others' work to a useful "busy bee." But I
don't want it to appear I am simply borrowing the idea from elsewhere
and I prefer not to cite earlier "unpublished" drafts. This concern
with priority is in the face of the biggest irony of all: an argument
of this chapter is that knowledge is inherently interdependent and
cumulative!
Presently, the text in question reads:
Stigmergy is a term coined by Pierre-Paul
Grasse to describe how wasps and termites collectively build complex
structures; as Karsai (2004:101) writes, it "describes the situation in
which the product of previous work, rather than direct communication
among builders, induces [and directs how] the wasps perform additional
labor." In addition to my proposal that this notion might be helpful in
understanding Wikipedia collaboration (Reagle 2005fss), Mark Elliot
(2006) has also, more thoroughly, argued the same: "As stigmergy is a
method of communication in which individuals communicate with one
another by modifying their local environment… [t]he concept
of stigmergy therefore provides an intuitive and easy-to-grasp theory
for helping understand how disparate, distributed, ad hoc contributions
could lead to the emergence of the largest collaborative enterprises
the world has seen" (Elliott 2006:4). However, we need not apply this
notion only to new media. For example, stigmergy might also be
applicable to Newton’s seemingly generous sentiment of
acknowledging the contributions of his predecessors: "If I have seen
further [than you and Descartes] it is by standing upon ye shoulders of
giants." (As cited in a 1676 letter from Newton to Hooke, by Merton
(1993), who details a long history of this aphorism and Newton's
probably less than magnanimous intention (Hawking 2002) of insulting
Robert Hooke, his short and hunchbacked rival.)
Is this appropriate?
this entry posted to
method;
comments (0)
2007 Feb 09 | Auctorial Leadership?
A few days ago, while walking home from the local library, I recalled
an expression I learned in a class on early Christian history: primus inter pares.
This notion was used by early church leaders (e.g., the Bishop of Rome,
now the Pope) and present day patriarchs to indicate a status of "first
among equals." Perhaps this could help me with my question
of what to call benevolent dictatorship in open content communities.
But, the sentiment wasn't quite right and it would be difficult to coin
a term out of that Latin expression. But as I followed links from the primus page I encountered the terms "patriarch," "ethnarch," "archons" and finally "auctoritas."
The Oxford Classical Dictionary defines patrum auctoritas
as: "the assent given by the 'fathers' (patres) to decisions of the
Roman popular assemblies. The nature of this assent is unclear, but it
may have been a matter of confirming that the people's decision
contained no technical or religious flaws. The 'fathers' in question
were probably only the patrician senators, not the whole senate..."
(Momigliano and Cornell 2003). Auctoritas is the Latin root of English words authority and author.
Given that "benevolent dictators" are often the founding author of open
content projects, it seems appropriate. (In the Internet standards
context, I spoke of "elders.")
While I was convinced for a time my term would include the root "arch,"
for "ruler," the more I read of auctoritas the more I liked it.
Additionally, the form of power inherent in auctoritas fits my
notion of leadership. It is not a coercive order but a recommendation
with a normative force based on the prestige and charisma of a leader.
Theodore Mommsen wrote of it as a force that is "more than an advice
and less than an order: it is an advice whose compliance it is not easy
to evade..." (Mommsen, as cited in Lottieri 2005:15). Lottiere's
concludes his discussion of the notion by writing:
For all these reasons we can say that auctoritas
wa[s] on the edge between the legal world and the social life, the
beliefs, the customs. It is in condition to influence the decisions by
its prestige. Therefore, people refusing the auctoritas can ignore it,
but they know that by the decision they are out of the community.
(Lottieri 2005:15).
And this dovetails into the possibility of forking!
So, I find the term to be a surprisingly good fit. Now I need to
figure out how to pronounce "auctorical" or "auctorial," or maybe even
"authorial," leadership. Is this too awkward?
this entry posted to
social/wikipedia;
comments (0)
2007 Feb 08 | ZotZero and BusySponge
I have been reading of ZotZero in Josh's
blog and am hopeful that it will help bridge the gap between the
dynamic and informal life of the Web (e.g., reading, blogging,
bookmarks, RSS, etc.) and the seemingly lifeless task of bibliography.
Wouldn't it be nice if citing something was as easy as bookmarking it?
Or, if you could read what your colleagues were reading via an RSS feed?
While I haven't played with ZotZero
yet -- and I use the Konqueror browser not Firefox -- I share this
vision and hope to see it become a reality. And since I recently posted
of my Freemind Extract tool (for transforming a mindmap into a
bibliography) I realize I haven't spoken of the flipside a couple of
years: absorbing information. But first, a historical digression.
The way I make note of and annotate resources and tasks evolved out
of two practices at the W3C. The first of which was a decree by Timbl
which I objected to strongly at the time: the great datespace shift of
1999. Because the W3C's root file/name space was getting too crowded,
Tim's new policy forbid new top-level spaces like www.w3.org/Signature or www.w3.org/Encryption.
There were too many already and who were we to lay claim to such spaces
for all time? There might be a new digital signature activity 10 years
from now, so where would they live? (Consequently, the subsequent key
management working group received www.w3.org/2001/XKMS.)
I appreciated this concern at the root level, but cringed at only being
able to organize other files by date of creation. Try finding a
document you wrote a couple of years ago in a space no more structured
than /2001/{01,..,12} and is shared by 50+ other
people. It's not easy. I realize the only way I could keep track of
things I had worked on was to have a log of events and documents I
cared about. (This shift also affected how we collaborated in our
shared space given issues of ownership, access controls, and version
management -- but perhaps more on that another time.)
The second W3C practice was that each of its hosts (worksites) had a
weekly meeting at which we shared the important events of the past week
and raised agenda issues for common discussion. To make it easier for
the minute takers we e-mailed two minutes to an e-mail list and a bot
would collect them into draft minutes which would be augmented with the
IRC log.
Preparing my two minutes before 10 a.m. Tuesday morning always
seemed more frantic than it need be. But, once I started keeping a log
of what I had done as a result of the datespace shift, it became
trivial. (In fact, I wrote a script to grab the past week
automatically, and even generated a RSS feed from the work log so that
one could "subscribe" to my work log by keyword/task -- anticipating
RSS feeds of tagged bookmarks.)
By 2002 I had tired of manually logging events, via an HTML editor,
to my personal blog and work log, so I wrote a specification for a
dream tool: Busy Sponge. It would soak up everything I touched of importance and send it to the right place. I opted for a commandline tool I named b.py.
Returning to today, and a challenge I'm sure I share with
the ZotZero folks, is how to automatically scrape as much metadata
as
possible from a Web resource? Busy Sponge continues to be the primary
way I input data into my work log and mind maps. Because metadata is no
more common or standard on the Web as it was five years ago I am
dependent on screen scraping heuristics. For example, the following
code allows me to easily capture and cite messages of Wikipedia mailing
lists -- and that is why it was such a hassle when the archives broke:
elif url.startswith("http://marc.theaimsgroup.com/"):
try:
author = re.search('''From: *(.*?)''', html).group(1)
except AttributeError:
author = re.search('''From: *(.*)''', html).group(1)
author = author.replace(' () ','@').replace(' ! ','.')\
.replace('<', '<').replace('>', '>')
author = author.split(' <')[0]
author = author.replace('"','')
mlist = re.search('''List: *(.*?)''', html).group(1)
mdate = re.search('''Date: *(.*?)''', html).group(1)
...
Unfortunately, beyond a couple mailing list archives and wikis --
which, fortunately, are the majority of what I grab -- I have to
manually edit my sponges with proper meta/bibliographic data. And
curses upon those bloggers who make it difficult to determine the
author of an article or even the whole blog -- even a pseudonym will
do! Beyond the usage of my tool, I can imagine much value in a social
tool that allows users to share annotations, or even screen-scraping
"plug-ins." One can hope!
this entry posted to
technology/python;
comments (0)
2007 Feb 01 | The Anti-Grand Poobah
In a draft on Wikipedia leadership I
labeled the "benevolent dictator" form of leadership in open content
communities as "paramount leadership." As I turn to revisit the topic
in my dissertation I am still not satisfied with this term. A possible
tactic would be to adopt the native's term of "benevolent dictator" and
use that. The problem, though, with that is that the term is much
confused and discussed -- which is why it is interesting -- and would
fail to distinguish the distinct theoretical concept I am offering
relative to the wider, imprecise, usage. Paramount means "superior to
all others" and serves the purpose of being distinct, but doesn't quite
capture the meaning of this type of leadership in open content
communities. Such leaders often have no formal title but founded the
community or otherwise achieved the position through merit. They serve
to make decisions that the community has a difficult time making
itself. And they must act with humility and humor or the community
might fail or fork. Some other ideas I've been bouncing around are
"provisional dictator" and "jester king." I need something that is the
opposite of "Grand Poobah." Any ideas are welcome!
this entry posted to
social/wikipedia;
comments (0)
2007 Jan 30 | Student Usage of Wikipedia
The Blogosphere has been abuzz about a history department's policy restricting students from citing Wikipedia. I'm not fond of this position, as I explained last year, and I thought I'd share the "best practice" I encourage in my students.
this entry posted to
career/teaching;
comments (0)
[This entry is now deprecated, please see Thunderdell (Freemind Extract).]
I am releasing version 0.6 of
the fe mindmapping bibliographic tools. As
explained in Extracting
Bibliographies from Freemind, these are python scripts that are able to
convert between Freemind
mindmaps (using a few simple conventions) and bibliographic formats (i.e.,
OO.org CSV and bibtex). It also makes it very easy for me to search my notes and quote authors
(e.g., "Giddens").
There are no massive changes, just the usual tweaks and bug fixes. One
notable change is the regular expressions in pe.py are much improved,
and it's quite uncanny at extracting bibliographic keys of the form
'Snide and Smith (2003)' or '(Snide, Smith and Smittie 2004)' from
natural language text.
this entry posted to
technology/python;
comments (0)
2007 Jan 25 | Dunc-Tank and Money
Like Biellla,
I have been following from afar the controversy [1,2] associated
with the dunc-tank project: a way for a few Debian developers to accept
donations. The moderate amount of money (appreciated
nonetheless I'm sure) caused an extraordinary ruckus among
other volunteers, leading to protest and resignations.
How
is it that money can be so divisive? The study Biella points
to suggests that even ambient reminders of money (what
psychologists
call "priming") can lead to "antisocialness." Subtly
reminding subjects of money or high affluence (e.g., descrambling a
sentence about a high salary, exposure to a poster or screensaver of
currency, seeing a big pile of Monopoly Money) led them to be more
"self-sufficient," that is less likely to ask or provide help, less
likely to donate to a cause, to arrange chairs further apart, and to
prefer singular activity to collaboration and social recreation (Vohs, Mead and Goode 2006).
this entry posted to
social;
comments (0)
2007 Jan 23 | Broken lists
I'm presently cursing whoever changed the configuration/names
of Wikipedia lists. Identifying emails in archives is sadly a difficult
problem, it really need not be, but fortunately the good folks at the
aimsgroup MARC also archive the lists and associate the unique identifier of every
message with a persistent and unique URL, as I wrote about previously.
But when Wikipedia moved its lists from "foo@wikimedia.org" to
"foo@lists.wikimedia.org" it not only broke email filters across
the land, it broke the MARC archives evidently. No message is
available in the MARC archive since the change, on January 6. Now, Wikipedians are realizing
that many of the links from the Wikis to email messages (e.g.,
referencing a message on the Wikimedia Foundation list) are broken.
My
backlog of email messages to scrutinize is growing as I hope Hank
Leininger and the other volunteers at MARC find the time and means to
address the problem. What would be great is if Wikipedia and other
users of archive software (i.e., mailman) pressed for stable references
to messages as a priority feature!
this entry posted to
method;
comments (4)
2007 Jan 10 | Understanding technology
I have just completed revising the syllabus for of course I will be again teaching in the spring of 2007: Understanding how we understand: technological predictions, myths, and implications. Pedagogically, the class is very much influenced by Brookfield and Presskill (1999), Discussion as a way of teaching: tools and techniques for democratic classrooms.
this entry posted to
career/teaching;
comments (0)