Socialization in Open Technical Communities

Joseph M. Reagle Jr. <email address>


While many definitions of openness focus on the character and licenses of the software products, relatively few directly address the character of the social organization that develops those products. This essay offers a definition of  openness and  considers how that characteristic affects the recruitment and socialization of newcomers to such organizations. The relevance of socialization is clear when one consider the growth of on-line communities, and precariousness of membership in voluntary organizations. I then suggest that “forking,” a split of the communities, is integral to the definition of openness and a possible vector of communicating social norms between communities, and that a significant difference between open technical communities and some other open/voluntary communities is the internal orientation of status seeking within the community.

Table of Contents

1 Introduction

2 Openness Considered

2.1 Free Software and Open Source

2.2 Openness, Voluntariness, and Remoteness

2.2.1 Openness

2.2.2 Voluntariness

2.2.3 Remoteness

2.3 Open Communities

3 Open Communities and Socialization

3.1 Motivation

3.2 Structure

3.3 Joining

3.4 Learning

3.5 Goals

3.6 Group and Self Identity

3.7 Role and Attribution

4 Conclusion and Exit

5 Bibliography

1 Introduction

How does the “openness” and “voluntariness” of communities, such as those that write free software (e.g., GNU/Linux) and open standards (e.g., Internet and Web), affect the socialization of their members? The majority of work on the social character of open technical communities has focused on motivation: why would someone volunteer their time to work on such a project? However, there are a number of other interesting questions related to socialization including the relevance of structure, joining, learning, goal setting, identity, and roles.

In the second section I introduce a few open communities and provide definitions for a number of terms and organizational models. In the third section I consider a set of features (characteristics) relevant to  socialization in “open communities” with references to existing literature, examples in contemporary practice, and questions meriting further research. I rely upon literature from both the technical (e.g., studies of the open source communities) and non-technical domains (e.g., a mountain rescue team). I conclude by arguing that open communities are a compelling and novel subject for study in light of socialization because of the continual arrival of newcomers, the communities' tendency to splinter (“fork”), and, perhaps unlike some other voluntary organizations, most of the status derived from participation is orientated within the community itself, not externally.

2 Openness Considered

The Dictionary of the Social Sciences defines socialization as “. . . the process through which individuals internalize the values, beliefs, and norms of a society and learn to function as its members”(Calhoun 2002:447). Socialization occurs through processes by which the individual learns of, and is expected to conform to, social norms, roles, and conceptualizations of self and group. For a decade I've followed and participated in various open communities (including the IETF and W3C described below), witnessing collaborative successes and disappointing breakdowns in the social fabric of the groups. What socialization processes influence members of these communities towards successful collaboration?

In this section I introduce a number of concepts and institutions, which I use in the next section to consider the question of “how does the open and voluntary character of these communities affect the socialization of its participants?”

What I mean by an open community is a generalization of the Free Software, Open Source and Open Standards movements. Communities marshaling themselves under these banners cooperatively produce, in public view, software, technical standards, or other content that is intended to be widely shared.  I use the term “open community” for two reasons. First, it is not as unwieldy as “open or free voluntary software and technical standards communities.” Second, the generalization permits me to make an abstraction of non-salient characteristics that are not relevant to sociological questions. For example, the following questions are relevant and will be addressed: is Open Source different from Free Software, are the communities open, are their products open, and are they solely voluntary efforts? However, once addressed, I do not need to invite attention to these question unless they are relevant to the discussion at hand.

2.1 Free Software and Open Source

The Free Software movement was begun by Richard Stallman at MIT in the 1980s. Previously, computer science operated within the scientific norm of collaboration and information  sharing. When Stallman found it difficult to obtain source code to a troublesome Xerox printer, he feared that the norms of freedom and openness were being challenged by a different, proprietary, conceptualization of information. To challenge this shift he created the GNU Project in 1984 (Stallman 1998), the Free Software Foundation (FSF) in 1985 (Stallman 1996), and the authored the GNU General Public License in 1989.

The goal of the GNU Project was to create a free version of the UNIX computing environment with which many computer practitioners were familiar, and even contributed to, but was increasingly being encumbered with proprietary claims. GNU is playful form of a recursive acronym: GNU is Not Unix. The computing environment was supposed to be similar to but independent of UNIX and include everything a user needed including an operating system kernel (e.g., Hurd) and common applications such as small utilities, text editors (e.g., EMACS) and software compilers (e.g,. GCC).

The FSF is now the principle sponsor of the GNU Project and focuses on administrative issues such as copyright licenses, policy, and funding issues; software development and maintenance is still an activity of GNU. The GPL is the FSF's famous copyright license for “free software”; it ensures that the “freedom” associated with being able to access and modify software is maintained with the original software and its derivations. It has important safeguards, including its famous “viral” provision: if you modify and distribute software obtained under the GPL license, your derivation also must be publicly accessible and licensed under the GPL.

In 1991, Linus Torvalds started development of Linux: a UNIX like operating system kernel, the core computer program that mediates between applications and the underlying hardware. While it was not part of the GNU Project, and differed in design philosophy and aspiration from the GNU's kernel (Hurd), it was released under the GPL. While Stallman's stance on “freedom” is more ideological, Torvalds' approach is more pragmatic. Furthermore, other projects, such as the Apache web server, and eventually Netscape's Mozilla web browser, were being developed in open communities and under similar licenses except that, unlike the GPL,  they often permit proprietary derivations. With such a license, a company may take open source software, change it, and include it in their product without releasing their changes back to the community.

The tension between the ideology of free software and its other, additional, benefits led to the concept of Open Source in 1998. The Open Source Initiative (OSI)  was founded when, “We realized it was time to dump the confrontational attitude that has been associated with 'free software' in the past and sell the idea strictly on the same pragmatic, business-case grounds that motivated Netscape.” (OSI 2003) Since the open source label is intended to cover open communities and licenses beyond the GPL, they have developed a meta (more abstract) Open Source Definition (OSI 1997) which defines openness as:

  1. Free redistribution

  2. Accessible source code

  3. Permits derived works

  4. Ensures the integrity of the author's source code

  5. Prohibits discrimination against persons or groups

  6. Prohibits discrimination against fields of endeavor

  7. Prohibits NDA (Non-Disclosure Agreement) entanglements

  8. Ensures the license must not be specific to a product

  9. Ensures the license must not restrict other software

  10. Ensures the license must be technology-neutral

A copyright license which is found by OSI to satisfy these requirements will be listed as a OSI certified/approved license, including the GPL of course.

Substantively, Free Software and Open Source are not that different: the differences are of motivation, personality, and strategy. The FLOSS (Free/Libre and Open Source Software) survey of 2,784 Free/Open Source (F/OS) developers found that 18% of those that identified with the Free Software community and 9% of those that identified with the Open Source community considered the distinction to be “fundamental” (Ghosh et al. 2002:55).

Given the freedom of these communities, forking (a split of the community where work is taken in a different direction) is common to the development of the software and its communities. One can conceive of Open Source movement as having forked from Free Software movement. I won't belabor the point of this difference further but to point out that the most common form of this difference today is over the term “Linux.” In particular, Richard Stallman objects to the term when it is used to refer to the whole computing environment, including GNU software beyond the core “Linux” kernel. The whole computing environment is typically integrated into a distribution, such as the one provided by the RedHat company or Debian community; there are dozens of distributions with variances  in their focus, the hardware on which they run, or, if affiliated with a company, the support services offered with it. When one wants to refer to the complete environment without specifying a particular distribution, Stallman argues that the one should use “GNU/Linux.” While this seems petty to many, Stallman feels it is important to continue to highlight GNU and FSF's role so that the moral issue of “freedom” is not forgotten in the rush to marketplace adoption (Stallman 1997).

The benefits of openness are not limited to the development of software. The Internet Engineering Task Force (IETF) and World Wide Web Consortium (W3C) host the authoring of technical specifications that are publicly available and implemented by applications that must interoperably communicate over the Internet. For example, different Web servers and browsers should be able to work together using the technical specifications of HTML, which structures a Web page, and HTTP, which is used to request and send Web pages. The approach of these organizations is markedly different from the “big S” (e.g., ISO) standards organizations which typically predicate membership on nationality and often only provide specifications for a fee. This model of openness has extended even to forms of cultural production. For example, the Creative Commons  provides licenses and community for supporting the sharing of texts, photos, and music.

2.2 Openness, Voluntariness, and Remoteness

Organization can be characterized along numerous criteria including size; public versus private ownership; criterion for membership; beneficiaries (cui bono); Parsons' four societal goals; Hughes' voluntary, military, philanthropic, corporate, and family types; and Thompson and Tuden's decision making strategies, among others (Blau and Scott 1962:40). I wish to explore the characterization of openness and voluntariness of communities where much of the work is done remotely (over the Internet).

2.2.1 Openness

My previous definition of an open community is an extensional definition: enumerating the communities of Free/Open Software (F/OS), open standards and content. This definition is unsatisfactory: the enumeration pertained to their products, not their social organization. Private firms do release free software under the GPL license; while subsequent users may modify such software, this example tells us little about how work is done “in the open.”

The approach of Tzouris  (2002) was to borrow from the literature of “epistemic” communities so as to provide four characteristics of “free/open” communities:

(1) Shared normative and principled beliefs: refers to the shared understanding of the value-based rationale for contributing to the software.

(2) Shared causal beliefs: refers to the shared causal understanding or the reward structures. Therefore, shared causal beliefs have a coordinating effect on the development process.

(3) Shared notions of validity: refers to contributors' consensus that the adopted solution is a valid solution for the problem at hand.

(4) Common policy enterprise: refers to a common goal that can be achieved through contributing code to the software. In simple words, there is a mutual understanding, a common frame of reference of what to develop and how to do it. (Tzouris 2002:21)

However, for my purposes these are slightly over-determined in that I'm interested in how newcomers are introduced to the present consensus and how it is arrived at.

Consequently, I shall venture my own definition. Yet, I shall not attempt a myopic definition of perfect openness or democracy in social organization. Very few organizations have completely homogeneous social structures. As I wrote in Why the Internet is Good: Community Governance That Works Well (Reagle 1999), even an organization like the IETF with the credo of, “We reject kings, presidents and voting. We believe in rough consensus and running code,” has explicit authority roles and informal elders. Consequently, in the following definition of open communities there is some room for contention. (I've always considered a characteristic of an open community to be the continual self-reflexive consideration of what it means to be open on difficult boundary issues.) An open community delivers or demonstrates:

  1. Open products: provides products which are available under licenses like those that satisfy the Open Source Definition.

  2. Transparency: makes its processes, rules, determinations, and their rationales available.

  3. Integrity: ensures the integrity of the processes and the participants' contributions.

  4. Non-discrimination: prohibits arbitrary discrimination against persons, groups, or characteristics not relevant to the community's scope of activity. Persons and proposals should be judged on their merits.  Leadership should be based on meritocratic or representative processes.

  5. Non-interference: the linchpin of openness, if a constituency disagrees with the implementation of the previous three criteria, they can take the products and commence to work on them under their own conceptualization without interference. While “forking” is often complained about in open communities  it can create some redundancy/inefficiencyI have and continue to argue it is an essential character and major benefit of open communities as well.

While I've now provided an intensional definition, it is specified with respect to the mostly technical communities I've already mentioned. While it might be relevant to other communities, my focus and usage of the term is determined by my examples.

2.2.2 Voluntariness

In addition to the models of organization referenced by Blau and Scott (1962), Amitai Etzioni (1961) describes three types of organizations:  "coercive" organizations that use physical means (or threats thereof), "utilitarian" organizations that use material incentives, and "normative" organizations that use symbolic awards and status. He also describes three types of membership: “alienative members” feel negatively towards the organization and wish to leave, “calculative members”  weigh benefits and limitations of belonging, and “moral members” feel positively towards the organization and may even sublimate their own needs in order to participate (Etzioni 1961).

As noted by Jennifer Lois (1999:118) normative organizations are the most underrepresented type of organization discussed in the sociological literature.  Even so, Etzioni's model is sufficient such that I define a “ voluntary ” community as a “normative” organization of “moral” members. I adopt this synonymous definition not only because it allows me to integrate the character of the members into the character of the organization, but to echo the importance of the sense of the collaborative “gift” in discussions among members of the community.

However, like with openness, it is difficult to draw a clear line: one cannot exclusively locate all open communities and their members within the “normative” and “moral” categories, though they are dominant. Many members of open communities are volunteers, either because of a “moral” inclination and/or informal “calculative” concern with a sense of satisfaction and reputation. While the FLOSS survey concluded, “that this activity still resembles rather a hobby than salaried work” (Ghosh et al. 2002:67), 15.7% of their sample declared they do receive some renumeration for developing F/OS. Even at the IETF and W3C, where many engineers are paid to participate, it is not uncommon for some to endeavor to maintain their membership even when not employed or their employers change.

2.2.3 Remoteness

Much of the work done by the open communities I've introduced is done remotely: many of the participants are not in the same geographical location and have few opportunities for face-to-face (F2F) interaction. In fact, since the character of much of the work is technical and the participants are working on networking and collaborative technology, they have the opportunity to “eat your own dog food,” wherein one should actually use what one is working on.

Unlike openness and voluntariness, where open communities are often contrasted as an opposite to strictly utilitarian organizations (e.g., private firms), remote collaboration is not a characteristic of opposites, but of degree. Tzouris  (2002:20) cites a number of references that claim, “Free/open source software contributors comprise communities that have many characteristics in common with Communities of Practice . . . groups of people informally bound together by shared expertise and interest.” Because of a lack of resources (institutional and monetary) open community participants are often constrained with respect to travel. Consequently, while open communities are an excellent case study, that which is learned there or in the firm can often be applied to either context. For example, Nardi and Whittaker “theorize that people create social 'fields' within which communication can take place” (Nardi and Whittaker 2002:84): the creation of these communicative zones is a critical socialization factor but tend to degrade and need to be maintained, which is aided by physical interaction.  I expect that the need to maintain communicative zones is necessary regardless of the type of community; even open communities recognize the importance of some members meeting at F2F events so as to develop a coherent plan for the future or fix inter-related software errors at a “bug squashing festival.”

What is interesting about open communities is the degree to which they've been able extract benefit and minimize cost from the constraint of remoteness.

2.3 Open Communities

The definition that I provide for an open community is one that is both open and voluntary and much of the work is done remotely, as discussed. Remote collaboration is a contextual variable to many of the socialization features I discuss, but the intent of this paper is to focus on the variables of openness and voluntariness in that context. The openness of the community is perhaps dominant in describing the character of the organization, though the voluntariness is critical to understanding the moral/ideological light in which many of the members view their participation.

3 Open Communities and Socialization

This section considers a number of socialization features in the context of “How does the open and voluntary nature of open communities relate to that feature?” While I've separated the features into separate sections for purposes of exposition, they are not discrete in practice. Consequently, I've attempted to present them with an understanding that they are entangled, but such that the concepts are progressively rendered and identify crucial intersections with other features when appropriate. Furthermore, some socialization features are accompanied by a an additional brief set of questions that might be addressed in future research.

3.1 Motivation

Given the recent excitement around open communities, the question of what motivates their members has been the predominant question of investigation to date. A well known maxim of Eric Raymond (1997), a prominent OS/F philosopher and developer, is that, "Every good work of software starts by scratching a developer's personal itch." Linus Torvald's, Linux's developer, is well known for his light-hearted approach, “I want to have fun in my life. Not fun like parties . . . I want do something that matters. The fact that it matters to a lot of people gives it meaning to me. I want to be the best I can be, and I don't plan to join the Army! [Laughs.]” ( Simon 1964 )

Does research back up such anecdotes? A recent Master's thesis surveyed much of the literature across multiple disciplines, “In essence, it was found that out of the thirty-six (36) articles, that were considered overall, 33.3%, that is 12 articles, are using theory of communities, 27.7%, that is 10 articles, are using motivational psychology, 13.8%, that is 5 articles, are using governance structure, 13.8%, that is 5 articles, are using economics, and 11%, that is 4 articles, are using gift economy or some other theory” (Tzouris 2002:18). While Tzouris refused to draw any conclusions with respect to primary motives (instead, he examined the differences inherent in each of the disciplines' approach and argued for a totalized understanding), he did cite research that found that “scratching your itch” is an important immediate benefit. Subsequent research within the past year has found substantive, “enjoyment-based intrinsic motivation, namely how creative a person feels when working on the project, is the strongest and most pervasive driver” ( Karim and Wolf 2003 ), and that engagement is determined by “pragmatic motives to improve [their] own software.” ( Hertel, Niedner, and Herrmann 2003 ). In fact, the preference of engineers for interesting and challenging work was identified more than forty years ago ( Becker and Carper 1956b :344).  Though of course, self-satisfaction is not the only motive: “We also find that user need, intellectual stimulation derived from writing code, and improving programming skills are top motivators for project participation” ( Karim and Wolf 2003 ).

Von Krogh, Spaeth, and Lakhani (2003) borrow Charles Tilly's concept of “scripts” in collective action to define joining as, “a behavioral script that provides a structure for the activity of becoming a member of a collective action project.” While I shall return to the socialization feature of joining below, I will note that an extremely common response “script” to requests (for features or corrections) is to “fix it yourself” as evidenced in this unhappy message from someone who asked about a bug:

We all understand that all the cygwin people are busy. We are all busy too. I think it is totally improper to respond to unpleasant bug reports like this with "You don't like it, fix it yourself!" I certainly don't expect you to learn how to build my open source software. I expect you to let me know if I've released something broken. And since my software is a building block that you rely on, I'll try to fix it asap. Or, I'll at least tell you that some part is broken, and not to use it. (Wampler, Bruce 2000)

While the effects of motivation on socialization may not be immediately apparent, the dominant motives of the existing community, whatever they might be, might be reflected on to newcomers: if an established member joined because she wanted to “scratch her own itch”, she might expect the same of newcomers.

Future Research

As a member becomes established within a community, how would the expression of motivation change? Do memberships shift their expressed motives to reasons other than “scratching an itch” as their involvement increases?

3.2 Structure

Before one can even examine how newcomers come to join and learn about a community, one might consider what is at stake. In Socialization to Heroism: Individualism and Collectivism in a Voluntary Rescue Group, Jennifer Lois (1999) examines the processes by which newcomers are socialized to “Peak”, a volunteer mountain-environment search and rescue group. The training required significant investment on the part of the expert and newcomer. However, because of the initial asymmetry of expertise and time, the physical danger, and the “heroic” reputation associated with the group, established Peak members feared "free riders" and “predatory behavior” (Lois 1999:126). Consequently, there was an explicitly articulated structure of status/role: rescue member, mission coordinator, support, and candidate. (Lois 1999:119). In addition to learning and demonstrating the necessary skills, members “were socialized to develop a consciousness that showed their lack of self-interest by downplaying arrogance and egoism by displaying humility and respect” (Lois 1999:121). Until then, “Peak kept members at the fringes until they had proved themselves by demonstrating their internalization of the group's most crucial norms and values.” Schein (1968) termed these essential elements for acceptance 'pivotal norms and values'; Caplow (1964) identified them as constituting the 'normative system'” (Lois 1999:121). The lack of a friendly reception at Peak meetings demonstrates a purposeful equating of social boundaries with technical boundaries.

In the open communities with which I am concerned, no one's life is at stake. Even so, “joining a developer community may not be costless. Software development is a knowledge-intensive activity that often requires very high levels of domain knowledge, experience, and intensive learning by those contributing to it” (von Krogh, Spaeth, and Lakhani 2003:3). Furthermore, "Fichman and Kemerer (1997) found that in commercial software development, complex technologies can erect significant barriers of understanding and contribution, to both users and developers of the software, and the integration of newcomers can be arduous" (von Krogh, et al. 2003:3). As I will show in  Joining (section 3.3) newcomers improve their acceptance by demonstrating pre-existing skill; in Learning (section 3.4) I show how experts attempt to document their experience for easier acquisition; and in Role and Attribution (section 3.7) I will consider the effects of explicit role designation and attribution.

In addition to the intentional structuring, or at least the structuring that can be rationalized a posteriori by the membership, there are other emergent structural characteristics. Von Krogh et al. (2003:14) found that in their project of study, “Participation in the development list was highly concentrated with four individuals, all of them developers, or 1.1% of the population accounting for 50% of the e-mail list traffic.” Many communities exhibit “scale-free” or “power-law” (Barabási 2002) tendencies: the amount of some variable (e.g., connections, wealth, or, in this case, sent email) is the inverse of one's ranked ordering. For example, the person in the second position (N=2) has half (½) of the variable than the first ranked, the third ranked has 1/3 of the first, etc. This structure also seems apt for describing the active portion of Linux kernel developers, “Note that approximately 2% of the contributors contributed more than 50% of the messages, an indicator of the differentiated role structure and contribution profile of the community.” (Moon and Sproul 2002:395)

In both the Peak community and open technical communities there are boundaries to entry. However, in the open technical community transition can be quickly accommodated. In the case of Peak, regardless of one's expertise, it takes time to demonstrate one's skill and to be trusted by established members. However, if the contributor's skill is substantial and the technical “contribution barrier” is low, (via the ease and modularity of the software architecture and code and the choice of the computer language (von Krogh, et al. 2003:24)), it is possible to demonstrates one's skill quickly as indicated in the next section.

Future Research

How flexible is the structure voluntary of groups given the variables of membership rank/status, longevity, and the amount of work done? To what extent are newcomers cognizant of the structure? Is the ability for newcomers to quickly achieve prominence in an open technical community an important characteristic for recruitment? However, when there are intransigent incumbents, I've noted that some newcomers easily get frustrated and leave a community when they recognize the scale-free character of the community and an inability to immediately have an effect.

3.3 Joining

If someone is motivated and ready to participate in an open community, how do they join? In a study of the KDE Desktop Environment community Andreas Brand (2003) reported that involvement was an informal process. (KDE is another recursive acronym for a project that provides a suite of applications and a windowing environment, much like Windows originally provided for DOS.) Brand writes that knowledge of a KDE project by newcomers was obtained via discussions with friends or reading articles. Developer status could be conferred (within 6 to 12 months) by sending small source code changes. Entry was voluntary, transition was meritocratic/reputation based, and voluntary exit occurred when the contributor had to focus his or her attention on other responsibilities within the project (a new role) or outside (family or work). Forced exit occurred when important conventions were violated such as the deletion of code from the community repository without authority or consensus (Brand 2003).

Von Krogh, Spaeth, and Lakhani (2003) studied the Freenet community in order to address the question of joining. The goal of Freenet is to provide an extremely robust peer-to-peer network with characteristics resistant to censorship: a high capacity for delivery, no centralized control, and no fixed location for the storage of information. Von Krogh et al. conducted 13 semi-structured telephone interviews, analyzed approximately 11,200 emails, and the management of 54,000 lines of software code to characterize the process of joining the Freenet community. As already described in Motivation (section 3.1) von Krogh et al. rely upon the idea of a “script” to categorize the approach to joining a community. They propose four findings from their research:

  1. Participants behaving according to a joining script (level and type of activity) are more likely to be granted access to the developer community than those participants that do not follow the project's joining script. (von Krogh et al. 2003:21)

  2. In an evolving software architecture of open source software projects, contribution barriers of modules (modifying and coding, variation in computer language, plug-in, and independence) are related to the specialization of newcomers. (von Krogh et al. 2003:28)

  3. Feature gifts by newcomers are related to their specialization in open source software projects. (von Krogh et al. 2003:29)

  4. In open source software projects, feature gifts by newcomers emerge from the newcomers' prior domain knowledge and user experience. (von Krogh et al. 2003:29)

These propositions are validated by specific empirical data that further elucidates the properties of successful joining scripts. For example, in terms of the initiating contact, “A high 78% of the population of development list participants attempted to initiate dialog, via starting a new thread, at least once. Of these attempts only 29 (10.5%) participants did not receive any reply to their initial posting and subsequently did not appear on the developer list again.”  (von Krogh, et al. 2003:14) And in the context of that initiation, ". . . no joiner started out by unsolicited 'new' technical suggestions, perhaps indicating that it might be wise to start out humbly and not to boldly announce 'great ideas' for solving problems. In fact none of the 12.3% who suggested technical solutions without accompanying software code in their first post were joiners" (von Krogh, et al. 2003:18). Introducing one's self with an actual contribution, even if minor, is commonly noted as being very welcome by the community.

Future Research

Lurkers are an interesting phenomenon of open communities; they are the submerged portion of the iceberg, those that read or watch the community but do not choose to participate or represent themselves. When people first communicate with the community they often precede their comment with, “I'm just a lurker but . . . .” In an open Working Group that I chaired the email list was subscribed to by hundreds of lurkers who never posted, seemingly commensurate less active “tail” of a scale-free network. What role does this massive but unstudied constituency have on the life of an open community? How long do members lurk before they post and what prompts them to do so? Furthermore, are there different types of lurkers? Are some simply new and shy? Are others sufficiently experienced in other open communities to know that absent a commitment to contribute (i.e., a successful instance of joining script) they'll have little effect and need not bother until they do?

3.4 Learning

The critical component of socialization is for newcomers to learn the culture/norms of a community so that they can successfully contribute to the products of the community, or even its culture. How does the open and voluntary nature of open communities relate to learning?

In the case of the Peak rescue group, learning is achieved though face-to-face, time and attention intensive, training and practice. The most immediate effect of both remoteness and openness in on-line communities is the ready availability of community information: it should be documented somewhere. At the W3C we would often respond to a question with, “It's on the Web.” Of course, as the Web grew and the ability to find information become more difficult, this statement took on a more humorous connotation.

One of the oldest open communities maxims I'm familiar with is RTFM: “Read The Fucking/Fine Manual.” The Google Group archives are an  extensive archive of Usenet messages, a pre-Web bulletin board type system on the Internet, from 1981 to the present. The first instance of “RTFM” usage in the archives is a 1983 message referring to the VMS mainframe computer community:

Try looking in the Master Index to the VMS volume set. There, under BACKUP, you will see a pointer off to appendix B of manual 4A. This appendix describes the BACKUP tape format in some detail. The VMS people have a cute little piece of advice for people who are too slug-headed to read the manuals: RTFM. (Reason 1983)

If one wants a definition of RTFM, one recursively applies the principle of reading before asking: the Jargon File contains definitions for many “hacker” terms and acronyms:

RTFM /R-T-F-M/ imp. [Unix] Abbreviation for `Read The Fucking Manual'. 1. Used by gurus to brush off questions they consider trivial or annoying. Compare Don't do that then!. 2. Used when reporting a problem to indicate that you aren't just asking out of randomness. "No, I can't figure out how to interface Unix to my toaster, and yes, I have RTFM." Unlike sense 1, this use is considered polite. See also FM, RTFAQ, RTFB, RTFS, STFW, RTM, all of which mutated from RTFM, and compare UTSL. (Raymond 2000)

The list of acronyms at the end of the definition indicates many of the potential sources of information from which to learn, and differing levels of rebuke to the seeker, including source code (RTFS) – Use The Source Luke (UTSL) is the preferred Star Wars inspired variant – the Web (STFS), and the FAQ (RTFF).

The Frequently Asked Question (FAQ) document genre is also rooted in the history of the Usenet. Given its prodigious growth (Gray 1996), the Internet has the novel characteristic that for nearly all of its history, half of its members are “new” within a given year. To respond to the constant flood of repeated questions, Usenet newsgroups began to compile a document of frequently asked questions and answers which were posted to the newsgroup every few weeks. However, FAQs found their true medium in the hypertextual Web. As the Web grew in the 1990s, so did the scope of content captured in a FAQ: there are now tens of thousands of FAQs on as many topics. However, even FAQs from technical communities focus on questions of social norms and values. As noted by Moon and Sproul, the Linux Kernel Mailing List FAQ states that, "A line of code is worth a thousand words. If you think of a new feature, implement it first, then post to the list for comments"  (Moon and Sproul 2002:396). This corresponds to the analysis of joining scripts by von Krogh, et al. (2003:18) as noted in the previous section. Actions can speak louder than words.

Finally, no one comes to an open community as a blank slate – though members of on-line communities can be quite young. What are the effects of previous professional socialization? Many open source contributors do have proprietary jobs, which preceded their open community experiences, “All the respondents have been contributing to FS/OSS for less time than they have been programmers. In particular, 16 have been contributing to FS/OSS for a period between 2 and 8 years, while only 2 have been contributing for more than 8 years.” (Tzouris 2002:30) And while students “play also a significant role in the community, . . . project performance and leadership is primarily a matter of professionals” (Ghosh et al. 2002)

Future Research

In addition to the Jargon File and FAQ, there is the HOWTO document genre, there are conventions for collaboratively editing Web pages (e.g., Wiki), on-line chatting, and software development. What can be learned from the history and differences of these genres with respect to socializations?

To what extent does experience in one community translate to others? For example, does the fact that people often work on more than one project mean they consequently transfer effective norms between communities? When a community forks, how many of the social norms are retained?

And, as asked by von Krogh, et al. (2003:33), given that open source software is voluntary and there is often a high degree of turn over, does the transparency of the development process outweigh the loss of generational experience?

3.5 Goals

The formality of a charter of an open community is often dependent on the character of the leadership and the size of the community.  At the birth of an open community, the project often starts small and the operative goals (“what an organization is actually trying to do”) precede official goals (“formal statements”) (Perrow 1961). While these statements might also be said of any community, because of the transparency of an open community, it is difficult for these two types of goal to significantly diverge or be misrepresented.

Communities that need focus and cohesion often benefit from explicit charters (Blau and Scott 1962:7). Requirements for cohesion tend to arise because of the natural preference of the leadership, the large size of the community, or the community is a fork from an existing (unsatisfactory) organization, so it wishes to clearly articulate its dissatisfactions and differences. Smaller projects (a handful of people) might simply share the informal charter based on previous direction of its creator, correspondence on an email list, and future expectations listed in a “TO-DO” document.  Any such documentation obviously serves as a useful mechanism of socialization for newcomers.

The Linux kernel development is very large and includes both commercial and voluntary participants (Moon and Sproul 2002). However, because of its strong “elder” leadership structure (Reagle 1999), it forgoes an explicit charter. In fact, Linus Torvalds, its originator and leader, tends to disavow any explicit goals beyond that which is considered to be good, useful, or elegant to an engineer's sensibilities:

There isn't a final goal -- the goal changes constantly. A good example is doing multiprocessor support. Four years ago I thought it was too expensive and normal people wouldn't have access to it; that there's no point. I wasn't very interested in it. So the first SMP versions of Linux came out without any help from me at all. Then I started to be very interested indeed because the systems are very cheap these days and it's technically a very challenging area . . . I decided it was what I wanted to do. So that's what I've been working on. It's not just me, however. The fact that SMP was done without me is an example of somebody else who had a different priority. (Linux Magazine 1999)

The interesting point to note here is that the development model of the open community (Raymond 1997) mitigates the a priori specification of requirements and their resourcing. If coordination and integration costs are sufficiently low, new features, and new direction, can easily be adopted. Consequently,  the newcomer can have a significant effect on the direction of the community and need not merely absorb its existing social norms.

Future Research

Transparency does afford many opportunities to view the ways in which organizational goals are affected by “interpersonal differences” and discussion about the direction of the project (“multiple criteria”) (Simon 1964). Often organizations attempt to shield newcomers from horizontal (unrelated groups) and vertical (high level management) disputes. To what extent does the visibility of such disputes affect the recruitment or integration of newcomers? For example,  do newcomers first hear about a project because of a much discussed flame war, or are they turned off by it? If flames attract newcomers, does it attract the sort of members that is good for the community: do they adopt this form of interaction as the norm?

3.6 Group and Self Identity

In her study of the Peak mountain rescue group Lois found that, “Because members' conformity is more important than in low-risk organizations, their socialization experiences are more explicit” (Lois 1999:119). This led to the requirement that members deny the self and orientate towards the group (Lois 1999:121). Members were not encouraged to speak to the press or demonstrate “excessive autonomy.” In particular, members should “Avoid 'basking in the reflected glory' of the group (BIRGing) as found by Cialdini et al. (1976)” (Lois 1999:122). Consequently, “New members were made to feel like outsiders. Peak tightly guarded the camaraderie and excitement from them until they had proved their willingness to participate”  (Lois 1999:129).

 However, in the low-risk communities with which I am concerned, I posit there's little benefit to excluding others out of fear for “BIRGing”. In the Peak case, the risk was of free-riders claiming membership to a larger outside community, “Indiscriminately granting heroic status to anyone who volunteered could dilute the prestige of membership” (Lois 1999:132). In open technical communities, because of their openness and technical character, membership does not automatically grant status to a member with a larger community: local townsfolk don't know nor care about who is a member of the local GNU/Linux user group, even if they have benefited by their contributions on their home PC. Instead it is status within the group that is likely to be more important.

Still, there is a strong individualistic/libertarian bent to many on-line cultures and excessive ego can cause tensions in any community. Mosfet, a productive though controversial developer of KDE desktop software, was often disassociated from the larger developer community, and I noted with amusement the prominence of the “I” on his Web site, “Click here to read about who I am and what I have done for Unix/Linux free systems” (Mosfet 2003). As already noted in the discussion of Free Software and Open Source (section 2.1) communities, the leadership styles and their articulation of self can vary widely, from the self-effacing diplomat, to the dogmatic ideologue. As noted by (Lois 1999:119), the “heroic concept” tends to be limited to voluntary organizations, and it is in evidence in open technical communities, though its scope tends to be limited to its immediate community.

Future Research

Lois provides an example of the paradox of the “hero” in the Peak context, “Goode, however, pushes this relationship further by revealing the underlying paradox of heroism: Subsuming the self to the group results in exalted individual status.” (Lois 1999:134) The status of “hero” extends outside of the Peak rescue group. Must one make a distinction in the open technical community context from global (e.g., a handful of celebrities such as Richard Stallman (FSF), Eric Raymond (OSI), Bruce Perens (Debian), and Tim Berrners-Lee (Web)) and local heroes to understand the character of this status? Do these global celebrities or local (community specific) heroes affect the aspirations or behavior of newcomers?

In what ways are new members of open communities expected to alter their sense of individual self, and relate to a group identity?

3.7 Role and Attribution

In Structure (section 3.2) I focused on a community's structural characteristics and “entry barriers.” Of course, entry is only the first of many possible transitions within a community. Within the context of a coherent member identity and organizational structure, newcomers learn the skills necessary to obtain and transition between roles: “Roles describe specific forms of behavior associated with given tasks; they develop originally from task requirements” (Katz and Kahn 1978:37). Roles might (1) be formally specified and understood by members of the community, (2) implicitly/informally understood by members of the community, or (3) defined and attached by external researchers.

In Katz's and Kahn's model (1978:180) a role is a recurrent activity; an office is a location within the organizational space responsible for one or more roles. This presumes a requirements based, task driven organization with specific positions and titles. As I've noted, in open communities these presumptions do not necessarily hold. Granted, in voluntary communities, acknowledging the contributions of members, via attribution, is one of the few incentives the community has for its membership. However, the relationship between  individualistic/egoistic motivation, distaste/disdain for formal titles, and the mostly internal orientation of status (versus BIRGing in a larger communities) is complex.

One of the most important activities/capabilities in open technical communities is that of CVS write access, or “check-in.” CVS  (Concurrent Versioning System) permits contributors to write to the shared repository of community products, usually software, but some organizations (such as the W3C) use it for managing documents as well. CVS access is an important variable of status and analysis. Those with CVS access often mediate changes to the source code for those who do not: a contributor sends a “patch” with modifications to existing code, and the developer with CVS access reviews this patch before applying it within CVS.

CVS's importance is reflected in von Krogh et al.'s (2003:9) definition for the roles of newcomer, joiner, and developer within the Freenet community. Not being granted CVS access can become a source of complaint or forking. Recently, the Cygwin port of the XFree86 windowing system was forked when requests for mediation to CVS, and then requests for direct CVS access were not responded to productively:

It seems that David Dawes has made his decision not to let the Cygwin/XFree86 project commit patches directly to the CVS tree; he doesn't seem to want to go on record saying this, but he has written several messages during this discussion about ftp servers, cvs servers, etc., while making no attempt to provide a direct answer to my question. Had he intended otherwise he could have easily said, "Gotta wait until the next board meeting", or "Sorry, server died, gonna have to wait a couple days". I can only interpret his ignoring the issue as a passive refusal to grant Cygwin/XFree86 CVS commit access. (Hunt 2003)

 One of the most visible representations of status within on-line communities is to have an email address that originates within the community itself. For example, the KDE's Applying For A KDE CVS Account document speaks to the privilege and responsibility of such an attribution, “KDE email addresses are not granted so easily anymore, as too many people have ranted with a KDE address and other people thought that it was the official position of the KDE team” (KDE Webmaster 2003).

While many communities are developer focused (those that write actual software code), others, such as the KDE desktop community, happily acknowledge the importance of GUI and icon designers, translators, and documentation writers.  Even for Linux kernel development, “. . . Torvalds acknowledged a contributor for compiling the Linux information sheet (Torvalds, 1992, January 9), and stated that one did not 'need to be a *kernel* developer to be on the credits list' (Torvalds, 1993, December 21). “ (Moon and Sproul 2002:393).

In my experience, role definitions and transition policies are not as explicit or regimental in open communities as in other types of organization. Few members bandy about a title and transitions tend to be an informal decision made by those already in the leadership positions. (Though larger or forked communities might be the exception: the Debian GNU/Linux distribution has a formal constitution and election process (Debian 2003).)  I expect much of this has to do with the voluntary nature and the lower costs of entry and transition. I doubt it is because there are fewer roles than in a proprietary software development context, and wonder if formal “labels” are unimportant and, instead, people know you by what you've done. In fact, the lack of titles might be a paradoxically important community boundary. Lois notes that in heroic communities there is the paradox that, “Subsuming the self to the group results in exalted individual status” (Lois 1999:134). In open technical communities, one need not subsume one's self, but if someone doesn't know who you are based on what've you've done, then they are clearly the outsiders.

Future Research

How many open communities have explicit charters that include definitions and policies for role transition? Is there support for my hypothesis that forked or large communities are those most likely to have explicit charters? How does the explicitness of a community affect the aspiration and orientation of members? What makes members feel that the community is effective and how does this relate to their personal sense of reward?

4 Conclusion and Exit

In this paper I've attempted to synthesize existing literature and community practice so as to derive a number of questions for future research. In looking at the socialization features of motivation, structure, joining, learning, goal setting, identity, and roles and attribution I've posited a few novel characteristics of these communities that have interesting implications for socialization. In summary:

  1. Many open technical communities are characterized by significant growth, in addition to the high-turnover typical to voluntary organizations. Seemingly, there are always “newbies.”

  2. While considered aberrant by some, the action of “forking” is critical to the very conception and life of open communities.

  3. Unlike other voluntary organizations such as the Peak rescue group, and aside from a handful of celebrities, much of the status derived from participation is orientated within the community itself.

Furthermore, I posed my questions for future research without much reference to how they might be answered. Fortunately, another interesting characteristic of researching on-line communities is the wealth of explicit data available for analysis. Von Krogh et al. (2003) provide an example of how one can use email archives and CVS repositories to yield insightful analysis. Two other potential research grounds are  SourceForge and Wiki's.

SourceForge  is an open source development service for F/OS projects, it provides resources such as Web hosting, CVS repositories, email lists and archives, and bug/issue tracking (SourceForge 2003). While some researchers have already began studying SourceForge for work unrelated to socialization (Stewart and Gosain 2003) it might be help answer my question of cross community participation: one could determine how many open source projects a SourceForge member belongs to, the nature of her multiple commitments, and look for evidence of these members spreading norms between projects.

As described on the What is Wiki page, “Wiki is a piece of server software that allows users to freely create and edit Web page content using any Web browser. Wiki supports hyperlinks and has a simple text syntax for creating new pages and crosslinks between internal pages on the fly” (Wiki 2002). In researching a Wiki community one might consider the timidity with which newcomers approach the task, the roles various persons take in editing the documents, and how conflicts arise and resolve.

Finally, because socialization is focused on member acquisition and orientation I've largely ignored an interesting sociological feature: exit. While some of the authors I've cited have commented on the testy circumstances prompting member exit, I suspect it's a complex and interesting phenomenon for open communities. In his book Exit, Voice, and Loyalty Albert Hirschman notes that, “Organizations and firms producing public goods or public evils constitute the environment in which loyalist behavior (that is, postponement of exit in spite of dissatisfaction and qualms) peculiarly thrives. . . In the case of public good . . . one continues to 'care' as it is impossible to get away from it entirely” (Hirschman 1972:104). I expect many open community participants feel as if their contributions are a public good, and that Hirschman's model of exit (leaving the organization), voice (raising concern about the organization), and loyalty (postponing exit and invigorating voice for and additional threshold) might provide a good model for the phenomenon of forking (group exit). And, just as newcomers learn from their predecessors on how to join and negotiate a community, they also might develop a sense of the appropriate conditions and conventions for productive exit.

In an episode of Seinfeld the friends gossip about a “bad breaker-upper” and agree that the end of the relationship is the most important part. Unlike in the dangerous circumstances of Peak where exit can be “permanent,” those that leave an open community may move on to a new community or even return: they will bring with them their whole range of socialized experience, from entry to exit.

5 Bibliography

. "Socialization." 2002. Craig Calhoun (ed.). Pp. 447-448 in Dictionary of the Social Sciences. New York, NY: Oxford University Press.

Barabási, Albert-László . 2002. Linked: The New Science of Networks. Cambridge, MA: Perseus Publishing.

Becker, Howard, James W. Carper. 1956. "The Elements of Identification with an Occupation." American Sociological Review 213:341-348.

Blau, Peter, W. Richard Scott. 1962. Formal Organizations: A Comparative Approach. New York, NY: John Wiley.

Brand, Andreas. 2003. The Structure, Entrance, Production, Motivation and Control in an Open Source Project.

Bruce Wampler. 2000. Supporting Open Source software.

Debian. 2003. Constitution for the Debian Project (v1.2).

Etzioni, Amitai. 1961. Modern Organizations. New York, NY: Free Press of Glencoe.

Ghosh, Rishab, Ruediger Glott, Bernhard Krieger, Gregorio Robles. 2002. Free/Libre and Open Source Software:  Survey and Study.

Gray, Mathew. 1996. Internet Statistics: Growth and Usage of the Web and the Internet.

Hertel, Guido, Sven Niedner, Stefanie Herrmann. 2003. "Motivation of Software Developers in Open Source Projects: An Internet-based Survey of Contributors to the Linux Kernel." Research Policy 32:1159-1177.

Hirschman, Albert. 1972. Exit Voice and Loyalty. Cambridge, MA: Harvard University Press.

Hunt, Harold. 2003. Cygwin/XFree86 - No longer associated with

Karim, Lakhani, Bob Wolf. 2003. "Why Hackers Do What They Do: Understanding Motivation Effort in Free/Open Source Software Projects." Sloan Working Paper 4425-03:1-28.

Katz, Daniel, Robert L. Kahn. 1978. Social Psychology of Organizations. New York, NY: John Wiley & Sons.

KDE Webmaster. 2003. Applying For A KDE CVS Account.

Linux Magazine. 1999. The Linux Interview.

Lois, Jennifer. 1999. "Socialization to Heroism: Individualism and Collectivism in a Voluntary Rescue Group." Social Psychology Quarterly 622:117-135.

Moon, Jae, Lee Sproul. "Essence of Distributed Work: The Case of the Linux Kernel." 2002. Pamela Hinds and Sara Kiesler (eds). Pp. Chapter 16 in Distributed Work. : MIT Press.

Mosfet. 2003. Internet Archive's Way Back Machine Archive of Mosfet Home Page.

Nardi, Bonnie, Steve Whittaker. "The Place of Face-to-Face Communication in Distributed Work." 2002. Pamela Hinds and Sara Kiesler (eds). Pp. Chapter 4 in Distributed Work. Boston, Ma: MIT Press.

Open Source Initiative. 1997. The Open Source Definition.

Open Source Initiative. 2003. History of the OSI.

Perrow, Charles. 1961. "The Analysis of Goals in Complex Organizations." American Sociological Review 76:854-866.

Raymond, Eric. 1997. The Cathedral and the Bazaar.

Raymond, Eric. 2000. The Jargon File 4.2.2.

Reagle, Joseph. 1999. Why the Internet is Good Community Governance That Works Well.

Reason, Corot. 1983. Re: Wanted VMS BACKUP for UNIX.

Simon, Herbert. 1964. "On the Concept of Organizational Goals." Administrative Science Quarterly 91:1-22.

SourceForge. 2003. SourceForge.

Stallman, Richard. 1997. Linux and the GNU Project.

Stallman, Richard. 1996. Free Software Foundation.

Stallman, Richard. 1998. The GNU Project.

Stewart, Katherine, Sanjay Gosain. 2003. Impacts of Ideology, Trust, and Communication on Effectiveness in Open Source.

Tzouris, Menelaos. 2002. Software Freedom, Open Software and the Participant's Motivation - A Multidisciplinary Study. London, UK: London School of Economics and Political Science.

von Krogh, Georg, Sebastian Spaeth, Karim R. Lakhani. 2003. "Community, Joining, and Specialization in Open Source Software Innovation." Research Policy 327:1217-1241.

Wiki. 2002. What is Wiki.