Google Summer of Code 2010!


For this year's Google Summer of Code, I proposed a follow-up alternate project to my research for the master's thesis in Management of Technology. Should my proposalbe accepted it will be mentored by my thesis supervisor Prof.dr Michel van Eeten.

My proposal can be found here:

Hierarchy in Meritocracy: Community building and code production in the ASF, EF, and PSF

An extended abstract is shown below:

Hierarchy in Meritocracy: Community building and code production in the ASF, EF, and PSF

Abstract

The ASF explains the success of its communities and the software they produce by claiming that meritocratic principles and organizational orientation to software engineering through community building, set them apart from other open source communities. The relevance of these claims, and therefore of institutions like Meritocracy, can be explained and better understood by analyzing them through an organizational model of open source.

Introduction

The Apache Software Foundation (ASF) explains the success of its communities and the software they produce by claiming that meritocratic principles and organizational orientation towards software engineering, through community building, set its communities apart from other open source communities. The relevance of these claims, and therefore of institutions like Meritocracy, can be explained and better understood by analyzing them through an organizational model of open source. Such a model was developed by Van Wendel de Joode (2005): “The model is intended to serve as input to discussion and reflection and as a guide for further research on the subject.” There are eight design principles in the organizational model of open source and for this project the principle of collective choice was chosen because of its relevance on decision-making. The organizational model of open source (Van Wendel de Joode, 2005) can help explain the relevance of claims that account for institutional success and the extent to which they have an effect on code production, or not. An empirical testing research project is currently being conducted [1] to determine the influence of hierarchy (in Meritocracy) on code production in open source communities of the ASF and to compare it to similar community building institutions in other communities such as those of the Eclipse Foundation (EF) and the Python Software Foundation (PSF). The theoretical aim behind this research project is to evaluate the role of hierarchy (in Meritocracy) as an organizational institution within open source communities. The practical aim is to acquire insight into the role of hierarchy (in Meritocracy) in community building and code production in the ASF and other open source communities in the EF and the PSF. The research questions are: My alternate GSoC’09 project marked the start of this research. In that project I focused on operationalizing the design principle of collective choice through the voting and community structure present in organizational open source networks. The outcome of my alternate GSoC’09 project was a contribution [2] to the SVNPlot project that enables: (1) collection of SVN repository data for a specified period of time, and (2) generation of social network graphs in CMU’s *ORA format. Using these contributions I collected data for all open source communities in the ASF [3], for 2008 and 2009, and then generated social network graphs and statistics using CMU’s *ORA software. I am currently replicating this analysis for all the open source communities in EF and PSF. I am now proposing an alternate project for GSoC’10 to again contribute to SVNPlot, by: (1) adding social network analysis capabilities to the code I developed in GSoC’09, and (2) enabling the generation social network graphs in GraphML format to enable follow-up visualization and analysis with Gephi. Furthermore, the resulting code of this alternate project will be integrated into ‘trunk’ to make it part of the official SVNPlot release. Afterwards I will propose an Apache Lab to implement the official SVNPlot release for active and ongoing analysis of community Meritocracy [4] within the ASF and provide stunning visualization capabilities of its open source communities using Gephi [6]. Proposing an Apache Lab was also part of my alternate GSoC’09 project. However, since CMU’s *ORA is not open source software I decided against proposing an ApacheLab after GSoC’09. This year, however, the outcome of my alternate GSoC project will be completely open source and therefore will allow me to propose an Apache Lab that relies completely on open source software. Moreover, the proposed alternate project will enable me to wrap-up my data collection and analysis in order to defend my master’s thesis at the end of the summer and present my research in the Business and Community Track of ApacheCon NA this coming November.

The Project?

This alternate project is about automating the analysis of voting and community structure that I am using for my master’s thesis. The main goals are to present my research findings at ApacheCon NA and to implement the open source tools used for organizational analysis in an Apache Lab similar to that of Apache Agora [5]. The organizational analysis I propose is based on the influence of hierarchy (in Meritocracy in community building and code production in open source communities of the ASF and other communities in the EF and the PSF. This alternate project contributes to addressing the marked lack of organizational research focused on institutions in open source communities, in general, and incubating communities in particular. The organizational analysis this alternate project will be a part of, focuses on the open source communities of the ASF, EF, and PSF, but is also applicable to any open source community that uses a Subversion code repository. Furthermore, ongoing research is already providing compelling insights [B] that will possibly contribute to an improved and in-depth understanding of organizational open source. In itself this makes the proposed alternate project worthwhile and interesting as it will be embedded in a broader research project that aims at scientific publication. In addition to providing insights into the role of hierarchy (in Meritocracy) on community building and code production, this alternate project could be useful in the evaluation of incubator projects and communities. For instance it could be used for active mentoring of open source communities under incubation as well as in deciding when such communities should graduate. More broadly, the proposed alternate project would be useful for community management in the ASF, EF or PSF. Even for companies, it would be useful to monitor the activities of software developers to better understand the informal organization that builds from the grassroots level through code interaction. In general the proposed approach is deemed useful and extensible. This alternate project involves coding at various levels, from contributing to the SVNPlot open source project, to developing an interface for its use in Apache Labs. Also importantly, the alternate project will codify the analysis I came up with for my master’s thesis, so that, if found to be useful, it can be implemented in an Apache Lab. From there onwards, the tools and analysis will be available for experimentation and innovation in the community. The idea for this alternate project came to me after GSoC’09 when I was able to collect data but still was forced to do many things manually and had to rely on software tools like CMU’s *ORA which are not open source. In other words, this alternate project is an extended way of scratching my own itches and cleansing my work so that it is completely open source. The outcome will help me to fully automate the data collection and analysis for my master’s thesis and subsequently for an Apache Lab.

Goals

This project has several goals at different levels, most of which are shared goals with my masters thesis. The main goals are described below.

Valid XHTML 1.1!


ocastaneda@apache.org