Google Summer of Code 2009

For this year's Google Summer of Code, I proposed an alternate project according to my interests and current work in my masters thesis for the Management of Technology programme. My proposal was accepted and will be mentored by my thesis supervisor Prof. Michel van Eeten.

My selected proposal can be found here:

Understanding New Open Source Communities in The Apache Software Foundation (Evolution of Poddlings in the Apache Incubator)

An extended abstract is shown below and the full proposal can be found in the following link:

Master thesis research

Understanding New Open Source Communities in The Apache Software Foundation (Evolution of Poddlings in the Apache Incubator)


New open source communities in the Apache Incubator are groups of individuals contributing to the evolution of a software codebase. Some contributors volunteer time and efforts while others are paid to develop software. They seldom meet but often collaborate through shared infrastructure on the Internet. Still they organize in communities to collectively develop enterprise grade software and eventually become full-fledged open source communities. The question behind this project is: 'how?'

Burning questions...

New open source communities continue to become part of The Apache Software Foundation (ASF). The entry path for these communities is the Apache Incubator, a top-level project created by the ASF to develop new projects and communities that comply with legal standards and adhere to the guiding principles of ‘The Apache Way’ [1]. The enormous success enjoyed by the ASF, its open source communities, and the open source software they produce, leads to questions such as: Answering these questions requires an understanding of how open source communities are organized and how they sustain themselves. It also requires an understanding how open source community development works in practice. To acquire the theoretical and practical understanding needed to answer these burning questions, I studied the PhD dissertation of Dr. Ruben van Wendel de Joode, related it to what I experienced collaborating with the Apache Tuscany incubator project during and after GSoC'08, and attended ApacheCon Europe 2009 as a staff volunteer for the second year in a row. These three sources of knowledge and experience inspired the empirical strategy that I will use in my masters thesis to study two of the most important aspects in the organization and sustainability of new open source communities: voting and community structure.

How are new open source communities in the ASF organized and how do they sustain themselves?

I believe that the role of institutions and processes active in the ASF, such as 'The Apache Way', goes over and beyond that of increasing external recognition and protecting the communities from outside pressures. Such institutions and processes also have an important internal role in community development as well as a vital interfacing role between communities and corporations (O'Mahony, 2002, 2003). I believe that evidence of such roles can be found in the relationship between voting and community structure, which I hypothesize to be self-reinforcing. To empirically investigate this relationship I will use software tools for data mining the SVN repositories and mailing lists of new open source communities under incubation in the Apache Incubator. Based on the data gathered from those sources, the social network structure of the communities can be constructed and analyzed. I propose that patterns in the evolution of community structure can be used to explain the relevance and importance of voting in new open source communities in the Apache Incubator.

The Project?

This project is about operationalizing the analysis of voting and community structure that I will use for my thesis. The main goal is to implement the tools used for analysis (currently SVNPlot and Network Workbench) in an Apache Lab similar to that of Apache Agora. The analysis will be based on the evolution of community structure over time and its relation to the collective institution of voting in new open source communities. The analytical framework and organizational orientation come from the research of Van Wendel de Joode (2005) and my own experience in open source. The lack of such organizational and evolutionary analysis on poddlings in the Apache Incubator, and the compelling insights it can provide, make this project worthwhile and interesting.

This project will enable the evaluation of Apache Incubator projects and communities, and could for instance be useful in deciding when a project graduates from the incubator, and would also be useful for continuous mentoring of communities under incubation. The project involves coding at several levels, from modifying the open source tools to be used, to developing an interface for their use in Apache Labs. Also importantly, the project involves coding the analysis to be performed in my thesis, so that, if found to be useful, it can continue to be used in the Apache Incubator through Apache Labs.

The idea for this project came about from a need for data collection for my thesis. In other words, it is a way of scratching my own itch. During data collection I thought to myself ' would be nice to have a tool like Apache Agora for this...' I see the idea as being useful to the ASF for continued analysis of projects in the Apache Incubator. As such, one of the goals of this project is to implement the tools used for analysis and make them readily available to the community through Apache Labs. From there onwards, the tools will be available for experimentation and innovation in the community.


This project has several goals at different levels, most of which are shared goals with my masters thesis. The main goals are described below.

Valid XHTML 1.1!