Is Analyzing Open Source Projects by Contributors a Valid Metric?

April 19, 2012 By Tim OBrien

3 minute read time

ReadWriteWeb's Joe Brockmeier has an interesting piece analyzing OpenStack Essex, while this isn't an exact overlap with the kind of analysis we're working on for Insight and Nexus, it's a view into the social and open source dynamics of a project.

Brockmeier's article is a summary of some analysis that OpenStack contributor Mark McLoughlin assembled from commits and Gerrit code reviews. It's a breakdown of activity by organization, as with many open source projects that have corporate involvement, there's always one or two companies that tend to dominate the commit breakdown.

Where the article is a little off-base is in the assessment of community health, you can't judge the "health" of an open source project by the mix of companies represented in a commit breakdown alone. It's an interesting statistic, but there's so much more to open source than code commits including documentation efforts, marketing spend by companies invested in a project, and financial support for essential efforts not directly related to code (legal, infrastructure, etc.). Open source isn't about code alone, and while it is an ideal for open source projects with corporate involvement to have balance, this balance can shift over time.

As Sonatype comes to market with more tools focused on helping you make better decisions about the components you use, we're going to focus first on "actionable" metrics like popularity and quality. While these sorts of metrics are interesting (and you can get them from Eclipse BTW, if you go to this page), I don't think it makes sense to create arbitrary, commit-based assessments about the health of a community.

I don't think it would ever make sense to have a "Warning Project dominated by one corporation" flag, because I predict you'd end up seeing this flag on just about every open source project out there. It's a subjective measure, and while these numbers may suggest that Rackspace is dominating the codebase of OpenStack, they may also just suggest that Rackspace has created a solid framework for a maturing community.

Cloud as the New OSS Battleground: Permissive vs. Copyleft

These open source cloud platforms (OpenStack and Eucalyptus among others) are the current battleground for computing. There's a massive amount of investment and attention being paid to cloud computing, and as a result, there's also a large amount of investment being poured into sponsored open source development. With CloudStack recently moving to the Apache Software Foundation and Eucalyptus recently signing agreements with Amazon to ensure AWS compatibility, there are weekly strategic moves from companies like Oracle, RedHat, Citrix, Amazon, VMWare, IBM, Intel, Appfrog, EngineYard..... (I'm going to stop because I could fill up several pages).

One of the big stories over the last week was Eucalytpus' embrace of GPL versus CloudStack's embrace of the ASL. Whatever your views on open source and licensing, it's important to note that OSS licensing issues are front and center with commentators writing reams of analysis about how Cloud will eventually turn into a battle between permissive licensing models and GPL licensing models.

One thing is certain, people are paying attention to licensing issues like they never have before. If you haven't yet done so, you should start too. Downloading a copy of Nexus are start getting a sense of your OSS license usage.

Tags: Nexus Repo Reel, Sonatype Says, Everything Open Source

Written by Tim OBrien

Tim is a Software Architect with experience in all aspects of software development from project inception to developing scaleable production architectures for large-scale systems during critical, high-risk events such as Black Friday. He has helped many organizations ranging from small startups to Fortune 100 companies take a more strategic approach to adopting and evaluating technology and managing the risks associated with change.