Evaluate Open Source Software

11 May 2010

Open Source software selection starts with the creation of a short-list of open source packages, and the very next step is the evaluation of all candidates.

Read the dogfood label first.Read the dogfood label first.

Open source projects are planned, developed and maintained often using accessible Revision Control systems (e.g. Bazaar, CVS, Git, Mercurial or SVN), Collaboration (blogs, forums, IRC channels, mailing-lists and wikis) and Tracking Systems (e.g. bugzilla, GNATS, OTRS, trac). Despite going through them all can be time-consuming, those are the primary source of information to know more about an open source project.

Open source software public repositories like SourceForge, Google Code or Codeplex, provide an all-in-one solutions with all necessary tools, others like GitHub or the Codehaus are focused only on code production and developers’ communications take place somewhere else.

Software metrics

Resources for Software Metrics.

Process, Product and Resource metrics matter, and methods to measure code (product) metrics – e.g. how to look at 10.000 lines of code in hour – are not the ultimate answer to fully qualify an open source project.

Beyond the code, metrics related to software development activities – e.g. average time to fix a bug, how many people contribute – or metrics related to resources (human resources involved, etc) matter. Some open source repositories make available tools and statistics to understand ‘social production‘; GitHub’s Network Graph Analyzer is an effective tool to visualize developers’ contributions, even across different repositories.

Ohloh provides a load of interesting information about open source projects, making them available also via API. Ohloh for every project enlisted in its directory returns a summary, some basic code metrics, names of contributors and a graph about their contributions, a detailed list of commits and projects file licensing. You can also compare projects (up to three), but the comparison is limited to the code base, activity and number of contributors. If you want to build your own open source software evaluation system have a look at ohloh open source tools Ohcount, ohloh_scm and ohdb.

Melquiades website makes available data about over 2600 projects, thanks to the repository finder Octopus finds open source projects’ forges, then uses Bicho for issue-tracking systems, CVSAnalY for code repositories, MalingListStats for mailing-lists and Sloccount to ispect source code. Melquiades makes its data accessible via API, both dumps and graphs. Indeed the fastest way to get the gist of a project vitality is to go through SCM, Mailing-lists and issue trackers charts.

FLOSSMole collects data about over 200.000 projects hosted at the Free Software Directory (details), OW2 (details), Rubyforge (details), Savannah, SourceForge (details), and also data from FreshMeat and now Google Code too. along with some historical collections.


Bringing it all together.

Meta-forges and research tools can save us a lot of time, making unnecessary to dig into every forge, communication tool and bug-tracking system. Paradoxically the abudance of information offered by tools collecting data may also adds to the problem of open source selection, especially in terms of aggregation and correlation.

Open source projects’ names are often spelled  in different ways (e.g. zenoss, Zenoss Core). Moreover different meta-forges and directories return different information about the same projects.

SOS Open Source solves the naming issue using project aliases. Automatic aggregation and correlation of data crawled from different sources is provided taking advantage of meta-project records stored in the internal database. SOS Open Source provides the user with the most appropriate attributes using content specific heuristics elaborated to reduce open source software selection’s complexity.

The organization of code production is largely affected by governance decisions. How and if decision-making processes are driven by the community is of great importance to figure out projects’ sustainability. Unique sponsors tend to retain full release authority,but they rarely states it at all.

SOS Open Source providing help to find answers also about code modularity, community management style, licensing and sponsorship information makes easier to assess the level of openness about production, governance and IP.