acm - an acm publication

2010 - November

  • An Interview with Prof. Andreas Zeller: Mining your way to software reliability
    In 1976, Les Belady and Manny Lehman published the first empirical growth study of a large software system, IBMs OS 360. At the time, the operating system was twelve years old and the authors were able to study 21 successive releases of the software. By looking at variables such as number of modules, time for preparing releases, and modules handled between releases (a defect indicator), they were able to formulate three laws: The law of continuing change, the law of increasing entropy, and the law of statistically smooth growth. These laws are valid to this day. Belady and Lehman were ahead of their time. They understood that empirical studies such as theirs might lead to a deeper understanding of software development processes, which might in turn lead to better control of software cost and quality. However, studying large software systems proved difficult, because complete records were rare and companies were reluctant to open their books to outsiders. Three important developments changed this situation for the better. The first one was the widespread adoption of configuration management tools, starting in the mid 1980s. Tools such as RCS and CVS recorded complete development histories of software. These tools stored significantly more information, in greater detail, than Belady and Lehman had available. The history allowed the reconstruction of virtually any configuration ever compiled in the life of the system. I worked on the first analysis of such a history to assess the cost of several different choices for smart recompilation (ACM TOSEM, Jan. 1994). The second important development was the inclusion of bug reports and linking them to offending software modules in the histories. This information proved extremely valuable, as we shall see in this interview. The third important development was the emergence of open source, through which numerous and large development histories became available for study. Soon, workers began to analyze these repositories. Workshops on mining software repositories have been taking place annually since 2004. I spoke with Prof. Andreas Zeller about the nuggets of wisdom unearthed by the analysis of software repositories. Andreas works at Saarland University in Saarbrücken, Germany. His research addresses the analysis of large, complex software systems, especially the analysis of why these systems fail to work as they should. He is a leading authority on analyzing software repositories and on testing and debugging. -- Walter Tichy, Editor