A Mixed Methods Approach to Mining Code Review Data

Examples and a Study of Multicommit Reviews and Pull Requests

Book chapter (2015)

Authors

Peter C. Rigby Concordia University

A. Bacchelli Software Engineering -

G. Gousios Radboud Universiteit Nijmegen

Murtuza Mukadam Concordia University

Research Group

Software Engineering () (TU Delft)

Inspection Empirical software engineering Modern code review

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:b9cdc3dd-ea90-4789-b10b-a6f674af0cab

Published Date

01-09-2015

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Software Technology

Research Group

Software Engineering

Abstract

Software code review has been considered an important quality assurance mechanism for the last 35 years. The techniques for conducting modern code reviews have evolved along with the software industry, and have become progressively incremental and lightweight. We have studied code review in a number of contemporary settings, including Apache, Linux, KDE, Microsoft, Android, and GitHub. Code review is an inherently social activity, so we have used both quantitative and qualitative methods to understand the underlying parameters (or measures) of the process, as well as the rich interactions and motivations for doing code review. In this chapter, we describe how we have used a mixed methods approach to triangulate our findings on code review. We also describe how we use quantitative data to help us sample the most interesting cases from our data to be analyzed qualitatively. To illustrate code review research, we provide new results that contrast single-commit and multicommit reviews. We find that while multicommit reviews take longer and have more lines churned than single-commit reviews, the same number of people are involved in both types of review. To enrich and triangulate our findings, we qualitatively analyze the characteristics of multicommit reviews, and find that there are two types: reviews of branches and revisions of single commits. We also examine the reasons why commits on GitHub pull requests are rejected.