Pull requests are a staple of open source software. Someone finds a bug or inefficiency within a section of code, and they can submit a “pull request” through Git to the repository owner. The owner can compare the new code versus the original, and if they agree it’s better off with the new code, it’s in. Sounds like an open and shut case for high quality, right?
Not so fast. A new study has been submitted that questions the validity of that process. It turns out there’s an inherent flaw in the system, which revolves around how much reputation matters when developers accept or deny pull requests. (Spoiler alert: reputation matters a lot.)
So why isn’t code quality making a difference on pull requests? While they can’t provide empirical evidence for that question, we can at least postulate. Even better, we have a solution.
A paper titled, "Does Code Quality Affect Pull Request Acceptance? An empirical study," was recently submitted to the journal "Information and Software Technology."
The goal of their paper was to understand how much code quality is considered when pull requests are accepted. They knew that pull request acceptance might be determined more by reputation and features than code quality – but they wanted to know how much code quality mattered overall.
Researchers describe their analysis of 28 Java open-source projects, which included 4.7m code quality issues in 36,000 pull requests.
Industry and researchers agree that code inspection helps to reduce the number of defects. In older days, developers organized review meetings to inspect the code line by line. It was tedious. In cases like that, especially compared to today, the effort required to perform code inspections ended up hindering their actual practice of use.
To understand if code quality influences a pull request’s approval, researchers conducted a case study involving 28 well-known Java projects to analyze the quality of more than 36K pull requests.
They analyzed the quality of pull requests using one of the four tools used most frequently for software analysis. Then evaluated the code quality against a standard rule set, allowing the detection of different quality aspects generally considered harmful – including code smells, anti-patterns, design issues, and various coding style violations.
Reasons for Pull Request Rejection (and Effects)
Prior studies have shown that pull request rejections are more likely when technical problems are not properly solved, and if it increases the number of forks. Other rejection factors:
- Inexperience with pull requests
- The complexity of contributions
- The locality of the artifacts modified
- The project's policy contribution
From the integrator’s perspective, an inherent problem with pull requests are addressing the social challenges. That is, how to motivate contributors to keep working on the project after they’re rejected, for one. How to explain the reasons of rejection without discouraging them, for another.
Unexpectedly, code quality turned out not to affect the acceptance of a pull request at all
From the contributor’s perspective, prior studies found that it’s important for contributors to reduce response time (i.e., respond faster), maintain awareness, and improve communication.
Results of this Study
They conducted a case study among 28 Java open-source projects, analyzing the presence of 4.7 M code quality issues in 36,344 pull requests.
They found that 19,293 pull requests (53.08%) were accepted and 17,051 pull requests (46.92%) were rejected. Eleven projects contained the vast majority of the pull requests (80%) and test defect (TD) items (74%). The distribution of the TD items differs greatly among the pull requests. For example, two of the projects (Cassandra and Phoenix) contain a relatively large number of TD items compared to the number of pull requests, while three other projects (Groovy, Guacamole, and Maven) have a relatively small number of TD items.
The results complement those obtained by prior studies. Namely, that the reputation of the developer might be more important than the quality of the code developed.
The main takeaway for practitioners, and especially for open-source projects, is realizing they should pay more attention to software quality. Pull requests are a very powerful instrument, which could provide great benefits if used for code reviews as well.
They also noted that researchers should investigate whether other quality aspects might influence the acceptance of pull requests.
Therefore, the quality of the code submitted in a pull request has no bearing on being accepted. The same results are verified in all of the 28 projects independently.
The reputation of the developer submitting the pull request is one of the most important acceptance factors.
Their study also concluded that we should raise awareness in the open-source community about making sure they consider code quality when accepting pull requests. Another important factor to consider is the developers' personality as possible influence on the acceptance of the pull requests.
Collaborator Prevents Reputation-Based Signoff
While this study focused on open-source projects, it’s probably safe to say that developer reputation also impacts closed-source ones. Additionally, we know many organizations have situations when code is pushed through quickly to meet deadlines. Or the thought that since a Sr. Developer made the change, it must be safe. Regardless of the project being worked on, a clearly defined and measurable process drives quality.
Collaborator drives quality by:
- Ensuring the right people are on the review: It provides defined participant roles and review subscriptions to place the appropriate team members on each review.
- Providing Traceability and Process Documentation: Cases auditing and compliance burdens, plus improves traceability with reports and metrics.
- Integrating with existing tools: Reviews are automatically created when a pull request is initiated. It can even prevent the merge from taking place until the review is completed.
Bottom line, your application needs to be performant, and street cred should not be the determining factor of whether or not code is approved. There needs to be a process in place, and Collaborator makes it easy. Try it for free today.