| 2 Review, Monitor

Establish a Code Ownership Loop with Collaborator and Bugsnag

Juli 25, 2022

This blog is derived from the webinar, “Accelerate Releases Through Code Ownership with Collaborator and Bugsnag”, focused on establishing a culture of code ownership and its benefits through the lens of the SmartBear tools Collaborator and Bugsnag.

Taking a line from the SmartBear 2021 Annual State of Software Quality Report: "Quality is top of the mind for every individual and every team. Every developer has felt the pain of software releases that did not go well leading to customer calls and late nights. Quality software remains the insurance policy that continues to pay out."

Engineering teams must constantly catch and remediate defects in order to ensure stable, high-quality applications. Testing and monitoring methods are typically applied throughout the entire software development lifecycle (SDLC) - from initial development, through testing and into production. With each added stage in the process, the relative cost of the issue increases. By the time an issue proceeds undetected into production, the reputation of the brand is on the line.

One effective strategy to streamline bug identification and resolution through the SDLC is to leverage the practice of code ownership.

In this blog we discuss

General code ownership principles, and how they drive a sense of clear responsibility, a factor proven to correlate with overall quality and efficiency in resolving issues.
How Bugsnag and Collaborator natively support and enhance this approach by providing tools to help squads or code owners make decisions best suited to their parts of an application.

What Is Code Ownership?

Code ownership is the idea that individual developers or teams are responsible for the features they create and deploy. For example, suppose that an ecommerce application has two teams responsible for user conversion and ecommerce. If there’s an error in the shopping cart, the ecommerce team owns the code and would be responsible for a fix. [source codeownership blog]

We see many teams adopting what is referred to as code stewardship, or weak code ownership, where teams of engineers take general responsibility over building, testing and maintaining certain parts of the application.

Benefits of Weak Code Ownership:

Minimizing Defects: “The proportion of reviewers without expertise shares a strong, increasing relationship with the likelihood of having post-release defects.” [Code Ownership and Review Study]. Naturally, developers familiar with a certain part of an application should be responsible for other processes around it, such as reviewing code changes. Developers who have contributed to a module or function may have a deeper understanding of specific dependencies, data requirements or other tricky circumstances that need to be met when new code is added, and can catch more issues or incorrect assumptions if they have navigated this themselves.

Efficiency: By allocating code review and error monitoring responsibilities to a module’s owners, these codeowners can review code changes more efficiently as they are often already up to speed on their module’s functionality and context.

Culture: A culture of ownership encourages developers to take initiative as the subject matter expert of their code and actively participate in the relevant technical decisions. Granting teams the ability to monitor their own parts of the code, triage, prioritize and resolve their own issues, and review code changes with their own requirements brings the flexibility to tailor decisions around the functionality they understand. This fosters a healthier development environment.

One example is tying error budgets to individual squads rather than defining this on the organization level. By doing so, teams can define their own error budgets and goals for their modules, which may bring more precision into those estimates on a company level.

Challenges

Below are some challenges we hear from teams implementing code ownership principles around their existing structures:

How do we make sure teams get alerted of stability issues in the parts of the code they own? How do we align teams around which bugs to focus on?
How do we make sure nothing slips through the cracks if there are gaps between what each team owns, or if things come through that no one team is specifically responsible for?
How can we make this work within our broader ecosystem of tools, like JIRA, to minimize duplicating effort keeping things organized between all these data sources?

How Do Collaborator and Bugsnag Fit in?

Bugsnag helps development teams identify stability issues in every release and provides actionable insights to prioritize and fix those with the greatest impact.

Bugsnag's Alerting and Workflow Engine is an all-encompassing term for our many integrations with tools like Slack, JIRA and PagerDuty, as well as the intelligent logic layered on top to trigger and route alerts to the right people at the right times.
- Bugsnag can be configured to alert the team who built the module or the integration when there are spikes in the errors coming from that code. The alert triggers can be tuned per module to ensure they are meaningful and align with that module’s behavior.
- For example, if one team is responsible for an app’s integrations with third party services, they may want to utilize spike detection to get alerted of potential outages. Another team might want alerts for every new error if they are responsible for rolling out a new feature.
- Teams should aim to minimize alert fatigue by adjusting alerting processes to follow code ownership boundaries. This helps to avoid high volumes of alerts firing off to one channel, where it is unclear who is responsible for what.

2. Issue tracker integrations: Engineering teams typically have pre-existing workflows built into their toolsets to support their squads, such as JIRA boards. Our integrations allow you to keep in sync with issue trackers and notification platforms in order to minimize cross management.

3. Our platform incorporates state tracking - teams can build a triaging process and know at any given time what the plan is for each error, so nothing gets lost in a high volume of data even when there may be gaps in coverage of errors.

Customer story: A great example of an engineering organization that leveraged Bugsnag’s features to transform the way they triage and resolve issues around a code ownership model is found here: https://www.bugsnag.com/customers/gusto 

Collaborator adds a level of accountability with clear, documented review processes and cross-functional alignment to catch errors early in your SDLC. 

Integrations with tools: Designed to fit into the same workflows as tools like GitHub and JIRA, Collaborator makes code review simpler, puts processes in place to give developers ownership of their code and facilitates bringing in peers in for additional approval of code shipped to production. 
Enforce process with a uniform workflow: Regardless of integration, a team’s workflow for review is the same for code artifacts as it is for documentation review, as it is for Simulink model review. 
Provides review enforcement: For most integrations, Collaborator provides a gating mechanism to ensure artifact review before any release is approved, further shifting defect discovery earlier and enabling the release of the highest quality code possible. 
Custom rules to be set up per team: 82% of teams satisfied with their code review process have specific guidelines on how they should be performed. [source] To create clear guidelines that ensure everyone on your team understands expectations, Collaborator lets you build custom checklists in review templates so participants with different roles and responsibilities can easily see what’s expected of them on each project.

Shift Left in the SDLC

The vast majority of defects are typically introduced in the coding/development phase of the SDLC, but are mostly undetected until later phases, such as pre-production testing (unit tests, functional tests and system tests) and production.

As seen below, an increasing number of bugs are typically found throughout the testing phases into release. The diagram highlights that the cost to repair a bug in a staging or production release is exponentially higher than a bug found during coding or early testing phases. This is in part because mean time to resolution is generally increased in later stages, and revenue or brand reputation can also take a hit when a bug is presented to your user base.

Given there is a direct relationship between the number of code reviewers and the quality of the code post-review, the goal becomes shifting error identification and resolution earlier towards the coding phase through more robust peer reviews, supported by a tool like Collaborator. This in turn shifts the “percentage defects found” curve to the left, and lowers the cost associated with remediating those repairs, from developer’s time, user experience and brand loyalty perspectives.

An ROI tool can be found on the Collaborator page, where you can drop in some basic figures for cost of developers, and quickly understand the advantages that early code review gives you as a return on investment. https://smartbear.com/product/collaborator/roi-calculator/

How Collaborator and Bugsnag Close the Loop

Collaborator and Bugsnag in tandem allow teams to minimize the errors released into production with more thorough code reviews, while also providing robust monitoring and alert routing for those errors that do ultimately make it to your end users.

We can see an ideal workflow below, illustrating the following:

Customizable review and approval workflow completed in Collaborator by module owners to ensure code changes meet criteria.
Approval of PR results in automatic merging of repositories, effectively applying the new changes to the codebase and moving changes forward for release.
Once the newest release is in production, Bugsnag will begin to process events in real time from users that have adopted the new version. The Release dashboard will pick up the new version and monitor its live stability (crash free session and user rates).
As new exceptions occur on user’s devices, Bugsnag will leverage information about where in the code the exception occurred and alert the right team to new errors or spikes in known errors in their modules, via our integrations.
Once receiving an alert, developers leverage the information in the Bugsnag dashboard to triage, diagnose and prioritize the error. When the bug is ultimately addressed and fixed by the squad, this will result in some commits and a PR into GitHub, which will then kick off a new squad-specific review process via Collaborator to review and release the fix.

To view live demonstrations of this feedback loop in the context of both tools, take a look at our full Webinar here.