Sonatype CEO on the future of the software supply chain

August 26, 2020 By Mark Henke

6 minute read time

As CEO of Sonatype for the past ten years Wayne Jackson has a rich perspective on where software development, and where it intersects with security, is heading. As he noted during an interview with Sonatype's CMO Matt Howard during the company's 2020 Nexus User Conference, it's been fascinating to watch the industry grow and change, and help Sonatype be a part of what's shaped its future. Here are some of his thoughts on a variety of topics, such as Maven, the software supply chain, and speed vs. security no longer being at odds.

Software everywhere

To Wayne, what is astonishing is how much software is being developed in every vertical. They say it is "eating the world," but that statement does not fully capture the extent of how it's transformed everything we do. Software isn't eating the world, it is the world.

The notion of a trillion different requests for open source components is hard-to-grasp. Especially when there does not seem to be a clear pattern to it. For example, an open source library discloses a bug or vulnerability, and yet the number of downloads for that library don't decrease. As Wayne notes, it's almost mind boggling that people aren't paying closer attention to this.

Sonatype knows whether such proclamations impact usage statistics because the company can measure download requests in a way others can't. As the curator of Maven Central, the company has unique insight into what’s happening around open source downloads and has been able to look at usage patterns for well over a decade.  Since Maven Central is the de-facto repository for open source libraries, this means they have an accurate measurement of who is using open source in the Java community.

State of the Software Supply Chain

Speaking of open source use, Sonatype recently released a piece of research called the State of the Software Supply Chain which Wayne and Matt dove into during the conversation. The report looks at both public, proprietary and empirical third-party data to understand how developers use open source software.

Built-in backdoors

A truly interesting shift is in how bad actors are targeting the open source supply. They are becoming active contributors to this software, planting in bugs and backdoors for malicious use later. This is alarming, given that historically open source has a reputation of safety because of its transparency. Yet these backdoors are sometimes able to hide in plain sight.

This development comes in part due to bad actors' laziness. Malicious users will usually take the easiest path to compromise a system. Vulnerabilities can be hard to find and involve repetitive scripting and searching. They can make their attacks easier if they can implant more convenient backdoors in the code upfront.

There are other benefits beyond security that come from understanding the supply of open source. Wayne even goes as far as to say that security is not the most important benefit. Instead, it's the innovation that comes from building upon existing, quality libraries. By adopting a trustworthy library, an organization is freed up to focus on more abstract ideas.

The more data developers have on the quality of a library, the better they can choose a healthy direction for their software. If they don't have this data, they may end up with dozens of libraries that do essentially the same thing in a codebase's infrastructure. And those developers would have to maintain that.

The State of the Software Supply Chain report has a rich set of data on what differentiates open source libraries in quality - giving readers a lot to think about.

Speed vs. security

Furthering discussions on how to manage software supply chains, Matt and Wayne discuss how security no longer becomes an obstacle to delivery speed when an organization reaches a certain level of maturity. Things like automated deployments and automated scans can actually enable teams to move fast without sacrificing security.

This is a similar story to that of testing, where it used to be that testing was done by a separate department. It was an afterthought. So teams would sacrifice testing in order to improve perceived speed, or vice versa. With automated testing so prevalent, speed vs. testing is no longer a dichotomy. It is becoming the same with speed vs. security. And it's encouraging to see more teams go in this direction.

Without data, things happen. Decisions are made blindly. But with data, you can make decisions that benefit the organization.

Dependency migration

Talking about dependency management, there are some interesting observations in how developers behave. For example, how many developers manage dependencies in a programmatic way rather than manually maintaining them?

The answer to this question is interesting. We can see signs of people beginning to manage dependencies in an automated way. But even when developers do this, there still seems to be a gap in maintaining these dependencies in a secure, programmatic way.

To understand this data, context matters. A team will care much more about the library that affects customer-facing code then they will care about something that helps manage a team’s lunch schedule.

We can get insight into this context by looking at how developers migrate from one library to another.

hibernate-validator library

For example, hibernate-validator has two large clusters of migrations. Turns out in between earlier versions 5 and 6, there were a lot of security vulnerabilities. So for this, we have a crowd-sourced way of knowing insecure library versions to avoid. As a developer, if I see download counts by version, I am likely to go to version 5 or 6, skipping the ones in between, thinking there may be security issues.

There is a perception in this case that a developer on version 5 of a library may say "it's not worth the effort to migrate." But GitHub recently published a paper that describes the subtle long-term cost of making that decision. Staying on older versions, seeming to have a short term cost-saving, actually loses money many times in the long run. To many developers who understand the creeping cost of technical debt, this may not be that surprising.

There is an idea related to this called net innovation, which is gross innovation minus technical debt. Gross innovation is great but eventually will be stifled by growing technical debt left unchecked in a product.

In another migration data point, we can see a more typical pattern with Spring Core. There is a more seamless migration from one to another.

Developers are not all jumping to the most recent version but are incrementally jumping to hygienic versions of Spring Core over time.

spring-core library

There is also fascinating data on reverse migration, where developers revert to an earlier library version after trying to migrate forward.

Overall, there are some fascinating observations we can make by looking at the state of the software supply chain. Wayne highlights that more and more, the industry is embracing the idea of dependency management. Doing so not only provides security benefits but also productivity and innovation.

You can see the full conversation between Wayne and Matt as well as other Nexus User Conference sessions here.

 

Tags: State of the Software Supply Chain, Nexus User Conference, News and Views, Guest Post

Written by Mark Henke

Mark Henke has spent over 10 years architecting systems that talk to other systems, doing DevOps before it was cool, and matching software to its business function. Every developer is a leader of something on their team, and he wants to help them see that.