Meet the developers behind Sonatype's automated malware detection system securing open source supply chains

April 08, 2021 By Ax Sharma

5 minute read time

Since we debuted our Advanced Development Pack in late 2020, Sonatype's discovery of malicious packages infiltrating npm has been making headlines over and over [1, 2, 3, 4, 5].

While it's been a company wide initiative, the progress has really been made possible by the team building our automated malware detection system, Release Integrity, part of the next-generation Sonatype Intelligence products that regularly monitors newly released npm packages and flags suspicious components.

This data which flows through our Advanced Development Pack, is powerful enough, but when combined with the power of Sonatype Repository Firewall, it automatically thwarts attacks on your software supply chain by quarantining suspicious and malicious components immediately.

Let's meet our principal software engineer Xiaorong Xiang and data scientist Cody Nash, part of the development team behind the Release Integrity system.

 

On any given day, Cody and Xiaorong can be seen extensively monitoring events and activity patterns associated with malicious components being published in the wild, and tuning our AI/ML-based automated malware detection algorithms accordingly.

As an example, this would include the frequent "spikes" we continue to see around the dependency confusion copycats being published. In mid-March 2021 that surpassed 10,000 - and is only growing.

The copycats have become so frequent, that it hasn't been possible to write about all of them, but the spikes seen through March saw Release Integrity catching at least 5,000+ copycats on top of the 5,000+ already seen on npm and PyPI we were able to cover.

pasted image 0 (3)-2

As Xiaorong explains in the video, every newly published npm package is ingested and evaluated by the Release Integrity system against a criteria comprising over five dozen "signals" or red flags that indicate the package could be suspicious, such as the age of the package, history associated with its author, and more importantly, the code inside the package.

As soon as Release Integrity flags a package or a dependency as "suspicious" it undergoes a quarantine queue where it'll then be manually reviewed by one of the Sonatype security research team.

Advanced Development Pack: Release Integrity

However, the good news is, while this manual analysis is in the works, users of Sonatype Repository Firewall will already be protected as any suspicious components will be quarantined, before they are pulled "downstream" into a developer's open source build environment.

Moreover, with the "Dependency Confusion Policy" feature configured, users get proactive protection from dependency confusion attacks, should conflicting package names exist in a public repository and their private internal repos.

Users of Sonatype Nexus Repository can additionally download Sonatype's "dependency/namespace confusion checker" script from GitHub to check if they have artifacts with the same name between repositories, and to determine if they have been impacted by a dependency confusion attack in the past. Additionally, users should take advantage of routing rules to dictate that internal dependencies get pulled from a trusted repository.

Sonatype's 2020 State of the Software Supply Chain states that next-generation upstream software supply chain attacks are far more sinister because bad actors are no longer waiting for public vulnerability disclosures. Instead, they are taking the initiative to contribute code to open source projects and then - unbeknownst to the other OSS project maintainers - injecting malicious code.  Those code changes then make their way into open source projects that feed the software supply chains of developers around the world is happening at a rapidly increased rate.

And this is happening at a rapidly increased rate. In fact, there was a 430% increase in upstream software supply chain attacks over the past year. Keeping this in mind, it is virtually impossible to manually chase and keep track of such components.

Sonatype's world-class security research data, combined with our automated malware detection technology safeguards your developers, customers, and software supply chain from infections.

Tags: vulnerabilities, featured, News and Views, Employee Spotlights

Written by Ax Sharma

Ax is a Security Researcher at Sonatype and Engineer who holds a passion for perpetual learning. His works and expert analyses have frequently been featured by leading media outlets. Ax's expertise lies in security vulnerability research, reverse engineering, and software development. In his spare time, he loves exploiting vulnerabilities ethically and educating a wide range of audiences.