PyTorch namespace (dependency) confusion attack

January 04, 2023 By Ilkka Turunen

4 minute read time

 

The holiday season has had a rough go over the past few years when it comes to supply chain incidents. 2021 famously saw the publishing of Log4Shell, and in 2022 a new incident affected a popular machine learning framework called PyTorch.

Disclosed by PyTorch themselves and first reported by BleepingComputer, this attack works using a supply chain attack tactic known as namespace or dependency confusion. We have previously elaborated on this type of attack, and have reported it as being one of the fastest growing forms of supply chain attacks in our recent State of the Software Supply Chain report.

Who is affected?

As per the PyTorch disclosure, this attack targeted users of the PyTorch-nightly build as opposed to the regular releases. Users of the stable release are not affected. The nightly build downloads its dependent packages from a private Python registry, as opposed to the official PyPI package index from which the official releases get their dependencies.

How does it work?

One dependency used in the nightly build is a package called torchtriton. The attackers went on the official pypi.org registry and registered the package name there with a high version number. Many open source registries do not have namespace protection, meaning that anyone can register any package name for themselves.

During the dependency resolution with Python, pypi.org registries generally take precedence over private or alternative registries, which is what the attacker used to their advantage.

The malicious package includes a binary file that is executed when the triton package is imported. The malicious payload then reads various files, including SSH keys, the contents of up to 1000 files in the $HOME directory as well as exfiltrating a whole host of other information about the system back to a command and control server.

A complete list of exfiltrated files can be found with the official disclosure.

How do I find out if I’m affected by the PyTorch-nightly incident?

Any machine affected by this issue should be considered compromised and due action taken to recycle stolen credentials. To find out if you’re affected, you need to find any known uses of PyTorch-nightly. 

Sonatype has onboarded the issue onto our database as sonatype-2023-0001. Any known affected components discovered during continuous monitoring will receive automatic alerts. 

PyTorch have taken steps to rename the torchtrion package to pytorch-triton and have now reserved the package name in pypi.org to prevent similar future incidents.

How can I protect my company from dependency confusion attacks?

Protecting against this type of attack can be tricky. Users of Sonatype Platform can take advantage of the built-in namespace confusion protection in Sonatype Repository Firewall with Sonatype Nexus Repository to protect proprietary packages from falling victim to the same strategy, preventing any protected packages from being downloaded from external sources. Read our documentation.

Generally this strategy is used as a targeted method to infiltrate malicious code into a specific organization. Using namespace confusion as a mass attack method against users of a targeted package, as opposed to a targeted infiltration strategy into a single organization is a novel way of leveraging this type of attack we have not seen before.

As a maintainer of your internal packages, especially when parts of your code is published to the wider world, you should also as a best practice reserve your package names, even if you do not intend to publish them to the world.

The problem stems from architectural choices with upstream registries. As custodians of Maven Central, we here at Sonatype will continue to advocate for strong namespace and package name protection in upstream registries. We will continue advocating for namespace protections as a part of our work at the OpenSSF.

Protecting against supply chain attacks is a part of managing dependencies

There is no doubt 2023 will continue to bring new types of supply chain attacks to the forefront. With our continued reliance on all the goodness open dependency ecosystems bring us, being aware of the risks and taking steps to manage your software supply chain in a thoughtful and architected way can help prevent unnecessary risk from being taken by developers, as well as help your organization react in a swift, practiced manner when a new incident like the PyTorch-nightly attack occurs.

Having the right tools in place will help you deal with modern risks without losing agility, and we are here to help our customers and the world alike tackle this challenge. Talk to us.

Tags: News, namespace, supply chain attacks, dependency confusion, DevZone

Written by Ilkka Turunen

Ilkka serves as Field CTO at Sonatype. He is a software engineer with a knack for rapid web-development and cloud computing and with technical experience on multiple levels of the XaaS cake. Ilkka is interested in anything and everything, always striving to learn any relevant skills that help towards building Sonatype for success.