Sonatype finds malicious npm packages which broadcast your IP, username, and device fingerprint info on the web

September 30, 2020 By Ax Sharma

8 minute read time

Sonatype researchers discovered and confirmed the presence of two new vulnerable npm packages. Sonatype's discovery was initially made by its malicious code detection bots. By applying machine learning and artificial intelligence to identify suspicious code commits, update signals, and developer patterns, the bots are continuously assessing changes across millions of open source software component releases. Following alerts from the Sonatype bots, our security research team verified the presence of malicious code in two npm packages and traced the intended exploit path.

The two packages are:

  • electorn
  • loadyaml

The two packages representing next-generation software supply chain attacks rely on typosquatting - an attack that impersonates legitimate packages and makes them available for unsuspecting developers to download. Typosquatting packages prey on a developer or unsuspecting user to make a minor typographical error which will trick them into installing the malicious package within their environment instead of the one they had originally intended to download. For example, the developer requests the "electron" package but unintentionally spells it "electorn."

Once installed, the packages discovered by Sonatype, collect the user's IP address, geolocation data, along with their device's fingerprinting information, and publishes this data to a public GitHub page.

Last year, Sonatype unveiled its next-generation malicious code detection bots being built into our Sonatype Intelligence products, to enable detection of malicious releases of open source components, known as "counterfeit components," and blocking their use within modern software factories.

Our release integrity monitoring efforts have constantly evolved since then and continue to provide us with top-notch security intelligence which protects our customers and their software supply chains.

Diving Deep into `electorn` and `loadyaml`

Multiple packages identified by Sonatype's malicious code detection bots include `electorn`, `loadyaml`, `lodashs`, and `loadyml`. Let's take a deep dive into each of these.

It is worth noting that all four packages share the same author, "simplelive12."
At the time of our research 2 of these packages (`electorn`, `loadyaml`) were still available on npm downloads, while the other two had been unpublished by the author.

  • sonatype-2020-0784: `electorn`  (downloads to date: 255)

    The package misspells the legitimate electron package. There's only one version (10.0.0) of the package available to download.

    This version comprises three files: the manifest, package.json, index.js and update.js. The "index.js" file is a mere placeholder with innocuous skeleton code.

A quick glance at the manifest (package.json) reveals 3 interesting findings.
On Line 4, "electorn" touts itself as a component which is an electron wrapper offering some kind of auto-update functionality, in vague terms.

Line 8 has explicit instructions to launch the "update.js" script in the background (notice the "&" in the shell command) as soon as the user attempts to install the package.

Because "preinstall" scripts are executed before the installation begins, this finding indicates the malicious component's author was relying on a user mistyping "npm install electron" as "npm install electorn."

image1

Further, line 13, indeed pulls the legitimate "electron" package as a dependency giving off the false impression that perhaps "electorn" is indeed a wrapper package. However, "update.js" is not concerned with the legitimate "electron" at all and doesn't even use the dependency anywhere.

Collects and exposes sensitive information

The "update.js" file contains minified NodeJS code packed on a single line. When unfolded, the code reveals functions which "fingerprint" the device it is running on - by collecting the logged in user's username, home directory path, and CPU model information.image2

The "fetchIPInfo" function further gathers the user's IP address and looks up the corresponding city and country of the IP.

The essence of the malicious behavior lies in the "update()" function which is called every hour.

The function uploads all of this collected information to a public page on GitHub. But nowhere in the file are any obvious URLs present.

The malware disguises API endpoints and URLs as base64 encoded strings.

For example, on line 87, the "fetchIPInfo" function is being fed the base-64 encoded API endpoint, https://ifconfig.co/json which returns the IP address and geolocation data as a JSON response. (Note: iconfig.co is an unrelated web service with legitimate use cases, used for IP lookups).

image3

The asynchronous function "comment" on line 90 is what will post all this collected data via the GitHub API ("api.github.com") to a public-facing page.

The exact address of the endpoint is revealed by line 93: L3JlcG9zL2g... decodes to: /repos/<repo name>/<path>/issues/4/comments

On taking a closer look at the GitHub page where comments were being posted, we observed that each comment comprised the IP address, city, country, and an "fp" (fingerprint) field, all visible to the public.

The "fp" field contains the username of the logged in user, their home directory, followed by their CPU as explained above. For example, when decoded this would look like:

johnsmith/Users/johnsmithIntel(R)Core(TM)i5-XXXXXCPU@2.30GHz

image4
Image: The package collects IP address and device fingerprinting information such as username, home directory, and the CPU model, and publishes this data (in base64 format) on a public GitHub page.

Another key observation we made was, while the GitHub issue page reported having over 800 "comments" (including duplicates, considering the package broadcasts the collected information every hour), those older than 24 hours were deleted.

image5

It is not entirely clear how this data is being processed and why is it removed every 24 hours from the public page.

sonatype-2020-0784 has been assigned for malicious package `electorn`.

  •  sonatype-2020-0781 :`loadyaml`  (downloads to date: 48)

    Another package `loadyaml` published by the same author is nearly identical to `electorn` in its functionality with one particular exception: it publishes the user data every 30 minutes as opposed to every hour.
    The author "simplelive12" has also previously published identical typosquatting packages `lodashs` and `loadyml` on npm downloads which they had later removed before these could be detected or flagged by anyone. sonatype-2020-0781 covers `loadyaml`, `lodashs` and `loadyml`.

Sonatype release integrity protects your software supply chain

All of the vulnerable packages are accounted for in Sonatype's data as sonatype-2020-0781 and sonatype-2020-0784 (formerly, sonatype-2020-0735).

The four packages were published on npm roughly a month ago and have had a little over 400 downloads combined, to date:
  • electorn: 255
  • lodashs: 78
  • loadyaml: 48
  • loadyml: 37

Our malicious code detection bots had picked up these packages within a day of their release on npm downloads and flagged them as suspicious, which kept our customers and their software supply chains protected from the start. As the timeline below shows, the confirmation of maliciousness occurred more recently, leading to this disclosure.

This month on Deep Diving all 4 packages, we are releasing our findings.

Timeline:

August 17-24, 2020: Malicious package `electorn` is published to npm downloads on the 17th followed by other packages, two of which (`loadyml`, `lodashs`) are shortly removed by the author.

August 18, 2020: Sonatype automated malware detection systems pick up the suspicious packages. The component is added to our Fast-Track data.

August 25, 2020: GitHub issue page is opened to public and begins publishing collected user data as "comments."

September 30, 2020: Sonatype security research team performs a thorough Deep Dive analysis on `electorn` and other identical packages. We report our findings to GitHub, npm, and are simultaneously making them publicly available. Our reason for the public disclosure centers on the fact that sensitive information of users who downloaded these packages inadvertently is already being exposed on the web and the malicious packages continue to exist on npm downloads, therefore the standard vulnerability disclosure timelines would not apply in this case.

As explained above, because these malicious packages were flagged as suspicious shortly after being published, our customers remained safe. Sonatype Security Research team specializes in world class vulnerability research. Malware research is a newer offering and we are constantly improving our processes when it comes to researching malicious components. As a result, although the packages were automatically identified and added to our data as suspicious components early on, it isn't until today that a thorough Deep Dive research was completed, confirming malicious actions and our findings are being made public. Going forward, this gap between the malicious package identification and a Deep Dive research will be minimized significantly.

October 1, 2020: Following Sonatype's report, npm removes both packages (`electorn` and `loadyaml`) from their downloads registry. GitHub removes the publicly visible issue page broadcasting user data.

Growth of next generation software supply chain attacks

As we've shared in our 2020 State of the Software Supply Chain report, these types of next-generation software supply chain "attacks" are far more sinister because bad actors are no longer waiting for public vulnerability disclosures. Instead, they are taking the initiative and actively injecting malicious code into open source projects that feed the global supply chain.

By shifting their focus "upstream," such as with open-source malware in "electorn," bad actors can infect a single component, which will, and this case probably have been, then be distributed "downstream" using legitimate software workflows and update mechanisms.

Our 2020 report also shows that this is happening at a rapidly increased rate. In fact, there was a 430% increase in next-generation software supply chain attacks over the past year. Keeping this in mind, it is virtually impossible to manually chase and keep track of such components.

Sonatype's world-class security research data, combined with our automated malware detection technology safeguards your developers, customers, and software supply chain from infections like these.

Remediation

DevOps-native organizations with the ability to continuously deploy software releases have an automation advantage that allows them to stay one step ahead of malicious intent. Sonatype Nexus Repository customers were notified of these malicious packages within hours of the discovery, and their development teams automatically received instructions on how to remediate the risk.

If you're not a Sonatype customer and want to find out if your code is vulnerable, you can use Sonatype's free Sonatype Vulnerability Scanner to find out quickly.

Visit the Sonatype Intelligence Insights page for a deep dive into other vulnerabilities like this one or subscribe to automatically receive Sonatype Intelligence Insights hot off the press.

Tags: vulnerabilities, featured, News and Views, Product, Nexus Intelligence Insights, malicious code npm

Written by Ax Sharma

Ax is a Security Researcher at Sonatype and Engineer who holds a passion for perpetual learning. His works and expert analyses have frequently been featured by leading media outlets. Ax's expertise lies in security vulnerability research, reverse engineering, and software development. In his spare time, he loves exploiting vulnerabilities ethically and educating a wide range of audiences.