Why Do I Need a Binary Repository Manager?

January 30, 2020 By Ember DeBoer

6 minute read time

This is an excerpt from Out of the Wild: A Beginner's Guide to Package and Dependency Management, a Sonatype Guide. This is the final installment. (Read part one and part two.)

So, why do I need a Binary Repository Manager?

Binary repository managers serve a couple of important functions as part of a modern software development lifecycle.

First, they can serve as a local copy, or “proxy,” repository for the language-specific package repositories/registries we discussed earlier. Creating these proxy repositories in a repository manager to store and cache your OSS components locally—rather than downloading them directly from an online repository every time you kick off a build—can provide some of the following benefits, as stated in our own Repository Management Basics course:

  • Increasing build performance due to a wider distribution of software and locally available parts.
  • Reducing network bandwidth and dependency on remote repositories.
  • Insulating your company from outages in the internet, outages of public repositories (Maven Central, npm, etc.), or even removal of an open source component.

In addition, repository managers serve as a “single source of truth” for the binaries used in your build processes.

At this stage, you may be asking yourself, but why can’t I just store my binaries where I store my source code? And the short answer is that you can. But you probably won’t want to after you understand more about how version or source control tools like GitHub differ from binary repository managers…

I use a Version/Source Control Management repository to store my source code. Why do I need a Repository Manager for my binaries?

As DZone’s Refcard on Using Repository Managers concisely states, “Repository Managers are to binaries what source repositories or VCS (Version Control Systems) are to sources.”

Authors Brian Fox and Carlos Sanchez go on to explain that binary files are much larger in size, and need a lot of metadata stored with them, such as package name, version, license, etc. They also don’t need to be diffed or cloned in the way that source code does.

Because of these differences, an artifact repository makes a lot more sense for storing binaries, whether they’re the outputs of your build (.zip, .jar, .war, etc.), packages downloaded from an online registry, Docker images, etc.

This thread on StackOverflow also provides some clarification on how the two tools differ:

“In everyday use, you’d store your source code and its history in a git repository, and store your build artifacts (e.g. the compiled software you want to deliver) in Nexus.”

Put more succinctly: “You manage what you code in Git, and what you build in Nexus.”

So while proxy repositories are the best method to store open source packages downloaded from online registries as we mentioned earlier, hosted repositories can serve as a means to store your internal build artifacts, including snapshots and releases.

Lastly, another advantage that repository managers provide is risk reduction in your build process. We alluded earlier to opening yourself up to certain risks when specifying the “latest” versions of a particular dependency, or even a version range, in your application-level package manager’s manifest. Downloading unvetted versions directly from online registries presents more risk because bad actors are increasingly poisoning the well, injecting malicious code into libraries or removing them all together.

As Mykel Alvis explained in his Nexus User Conference presentation, the ability to insulate yourself from outages or vulnerabilities that may occur in such cases is made possible by use of a caching repository manager.

Putting it all together

Looking at the diagram below, you can see how the application-level package managers (invoked at the developer and CI circles) and their corresponding registries/repositories (top left), source control management systems (bottom) and a binary repository manager (top/right) all work together as part of a modern software development process. Continuous integration can also easily be added to the mix to further your organization’s DevOps goals.

Package Managers in a DevOps Pipeline

Image credit: https://www.sonatype.com/product-nexus-repository

Further Learning

Repository Management Basics (Course) This course is designed to provide new customers with the first steps towards optimizing their Nexus Repository Manager configuration. Specifically, it provides critical, high-level theory, best practice, and practical application related to understanding specific concepts and terminology related to Nexus Repository Manager.

Nexus Repository Manager - Proxying Maven and npm Quick Start (Guide) If you’re new to repository management with Nexus Repository Manager 3, use this guide to get familiar with configuring the application as a dedicated proxy server for Maven and npm builds. To reach that goal, follow each section to:

  • Install Nexus Repository Manager 3
  • Run the repository manager locally
  • Proxy a basic Maven and npm build

Go Dependencies in Nexus Repository (Guide) This guide will give you fundamentals on dependency management with Go modules. Modules were added to the Go ecosystem to give you built-in versioning and dependency management. Now you and your fellow developers can adapt Go software development to Nexus Repository. Use this guide to get an understanding of the Go toolset, version control, and environment configuration.

Thank you for reading. Find more resources like this in the Sonatype Community, a place where you can ask questions to other Nexus users and the Sonatype team. Choose from an assortment of learning paths, developed by a team of experts, that helps make using Nexus even easier.  

Sources

 

Tags: repository manager, SDLC, Nexus Repository, Product, Post developers/devops

Written by Ember DeBoer

Ember is Senior Tech Content Developer at Sonatype.