Apple’s Swift software development ecosystem is undergoing a big change, restructuring how Swift code packages are created and distributed. This move echoes the same change in the Go ecosystem a few years earlier, and reveals an essential design trade-off at the heart of package management: do we even need packages at all?
To answer this, let’s look at a question that every major software ecosystem has to answer: how do we share and reuse software?
In the dark ages of open source, using third-party packages required fishing around on the internet for download sites. This could mean wasting time, using old versions, or borrowing from dead projects. Thankfully, this landscape has changed completely.
Today, developers share packages within the walls of their organizations through private repositories. Massive public repositories like Maven Central or the Python Package Index (PyPI) let developers share their open source software contributions with developers around the world.
What makes all this possible is the humble package manager, a type of utility that lets developers bundle up, describe, publish, and consume software declaratively. Package managers like Maven for Java or npm for JavaScript remove the pain of finding, downloading, and installing packages.
Every package gets a unique name (the so-called “package coordinates”), allowing developers to list which ones they need while the package manager automatically finds and downloads them. By freeing developers from worrying about where to get a package (and all its dependencies), we can now simply declare what we want and get on with the coding.
By caching third-party packages where they enter the organization, DevOps teams can provide their developers with fast, repeatable access to every dependency that an application requires.
In an age of rapidly increasing software supply chain attacks, it's important to have a single point where open source dependencies enter the organization. This enables increasingly important monitoring and automated policy enforcement.
Despite their similarities, no two package managers are the same. A healthy spirit of experimentation and innovation has produced a bevy of useful improvements.
For example, many package managers now have a built-in search mechanism (Maven being a notable exception). This makes it possible to find and download packages from the command-line interface (CLI) or standardized REST endpoints.
Meanwhile, Docker’s protocol for downloading and publishing binary containers includes built-in support for de-duplicating the layers shared between many images. Without this, copies of base images would crush repositories with terabytes of redundant data.
However, one trend that has proven less promising is getting dependencies straight from the git source repositories where they are written.
Building a central package repository like Maven Central is an expensive and time-consuming proposition. The widespread availability of public source control systems like GitHub offers an appealing alternative.
After all, why not skip having a centralized repository completely?
Originally, Swift and Go both embraced this idea in the design of their package managers. Rather than requiring a central repository for package binaries, they could resolve package coordinates as tags in any public git source repository. This has a few advantages:
This is a tempting accelerator for a new ecosystem where a critical mass of reusable packages is so important for adoption in the development mainstream.
But if that’s the case, why has the Swift community proposed an alternative approach?
In a mature, widely used ecosystem, new challenges that favor binary packages over git-based distribution become apparent.
Switching to immutable binary packages addresses these risks. Since they can’t be changed, they can be cached in a local repository manager for fast, repeatable builds that let developers work quickly. Teams are insulated from disruptions in the supply chain of dependencies because whatever versions in use can be readily sourced from private caches or the centralized repository.
When every package is uniquely identified, reliable software composition analysis becomes possible. Teams can easily comply with requirements for a software bill of materials, and when advisories or recalls are necessary, identifying which application versions are affected is a simple process.
Finally, malicious open source attacks on developers and development infrastructure can be stopped at the repository, before they even enter the software development life cycle.
Using binary packages for software distribution has important benefits for a mature software ecosystem, across all stages of the software development life cycle. As more ecosystems take root and grow, I hope we’ll see even more bold innovations while fully utilizing the hard-won lessons from existing package managers.
* * *
Want to talk shop about package management, using DevOps at scale, or anything open source? Come join us over on the Sonatype Community.
Sonatype runs anywhere — self-hosted, on the cloud, or air-gapped. Sonatype's cloud offers can be found and are hosted on AWS.