Managing OSS Forges with Nexus


January 6, 2010 By Brian Fox

nexus-smallIn addition to managing and maintaining the Maven Central repository, I also serve as the administrator for two very large forge repositories: repository.apache.org and nexus.codehaus.org. This post is going to dive into the details of the best practices that I’ve developed to maintain these very large instances. I will focus on the configuration of Nexus in this post, but if you’re interested in system level details, those are documented here.

Both of these repositories have a few things in common that have driven the design:

  • there are many disparate projects deploying artifacts that require fine grained access control per project
  • release repositories are synced to central
  • they are the most commonly used snapshot repositories in the maven ecosystem
  • the majority of users are anonymously reading the snapshots
  • they are transitional repositories that replace older static repositories

They also have a few things that are very different:

  • Apache is a Solaris Zone
  • Codehaus is an Ubuntu Jeos VM
  • Apache is using httpd for reverse-proxying and ssl
  • Codehaus is using Nginx for reverse-proxying and ssl

This post contains two sections, the first covers some system-wide Nexus configuration, the second contains details about adding individual projects, along with security and staging configuration. If you are setting up a public Maven repository, this post might give you some ideas about configuration and administration issues that you’ll need to think about.

Nexus System Configuration

We want to protect all authenticated traffic, so both systems rewrite all http access to https (you can see how that’s done in the server setup linked above). However, since 99% of all traffic to these systems is anonymous, I’ve allowed the snapshot urls to poke through without being redirected.

Since I am using reverse proxies in front of Nexus, and the protocol doesn’t have a good way to tell Nexus what the inbound protocol was, I need to tell Nexus how to generate absolute urls that are used in the REST API. This is done by setting the following options in the server configuration pane:

0-baseurl

The systems are configured with just two hosted repositories: releases and snapshots. Both systems are transitional, meaning that projects elect to convert at a convenient time. To support this, I proxy the old snapshot repository and aggregate it with the locally hosted snapshot repo. When you hit http://repository.apache.org/snapshots or http://nexus.codehaus.org/snapshots, you’re hitting this group and it appears as one repository. We also have a staging group that is used to aggregate all staging repos that haven’t yet been promoted. Here’s what the repository list looks like:

0-repositories
One benefit to using Nexus in these forge setups, is that we are able to configure rules that automatically check staged artifacts before they can be promoted. This includes things like validating the pgp signature is present and signed with a publicly accessible key, looking for sources and javadocs, validating the pom, etc. This is one way we are helping to improve the data in Central, by helping to correct it right at the source. Since these rules are tied in to the Staging support, we want to disable the ability to deploy directly to the releases repository. This is done by setting the repository url to be read-only:

0-disable-redeploy
I also have configured the following jobs:

  • Configuration Backup – Backs up the Nexus Configuration files. I have it set to run daily and to keep 10 days of backups
  • Publish Indexes – This packages the internal real-time indexes into a format that is consumable by downstream Nexus’ and M2eclipse users (other tools consume this data as well). I have this set to run daily.
  • Purge Proxy Artifacts – Since we’re transitional and proxying the old, static snapshot repositories, I have configured a task to evict items that haven’t been requested in more than 10 days. This just reduces the disk consumption on the repo. If a file is re-requested later, it will be retrieved again from the proxy on demand.
  • Snapshot Cleanup – We want to enforce best practices and keep snapshots moving forward. The cleanup task is set to keep a maximum of 3 timestamped snapshots for each artifact for a minimum of 10 days. All snapshots for an artifact are also purged when a release is promoted.
  • Empty the trash – All delete operations in Nexus never actually delete, they just move files to a trash folder. This is a security net in case you misconfigure a cleanup task, or simply make a mistake in the ui (like dropping a repo you meant to promote). We keep on top of the trash by scheduling it to run once a week. New in 1.4 is the ability to purge things from the trash only after they have been deleted for x days. I’ve set this to hold things in the trash at least 7 days. This gives projects more than enough time to detect any issues and recover the artifacts.

10-trash

Project Specific Configuration

Each project needs to have access to only their own artifacts. Nexus supports two different ways to handle the security separation. If you want to read more about the two modes read my previous post:“Optimal Nexus Repository Configuration” .  We have chosen to manage the forge by partitioning a single pair of repositories. The next few steps show how this is done.

First, we define a new Repository Target for this project’s artifacts. Don’t worry if you’re not a regexp genius, wildcards are very easy, and we let you define multiple regexps so you don’t have to figure out more complicated and/or expressions. In the image below, I’ve created a new target called “org.codehaus.org” that will contain all artifacts in the paths /org/codehaus/mojo and below.

1-repotarget

Now that we’ve defined our “bucket” of artifacts, we need to create some permissions that are associated with it. You’re probably thinking “create permissions?” Yes, see the Repository Target is a generic concept that lets you arbitrarily group artifacts, but notice we haven’t yet associated the target with any repositories.

NOTE: The logic behind this approach is that you may want to grant people read access to all org.foo artifacts, but what if you only want them to see artifacts that have been promoted and not things that are still being staged?

In this case, we want to grant CRUD (Create / Read / Update / Deleate) to SNAPSHOT artifacts, but only CRU to releases. (Admins only are allowed to delete releases, this prevents problems once things are synced to Central). I will do this in two steps as shown below. First create a set of permissions that link the org.codehaus.mojo Repository Target to “All Repositories”:

2-allpriv

Then create a set or permissions that only apply to the hosted Snapshot repository:

3-snapshotpriv

Both the Apache and Codehaus forges use the Staging support in conjunction with Staging rules to validate the integrity of releases.

NOTE: Nexus staging is unique in that it’s entirely controlled from the server side, which means Admins can adjust as needed without changing the poms. It’s also designed so that all projects use a single url for deployment that is abstracted from the repository, which provides two benefits: 1) you can change the repository hosting artifacts without changing poms and 2) you can specify the distributionManagement url in just one place, reducing the errors. For example, at Apache we have a parent pom that contains all the logic a project needs to be staged.

We control this in Nexus by creating a Staging Profile. Fortunately the profile reuses the Repository Target we defined earlier:

4-staging-profile

Not shown are the settings that let you set the validation rules and who should be notified at each promotion step.  Now that we have defined the staging profile, the system automatically created a few new permissions that let you specify who is allowed to stage, drop and promote these artifacts. We want to grant this permissions to users and this is done via roles.

The Codehaus repository is linked to their LDAP system. Nexus takes a unique approach that lets you easily grant access to dozens of users without having to configure each user in the system. We do this by allowing you to “map an external role” and then grant Nexus specific permissions to any user that has this role.  To do this, navigate to the Role pane and select the Add External Role Mapping as shown below:

5-external_role

This brings up a dialog where you can select the external realm (you could have multiple realms), and then you will see a list of all the roles known to the external system. Here I’m selecting “mojo-developers”.

6-role-mapping

This creates a new role in Nexus with an id matching the external system id “mojo-developers”. Any user authenticated that has this role in the external user account will automatically be granted these permissions.

mojo_-_role-config

I now select the roles and permissions I want to grant. Specifically, I grant the following:

  • Staging: Deployer (org.codehaus.mojo) - This is a role that was created when I setup the profile. It contains the basic set of permissions needed to allow a user to view, stage, close and drop org.codehaus.mojo staging repositories (only)
  • UI: Staging Repositories – This lets the user actually see the staging view, without this they couldn’t see what they staged.
  • org.codehaus.mojo – All – Here I’m granting Create, Read and Update for all matching artifacts (remember these are the permissions we created above that apply to all repositories)
  • org.codehaus.mojo – snapshots: Here I’m granting delete, but only for the org/codehaus/mojo artifacts in the Snapshot repository.
  • Staging: Profile org.codehaus.mojo – (promote): The Staging: Deployer xxx roles give the ability to stage, but not the ability to promote. This permission may be granted to managers, qa, or PMC etc as appropriate. Here we’re letting developers stage and promote their own artifacts.

And we’re done. Notice that I didn’t need to go grant permissions to every user, and I didn’t have to put the users into groups. This is the power that Nexus provides in user management. This is core functionality that applies to any realm you may have, not just Nexus Professional ones.

Now, to illustrate some more power of this approach, what if org.codehaus.mojo later starts managing artifacts under org.mojo? I don’t have to redo everything here, I just extend the Repository Target “bucket” by adding “.*/org/mojo/.*” and instantly all the permissions, staging profiles, etc apply to the new groupId. This has definitely saved me many times at Apache with the Webservices Projects… they have more groupIds than I can count, but they all map back to the same external role. Each time a new one comes along, I just add it to the target and I’m done.

Automating Nexus Administration via REST

This is the process we’ve ironed out over several months. I definitely don’t go through this UI clicking every time. Since Nexus has a full REST API (the UI is just a REST client built in JavaScript), we have developed a set of command line tools that take a few simple inputs like groupId, external role id and project name and automates all of this configuration via REST calls automatically. You can see those tools here and they make a great example of how to integrate Nexus into the DNA of your organization.