Nexus Tips and Tricks: Backup Nexus


June 15, 2011 By Juven Xu

Sonatype is looking through the archives and re-posting popular articles for those new to Sonatype tools.

The first blog in the series is by Sonatype software developer Juven Xu, on backing up Nexus.

Nexus is the industry leading repository manager that helps reduce build times and increase your control of open source artifacts by  managing software artifacts required for development, deployment, and provisioning. Nexus greatly simplifies the maintenance of your own internal repositories and access to external repositories such as Maven Central.  With Nexus you can completely control access to, and deployment of, every artifact in your organization from a single location.

If you are already using Nexus, this article will teach you how to back up your repository manager (if you don’t use Nexus, you can click here for more information).

Why Backup

To backup Nexus simply means to make a copy of your Nexus files for safekeeping. The copy should be stored on different hardware, other than the original one. For example, you might want to copy your Nexus settings in the sonatype-work/nexus/config folder to a removable hard disk.

It is important to backup your Nexus files because sometimes things fail. Hard disk might crash, files might be deleted by other programs, even ourselves might delete important files by mistake. With a backup, you will be protected from these frustrating events.

What To Backup

Nexus consists of two parts, one part is the runtime web application, which can always be downloaded from Sonatype site, the other part is the sonatype-work/ folder, which contains all the user data and configuration. So here our focus is the sonatype-work/ folder. If you are running the Nexus bundle, this folder is next to the nexus-webapp directory. If you are using Nexus war, this folder should be under the user home directory by default.

You can simply backup the whole sonatype-work/ directory if you are rich in disk space, but it’s not necessary. Some of the files in the directory are very important, or even unreproduceable, while some of the files can be easily regenerated. Here is a full list of sub-folders, from the most important to least:

  1. nexus/storage (hosted repositories) — When you are using Nexus, you deploy lots of internal artifacts, which can’t be retrieved from public repositories, into these hosted repositories. Each repository has a corresponding folder in this directory. For example, hosted repository with id ‘release’ has a folder ‘nexus/storage/release’. Because the artifacts stored in hosted repositories are not likely hosted in any public repositories, we must backup them.
  2. nexus/conf —  All the nexus configuration files are stored here, such as repository configuration, security configuration, and log configuration. Although you can recreate them in theory, it would be a pain to lose them — it takes time to recreate them, so, back up this folder.
  3. nexus/logs —  All the nexus log are stored here, you might want to backup them to make sure you know the history.
  4. nexus/timeline — Most of the important events like authentication failure, scheduled task starting, and recently deployed artifacts, are recorded and showed via RSS feeds. These events are internally stored in this folder. You can backup them for potential backtrace.
  5. nexus/storage (proxy repositories) —  Once you’ve been using Nexus for a long time, huge amount of artifacts are cached from public repositories like Maven Central. The speed of your Maven builds benefits from these cached artifacts. Since these artifacts can be retrieved from public repositories at any time, there’s no need to backup them. But, if you have adequate disk space, you can still consider backup them, since this can save a lot of time for retrieving them again.
  6. nexus/trash — Should you backup trash? Most of time these no need to do this. When artifacts are deleted in Nexus, they are actually moved to the trash, you can choose to backup them if you think you are likely to do stupid things very often. :)
  7. nexus/indexer — Nexus indices are store in this folder, since Nexus can rebuild indices using the reindex task, there is no need to backup this folder.
  8. nexus/proxy — Artifacts’ attributes like repository path, is readable, and last requested time are stored in this folder. Nexus can rebuild these attributes as well, so there is no need to backup them.

How To Backup

Now that you know what files to backup according to your situation, you need to know how to backup your files.

Take this example:

My Nexus is running on Ubuntu Linux, the path of sonatype-work directory is /home/juven/bin/sonatype-work/, I have a removable disk which is mounted at /media/disk/, I am using rsync to make backup:

$ rsync -a -delete -v ~/bin/sonatype-work /media/disk/

This command is equivalent to:

$ cp -a ~/bin/sonatype-work /media/disk/

except that it’s more efficient if there are only a few differences.

  • The -a option tells rsync to run in archive mode, which means to run recursively, keep file mode, ownership, and keep symbolic link etc.
  • The -delete option tells rsync to delete files in the target directory if they were deleted in the source directory.
  • The -v option tells rsync to show verbose log.

So this command will backup the whole sonatype-work folder, what if I only want to backup some important the sub-folders?

Suppose I only want to backup these folders: /sonatype-work/nexus/storage/releases/, /sonatype-work/nexus/storage/thirdparty/, /sonatype-work/nexus/conf/, /sonatype-work/nexus/storage/logs/, and /sonatype-work/nexus/storage/timeline/, create a rsync includes list file like this:

+ /sonatype-work/nexus/storage/
+ /sonatype-work/nexus/storage/releases/
+ /sonatype-work/nexus/storage/thirdparty/
+ /sonatype-work/nexus/conf/
+ /sonatype-work/nexus/logs/
+ /sonatype-work/nexus/timeline/
- /sonatype-work/nexus/storage/*
- /sonatype-work/nexus/*

This tells rsync to only include what we want, and exclude anything else, now run rsync again with the includes list:
$ rsync -a -delete -v –include-from includes.list  ~/bin/sonatype-work /media/disk/

Automate The Backup

If your backup disk is always connected to your machine and you don’ t want to repeat the backup command manually, then schedule a task running the backup periodically.

On Linux, cron is a perfect tool for automating the backup task. In my case on Linux, with user name ‘juven’, edit file /etc/crontab, add a line like this:

0 0 * * 0 juven rsync -a -delete –include-from ~/bin/includes.list ~/bin/sonatype-work /media/disk/

This entry tells cron to run the rsync command as user juven, weekly.

And don’t forget to restart the cron deamon:

$ sudo /etc/init.d/cron restart

And that’s it! It should automatically perform the backup at the time you specified.