Why Putting Repositories in your POMs is a Bad Idea


February 25, 2009 By Brian Fox

I get this question frequently so it is time to write down my thoughts on the answer so I can stop repeating myself. Here’s the question:

Should I put the urls to my repositories in my poms or in my settings?

The short answer is: settings.

The long answer is: it depends.

There are two scenarios to consider here. Enterprise software (generally not published externally) and public software. Lets take Enterprise software first. Continue reading this post for a full explanation of both scenarios.

Enterprise

In an Enterprise scenario, you will typically want to have a repository manager like Nexus sitting on your front lines. It will be proxying all external repositories and hosting your internal ones.

When Maven looks for an artifact, it walks down the list of repositories defined in the POMs and settings until it finds what it wants. If you define your repository manager as a repository in your POMs Maven will still fall back to the built-in Central repository when artifacts are not found in the repository manager.

One possible solution to this is to overload the “central” repository id and point this at your repository manager. This can work, but you would want to do this in your corporate pom so you don’t need to define it in every project you have. Once you’ve overloaded “central”, you have a “Catch-22″. In a clean environment, you need to know where the repository is to find the parent POM — to tell you where the repository is. You will end up having to define this repository in your settings anyway just to bootstrap.

Simply redefining Central has a few other problems. Specifically, it doesn’t redirect repositories that may be defined in other POMs you include transitively. This causes a few problems: first, it will slow down your builds as Maven goes out to all the external repositories looking for artifacts…possibly even your internal ones that would never be found there. Second, it means something may build for one developer but not another. Third, it means as an organization you have no idea where your artifacts are coming from.

As an organization, you generally will want all your developers using the same set of repositories for their builds, and make all requests via a controlled mechanism. This is best accomplished with a mirrorOf * entry in your settings to redirect all lookup requests to your repository manager. It is not possible to define a mirrorOf in a POM. See this section of the Nexus Book for examples of what a good setup will look like.

With all these things considered, you can see that defining the repositories in your poms doesn’t really help you much and generally just gets in the way.

This all presumes that your artifacts are not consumed externally. If they are, then you also fall into the next category. (They are not mutually exclusive)

Open Source Projects

If you are publishing your software and others will check it out and build it, then there are more points to consider. If all your dependencies are available in the Central repository, then you have nothing else to do. If however, you have artifacts that may exist only in your repository (think snapshots of your parent POMs), or other third party repos, then developers will have a hard time building your source. Only in this case does it make sense to add repository entries to your POMs. However, there are side effects to this that should be noted:

  • The entries you have defined will be burned forever into your released POMs. This means that, should the URLs change down the road, anyone consuming your POMs will face these broken URLs and have to track down the new ones manually.
  • Each of these entries will end up in your consumer’s builds since Maven needs to check them all for missing dependencies. Maven currently isn’t able to tell when you introduce a repository which dependencies are supposed to be found there (and it gets worse when you consider transitivity).

If the product of your build is a tool (like Nexus) rather than components used as dependencies, then adding the repository to the POM is fairly safe. In this case, it is unlikely others will depend on your artifacts directly in some other build, and the above concerns are nullified since presumably the new source would have the correct repository URLs.

If you are publishing jars that will be consumed by others, then you should think about getting them synced to the Central repository. This means that they will always be available to all users no matter what happens to your repository or URLs, and you won’t accidentally introduce new repositories to their builds. Central will eventually reject POMs with repository elements for this reason.

Summary

So to sum up, if you want to have repeatable builds and good control over your organization internally, use a repository manager and use a mirrorOf entry in everyone’s settings.xml to point at that url.

If you are exposing your source and want to make it easy for others to build, then consider adding a repository entry to your POM, but don’t pick a URL lightly, think long-term, and use a URL that will always be under your control. If your URL has to change down the road, make sure that you will always be able to track 404s and write the appropriate mod_rewrite rules to ensure that future builds will be able to find the appropriate artifacts.

PS. If you got here from the Jfrog blog, then I suggest you read my response.