Maven 3.x: Paving the Desire Lines - Part One

November 04, 2009 By Jason van Zyl

8 minute read time


Maven 3.0

With Maven 3.x we have made our best attempt to listen to users, find out what they need and want, and make reasonable preparations and plans to fix the problems, implement useful features, and create the integration points for working properly with third-party systems like Nexus and Hudson. While there have been many new feature requests -- and we have many new features done or in the works -- the number one concern is backward compatibility. Maven 3.x makes an ardent attempt to be 100% backward-compatible with Maven 2.x. All your Maven 2.x builds should work out of the box using 3.x without change to your projects. We simply can't cause pain to the existing user base. We are very cognizant that even small changes can have a huge impact and so we have tried achieve complete Maven 2.x fidelity with Maven 3.x.

In Maven 3.x we have reduced the number of modules, we have simplified and extracted the artifact resolution system, we have put a performance framework in place, and we have created an enormous set of integration tests for Maven's core and for many of the core plugins. We are constantly trying to validate core features and core plugins are working properly. We hope with the simplifications that we have made that more people will get involved in the project. We have done some heavy overhauling of many of the internals to make way for future features and I believe Maven 3.x is really the only path forward for Maven. What's the technique we used to decide on how to plan and work toward the completion of Maven 3.x? A very practical approach to determining an optimal path called desire lines.

Desire Lines

A desire line usually represents the shortest or most easily navigated route between an origin and destination. The width and amount of erosion of the line represents the amount of demand. The term was coined by Gaston Bachelard in his book The Poetics of Space. Desire lines can usually be found as shortcuts where constructed pathways take a circuitous route. A concrete example: the pathways around the Berkeley campus in California. During the early years of Berkeley no pathways were paved. Instead they let inhabitants walk in optimal paths between the buildings and location over a period time to form clear pathways over the grass. Once these pathways had been established they could be paved to make the pathway more permanent. This is very similar to what happened with Maven 2.x. Consider Maven 2.x the pathways marked in the grass. Consider Maven 3.x taking all that learning from Maven 2.x and adopting the optimal form of use and codifying those forms of use i.e. paving the desire lines.

Over the course of hundreds of thousands of people using Maven we have a very good idea of what these optimal pathways look like now. Here are a few examples of some of the things we've seen and the things we have to do in order to make these lasting improvements.

Responding to invitations for improvement

DesireLine1.png

There are many things in Maven 2.x that work reasonably well but have some minor flaws which cause irritation:

  • Dependency management. How to effectively manage what we will call a target platform (intentionally using the Eclipse nomenclature). It is easy to manage the versions of an applications runtime or its target platform. But what happens when you start pulling in multiple, separately developed target platforms.
  • Version specifications and version ranges. Maven's notion of this is workable, but when it comes to the version specification and ranges it's time to just defer to OSGi in this regard. What OSGi specifies and what Maven specifies are not wildly different but different enough to cause some problems. We'll be working on aligning Maven to the version specification of OSGi.
  • The way plugins interact with embedded environments was just wrong. We allowed for no support for incremental changes. The result is that in M2Eclipse when you change an individual resource, that individual resource can be processed efficiently and not fire up all of Maven to run the whole build lifecycle. Sorry, but you won't have time to get a coffee anymore.
  • There are slew more, but we'll save those for other blog entries.

Responding to ... being completely wrong

DesireLine2.jpg

There are some things in Maven 2.x we just didn't get right and we have to make reparations:

  • Composition versus inheritance in the POM. There are some great debugged toolchains like we have in the Apache Organization POM for releasing, but this toolchain is not easily consumable by outside projects because it makes no sense to inherit from the Apache Organization POM for your projects. So how can we make these chunks of debugged combinations of plugins and their associated configuration? We will introduce mixins to help with this. Essentially it will be a POM consisting of plugins and configurations that can be externally parameterized. These mixins will be deployed to a repository and be referenced with a standard coordinate. Basically it will be an intelligent import with validation which will allow composition in your POMs.
  • Checking out the whole source tree versus an individual module and parent element versioning. We have always intentionally made projects specify versions in the parent elements. When you check out individual modules this is necessary. But for projects where the whole source tree is checked out, the version of the parent can be inferred and the requirement for specifying the version element in the parent can be removed.
  • Clean separation of operations that need to happen at the beginning and end of the lifecycle versus in the lifecycle itself. To make the OSGi integration we created work we needed to get at the projects before the build lifecycle was executed so we added a clear hook before lifecycle execution. What we have done to make OSGi integration work in Maven 3.x is simply not possible in Maven 2.x. We also had a terrible problem with reporting and aggregation because there was no clear point at which the lifecycle was over. Now once the build lifecycle is complete there is a clear hook to get the projects in the reactor so they can be processed easily. Accurate aggregation is now possible.
  • Again, there are slew more, but we'll also save those for other blog entries.

Responding to under-utilization of what exists

Unfortunately there are still a lot of powerful features and plugins in Maven that a lot of people don't know about. We have tried to remedy this situation with the four books that we constantly update stream, and a stream of blog posts, and the soon to be enterprise Maven users list which will focus on the holistic and systematic use of Maven, M2Eclipse, Nexus and Hudson. We need to do a better job explaining end-to-end best practices for developing, testing, and provisioning software. We also need to do a better job describing some of the incredibly useful plugins we have like the enforcer plugin and the dependency plugin.

But again, our best attempt to show people what is available through the four books that we have:

Here is a picture I always use to illustrate the point that even though something may be sitting directly in front of you, it's not always immediately obvious unless someone points it out to you. Th
is is very much the case where we see users asking us for help with particular use cases and in frequently it's very easy to answer the question by pointing to a URL. My example is the Fedex logo which was designed to have an arrow being formed between the last two letters of the logo. Quite ingenious but most people I point this out to haven't notice it.

Fedex1.png

Fedex2.png

We are going to try a lot harder to make sure the value that exists is easier to find. I don't enjoy reading blog posts from unhappy users at all, but I especially don't like it when I know there is a solution which has been documented somewhere. Those painful situations can be reduced if not eliminated completely.

Backward Compatibility

I can't stress enough how important backward compatibility is to us. We have done a non-trivial amount of work to pave the way for new capabilities in Maven 3.x while preserving 100% backward compatibility for:

  • Maven itself and the behaviour that users expect from the CLI
  • Artifact Resolution API
  • Plugin API
  • Plugin configuration
  • Site generation (even though we've extracted site/reporting entirely into the maven-site-plugin)

We must ensure that plugins and reports written against the Maven 2.0.x APIs remain viable in 3.0. We don't want people rewriting their plugins or having to change their projects at all or it will simply be chaos. We simply have too many users and requiring changes will have an enormous human cost so we've been slow in announcing Maven 3.x. The version of Maven 3.x you can pull from the Sonatype grid is probably the best version of Maven that has existed but we are still being careful about releasing it. If you want to try it you can find it here.

We must also ensure that POMs of version 4.0.0 are supported in 3.x along with the behavior currently experienced. We will need to make changes to the POM in order to introduce many of the new features so we've made the requisite preparation. We are relying heavily on our integrations tests right now but as we move forward the work that Benjamin is doing on the project/model-builder will help us to accommodate different versions of a POM, and different formats we decide to support. The model-builder code has no limitation with respect to formats. We can support XML, or any source that anyone can dream up. These implementations may find use outside of Maven. For example someone might build something with the Maven, JRuby, and Mercury to create a JRuby-based system. The same could be done for Groovy. We already have some examples of this with the Polyglot Maven project we've started which will be the topic of a subsequent post.

We have managed to keep almost everything backward compatible and we will go so far as to provide an isolated execution environment that can contain older versions of the plugin API so that everything from the past can continue to work in Maven 3.x. We will not truly be able to do this until we change the internal Maven runtime over to OSGi but we know what needs to be done.

Integration Testing

An enormous amount of time and energy has gone into improving the the integration tests for Maven. The integration tests are the gatekeeper and let us determine that we have a Maven 3.x that is compatible for Maven 2.x. We now have 506 integration tests that will help protect us as we move forward. We will likely approach 600 integration tests by the time we reach the Maven 3.0 GA.

All core Maven plugins now have ITs and we have started branching out into the Mojo project to create plugin ITs there as well. We have a good pattern so we are encouraging anyone writing a Maven plugin to create ITs so that we can help ensure we don’t break people in the future. We have really taken fixing the problems we see seriously. I couldn't help but make a little spoof of the SNL skit and adapting it for the Maven perspective.

In the next post I will start talking about some of the technical changes we have made to Maven 3.x in order to achieve our goals. Those will include:

  • Refactored model/project builder (the support Polyglot Maven uses)
  • Queryable lifecycle
  • Lifecycle Extension points
  • Error and integrity reporting
  • Mercury: Jetty Client & SAT4J
  • Embedding
  • Incremental build support (used heavily in M2Eclipse for performance improvements, these are changes to the plugin API)

I plan to try and write something about Maven 3.x frequently in preparation for the Maven 3.x talks in Stockholm, Dusseldorf, Cologne, and Devoxx.

Stay tuned!

Tags: Nexus Repo Reel, Sonatype Says, Everything Open Source

Written by Jason van Zyl

Jason is a co-founder and the former CTO of Sonatype.