Over the past ten years, demands for faster and better software have reshaped the way we do our jobs. Software used to be written. Today it’s assembled from components, and built and delivered continuously.
Join Jason van Zyl, for 30 minutes on Thursday, June 28 at 2:00PM EDT (GMT-0400) when he will be discussing the trends that will shape the next phase of software development. Jason will share insights on the future of Apache Maven-based development, and the emerging trend towards component lifecycle management.
I’m surprised by how few engineers know his work. Rube Goldberg was a cartoonist who lived from 1883-1970, he’s famous for drawing cartoons of ridiculous and inconceivably complex machines. His work was important during a time in which the world was becoming increasingly mechanized and automated providing a sort of cultural “steam vent” – a way for people to poke fun at machines and industry. I’d embed his work here, but none of it is public domain, so see for yourself or search Google Images. (Be warned, you can spend hours looking at these cartoons.)
I learned about Rube Goldberg from an Engineering professor who, at the time, said, “Rube Goldberg is the most important thing you’ll learn over the next four years”. Back then, we all thought he was joking, but it turns out that he wasn’t. In fact, I wish more people, especially “build engineers” had some exposure to these cartoons. If they had, they’d take a step back and realize that there has to be a better way.
Over the course of the past few years, I’ve interacted with hundreds of people when talking about build tools and repository management. It continues to surprise me how many people don’t realize where these artifacts come from. When you run a build and these JARs just show up alongside all of their dependencies, it’s like magic to most people. If you know how it works, it’s very obvious to you that running a repository manager is the right thing to do. This post is a reminder to everyone using build tools that rely on Central: take time to proxy Central with a repository manager.
“Wait, that’s how Central works?”
There’s something so automatic about dependency management in Maven that it often takes people a few months to understand exactly where those JAR files are coming from.
In an 8 hour Maven class, I get to dependencies in the third hour, and after describing Central, what it is was like before Central, how metadata is stored in a repository alongside binaries, transitive dependencies, etc…. it all falls into place, and people realize that this simple thing they’ve grown accustomed to is only easy because of a ten year effort to refine the model, the creation of a support structure for source forges at places like Oracle and Google, and a constant investment in infrastructure.
On the one hand, it’s a great success that Central is, for the most part, an invisible utility that supports developers. On the other hand, it’s the kind of thing that people can start to take for granted very easily.
For example, a few months ago I spoke to someone who worked in an environment disconnected from the internet for security reasons. This individual was talking about how limiting it was to have to download JARs from open source projects manually and assemble them in a project. His words were: “It’s like programming Java in 2001 again.”
How can you help?
Imagine millions of developer spread all over the world: different time zones, different applications, but they all hit the same service: Central. Some regions have more developers than others so we certainly see peaks in usage throughout the day, but in general, Central’s serving thousands of files throughout the world at any given time during the day.
Maybe someone just installed Maven for the first time, or maybe they blew away a local repository, with numbers like these we see a world that has a constant appetite for artifacts. It isn’t a problem for Central, and I’m not writing this because Central is falling down on the job. Central can handle it, but it certainly isn’t the most efficient way to support millions developers. It isn’t a good use of network bandwidth, and it isn’t a good use of energy to constantly cart around the same static JARs over and over and over again when the solution is so easy.
If everyone who used a build tool that interacted with Central adopted a repository manager such as Nexus we’d have a faster, more responsive system. Central’s maintainers would be focused less on addressing the occasional runaway build and could spend more time and resource on increasing availability and functionality of this essential service.
The other factor playing into this is that Maven builds only download releases once. It isn’t like these build tools are repeatedly returning to Central to download release artifacts over and over again.
Well… actually… that isn’t true, we’ve seen some installations of Hudson configured to delete a local repository before every build placing a high load on Central. Imagine a build that downloads 50 MB of dependencies running once every 5 minutes. That’s one build consuming ~14 GB a day never mind the time wasted downloading static artifacts.
While these broken builds are the exception, they do still show up from time to time. Central can handle the load, but imagine 1000 of these broken builds running continuously and you can see the challenge.
A Simple Reminder: Please Proxy Central
We’re constantly watching the performance of the system and making sure it stays up and running for an entire world of developers. If you use a build tool that hits Central whether it is Buildr or Maven or Gradle or Ivy, you can help us by running a Nexus instance.
Even if all of your builds work perfect, running a local Nexus instance helps preserve Central as a public, free resource and it will lead to faster, more responsive builds.
Sonatype is going through the archives and digging up articles that we think would be useful to developers using our tools. If you use Maven, keep reading the post below from Sonatype Vice President of Engineering Brian Fox on Maven best practices and how-tos.
We have a handful of Maven best practice and how-tos documented in the blogs. Over time they get buried by newer posts, but the content is still just as relevant. Learn more about Maven by checking out the following blogs:
Some questions about syncing Maven repositories between two sites were recently asked on GetSatisfaction.com.
“We will be moving data centers and want to setup another Maven2 repo that is managed with Nexus OSS. We want both repositories to be online and read\writeable until we migrate all our environments to the new site. I have a few questions:
What is the best method for copying the repo to the new location?
What is the best method for keeping the two repos in sync? We want to minimize network bandwidth usage.”
The video below answers these questions, and offers multiple solutions: