<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Sonatype Blog &#187; builds</title>
	<atom:link href="http://blog.sonatype.com/people/tag/builds/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.sonatype.com/people</link>
	<description>Sonatype is transforming software development with tools, information and services that enable organizations to build better software, faster, using open-source components.</description>
	<lastBuildDate>Tue, 18 Jun 2013 15:30:05 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Benefits of a Repository Manager: Part III Continuous Build Deployment</title>
		<link>http://blog.sonatype.com/people/2010/08/benefits-of-a-repository-manager-part-iii-continuous-build-deployment/</link>
		<comments>http://blog.sonatype.com/people/2010/08/benefits-of-a-repository-manager-part-iii-continuous-build-deployment/#comments</comments>
		<pubDate>Fri, 06 Aug 2010 13:46:30 +0000</pubDate>
		<dc:creator>Tim O'Brien</dc:creator>
				<category><![CDATA[Maven]]></category>
		<category><![CDATA[Nexus]]></category>
		<category><![CDATA[ant]]></category>
		<category><![CDATA[builds]]></category>
		<category><![CDATA[continuous]]></category>
		<category><![CDATA[Hudson]]></category>
		<category><![CDATA[repository]]></category>
		<category><![CDATA[Tutorial]]></category>

		<guid isPermaLink="false">http://www.sonatype.com/people/?p=5914</guid>
		<description><![CDATA[In the previous post in this series I discussed three compelling ways in which a repository manager can benefit the development cycle. It proxies artifacts locally, it is optimized to store binary artifacts, and it facilitates a new level of collaboration and agility that isn&#8217;t possible when your SCM is only way for workgroups to [...]]]></description>
				<content:encoded><![CDATA[<p>In the previous post in this series I discussed three compelling ways in which a repository manager can benefit the development cycle.   It proxies artifacts locally, it is optimized to store binary artifacts, and it facilitates a new level of collaboration and agility that isn&#8217;t possible when your SCM is only way for workgroups to collaborate.   In this post, I&#8217;m going to talk about how a repository manager works in concert with a continuous integration server like Hudson or Bamboo.</p>

<p><span id="more-5914"></span></p>

<p><a href="http://www.sonatype.com/people/wp-content/uploads/2010/08/ci-builds.png"><img class="aligncenter size-full wp-image-5915" title="ci-builds" src="http://www.sonatype.com/people/wp-content/uploads/2010/08/ci-builds.png" alt="" width="337" height="147" /></a></p>

<p>First, the how, what, and when of a continuous integration server.  Continuous integration (CI) servers are an established fact of of modern development infrastructure.   It is a server which, for the most part, waits and watches.   It keeps a vigilant eye on your source control system and jumps into action every time it sees a code change.    When code changes, your CI system is usually configured to run the entire build, execute all of your unit and integration tests, and send out an email to every developer if it identifies a defect or a failed test.</p>

<p>It does this so that you will have an easier time identifying where a particular problem was introduced to the source code.   If John checks in some bad code, the CI system runs the build immediately, and about 30 minutes later, everyone in the group receives an email with the subject header &#8220;John just broke the build&#8221;.   It is a great way to identify errors, and it is also a great way to motivate developers to test locally before committing to a source control system as no one likes to be the reason for a build failure email.</p>

<p>Running a CI server is more than &#8220;just a good idea&#8221;.  Once your system reaches a certain level of complexity you can&#8217;t scale a system without commiting to continuous integration and testing.   If you don&#8217;t have continuous integration, you end up having to put all development on hold each time you want to perform a release.   If you don&#8217;t build, test, and deploy your system on a regular basis &#8211; if it isn&#8217;t something that is well rehearsed, integration becomes a time consuming nightmare of manual testing and builds that often leads to inconsistent builds.   This is especially true if your development effort spans multiple systems and multiple development workgroups.   You run a CI system because building, testing, and deploying your system should be automatic: it should be as trivial as pressing a button.</p>

<p>The concept of a CI server is only slightly more established than a repository manager, and very often you will see that an organization has identified the need for a CI server before they&#8217;ve identified the need for a repository manager.   If you are coding a complex system, there is a very good chance that you are already running a CI server.  The most popular servers out there are Hudson, Bamboo, and CruiseControl.   While the connection between CI servers and repository managers might not be immediately obvious, when used together they can introduce some new possibilities for the way you develop your systems.</p>

<h2>Continuous Publishing</h2>

<p>When you have a system to continuously build your code, you also have a system that can continuously publish SNAPSHOT artifacts to a repository manager to enable a more granular approach to development.   What do I mean by &#8220;a more granular approach to development&#8221;?  To answer that question, let&#8217;s take a look at a complex multi-module project using the example of the eCommerce group from the previous post in this series.</p>

<p><a href="http://www.sonatype.com/people/wp-content/uploads/2010/08/ecom-multimod.png"><img class="aligncenter size-full wp-image-5918" title="ecom-multimod" src="http://www.sonatype.com/people/wp-content/uploads/2010/08/ecom-multimod.png" alt="" width="434" height="569" /></a></p>

<p>Assume you have a new programmer starting tomorrow.   Instead of throwing him at the entire forty-thousand lines of code, you would like to be able to give that developer a small, easy to digest task.    You want this developer to add support for PayPal&#8217;s Adaptive Payments API in your eCommerce system.  That&#8217;s it.   You don&#8217;t want them to be distracted by the overwhelming scope of the project, and you certainly can&#8217;t afford for them to take a three month voyage through your project&#8217;s code before they start contributing to the effort.   Deadlines are tight, and you don&#8217;t have enough people on your team.   It is important that new hires start programming as soon as they walk in the door.</p>

<p>Without a repository manager hooked up to a continuous integration server, if you try to checkout just the ecom-paypal project, the build is going to fail because it will try to download dependencies from a repository manager.   In the case of the ecom-paypal project, assume that the dependency graph looks like this.</p>

<p><a href="http://www.sonatype.com/people/wp-content/uploads/2010/08/ecom-multimod-dep.png"><img class="aligncenter size-full wp-image-5919" title="ecom-multimod-dep" src="http://www.sonatype.com/people/wp-content/uploads/2010/08/ecom-multimod-dep.png" alt="" width="448" height="191" /></a></p>

<p>When you have a repository manager and a continuous integration server, you can configure your continuous integration server to publish SNAPSHOT artifacts (in-progress SNAPSHOT binaries) to your repository manager.   This will allow you to just check out a single, isolated portion of a much larger multi-module project.</p>

<p>Without a repository manager, trying to build version 1.3-SNAPSHOT of ecom-paypal in isolation is going to generate errors because you are forced to checkout the entire codebase to build and install all of the dependencies in your local repository.   With a repository manager, SNAPSHOT artifacts are being continuously published because Hudson is checking you SCM every few minutes and building the latest code.   When you run the ecom-paypal module&#8217;s build in isolation, Maven is going to download the most recent SNAPSHOT.</p>

<p>Without a repository manager, your new developer is going to have to download the entire codebase and run a large time-consuming build.   With a repository manager you can work on specific components of a larger multi-module project.    This ability to divide and conquer your codebase comes in very handy when you need a consultant to take a look at a specific problem, or when you need to look at a coding problem in isolation.</p>

<p>When you continuously publish build artifacts to a repository manager, you move away from the single monolithic project build and toward a project layout and architecture that lends itself to modularization.</p>

<p>In tomorrow&#8217;s post: How a Repository Manager decouples deployments from source code, and what that means for developer operations.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.sonatype.com/people/2010/08/benefits-of-a-repository-manager-part-iii-continuous-build-deployment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Hudson Build Farm Experience, Volume III</title>
		<link>http://blog.sonatype.com/people/2009/02/the-hudson-build-farm-experience-volume-iii/</link>
		<comments>http://blog.sonatype.com/people/2009/02/the-hudson-build-farm-experience-volume-iii/#comments</comments>
		<pubDate>Wed, 04 Feb 2009 16:18:01 +0000</pubDate>
		<dc:creator>John Casey</dc:creator>
				<category><![CDATA[How-To]]></category>
		<category><![CDATA[Maven]]></category>
		<category><![CDATA[builds]]></category>
		<category><![CDATA[Hudson]]></category>

		<guid isPermaLink="false">http://blogs.sonatype.com/people/?p=1528</guid>
		<description><![CDATA[I&#8217;ve been working on a Hudson-based build farm for Sonatype and Maven open source builds since sometime in September of 2008. I&#8217;ve learned a lot as a result, so I thought I&#8217;d share some experiences from the trenches. In this third - and probably, final - installment I&#8217;ll discuss some issues we tackled with our VMWare environment itself, and look ahead to some issues with which we still grapple on a day-to-day basis.]]></description>
				<content:encoded><![CDATA[<p>I’ve been working on a Hudson-based build farm for Sonatype and Maven open source builds since sometime in September of 2008. I’ve learned a lot as a result, so I thought I’d share some experiences from the trenches. In this third &#8211; and probably, final &#8211; installment I’ll discuss some issues we tackled with our VMWare environment itself, and look ahead to some issues with which we still grapple on a day-to-day basis.</p>

<h2>VMWare, Efficiency, and the Space-Time Continuum</h2>

<p>Compared to what we went through trying to get Windows builds running reliably out on the build farm, this discussion is going to seem somewhat…nitpicky. However, there are some important things to understand when you’re running a build farm on VMWare ESXi, so let’s dive in and take a look.</p>

<p><span id="more-1528"></span></p>

<p>The first thing to understand is that the hardware specs of your ESXi machine represent a sort of theoretical maximum. Just looking at those numbers (we have 8 cores at 3.16 GHz and 32 gigs of RAM), you’ll be tempted to salivate and wring your hands as you dream about all the simultaneous builds you can run. <strong>Resist!</strong> Remember that you’ll have multiple virtual machines sharing that hardware, each of which has a certain sunk cost in terms of memory (and, minimally, CPU) overhead. This overhead comes from the RAM and CPU necessary to run a full-blown operating system, on which your Hudson instance executes. In some cases like Ubuntu JeOS (Just enough OS), which are designed for use in virtual machines, the overhead is pretty minimal though still noticeable; in other cases like Solaris or Windows, you’re stuck with the same operating system your desktop machine might run…complete with GUI. OK, I’m sure you can turn off the GUI on Solaris &#8211; it runs webservers, right? &#8211; but I’m not a Solaris expert, and more to the point I’m not interested in tainting that environment too much with customizations. Too much customization can render your build platform unique, which is a bad thing. Additionally, there can be a bit of inefficiency related to allocation of RAM and CPU resources if you structure your VMs to grab and reserve those resources no matter what. This means that even if those VMs are completely idle, they may hold onto a certain amount of RAM (usually not CPU really, in my experience) and choke out other competing VMs. On the other hand, if you don’t reserve resources for your VMs, you may face sudden lock-ups if you have too many VMs competing for what is fundamentally a finite resource.</p>

<p>In theory, this should simply slow down all VMs on the system; sort of a reverse rising-tide-lifts-all-ships effect. In practice, we’ve found that this sort of competition can lead to full-out system crashes. Funny thing: it turns out some operating systems don’t respond favorably to having less RAM than they thought. If it’s just a CPU-competition issue, then your VMs may simply leak time…but we’ll talk about this in a minute. After groping around in the dark for several days, we gradually determined that the best policy was to try to limit the total pseudo-hardware configurations for all running VMs to something on the order of 90% of what the ESXi machine actually has. Note that you must always tell each VM how many CPUs and how much RAM is “owns”, even if you don’t reserve those resources by messing with the Resources tab in the VM settings. (Reserving them via the Resources tab should force more of a hard allocation, limiting VMWare’s ability to shuffle resources to where they’re most needed, as I alluded to above.) What I’m talking about is really trying to keep the total resources “owned” by all running VMs just below the actual hardware resources available on the machine…it just seems to function more smoothly that way.</p>

<h2>Managing Resources: Understanding Your Builds’ Needs</h2>

<p>I need to stop things here and provide a bit of a disclaimer. Some of our builds are quite large, and can take a very long time to complete. In the past, each time we’ve run into resource problems in our build farm, it’s been as a result of these huge builds running on all available VMs at once. So, the load put on our particular build farm varies tremendously from moment to moment. This may seem like a strange niche case, but there’s a critical lesson here.</p>

<p><strong>You have to plan for the maximum momentary load you’re likely to see on the whole build farm.</strong></p>

<p>It only takes one instant maxing out the RAM on your ESXi hardware to cause one or more of your VMs can grind to a halt. If you have more than one build that can run for a long time or runs on all VMs at the same time, you need to be prepared for saturating your server’s hardware. You can limit the effects of this a little bit by using the Locks and Latches Hudson plugin and keeping long-running jobs on the same lock. This will cause the your build times for any particular distributed job to balloon, so be prepared; but failing to do this can completely lobotomize a VM, leaving it with a corrupted disk or something similar. You’ll have to ask someone else for a technical explanation of why this is, but believe me: I’ve had to rebuild VMs on multiple occasions because of this problem.</p>

<p>On the other hand, if you have a lot of small builds that are unlikely to jam up the works for long by themselves, you can probably get away with tuning the number of Hudson executors on each VM and leaving the CPU/RAM allocations to each VM as suggestions. That way VMWare doesn’t have to set aside that segment of its resources for an idle VM. Even if you have this sort of setup, but still have that one huge build, you can avoid Hudson gridlock by making sure you have at least two executors on each VM where the long-running job will build. This way, the more agile builds have a passing lane for to get around that trundling, grindingly slow 18-wheeler of a build.</p>

<p>We’ve actually been able to cheat the resource allocation rule I mentioned above to a certain extent. Our private build farm tends to have much faster, less frequent builds, so we’ve been able to almost double the number of running VMs on the ESXi server since the VMs allocated to the private build farm are idle much of the time. As we add new jobs to each build farm, I’m sure this will cease to be true, but for now the two farms look like they’re running on twice as much hardware as we physically have…and they seem happy as two peas in a pod.</p>

<h2>Keeping Time</h2>

<p>Virtual machines running on ESXi tend to have some trouble keeping time. It’s a little embarrassing, and we try not to talk about it in public, but there it is. Left to their own devices, VM operating systems may move backward <em>or</em> forward in time relative to any outside fixed point. To the outsider, some VMs will appear slightly blue, while others will appear slightly reddish…Einstein would be impressed.</p>

<p>Okay, bad physics jokes aside, they’re not really <em>moving</em> in time; they just sort of lose track of it. The problem is pretty well documented out on the internet, and there are some pretty good instructions for compensating, like <a href="https://docs.sonatype.com/download/attachments/15076609/vmware_timekeeping.pdf">this one</a> (PDF) from VMWare. It seems that the timekeeping problem arises from CPU allocation and kernels that count CPU ‘ticks’ to keep time. The best practice seems to be taking a two-pronged approach to keep everything synchronized. First install VMWare Tools, and second configure NTP time synchronization on each VM operating system.</p>

<p>VMWare Tools is meant to keep VMs in sync by catching them up when they fall behind (probably due to not getting the CPU access they expect). However, the tools are apparently useless for reigning in VMs that run out ahead of the bunch. Personally, I have no idea why a VM operating system would skip <em>ahead</em>, but the internet assures me it’s possible, and I’ve actually seen it happen in our build farm. To handle this problem, we must enable NTP clock synchronization for our VMs. Installing VMWare Tools is a breeze on most operating systems, except for FreeBSD. It seems there is no version available for BSD, so you’re left with NTP to keep things up-to-date. That’s okay; it does pretty well. As far as enabling NTP, this is also a breeze on most operating systems; most already have NTP installed, or can have it installed through a simple command like:</p>

<pre><code>sudo aptitude install ntp
</code></pre>

<p>…Except, of course, on Windows. On windows, you’ll need to dig into the Policies section of the Control Panel, as described <a href="http://technet.microsoft.com/en-us/library/cc749145.aspx">here</a> to enable network time protocol. This is far less intuitive or simple than on just about any other OS (except possibly Solaris, and on Solaris the cure for all problems is a good manual).</p>

<p>One other interesting point about NTP: if you have an NTP server on your VMWare machine you’re thinking about using, <strong>STOP!</strong> Use an NTP server external to VMWare; remember how VMWare has some problems with timekeeping? I may have touched on this point somewhere above. In out build farm, we’re using the following NTP configuration (or approximations of this, on the Windows systems):</p>

<pre><code>$ cat /etc/ntp.conf

server 0.north-america.pool.ntp.org
server 1.north-america.pool.ntp.org
server 2.north-america.pool.ntp.org
server 3.north-america.pool.ntp.org
</code></pre>

<p>Pretty simple, really. Using multiple time sources gives your network the ability to compensate for any clock skew that may appear in any one of the sources. It should also make your configuration more resilient to partial network outages, such as when the entire east coast of the US disappears from the internet (it’s happened before).</p>

<p>But why go to all this trouble? Why does it matter that all OS clocks tick in perfect harmony? Apparently, Hudson can lose build results if the timestamps are off by too much. It’s not just an urban legend; we’ve had this problem (which is why I know so much about VMWare’s timekeeping). Again, I’m not completely sure why Hudson loses build results, or why it relies on timestamps from slave instances at all for that matter; these are questions best asked of Hudson developers. What I can tell you is keeping time in sync throughout your build farm is in your best interest.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.sonatype.com/people/2009/02/the-hudson-build-farm-experience-volume-iii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Hudson Build Farm Experience, Volume II</title>
		<link>http://blog.sonatype.com/people/2009/02/the-hudson-build-farm-experience-volume-ii/</link>
		<comments>http://blog.sonatype.com/people/2009/02/the-hudson-build-farm-experience-volume-ii/#comments</comments>
		<pubDate>Mon, 02 Feb 2009 15:30:09 +0000</pubDate>
		<dc:creator>John Casey</dc:creator>
				<category><![CDATA[How-To]]></category>
		<category><![CDATA[Maven]]></category>
		<category><![CDATA[builds]]></category>
		<category><![CDATA[Hudson]]></category>

		<guid isPermaLink="false">http://blogs.sonatype.com/people/?p=1517</guid>
		<description><![CDATA[I&#8217;ve been working on a Hudson-based build farm for Sonatype and Maven open source builds since sometime in September of 2008. I&#8217;ve learned a lot as a result, so I thought I&#8217;d share some experiences from the trenches. In this second installment I&#8217;ll discuss a few more details related to remote maintenance, along with the hurdles we encountered integrating Windows into our Hudson farm (and the solutions we found). I know I promised somewhat more than this in Volume I, but in the end it would have resulted in a truly gargantuan post&#8230;so, I&#8217;ll (hopefully) finish up this little mini-series with a discussion of VMWare issues, and some in-progress challenges with which we continue to grapple, in my next post.]]></description>
				<content:encoded><![CDATA[<p>I’ve been working on a Hudson-based build farm for Sonatype and Maven open source builds since sometime in September of 2008. I’ve learned a lot as a result, so I thought I’d share some experiences from the trenches. In this second installment I’ll discuss a few more details related to remote maintenance, along with the hurdles we encountered integrating Windows into our Hudson farm (and the solutions we found).</p>

<h2>Eyes and Ears: Getting Access</h2>

<p>Having access to the build farm is critical for maintenance, but it can also be very important to developers who are debugging a failing build. In our build farm, we’re using various mechanisms to provide this access, largely based on what is best suited for a particular VM operating system. The basic requirement here is to provide “natural” browsing capabilities for the filesystem on each VM, along with the ability to upload files if necessary (this came in very handy for installing and testing FlashPlayer, for instance).</p>

<p><span id="more-1517"></span></p>

<h2>SSH</h2>

<p>For starters, we have SSH access to all machines in the farm. This is partially borne out of convenience, since we use SSH to connect Hudson nodes together, and partially out of practicality, since it’s one of the best connection methods for headless Linux systems (those running without X Windows). Initially, we had an SSH port mapped through the router VM from the public connection into each Hudson VM. However, we soon realized that direct access to Hudson slave VMs was largely unnecessary since it was next-to impossible to remember which non-standard port led to which VM. Now, we’ve simplified SSH access down to two public ports: one for accessing the webserver VM, and one for accessing the Hudson master instance. From the Hudson master instance, it’s a breeze to connect to any of the Hudson slave VMs, using their internal DNS hostnames to avoid confusion. While this may not be quite as efficient, it greatly simplifies the knowledge of our build farm that someone needs in order to make use of it. It’s simply not realistic to force someone to have a wiki page open so they can lookup the port number to use to reach our Windows Hudson slave. It’s not realistic, and it only adds two new layers of maintenance burden (maintaining the wiki page, and maintaining the NAT mappings). Simple trumps efficient.</p>

<h2>Terminal Services</h2>

<p>While SSH is a good, all-purpose access mechanism, it’s not exactly ideal for navigating the Windows filesystem. And you can forget about managing (read: killing) Windows processes from the command line; it’s definitely not a user-friendly experience. To compensate, we use Windows Terminal Services to get access to the desktop of the VM. There are some pitfalls associated with Terminal Services on Vista, however…but I’ll get into that in a minute. However, once setup correctly Terminal Services (or TSC, or RDC, or rdesktop, or whatever you prefer to call it) provides a nice, snappy way to navigate a remote Windows system from just about any client operating system. The proliferation of names for this protocol provides a clue as to just how many different client applications there are. Beware, though: they work with varying degrees of success. I’ve found Microsoft’s own RDC application to work best for OS X.</p>

<h2>VNC</h2>

<p>As if two connection protocols wasn’t enough, our addition of the OSX VM to the build farm forced us to use yet another: VNC. OSX is actually pretty well-behaved when it comes to ease of navigation over a SSH connection, and managing processes on the machine via the command line is really no problem at all. However, there is at least one function that all OSX machines perform which absolutely requires access to the desktop: software updates. Without graphical access to the Mac, we can’t install updates to the operating system since Software Update is a graphical application. So, while we don’t use VNC very often at all, we do have to maintain access to the Mac desktop for the relatively rare software update to run. Incidentally, VNC has also proven quite handy for installing new software out on the Mac.</p>

<p>Finally, there is one maintenance task that even the best remote connection has a hard time coping with. As anyone who manages remote machines for a living will tell you, there are times when there simply is no substitute for an on-site thumb to push the power button. In the early days of this build farm, we experienced several instances where our Ubuntu Hudson slave simply maxed out its available RAM. I’m not sure whether dedicated hardware would react in precisely the same way, but when we exceeded the allocated RAM for that virtual machine, it simply froze. Solid. The only way to bring it back was to power cycle the virtual machine (even this didn’t always work…I’ve completely rebuilt the Ubuntu VM a couple of times now). This is where our last line of defense comes in: VMWare Infrastructure Client. Infrastructure Client gives you a bird’s eye view into the running ESXi machine, where you can provision, decommission, modify, and manipulate all of the running VMs. You can even grab a monitor-ish view on an individual VM in order to execute commands on the running OS. When nothing else works, Infrastructure Client still does. This handy application has saved my bacon on multiple occasions, from misconfigured network interfaces to the aforementioned RAM-locked coma. But beware: this level of access comes at a price. VMWare Infrastructure Client only runs on Windows, a fact that for me required installation of a Windows XP virtual machine via Parallels on my Mac. Fortunately, this solution works well, and I don’t have to keep a dedicated Windows machine in the closet.</p>

<h2>The Square Peg: Dealing with Windows</h2>

<p>On the whole, Windows has not only been the hardest to integrate with our Linux-based Hudson master instance, but also by far the hardest to connect to in a consistent, reliable way. I suppose some of this is to be expected, given the difference in filesystem structure between Windows and the rest of the world. But what may be a little less expected &#8211; it was for me, at least &#8211; is just how hard it is to integrate Windows into an overall plan for remote access. I’m going to address the remote access question first, since it at least has a reasonable solution; but rest assured, I’ll get back to the filesystem challenges soon.</p>

<h2>Windows Connectivity</h2>

<p>As I mentioned above, SSH has become our lowest-common-denominator connection for remote access. SSH works well (in <em>almost</em> all cases), is easy to configure, and performs a double duty by allowing both remote shell access to a machine and the ability to remotely execute a program, as with the following example:</p>

<pre><code>ssh jeos1 bash -l -c echo $PATH
</code></pre>

<p>SSH also has the advantage of ubiquity; you can install it everywhere, on basically any operating system. Or so I thought. It turns out that SSH for Windows comes in basically two flavors: a Windows-native “port” that uses Windows’ own authentication methods and contains some pretty interesting divergences from *nix or BSD brands of SSH, and SSH over Cygwin. In a previous life, I had already tried using the native SSH daemon, with little or no success. Since many of us at Sonatype have experience working with Cygwin, we opted for that solution. Once setup, this option works fairly well…but the setup is not for the faint of heart. After much Googling, I came across <a href="http://pigtail.net/LRP/printsrv/vista-cygwin.txt">this explanation</a> for installing SSHd on Cygwin+Vista. While it doesn’t provide much explanation along the way, the steps do work for gaining basic SSH access to a Vista machine. Later, I found out the hard way that providing desktop access to applications executed through that SSH session was another matter entirely, one that required quite a bit of trial and error to solve. In the end, we configured the SSHd to run via the same user (‘hudson’) as the Hudson <code>slave.jar</code>, to ensure Hudson builds that use FlashPlayer to run tests had access to the desktop. In point of fact, I’m still not sure we have that one completely figured out…</p>

<p>Once we had a reliable SSH connection to our Vista VM, we learned that failed Hudson jobs could pollute the running system with orphaned processes, Java or FlashPlayer instances that would never complete for whatever reason, but instead would squat on their reserved sockets and file locks until forcibly removed. In the end, we simply could not avoid enabling desktop access via Terminal Services. However, we found out that merely turning on this service is not enough; Vista uses a newer protocol version than just about any client out there (except, I imagine, the Vista terminal services client). Got XP? Maybe OS X? Tough luck.</p>

<p>Once again, after much Googling, we learned that Terminal Services on Vista uses a new protocol feature called Network Authentication by default. This feature excludes pretty much the rest of the free world from connecting to your Windows machine, even if you ask nicely. Luckily, I dug up <a href="http://theillustratednetwork.mvps.org/RemoteDesktop/RDP6ConfigRecommendations.html">this page</a> that gave some tips for working around Network Authentication. For specifics, see the section entitled “Enable Vista Remote Desktop host computer use of Network Level Authentication”, about 3/4 of the way down that page. With Network Authentication disabled, XP and OS X clients were free to connect to our Vista VM, allowing us access to install and test applications such as FlashPlayer, and to manage(<em>KILL!</em>) orphaned processes.</p>

<h2>Integrating Hudson: Linux Master, Windows Slave</h2>

<p>As many developers know, writing software that’s meant to run on both Linux and Windows can be particularly difficult. It’s not simply that Linux uses ‘<code>/</code>’ for file- and directory-separation, while Windows uses ‘<code>\</code>’. It’s not just that Linux uses ‘<code>/</code>’ to denote the root of the filesystem for the whole computer so that URLs often look like: <code>file:///home/hudson/.m2/settings.xml</code>, while Windows uses drive letters and has multiple filesystem roots resulting in URLs that look like: <code>file:/C:/Users/hudson/.m2/settings.xml</code>. No, the pain of programming for both Windows and non-Windows target environments is all of this, and much more. From the aforementioned path-formatting differences, to different line endings, to differing approaches to child processes and file locking, working in a mixed environment like this can often feel like death by a thousand cuts. It’s not that any one of these inconsistencies is insurmountable; it’s that, taken together, coping with all the differences can lead to very, very complex configurations.</p>

<p>As a case in point, adding Windows to our predominantly Linux-compatible Hudson build farm involved:</p>

<ol>
    <li>installing Cygwin to run SSHd</li>
    <li>writing a separate slave-launching script (<code>start.bat</code>, mentioned in the last post)</li>
    <li>liberal use of the Windows symlink approximation (abomination?) called <code>mklink</code>, to approximate a filesystem layout normalized with that of our other Hudson VMs</li>
    <li>a <em>lot</em> of hand-holding to clear orphaned processes resulting from failed builds</li>
</ol>

<p>In spite of the fact that we brought Windows into the build farm mix early on &#8211; we’ve been running builds on Vista since probably around the beginning of October at least &#8211; the Windows slave VM still contains more chewing gum, duct tape, and shade-tree engineering than the rest of the build farm put together. The fact that it works for most of our builds is something I attribute more to luck than skill, and it’s still impossible to run the Maven bootstrap on anything but the default JDK version setup in the <code>start.bat</code> file. Forget specifying the Java version in that job definition. The Maven bootstrap uses Ant to orchestrate a rough first pass, then spawns a Maven process toward the end of the build to run unit tests and other verifications. Any explicit declaration of a Java version in this job results in the Windows slave using an incompatible <code>JAVA_HOME</code> path, which causes the build to fail abysmally.</p>

<p>I’ve got a lead on fixing this problem in the form of a patch to Hudson, but for now this issue is filed solidly in the ‘In-Progress’ category.</p>

<p>Path problems aside, we ran into some interesting issues with the Maven versions we installed on Windows. I can’t say I’m entirely sure how this happened; it seems like anything installed should have the equivalent of <code>755</code> and <code>644</code> directory and file permissions, respectively. In any case, when I installed Maven on the Vista VM, the <code>hudson</code> user didn’t have permissions to actually execute it. As a result, I modified the permissions of the entire <code>/opt/maven</code> equivalent path &#8211; the parent directory for all Maven installations &#8211; to allow Everyone access to All Permissions. Sure, that’s probably too lax; but this is a slave VM that’s basically cut off from the world, and in the end, it’s pretty much disposable. If it gets compromised, the perpetrator will only have access to OSS source code, and we can decommission/re-provision a new Windows VM quickly thanks to our SVN repository.</p>

<h2>Summary</h2>

<p>Now that we’ve covered the ins and outs of connecting to the build farm and dealing with the “special” nature of Windows, I think this is a good place to end.</p>

<p>In the last post, I covered the basic topology and system configuration for our build farm. In this post, we’ve talked at length about remote connectivity and Windows-related issues. Please keep an eye out for my next post, when I’ll wrap up this series by discussing some special considerations dealing with VMWare’s ESXi environment, and looking forward to some of the as-yet unresolved issues we’re facing today.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.sonatype.com/people/2009/02/the-hudson-build-farm-experience-volume-ii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Hudson Build Farm Experience, Volume I</title>
		<link>http://blog.sonatype.com/people/2009/01/the-hudson-build-farm-experience-volume-i/</link>
		<comments>http://blog.sonatype.com/people/2009/01/the-hudson-build-farm-experience-volume-i/#comments</comments>
		<pubDate>Wed, 28 Jan 2009 00:31:54 +0000</pubDate>
		<dc:creator>John Casey</dc:creator>
				<category><![CDATA[How-To]]></category>
		<category><![CDATA[Maven]]></category>
		<category><![CDATA[builds]]></category>
		<category><![CDATA[Hudson]]></category>

		<guid isPermaLink="false">http://blogs.sonatype.com/people/?p=1503</guid>
		<description><![CDATA[I&#8217;ve been working on a Hudson-based build farm for Sonatype and Maven open source builds since sometime in September of 2008. I&#8217;ve learned a lot as a result, so I thought I&#8217;d share some experiences from the trenches.]]></description>
				<content:encoded><![CDATA[<p><a href="http://blogs.sonatype.com/people/wp-content/uploads/2009/01/hudson.png"><img class="alignright size-medium wp-image-1510" title="hudson" src="http://blogs.sonatype.com/people/wp-content/uploads/2009/01/hudson.png" alt="" width="162" height="66" /></a>I’ve been working on a Hudson-based build farm for Sonatype and Maven open source builds since sometime in September of 2008. I’ve learned a lot as a result, so I thought I’d share some experiences from the trenches. In this first installment, I’m only going to cover our goals and outline the basic setup of our farm; I’ll save discussion of specific hurdles and advantages offered by our environment for the next post.</p>

<p><span id="more-1503"></span></p>

<h2>The Challenge</h2>

<p>Java software must function in a nearly endless variety of runtime environments. While the bytecode itself is basically portable from one operating system to another, I’m sure everyone knows this doesn’t mean software written in Java is automatically portable.  The Write-One-Run-Anywhere (WORA) ideal of Java is an ideal; in real life software must be tested on all platforms.  In the past, Maven releases have relied on a best-effort approach, where the continuous integration builds and integration tests were run on one operating system, and other operating systems were periodically &#8220;spot checked&#8221; just a release.  We were using JIRA and our development community to compensate for the lack of a real build farm which would have allowed us to continually check for problems on a variety of platforms. Since we were running our CI operations on a Linux, BSD, or Solaris machine (it varied), we relied on developers to file JIRAs for anything that turned up broken on Windows or OS X. Since most of us work on one of these two platforms, the most critical issues were normally caught and fixed. If an issue cropped up on an operating system that wasn’t exposed on the CI system or the developers’ own workstations, it typically survived until the next release cycle, after a user reported it and worked with the development team to test and get it resolved.</p>

<p>When we started releasing open source here at Sonatype, we decided to take a much more proactive role in verifying our software. Our approach has been fairly straightforward: make sure we encounter and fix as many of the issues in our software as possible, before they have a chance to trip up our users. Like any other aspect of the software engineering world, our ideal has been tempered by a dose of practical reality…but I’ll get to that later. For now, suffice it to say that we wanted the ability to test software on as many operating systems as could run Java, and as many Java implementations as we are willing to say we support commercially. Additionally, the results of all these myriad builds should be collated and easy to understand.</p>

<p>Since our business is very much dependent on the health of Maven, we decided this new build farm should be provided as a resource for the Maven community in addition to our own open-source offerings.</p>

<h2>Enter Hudson</h2>

<p>We settled on Hudson as our continuous integration system for a few reasons. First, it’s dead simple to install and use in the non-distributed sense, and many of us had glowing opinions of this little application. Even now, my sentiments toward Hudson are similar to that of a long-time friend and colleague: I’m still impressed despite the fact that I now have enough experience to see its flaws.</p>

<p>The second and third reasons for choosing Hudson were even more practical. It was the only system with a history of supporting multiple versions of Maven and multiple JDKs. Also, at the time, it was the only system that could collate distributed builds from multiple slave nodes running different operating systems. While this latter feature was new &#8211; and we really didn’t appreciate just <em>how</em> new at the time &#8211; it was working and documented.</p>

<p>Finally, Hudson offers a plugin API and a large number of plugins to help cope with extra requirements like IRC notification or Git support. These plugins were a big attraction, but the fact that Hudson’s developers were thoughtfully exposing a plugin API meant that we could probably provide any extra bells and whistles we might require.</p>

<p>In fact, at the point where we decided to implement a build farm, we already had a one-dimensional, non-distributed Hudson deployment. So I guess you could even add that to the list of advantages: we had a certain amount of experience maintaining a Hudson instance. What remained was learning how to setup and maintain the underlying array of operating systems on which the build farm would rely, then learning how to run Hudson on this array.</p>

<h2>Nuts and Bolts: Our Farm Environment</h2>

<p>Since we really weren’t sure what operating-environment details we might require for adequate testing in our build farm, we opted to run the whole thing &#8211; or, as much of it as possible &#8211; on a large VMWare ESXi machine. This gives us the ability to provision operating systems as needed, or decommission old VMs when they outlive their usefulness. It also gives us a certain degree of scalability, since (theoretically) we can deploy copies of a given operating system to adjust to demand. In practice, this scalability is limited by the resources available on the machine as a whole, but more on that later! In any case, alternatives like Xen would have limited the range of operating systems we could have deployed. Alternatives like separate hardware per node would leave us guessing up front what our real operating system needs would be, and for which types of hardware to support those needs. VMWare ESXi seems uniquely suited for this sort of system; its flexibility has proven to be a great asset as we planned and then updated our build farm.</p>

<h3>Hardware</h3>

<p>Our hardware consists of two quad-core 3.16 GHz Xeon CPUs, with 32GB of RAM, and a 1.3TB disk array. On this machine, we run a router VM, a bare-bones httpd VM that uses mod_proxy to connect to our Hudson master VM, which is an Ubuntu JeOS instance. In addition, we have Hudson slave VMs for Windows Vista 64-bit, Ubuntu JeOS (to prevent overloading of the master instance), FreeBSD, Solaris, and CentOS. Finally, we have a Mac OS X machine colocated with the ESXi machine and connected up as if it were just another virtual machine.</p>

<h3>VMs and Configuration</h3>

<p>The router VM provides NAT/firewall capabilities for the entire farm, as well as DHCP for new VM setups, and internal DNS. The httpd VM literally runs nothing above the operating system level except for SSHd, Apache, and logrotate (to manage the disk-clogging tendencies of webserver logging). For obvious reasons, the Hudson master and slave VMs are far more involved, mostly due to Hudson, SSH, Subversion, and Maven configurations, plus the Ant, Maven, Java, and Hudson files themselves. All of these Hudson VMs require basically the same software, and actually share all of the same files that can be used across platforms. To facilitate keeping all these configurations and software installations up to date across an array of systems, we check them into a dedicated Subversion repository, then simply check them out on each machine as a series of working directories. Got an update for the Maven settings you need to use for builds on the farm? No problem. Just update the VM working directories. To make this even easier, we’ve created a couple of Hudson jobs that will actually call <code>svn update</code> for each working directory on each Hudson VM. In cases where a piece of software is operating system-dependent (<em>hint,</em> the JDK) we simply create a directory structure in Subversion for each operating system class, with mirrored directory structures within. Want to add a new Linux slave? Check out the JDKs from <code>/grid/linux/opt/java</code>. New Solaris slave? You’ll want <code>/grid/solaris/opt/java</code>. Everything we need to provision a new Hudson VM is contained within Subversion.</p>

<p>The normalization of our VM directory structures is a huge advantage when you’re supporting a distributed Hudson environment; so much so that it’s highly recommended in Hudson’s own documentation for setting up distributed builds. Since all path information is managed by the master Hudson instance and passed on to the slave instances, it’s absolutely critical that the directory structures and installed software look as uniform as possible. This has obvious consequences for adding Windows to the mix, which is actually still one of our biggest pain points…but I’ll discuss this at length in the next installment.</p>

<h3>Running Hudson</h3>

<p>As for how to actually run Hudson, we’re using JavaServiceWrapper for the master instance. It gives us a nice, familiar way to configure and control the system, all packaged in a script that’s compatible with System V initialization. The slaves are actually controlled through the master instance as well, using SSH public-key authentication and a convoluted launch command. Each new slave VM gets a standardized <code>$HOME/.ssh</code> directory that allows the Hudson master to use its SSH key to login to the slave machines without a password. The only thing that remains is to add the new slave’s DNS hostname to the <code>$HOME/.ssh/known_hosts</code> file on the Hudson master, to keep the SSH client from prompting when Hudson starts the slave connection. Once this is done, we simply configure the Hudson master to connect to the new slave using a command line like the following:</p>

<pre><code>ssh jeos1.grid.sonatype.com bash -l -c /opt/hudson/slave/start.sh
</code></pre>

<p>The <code>start.sh</code> script itself just does some basic environmental setup to account for differences in the SSH server behavior on different operating systems, then launches the Hudson <code>slave.jar</code>. Our particular script looks a little crufty, scarred from our path up the learning curve:</p>

<pre><code>#!/bin/bash

export JAVA_BASE=/opt/java/sdk
export JAVA14=$JAVA_BASE/1.4
export HOME=/home/hudson
export MAVEN_OPTS="-Xmx512M -Duser.home=${HOME}"
export M2_HOME=/opt/maven/apache-maven-2.1.0-M1
export ANT_HOME=/opt/ant/apache-ant-1.7.1
export JAVA_HOME=/opt/java/sdk/current
export PATH=$M2_HOME/bin:$ANT_HOME/bin:$JAVA_HOME/bin:$PATH

if [ -d /opt/local/bin ]; then
  export PATH=/opt/local/bin:$PATH
fi

svn up /opt/hudson/slave

if [ -f $HOME/.hudson-config ]; then
  svn up $HOME/.hudson-config
  source $HOME/.hudson-config
fi

cd $HOME
nice -n 19 java \
    -Djava.util.logging.config.file=/opt/hudson/slave/logging.properties \
    -Duser.home=${HOME} \
    -jar /opt/hudson/slave/slave.jar
</code></pre>

<p>On our Windows VM, we wound up using a DOS batch file that is much the same as the above: <code>start.bat</code>. We found it much simpler to spawn a batch file in Windows, despite the fact that we’re actually connecting through an SSH daemon that’s running on Cygwin. Similarly, on Solaris we have to tweak our HOME envar to use <code>/export/home/hudson</code>, which warrants yet another <code>start.sh</code> variant: <code>start-solaris.sh</code>.</p>

<h3>Summary</h3>

<p>Now that I’ve gone through the basic topology and configuration for our Hudson environment, I think I’ll end the post here. In the next post, I’ll discuss some of the harder lessons we’ve learned from actually running Hudson in this configuration, particularly when it comes to including operating systems like Vista. Be sure to tune in!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.sonatype.com/people/2009/01/the-hudson-build-farm-experience-volume-i/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
