Category Archives: Central

Browse Repos Easily with Updated Maven Central Search


July 17, 2011 By Joel Orlina

We’ve rolled out an update to  Maven Central Search that makes the tool easier to use and fixes a handful of minor bugs.

Enhancements

  • Browser bar image 1

    Access the browser bar by selecting the edit icon

    Browser bar edit 2

    Type your search into the browser bar and select OK

    Browse quickly to artifact directories when you know the path. The new browser bar text box lets you browse the repository in Central Search in almost the same way you browse http://repo1.maven.org with your web browser. Just type a path into the browser bar text box and you’ll be taken right to the artifact directory.Accessing the browser bar is easy –select the edit icon (which looks like a pen) next to the repository path in the Browse view and the repository path will be replaced with the browser bartext box.
  • Classname search is more flexible.  You can paste in a path to a source file (e.g., org/apache/coyote/ajp/AjpAprProcessor.java) or to a class file (e.g, org/apache/coyote/ajp/AjpAprProcessor.class), and Central Search will construct a fully-qualified classname search from it.
  • Multi-page results are easier to browse.  When search results run longer than your browser window, you can navigate from page to page without scrolling back to the top of the page as was previously required.
  • Browse view performance is improved.  Long pages in the Browse view of the repository are now cached.  This improves responsiveness when  viewing the root directory of the Browse view (as well as the org, com, and net directories)..

Continue reading

Maven Central Building Blocks


June 27, 2011 By Joel Orlina

In my last post, I explained the REST-style API that underlies Maven Central’s browser-based search UI. That API essentially comes “for free” with the main components on which Maven Central Search is built:

In this post, I will highlight those components and describe how they were used to implement Maven Central Search.

When we started the project, we looked at a couple of options for implementing search, including Solr and the existing Nexus search capability built directly on top of Apache Lucene. The Nexus approach initially seemed compelling as we clearly have significant experience with it and Nexus search even provides a REST API for full-text search that we could have leveraged. So, why did we end up choosing Solr when we could have simply re-used the search functionality in Nexus or even crafted a web UI backed by an instance of Nexus running on top of Central? Two reasons:

  • Flexibility — We discovered early on during the design phase of Central Search that we needed changes to the schemas, fields, and even field contents in the Lucene indices being used by Nexus. Making those changes to the schemas would have required other changes within the Nexus codebase. With Solr, we could simply point our Solr installation against an existing index or even have Solr build a new index from scratch by adding documents through Solr’s REST API. We could rapidly prototype schema changes (often in 1-2 lines of xml and not even requiring us to restart Solr) and see our updated search results almost immediately.
  • Scalability — Solr bills itself as an “enterprise search platform.” One of the enterprise features that attracted us to Solr was its built-in support for replication. As query load increases in the future, we can simply balance that load across hardware serving multiple copies of the same data. Solr’s support for multiple indexes also leaves us a path open for sharding our index data, once it becomes so large as to be difficult to serve out of a single index on a single server.

Continue reading

You Don’t Need A Browser to Use Maven Central


June 9, 2011 By Joel Orlina

Since its release in January, the Maven Central website (http://search.maven.org) has provided Apache Maven users with:

  • Search functionality that allows one to quickly track down artifacts and their dependency details when trying to resolve build problems.
  • Browse functionality that aids in discovery of new artifacts to use in projects.

In the intervening months, Sonatype has focused its efforts on improving the usability of the Maven Central user interface in the hopes of making it the first place users look when trying to find an artifact.  Recently, users who have reaped the benefits of using the Maven Central website have asked about interacting programmatically with the search functionality.

If you pay attention to your web browser’s address bar when conducting searches on Maven Central, you can already see that a REST-style API exists.  For example, searching for “guice” from the main search box results in the following URL being generated (the following URL’s are NOT URL-encoded for the sake of readability):

Translating the search request into English, that URL requests a basic search for any artifact (irrespective of version) containing the word “guice” in either the groupId or artifactId, returning only the first page of results.  Each row of the results shows the latest version of the artifact and the date the artifact was last updated as well as any classifiers associated with the artifact.

You can build up the complete library of search requests simply by paying attention to your web browser’s address field as you use the Maven Central website.  For the sake of convenience, we’ve collected all the URLs that make up Maven Central’s search API in a document available here.

Sadly, these URL’s are still only useful when requesting them via web browser.  They are links that can be bookmarked or e-mailed, but they do NOT work when using a non-browser agent like wget or curl.  The Maven Central user interface is essentially a browser-based application that uses Javascript to make asynchronous requests to yet another set of URL’s.  Once you make a request that looks like the URL above, the browser fires off the actual request to another Maven Central URL responsible for conducting the search and returning results that are formatted by the browser.

The sample request above, when converted to an actual Maven Central search request, looks like this:

The actual text of your query goes in the appropriately named “q” parameter, the “rows” parameter restricts the results to a smaller number than the full result set, and the “wt” parameter can be either “xml” or “json,” depending on how your application prefers to handle results.

Some useful examples appear below.  Again, please refer to the API Guide for a complete listing:

In an upcoming post, I’ll describe the architecture behind Maven Central that makes all this functionality possible.

Maven Central Failover Mechanism Improves: Temporary IP change on Monday


May 9, 2011 By Brian Fox

Spoiler Alert! This post contains information about a change to Maven Central’s IP addresses. If your network has firewall rules in place that need specific IPs, be sure to read this post.

We’re working hard and investing continued effort into making sure that Central is as available as possible. As Maven Central supports a world of developers, even a few minutes of downtime is completely unacceptable to us. In line with our previous efforts to make Maven Central as bullet-proof and available as possible, we are planning to make the US repository even more fault tolerant using a tool called Pacemaker. Once we’ve had time to evaluate the impact of the changes described in this post we will deploy similar measure for our European Union and Asia/Pacific mirrors.

As a follow up to our previous enhancements to central, we are planning to make the Maven Central US repository even more fault tolerant.

The US repository runs on two virtual machines (VMs) in a VMWare cluster with 4 physical nodes configured to use the High Availability support in VMWare. Despite having multiple levels of fault-tolerance, recovery from a misconfiguration or other catastrophic failure still requires a DNS update to a standby IP to restore Maven Central. This DNS change requires time: time to make the change and then an often unpredictable time for DNS changes to propagate over the entire Internet.

This is unacceptable to us. Millions of developers depend on Maven Central, we’ve invested in redundant virtual machines running on redundant physical hardware. If there is an unforeseen event, the problem should be addressed in a few seconds.

To achieve immediate failover in the event of failure we will be using a tool called Pacemaker to manage Maven Central’s floating IP cluster. Pacemaker monitors the repository IP address, Nginx process status, and sample content from Maven Central. If Pacemaker identifies a failure in any one of these components it will immediately failover to the backup machine. In my testing, this takes about 3-5 seconds to occur.

In a previous post I discussed the systems we have in place and how the IPs are configured:

We are aware that some users have firewall rules that are locked to the external service IP. Because of this, we strive to maintain a consistent IP for each system, however the primary mechanism for accessing the repository is by DNS for most users. At times, our failover escalation or maintenance procedures may require us to redirect the DNS for one system to another. For this reason, if you have firewall rules in place that need specific IPs, please allow this list so that you won’t be affected by any temporary transitions:

  • 207.223.240.88 : US primary
  • 207.223.240.92 : US staging / standby
  • 89.167.251.252: UK Primary
  • 89.167.251.253: UK standby

Continue reading

Enhancements to Maven Central


March 17, 2011 By Brian Fox

Sonatype is committed to ensuring that Maven Central is a reliable resource for the community. We are continuing to invest in enhancements that improve system availability, including the conversion to virtual systems and adding redundancy to Central’s Internet connection.  These new improvements follow the deployment of the new official Maven Central repositories in the UK that enabled much faster access for all Maven users in Europe, as well as providing another geographically redundant backup of artifacts.

Virtual Servers: We now have 6 nodes running in a private cloud cluster and both Central machines in the US are fully converted to virtual machines. This will allow the two systems to be completely isolated from hardware failure and provides greater flexibility for load balancing and performance tuning.

Internet Redundancy: Maven Central is now connected to Contegix’s fully managed network which provides multiple routes to the Internet. While the existing connection had been extremely stable in the past, we believe that using the managed and redundant network is the best choice for providing a reliable service to our users. The conversion to the new network is already completed and you may notice that Maven Central now has a new IP address.

We are aware that some users have firewall rules that are locked to the external service IP. Because of this, we strive to maintain a consistent IP for each system, however the primary mechanism for accessing the repository is by DNS for most users. At times, our failover escalation or maintenance procedures may require us to redirect the DNS for one system to another. For this reason, if you have firewall rules in place that need specific IPs, please allow this list so that you won’t be affected by any temporary transitions:

  • 207.223.240.88 : US primary
  • 207.223.240.92 : US staging / standby
  • 89.167.251.252: UK Primary
  • 89.167.251.253: UK standby