Nexus Indexer API: Part 1


June 1, 2009 By Damian Bradicich

This series of Nexus Indexer posts will focus on integrating the Nexus Indexer into your own application.  If you have an application that needs to search for an artifact by GAV coordinates, or by class name, you can use the Nexus Index format and the Nexus Indexer API to very easily search and locate artifacts in any repository that creates a Nexus Index. There are four main functions that are exposed in the Nexus Indexer API.

  • Indexing
  • Searching
  • Packing
  • Updating

This post will focus on Indexing and provide you with some concrete code examples that show you how to integrate the Nexus Indexer into your own application.

Indexing

The Nexus Indexer is responsible for looking at a maven repository directory layout, and generating a searchable index of the contents of that repository. Below I have a very simple Plexus component that uses the Nexus Indexer to index a directory:

package org.damian;
 
import java.io.File;
import java.io.IOException;
import java.util.List;
 
import org.codehaus.plexus.component.annotations.Component;
import org.codehaus.plexus.component.annotations.Configuration;
import org.codehaus.plexus.component.annotations.Requirement;
import org.codehaus.plexus.personality.plexus.lifecycle.phase.Disposable;
import org.codehaus.plexus.personality.plexus.lifecycle.phase.Initializable;
import org.codehaus.plexus.personality.plexus.lifecycle.phase.InitializationException;
import org.sonatype.nexus.index.NexusIndexer;
import org.sonatype.nexus.index.context.IndexCreator;
import org.sonatype.nexus.index.context.IndexingContext;
import org.sonatype.nexus.index.context.UnsupportedExistingLuceneIndexException;
 
/**
 * Sample app to show how to integrate with the nexus indexer.  Note that this is a simple plexus
 * component extending the SampleApp interface
 *
 * public interface SampleApp
 * {
 *    void index()
 *        throws IOException;
 * }
 *
 * @author Damian
 *
 */
@Component( role = SampleApp.class )
public class DefaultSampleApp
    implements SampleApp,
        Initializable,
        Disposable
{
    // The nexus indexer
    @Requirement
    private NexusIndexer indexer;
 
    // The list of index creators we will be using (all of them)
    @Requirement( role = IndexCreator.class )
    private List indexCreators;
 
    // The indexing context
    private IndexingContext context = null;
 
    // The path to the repository to index, value will be pulled from
    // the plexus context
    @Configuration( value = "${repository.path}" )
    private File repositoryDirectoryPath;
 
    // The path to store index files, value will be pulled from
    // the plexus context
    @Configuration( value = "${index.path}")
    private File indexDirectoryPath;
 
    // Initialize the index context
    public void initialize()
        throws InitializationException
    {
        try
        {
            // Add the indexing context
            context = indexer.addIndexingContext(
                // id of the context
                "sample",
                // id of the repository
                "sampleRepo",
                // directory containing repository
                repositoryDirectoryPath,
                // directory where index will be stored
                indexDirectoryPath,
                // remote repository url...not in this example
                null,
                // index update url...not in this example
                null,
                // list of index creators
                indexCreators );
        }
        catch ( UnsupportedExistingLuceneIndexException e )
        {
            throw new InitializationException( "Error initializing IndexingContext", e );
        }
        catch ( IOException e )
        {
            throw new InitializationException( "Error initializing IndexingContext", e );
        }
    }
 
    // clean up the context
    public void dispose()
    {
        if ( context != null )
        {
            // Remove the index files, typically would not want to remove the index files, so
            // would pass in false, but this is just a test app...
            try
            {
                indexer.removeIndexingContext( context, true );
            }
            catch ( IOException e )
            {
                e.printStackTrace();
            }
        }
    }
 
    // index the repository
    public void index()
        throws IOException
    {
        // Perform the scan, which will index all artifacts in the repository directory
        // once this is done, searching will be available
        indexer.scan( context );
    }
}

As you can see, there isn’t much code required to properly manage indexes of your Maven 2 repository. This is the simplest case, where you don’t have anything you want to add to the indexer, but suppose you want to index additional data.

The Nexus Indexer uses a set of IndexCreator Plexus components to create the Lucene indexes that store the data.  The Nexus Indexer has 2 by default, the MinimalArtifactInfoIndexCreator (min) and the JarFileContentsIndexCreator (jar).  The min creator takes common items from an artifact and pom (will do best guess if a pom isn’t available) such as groupId, artifactId, version, packaging, classifier and puts those in the index as searchable fields.  The jar creator simply puts all class names into the index as searchable fields.  Of course you can create your own IndexCreator Plexus component that can index whatever you see fit.

I will now show a sample creator that will explain how to add a new IndexCreator (Note that nexus-indexer-2.0.1-SNAPSHOT is required for this sample to run properly)

package org.sonatype.nexus.index.creator;
 
import java.io.IOException;
 
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.codehaus.plexus.component.annotations.Component;
import org.sonatype.nexus.index.ArtifactContext;
import org.sonatype.nexus.index.ArtifactInfo;
import org.sonatype.nexus.index.context.IndexCreator;
 
/**
 * A Sample Index Creator that will show how to properly create your own IndexCreator component
 * @author Damian
 *
 */
// Define the plexus component, and the hint that plexus will use to load it
@Component( role = IndexCreator.class, hint = "sample" )
public class SampleIndexCreator
    extends AbstractIndexCreator
{
    // The name of my sample field
    public static final String MY_FIELD = "myfield";
 
    /**
     * Populate ArtifactInfo with data specific to your application.  Note that the
     * artifactContext contains other useful objects, which may come in handy.
     */
    public void populateArtifactInfo( ArtifactContext artifactContext )
        throws IOException
    {
        // Add the data to the ArtifactInfo
        artifactContext.getArtifactInfo().addAttribute( MY_FIELD, "value" );
    }
 
    /**
     * Popluate ArtifactInfo from exisiting lucene index document, will want to populate the
     * same fields that you populate in populateArtifactInfo
     */
    public boolean updateArtifactInfo( Document document, ArtifactInfo artifactInfo )
    {
        // Add the data to the ArtifactInfo
        artifactInfo.addAttribute( MY_FIELD, document.get( MY_FIELD ) );
 
        //Note that returning false here will notify calling party of failure
        return true;
    }
 
    /**
     * Add data from the artifactInfo to the index
     */
    public void updateDocument( ArtifactInfo artifactInfo, Document document )
    {
        document.add(
            new Field(
                //Field name, should be unique across all IndexCreator objects
                MY_FIELD,
                // get your new data and add to the index
                artifactInfo.getAttribute( MY_FIELD ),
                // Whether the data should be stored (YES or NO)
                Field.Store.YES,
                // Whether the field should be indexed (NO, TOKENIZED, UNTOKENIZED)
                Field.Index.UN_TOKENIZED ) );
    }
}

Simply pass this creator in with the list of creators when adding an index context, and it will now be available. And that’s it for this entry, next time we’ll get to searching the nexus index.

Sample Maven project can be found here http://svn.sonatype.org/nexus/trunk/sandbox/nexus-indexer-sample and will be updated periodically as I put together more details for the blog posts.