Nexus Indexer API: Part 2


June 7, 2009 By Damian Bradicich

This series of Nexus Indexer posts will focus on integrating the Nexus Indexer into your own application. If you have an application that needs to search for an artifact by GAV coordinates, or by class name, you can use the Nexus Index format and the Nexus Indexer API to very easily search and locate artifacts in any repository that creates a Nexus Index. There are four main functions that are exposed in the Nexus Indexer API.

This post will focus on Searching and provide you with some concrete code examples that show you how to use the indexer search capabilities in your application.

Searching

The Nexus Indexer uses Apache Lucene to handle storage and searching of the index data, and provides a very simple method to search against fields in the index.  Below I have extended my plexus component from the Part 1 of this blog series, to include search functionality.

package org.damian;
...
/**
 * Sample app to show how to integrate with the nexus indexer.  Note that this is a simple plexus
 * component extending the SampleApp interface
 *
 * public interface SampleApp
 * {
 *    void index()
 *        throws IOException;
 *
 *    Set searchIndexFlat( String field, String value )
 *        throws IOException;
 *
 *    Set searchIndexFlat( Query query )
 *        throws IOException;
 *
 *    Map searchIndexGrouped( String field, String value )
 *        throws IOException;
 *
 *    Map searchIndexGrouped( String field, String value, Grouping grouping )
 *        throws IOException;
 *
 *    Map searchIndexGrouped( Query q, Grouping grouping )
 *        throws IOException;
 * }
 *
 * @author Damian
 *
 */
@Component( role = SampleApp.class )
public class DefaultSampleApp
    implements SampleApp,
        Initializable,
        Disposable
{
    // The nexus indexer
    @Requirement
    private NexusIndexer indexer;
 
    ...
 
    // search for artifacts
    public Set searchIndexFlat( String field, String value )
        throws IOException
    {
        // Build a query that will search the documents for the field set to the supplied value
        // This uses predefined logic to define the query
        // See http://svn.sonatype.org/nexus/trunk/nexus-indexer/src/main/java/org/sonatype/nexus/index/DefaultQueryCreator.java
        // for details
        Query query = indexer.constructQuery( field, value );
 
        return searchIndexFlat( query );
    }
 
    // search for artifacts using pre-built query
    public Set searchIndexFlat( Query query )
        throws IOException
    {
        // Build the request
        FlatSearchRequest request = new FlatSearchRequest( query );
 
        // Perform the search
        FlatSearchResponse response = indexer.searchFlat( request );
 
        // Return the artifact info objects
        return response.getResults();
    }
 
    public Map searchIndexGrouped( String field, String value )
        throws IOException
    {
        // We will simply use the GAV grouping, meaning that each groupId/artifactId/version/classifier
        // will have its own entry in the returned map
        return searchIndexGrouped( field, value, new GAVGrouping() );
    }
 
    public Map searchIndexGrouped( String field, String value, Grouping grouping )
        throws IOException
    {
        // Build a query that will search the documents for the field set to the supplied value
        // This uses predefined logic to define the query
        // See http://svn.sonatype.org/nexus/trunk/nexus-indexer/src/main/java/org/sonatype/nexus/index/DefaultQueryCreator.java
        // for details
        Query query = indexer.constructQuery( field, value );
 
        return searchIndexGrouped( query, grouping );
    }
 
    public Map searchIndexGrouped( Query q, Grouping grouping )
        throws IOException
    {
        GroupedSearchRequest request = new GroupedSearchRequest( q, grouping );
 
        GroupedSearchResponse response = indexer.searchGrouped( request );
 
        return response.getResults();
    }
}

As you can see, there are 2 ways to handle how the search is performed. The first is to use the nexus indexer to decide what type of lucene Query to create, based upon the input string. The second is to pass in your own lucene Query object, you can use any object that extends the org.apache.lucene.search.Query abstract class. If you choose to let the nexus indexer build your query, the following rules will apply to the search term:

  • string converted to lowercase
  • if first character is ‘^’ drop it
  • if first character anything other than ‘^’ prepend string with ‘*’
  • if last character is ‘ ‘, ‘<’ or ‘$’ drop it
  • if last character anything other than ‘ ‘, ‘<’ or ‘$’ append ‘*’ to end of string
  • if modified string does not contain ‘*’ then a TermQuery is used (exact match)
  • if modified string contains ‘*’ in last position, then a PrefixQuery is used (and ‘*’ is dropped from string)
  • if modified string contains ‘*’ in any other position, then a WildcardQuery is used

There will probably be occassions where this list of rules doesn’t apply to your use case, so you instead will want to give the Indexer your own Query object, that leaves you to choose from the many Query objects that lucene provides, or even to make your own if necessary.

You will also notice that there are 2 different types of result sets you can get back, flat or grouped. Flat results are simply a Set of ArtifactInfo objects for each match in the index. Grouped results will combine results based upon the Grouping you request (in my sample, the default is GAVGrouping which will group items together that share the same groupId/artifactId/version/classifier).

Here are some samples from a unit test in the project source code, that do the different kinds of searching.

    public void testSampleSearch()
        throws Exception
    {
        app.index();
 
        Set artifacts = app.searchIndexFlat( SampleIndexCreator.MY_FIELD, "value" );
 
        assertNotNull( "returned artifacts is null", artifacts );
        assertFalse( "returned artifacts is empty", artifacts.isEmpty() );
 
        for ( ArtifactInfo ai : artifacts )
        {
            assertEquals( "returned artifact has invalid data", "value", ai.getAttributes().get( SampleIndexCreator.MY_FIELD ) );
        }
    }
 
    public void testSampleSearchWithTermQuery()
        throws Exception
    {
        app.index();
 
        // This type of query will be totally built outside of nexus indexer, and will not
        // be tied to constraints defined in
        // http://svn.sonatype.org/nexus/trunk/nexus-indexer/src/main/java/org/sonatype/nexus/index/DefaultQueryCreator.java
 
        // A TermQuery matches equal strings
        Query q = new TermQuery( new Term( SampleIndexCreator.MY_FIELD, "value" ) );
 
        Set artifacts = app.searchIndexFlat( q );
 
        assertNotNull( "returned artifacts is null", artifacts );
        assertFalse( "returned artifacts is empty", artifacts.isEmpty() );
 
        for ( ArtifactInfo ai : artifacts )
        {
            assertEquals( "returned artifact has invalid data", "value", ai.getAttributes().get( SampleIndexCreator.MY_FIELD ) );
        }
    }
 
    public void testSampleSearchWithPrefixQuery()
        throws Exception
    {
        app.index();
 
        // This type of query will be totally built outside of nexus indexer, and will not
        // be tied to constraints defined in
        // http://svn.sonatype.org/nexus/trunk/nexus-indexer/src/main/java/org/sonatype/nexus/index/DefaultQueryCreator.java
 
        // A PrefixQuery will look for any documents containing the MY_FIELD term that starts with val
        Query q = new PrefixQuery( new Term( SampleIndexCreator.MY_FIELD, "val" ) );
 
        Set artifacts = app.searchIndexFlat( q );
 
        assertNotNull( "returned artifacts is null", artifacts );
        assertFalse( "returned artifacts is empty", artifacts.isEmpty() );
 
        for ( ArtifactInfo ai : artifacts )
        {
            assertEquals( "returned artifact has invalid data", "value", ai.getAttributes().get( SampleIndexCreator.MY_FIELD ) );
        }
    }
 
    public void testSampleSearchWithWildcardQuery()
        throws Exception
    {
        app.index();
 
        // This type of query will be totally built outside of nexus indexer, and will not
        // be tied to constraints defined in
        // http://svn.sonatype.org/nexus/trunk/nexus-indexer/src/main/java/org/sonatype/nexus/index/DefaultQueryCreator.java
 
        // A WildcardQuery supports the * and ? wildcard characters
        Query q = new WildcardQuery( new Term( SampleIndexCreator.MY_FIELD, "*alue" ) );
 
        Set artifacts = app.searchIndexFlat( q );
 
        assertNotNull( "returned artifacts is null", artifacts );
        assertFalse( "returned artifacts is empty", artifacts.isEmpty() );
 
        for ( ArtifactInfo ai : artifacts )
        {
            assertEquals( "returned artifact has invalid data", "value", ai.getAttributes().get( SampleIndexCreator.MY_FIELD ) );
        }
 
        // A WildcardQuery supports the * and ? wildcard characters
        q = new WildcardQuery( new Term( SampleIndexCreator.MY_FIELD, "v?lue" ) );
 
        artifacts = app.searchIndexFlat( q );
 
        assertNotNull( "returned artifacts is null", artifacts );
        assertFalse( "returned artifacts is empty", artifacts.isEmpty() );
 
        for ( ArtifactInfo ai : artifacts )
        {
            assertEquals( "returned artifact has invalid data", "value", ai.getAttributes().get( SampleIndexCreator.MY_FIELD ) );
        }
 
        // A WildcardQuery supports the * and ? wildcard characters
        q = new WildcardQuery( new Term( SampleIndexCreator.MY_FIELD, "val*" ) );
 
        artifacts = app.searchIndexFlat( q );
 
        assertNotNull( "returned artifacts is null", artifacts );
        assertFalse( "returned artifacts is empty", artifacts.isEmpty() );
 
        for ( ArtifactInfo ai : artifacts )
        {
            assertEquals( "returned artifact has invalid data", "value", ai.getAttributes().get( SampleIndexCreator.MY_FIELD ) );
        }
    }
 
    public void testSampleSearchGroup()
        throws Exception
    {
        app.index();
 
        Map groupedArtifacts = app.searchIndexGrouped( SampleIndexCreator.MY_FIELD, "value" );
 
        assertNotNull( "returned groupedArtifacts is null", groupedArtifacts );
        assertFalse( "returned groupedArtifacts should not be empty", groupedArtifacts.isEmpty() );
 
        for ( ArtifactInfoGroup artifactGroup : groupedArtifacts.values() )
        {
            String[] parts = artifactGroup.getGroupKey().split( ":" );
            //1st part groupId
            //2nd part artifactId
            //3rd part version
            //4th part classifier
            assertEquals( "should be 4 parts to the group key", 4, parts.length );
            assertFalse( "each group should contain at least 1 artifact", artifactGroup.getArtifactInfos().isEmpty() );
        }
    }
 
    public void testSampleSearchGroupNewGrouping()
        throws Exception
    {
        app.index();
 
        // Search using my own grouping, which will group based upon the MY_FIELD parameter
        Map groupedArtifacts = app.searchIndexGrouped(
            SampleIndexCreator.MY_FIELD,
            "value",
            new AbstractGrouping()
            {
                @Override
                protected String getGroupKey( ArtifactInfo artifactInfo )
                {
                    return artifactInfo.getAttributes().get( SampleIndexCreator.MY_FIELD );
                }
            } );
 
        assertNotNull( "returned groupedArtifacts is null", groupedArtifacts );
        assertEquals( "returned groupedArtifacts should have 1 entry", 1, groupedArtifacts.size() );
        assertEquals( "group key should be value", "value", groupedArtifacts.values().iterator().next().getGroupKey() );
    }
}

As you can see from the last test method, you can create your own Grouping object, to group things anyway you see fit, in this case, the results will be grouped using the MY_FIELD parameter.

That’s all for now, next time we will get into packing up indexes for publishing.

Sample Maven project can be found here http://svn.sonatype.org/nexus/trunk/sandbox/nexus-indexer-sample and will be updated periodically as I put together more details for the blog posts.