Simple Setting Up A Custom Lucene Index With Umbraco

Umbraco, like a lot of CMS systems, decided against creating a custom search functionality from scratch, instead, they implement one of the most widely used search providers available, Lucene. Lucene is a really powerful search library that is also super quick. In today’s guide, I’m going to cover the basics to get you up and running with your own custom indexes with Umbraco.

Where Are My Indexes?

If you are new to Umbraco, in your websites webroot, in ‘App_Data’ -> ‘TEMP’ -> ‘ExamineIndexes’ you will find all your websites Lucene files. Out of the box, you should see some folder called ‘Internal’ and ‘External’. The first tip I’m going to go over is how to force Umbraco to re-index your website. Open your Umbraco backend up in a browser:

umbraco_indexes_introduction_1

Click on the ‘Developer’ section and then the ‘Examine Management’ tab. Under the Indexes section, you should see a list of Indexes that have been set-up in your site. If you click on any of the indexes, you should see a ‘Rebuild index’ button. Clicking on this should generate your indexes for you.

How Are Umbraco Indexes Defined?

If you are wondering how these indexes magically appear on your website, then you can take a look in your ‘Config’ folder:

umbraco_indexes_introduction_2

In here you will see two files called, ‘ExamineIndex.config’ and ‘ExamineSettings.config’. If you look inside both files, you should be able to see where the indexes are defined.

Custom Umbraco Indexes

You might be wondering why you would need to create a custom Umbraco index, a few examples include:

  • Locking down your search for security by only exposing certain Umbraco items
  • Creating a specialist search that only contains results for a given language
  • Performance

To get started with a custom index, first you need to define a custom indexset. An indexset is a configuration element set in ‘ExamineIndex.config’ that defines all the doctypes and fields the index will use. An example of a custom Index I’ve created, include:

<IndexSet SetName="Contacts" IndexPath="~/App_Data/TEMP/ExamineIndexes/Contacts/" IndexParentId="1">
<IndexAttributeFields>
<add Name="id" />
<add Name="nodeName"/>
<add Name="updateDate" />
<add Name="writerName" />
<add Name="nodeTypeAlias" />
</IndexAttributeFields>
<IndexUserFields>
<!-- Contact Page -->
<add Name="companyHQTitle" />
<add Name="hQAddressLine1" />
<add Name="hQAddressLine2" />
<add Name="hQAddressLine3" />
<add Name="hQPostcode" />
</IndexUserFields>
<IncludeNodeTypes>
<add Name="ContactPage"/>
</IncludeNodeTypes>
<ExcludeNodeTypes>
</ExcludeNodeTypes>
</IndexSet>

This indexset can be added anywhere in the section. The element might look scary at first, but it’s quite easy when you break it down:

  • SetName This is the name of the index (or alias in Umbraco talk). This is the name you will use in your code to reference the index
  • IndexPath Where the index will live on disk
  • IndexAttributeFields Defines the in-built Umbraco fields to include in the index
  • IndexUserFields The custom fields in your doctypes to include
  • IncludeNodeTypesThe doctypes you want to include
  • ExcludeNodeTypesThe doctypes you want to exclude

After defining what you want to include in your custom index, you need to register it with Umbraco. This is done in ‘examine settings.config’. This is done in a two-step process, by registering an ExamineIndexProviders and an ExamineSearchProviders. The code to register an ExamineIndexProviders looks like this and needs to live within the -> section.

<add name="Contacts" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
supportUnpublished="false"
supportProtected="false"
analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" />

Next, you will need to register the ExamineSearchProviders, in the -> section.

<add name="Contacts"
indexSet="Contacts"
type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>

The most import part of this section is that the IndexSet’ property matches your index name that was defined in the ExamineIndexProviders.

Searching The Index

So, now you’ve defined an index, if you follow the steps above you should see the files populate on disk. The next step is to use it in code:

var searchTerm = "search term";
var searcher = ExamineManager.Instance.SearchProviderCollection["Contact"];
if (searcher == null)
return null;
var searchCriteria = searcher.CreateSearchCriteria(BooleanOperation.Or);
ISearchCriteria query = searchCriteria.RawQuery(searchTerm);
IOrderedEnumerable<SearchResult> searchResults = searcher.Search(query).OrderByDescending(x => x.Score);
return searchResults;

In the code above we use the ExamineManager, passing in the index name to select which provider to query. We set a searchCriteria and passed in a search term. You can be very detailed in how you want to query the index, but, in this example, I’m just doing a basic search.

Lastly, we query the index and order the results based on the ‘score’ property. When Lucene searches and indexes, it will also rate how close a match it is, so filtering with most relevant results first is a pretty standard factor.

Conclusion

If you followed everything correctly, you should now have a custom index defined, you should be able to see the physical files being generated and you should be able to use Umbraco ExamineManager to search that index in your code. If you have a simple need for an index, then I’m hoping this guide covers everything you need to get up and running. If you have more complex needs, then Lucene can be a complicated beast and you can get your hands a lot dirtier.

Jon D Jones

Software Architect, Programmer and Technologist Jon Jones is founder and CEO of London-based tech firm Digital Prompt. He has been working in the field for nearly a decade, specializing in new technologies and technical solution research in the web business. A passionate blogger by heart , speaker & consultant from England.. always on the hunt for the next challenge

More Posts

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *