Geta

Meny NO

Ascend'15 - Episerver Find - Advanced Developer Session

In the start of November I was fortunate enough to attend the EPiServer Ascend conference which was held in Las Vegas.

Together with Patrick van Kleef I held a lab session titled "Episerver Find - Advanced Developer Scenarios".
Patrick covered topics such as unified search, highlighting and statistics - you can read his session summary here: http://www.patrickvankleef.com/

My focus for the session was EPiServer Find's .NET Client API and this blog post is a short summary of some of the topics I covered.

Episerver Find - A powerful query engine

Episerver Find is not just a product for free text search, it’s also perfect for content retrieval. It is a powerful and scalable query platform that can index and query large amounts of data of any type.

Indexing and customizing serialization

Episerver Find offers an intuitive API for indexing and querying objects which can be used from just about any .NET application.

.NET Client API 

  • EPiServer.Find.dll
  • Uses JSON.NET from serialization
  • No dependencies to other EPiServer products
  • Easy to install using NuGet

All communication with Find's REST API goes through an instance of the IClient interface. 

To obtain such an instance you can use the Client.CreateFromConfig() method, and that will create it based on settings in your app.config/web.config. Note that if you are using Find with any other EPiServer Products, like CMS or Commerce, you should always use the SearchClient.Instance singleton.

// Creating client instance from app.config/web.config settings
var client = Client.CreateFromConfig();
// *ALWAYS* use SearchClient.Instance singelton if you're using
// Find with any other EPiServer Products, like CMS/Commerce
var client = SearchClient.Instance;

This is important since the client holds a number of default conventions related to how different types should be serialized, how they define IDs and more.

By default, all public properties on your model are serialized and sent to the search engine. By using attributes or conventions, you can customize how the data is serialized.

As a start is is a good idea to indicate what field on your model that should represent the id. Unless specified, the service automatically assigns an ID to an indexed document. To explicitly specify an ID, you can either use the Id-attribute or you can specify your property using conventions.

// using ID attribute public class Product
    {
        [EPiServer.Find.Id]
        public string ProductId { get; set; }
    }
// Specify Id property to Find using conventions
client.Conventions.ForType<Product>().IdIs(p => p.ProductId);

You can also use conventions to add extra fields to the index. This is especially handy if you’re not allowed to change the model, say you’re indexing 3rd party libraries. You can also exclude fields, again by using attributes or convention methods. Take a look at the documention for some examples.

Remember that customizing client conventions should be part of the initialization, typically in a initializable module. And for the indexing part itself, make sure you always bulk index.

Filtering and caching

Filters are similar to queries, but filters provide better performance – because as opposed to free text search, filters either match completely or not at all. Filters are very well documented on Episerver World, see Filtering section. And if you are familiar with LINQ it is really not rocket science.

In general; use Match for an exact match and Exists to determine if a field has value.
For string you have MatchCaseInsensitive if casing is not important, and there is also support for fuzzy search using “MatchFuzzy”.

You can use methods with familiar names from LINQ to apply sorting and paging - OrderBy, Skip, Take etc.
Once you're done building your search request execute it against the REST API and get the result using the GetResult() method. 

In general, it is a good idea to cache your queries, and there are built-in methods  provided for this; see example code below. 

 // Filter example: Find all Products of Category knitwear
// and with a Price range of 10-50
// Caches the result for 1 hour
var client = Client.CreateFromConfig();
var result = client.Search<Product>()
.Filter(p => p.CategoryEnum.Match(CategoryEnum.Knitwear))
.Filter(p => p.Price.InRange(10,50))
.StaticallyCacheFor(TimeSpan.FromMinutes(5))
.GetResult();

How does caching work?

Find will use the query as the cache key and then cache the result in memory in your application. A good way of ensuring that you are in fact caching correctly is to look at Fiddler traffic - if you are caching correctly you should not see any requests to Find while the cache is valid. 

Keeping in mind that the query is the cache key, you need to be aware of the following scenario:

Lets say we are adding one more filter to our query example from above:

var yesterday = DateTime.Now.AddDays(-1);
// We are caching but the result will never be fetched from the cache
var result = client.Search<Product>()
.Filter(p => p.CategoryEnum.Match(CategoryEnum.Knitwear))
.Filter(p => p.Price.InRange(10,50))
.Filter(p => p.LastUpdated.LessThan(yesterday))
.StaticallyCacheFor(TimeSpan.FromMinutes(5))
.GetResult();

We are caching but since the variable yesterday will change every time, the query is different and a new cache key will be "constructed" by Find. Over time we end up filling up memory and never take advantage of the cached items. 

To solve the issue I would create a method that would calculate yesterday value based on last time query was cached.

var yesterday = GetCacheRelativeNowDateTime().AddDays(-1);


Project to improve performance

As already indicated, queries to Find are json requests and responses sent over the network.

Let say we are looking for Size and Stock for a Product named “Mio Cardigan”. If stock and size is all the data we need, there is no need to retrieve the whole Product object. By using projection we can tailor the objects to our need. Lets look at an example.

 /// Example: Find sizes and stock for product named 'Mio cardigan'
/// Tailor the objects to your need: Less data transfered = smaller response
var result = client.Search<Product>()
.Filter(p => p.Name.Match("Mio cardigan"))
.Select(r => new { Id = r.ProductId, SkuItems = r.SkuItems})
.StaticallyCacheFor(TimeSpan.FromMinutes(1))
.GetResult();
 foreach (var hit in result)
{
Console.WriteLine(hit.Id);
foreach (var sku in hit.SkuItems)
{
Console.WriteLine("\t Size {0} Stock {1}", sku.Size, sku.Stock);
}
}

The less data transferred, the smaller the response, the better performance! Projections are supported to both anonymous types and classes.

I also talked about facets and multi search - If you want to learn more about multi search read my blog post: Building an advanced search page using Episerver Find.

A console application with more sample code is available on Github. Download it, go to find.episerver.com to create an index, add it to app.config and you are good to go! 

Support for nested queries in Episerver.Find 11

A couple of days after Ascend, Episerver realeased support for nested queries as part of update 89. Nested queries in Episerver Find let you query a parent object and filter it on a child collection marked as "nested." This is particularly useful in solutions with Episerver Commerce and Find, to manage search queries in large catalog structures.
Read more about it here.

kommentarer drevet av Disqus

Key points

  • EPiServer Find is more than free text search - it is also a flexible and scalable query platform
  • Use SearchClient.Instance when using Find with other EPiServer products
  • Customize serialization using attributes or conventions
  • Customizing client conventions should be part of an initializable module
  • Make sure you cache your queries
  • Use projection to improve performance
  • Episerver.Find 11 now supports nested queries