Charlie Hull (Flax) alerted me to a blog post by Jon Tai about search engine tuning to boost specific types of content in a result list. This blog post is a nice illustration of two important points. The first is that when seeking to enhance search performance you really do need to know your content well, and know it within the context of your users. The search application is elasticsearch and is being used on the IGN website, so you can relate the comments to a real world example. Jon talks about the issues of compensating for different document types, adding domain-specific logic and understanding the pros and cons of boosting at index time or query time.
The second is that it demonstrates the openness of open source software. It is certainly possible to do the same sort of boosting on a commercial engine but you need to use the tools supplied. If they meet your need that’s fine, but searching is a very graular process and you may need to get down to a very specific document type. Jon’s post not only provides the script for another Lucene/elasticsearch developer to use but points out what he tried but did not work. That is not likely to be in the user manual of a piece of commercial software. I also liked the nice use of Wolfram Alpha Clip ‘n Share at the end of the blog post.
On a related topic read Charlie’s post on Lucene/Elasticsearch