Budget planning for enterprise search – the Fourth Dimension

For the last 20 years one of the constant challenges of search procurement projects has been working with clients to balance software costs, professional services costs and the investment in a search team. Computational costs could pretty much be ignored in an on-prem installation other than perhaps some additional servers for the index. Now the three dimensions have become four. The fourth dimension is the computational costs incurred with the cloud service provider.

A recent post from the guru of search gurus, Daniel Tunkelang, sets out some of these costs and makes very informed suggestions as to where investment in computational functionality is wise and also where the impact is low. The post is primarily directed at opensource solutions but it is completely relevant to commercial enterprise search situations. The difference is that with a commercial application the customer has little control over the allocation of computing resources. All the normal rules on capacity planning go out of the window with search, especially where there is a migration from on-prem to cloud, because the customer has absolutely no prior knowledge of the difference between enterprise applications that use databases and search applications with look-ups from an inverted file. The content processing pipeline is a good example.

The vendor may have been transparent in the elements involved but will not be able to fine-tune the cloud service costs without a seriously deep dive into both the current architecture and the architecture-to-be. In my experience two of the elements that often catch clients out are auto-suggest (because every added character in effect is a new query to run against the index) and when it is necessary to undertake a partial re-index. Another factor is federated search. It may seem such a good idea to be able to search all applications but will the budget support this use case and deliver value?

Then there is the system up-time problem. It is unusual for a search vendor to have their own cloud platform so they will be buying space on AWS, Google, Azure or any number of other services. This space usually comes with an up-time commitment of (say) 99.7%. That may seem fine until you work through the implications of the 0.3% and decide that a 99.95% up-time is essential to maintain global service levels. The cloud provider can certainly meet that requirement but the cost of the additional 0.25% might make your eyes water. Federated search again needs to be factored in because the required search uptime may be greater than one of the selected applications (HR is a good example) where computation speed has never been a major requirement. Of course you may be planning to use the cloud service with which you already have an enterprise agreement. Even then you are going to have to flex some numbers.

It is in the nature of a search procurement that even the preferred vendor may not be able/wish to present the complete financial picture until negotiations start about the contract. Part of this is a concern that the costs may be significantly higher than the customer expected and another part is that it is very dependent (inter alia) on the user profiles, index freshness, query strategies and content complexity. It might well take a serious proof-of-concept to get a handle on some of the numbers but the question with any PoC is the extent to which it scales. In a procurement project last year a vendor who could meet all the functional and non-functional requirements of my client could not meet the uptime requirement within the available budget and was excluded from consideration. The final twist may be that the vendors you are considering use different cloud platforms and different algorithms for the usage costs. Welcome to the Fourth Dimension of enterprise search procurement!

Martin White