Over the last few years I have been collecting and collating research papers and conference papers that are in some way relevant to enterprise search. There are very few ‘case studies’ of enterprise search implementations but there is a significant amount of research into techniques that could in principle be applied to ‘document-centric’ search applications. In total I now have almost 1500 research papers. My categorization scheme broke down several years ago and I’ve decided that it is now time to rebuild it, initially as a list of categories and then perhaps as a taxonomy.
The list of 21 categories below excludes AI-ML research, especially around large language models. ML techniques will certainly play a role in enterprise search applications but at present the direct impact is not clear. The categorization is built from a review of the most recent 240 papers added to my collection, dating back to September 2021. So that means I’ll collecting around 25 papers a month, but that excludes AI-ML research.
- Conversational search
- Enterprise search
- Entity extraction and name recognition
- Evaluation metrics
- Federated search
- Information extraction from documents
- Information retrieval (fundamental research)
- Knowledge management
- Language issues
- Personalized search and recommendations
- Professional search
- Query matching
- Question answering
- Result management
- Result ranking
- Retrieval models
- Snippets
- Summarization
- Systematic searching
- Task completion and retrieval models
- User experience
As always there is some overlap between categories.
My objective in publishing this list of categories (which is very much work-in-progress!) is to illustrate the breadth of topics that you should be following if you want to understand the future directions of search in organizations. The rate at which I am adding to my collection also gives an indication of the very high level of investment being made in search research and the speed of innovation.
Martin White