Enterprise social networks Part 3 – Searching and monitoring
In Parts 1 and 2 of this series of three posts on enterprise social networks I have given some indication of the wealth of information that is available in the research literature about good practice and success metrics assessment. In Part 3 I want to look at the issues around searching ESNs. As far as I am aware the only paper offering a detailed analysis of search and social media focuses more on public social media services and has very little to say specifically about enterprise networks. Nevertheless many of the comments made are broadly applicable to ESNs. Search has another role in ESNs and that is as a means of monitoring ESN messages against profiles set up by users. Pioneering work on search-based monitoring is described in detail in a paper published in 2013 by the IBM team working on the Streamz application. Note that some versions of this paper carry a 2017 date because this was the date it was uploaded as an open access document on the INRIA servers in France.
There is an understandable concern about indexing and searching emails but seemingly not about searching ESNs even though one of the justifications for an ESN may well have been reducing email traffic. Of course a major difference between an ESN and email is that in an ESN there is unlikely to be any contributions from outside of the organisation so there is not the concern about indexing email content that is not owned by the organisation. The similarities include the extent to which employees will attach documents (used here as a generic term for any content item), the short messages on an individual post that may only make sense within the context of a thread, and usually a rapid response to a post.
The posts could be very short indeed. I might post a request asking if anyone knows if there is a standard on web usability. Within minutes I receive a response from Charlie that just says “ISO9241”. This is an example of a complete and helpful answer in just seven characters. Does it mean that the person sending the response is an expert in usability? Not necessarily. They could be a corporate representative on other ISO committees and just happen to know the standard number. I am making this point because at present there is a great deal of enthusiasm for mining ESN traffic to identify expertise. There is an IBM research paper dating from 2007 which provides a good introduction to such an application.
There is a book to be written on searching ESNs (now there’s an idea!) because there are so many issues that should be taken into account. ESN posts will be HTML files with Office (usually) files attached. The Office files will probably come from SharePoint and therefore will already have been indexed. Should the ESN application be indexing just the HTML or the attached file as well. The ESN search solutions I have seen (by no means all) tend not to offer a depth of indexing that would be provided by an enterprise search application.
The problem with ESNs is that the messages will be short, may be in a range of languages, may not contain many ‘quality’ words worth indexing and are being generated in considerable volume by the business. There could of course be some useful tagging in terms of the communities using the ESN but the extent to which this is either useful and consistent is open to question. Maintaining index freshness could be very important. I might recall that earlier in the day Charlie sent me some information about a usability standard but that was six hours ago. I need to find it. This requires the index is refreshed in near real-time. If there is a standalone search within an ESN that may be feasible but if the ESN is being indexed by the main enterprise search application this amount of incremental indexing could be a serious server architecture challenge. There are also likely to be some challenges in security mapping.
Integrating ENS search into an enterprise search application that is focused primarily on non-web content also requires some careful considerations of how the ESN content will be ranked. The ESN content text length is so short that tf.idf models are going to give some highly unpredictable results. If multiple languages are involved language identification at either index time or query time may be a challenge as there is so little text to work with. The trade-offs between having the ESN and enterprise search indexes integrated or separate need careful consideration, especially in providing what is likely to be very valuable profiling serviced running across multiple ESN channels.
There are ways around all the issues mentioned above, but they will be very dependent on many variables in terms of content, use cases, technology and user expectations. The sheer volume of ESN content in terms of items posted and then indexed, and the need to have a quality search experience if employees are to find just what they are looking for (my guess is that precision is going to be far more important than exploration) mean that a significant amount of research, testing and training is going to be required. If you have been managing without a search strategy so far and are now recognising the totality of information, knowledge and expertise that could be within your ESN then creating a strategy is essential. When Andrew McAfee put Search as the first item on SLATES I think he knew exactly why search was going to be so important to the utility of ESNs.
Martin White