In order to be awarded my chemistry degree in 1970 I had to pass a reading comprehension examination in German. This was because at that time the Beilstein and Gmelin handbooks on organic and inorganic chemistry were primarily published in German, as was Angewandte Chemie, one of the leading primary journals in chemistry. This expertise (such as it was) became useful at the start of my career in the metallurgical industry as many of the leading journals and technical magazines were in German.
Move forward to 2020 and a superb Masters thesis in German by Ann-Kathrin Heike Kennecke entitled Untersuchung der Barrierefreie-InformationstechnikVerordnung mit Fokus auf Anforderungen von Menschen mit Lese- und Rechtschreibstörung. This is an outstanding piece of research with implications for the dyslexia community. In my experience is it not uncommon to find a thesis that is written in the primary language of the candidate and the examiners.
I recently authored a paper on managing enterprise language diversity that was published in Business Information Review. (If you do not have access to BIR contact me and I will send you a pre-print.) The range of languages in use in an organisation is usually not well understood or supported. National languages are especially important where there is a need to communicate with consumers (drug adverse reaction reporting) or compliance (environmental impact statements), to name but two examples.
In my paper I did not specifically cover the language implications of publishing research in languages other than English but there are many research papers that report on the issues. You can read them here and here and both are open access.
Where am I going with this post?
The issue is about discoverability and the gap between the almost total focus on English for AI/ML/neural network/NLP development and the importance of having un-biased access to global information resources, a situation elegantly presented by Sebastian Ruder. Talking to an enterprise search vendor recently the CEO was almost dismissive of the fact that they had majored on English in their model development. If you look at the web sites of most search vendors there is rarely a comment on the extent to which their applications are modelled across languages other than English
If you are relying on Microsoft Search for your global, and ideally multi-lingual, search then you should read a recent post by Agnes Molnar on the complexities of handling even two languages.
Another factor here (which is mentioned in my paper) is that in many organisations employees are searching English language content in a second or even third language. They may not have the command of English (especially synonyms) needed to construct alternate query strategies and their queries may well mask their intent in searching, which many search applications claim as their primary AI/ML contribution to effective search.
Organisations love to write policy documents, but in my experience there is rarely a corporate language management policy that is based on research, reality and user requirements. Even in Nordic organisations, where there is usually a good level of competence in English problems can arise – read this case study (download). What is the situation in your organisation?
If you would like to learn more about managing multilingual search it is one of the topics covered in my one-day on-site enterprise search management training course.
Martin White