Personalisation – legality, safety and transparency
Earlier this year I blogged about concerns I had about possible ethical issues in providing personalized search results. It seems that others have had similar concerns, and the issues were at the heart of a challenging presentation by Professor Maarten de Rijke in his lecture as Strix Award Winner for 2017 in London on 24 November. A few days later at the British Computer Society Information Retrieval Special Interest Group’s Search Solutions 2018 conference several of the speakers commented on what might be termed safety and transparency issues, notably Gabriella Kazai (Microsoft Research) and Mounia Lalmas (Spotify).
Earlier this month the Dutch Government published a 90 page report on the data protection issues with the logging software that Microsoft uses in its Office software. It had been commissioned by the Ministry of Justice and Security for the benefit of SLM Rijk (Strategic Vendor Management Microsoft Dutch Government) because the Government was concerned about the information that Microsoft was collecting through the logging routines build into the application.
The report identifies the following data protection risks:
- No overview of the specific risks for individual organisations due to the lack of transparency (no data viewer tool, no public documentation)
- No possibility to influence or end the collection of diagnostic data (no settings for telemetry levels)
- The unlawful storage of sensitive/classified/special categories of data, both in metadata and in content, such as for example subject lines of e-mails
- The incorrect qualification of Microsoft as a data processor, instead of a joint controller as defined in article 26 of the GDPR
- Not enough control over sub-processors and factual processing
- The lack of purpose limitation both for the processing of historically collected diagnostic data and the possibility to dynamically add new events
- The transfer of (all kinds of) diagnostic data outside of the EEA, while the current legal ground is the Privacy Shield and the validity of this agreement is subject of a procedure at the European Court of Justice
- The indefinite retention period of diagnostic data and the lack of a tool to delete historical diagnostical data.
The report notes that discussions have been held with Microsoft but the issues are still open ones. This is not surprising as these sub-routines go to the heart of how Microsoft delivers functionality. The potential GDPR issues of employee monitoring have also been considered by the Article 29 Data Protection Working Party of the European Commission in Opinion 2/2017 on data processing at work adopted on 8 June 2017.
At present the search vendors are promoting the use of logging software in order to deliver personalized sets of results to employees using their search application and to identify the expertise of employees so that others can identify potential experts within their organisation. In theory this seems to be very helpful initiatives.
However, there are now concerns on two fronts about the impact of these initiatives. The first of these is whether they satisfy the requirements of GDPR. This is the reason for the actions taken by the Dutch Ministry of Justice and Security. The second is about the ethics of personalization. The presentations I have referred to above both focused in the importance of individuals being able to discover what information was being collected and how it was being used to create profiles that could lead to the selection of information being presented to them, either from a search or from news monitoring applications. It is not just the raw data but the weighting that has been put on each element. You only have to look at the recommendations for books and other products displayed by Amazon to wonder just how the personalization software came to a conclusion about which books to present.
The situation becomes even more challenging in an enterprise situation when the core factor in deciding on the selection of information to present is the security permissions of the employee. If these are not disclosed then the employee is not in a position to query whether a mistake has been made or whether for some other purpose. It should be noted that in Germany Workers Councils take a very close interest in the use and potential misuse of personal information. To add to the challenges there is no Federal data privacy legislation though California (as just one example) has passed very comprehensive data privacy legislation which is in line with GDPR.
The ethical implications were a particular feature of Maarten de Rijke’s lecture. He emphasized not just the transparency implication but also the safety consideration. What recourse would an employee have if they were put into a dangerous situation because they did not have access to all the information needed to make an informed decision?
No matter how good the business case might seem for the delivery of personalized search results or computer-derived profiles my advice would be to ensure that you have taken qualified legal advice on the situation for your organisation and your vendors. This should take into account not only EU GDPR legislation but also the implications for data transfer to non-compliant countries and the reputational impact of a whistleblower who considers they have been discriminated against but does not have access to the profiles that have been generated to present their case.