Every day I scan through around 200-300 additions to the arXiv preprint database looking for research that has implications for the assessment, adoption and governance of AI in the enterprise. I skip over technical developments on LLMs and research into the performance of ChatGPT et al. The benefit of the arXiv service is that it facilitates the early publication of research, The fact that it is open source means that IT managers and developers without access to the academic journal services can download the research. The downside is that sometimes subsequent peer review prior to publication results in modifications to the pre-print version.
This is the first of what I intend to be a monthly synopsis of a selection of the outcomes of my scanning. I have not tried to summarise or critique the research papers as a good abstract is only a click away. The objective is to give you a curated list of open-source research outcomes that in my opinion you should at least be aware of and perhaps download and circulate to colleagues even if you yourself do not have time to read through them.
Forgotten Knowledge: Examining the Citational Amnesia in NLP
https://arxiv.org/abs/2305.18554
ChatGPT is a Remarkable Tool—For Experts
https://arxiv.org/abs/2306.03102
When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings
https://arxiv.org/abs/2104.08671
The Two Word Test: A Semantic Benchmark for Large Language Models
https://arxiv.org/abs/2306.04610
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
https://arxiv.org/abs/2304.06364
Lost in Translation: Large Language Models in Non-English Content Analysis
https://arxiv.org/abs/2306.07377
The Ethics of Algorithms: Key Problems and Solutions
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3662302
Inverse Scaling: When Bigger Isn’t Better
https://arxiv.org/abs/2306.09479
Friend or Foe? Exploring the Implications of Large Language Models on the Science System
https://arxiv.org/abs/2306.09928
TRUSTGPT: A Benchmark for Trustworthy and Responsible Large Language Models
https://arxiv.org/abs/2306.11507
An Overview of Catastrophic AI Risks
https://arxiv.org/abs/2306.12001
Testing of Detection Tools for AI-Generated Text
https://arxiv.org/abs/2306.15666
Martin White
20 July 2023