The information discovery challenges of PowerPoint files

I must start by pleading guilty. I have over 700 PowerPoint files dating back over a decade. A significant number of them are work-in-progress versions and in most cases the file name is the conference title and not the subject of the presentation. This collection excludes presentations given to clients, which at a rough guess (based on a sample) would add another 300-400. Each file invariably consists of text and many images, often of Excel files and of course screen shots. The images are usually JPEG and so the content in them is impregnable to the search index crawler. The titles are often sourced from my somewhat dry sense of humour and are designed primarily to make the audience/client smile at the outset of the presentation. The crawler/index issue is significant more challenging when the PPT file has been wrapped up as a PDF file.

This situation mirrors my experience in every client I’ve worked for. Highly important information (often confidential to the people in the room) is incarcerated in the file, and only makes sense when heard in the context of the presentation. The audience may well have comments on the statements made but the slide deck is rarely updated and re-circulated. There may be a date of the presentation on the title page but there will be no useful metadata in the individual slides, if only because of the challenges in adding in footer information. It is also not uncommon for the final version to be presented from a laptop or iPad following a last minute (often late-night in the hotel) scramble to correct some of the information. It then remains on the laptop and is not uploaded to the main server, especially if the presentation is being given in a different location/country and gaining access to the server becomes fraught with password complications. This update will probably result in a new date being applied even it the presentation is in fact a couple of months old, creating a false idea of how current the file actually is.

Number 7 of my ten ‘stress tests’ for enterprise search is designed to alert you, and your organisation, to the discovery problems that can arise with PowerPoint files. While you are at it, why not run the other nine tests?

Martin White  10 August 2023