FIAT/IFTA World Conference 2022: The Role of AI in Archiving and Data Management

You can learn so much when you take the time to speak to those on the frontline of your industry, to hear their vision and what they’re dealing with. Having attended the FIAT/IFTA World Conference in Cape Town, South Africa, last month, I can say with some certainty that the world of archiving and data management is being seen as essential to democracy, which is an important mission, but alongside that is a lot of looming change - in how the role of an archivist works, what skills they need and how the wider world sees their work.

Moments Lab (ex Newsbridge) was a Gold Sponsor of the FIAT/IFTA World Conference, and I was joined at the conference by industry lead Carole Pigeard. We each had many conversations with people on the ground, those who are working hands-on with archiving and indexing roles. We were happy tof share how we are working with AI - our way of doing things - to help illuminate what’s possible with this technology.

Frederic, Kathey and Carole standing together and smiling for the camera. Kathey is holding a 'shortlisted candidate' sign. — *Frederic Petitpont, Kathey Battrick (Asharq News) and Carole Pigeard*

Archives for democracy

Excitingly, one of the first presentations was with a representative of the South African Broadcasting Corporation (SABC), whose viewpoint really inspired me. The SABC is one of the biggest rights holders on the continent, and they see their role as preserving the archive and providing access to it as a matter of democracy - a message that provided a lot of sense and renewed drive to everything we do. From their point of view, an archivist must have objectivity in their mission to make sure the past is archived and indexed in a fair way, so that when future generations look back they will have an objective way to view our times.

This “fairness” in indexing asks the question: how can we trust an AI to do the job? Is it right to delegate such an important task to technology which can learn biases? We’ve heard many times that deep learning can introduce bias to how AI makes its decisions - introduced in the data used to train the AI - and this is something that must be faced if we are to truly harness the power of this technology. One way to fight this bias is to build better datasets, which means indexing “fairness” actually does require humans.

*Carole Pigeard presents Moments Lab's mission at the 2022 FIAT/IFTA World Conference*

Asharq News innovates with multilingual data models

To demonstrate the results that can be achieved with Moments Lab’s platform, our client Asharq News was at the conference and celebrated winning the award for Excellence in Media Management for the project “AI Metadata for Arabic Archive”. Asharq News uses the Moments Lab Multimodal AI Indexing technology to achieve great things with their Arabic-language indexing and archiving, and I was proud to join them on stage as they presented their project, fielding the more technical questions from the audience.

Asharq is a fairly new channel; they didn’t have a legacy archiving system to worry about, so could start with all new technology. They have no big shelves of archived tape; they are very dynamic, young and fresh, and are working on accessibility using Moments Lab’s Media Hub platform. They’re showing that it can work at scale - they have more than 2,000 video hours per month - and they can make it work with the Arabic language. Their system is dual-language, but they also have the capacity to use OCR in Arabic and detect named entities, people and brands.

The team at Asharq have spent a lot of time training their AI, and they are building the first dataset capable of recognising any Arab business leaders.

Think of the Arabic language, and something as simple as the name Mohammad: there are three different ways to spell that one name. If the AI is trained in one model but users don’t use it the same way, you will create duplicates of the same person. In a proper semantic model, this Mohammad is the same as these others. Leveraging Wikidata, you are basically saying these are synonyms for the same person. This is something people have been theorizing a lot over the last few years, and we have it working at Asharq.

We want Moments Lab’s platform to be inclusive and accessible, to have no language left behind. At the moment, archiving with AI is an English-only exercise, no matter where in the world the archive is located. When it comes to localization and technology, we want to make sure AI can work with other languages, including regional dialects.

Evolving job roles

As this technology begins to permeate, the role of the archivist and the skills needed to work in indexing is changing. As with many roles that interface with technology today, archivists are having to become data scientists of a sort - people who were media managers must now become AI and data engineers. Some greet this with reluctance, but it’s not a majority. Many see AI technology as an enabler; as the world is producing more and more content, we realize that the ones who know how to manipulate data and train AI have a superpower.

This change in how we handle data became clear as we spoke to Sound & Vision (Beeld & Geluid), the national archive of the Netherlands, and they shared their pain as they encounter data fractures while archiving more than 70 years of content.

The way something was archived 20 years ago impacts how we can view it today, and the work of archivists today will in turn impact future accessibility.

Again, this reflects the role of archiving in democracy, and it’s actually quite inspirational for us to think of our work and our platform as playing a role in that important job.

And yet, cloud adoption is not yet where we would expect it to be. FIAT/IFTA shared the results of its annual research into cloud adoption, and it was staggering: More than 35% of respondents fear cloud will be more expensive, almost 30% believe they already have sufficient storage and so have no need for the cloud, and almost one-quarter of respondents have concerns about data protection issues. The hybrid approach of part-cloud and part-on premises seems to still be the norm in archiving and indexing circles.

Interestingly, though, more than 40% of respondents to that survey say they are still investigating the market - and that resulted in us having many great conversations with conference attendees, many of whom were actively approaching us to commend us on the promises of our technology. Where many are still assessing the potential of AI to improve the lives of indexers, our Multimodal AI is proving the possibilities are here already.

We have come a long way, but there is still further to go. The more people I spoke with at the conference, and the more questions I answered during panel sessions and presentations, the more I realized how much work is still to be done on the ground with the indexing and archiving community to help reassure them about the opportunities of artificial intelligence and machine learning in their job roles. No, we are not out to replace them with machines - and no, implementing AI is not a way to slash costs. We just don’t have the human capacity to watch and index every piece of content produced every day. Some leaders were disappointed to hear this! But how do you ensure archiving fairness when facing these volumes of content?

The memory of the world is expanding fast. We are breaking the myth that you don’t need the human; you do actually need to have people creating the models and guiding the AI, or it won’t work to its full potential.

AI and deep learning is used across many industries, and it’s used in our everyday lives; you could say it’s a matter of democracy that people should understand how the technology works. As a company working with AI, we had to commit to this role in reassuring those we spoke with, and to explain the workings and possibilities. After having attended so many conferences dedicated to AI, this was a big wake-up call for us. At the frontline level, AI is still the next big thing. We still have a lot of work to do in evangelization. We need to make it more accessible, but in a way that helps people to understand how they can make it their own and create value with it.

There is a lot of work ahead of us, for sure. But I am energized and ready, because I know what we do at Moments Lab is the answer to a lot of the challenges we heard about at FIAT/IFTA.

‍

For more information about Moments Lab and our solutions, please contact us.

‍