Research Studies

Abstracts - Project List by Title
Title Code
AI in the Middle ages-Arrangement of ancient documents via appearance-based recognition
AI in the Middle ages-Arrangement of ancient documents via appearance-based recognition
AD03/
AD03/
AI literacy for record management and Archives
AI literacy for record management and Archives
MA02/
MA02/
AI Tutorials
AI Tutorials
GS01/
GS01/
Analyzing Public Data Sets
Analyzing Public Data Sets
AD02/
AD02/
Building & Creating a Digital Twin for Preservation
Building & Creating a Digital Twin for Preservation
CU04/Retention and Preservation
CU04/Retention and Preservation
Case Study on Extraction and Identification of Records containing Personal Data and Sensitive Personal Data for Long Term Preservation
Case Study on Extraction and Identification of Records containing Personal Data and Sensitive Personal Data for Long Term Preservation
RA02/
RA02/
Comparative assessment of ethical codes of the archival/records management and artificial intelligence communities
Comparative assessment of ethical codes of the archival/records management and artificial intelligence communities
MA01/
MA01/
Context and Provenance for Distaster Data
Context and Provenance for Distaster Data
CU01/
CU01/
Employing AI for Retention & Disposition in Digital Information and Recordkeeping Systems (DIRS)
Employing AI for Retention & Disposition in Digital Information and Recordkeeping Systems (DIRS)
AA01/
AA01/
Enterprise Master Data Management and the Role of Metadata
Enterprise Master Data Management and the Role of Metadata
RP02/
RP02/
Gamification of archival experience for users
Gamification of archival experience for users
RA04/
RA04/
Identification of critical archival challenges which are the best candidates for improvement by AI technologies in the context of retention and preservation of digital records
Identification of critical archival challenges which are the best candidates for improvement by AI technologies in the context of retention and preservation of digital records
RP01/
RP01/
Inova HFA-Using AI to better manage health records and patient outcomes
Inova HFA-Using AI to better manage health records and patient outcomes
CU06/
CU06/
Intensional logic-based AI for the Records in Contexts Conceptual Model (RiC-CM) and Ontology (RiC-O)
Intensional logic-based AI for the Records in Contexts Conceptual Model (RiC-CM) and Ontology (RiC-O)
AD04/
AD04/
Investigating the Use of AI technologies in the Realm of E-Government Development
Investigating the Use of AI technologies in the Realm of E-Government Development
CU07/
CU07/
Model for an AI-Assisted Digitization Project
Model for an AI-Assisted Digitization Project
RA03/
RA03/
Personal Information Content Assessment
Personal Information Content Assessment
MA05/
MA05/
Preserving AI Techniques as Paradata
Preserving AI Techniques as Paradata
RP04/
RP04/
Privacy in Digital Records Containing Personally Identifiable Information (PII)-An Exploration of Current Status and the potential of AI Tools and Techniques
Privacy in Digital Records Containing Personally Identifiable Information (PII)-An Exploration of Current Status and the potential of AI Tools and Techniques
RA06/
RA06/
Records Classification using Natural Language Processing Techniques to Support Trustworthy Public Digital Record
Records Classification using Natural Language Processing Techniques to Support Trustworthy Public Digital Record
CU02/
CU02/
Research on the process of declassifying personal information using AI tools - Israel State Archives case study
Research on the process of declassifying personal information using AI tools - Israel State Archives case study
RA07/
RA07/
Smart Grid Data Communication and Analytics
Smart Grid Data Communication and Analytics
CU03/
CU03/
Teachable AI for Arrangement and Description
Teachable AI for Arrangement and Description
AD01/
AD01/
The Development of Ethical Guidelines for AI use with Records
The Development of Ethical Guidelines for AI use with Records
MA03/
MA03/
The role of AI in identifying or reconstituting archival aggregations of digital records and enriching metadata schemas
The role of AI in identifying or reconstituting archival aggregations of digital records and enriching metadata schemas
CU05/
CU05/
The Role of Records and RM in Environments Where Trustworthy AI is the Focus
The Role of Records and RM in Environments Where Trustworthy AI is the Focus
MA04/
MA04/
User approaches and behaviours in accessing records and archives in the perspective of AI-A global user study
User approaches and behaviours in accessing records and archives in the perspective of AI-A global user study
RA05/
RA05/

Creation and Use Back to Top

  • Title: Context and Provenance for Distaster Data (CU01)

    Lead Researcher: Michael Stiber, University of Washington, Bothell

    Timeline: September 2021-August 2026

    Abstract: This study will explore fundamental challenges associated with a particular use case for AI in records and archives: data from critical systems surrounding incidents such as natural disasters or actions by bad actors. This study leverages work ongoing at the University of Washington Bothell (UWB) and at the University of British Columbia (UBC). At UWB the work has been focused on developing simulations of emergency public communications systems, such as "next generation 911" (NG911) in North America, and to manage such work via a system that collects, visualizes, and manages provenance information from simulation artifacts. These systems, Graphitti and Workbench, are described at https://depts.washington.edu/biocomp/index.html. At UBC the focus is on building and preserving digital disaster archives in Japan from the point of view of archival studies, such as the 2011 Great East Japan Earthquake archives.

  • Title: Smart Grid Data Communication and Analytics (CU03)

    Lead Researcher: Mohamed Ibnkahla, Carleton University

    Timeline: 2021-2026

    Abstract: With the popularity of smart electrical appliances and home energy management systems, massive amounts of data are generated about electricity consumption. These data are beneficial for the utility companies as they provide information about the behaviour patterns of consumers, and these can also inform decisions on how to optimize the load on the grid. The data obtained from the communication system is stored in a database hosted in the cloud. Our aim is to communicate the data communication system between the transformer agent (TA), attached to a neighbourhood’s electric transformer, and its customer agents (CAs) attached to each house using inexpensive and common-use devices and modules and to process these data to help utility companies design better demand side management (DSM) programs for the efficient transmission and distribution of energy. This solves the problem of balancing electric demand and supply at the grid and reduces peak demands, which helps lower the electricity bills for the consumers. In this context, we analyze household electricity consumption data to forecast energy consumption for short-term (hours/days ahead) and long-term (weeks/months ahead). What records are generated in these systems, what and how should they be preserved, and what role will artificial intelligence play?

  • Title: Records Classification using Natural Language Processing Techniques to Support Trustworthy Public Digital Record (CU02)

    Lead Researcher: Umi Mokhtar, Universiti Kebangsan

    Timeline: 2021-2024

    Abstract: In the library field, classification is used for retrieval and searching, whereas in records management, classification is designed to be used for preservation purposes and to maintain required records characteristics. The authenticity, reliability, integrity, and usability of records must be maintained throughout their lifecycle. This study will extend the Functional Model for Classification: The Records Management Approach developed by the lead researcher in 2015 to embed AI techniques to automate the classification of records.

  • Title: Building & Creating a Digital Twin for Preservation (CU04)

    Lead Researcher: Stephen Fai, Carleton University

    Timeline: 2021-2026

    Abstract: A digital twin is an ecosystem of multi-dimensional and interoperable subsystems made up of physical things in the real-world, digital versions of those real things, synchronized data connections between them and the people, organizations and institutions involved in creating, managing, and using these. The Carleton Immersive Media Studio (CIMS) has constructed several important digital versions, simulations, models, virtual experiences and digital assisted narratives of historical and contemporary structures, artifacts and regions using a variety of data, AI/ML & technologies, that are open source and proprietary, not least of which include: • Building information management systems (BIM) • Asset Management Systems (AMS) • Unreal Game Engine and a variety of others for simulation purposes • VR and modelling • AI/ML • Real-time data for decision making • Many others CIMS projects include the Parliament of Canada, the National Capital Commission in Ottawa, Ontario East Economic region, Muskowekwan Residential School, Documentation of the Tomb of Tutankhamen, and Tomb of Nefertari, Kasbah de Taourirt in Morocco and many others (see the URL below). These are pre-cursors to their work on digital twin technologies. This ITrust AI study will explore the question: Can a digital twin be preserved and what is required at the point of creation to ensure that it can be? using as a case study CIMS' closest model of a digital twin so far, the Sustain Digital Campus project that involves a digital twin of the Carleton University Campus.

  • Title: The role of AI in identifying or reconstituting archival aggregations of digital records and enriching metadata schemas (CU05)

    Lead Researcher: Mariella Guercio, Associazione Nazionale Archivistica Italiana-ANAI, and Stefano Allegrezza, Università di Bologna

    Timeline: January 2022-July 2023

    Abstract: The uncontrolled creation of a huge numbers of current records with missing metadata, necessary to ensure the reliability, trustworthiness, quality and sustainability of appraisal and acquisition, is a common and complex problem today, including: 1. Records managed by ERMS without the full set of information required for proper records creation; 2. Business systems that create and manage records with only partial identification and procedural information; 3. Records created by systems without metadata and without being integrated in ERMS, including email repositories. This study will use case studies to explore the question: Can we use AI tools to constitute or reconstitute archival aggregations and create metadata schema for them? It will assess existing AI technologies that can address the problem of non-aggregated, unarranged, or de-contextualized records both in the current and semi-current phases of their lifecycle in order to ensure an accurate appraisal and guarantee managed and controlled transfer procedures. Further, it will identify archival requirements for new tools, which should be developed according to archival concepts and principles. In particular the study aims to assess: - the possibility of using AI tools to re-establish the archival bond among a multitude of de-contextualized records; - the possibility of using AI tools to integrate incomplete recordkeeping metadata schemas; - the capability of existing AI technologies to address the critical archival issues mentioned above; and - the ability of archival concepts and principles to inform new AI tools aligned with the archival needs named above.

  • Title: Inova HFA-Using AI to better manage health records and patient outcomes (CU06)

    Lead Researcher: Claudio Gottschalg-Duque

    Timeline: September 2021-December 2023

    Abstract: This study will investigate the use of AI (NLP), DLT and Smart Contracts to solve the health records (EHR) management problems, including how to use the health records register to improve patients' health outcomes (early recognition of clinical deterioration), while respecting the General Personal Data Protection Law (LGPD, Brazilian law number 13.709-2018). Can AI help researchers, clinicians and administrators improve their services efficiently and effectively using patient data without invading patient privacy? The objective is to implement Hospital 4.0 in the HFA through the help of academia and IT processes.

  • Title: Investigating the Use of AI technologies in the Realm of E-Government Development (CU07)

    Lead Researcher: Proscovia Svärd

    Timeline: February-October 2022

    Abstract: This study will explore what AI technologies are being used within the realm of e-government development and what recordkeeping challenges can be identified, specifically a) at what points in the processes where AI is being deployed are records created? b) How are they created and captured for use? c) How are they pluralized? d) Are there any challenges that can addressed by AI? By conducting a systematic literature review, the study team will: 1. Establish legislative and regulatory guidelines in three selected countries (Sweden, Finland and South Africa) that inform e-government development pertaining to different AI technologies and the creation and use of records. 2. Identify in the different countries, key trendsetters (national government agencies/municipalities) that utilise AI towards e-government development. The focus is to determine what, and the extent to which these trendsetting organisations utilise AI towards e-government 3. Identify recordkeeping challenges during the utilisation of AI within the realm of e-government development.

Appraisal and Acquisition Back to Top

  • Title: Employing AI for Retention & Disposition in Digital Information and Recordkeeping Systems (DIRS) (AA01)

    Lead Researcher: Pat Franks, San Jose State University

    Timeline: October 2021-September 2024

    Abstract: Not all records that would benefit from storage in a trusted digital repository (or other electronic storage solution) must be preserved indefinitely. We must, therefore, trust not only that our records are being preserved for access as long as necessary but also that those records that must be disposed of can be done according to a defensible retention and disposition schedule. The records management function provides controls related to records disposition following an approved retention schedule. AI may provide the means to dispose of such records accurately and efficiently even if stored in trusted digital repositories that were not designed to facilitate disposition. This study will investigate how AI can be used to not only implement but also create retention schedules, enable litigation controls, provide PII security, and ensure consistency with organization-wide policies and procedures.

Arrangement and Description Back to Top

  • Title: Teachable AI for Arrangement and Description (AD01)

    Lead Researcher: Richard Arias Hernandez, University of British Columbia

    Timeline: October 2021-September 2023

    Abstract: This study aligns with ITrust AI’s overall goal "to design, develop, and leverage Artificial Intelligence to support the ongoing availability and accessibility of trustworthy public records" by focusing on creating lesson plans and educational materials for archival students, archivist, and records managers to be able to at least leverage (and possibly "design") Artificial Intelligence to support the ongoing availability and accessibility of trustworthy public records in the areas of archival description and arrangement. This project can be the basis for, or join a bigger project that focuses on broader curriculum development of AI for Archival Science.

  • Title: Analyzing Public Data Sets (AD02)

    Lead Researcher: Ozgur Kulcu, Haceteppe University

    Timeline: October 2021-September 2023

    Abstract: This study will explore how to analyze archival contents and what meaningful results can be obtained by using AI technologies including data mining and machine learning. The study will use digital archival content produced by different public institutions in Turkey. Topics to be investigated include what support unstructured big data archives can provide for the institutional decision-making process, and detection of missing and incorrect information in data archives, automatic classification, topic generation and subject detection, and automatic processes management.

  • Title: AI in the Middle ages-Arrangement of ancient documents via appearance-based recognition (AD03)

    Lead Researcher: Benedetto Luigi Compagnoni

    Timeline: 2021-2026

    Abstract: Interest in applying Artificial Intelligence to image data analysis is growing, and scientists are increasingly using it as a powerful, complex, tool for statistical inference. Computer-based image analysis provides an objective method of scoring visual content independent of subjective manual interpretation, while potentially being more sensitive, consistent and accurate. This study will test AI tools for an appearance-based recognition of the "signum tabellionis" of ancient parchments and documents. The approach that will be used, based on the use of neural networks, aims at reducing manual annotation and at the same time at using manual annotation as a form of continuous learning. The whole system needs manual tagging of large training data. All manually verified data will be used as continuous learning and will be maintained as training datasets. A deep neural network based on an object detector will be used to recognise the "signum tabellionis" of the parchments. This system concerns not only the recognition and classification of the objects present in the images, but also the location of each of them. The study expands the implementation of AI in archival science: a method that could be reproduced by many other Archives and for different types of documents.

  • Title: Intensional logic-based AI for the Records in Contexts Conceptual Model (RiC-CM) and Ontology (RiC-O) (AD04)

    Lead Researcher: Hugolin Bergier

    Timeline: TBD

    Abstract: The purpose of this study is to establish mutual understanding between the archival and AI fields within the context of archival arrangement and description. The team will identify specific AI technologies that can address critical archival challenges: the research will apply an enriched intensional formalism using logic-based AI to analyze and enrich the Records in Context ontology (RiC-O).

Retention and Preservation Back to Top

  • Title: Identification of critical archival challenges which are the best candidates for improvement by AI technologies in the context of retention and preservation of digital records (RP01)

    Lead Researcher: Hrvoje Stancic, University of Zagreb

    Timeline: October 2021-May 2022

    Abstract: This study aims to identify critical archival challenges in the context of retention and preservation of digital records. The issues arising from the implementation of OAIS-based and other digital archive solutions will be investigated. For example, the research will look for repetitive tasks, tasks requiring dealing with a large quantity of digital records as well as other tasks which are the best candidates for improvement by using AI technologies. The research will also aim to identify challenges arising from digital preservation risks. Once the critical challenges are recognized, the specific factors within them will be identified and mapped, and the way how to further address them by AI technologies will be proposed. The study has two research objectives: 1. identify critical challenges to be addressed by AI, and 2. identify within each critical challenge the specific factors to be addressed and how AI might address them.

  • Title: Enterprise Master Data Management and the Role of Metadata (RP02)

    Lead Researcher: Alex Richmond, Bank of Canada

    Timeline: 2021-2026

    Abstract: As many public agencies are investing in research and infrastructure to advance their data and analytic capabilities, the challenge of mastering enterprise data has surfaced as a key pain point. By combing domain expertise from archival science, specifically descriptive standards and the use of metadata, as well as recent thinking in data warehouse, data lake architectures, and object modeling, the Bank of Canada is proposing a research study centered around a proof of concept at the Bank using the Legal Entity Identifier standard to create an enterprise master data set for financial institutions. In this study we will be looking at various AI technologies that can contribute to the utilization of metadata in the development and maintenance of enterprise master data sets. Further, we will be looking at virtual graph technologies to link various data sets and information assets to increase their accessibility and reusability by stakeholders. Finally, we aim to produce a set of best practices and guidelines in the application of metadata in the creation and maintenance of enterprise master data, which will aid in their findability and access and preservation over time.

  • Title: Preserving AI Techniques as Paradata (RP04)

    Lead Researcher: Pat Franks, San Jose State University and Babak Hamidzadeh, University of Maryland

    Timeline: October 2021-September 2024

    Abstract: If an AI technique is used to facilitate or automate an archival, recordkeeping, or other process, how much of that AI technique, its code, the data (probably a subset of existing records) we use to train it, test cases and test results to examine its efficacy, its parameters and their values at or over the time of application, the technical environment in which it is executive, and the records it (the AI technique) is applied to for automation purposes, should be preserved? This question is not to preserve AI techniques for their own sake and in and of themselves, but it is to preserve them as contextual materials/information in support of preserving the records they are applied to. As such are they preserved as part of procedural context, technological context, a combination of the two, or other contexts? How do we preserve the pieces that constitute the AI technique (code, training data, test cases, parameters, etc.)? How reproducible should what we preserve be? If there is non-determinism or randomness in any of these AI techniques, how do we identify, characterize, and preserve them? If there is/are human(s) intertwined with the AI technique in the decision-making process, how is the human’s role and his/her relationship with the AI technique captured and preserved? This study will explore these questions, gathering data about the present state of practice, and proposing best practices and solutions.

Management and Administration Back to Top

  • Title: Comparative assessment of ethical codes of the archival/records management and artificial intelligence communities (MA01)

    Lead Researcher: Jim Suderman

    Timeline: 2021-2025

    Abstract: This study is based on the archival concept of authentic records being reliable evidence of the past. Archival ethics focus on the application of archival concepts and principles in ethical ways so that archivists and archival organizations are trusted to preserve authentic records and make them available to users in the context in which the records were created. As AI technologies are increasingly used by records professionals and archives, archival and AI ethical issues overlap and intersect. The study asks the questions: 1. What similarities and differences exist between representative ethical codes of the two communities? 2. In what ways might or should the ethical codes of each community influence the other? It will complement the Ethical Guidelines for AI use in Private Sector Organizations (MA03).

  • Title: AI literacy for record management and Archives (MA02)

    Lead Researcher: Moises Rockembach, Universidade Federal do Rio Grande do Sul

    Timeline: October 2021-April 2023

    Abstract: The uses of artificial intelligence to support records management activities involve not only the development of AI applications, but human-machine interaction. If AI can help us with records management, we need to develop AI literacy to work together and adopt new solutions. How can we develop an AI literacy in the context of records management and archives? This study will identify competencies for critical AI evaluation; analyze the digital transition scenario, and the impacts on labor dynamics; identify the challenges involving communication/interaction between humans and AI; and propose ways to engage in AI-based records management and archives solutions.

  • Title: The Development of Ethical Guidelines for AI use with Records (MA03)

    Lead Researcher: Mia Steinberg, Collabware

    Timeline: September 2021-September 2023

    Abstract: Organizations, especially in the tech sector, will make use of AI for analysis and data comprehension. It would be prudent to have an established set of guidelines for these organizations to use to create in-house ethics review committees and ensure that their AI use is responsible and transparent. The objective of this study is two-fold. First, we seek to identify and create a set of reasonable and prudent practices for the development of an ethics review process within an organization. Once established, we will create a meaningful plan on how these processes would be applied. Therefore, the objectives cover both the procedure of creating an ethics review process as well as its outcome. Records should be authentic evidence and their use and access are already governed by ethical guidelines within the archival profession; this study extends that ethical framework to focus specifically on the application of AI to records, and brings a multidisciplinary approach that gives consideration to the complexities of the technology. This study complements the Comparative assessment of ethical codes of the archival/records management and artificial intelligence communities (MA01).

  • Title: The Role of Records and RM in Environments Where Trustworthy AI is the Focus (MA04)

    Lead Researcher: Sherry Xie, Renmin University of China

    Timeline: 2021-2026

    Abstract: This study explores the meaning of “trustworthy artificial intelligence” (https://digital-strategy.ec.europa.eu/en/policies/artificial-intelligence) and the roles of digital records and records management in the EU's AI strategy for building trustworthy AI. It will assess the EU's "trustworthy artificial intelligence" initiatives with current developments in (digital) RM in the context of Asia, North America, and the G20. If current (digital) RM developments are not aligned with or represented in the EU's initiative, the study team will establish recommendations on next steps for the RM profession.

  • Title: Personal Information Content Assessment (MA05)

    Lead Researcher: Jim Suderman

    Timeline: October 2021-December 2022

    Abstract: Privacy protection is a central legal responsibility of public sector organizations. For some such organizations privacy protection is already a risk-based process. Semi-structured digital records, e.g., email, present a significant challenge to assessing risks of privacy breaches because every record must be reviewed for personal information before it can be made generally accessible. Confidence and trust in public sector organizations will be improved if a robust, AI-supported means to assess the scope, type, and location of personal information can help the human experts charged with protecting privacy focus their attention where the risks are highest. This grounds the study not in one or more specific archival principles but in the role of archival organizations and archivists as trusted preservers. To continue to be trusted, archivists, including those managing archival collections, need to be aware of the information in their collection, their legal obligations for administering it, and the ethical and moral responsibilities to assess how changing contexts can affect their responsibilities.

Reference and Access Back to Top

  • Title: Case Study on Extraction and Identification of Records containing Personal Data and Sensitive Personal Data for Long Term Preservation (RA02)

    Lead Researcher: Alicia Barnard

    Timeline: January 2022-June 2023

    Abstract: This study aims to develop an algorithm for recognizing and extracting unstructured information that is personal data and sensitive personal data in digitized records (PDF with OCR) by applying artificial intelligence techniques (AI), in particular machine learning algorithms (ML), and look for possible requirements or equivalents of trustworthiness (accuracy, reliability and authenticity) of AI and ML of the product to be obtained.

  • Title: Model for an AI-Assisted Digitization Project (RA03)

    Lead Researcher: Eng Sengsavang, UNESCO

    Timeline: 2022-2026

    Abstract: This study will explore the following questions: ● What key archival functions and best practices are carried out in effective digitization projects? ● What AI-based tools are currently being developed and/or used by practitioners and vendors for digitization activities? ● What digitization projects have been implemented, completed, or are in progress that have used AI technologies? ● What are the benefits and risks, limitations and potential biases when using such AI technologies in digitization projects? ● What AI-based tools might be developed in the future, particularly to assist archival functions during the digitization process (pre-digitization, digitization, post-digitization), and particularly solutions that are low-cost or less resource-intensive? The study will accomplish the above by modeling key archival functions carried out during digitization projects that may benefit from AI technologies, whether already developed, emerging, or to be developed in the future. The study will also explore potential low-cost AI solutions, recognizing that digitization is a resource-intensive process, and that resource-strapped organizations, groups, and least developed countries in particular face barriers to digitization.

  • Title: Gamification of archival experience for users (RA04)

    Lead Researcher: Demet Soylu, Haceteppe University

    Timeline: 2021-2026

    Abstract: This study explores problems related to the user experience during the online access and retrieval process in digital archives. It challenges the existing traditional approach to the access and retrieval process utilized within archives and aims to promote a user-centric approach to online access and retrieval. The study also aims to enable the easy facilitation of archival services for key target group(s) as identified through other ITrust AI research studies. The focus of this study is to improve the technical features related to online access and retrieval of archival records through the synthesis of gamification as a component of machine learning within the AI context.

  • Title: User approaches and behaviours in accessing records and archives in the perspective of AI-A global user study (RA05)

    Lead Researcher: Pierluigi Feliciati

    Timeline: October 2021-March 2023

    Abstract: This preliminary user study aims to bring the users' perspective to support the definition of requirements and guidelines for developing trustful, valuable, and viable AI tools to improve archival reference and access. Data is lacking internationally on the actual UX of accessing records and archives. In the absence of shared protocols and metrics, users' behaviour and satisfaction (quality of access) studies are driven at the “local” level, limited to specific services. How much do we know about how users perform their research? Do they use personal names? Places? Dates? Functions? Subjects? Are they comfortable with the language of interfaces and records? Are they willing to share their research data to improve archival services? Data collected by involving a sample of final users could reveal more about the satisfaction against existing digital archival reference and access services (access to digital records + access to digitized records and documents + access to digital finding aids and reference and access tools), and the actual expectations and concerns on the application of AI to reference and access archival records. Finally, this study provides an opportunity to better define the reference and access function in an AI context (articulating it in activities, processes, and quality indicators).

  • Title: Privacy in Digital Records Containing Personally Identifiable Information (PII)-An Exploration of Current Status and the potential of AI Tools and Techniques (RA06)

    Lead Researcher: Georg Gaenser, European Free Trade Association

    Timeline: 2021-2026

    Abstract: This study explores a key barrier (i.e., risk to privacy) to providing open access to digital archival records. It will investigate how archival institutions are protecting privacy in digital records containing PII when providing access to them, how AI tools and techniques could contribute to the challenges faced by archival institutions in providing access to these kind of records, and what are the implications of using AI tools and techniques to deal with privacy issues in records. Specific objectives will be: to identify the main needs/obligations of archival institutions concerning the protection of privacy in the provision of access to digital records (derived from legal/regulatory requirements, institutional policies, standards, social expectations, etc.); to describe current approaches, processes, techniques, and/or tools used by archival institutions to protect PII in digital records when providing access to them; to identify gaps between the needs/obligations that archival institutions face regarding the protection of privacy in digital records and the current approaches, processes, techniques, and tools being used to provide access to them; to identify AI tools and/or techniques that could help archival institutions to comply with regulatory/institutional/social requirements when providing access to records; and to explore uses, issues, and challenges associated with the use of the identified AI tools and techniques in the archival field, as well as new tools that could be developed to that end.

  • Title: Research on the process of declassifying personal information using AI tools - Israel State Archives case study (RA07)

    Lead Researcher: Silvia Schenkolewski-Kroll, Bar Ilan University, and Assaf Tractinsky, Israel State Archives

    Timeline: November 2021-November 2023

    Abstract: Archivists and record managers around the world deal with the process of declassifying information in paper and digital form, mainly in the context of quantities and time invested. This issue increases as digital information needs to be declassified, among others, due to the large quantities and its structure. This study proposes to explore the nature of declassifying in the paper and digitals environment to identify and develop guidelines for future use of AI tools for declassification. The main purpose of the study is to review the literature and practice in archival institutions, the process of declassifying in the paper and digital environment, and to create a framework and guidelines that will serve as a basis for future declassification using AI tools.

General Studies Back to Top

  • Title: AI Tutorials (GS01)

    Lead Researcher: Muhammad Abdul-Mageed

    Timeline: 2021-2026

    Abstract: This AI and ML Tutorial Repository (https://github.com/UBC-NLP/itrustai-tutorials) houses tutorials created for the iTrustAI SSHRC Partnership Grant. These tutorials will grow over time and will be used in hands-on workshops for training purposes for archivists and records professionals, as well as anyone interested in learning about different AI tools and techniques. These will include Natural Language Processing (NLP) and its core tasks (Part of Speech (POS) Tagging, Named Entity Recognition (NER), Text Classification, Machine Translation); Speech Processing; Image Processing; Practical Machine Learning.