About the Research

Research Activities

Date Range Research Objective Activities
2021 - 2022 1: Identify specific AI technologies that can address critical records and archives challenges
  • Identify critical challenges to be addressed by AI, adding to our initial survey of techniques (Table 1)
  • Surveys and interviews with practitioners within the global records and archives community
  • Identify within each critical challenge the specific factors to be addressed and how AI might address them
  • Expert interviews and mapping
  • Identify and prototype candidate AI technologies
  • Candidate use cases
  • Create initial evaluation criteria for AI solutions for records and archival challenges, including a diverse set of challenge datasets focusing on specific issues
2022 - 2023 2: Determine the risks and benefits of using AI technologies on records and archives
  • Determine the requirements of public records compared to the capabilities of AI technologies
  • Doctrinal legal research
  • Development of a value structure for risks and benefits
  • Identify the limitations of each potential AI solution
  • Policy analysis
  • Expert interviews
  • Environmental Scans
  • Comparison studies of AI solutions on representative datasets
  • Develop list of threats and vulnerabilities
  • SWOT/PESTLE Analysis
  • Theoretical Analysis
  • Stakeholder Interviews
  • Expert Assessment
  • Error analysis of AI solutions based on performance on challenge datasets
  • Iterate on validation criteria, for instance creating new versions of challenge datasets, to address any important factors discovered through threat and vulnerability analysis
2023 - 2024 3: Establish how archival concepts and principles can inform the development of responsible AI
  • Establish archival principles to be used to inform AI development
  • Develop and improve AI tools based on these principles
  • Identify and mitigate biases present in training datasets and models
  • Consistency Analysis
  • Determine whether AI informed by archival principles is more aligned with archival needs
  • Experimental comparison of models on challenge datasets
2024 - 2025 4: Validate outcomes from Objective 3 through case studies and demonstrations
  • Deploy archival oriented AI tools
  • Measure AI solutions against the validation criteria developed in Phases 1 and 2
  • Examine feasibility, sustainability, bias, transparency, generalizability, and preservation of context in AI solutions
  • Case studies
  • Use cases
  • Detailed error analysis of AI solutions in the context of case studies
  • Develop and validate tools including framework for evaluation and checklists for institutions considering AI implementation
2025 - 2026 5: Completion of Outputs
  • Finalize overarching publication of outcomes
  • Packaged software (e.g. to automatically caption historical photos, sensitize descriptions of documents, or translate historical documents in indigenous languages.)

Expected Outcomes

This project will:

  • generate new knowledge on the uses of AI, such as Machine Learning (ML), on public records
  • improve upon existing tools and create new ML tools that will address archival needs, such as machine translation, image recognition and description, optical character recognition (OCR) and handwritten text recognition, text summarization and classification, and text style transfer for language civilization (e.g., removal of bias, hate, and sexism). The tools developed will be tailored to, tested for, and deployable by the records and archives community
  • produce best practices, standards, and guidelines for applying ML tools across an entire problem space, bringing archival knowledge and practice to bear on problems such as bias in ML, explainable artificial intelligence (XAI), and image description
  • enrich research and lead to knowledge co-creation in several disciplines, including archival science, records management, AI, cybersecurity, information science, law, and ethics, through knowledge exchange and uptake between scholars and practitioners within and among those discipline
  • train a substantial cohort of students -- future scholars and professionals -- who will bring their enhanced knowledge to the institutions, organizations, communities, and governments they will serve, and to the archives and records community as a whole