Research Ideas and Outcomes :
Research Article
|
Corresponding author: Corey DiPietro (dipietroc@si.edu)
Academic editor: Editorial Secretary
Received: 26 Sep 2023 | Accepted: 11 Oct 2023 | Published: 25 Oct 2023
This is an open access article distributed under the terms of the CC0 Public Domain Dedication.
Citation:
Dikow RB, DiPietro C, Trizna MG, BredenbeckCorp H, Bursell MG, Ekwealor JTB, Hodel RGJ, Lopez N, Mattingly WJB, Munro J, Naples RM, Oubre C, Robarge D, Snyder S, Spillane JL, Tomerlin MJ, Villanueva LJ, White AE (2023) Developing responsible AI practices at the Smithsonian Institution. Research Ideas and Outcomes 9: e113334. https://doi.org/10.3897/rio.9.e113334
|
Applications of artificial intelligence (AI) and machine learning (ML) have become pervasive in our everyday lives. These applications range from the mundane (asking ChatGPT to write a thank you note) to high-end science (predicting future weather patterns in the face of climate change), but, because they rely on human-generated or mediated data, they also have the potential to perpetuate systemic oppression and racism. For museums and other cultural heritage institutions, there is great interest in automating the kinds of applications at which AI and ML can excel, for example, tasks in computer vision including image segmentation, object recognition (labelling or identifying objects in an image) and natural language processing (e.g. named-entity recognition, topic modelling, generation of word and sentence embeddings) in order to make digital collections and archives discoverable, searchable and appropriately tagged.
A coalition of staff, Fellows and interns working in digital spaces at the Smithsonian Institution, who are either engaged with research using AI or ML tools or working closely with digital data in other ways, came together to discuss the promise and potential perils of applying AI and ML at scale and this work results from those conversations. Here, we present the process that has led to the development of an AI Values Statement and an implementation plan, including the release of datasets with accompanying documentation to enable these data to be used with improved context and reproducibility (dataset cards). We plan to continue releasing dataset cards and for AI and ML applications, model cards, in order to enable informed usage of Smithsonian data and research products.
artificial intelligence, machine learning, GLAM, galleries, libraries, archives, museums, collections
The Smithsonian Institution is the world's largest museum, education and research complex. It includes 21 museums, eight research centres, 15 archival repositories, 21 specialised library branches and a zoo. The collection holdings contain approximately 157.2 million objects and specimens, 148.2 thousand archival cubic feet (4.2 thousand cubic metres) and 2.3 million library volumes (Fiscal Year 2022; https://www.si.edu/dashboard/national-collections). In terms of digitised collections, 37 million objects and specimens have a digital record, 7.5 million objects and specimens have a digital image, 113 thousand archival cubic feet (3.2 thousand cubic metres) with a digital record and 27.2 thousand archival cubic feet (770 cubic metres) with a digital image. A total of 1.5 million library volumes have a digital record and 59.7 thousand library volumes have a digital image. The Digitization Program Office, part of the Smithsonian Office of the Chief Information Officer (OCIO), has been instrumental in collaborating with museums across the Smithsonian to investigate how to most efficiently digitise data and to do so at scale for representative projects. The Smithsonian Open Access Initiative (
Museum collections, libraries and archives contain many kinds of information resources. These include:
All of these resources can also have associated metadata, some of which are generated automatically during digitisation (e.g. EXIF metadata for digital photographs), while others are added manually by content experts in many different roles, including but not limited to archivists, cataloguers, librarians, collections information specialists, data managers and curators.
The opportunity to introduce AI and ML tools into parts of these workflows is appealing for many reasons. What rises to the top during conversations with knowledge workers at Galleries, Libraries, Archives and Museums (GLAM institutions) both large and small is the desire to make more of the collections available to the public, in a way that enables education, discovery and research. This desire is tempered with the recognition that there will never be enough staff or funding to enable this digital transformation because existing workflows do not allow for the massive scale required for the size of the collections. While there have been a number of reviews and experiments on the use of AI in GLAM institutions (e.g.
One of the most promising uses for AI in GLAM institutions is to improve accessibility. Ensuring that digitised content and data are accessible to all users, whether by adding alt-text to images (alternative text that describes the function or appearance of an image), text transcriptions to audio and video or translations into multiple languages, is crucial and, in many cases, required (e.g. WCAG Web Content Accessibility Guidelines;
The rapid increase in the pace of digitisation has been met with a dramatic increase in the availability and usability of pre-trained models for diverse machine-learning tasks. While there is an eagerness to test these models on Smithsonian collections, there are many reasons why these pre-trained models might not work well on data associated with GLAM institutions or, if applied broadly across collections, could produce outputs that are misleading or even harmful. Off-the-shelf computer vision models, for example, have been trained with image datasets that often include images with labels, which may be outdated, offensive or inaccurate (e.g.
These benchmark computer vision training datasets also are not representative of the kinds of collections held by GLAM institutions since the types of objects present in these training data are those for which many image examples can be found online and labelled by non-experts. Many of the collections held by GLAM institutions are historical objects, rare, unique or would require other nuanced labelling. In early experiments done at the Smithsonian with applying commercial computer vision models to digitised collections objects from the National Museum of American History, we found examples where the model was simply inaccurate (e.g. a photo of a Morse Daguerreotype camera processed by Google Vision was classified with high probability as a sound box; Fig.
A photo of a Morse Daguerreotype camera. When processed by Google Vision, it was classified with high probability as a sound box. Source: https://n2t.net/ark:/65665/ng49ca746a6-6a45-704b-e053-15f76fa0b4fa. Date accessed: 22-06-2023.
A photo of prop shackles worn by LeVar Burton as Kunta Kinte in Roots. When processed by Google Vision, it was classified with high probability as jewellery. Source: https://n2t.net/ark:/65665/ng49ca746a9-d072-704b-e053-15f76fa0b4fa. Date accessed: 22-06-2023.
There is a robust body of scholarship around the topics of bias in AI and the disproportionate harm it can cause to people of colour (e.g.
Methods to remediate or at least document these risks and biases have also been proposed. Data statements (
Attempts at regulation and policy by government entities have lagged significantly behind the academic literature. Just in the past few years, however, the U.S. Executive Branch has convened AI experts from academia and industry and released multiple policy recommendations and proposed actions. Developed under Alondra Nelson’s leadership at the Office of Science and Technology Policy, a Blueprint for an AI Bill of Rights was released in October 2022, which states, “The Blueprint for an AI Bill of Rights is an exercise in envisioning a future where the American public is protected from the potential harms and can fully enjoy the benefits, of automated systems” (
In 2021, during COVID pandemic restrictions on in-person gatherings and meetings, a virtual AI and ML focused reading group was formed at the Smithsonian. In order to promote the broadest participation possible, we chose readings that were of interest not only to Smithsonian staff already building or implementing AI in their work, but also to staff from across the Institution who interact with data in all ways. We all came to quick agreement that focusing on data was important because the data themselves play such a central role in the downstream application of any computational tools. Due to the expansive footprint of the Smithsonian, the subject-matter expertise of staff is extremely broad. One constant challenge is finding ways to break down silos that form as staff work in their organisational “unit” (museum, department, research centre etc.) – sharing knowledge across units and even departments can be challenging. That was one impetus for the formation of the reading group – many are interested in similar topics, but it can be difficult to find time to stop what we are doing to talk to each other. The books that were chosen in this initial phase were: Atlas of AI: Power, Politics and the Planetary Costs of Artificial Intelligence by Kate Crawford (
Almost immediately during the reading group conversations, we realised that we needed a document that could serve as best practices or guardrails, as AI and ML applications become more prevalent. Not having this in place was hindering our ability to work together on these topics across our distributed organisation. Our first goal was to draft an “AI Values Statement” with feedback from Smithsonian staff and affiliates with diverse expertise. We saw this as purposefully distinct from an official Smithsonian policy around the use of AI, which we felt and still feel, would be difficult to draft and implement while technologies are changing so rapidly and there may not be a one-size-fits-all solution to go across all Smithsonian data types and units. Indeed, during the months between drafting our Values Statement and compiling the community feedback, ChatGPT was released and added a new dimension to this work that could not be easily fit into our existing language. We found inspiration from the Stanford Special Collections and University Archives Statement on Potentially Harmful Language in Cataloguing and Archival Description (
We also felt the need to walk a bit of a tightrope; guidance should not stifle creativity and innovation in piloting experimental tools, but instead should empower potential users and consumers of AI tools and outputs to understand the potential risks and how to navigate the process when deciding whether and how to implement AI. We hope it can be a guide for users to ask appropriate questions before entering into a new project or partnership using AI tools. Particularly as AI tools have become part of applications we are already using (e.g. photo editing software, search engines, machine-generated transcription for video and audio, autocomplete in document and email programmes, computer vision weapons detection in security systems), there is really no avoiding this technology becoming a part of existing workflows, but that does not mean we cannot make choices about which tools we use and how we use them.
In order to begin to implement the recommendations from the Values Statement, we chose an initial handful of Smithsonian datasets on which pilot Dataset Cards and used the Dataset Card template from HuggingFace (
All dataset cards have been posted to GitHub (https://github.com/smithsonian/dataset-cards) and archived at Zenodo (https://zenodo.org/doi/10.5281/zenodo.8381116).
During community discussions, we thought it was important to distinguish between two main tracks of work at the Smithsonian which use or may use AI tools, which we refer to as “research” and “strategic” tracks. The research track has been vibrant at the Smithsonian since 2017, particularly for digitised natural history datasets. For these projects, Smithsonian researchers have generally built custom convolutional neural networks (most recently using transfer learning on open-source models, for example, ResNet) or natural language processing pipelines, which were trained or fine-tuned using Smithsonian content. It is in our best interests to make sure that methods, techniques and lessons learned during the course of these research projects are discussed and shared. Some examples of these research projects include:
At a strategic level, we are still trying to determine where AI can have the most impact and improve the efficiency of current collection workflows and practices to the greatest extent. While AI applications to research do indeed provide efficiencies (e.g. a person would not be able to measure leaves on 4.5 million herbarium sheets), the goal is often not focused solely on efficiency, but on new ways of capturing data or features to generate scientific insights (e.g. analysing total plant shape as opposed to restricting to a handful of traditional measurements). The strategic applications may rely more heavily on commercial models and, thus, may require more scrutiny after implementation to identify inaccuracies or harmful outputs.
The Smithsonian AI Values Statement is below and is also posted at https://datascience.si.edu/ai-values-statement. Fig.
Engage internal community: In order to maintain open lines of communication across AI and ML and data practitioners across all Smithsonian units, we plan to continue our reading group as well as institute regular AI community meetings. At these meetings, community members can present projects using or building AI tools either in planning or implementation phases to receive feedback from other community members. We also see the opportunity to connect practitioners from different units that may be using the same tools or working on developing methods with shared challenges. We see this as an informal way of keeping track of which technologies, vendors and methods community members are using.
In addition to these gatherings, which may organically draw more technically-focused staff and AI practitioners, we plan to share learning opportunities and resources in non-technical venues, both synchronously and asynchronously. We think it is important to ensure that communications include as broad a group of Smithsonian staff and affiliates as possible as AI touches all of us. By plugging into existing committees and standing working groups including higher-level Director’s meetings, strategic teams focusing on data governance, digital practitioners, webmasters and web developers, to a series of presentations at unit all-staff meetings, we hope to bring topics like AI literacy and policy advocacy to all Smithsonian staff. Plugging into and leveraging existing networks can help compliment and grow the existing community.
Promote the use of Dataset and Model Cards: The Smithsonian Open Access Initiative has made millions of digitised objects and records available for public use. Datasets made available by GLAM institutions can be difficult to use due to institution-specific metadata or cataloguing practices that are unclear to end-users. We also cannot easily anticipate all the ways these data will be used. Dataset Cards, which are human-readable README pages that contain general information about the data and how it should be used, can provide a way for institutional expertise, context and bias to be conveyed to end-users (and even our future selves). Our Dataset Cards completed to date are posted on GitHub (
Dataset Card Advantages |
Dataset Card Disadvantages |
For discrete or “complete” datasets, these can provide comprehensive information for users of the data. |
For broad datasets that span departments, answers to prompts may be too non-specific to be useful. |
Provide context, awareness, cautionary information and potential risks for both the data content and data format. |
For Smithsonian data, there is no real way to describe all Open Access data as a single set even though users may be interested in these data as a whole. |
Gaps in collections scope and dataset biases (when known) can be identified and described up-front. |
If the content of the card changes, it may be challenging to ensure users use the newer version. |
Cards and their associated datasets and model cards can be integrated with HuggingFace and GitHub. |
Incomplete or growing datasets may be more difficult to describe comprehensively and will require more extensive versioning. |
Can be used for both “internal” as well as public-facing datasets, assisting both future staff and external users. |
Datasets may be modified and manipulated into new versions depending on AI task or goal. Currently, there is no straightforward way to link to related datasets from a dataset card. |
Provide training opportunities: The Smithsonian has a robust data science skills training programme coordinated by the OCIO Data Science Lab with more than 20 Smithsonian staff and fellows who have completed Carpentries (
Build in mechanisms for feedback: The opportunity for AI to provide text descriptions, labels, captions and other enhanced metadata further enables more Smithsonian content to be shared online with the public. The increase in content also means that mechanisms must be in place to allow user feedback and suggestions for improving these machine-generated metadata. Both staff and the public should have opportunities to correct inaccuracies and flag terms that are outdated or harmful, particularly as vocabularies and language usage change over time or as objects develop new or different cultural relevance.
We also think it is crucial for any AI-generated content to be clearly labelled as such so that it is not confused with metadata that has been created by humans. While both machine-generated and human-generated metadata can be inaccurate and may need to change over time, it is important for users of the data to know how they were produced. Documentation of methods and model versions would be required of any scholarly research using these tools and we think it is as important when any AI methods are applied to data put online for public audiences.
Collaborate with other GLAM and federal institutions: For federal organisations, federal procurement regulations can sometimes limit our ability to be agile in the adoption of new technologies. In order to address such limitations, we have been developing collaborations with other federal organisations including the Library of Congress and the National Archives and Records Administration, as well as colleagues at Virginia Polytechnic Institute and State University, to discuss shared challenges and opportunities around the topic of AI. While policies will likely be institution-specific as our data as well as processes and organisational structure are all unique, we see great value in building on each other’s progress. We hosted two workshops for staff from our institutions in 2022 and presented a panel discussion at the Joint Conference on Digital Libraries (
Evaluating vendors and partners: The use of contracts with outside vendors is common at many GLAM institutions and federal agencies, in particular for experimental work for which it would take a long time to develop the case for and dedicate funding to new staff positions. These institutions also often cannot match salaries in the for-profit sector, so contracting can be a mechanism to bring in expertise on a project-by-project basis. It can sometimes be difficult for institutions to evaluate vendor promises and for vendors to fully understand challenges before embarking on a project because GLAM data are often historical, messy and not uniform. The terms of the agreements signed by vendors and institutions may not detail the exact technology used when building or implementing an AI system and, in many cases, the training data, the model or the pipeline may be closed-source. How vendors will evaluate success may also vary from the metrics valued by the institution. This is particularly important given the status of our institutions as trusted sources. Inaccurate outputs can begin to erode public trust.
We expect the Values Statement to evolve over time, in response to changing technology, feedback from users and broadening of the types of data available for AI applications. We also foresee a time when a more structured governance framework is needed, as the number of users and use cases of AI applications grow. When the specifics of such a framework are considered, we hope that the steps put into place here lay the groundwork for a growing, vibrant, community of digital practitioners engaged in experimenting, evaluating and integrating new technologies into traditional collections practices.
Technology is not neutral.
The use of Artificial Intelligence (AI) tools1 to describe, analyse, visualise or aid discovery of information from Smithsonian collections, libraries, archives and research data reflects the biases and positionality of the people and systems who built each tool, as well as those that collected, catalogued and described any data used for their training. These tools might hold extensive value in their use at the Smithsonian, but there are issues that will limit the applicability and reliability of their use due to the way they were planned and created.
We seek to only begin AI projects2 that implement tools and algorithms that are respectful to the individuals and communities that are represented by the information in our museum, library and archival collections. We aim to be proactive in identifying and documenting biases and methodologies when building and implementing such tools and making the documentation available to audiences that will interact with the resulting products. We recognise that technology evolves over time and that our efforts must also evolve to ensure our ethical framework stays relevant and robust. We encourage any person, community or stakeholder involved with or affected by said tools and algorithms to provide feedback and point out any concerns.
We acknowledge the opportunities that AI tools present for cultural heritage organisations:
We urge anyone contemplating an AI project to consider:
We strive to promote the following actions when implementing AI tools:
We strive to recognise the following when implementing AI tools:
We strive to promote the following when partnering with outside organisations on AI tools or projects:
1The term “AI tools” includes a variety of technologies that seek to create decision-making software. Some examples include facial and speech recognition, machine-learning based optical character recognition, language translation, natural language processing, image recognition, object detection and segmentation and data clustering. Common commercial examples include virtual assistants such as Siri or Alexa, website search and recommendation algorithms and tagging and identification of people in images on social media platforms.
2The term “AI project” refers to an intentional effort to utilise or create an AI tool in research or in an existing workflow.
We thank Deron Burba, Carmen Iannacone, Beth Stern and Diane Zorich for providing feedback on earlier versions of the AI Values Statement and this manuscript and Benjamin Charles Germain Lee and Lukas Hughes-Noehrer for their reviews of the preprint. We thank everyone who attended an AI reading group meeting and contributed to conversations about this topic. In addition to the authors, we also thank Smithsonian staff who contributed to dataset cards: Jessica Bird, Torsten Dikow, Sylvia Orli, Douglas Remley, Eric Schuettpelz and Kamilah Stinnett. RBD thanks Thomas Padilla, Liz Lorang, Harish Maringanti, Jefferson Bailey, Ryan Cordell, Abbey Potter, Meghan Ferriter, Jill Reilly, Bill Ingram, Sylvester Johnson, Laura Coyle and Courtney Bellizzi for discussions over the past few years that helped refine the ideas presented here. We dedicate this work to the memory of our colleague Effie Kapsalis, who led the Smithsonian Open Access Initiative.