Data + Archives and Archival Violence
Sources:
Chiang, T. (2023, February 9). ChatGPT is a blurry jpeg of the web. The New Yorker.
- In this article, Sci-fi author Chiang, uses the analogy of digital images to explain how large language models (LLMs) including [Chat]GPT function and why they hallucinate or otherwise approximate seemingly correct but compromised information. It explores how the internet as an archive of information behavior serves as a dataset for training generative AI. #Practical #Philosophical
Colavizza, G., Blanke, T., Jeurgens, C., and Noordegraaf, J. (2021, December 14). Archives and AI: An overview of current debates and future perspectives. Journal on Computing and Cultural Heritage, 15(1), 1-15.
- This academic journal article serves as a literature review for scholarship considering the role of AI in archival processes including how the digitization of archival materials transforms the content into big data, which archivist tasks could be automated and how that will shift the work of archivists, and the implications of both these shifts on how information is organized and knowledge is rendered. The authors acknowledge their survey doesn't include any substantive consideration of ethics around AI use in this capacity. Dense reading - useful exploration, written for subject experts. #Philosophical
Crawford, K. and Paglen, T. (2019, September 19). Excavating AI: The politics of training sets for machine learning. Excavating AI.
- In this position paper originally published by New York University’s AI Now Institute to accompany the authors’ art exhibit, Training Humans, Crawford and Paglen elucidate the biases in training data - the processes and the data itself- beginning with the problematic nature of taxonomy as an act of power to specifically critique ImageNet and its impact. It’s important to note the paper and art exhibit have been criticized for a double standard in featuring images of subjects without their explicit consent which is part of the authors’ original critique of the image archive. #Philosophical
freemyn, k.b. (2022). Expanding the Black archival imagination: Digital content creators and the movement to liberate Black narratives from institutional violence. In Burns-Simpson, S., Hayes, N.M., Ndumu, A., & Walker, S. (Eds.). The Black librarian in America: Reflections, resistance, and reawakening.
- In this chapter, freemyn discusses how independent Black digital memory workers use Instagram to do archival work outside institutional archives and the violence on marginalized people they perpetuate. She talks about how the conventions of the platform e.g. grid interface, hashtags serve to construct independent archives of the work of Black women and how institutional archivists can support this work through outreach, collaboration, and job expertise. This chapter explores how technology and new media can be used to confront and circumvent institutionalized culture work that harms marginalized people. #Philosophical #Practical
Hempel, J. (2018, November 13). Fei-Fei Li’s quest to make AI better for humanity. Wired.
- This article is a profile of computer scientist and AI pioneer, Fei-Fei Li, who founded the image archive, ImageNet, and ran the AI Lab at Stanford for several years. It includes her motivations and processes for starting ImageNet, her controversial time working for Google, and her feelings about the need for increased diversity in the field of AI. #Practical #Philosophical
Internet Archive. (2023, May 2). Generative AI meets open culture [Video]. Archive.org
- A 60 minute video of a recorded panel discussion with expert representatives from Internet Archive, Wikimedia Foundation (Wikipedia), and Creative Commons. Topics covered include how Internet Archive (IA) is using AI to explore and improve their records, how generative AI can be fun and joyful, how Wikipedia editors are testing ChatGPT, how creative participation is changing, the limits of copyright, the tension between the intended public good of the Internet and the corporate motivations of most AI companies, etc. [Ed note: The closed captions aren’t the most accurate.] #Practical #Philosophical
Jules, B. (2016, Nov. 11). Confronting our failure of care around the legacies of marginalized people in the archives. [Keynote presentation]. Medium
- Dr. Jules is an archivist and historian and co-founder of the Archiving the Black Web project who gave the keynote presentation at the National Digital Stewardship Alliance annual meeting in which he discussed how an emphasis on neutrality and standards in the field over an ethics of care leads to symbolic annihilation of marginalized peoples in information preservation and knowledge construction through archives, libraries, and museums. Though his keynote doesn't mention AI, it's instructive for recognizing how a similar insistence that technology like AI is neutral is both incorrect and harmful in its impact. #Philosophical #Practical
Moore, K. (2024, March 27). Review: Kate Crawford’s ‘Atlas of AI’, chapter 4: Classification. Hastac Commons.
- This peer-reviewed blog post is written by a doctoral graduate student and provides a synthesis and analysis of the chapter on classification in Crawford’s book, Atlas of AI, including how classification acts as a system of power that impacts how AI is trained and used drawing on a historical tradition of biases and racism in archival processes. #Practical #Philosophical
PBS NewsHour. (2017, January 2). Internet history is fragile. This archive is making sure it doesn’t disappear [Video]. Youtube.
- A 9 minute news report that serves as an explainer for the Wayback Machine, the visual search function of the Internet Archive, that allows users to access screen captures of pages from defunct or updated websites. #Practical
Rebeiz, L.D. (2023, November 1). AI tools paint a blurry picture of our current reality - so what do these biases mean for our future? It’s Nice That.
- In this article, artist Rebeiz discusses their experience training a Generative Adversarial Network (GAN) to generate output paintings that look like her own work by creating a database of their own paintings as training data and contrasts that with Midjourney and DALL-E which are trained on web data and generate biased and limited outputs because of it. Rebeiz explores the complicated issue of the necessity of archive as knowledge preservation even as the preservation processes and the knowledge are problematic. They view engagement with AI as a “necessary form of resistance” to erasure from the digital memory of the world. #Philosophical
Wong, J.C. (2019, September 18). The viral selfie app ImageNet Roulette seemed fun - until it called me a racist slur. The Guardian.
- In this article, a technology journalist discusses her experience using the app created by Crawford and Paglen as part of their ImageNet critique which allows users to upload an image of themselves and the app will categorize it according to ImageNet’s taxonomy which includes racist categories. This piece is useful to explore ethical considerations around critique as a concept and its direct discriminatory impact on people engaging with the technology. #Practical #Philosophical