At the closing ceremony of the 2023 iPres conference, an international gathering of digital preservation practitioners held at the University of Illinois Urbana-Champaign (UIUC), event organizer and UIUC Interim Associate Vice Chancellor for Research & Innovation Christopher Prom described being interviewed by the local paper, The News-Gazette, for an article about the conference. When asked by the interviewer why digital preservation is important, Prom spoke enthusiastically about the values of the profession and our social responsibility to ensure that the digital record remains accessible. He then shared the published article’s headline, which was quoted from an offhand phrase he used during the interview: Digital preservation [is] “a different kettle of fish.” The iPres audience laughed at this non sequitur chosen to represent the professional challenges that had been the focus of the week-long conference. Sensing that an important new figure of speech had entered the digital preservation lexicon, I used Internet Archive’s Save Page Now feature to archive the article for future generations of scholars.
Running from September 19 to 22, the “different kettle of fish” conference program covered a variety of digital preservation topics. Concurrent sessions grouped by themes included a mix of workshops, presentations of short and long papers, panels, lightning talks, posters, bake-offs, and games. Here are a few highlights from some of the sessions that I attended:
Policy Development and Documentation
- The Digital Preservation Coalition (DPC) hosted a workshop to demonstrate two assessment tools. The Rapid Assessment Model (RAM) enables organizations to identify and track continuous incremental progress towards achievable digital preservation policy goals with the intention of approaching “good enough” practice rather than striving for unachievable best practice. DPC’s Sharon McMeekin talked about the value of acknowledging our deficiencies when things don't work out and mentioned the example of a clandestine ‘failure club’ organized under the motto, “If you aren’t failing, you aren’t trying.” The workshop also presented the Competency Audit Toolkit (CAT), a worksheet designed to help assess and build staff skill levels within an organization by charting the diversity of skills necessary to enhance a digital preservation program.
- In a later session, DPC’s Jenny Mitcham joined virtually to present Documentation Good Practice and premiere a new DPC guide on Digital Preservation Documentation. Tying the topic of documentation to the main conference theme ‘Digital Preservation in Disruptive Times,’ Mitcham emphasized the importance of documentation as something that we turn to in times of crisis.
- Rebecca Frank’s presentation on Repository Staff Perspectives on the Benefits of Trustworthy Digital Repository Certification described the results of a study based on interviews with repository staff members of institutions that have gone through the Trustworthy Digital Repository Audit & Certification (TRAC) process. Respondents reported internal benefits to the certification, such as being forced to review and document processes. Frank described this benefit in terms of the “lottery test”: If someone in your organization won the lottery and left their job, would you lose their institutional knowledge because it hasn’t been documented? The external benefits included a perceived status in the community and reassurance for stakeholders. Notably, while members generally found TRAC valuable, no one considered their content to be better preserved as a result, and many were ambivalent about the costs and the rating system.
- New York University’s Laura McCann and Weatherly A. Stephan presented From Silos to Community: The Path to a Holistic Digital Preservation Policy, which described a participatory and inclusive decision-making model to connect colleagues engaged in digital preservation work across multiple units of NYU library. The resulting Digital Preservation Policy was developed collectively by starting from a shared understanding of digital preservation and building up themes to define the scope of a values-based policy statement.
- The Lessons from the Future: Looking Back at Policy Development panel included several members of the DPC together with Elizabeth England (National Archives and Records Administration), Martin Gengenbach (National Library of New Zealand), and Kieran O’Leary (National Library of Ireland). Speaking largely from the perspective of national government libraries, the panel talked about turning aspirational policy into actionable strategy, organizing policy work across a large institution, measuring impact, and the cycle of reviewing and revising policies over time.
File Formats
- Several file format presentations at iPres referenced a panel discussion hosted by the Open Preservation Foundation (OPF) earlier in the year: Do Unacceptable File Formats Exist? The OPF panel was reprised on the second day of the conference with two of the original participants, Sam Alloing and Valentijn Gilissen, joined by the “file format extended universe” of Leslie Johnston, Kate Murray, Micky Lindlar and Tyler Thorsted. The core debate concerns the terms used by institutions to express file format policy, and whether it may be harmful to restrict the formats a repository ingests to ‘acceptable’ or ‘preferred’ formats that are considered better for long-term preservation. Some argue that this restrictive terminology puts the burden of responsibility for preservation on the content creator rather than on the institution and may force depositors to alter their content by changing it to an unsuitable format that's better for preservation. Johnston and Murray brought up the concept of technical debt created by accepting a wide variety of formats, which may require institutions to take on the maintenance of emulators. Lindlar argued that people will use data in the formats available, regardless of whether it's suitable for preservation or not. All of the panelists seemed to agree that institutional context is important, and that file format policies say more about institutional capabilities than they do about any inherent preservation characteristics of the file formats themselves. The key takeaway of the discussion is that an effective file format policy involves accepting content in its original format and being transparent about the service levels that the institution can provide to keep content accessible. ‘Recommended formats’ are better expressed as guidance for content creation, ideally to be considered before a digital project has begun.
- Tyler Thorsted’s presentation, Key Elements of File Format Strategy, identified the technical factors that an institutional policy should document to better understand and assess format risks. Thorsted also acknowledged the importance of tracking institutional factors, such as the history of policy decisions.
- Micky Lindlar described a model for a shared understanding of error-message handling in Not Well-Formed or Invalid. Now What? – Towards a formalized workflow for format validation error treatment. The model lays out a decision process for dealing with validation errors when encountering files that don't meet the file format specification. An understanding of file formats is obviously important to identifying and fixing validation errors. I would be interested in seeing this model adapted to help deal with implementation errors for files that pass validation checks and are considered 'well-formed' but still have rendering issues (PDF).
Other Topics
- Sawood Alam of the Internet Archive presented an intriguing concept for tracking versions of web objects in a decentralized manner: IPARO: InterPlanetary Archival Record Object for Decentralized Web Archiving and Replay. This system enables web archives to index content in a way that requires storing only one copy of a digital object even if it appears in multiple places.
- Laurel Provencher explained the basics of DNA-based data storage and described a text search experiment using a DNA-encoded dataset of Shakespeare's complete works in A Storage and Search Demonstration with DNA-Encoded Text. The advantages of DNA data storage include information density, stability, easily replication, readability (understanding DNA will always be necessary for human health), and the ability to enable massively parallel in-storage computing.
I'm still catching up on papers for sessions that I missed. Did you attend iPres 2023? I'd love to hear your thoughts and recommendations. Get in touch!