Preserving Digital Archives


Most attendees at the Beinecke Library’s recent conference on digital archiving–Beyond the Text: Literary Archives in the 21st Century“–arrived equipped with the idea that there is no preservation without loss.

What may have given some attendees pause, particularly those who work primarily on the first two centuries following the Reformation, is how much 21st-century digital stuff is being preserved–and how idiosyncratic the process of selection can be.

Faced with the data deluge of a contemporary literary figure’s electronic correspondence, for example, how do archivists determine what gets archived and what gets tossed?  Now that archiving can begin during a writer’s or publisher’s lifetime, without a family member’s interference (think Cassandra Austen), who shapes the archive?  And if digital archivists shape the archive, what principles of retention do they use?  Where do their loyalties lie? With the author?  Or with the data-hungry and feverishly scandal-mongering scholars of posterity?

The two-day conference raised unresolved and provocative questions, many of which focused on the problem of selection.  Fran Baker, the Assistant Archivist for John Rylands Library at the University of Manchester, discussed the complexity of archiving the Carcanet editorial papers, including email.  Hearing about the decision-making process determining what stays and what gets tossed may not seem new to librarians familiar with the problem of sorting and discarding, but in the context of shaping an archive, that decision-making process and its likelihood of error takes on urgency.

There were stories of forensic success, the most notable of which is Matthew Kirschenbaum’s narrative of the extensive and collective effort tracking down William Gibson’s electronic poem, “Agrippa,” which was designed to encrypt itself after a single reading.  That a text programmed to go away can be recovered suggests both the value of collaborating on large digital projects like The Agrippa Files and the perils of assuming that an author has control over her or his electronic archives.  Similarly, Beth Luey’s account of the rich storehouse of data contained in publishers’ records–sales data, copies printed, copies sold, print runs, design decisions, contracts, marketing files, legal disputes, reviews, book jacket design, subsidiary rights, and so forth–both encouraged work on publishers’ records and raised ethical and legal issues.  In the discussion that followed, for example, it became clear that though some publishers did not retain rejected manuscripts, others did, including pertinent correspondence and readers’ reports.

The Keynote talk by David Sutton noted that literary manuscripts are like no other manuscripts in that they offer insights into the act of creation.  He showcased ongoing projects that promote an awareness of digital literary archives:

Hazel Carby’s eloquent, harrowing, and culturally resonant account of tracing her family genealogy back to a slave owner’s carefully archived records, reminded everyone that archives preserve both the beautiful and the monstrous.

Diane Ducharme drew on her experience at the Beinecke to warn that however much we may desire an unmediated past and a pristine archival order free from editing and explicating, all archives arrive shaped and selected.  Her discussion underscored the importance of searching for the traces of a previous archivist’s work.

Micki McGee described her experience with the Yaddo Archive Project, which aims at providing visualizations of the social network of writers who worked at Yaddo.  She described the process of seeking a relational database with social network mapping and a visualization widget.  Though the project, Yaddo Circles, requires authentication and is not yet available for public view, this vimeo provides an overview.  Clicking here reveals the kind of relational visualization this project might produce.

McGee also recommended looking at the following projects:

These projects have potential for helping us recover the intensely sociable and highly competitive literary worlds of the long eighteenth century.   Like the many other provocative and interesting papers and introductions to sessions, they point a way forward even as they raise methodological, logistical, and even ethical questions.

This conference made clear the value of a longer conference, with sessions focusing on specific problems posed by digital archives of material both old and new.  I welcome contributions by others who attended the conference to help complete this cursory overview.


10 Responses to “Preserving Digital Archives”

  1. Eleanor Shevlin Says:

    Many thanks, Anna, for this excellent report on the “Beyond the Text” conference held at Yale. I look forward to hearing from others who attended.

    Did any of the papers or ensuing discussions address the before-and-after effects of digitization on deciding what to preserve? In other words, how is the decision-making process of what to include being altered by digitization? Do the choices focus on what to digitize and what not? Or does what is “tossed” literally mean discarded permanently rather than simply not digitized?

    Let me also mention an upcoming event that should contribute to this conversation, One of the pre-conference events at the SHARP 2013 conference this July seems quite relevant. The roundtable session, Digitization Crossroads, is being organized by Dr. Alea Henle, University of Wisconsin-Madison, as part of SHARP’s outreach to the the Society of American Archivists (SAA), and related professionals. The session aims “to bring together scholars and practitioners from archives, history, library science, and literary studies to encourage interdisciplinary and cross-profession exchanges.” As the roundtable description explains, professional and disciplinary interests frequently result in different approaches to collecting and preserving materials. These differences, in turn, assume key importance in creating an archive, for a designer’s perspective on collection and preservation will influence his or her choices in constructing the archive and thus ultimately the shape and functionality of the archive itself.

    Speakers include:

    • Alea Henle, University of Wisconsin-Madison, Chair
    • Mark A. Greene, University of Wyoming
    • Melissa Homestead, University of Nebraska
    • A. Mitchell Fraas, University of Pennsylvania
    • Kristine Hanna, Internet Archive
    • August Imholz, Jr., Readex (retired)


  2. Anna Battigelli Says:

    Thanks, Eleanor, for calling attention to the SHARP Digitization Crossroads roundtable, which sounds like it will provide a productive multiplicity of perspectives on archival issues.

    To answer your question, the before-and-after effects of digitization were an undercurrent in many of the sessions. The Agrippa Files story suggests that the tossed is not always permanently tossed.

    But the general question of how to select remained problematic, despite vague claims to be “true to the author.”

    Missing was a sense of what it means to be true, not to an author, but to a work. If literary archives are, as David Sutton noted, unique because of the insight they provide to the creative process, loyalty to a work–and to the thought directed at that work–seems key. Here too, the Agrippa Files project seems relevant. No one seemed to care that the author intended for the electronic poem to disappear. But they did care about preserving the poem.


    • Eleanor Shevlin Says:

      Thanks, Anna. This need to be true to the work, too, also serves as an important reminder that the work is a product of many hands. This multiplicity should receive consideration as well.

      My before-and-after query is tied to issues of tossing. One hears about material being discarded after it has been digitized, and I wonder if roughly the same amount of material was being “tossed” pre-digital preservation. Or was it instead just receiving very low priority in terms of being catalogued.


      • Anna Battigelli Says:

        The discussion was mostly of archiving digital records, though Catherine Hobbs talked about the variety of media confronting archivists as they bring order to personal archives. One example she provided was Carol Shields’ mother’s diary, a physical book that is part of Shields’ archive and holds significance because Shields includes a diary very much like it in Stone Diaries.

        You are right about the need to consider collaborative authorship. Heather MacNeil addressed the need to use nuanced understandings of authorship in her paleontological account of archive creation.


      • Eleanor Shevlin Says:

        I do wonder the effects of digitization on non-textual materials that are sometimes included in archives. Steve Enniss (Folger Shakespeare Library, soon-to-be director of the Harry Ransom Center) gave a fascinating talk a few years ago to the Washington Area Group for Print Culture Studies. He discussed various types of objects–I believe, locks of hair, cigars, and the like were examples he used–that were included along with the texts (not surprisingly, these objects often became subject to fetishization).


  3. Alea Says:

    Sounds like a fascinating conference. I wish I could have been there–but this wrap-up is a start!

    As it happens, the matter of discarding materials after they’ve been digitized (and the related matter of whether digitizing books involves disbinding them or digitizing while intact) is an example of why I’m vested in increasing dialogue across disciplines and professions about digitization. From a material standpoint, each copy potentially offers unique information. From a library and/or archival viewpoint, a digitized copies plus a few physical ones may be sufficient to justify de-accessioning copies or not accessioning additional ones. There is no one right answer (to invoke a librarian mantra).

    Two of the terms in archival literature which I’d like to bring into wider discussion are transparency and provenance. Transparency involves allowing users of archives, databases, and digital and other resources, as much information as practical about their mediation. Provenance has been gaining traction (of sorts) in book history particularly with respect to physical books. I’m not particularly fond of the term primarily because it is usually interpreted in a singular manner as the history of ownership of an item. My personal research is on the ways individuals collecting material for the writing of history (centered on the U.S. before the Civil War) affected what was collected and preserved. Based on that, I think it is important to consider multiple, overlapping histories of sources as items but also as representatives of various types of sources.


  4. Eleanor Shevlin Says:

    Thanks so much, Alea, for your remarks. Transparency is crucial, and the concept seems to inform the roundtable you have organized for SHARP. Are you promoting a more expansive definition of “provenance”? A meaning that would extend far beyond simple ownership to include motivations and methods?


  5. Alea Says:

    I’ve been using “source histories” as an alternative to provenance. Should I get my wish to see the matter receive wider attention & discussion, I’m not sure how much influence I’ll have in determining terminology! Perhaps provenance may be defined more expansively. And yes, I’d like to see meanings going beyond simple ownership of specific items to consider motivations and methods of specific items–and of the specific items as representative or not of broader categories of items.

    All too often, we don’t have much provenance information for a variety of reasons (sometimes it’s not that the info doesn’t exist, but that it exists in library or archives institutional records unconnected to the actual item). Yet, if we consider items as specific sources and as members of a variety of categories — creator(s) (gender, race, status, age, influence . . .), topic, genre, form, owner(s), means of transfer of ownership, and so on — we can develop more complex histories or provenance. I’m working on providing an example of this in one of my works-in-progress.


  6. Eleanor Shevlin Says:

    Your notion of “source histories” seems extremely useful. I can readily see the value in charting the “means of transfer of ownership” for book historians.


  7. Anna Battigelli Says:

    Alea: This issue received attention at the conference from both Catherine Hobbs and Heather MacNeil.

    The term they used was “fonds.” Their perspective was that a given fonds’ history was shaped, not just by the original collector, but also by archivists and, more generally, by cultural forces. Michael O’Driscoll added the perspective that institutions also shape collections.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: