Digital Humanities and Archives II: ‘Archival Effects’ of Digitization

by

In an earlier EMOB post, “Digital Humanities and the Archives I: Economics and Sustainability”, we discussed the varied connotations that the term “sustainability” evokes. Yet the concept of “archives” also engenders a multiplicity of meanings as does the word “database.” In some circles “archive” and “database” are used interchangeably, while for others the terms signal distinctions between the past and the present. As Marlene Manoff has observed,

When scholars outside library and archival science use the word “archive” or when those outside information technology fields use the word “database,” they almost always mean something broader and more ambiguous than experts in these fields using those same words. The disciplinary boundaries within which these terms have been contained are eroding. Scholars use the terms metaphorically, appropriating them from the professional experts. (Manoff, “Archive and Database as Metaphor: Theorizing the Historical Record.” portal: Libraries and the Academy, 10.4 [2010], 385)

The submissions for the “Digital Humanities and the Archives” roundtable at ASECS 2012 attest to the varied meanings scholars ascribe to “archive” as a digital entity. While some proposals viewed commercial textbases such as ECCO or EEBO as archives, others considered non-commercial digital projects (some of which were designed to perform additional roles beyond being a repository), as falling under the “archival” designation. Still others proposed topics that were not tied to specific digital collections or projects. Reflecting this diversity, the selected presentations featured two papers on the nature of searching within digital environments (Randall Cream, West Chester Univ., and Bill Blake, New York Univ.), another on the coding issues encountered in building a performance history database (Mike Gavin, Rice University; University of South Carolina, Fall 2012), a fourth on the potential evidence that can be derived from negative results (Sayre Greenfield, Univ. of Pittsburgh, Greensburg), and the last on a digital archive aimed at facilitating exchange between scholars facilitating exchange between scholars and those outside the academy (Jessica Richard, Wake Forest Univ.). In his post on the many Digital Humanities sessions at ASECS, Stephen Gregg offers a fine overview of this roundtable, so the following comments supplement his summary. In addition, they serve as a springboard for discussing digitization’s broader “archival effects,” a term coined by Marlene Manoff to “suggest the ways in which digital media bring the past into the present” (386).

Contrasting the old and the new, Randall Cream noted that unlike traditional archives whose contents are not always fully known, digital archives and databases afford more certainty because their creation involves detailed and defining–an encyclopedic naming of their various parts. For Cream, this difference has also meant that searching the digital archives lacks the serendipitous discovery that scholars often experience when working in brick-and-mortar archives. He suggested concept-linked searching as a possible means of fostering chance discoveries within digital environments, a suggestion that provided a fitting segue to Bill Blake’s talk on crafting more effective digital searches. Blake argued for thinking beyond topical keyword searches aimed solely at retrieval. Instead, he called for adopting more quality, conceptually-based searches that will yield better results; such searches will counter the drift and spread that occur when the aim of retrieval replaces the goal of discovery. (Given earlier EMOB discussions of semantic- or meaning-based searches, it should be noted that Blake was referring to the ways users select and fashion search terms and not to the new search platforms that enable semantic or meaning-based searching such as Mimas used in JISC’s Historic Books collection.)

Cream’s and Blake’s remarks point to what could be termed a remediation of research practices as print and digital interact, and both their talks highlighted searching as perhaps one of the most significant reconfigured practices. And indeed the concept of searching has undergone major reformulations in the digital environment. While accessibility and quickness of obtaining results are often seen as digital archives’ main advantage over print, a key benefit of digital collections resides in their enabling users to traverse immense areas of texts multi-directionally. Put another way, what seems radically different about searching in the digital world is not merely unprecedented access and speed, but rather the ways one can alter search strategies instantaneously, shifting not only the search terms employed at a moment’s notice but also the temporal and spatial coordinates in which those terms are placed. This capability expands the ways we are approaching the search as a strategy, opening up new conceptualizations even as we retain the habits and training we acquired working with print. As Wired magazine’s Kevin Kelly has observed: “What search uncovers is not just keywords but also the inherent value of connection…Search opens up creations. …As a song, movie, novel or poem is searched, the potential connections it radiates seep into society in a much deeper way than the simple publication of a duplicated copy ever could” (Kevin Kelly, “Scan this Book!” New York Times, 14 May 2006).

The searching enabled within digital archives reorients our thinking about what constitutes relevant information and exposes the kinds of connectivity that we would likely miss or overlook working with print and manuscript in traditional environments. This reorientation, moreover, possesses its own opportunities for serendipity. While serendipitous discoveries made when working in a traditional archive or even browsing in the stacks typically occur within a bounded space and a pre-selected range of call numbers, digital archives and databases enable virtual movement throughout their holdings to uncover relevant but unforeseen connections not bounded by categories of expectations. In short, capable of serving as far more than text delivery systems and repositories, these digital archives and databases function as “discovery aids.” Fostering a culture of connectivity, these intellectual laboratories of sorts can provide access not only to individual titles but also to a larger, dynamic field of textual and sociocultural activity.

Sayre Greenfield’s paper demonstrated the kind of discoveries that this rethinking of relevant information can yield. Noting that assessing negative findings requires caution, Greenfield explored the ways in which a lack of search results—negative evidence—can translate into meaningful information and concluded that “absences are most useful when measured against positive results found elsewhere, in different genres or different periods.” In offering examples of the different hits obtained from performing the same search in ECCO and Burney, he drew attention to the importance of knowing the scope of a given database and the value of working across databases.

Mike Gavin’s paper also underscored the importance of understanding the operation of digital archives and the rethinking that such understanding can prompt. As Gavin recounted, creating a digital archive of dramatic works that incorporates their performance history has necessitated adapting TEI coding to facilitate searching. While his comments reflect the perspective of those constructing the archive, they also hold significance for users of digital archives. The tagging examples he provided illustrate the significant intellectual labor that goes into the creation of digital databases and archives; encoding a document, after all, is an interpretive practice requiring careful thought and subject expertise. His illustrations are a cogent reminder that the archives–whether traditional or digital–are never neutral but always are rooted in the views and principles of their creators. In the case of digital archives or databases, users benefit from being cognizant of their “constructedness.” Having an awareness of a digital archive’s creators, the circumstances surrounding its creation, the quality of its metadata, and the idiosyncrasies of its search engine will almost certainly enhance a user’s search process and, in some cases, even his or her analysis of results. Unfortunately, it is not always possible to uncover such details about digital archives and databases. Plus, even when there is transparency and one can familiarize oneself with a digital archive’s encoding principles and information architecture, the tagging can still limit the what results searches return. On a different note, it seems worth mentioning that the tasks of coding and organizing the contents of a traditional archive will, in turn, often enrich knowledge of its physical material. And this physical material remains important, for the digital and the material are not one and the same.

Unlike the first four papers that focused on either existing archives or ones nearing completion, Jessica Richard’s paper dealt with the early planning stages of a digital project. The incarnation for the project was a desire to foster exchange between eighteenth-century science studies scholars and a non-academic readership; creating a web-based site seems an ideal medium for the public-humanities thrust of this project. Notwithstanding its differences from the other talks, Richard’s topic very much reflects how the digital is transforming our traditional conceptions of archives. The project’s rethinking of audience, attention to wide access, and desire to translate scholarship for an interested general public all exemplify aspects of this transformation.

As these five talks illustrated, digital media are transforming our theoretical conceptions of “archives”; creating new paradigms and inspiring shifts in existing models as the digital and traditional archival cultures interact; and shaping the kinds of archival projects being undertaken, the methodologies used, and the types of research questions posed. Early in her essay Manoff suggests that “our current moment reflects the convergence of two phenomena–new technical capacities and an age-old impulse to gather and preserve. The ease of capturing digital data is an incitement to archive” (386). In light of the linguistic history of “archive,” connections between new technical capacities and the desire to collect and preserve have perhaps an even longer history. The word “archive” does not appear until after the invention of hand-press printing. While its use as a noun to denote either a historical document that is preserved or the place in which such documents are kept dates from the late 1630s/early 1640s, its verbal form–to archive–does not enter the lexicon until the twentieth century. Whether coincidence or not, this verb does not gain wide currency until the 1980s, a timing that corresponds with the growth in the use of computers and related technologies. In the past two decades the extensive adoption of digital technologies has dramatically spurred efforts to assemble large-scale collections of visual, verbal, and even oral materials and make them virtually available, either freely or commercially.

For Manoff, metaphorical appropriations of “archive” are not only useful for theorizing the ever-increasing growth of these collections but also for theorizing the digital in terms of its archival effects on our conceptions of history and the cultural record (385-6). As Manoff observes at the close of her essay, “archive” especially lends itself to such theorizing because the concept “carries within it both the ideal of preserving collective memory and the reality of its impossibility” (396). The musings about traditional and digital archives presented here touch upon only a few of the archival effects that digital transformations are exercising on our research practices and broader relationships with the history and knowledge. I hope others will add their thoughts about these changes and the explanatory power of “archive” to address our cultural moment.

Advertisement

Tags: , , , ,

10 Responses to “Digital Humanities and Archives II: ‘Archival Effects’ of Digitization”

  1. Anna Battigelli Says:

    Thanks, Eleanor, for this great overview of a truly thought-provoking ASECS session. If the concept of an “archive” is being re-thought, so, too, are the methods by which we search that archive.

    I found all of the speakers interesting, and I would love to hear more about their concerns here. I was particularly interested in Bill Blake’s distinction between searching with the goal of retrieval as opposed to searching for discovery. How might we approach and fashion search terms? What limits do we run up against and what limits do we ourselves impose?

    Like

  2. Laura Rosenthal Says:

    Thanks for this post; it’s good to have the chance to read about an interesting panel that I wasn’t able to get to. Two issues leap to mind: (1) a while back Laura Mandell, who knows more than anyone about this kind of thing, was describing to me how imperfect these databases are. I can’t remember the exact number, but she was saying that because of the less advanced technology in ECCO, for example, you only get a small fraction of the hits that are out there and that a better technology would find. Since then, I have viewed negative results with skepticism and been reluctant to draw conclusions based on them. (2) It would be interesting to think about the differences between getting lost in a paper archive vs getting lost in a digital one. Both can be vast and overwhelming.

    Like

  3. Eleanor Shevlin Says:

    Thanks, Laura. It’s good that you raised the issue about negative results because that point deserves clarification. Sayre, who spoke on negative results at ASECS 2013, is well aware of the problems stemming from poor OCR (Optical Scanning Recognition) and the resulting unreliable hits returned. In fact, Sayre spoke about ways to work around the poor OCR to obtain hits that more accurately reflect the actual database’s contents a few years ago at ASECS as part of a two-part roundtable (the first at EC/ASECS and the second at ASECS) that Anna and I organized. The sessions were devised to address this problem as well as other bibliographic issues. This blog was initially created to prepare for those sessions as its initial post Introductions: How to Improve EEBO, ECCO, and Burney Collection Online? details. Some other relevant prior posts include ones that can be found here and here; there are others too. Laura Mandell briefly mentions the work on improving OCR technology in her EMOB post on 18thConnect, and much work has advanced on that front as well as developing means for those without access to these commercial databases to obtain texts through coding and correcting work. In her essay “Brave New World: A Look at 18thConnect” that appears in a special electronic resources forum that Anna and I co-edited for Age of Johnson (vol. 21), Laura Mandell discusses 18thConnect in more depth. Jim Tierney’s contribution to that forum, “The State of Electronic Resources for the Study of Eighteenth-Century British Periodicals: The Role of Scholars, Librarians, and Commercial Vendors” discusses, among other issues, OCR issues in digital databases devoted to periodicals.

    In the paper for this year’s roundtable, Sayre did stress caution. His examples dealt with phrases from Hamlet and considered when and in what kinds of texts these phrases began to appear with regularity.

    Getting lost in paper or digital archives can both result in wastes of time and serendipitous finds. Yet, the experiences nonetheless differ from one other. I do think that the digitization of physical documents can result in revised, more detailed finding aids for the paper-based archives.

    Like

  4. Anna Battigelli Says:

    I’ve been thinking about Laura’s distinction between digital and print archives. She asks whether we get lost in different ways in these archives. Do we tend to cluster greater numbers of items in digital archives, and zero in on a one or two items in print archives? The time required to call up a text at a rare book library tends to slow us down–as does the limit of items one can view at one time.

    But what is the difference between sorting through a box of many miscellaneous manuscript papers or a commonplace book with little identification in a rbr and the flood of texts we might call up in a digital archive? Are we less patient or less methodical with the digital search? And if so, why? Because we can return to it at will, whereas a visit to a rare book library tends to require careful and focused time management? Or because we try to guess the provenance or the cataloguer’s logic in clustering manuscripts into one box?

    Like

  5. Anna Battigelli Says:

    Perhaps more importantly, do we skim for “data” in a digital archive, and read for greater comprehension or perhaps even something like mastery or contextual understanding when we examine a printed text? Are we as patient with being lost in a digital archive as we are in a print archive? Being lost, at least for some time, seems important for any useful discovery.

    Like

  6. Eleanor Shevlin Says:

    An interesting thread. “Being lost” could refer to two different experiences. In one sense being lost in an archive, whether paper or digital, could signal an inability to find one’s bearings or make sense of the archive’s organization–or it could signify just a sense of being overwhelmed by the quantity of its contents. At the same time, “being lost” also evokes deep absorption. I have experienced both senses when working in both paper and digital repositories.

    I have also been impatient when working in both environments. When working in archives abroad especially but also in other situations in which time is limited (consider a trial for a digital database), I have often felt panicked about insufficient time. Of course, the ability to take digital photos has sometimes eased my sense of “time-is-running-out”.

    As for the kinds of attention I give to documents, the questions I am asking, the type of material I am consulting, the research project, and other factors all help shape and determine the degree of focus and attention I exercise.

    Like

  7. Michael Says:

    I don’t know. As a young boy, I was once lost in a large state park. It was a terrifying experience and in no way comparable to anything I’ve ever felt while working in an archive, digital or otherwise. I’m not sure what we’d learn by ruminating on “being lost” as a metaphor for archival work, but my initial reaction is that it extends to the point of contortion what is already a potentially misleading spatial metaphor: Can a person ever really be “in” an archive? You could walk the stacks, I suppose, or spend a day (or a decade) in a reading room. With a digital archive, you might never leave your kitchen table. Perhaps I’m missing something?

    One thing, though, that I felt myself chewing on after the roundtable was the repeated emphasis on the concept of “searching.” It seems like “searching” is something that we do with digital archives, but “researching” is the usual term for intellectual work. These terms seem to entail a cluster of differing associations and assumptions about the nature of the archive, its objects, and the labor involved when incorporating those objects into histories and arguments.

    It’s odd, I think, because “searching” suggests an act of looking for something that’s already, immutably there, while “researching” suggests an investigation into the internal workings and causes of some phenomenon that must be imagined and, at least in part, constructed. Yet, the digital archive — the nonplace of searching — is where the objects of our knowledge are not objects at all… If you’ll forgive a small play on words.

    Perhaps this connects to Anna’s point that physical objects come to us — either in boxes, bookshelves, or museum spaces — already organized and explicitly catalogued, whereas digital texts are variously tagged but rarely foldered. In the physical archive, you’re usually working with objects that have already been found.

    Surely someone has written (better than I could) on this idea. Eleanor?

    Like

  8. Anna Battigelli Says:

    The boxes I look at in rare book libraries and archives often contain items that are neither particularly organized nor fully catalogued, though someone has placed them in that box, and there may be an organizational principle to be detected. But more often a kind of forensics is required to interpret what that box tells us, or, as de Certeau puts it, to let the dead speak.

    Digital archives lack most of the material or bibliographical clues that material artifacts provide. To return to Laura’s distinction, we read them differently because they contain different sets of information. And to return to Eleanor’s original point, what we mean by “archive” changes, depending on, among other things, what information a given archive can deliver.

    Like

  9. Eleanor Shevlin Says:

    I have heard many express being lost when first working in digital textbases such as EEBO, EECO, and Burney. While these students or new users did not experience the terror that Mike felt while being lost in a large state park at a young age, their confusion, frustrations, and loss of bearings were nonetheless quite real. Paper archives can equally produce such feelings. And, of course, the sense of being lost in both cases operates metaphorically. Literal space does not work for either type of archive (reading rooms that house paper archives, for instance, are not large enough for someone to become lost).

    As I noted earlier, digitization projects can result in improved organization of paper archives and better finding aids and catalogue notes. Yet, in cases when digitization is viewed as replacing the material artifacts, the contents of paper archives can become disorganized as the digitization takes place and the objects not conscientiously returned to their prior order.

    Yes, search does seem to be key, and Mike’s musings on the differences that “search” and “research” evoke are worth noting. While we tend to associate “search” with the digital, the verb’s meaning to “peruse, look through, examine (writings, records) in order to discover whether certain things are contained there,” dates from the late 14th century (“search” v. 4a, OED), but this meaning seems to sustain currency only through the eighteenth century, emerging again in the 1960s in the context of computer searches. “Research,” as in to “investigate or study closely” (“research” v. 1a OED) does not appear until late in the 16th century but then endures through today. (I should note that the dating of word usage in the OED will probably undergo many, many revisions as more and more earlier works are digitized and searched). “Searching” in the sense of seeking something already present is no doubt a product, in part at least, of the verb’s current connections with databases and textbases. (Yet I do wonder if New Historicism’s championing of the archives played a role in popularizing the term “search.”) While it seems that “research” requires more hefty intellectual work, the discussions of “search,” such as those by Bill Blake and Randall Cream at this roundtable as well as those elsewhere by others, indicate a reconsideration of the searching process as requiring intellectual labor in its own right and also influencing the process of research as well as the types of projects pursued.

    Like

  10. researching without regular access to ecco? | The Long Eighteenth Says:

    […] between Laura Rosenthal and Eleanor Shevlin on EMOB about the benefits (and perils) of “getting lost in the archives,” whether these were digital or paper-bound; the second bit of inspiration was from some […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: