Archive for the ‘Uncategorized’ Category

Free Access to Orlando during the Month of March

March 22, 2013

In honor of Women’s History month, Cambridge University Press’s Orlando: Women’s Writings in the British Isles from the Beginnings to the Present is offering free access during March. Orlando “provides entries on authors’ lives and writing careers, contextual material, timelines, sets of internal links, and bibliographies. Interacting with these materials creates a dynamic inquiry from any number of perspectives into centuries of women’s writing.”

To gain access, the login is womenshistory2013, and the password is Orlando.

EEBO Interactions Ends

March 11, 2013

EEBO Interactions, the web site that fused social networking and digital bibliography, is shutting down at the end of March 2013.

ProQuest’s decision to decommission EEBO Interactions should come as no surprise.  If traffic indicates success, the site received too little to certify its academic or commercial value.   The small core of contributors who worked brilliantly and doggedly to improve bibliographic entries was not enough to prove that value.  Why should it be?  In a world where crowd-sourcing promises instant and free correction, EEBO Interactions‘ small stream of corrections proved too little and too slow.

Nevertheless, the decision to shut down EEBO Interactions is a disappointment because it ends a promising and visionary venture on ProQuest’s part.  Proquest accomplished at least two great things.  First, it offered a rare joint venture uniting academic and commercial worlds.  Second, it conjured up the first bibliography to offer relational cataloging.  If this  iteration of that vision  did not quite take off, it is to be hoped that later iterations will.  Traffic may be one indication of success, but vision is another.

As an editor for EEBO Interactions, I would like to thank EI‘s contributors.  They are a special group of readers, experts willing to put time into a promising experiment.  I have told Stephen Brooks that I would ask emob readers what EEBO Interactions could have done to encourage traffic or otherwise improve.  What might a second iteration include or not include?  Is an unedited, crowd-sourced version of EEBO that runs parallel to EEBO the way to go for such interactions?  Or is an ESTC-led editorial board the way?  An option in between these two poles?

One note of caution.  Anyone interested in preserving information recorded on EEBO Interactions should download material before the end of the month.   ProQuest will save material contributed to EI in some form, but it will be difficult  to access.

Those interested in correcting EEBO entries in the future will want to use http://eebo.chadwyck.com/about/webmaster.htm, or click here.

English Broadside Ballad Archive (EBBA at UCSB)

February 25, 2013

This is the second of a two-part series on free digital archives featuring English ballads.  It follows Eleanor’s discussion of the JISC-funded Broadside Ballad Initiative at Oxford.

The University of California at Santa Barbara has created a free digital ballad collection called The English Broadside Ballad Archive (EBBA), which provides access to more than 8,000 seventeenth-century ballads.  The collection includes ballads from the Pepys Collection, the Roxburgh Collection, the Euing Collection, and the Huntington Library.  EBBA is directed by Patricia Fumerton at UCSB.  This project was supported by the N.E.H.

Individual entries provide links to  sheet facsimiles, facsimile transcriptions, and often recordings.  These features facilitate introducing students both to ballads’ visual details–ornaments, woodcuts, columned verse–and to their tunes.

Cataloging is full and includes the following:

EBBA ID: An internal identifier. Each individual ballad in the archive has a unique EBBA ID.

Title: A diplomatic transcription of the ballad title as it appears on the ballad sheet. The title consists of all ballad text before the first lines of the ballad, including verse headers but excluding text recorded elsewhere under other catalogue headings (such as the license or author, date, publisher and printer imprints).

Date Published: The year—or, in most cases, range of years—during which EBBA believes the ballad to have been published. See Dates.

Author: The recognized author of the ballad in cases where an indication of authorship has been printed on the ballad or, in the case of Pepys ballads, when Weinstein has identified an author from external sources (e.g., Wing, Rollins).

Standard Tune: The standardized name for the melody (according to Claude M. Simpson or other reliable sources). Clicking the standard tune name will return all ballads with the same melody, including alternate tune titles.

Imprint: A diplomatic transcription of the printing, publishing, and/or location information as it appears on the ballad sheet.

License: A diplomatic transcription of the licensing or permission information as printed on the ballad.

Collection: The name of the collection to which the ballad belongs. In cases where the ballad is not part of a named collection, the name of the holding library plus “miscellaneous” will appear. For example, Huntington Library ballads that are not part of a collection are grouped as “HEH Miscellaneous.”

Sheet/Page: For ballads that are collected as independent sheets, the citation page displays the word “Sheet” and lists the sheet number given to it by its holding institution (usually part of its shelfmark). For ballads bound in a book, the citation page displays the word “Page” and lists the page number within the bound volume.

Location: The name of the holding institution.

Shelfmark: The shelfmark assigned by the holding institution.

ESTC ID: The Citation Number for the English Short Title Catalogue (ESTC). Use this number to find the full ESTC citation for any given ballad at http://estc.bl.uk/.

Keyword Categories: The keywords from EBBA’s standardized keyword list that relate to the ballad’s theme and content.

Notes: Clarify potential areas of confusion for users, such as ballads that have print on both sides of a sheet.

MARC Record: A link to our MARC-XML records

Additional Information: Information specific to each part of the ballad.

Title: Separate titles for multi-part ballads.

Tune Imprint: Tune title(s) as printed.

First Lines: A diplomatic transcription of the first two lines of the ballad text proper, below any heading information included in the title or elsewhere under other catalogue headings.

Refrain: Repeated lines at the end of or within ballad stanzas.

Condition: Description of ballad sheet damage and the current state of the sheet. (This information is from Weinstein and is currently for the Pepys collection only.)

Ornament: A list of decorations made of cast metal that appear on the ballad. Frequently used to fill empty spaces in the forme and/or to delimit parts of the ballad text, these ornaments include vertical rules, horizontal rules, and cast fleurons. (This information is from Weinstein and is currently for the Pepys collection only.)

Ballad scholars working with EEBO or ECCO will be familiar with the difficulty of finding ballads, making English Broadside Ballad Archive and Bodleian Library Broadside Ballads necessary.

Together with new printed resources, such as Patricia Fumerton and Anita Guerrini’s Ballads and Broadsides in Britain, 1500-1800 (Ashgate 2010) and Angela McShane’s Political Broadside Ballads of Seventeenth-Century England: A Critical Bibliography (Pickering & Chatto 2011), these digital resources provide a robust and growing archive  for the systematic study of a format whose transiency may have discouraged such studies in the past.

Folger Institute “Early Modern Digital Agendas”

November 29, 2012

The following announcement, from Owen Williams, Assistant Director of the Folger Institute, will be of interest to readers:

In July 2013, the Folger Institute will offer “Early Modern Digital Agendas” under the direction of Jonathan Hope, Professor of Literary Linguistics at the University of Strathclyde. It is an NEH-funded, three-week institute that will explore the robust set of digital tools with period-specific challenges and limitations that scholars of early modern English now have at hand. “Early Modern Digital Agendas” will create a forum in which twenty faculty participants can historicize, theorize, and critically evaluate current and future digital approaches to early modern literary studies—from Early English Books Online-Text Creation Partnership (EEBO-TCP) to advanced corpus linguistics, semantic searching, and visualization theory—with discussion growing out of, and feeding back into, their own projects (current and envisaged). With the guidance of expert visiting faculty, attention will be paid to the ways new technologies are shaping the very nature of early modern research and the means by which scholars interpret texts, teach their students, and present their findings to other scholars.

This institute is supported by an Institutes for Advanced Topics in the Digital Humanities grant from the National Endowment for the Humanities’ Office of Digital Humanities. Please visit http://emdigitalagendas.folger.edu/ for more details.

Owen writes that he will be happy to answer questions pertaining to this interesting new project.

T-PEN: A New Tool for Transcription of Digitized Manuscripts

October 22, 2012

One of the exciting turn of events for scholars has been the growing number of unpublished, hand-written documents now available on the world wide web. Textual scholars no longer have to travel to distant countries for view the essential manuscript(s) for their research. Instead, they can now sit themselves down in front of their laptop and display each successive page. This has moved many sources that were once difficult to access into the “completely accessible” category.

But does that make them usable?  Despite the desire to make many manuscript collection freely accessible, many digital repositories use “tiled-based” viewers in order to protect unauthorized copying of the collection. This is completely understandable, but those viewers sometimes place limits on how a digital surrogate can be viewed. They can even make it difficult for scholars to extract what they often want most: a transcription of the manuscript’s content. Moreover, the current practice of transcribing from digitized pages can easily permit mistakes to occur. Transcribers currently move from the image to a word processing application in another display window (either on the same screen or on a different monitor). That process can easily mimic the same mistakes that the original scribe could make: haplography (omission of content between similar or identical words; “saut du même au meme”), dittography (repetition of letters or syllables), duplication or omission (of letters, words, or lines), often caused by homoearcton and homoeoteleuton (similar beginnings and endings of words), and transpositions. Could it then be possible to make these digital manuscripts both accessible and highly usable?

T-PEN (Transcription for Paleographical and Editorial Notation) seeks to address both the accessibility and usability of digital repositories. Developed by the Center for Digital Theology of Saint Louis University, in collaboration with the Carolingian Canon Law Project of the University of Kentucky, this new digital tool is a sophisticated web-based application that assists scholars in transcribing these manuscripts. To reduce the likelihood of transcription errors, we took advantage of digital technology to place both the transcription and the exemplar in a manner that minimized the visual movement between the two as much as possible. We accomplished this with a simple but novel visualization of the lines of script in the exemplar, which we integrated with interactive transcription spaces. To build the tool, we developed an algorithm for “parsing” the lines of script in an image, and a data model that connected the image delivery of manuscript repositories with the actions of transcribers.

But we wanted T-PEN to offer more than just a means to ensure good transcription. We had, in fact,  three goals in mind:

  1. To build a tool useful for any kind of scholar, from the digital Luddite to those obsessed with text encoding;
  2. To provide as many tools as possible to enhance the transcription process;
  3. To help scholars make their transcriptions interoperable so that those transcriptions would never be locked into the world of T-PEN alone.

After two years of design, development, and intensive testing this tool is now available to the wider public. It was built in the first instance for those working with pre-modern manuscripts, but there is nothing in its design that would prevent early modern scholars from exploiting T-PEN for their purposes. T-PEN is a complex application and to explain every function would take several posts. Instead, I want to provide a brief overview of how someone can set up a transcription project, how they can use T-PEN to produce high-quality work and finally how to get transcriptions out of T-PEN and into other applications or contexts.

Choosing your Manuscript

T-PEN is meant to act as a nexus between digital repositories and the scholar. To date, we have negotiated access to over 3,000 European manuscripts and we are working on further agreements to expand that list. Our aim is to have a minimum of 10,000 pre-modern European manuscripts available for transcription. Even with that number, we will never be able to satisfy all potential users. We therefore enabled private uploads to extend T-PEN’s usability. Many scholars have obtained digital images of a manuscript and they have permission to make use of them for research purposes. Private uploads to T-PEN are an extension of that “fair use.”  Users zip the JPG images into a single file and then upload them to T-PEN. These type of projects can only add five additional collaborators (see project management, below), and they can never become public projects. Currently T-PEN can support around 300 private projects, and we are expanding our storage capacity for more.

T-PEN's Catalog of Available Manuscripts

Transcribing your Manuscript

Once you select your manuscript you can immediately begin your transcription work. T-PEN does not store any permanent copies of the page images, so each time you request to see a page T-PEN loads the image from the originating repository. If you have never transcribed the page before, T-PEN takes you to the line parsing interface. This adds a little time to the image loading as T-PEN parses the image in real time. When it finishes, you will see a page that looks like this:

T-PEN's Line Parsing Interface

T-PEN attempts to identify the location of each line on the page and then uses alternating colors to display those coordinates. As you can see, we make no claim of absolute perfection. We worked on this algorithm for  almost two and half years and after extensive testing, we’ve been able to promise, on average, an 85% success rate. There are a number of factors that prohibit complete accuracy and so we offer a way for the transcriber to introduce corrections herself. You can add, delete or re-size columns; and insert or merge lines as well. You can even adjust the width of individual lines if they vary in length. You can even combine a number of lines if you want to have them grouped together for your  transcription. Sometimes, manuscripts don’t merge well in our modern, rectilinear world: many handwritten texts were written at an angle or were so tightly bound that the page could not be photographed as flat. T-PEN ultimately doesn’t care: what really matters for connecting transcription to a set of coordinates on a digital image. What really matters is that the left side of the line box aligns with the written text. That’s the anchor.

When you are satisfied with the line parsing, you can start transcribing. The transcription interface looks like this:

T-PEN Transcription User Interface

This interface allows you to transcribe line by line, with the current line surrounded by a red box. There are some basic features to note. First, as you transcribe the previous line is noted above because so often sentence units are split across lines. Transcription input is stored in Unicode and T-PEN will take whatever language set the user has enabled his computer to type. If there are special characters in the manuscript, the transcriber can insert them either by clicking on the special character button (the first ten are hot-keyed to CTRL+1 through 0).

Second, users can encode their transcription as they go. On this aspect, T-PEN is both innovative and provocative. Many scholarly projects that include text encoding often adopt a three-step process: the scholar transcribes the text and then hands it to support staff to complete the encoding, which is finally vetted by the scholar. However, there are many times in which semantic encoding of transcriptions has to include how the text is presented on the page. T-PEN innovatively allows scholars to integrate transcription (with the manuscript wholly in view) and encoding into one step. Often the best encoder is the transcriber herself. That innovation comes with a provocative concept, however. In digital humanities where TEI is the reigning orthodoxy, T-PEN is at least heterodox if not openly heretical. T-PEN’s data model does not expect,  nor require, a transcription to be encoded much less utilize TEI as the basis of structured text. Instead, T-PEN treats all XML elements as simply part of the character stream. T-PEN can support transcribers who don’t want to encode at all as well as those who are wholly committed to the world of TEI. For those who want to encode, a schema can be linked to a project to produce a set of XML buttons that can be used in the transcription interface.

Project Management

For those who simply want to start transcribing, project management will not be that important. For those who envisage a more sustained project (and perhaps a collaborative one at that), it will be vital. There are a number of components in managing a T-PEN project, but here I want to highlight two of them.

Collaboration. Like most digital tools, T-PEN allows you to invite collaborators to join your project. All members of a project have to be registered on T-PEN (but that’s free and requires only providing your full name and an email address). Managing collaboration has three features, of which only a few projects will use all three. There is first adding and deleting project members. Any member of a project can see who is also a member, but only the project leader can add or delete members. A project leader can even have T-PEN send an invitation to a non-T-PEN person and invite them to join (and once they do, they automatically become part of that project).

Collaboration in Project Management

Second, there is a project log to inspect. This log records any activity that changes the content or parameters of the project. This can be particularly helpful when tracking down how a transcription has changed in a shared project (and a user can display the history of each line in the Trasnscription UI). Finally, projects can make use of T-PEN’s switchboard feature. This is for transcription projects that may be part of a larger project, and where the transcriptions will be aggregated in another digital environment. Switchboard does two things for a project: (1) it allows different projects to share the same XML schema so that all transcriptions will conform to the larger project’s standards; and (2) it will expose the transcription through a web service to permit easy export to the larger project.

Project Options. The two more important options are button management and setting the transcription tools. As seen in the screen shot of the transcription interface, users can use buttons to insert both XML elements and special characters. Those buttons are created and modified as part of the project options. If there is an XML schema for the project, a project leader can link it to the project. Then in button management, the elements in that schema populate the XML button list. The button populator does not discern between metadata elements and elements found in the body of an encoding schema. Users then have to modify the button list to cull the elements that won’t be used during transcription. There’s an additional advantage to editing that list: each button can gain a more readable title. This can be helpful if the encoding schema exploits the varying use of the <seg>  or the <div> elements in TEI. When the possible deployment of the tag might be unclear to those with less experience with TEI, a more straightforward title can become a better guide to its use.

Special characters allow the user to identify characters in the UTF-8 system which may not be represented on a standard keyboard. These can be created by entering the correct Unicode value for the character. The first 10 characters are mapped to hotkeys CTRL+1 through 0.

Finally, the set of tools that are available on the transcription interface are set in project options. T-PEN has thirteen tools built-in and most of them were included to assist transcribers of pre-modern manuscripts. Some will be helpful to editors of modern texts. If those tools are unhelpful, then the user can expand that list of tools: all that is needed a name of the tool and its URL. Once attached to the project, the user will be able to access that tool in the transcription interface.

Getting your Transcription out of T-PEN

Digital tools often fall into one of two categories. “Thinking” tools are ones that allow users to manipulate and process datasets in order to test a certain idea or to visualize an abstract concept. They can also allow the user to annotate a resource as a way of processing the scholar’s conception of the object’s meaning or the hermeneutical framework it may require. These tools are invaluable, but they do not easily produce results that can be integrated into a print or digital publication. The second type is what I call the production tool. With these applications, the final objective is to produce something that can be integrated in other contexts. T-PEN falls firmly into this second category—although it has its own annotation tool with which a user can record observations about each manuscript page (and it is compliant with the W3C standard, the Open Annotation Collaboration). Scholars transcribe normally one of three reasons; to create a scholarly edition; to place those transcriptions in footnotes or in the appendices of a monograph; or to integrate an encoded text into a larger resource.

T-PEN supports four basic export formats: XML/plaintext, where the user can filter out one or more XML tags; PDF; RTF which is compatible with most word processors; and finally, basic HTML. For the first one, if the user has attached a header to the project, that header can be included in the export. There is an important caveat here:  T-PEN was not designed to be an XML editor. We do offer a basic, well-formedness check (which stops at the first error), but T-PEN does not offer full validation services. Most scholars who encode with T-PEN export their transcriptions to an XML editor for full validation of the file. The last three export formats include some simple transformation for text decoration (italics, bold, etc.). Users can also identify the whole transcription or specify a range based on the pagination (or foliation) of the manuscript.

T-PEN's Export Options

This post only covers the basics of T-PEN. There are more features available to the user. There is a demonstration video on YouTube  where you can walk with one of T-PEN’s research fellows as she begins a transcription project.  T-PEN is freely available, thanks to a major investment from the Andrew W. Mellon Foundation and a Level 2 Start-up grant from the National Endowment for the Humanities. So go to t-pen.org and register for an account.

From Boston to Peru: Reading Books at the Boston Athenaeum and the Peru Free Library

October 9, 2012

How are we to bring order into this multitudinous chaos and so get the deepest and widest pleasure from what we read? 

V. Woolf, “How to Read a Book”

Photo Credit: Megan Manton/Boston Athenaeum

“To enter the building is to feel an overwhelming impulse to read.”  So wrote Sarah Schweitzer about the Boston Athenaeum in a 2009 Boston.Com article.  Indeed, pushing back the building’s red, leather-bound doors, one plunges into the world of reading like a sea-creature slipping into the ocean’s depths.

How is it that a building can transform us from scatter-brained urban land creatures subject to Boston’s many disparate calls into more focused beings equipped to swim through the world of learning?  It may be that the library’s high ceilings and twelve floors expand our sense of possibility, inviting the mind to unbend.  Certainly, the Athenaeum’s quiet aura of uninterrupted work offers a refuge from the jostling noise of the city’s streets.  Fellow readers lost in concentration call us to our task.  Art, sculpture, newspapers, journals, 750,000 books, maps–all await, encouraging inquiry.  The interior’s opulence telegraphs the value of spending time with books, transporting us to a lost age when leisure allowed one to linger over fictions and treatises, sermons and histories, maps and art, with nothing more pressing awaiting than afternoon tea.

But the Athenaeum’s true luxury is something even more precious and more rare than comfort and splendor alone: it offers the order necessary for sustained reading.

We see this order in the carefully designed reading spaces enticing one to that concentrated state of mind so beneficial for reading.  Solid walnut tables provide space for research materials.  Desks tucked between bookshelves beckon. Upholstered chairs placed next to side tables allow readers to sit next to stacks of books and begin the task of browsing.  The reference room displays recent journals side-by-side on long tables (shown below) carefully ordering the chaotic possibilities before us.

Photo Credit: Megan Manton/Boston Athenaeum

In short, the library has been designed for readers by readers to encourage us to leave the tyranny of the present by plunging into the otherworldly and timeless worlds contained in books.  Seated at the Athenaeum, we can take down volumes and, in Woolf’s words, “make them light up the many windows of the past; we can watch the famous dead in their familiar habits and fancy sometimes that we are very close and can surprise their secrets, and sometimes we may pull out a play or a poem that they have written and see whether it reads differently in the presence of the author.”

Photo credit: Megan Manton/Boston Athenaeum

The Boston Athenaeum is a subscription library.  To borrow books and use the upper floors requires a membership fee beyond the reach of many.  But the first floor is open to the public six days a week, and the Athenaeum’s programs, including concerts, are open to the public free of charge.  Its value as a public space is at least threefold: it is a research and membership library; an art museum and public gallery; and a public forum for lectures, readings, concerts, and other events.

Perhaps most of all, the Boston Athenaeum is a valuable icon reminding us of the civic value placed by a community on reading.

Less palatial, but no less essential, are the public spaces created by our public libraries.  Situated by the apple orchards of upstate New York is the Peru Free Public Library (shown below), a lovely 1927 structure that blends the old and the new.  It maintains its early twentieth-century elegance, even as it runs on solar energy.

Photo credit: Theresa Sanderson

Smaller in scale than the Boston Athenaeum (it holds about 14,000 items), it, too, beckons readers with its carefully arranged reading spaces.  A fireside (below) often warms  readers working at the reference room’s long tables during the shortening fall days and throughout the winter.

Photo credit: Theresa Sanderson

Carefully arranged reading spaces offer an opportunity to clear one’s head:

Photo credit: Theresa Sanderson

A children’s reading room is designed to invite young minds to the world of books:

Photo credit: Theresa Sanderson

The Peru Free Library’s many activities bind the community through art shows, pottery shows, book sales, children’s activities, public lectures, and other events.  Like the Boston Athenaeum, the Peru Free Library is carefully and creatively managed.

Public reading spaces like the Boston Athenaeum and the Peru Free Library contribute immeasurably to their communities and to their readers, allowing them to expand their sense of who they are.   By orchestrating spaces designed to slow us down long enough to stop skimming and sink into deep reading, they encourage a more studied approach to thought than is possible away from books.  If we feel as Woolf did, that heaven is “one continuous unexhausted reading,” the Boston Athenaeum and the public libraries that share its commitment to encouraging reading make it a little easier to experience heaven on earth.

New Digital Projects II: Vernacular Aristotelianism and Digitized Archives at the Wellcome Library

October 4, 2012

The following guest post, the second of two parts, is by Andie Silva, Wayne State University

In a previous post, I discussed the Vernacular Aristotelianism database featured during the first week of a two-week workshop at the University of Warwick this past summer.  During that workshop, Chris Hilton, senior archivist at the Wellcome Library, presented that library’s massive restructuring of its archives and plans to digitize their entire collection.

As if that were not already an impressive undertaking, the Wellcome promises that all their material will be available, not only for open access (with a library membership card—free with in-person visit) but also for sharing: users will be free to copy, link, and even embed any digital materials from the Wellcome for any non-commercial purposes.

However, as Hilton demonstrated, digitization is not enough: without the proper coding and re-cataloguing of the material, most users won’t know where to look, or what to look for. What he calls the “white box” syndrome is a constant challenge to digital archivists: how does one translate the intricate, detailed knowledge of the archivist into a blank search box? In a way, that is largely impossible; having heard many tales of “found treasures” from scholars who took the time to get to know a librarian and talk to them about their research, I am not one to underestimate the value of physically visiting an archive.

Of course, not everyone is able to do so, and that is where the digital archive comes in. While digitization cannot fully supplant true archival research, it allows instead for new kinds of research. Take, for instance, the already fully digitized Wellcome Arabic Manuscripts. Thanks to a very generous grant by the JISC’s Islamic Studies Program (and, no doubt, some very hard-working grad students), this digital archive is an online researcher’s dream. Each manuscript has been photographed in full, including the covers, binding, and original coloring and detailing. Because the whole book, and not just a close-up of the pages, has been photographed, the researcher is better able to grasp the sizing, page setting, and general condition of the manuscript. From this broader view, the reader can then zoom in to a specific page and actually read its contents. Granted, this process is a little slow—however, given the quality and viewing options, I can’t see that as a major flaw.

Another fantastic improvement is that the thumbnails of each page appear in a separate frame, allowing the viewer to browse the entire work while inspecting specific pages. This kind of “horizontal browsing” (although in this case the frame is vertical) is something Hilton hopes will be applied to the rest of the Wellcome digital materials. According to Hilton, this extra frame will also contain information about related materials, cross-searching, and external links. I imagine that due to monetary and time constraints the rest of the materials will not be as detailed as the Arabic Manuscripts. Nonetheless, this collection demonstrates the incredible amount of information and details that are possible for those implementing digitizing projects. Thanks to those who catalogued and annotated the Arabic collection, researchers have the option to investigate even material details like binding and physical conditions of a manuscript and never have to pay more than the price of an internet connection.

While digital projects may not (and perhaps should not) replace material archives, they offer new possibilities for research. Scholars interested in statistics, for example, are now better able to quantify and analyze data at the speed of a search engine. One of the workshop participants, for instance, questioned the use of “Publics” in the title “Reading Publics,” arguing that it was not a word contemporary to Renaissance audiences and therefore inaccurate to describe their acts of reading, purchasing, and engaging with books. His claim was backed by a database search for the use of the word “public” in sixteenth and seventeenth-century texts, which revealed only a few works using the word anywhere in the text. He quickly realized, however, that his initial search had failed to elicit books that included variant spellings or synonyms. What’s more, his research was limited to English texts, and (more importantly) texts that had already been transcribed by the Text Creation Partnership for EEBO (another exciting project that is not yet entirely available to the public).

This example makes clear some of the limiting aspects of digital research: we are always, sometimes unawares, conditioned by the parameters of the search box—and, more specifically, by whoever coded the keywords into the database. This example also highlights a new kind of conversation that is made possible by virtue of digital projects.  New endeavors like Vernacular Aristotelianism and the Wellcome Arabic Manuscripts show us that digital archives have the opportunity (perhaps even, I dare say, the responsibility) to rethink literary categories, to open up new angles for research, and to foreground aspects of book production and reception beyond the figure of the author.

CFP: JEMCS Special Issue on the Early Modern Digital

August 11, 2012
The following call for papers, posted on SHARP-L, may be of interest
to readers.  Contact Devoney Looser for additional information (contact information below).
Journal for Early Modern Cultural Studies:  Special Issue on the Early Modern Digital (due 15 Jan 2013)
It is well understood that “the digital turn” has transformed the contemporary cultural, political and economic environment.  Less appreciated perhaps is its crucial importance and transformative potential for those of us who study the past.  Whether through newly—and differently—accessible data and methods (e.g. “distant reading”), new questions being asked of that new data, or recognizing how digital reading changes our access to the materiality of the past, the digital humanities engenders a particularized set of questions and concerns for those of us who study the early modern, broadly defined (mid-15th to mid-19th centuries).For this special issue of JEMCS, we seek essays that describe the challenges and debates arising from issues in the early modern digital, as well as work that shows through its methods, questions, and conclusions the kinds of scholarship that ought best be done—or perhaps can only be done— in its wake.  We look for contributions that go beyond describing the advantages and shortcomings of (or problems of inequity of access to) EEBO, ECCO, and the ESTC to contemplate how new forms of information produce new ways of thinking.We invite contributors to consider the broader implications and uses of existing and emerging early modern digital projects, including data mining, data visualization, corpus linguistics, GIS, and/or potential obsolescence, especially in comparison to insights possible through traditional archival research methods. Essays of 3000-8000 words are sought in .doc, .rtf, or.pdf format by January 15, 2013 tojemcsfsu@gmail.com<mailto:jemcsfsu@gmail.com>.  All manuscripts must include a 100-200 word abstract. JEMCS adheres to MLA format, and submissions should be prepared accordingly.In addition, we would welcome brief reports (500-1500 words) that describe digital projects in progress in early modern studies (defined here as spanning from the mid-fifteenth to the mid-nineteenth centuries), whether or not these projects have yet reached completion.  These reports, too, should be submitted in .doc, .rtf, or.pdf format, using MLA style, by 15 January 2013 to  to jemcsfsu@gmail.com.

Devoney Looser, Catherine Paine Middlebush Chair and Professor of English
Co-Editor, Journal for Early Modern Cultural Studies
Tate Hall 114
Department of English
University of Missouri
Columbia, MO 65211
573-884-7791
FAX: 573-882-5785
looserd@missouri.edu
http://www.devoneylooser.com

Digital Humanities and Archives II: ‘Archival Effects’ of Digitization

April 29, 2012

In an earlier EMOB post, “Digital Humanities and the Archives I: Economics and Sustainability”, we discussed the varied connotations that the term “sustainability” evokes. Yet the concept of “archives” also engenders a multiplicity of meanings as does the word “database.” In some circles “archive” and “database” are used interchangeably, while for others the terms signal distinctions between the past and the present. As Marlene Manoff has observed,

When scholars outside library and archival science use the word “archive” or when those outside information technology fields use the word “database,” they almost always mean something broader and more ambiguous than experts in these fields using those same words. The disciplinary boundaries within which these terms have been contained are eroding. Scholars use the terms metaphorically, appropriating them from the professional experts. (Manoff, “Archive and Database as Metaphor: Theorizing the Historical Record.” portal: Libraries and the Academy, 10.4 [2010], 385)

The submissions for the “Digital Humanities and the Archives” roundtable at ASECS 2012 attest to the varied meanings scholars ascribe to “archive” as a digital entity. While some proposals viewed commercial textbases such as ECCO or EEBO as archives, others considered non-commercial digital projects (some of which were designed to perform additional roles beyond being a repository), as falling under the “archival” designation. Still others proposed topics that were not tied to specific digital collections or projects. Reflecting this diversity, the selected presentations featured two papers on the nature of searching within digital environments (Randall Cream, West Chester Univ., and Bill Blake, New York Univ.), another on the coding issues encountered in building a performance history database (Mike Gavin, Rice University; University of South Carolina, Fall 2012), a fourth on the potential evidence that can be derived from negative results (Sayre Greenfield, Univ. of Pittsburgh, Greensburg), and the last on a digital archive aimed at facilitating exchange between scholars facilitating exchange between scholars and those outside the academy (Jessica Richard, Wake Forest Univ.). In his post on the many Digital Humanities sessions at ASECS, Stephen Gregg offers a fine overview of this roundtable, so the following comments supplement his summary. In addition, they serve as a springboard for discussing digitization’s broader “archival effects,” a term coined by Marlene Manoff to “suggest the ways in which digital media bring the past into the present” (386).

Contrasting the old and the new, Randall Cream noted that unlike traditional archives whose contents are not always fully known, digital archives and databases afford more certainty because their creation involves detailed and defining–an encyclopedic naming of their various parts. For Cream, this difference has also meant that searching the digital archives lacks the serendipitous discovery that scholars often experience when working in brick-and-mortar archives. He suggested concept-linked searching as a possible means of fostering chance discoveries within digital environments, a suggestion that provided a fitting segue to Bill Blake’s talk on crafting more effective digital searches. Blake argued for thinking beyond topical keyword searches aimed solely at retrieval. Instead, he called for adopting more quality, conceptually-based searches that will yield better results; such searches will counter the drift and spread that occur when the aim of retrieval replaces the goal of discovery. (Given earlier EMOB discussions of semantic- or meaning-based searches, it should be noted that Blake was referring to the ways users select and fashion search terms and not to the new search platforms that enable semantic or meaning-based searching such as Mimas used in JISC’s Historic Books collection.)

Cream’s and Blake’s remarks point to what could be termed a remediation of research practices as print and digital interact, and both their talks highlighted searching as perhaps one of the most significant reconfigured practices. And indeed the concept of searching has undergone major reformulations in the digital environment. While accessibility and quickness of obtaining results are often seen as digital archives’ main advantage over print, a key benefit of digital collections resides in their enabling users to traverse immense areas of texts multi-directionally. Put another way, what seems radically different about searching in the digital world is not merely unprecedented access and speed, but rather the ways one can alter search strategies instantaneously, shifting not only the search terms employed at a moment’s notice but also the temporal and spatial coordinates in which those terms are placed. This capability expands the ways we are approaching the search as a strategy, opening up new conceptualizations even as we retain the habits and training we acquired working with print. As Wired magazine’s Kevin Kelly has observed: “What search uncovers is not just keywords but also the inherent value of connection…Search opens up creations. …As a song, movie, novel or poem is searched, the potential connections it radiates seep into society in a much deeper way than the simple publication of a duplicated copy ever could” (Kevin Kelly, “Scan this Book!” New York Times, 14 May 2006).

The searching enabled within digital archives reorients our thinking about what constitutes relevant information and exposes the kinds of connectivity that we would likely miss or overlook working with print and manuscript in traditional environments. This reorientation, moreover, possesses its own opportunities for serendipity. While serendipitous discoveries made when working in a traditional archive or even browsing in the stacks typically occur within a bounded space and a pre-selected range of call numbers, digital archives and databases enable virtual movement throughout their holdings to uncover relevant but unforeseen connections not bounded by categories of expectations. In short, capable of serving as far more than text delivery systems and repositories, these digital archives and databases function as “discovery aids.” Fostering a culture of connectivity, these intellectual laboratories of sorts can provide access not only to individual titles but also to a larger, dynamic field of textual and sociocultural activity.

Sayre Greenfield’s paper demonstrated the kind of discoveries that this rethinking of relevant information can yield. Noting that assessing negative findings requires caution, Greenfield explored the ways in which a lack of search results—negative evidence—can translate into meaningful information and concluded that “absences are most useful when measured against positive results found elsewhere, in different genres or different periods.” In offering examples of the different hits obtained from performing the same search in ECCO and Burney, he drew attention to the importance of knowing the scope of a given database and the value of working across databases.

Mike Gavin’s paper also underscored the importance of understanding the operation of digital archives and the rethinking that such understanding can prompt. As Gavin recounted, creating a digital archive of dramatic works that incorporates their performance history has necessitated adapting TEI coding to facilitate searching. While his comments reflect the perspective of those constructing the archive, they also hold significance for users of digital archives. The tagging examples he provided illustrate the significant intellectual labor that goes into the creation of digital databases and archives; encoding a document, after all, is an interpretive practice requiring careful thought and subject expertise. His illustrations are a cogent reminder that the archives–whether traditional or digital–are never neutral but always are rooted in the views and principles of their creators. In the case of digital archives or databases, users benefit from being cognizant of their “constructedness.” Having an awareness of a digital archive’s creators, the circumstances surrounding its creation, the quality of its metadata, and the idiosyncrasies of its search engine will almost certainly enhance a user’s search process and, in some cases, even his or her analysis of results. Unfortunately, it is not always possible to uncover such details about digital archives and databases. Plus, even when there is transparency and one can familiarize oneself with a digital archive’s encoding principles and information architecture, the tagging can still limit the what results searches return. On a different note, it seems worth mentioning that the tasks of coding and organizing the contents of a traditional archive will, in turn, often enrich knowledge of its physical material. And this physical material remains important, for the digital and the material are not one and the same.

Unlike the first four papers that focused on either existing archives or ones nearing completion, Jessica Richard’s paper dealt with the early planning stages of a digital project. The incarnation for the project was a desire to foster exchange between eighteenth-century science studies scholars and a non-academic readership; creating a web-based site seems an ideal medium for the public-humanities thrust of this project. Notwithstanding its differences from the other talks, Richard’s topic very much reflects how the digital is transforming our traditional conceptions of archives. The project’s rethinking of audience, attention to wide access, and desire to translate scholarship for an interested general public all exemplify aspects of this transformation.

As these five talks illustrated, digital media are transforming our theoretical conceptions of “archives”; creating new paradigms and inspiring shifts in existing models as the digital and traditional archival cultures interact; and shaping the kinds of archival projects being undertaken, the methodologies used, and the types of research questions posed. Early in her essay Manoff suggests that “our current moment reflects the convergence of two phenomena–new technical capacities and an age-old impulse to gather and preserve. The ease of capturing digital data is an incitement to archive” (386). In light of the linguistic history of “archive,” connections between new technical capacities and the desire to collect and preserve have perhaps an even longer history. The word “archive” does not appear until after the invention of hand-press printing. While its use as a noun to denote either a historical document that is preserved or the place in which such documents are kept dates from the late 1630s/early 1640s, its verbal form–to archive–does not enter the lexicon until the twentieth century. Whether coincidence or not, this verb does not gain wide currency until the 1980s, a timing that corresponds with the growth in the use of computers and related technologies. In the past two decades the extensive adoption of digital technologies has dramatically spurred efforts to assemble large-scale collections of visual, verbal, and even oral materials and make them virtually available, either freely or commercially.

For Manoff, metaphorical appropriations of “archive” are not only useful for theorizing the ever-increasing growth of these collections but also for theorizing the digital in terms of its archival effects on our conceptions of history and the cultural record (385-6). As Manoff observes at the close of her essay, “archive” especially lends itself to such theorizing because the concept “carries within it both the ideal of preserving collective memory and the reality of its impossibility” (396). The musings about traditional and digital archives presented here touch upon only a few of the archival effects that digital transformations are exercising on our research practices and broader relationships with the history and knowledge. I hope others will add their thoughts about these changes and the explanatory power of “archive” to address our cultural moment.

Digital Public Library of America (DPLA) to open April 2013

March 30, 2012

By April 2013, the Digital Public Library of America should be up and running.  With this announcement, Robert Darnton opened a recent talk about DPLA sponsored by Harvard Library Strategic Conversations.

Darnton reviewed DPLA’s brief history, including its origin at a meeting at Harvard’s Radcliffe Institute on 1 October 2010, its successful coalition of foundations committed to providing financial support, its appointment of a steering committee, and its selection of John Palfrey as the steering committee’s chair.  Six “workstreams” have been designed to arrive at consensus-driven plans in the following areas:

To join a workstream listserv, consult the appropriate web page.

Darnton insisted that DPLA was not simply a response to Google, though DPLA is open to working with Google and has extended invitations to that effect.  He provided an incisive history of Google Book Search’s legal troubles, and noted that DPLA has much to learn from that history.

Next, John Palfrey (chair of the DPLA steering committee, and author of Born Digital: Understanding the First Generation of Digital Natives), outlined some of DPLA’s goals, though he conceded that the exact nature of the DPLA was still be determined:

  • constructing a creative and technologically sophisticated learning environment beyond that created by e-books.  This involves imaginative work by architects, programmers, catalogers, users, and and just about anyone else prepared to think innovatively.
  • considering the following elements that will shape the still indistinct and ever-evolving nature of DPLA:
    • code will be free and open source
    • metadata will aggregate existing data and create additional data.  It has already arrived at an agreement to network with Europeana, Europe’s digitized knowledge-sharing platform.
    • content will include all media types
    • tools and services will facilitate public innovation.  Palfrey provided as an example the use of a “scanabego,” a truck with scanning tools that would be driven across the country to local historical societies, offering to digitize their records in exchange for linking those records to DPLA.
    • DPLA’ community will be widespread and participatory.  According the DPLA web site, “DPLA will actively support the community of users and developers that want to reuse and extend its content, data, and metadata.”

In the discussion that followed the presentation, one of the most interesting comments was Charles Nesson’s description of a Digital Registry Project to address the copyright issues that plagued Google Books.  The Registry seeks pro bono commitments from major law firms “to defend the copyright status determinations of major cultural institutions such as libraries and universities” (see the memo available on Charles Nesson’s web site.)  According to the DPLA web site,

The objective of the Digital Registry Project is to create a comprehensive registry to undergird digital exploitation of intellectual property—for personal, educational, or commercial use. This vision encompasses all copyrighted works, all orphan works, and all works in the public domain. The Digital Registry seeks to kick start the registry process by beginning with those works that belong no one and therefore belong to everyone: the public domain. This registry is intended to be a simple and unassailable starting point for all larger registries.

More information is available at the extensive and carefully designed DPLA web site and the DPLA blog, which is guaranteed to interest emob readers.


Follow

Get every new post delivered to your Inbox.

Join 94 other followers