Archive for the ‘Digital Humanities’ Category

Book History and Digital Humanities: SHARP at #MLA 14 #s738

January 27, 2014

The recent MLA 2014 conference featured numerous sessions dealing with digital humanities in its various incarnations. More than a few of those sessions dealt with the interrelationships between new and old technologies, including Session 738, a stimulating roundtable sponsored by the Society for the History of Authorship, Reading & Publishing (SHARP) and organized by Lise Jalliant (University of Newcastle). Unfortunately, Lise was not able to attend MLA as planned, so Eleanor Shevlin served as chair in her stead.

Designed to “shed light on the digital future of book history and the bibliographical roots of digital humanities” (MLA special session proposal), the “Book History and Digital Humanities” roundtable featured six projects that attest to the close interrelationships between the two fields. The presentations were delivered in the chronological order of the projects. Not only did these projects illustrate the ways in which the digital and book historical are tightly intertwined, but they also demonstrated various technological advances as they highlighted what a new generation of digital capabilities and thinking are affording scholarship.

Greg Hickman, head of the University of Iowa’s Special Collections and Archives, opened the session by discussing the Atlas of Early Printing, an interactive map that provides a visualization of printing’s spread during the incunabula period. The 2013 version Greg demonstrated offers a technological advance over the map’s flash-based design launched in 2008 and has been primed to operate effectively on mobile devices as well as desktops.

Atlas of Early Printing

Atlas of Early Printing


Unlike the two-dimensional print maps from which it draws its inspiration, the Atlas contains information related to the spread of print such as the locations of paper mills, universities, trade routes. Users can select all or any of this additional information to create specific contextualizations about the ways the press and printing took hold throughout Europe in the decades leading up to the sixteenth century.

Interested in using technology for purposes beyond gathering, organizing, and explaining information, Michael Gavin, a professor of English at the University of South Carolina, discussed using computer simulation to create a more generative way of working with information. Specifically, Gavin, drawing from Joshua Epstein’s work in agent-based computational simulation to model early modern print culture and to “grow information” about seventeenth- and eighteenth-century book trade issues including censorship and the effects readers exercised on printers and booksellers. The use of such computer modeling focuses on simulating social behavior to generate and test information; if the model is right, then it should not crash.

The director of NINES and professor of English at University of Virginia, Andrew Stauffer, made a cogent plea for the imperiled status of nineteenth-century printed books. Individual copies of nineteenth-century books, often still in the stacks or in the process of being de-accessioned (if not already removed), possess rich, layered histories and the evidence of their multiple temporalities. In an effort to preserve the histories of these works “hidden in plain sight,” In addition to advocating for the primacy of the printed work as a site embodying distinct, irreplaceable data, Stauffer is developing a crowd-sourcing project that will ask academic institutions, other holding bodies and individuals to use Instagram and other forms of technology to capture digitally this heritage and make it accessible.

Matthew Laven, the Associate Program Coordinator of the Mellon-funded “Cross Boundaries: Re-envisioning the Humanities for the 21st Century” at St. Lawrence University, addressed the question “What is a digital bibliography of a book?” through his work on a dynamic, visually-enriched publishing history of Willa Cather’s Death Comes to the Archbishop (1927) for the Willa Cather Archive. Acting as a case study for the digital representations of both various material artifacts (e.g., manuscripts, printed translations, unusual editions) and textual variances, the project also seeks to convey the bibliographical ties among the various artifacts and is informed by a Functional Requirements for Bibliographic Records (FRBR)-based ontology.

Hannah McGregor, a SSHRC postdoctoral fellow at the University of Alberta, spoke about constructing an innovative methodological approach to studying periodicals that she and Paul Hjartarson, professor of English and film studies at the University of Alberta, have been developing in collaboration with the Editing Modernism in Canada research group. A key working hypothesis of this project is that periodicals are ideally situated for digital remediation as relational databases because they themselves resemble databases (that the word “magazine” also meant a storehouse bespeaks this similarity). While middlebrow magazines serve as the project’s focal point, McGregor drew her examples from the Western Home Monthly and Pictorial Review. The issue of labeling—what to call different items, the problem of categories and categorization—has been a vexed point and one no doubt complicated by the multiplicities of genres and the nature of periodical materials (think of the Burney 17th and 18th Century Newspaper Collection). This issue of labeling underscored the ways in which coding is important intellectual labor.

The final participant, Elizabeth Wilson-Gordon, professor of English at King’s University College in Alberta, presented the Modernist Archives Publishing Project (MAPP). A collaborative effort involving Canadian, U.K. and U.S., institutions, the project seeks to advance research in the history of modernist presses and publishing. Wilson-Gordon used Virginia Woolf’s Hogarth Press to illustrate the capabilities of MAPP. The Hogarth Press offered an especially rich example because of the insights its history affords about Woolf and her work but also because of its importance to interwar publishing and its longevity throughout the twentieth century. Like many of the other projects discussed, MAPP illustrated the importance of collaboration and communities of scholars working in tandem. The launch of the Hogarth Press open-access portion of MAPP is slated for 2017.

The Book History and Digital Humanities session was one of three excellent panels sponsored by SHARP. SHARP’s liaison to MLA, Greg Barnhisel has written a full account of the other two, equally invigorating sessions for the spring issue of SHARP News: the official SHARP panel, Session # 501 Books and the Law, and Session #398 Virginia Woolf and Book History, co-sponsored with the Virginia Woolf Society.

2013 ODH Project Directors Meeting

September 23, 2013

The NEH has just announced its 2013 Office of Digital Humanities Meeting will take place on Friday, October 4, 2013, at NEH Headquarters in Washington, DC.

As in the past, the meeting will feature 3-minute Lightning-Round presentations from ODH grantees. This year thirty-two grant recipients from 2013 will be presenting–almost all of those who received a grant this year. EMOB will be reporting on these presentations in a subsequent Fall post. See an earlier post for reporting on past NEH awards.

In addition to these lightening rounds, Dr. Michael Witmore, Director of the Folger Shakespeare Library, will give one of two keynote addresses. His talk is titled “Adjacencies, Virtuous and Vicious, in the Digital Spaces of Libraries.”
Abstract: This talk will explore how techniques of discovery — scanning shelves, exploring digital texts and catalogues — may change the nature of research conducted in Libraries. The argument: with the advent of massively searchable digital corpora, the uses and advantages of “nearness” in Libraries will change.

Dr. Amanda French, Center for History and New Media at George Mason, will deliver the second keynote, “On Projects, and THATCamp”
Abstract: Since its start in 2008, THATCamp, The Humanities and Technology Camp, has seen more than 170 events held or planned worldwide and has provided digital training and professional development to more than 6000 people, most of them humanities scholars, students, or professionals. Whether we consider it one project or many, THATCamp has become an essential feature of the digital humanities landscape, and it is time for some perspective on it.

While there is no charge to attend, one must register. For more details and to register to attend, please visit the ODH webpage.

Teaching with ECCO

August 17, 2013

As posted yesterday, Gale Cengage is providing SUNY colleges with trial access to ECCO (Eighteenth Century Collections Online) and NCCO (Nineteenth Century Collections Online) this fall. Gale Cengage is also sponsoring
essay contests for SUNY students using these tools. This is a great opportunity to test these products, to think about how best to teach with them, and to evaluate students’ responses to them. So how best to introduce these resources?

Thinking about my undergraduate Gothic Novel class this fall, I decided that short videos would be the most effective way to introduce students unfamiliar with eighteenth-century texts to ECCO. I prepared three brief videos (below). I would love to hear how others introduce students to these tools.

There are a number of other videos on using ECCO. Below are a few from Virginia Tech:

The following essays from The Eighteenth-Century Intelligencer are also helpful. See especially the appendices Eleanor included in her illuminating essay. You may have to scroll through the pdf document to find each individual essay.

For those relatively new to using ECCO in the classroom, the following resources may provide useful background. I will use Gale’s guide as a handout after students have watched the videos.

For those using Burney (which is included in the free trial), our “Preliminary Guide for Using Burney ” may be helpful.

Finally, Laura Rosenthal opened a valuable discussion on this topic in 2009 on Long Eighteenth that may interest readers. I’d love to hear updates to that discussion, particularly ideas for effective teaching assignments. What works? What doesn’t?

Trial Access to ECCO and NCCO for SUNY Colleges + Essay Contests

August 16, 2013

The following announcement from Gale Cengage will interest faculty and students at SUNY schools. It’s a great opportunity to explore these resources and students’ responses to them.

We hope to hear about classroom experiences here on emob.
AB

*****

This fall, Gale Cengage Learning is sponsoring an essay contest for SUNY students. Its purpose is to encourage primary source research using advanced databases like Eighteenth Century Collections Online (ECCO) and Nineteenth Century Collections Online (NCCO). We hope this experience with these key resources will help students prepare for a digital future.

We are offering free access to SUNY schools during fall 2013 through our new platform Artemis, which will contain both ECCO and NCCO. We hope you and your students will explore these tools to see how they enrich the learning environment. We also hope you will encourage your students to submit essays that incorporate these resources as part of the contest.

Two undergraduate essay awards ($250 each) and one graduate essay award ($500) will be offered for the best submissions on 18th-19th-century history and/or literature.

More information can be found at the link below: http://galesupport.com/suny/

Questions can be forwarded to Theresa DeBenedictis:

Theresa DeBenedictis
Gale, Cengage Learning
Theresa.debenedictis@cengage.com
1-800-877-4253 x 2229
Cell: 732-865-4249

Conference to Launch of Digital Miscellanies Index, a New Resource

August 5, 2013

On 17 September 2013, St. Peter’s College Oxford will host a one-day conference, “A Miscellany of Miscellanies: Popular Poetic Collections and the Eighteenth Century Canon” and an evening performance of eighteenth-century music to launch the Digital Miscellanies Index.

This Leverhulme-funded index was three years in the making. Its publication will make freely available 1,000 poetic miscellanies published during the eighteenth century. The Index adds to the porjects hosted by Bodleian’s Centre for the Study of the Book. The Bodleian Library’s Harding Collection, “which houses the most significant but largely neglected group of miscellanies in the world,” contains the majority of the miscellanies, but the project also contains data about copies held at the British Library and the Cambridge Library. The project developers based their work on Professor Michael Suarez, S.J.’s recent bibliography of eighteenth-century poetic miscellanies.

Dr. Abigail Williams (St. Peter’s College Oxford) is the Index’s principal investigator. Some EMOB readers may have heard Dr. Jennifer Batt, DMI’s post-doctoral project coordinator, speak about this exciting project at past American Society for Eighteenth-Century Studies conferences. As the DMI website notes, “In displaying this material for the first time, the Index will enable users to map the changing nature of literary taste in the eighteenth century.”

We look forward to the availability of the Digital Miscellanies Index and to hearing the experiences of EMOB readers using this new resource.

SHARP 2013 Digital Projects and Tools Showcase

July 29, 2013

In mid-July the Society for the History of Authorship, Reading & Publishing (SHARP) met for its twenty-first annual conference, “Geographies of the Book,” in Philadelphia. Hosted by University of Pennsylvania, the conference included a three-hour, stand-alone digital showcase on Saturday, July 20th. Before I turn to the sixteen projects featured in the showcase, a few words about the history of digital sessions at SHARP are in order.

The tradition of showcasing digital projects at SHARP conferences was begun by Dr. Katherine Harris (San Jose University) for the 2008 conference held in Oxford, England. Currently serving as the E-Resources Review Editor for SHARP News, Dr. Harris continued to organize showcases for subsequent conferences. These highly popular sessions ran concurrently with other sessions. Although the 2011 Washington, DC organizers had attempted to find space to hold a stand-alone session that would not compete with other panels, space limitations prevented this desire from becoming a reality. A successful digital project session for the DC conference, however, was organized once again by Kathy Harris. Yet, the 2013 Digital Showcase at Penn marked the first time that the demonstrations of new digital projects and tools at SHARP had a dedicated time slot of its own as well as a setting well-suited to such an exhibition.

With a dedicated three-hour running time, the digital showcase ran from 12:30 to 3:30 pm; it competed for attention with parallel programming only during its final hour. The showcase’s location in Penn’s Houston Hall’s Hall of Flags easily accommodated 16 six-foot tables, each with its own monitor, and afforded the room for numerous attendees to navigate the various stations with ease.

Mitch Fraas (UPenn) demonstrates his project.
Photo credit: Alex Franklin (Univ. of Oxford)

Alan Galey (UToronto) demonstrates his project.
Photo credit: Alex Franklin (Univ. of Oxford)

The following is a list of the sixteen projects:

Eight of the sixteen projects deal directly with the early modern period, and at least two–Mark Algee-Hewitt and Tom Mole’s Bibliograph and Tim Stinson’s ARC and Collex–extend beyond the historical confines of the early modern but possess specific relevance to the period. I have counted Alan Galey’s The Borders of the Book: Visualizing Paratexts and Marginalia in Multiple Copies and Editions among the early modern projects because his work relies on texts from this period. Yet, his work on digital visualizations of differences in paratextual features and different readers’ marginalia found in multiple copies of the same books has larger application, too. All of the projects, no matter what the period, embody approaches and strategies afforded by the digital that can help advance work in book history and related fields. The projects are also at various stages–and you will notice that some have links, and some don’t because they are either in very early stages or simply not ready for widespread release. Bibliograph, for instance, is currently a prototype, with a beta version in the works for testing; the project launch date is aimed for 2014 or 2015.

END: Early Novels Database is a collaborative project involving several Philadelphia academic institutions but still in the midst of digitization and construction. In contrast, the Eighteenth-Century English Grammars Database is, in one sense, “complete, but as Professor Yáñez-Bouza noted, it is also “an open-end project because one can always add more grammars and some of the fields could be completed with more information had we the resources to look into contemporary book reviews and sales catalogues (e.g. the fields Price and Target Audience).”

Several of the projects have made previous appearances in EMOB posts. A post last June mentioned ARC (Advanced Research Consortium), and it is very good to see the progress since then. The Mellon grant that the Early Modern OCR Project (see the entry for Jacob Heil) received was announced in a post last fall. More recently, EMOB devoted a post to the image-matching software developed at the Bodleian that Alex Franklin presented at SHARP. Finally, the Mapping the Republic of Letters project the EMOB discussed in a post several years ago, served as the inspiration for Mitch Fraas’s Expanding the Republic of Letters: India and the Circulation of Ideas in the Late Eighteenth Century.

Explore and comment!

Virtual Paul’s Cross Project website is now available for exploration!

May 8, 2013

st-paul

About a year ago, EMOB devoted a post to several NEH-funded digital projects. John N. Wall, Project Director and Professor of English Literature at NC State University, has let us know that the Virtual Paul’s Cross Project website is now available for exploration at http://vpcp.chass.ncsu.edu. We provide below the press release announcing its availability and invite EMOB readers to explore and comment.

The Virtual Paul’s Cross Project uses visual and acoustic modeling technology to recreate the experience of John Donne’s Paul’s Cross sermon for November 5th, 1622. The goal of this project is to integrate what we know, or can surmise, about the look and sound of this space, destroyed by the Great Fire of London in 1666, and about the course of activities as they unfolded on the occasion of a Paul’s Cross sermon, so that we may experience a major public event of early modern London as it unfolded in real time and in the context of its original surroundings.

The Virtual Paul’s Cross Project has been supported by a Digital Start-Up Grant from the National Endowment for the Humanities.

The Virtual Paul’s Cross Project has sought the highest degree of accuracy in this recreation. To do so, it combines visual imagery from the 16th and 17th centuries with measurements of these buildings made during archaeological surveys of their foundations, still in the ground in today’s London. The visual presentation also integrates into the appearance of the visual model the look of a November day in London, with overcast skies and an atmosphere thick with smoke. The acoustic simulation recreates the acoustic properties of Paul’s Churchyard, incorporating information about the dispersive, absorptive or reflective qualities of the buildings and the spaces between them.

This website allows us to explore the northeast corner of Paul’s Churchyard, outside St Paul’s Cathedral, in London, on November 5th, 1622, and to hear John Donne’s sermon for Gunpowder Day, all two hours of it, in the space of its original delivery and in the context of church bells and the random ambient noises of dogs, birds, horses, and crowds of up to 5,000 people.
There is a Concise Guide to the whole site here.

In keeping with the desire for authenticity, the text of Donne’s sermon was taken from a manuscript prepared within days of the sermon’s original delivery that contains corrections in Donne’s own handwriting. It was recorded by a professional actor using an original pronunciation script and interpreting contemporary accounts of Donne’s preaching style.

For John Donne’s Paul’s Cross sermon for November 5th, 1622 (in 15-minute segments), as heard from 2 different positions in the Churchyard, go here.

On the website, the user can learn how the visual and acoustic models were created and explore the political and social background of Donne’s sermon. In addition to the complete recordings of Donne’s Gunpowder Day sermon, one can also explore the question of audibility of the unamplified human voice in Paul’s Churchyard by sampling excerpts from the sermon as heard from eight different locations across the Churchyard and in the presence of four different sizes of crowd.

For excerpts of the sermon from eight different locations and in the presence of different sizes of crowd go here.

The website also houses an archive of materials that contributed to the recreation, including visual records of the buildings, high resolution files of the manuscript and first printed versions of Donne’s sermon for Gunpowder Day 1622, and contemporary accounts of Donne’s preaching style. In addition, the website includes an acoustic analysis of the Churchyard, discussion of the challenges of interpreting historic depictions of the Cathedral and its environs, and a review of the liturgical context of outdoor preaching in the early modern age.

To see the visual model in detail on a fly around video go here. This is especially dramatic if viewed in HD video and at Full Screen display.
This Project is the work of an international team of scholars, engineers, actors, and linguists. In addition to the Project Director, they include David Hill, Associate Professor of Architecture at NC State University; Joshua Stephens, Jordan Grey, Chelsea Sacks, and Craig Johnson, graduate students in architecture at NC State University; John Schofield, Archaeologist at St Paul’s Cathedral and author of St Paul’s Cathedral Before Wren (2011); David Crystal, linguist; Ben Crystal, actor; Ben Markham and Matthew Azevedo, acoustic engineers with Acentech, Inc; and members of the faculty in linguistics and their graduate students at NC State University, especially professors Walt Wolfram, Erik Thomas, Robin Dodsworth, and Jeff Mielke.

Wall’s team is now planning a second stage of this Project, with the goal of completing the visual model of Paul’s Churchyard, including a complete model of St Paul’s Cathedral as it looked in the early 1620’s, during John Donne’s tenure as Dean of the cathedral. This visual model will be the basis for an acoustic model of the cathedral’s interior, especially the Choir, which will be the site for restaging a full day of worship services, including Bible readings, prayers, liturgies from the Book of Common Prayer, sermons, and music composed by the professional musicians on the cathedral’s staff for performance by the cathedral’s organist and its choir of men and boys. They will be competing for our attention, as they did in the 1620’s, with the noise of crowds who gathered in the cathedral’s nave, known as Paul’s Walk, to see and be seen and to exchange the latest gossip of the day.

SHARP 2013 Call for Submissions, for digital projects related to book history and bibliography

December 13, 2012

The Organizing Committee for the Philadelphia SHARP Conference 2013 announces a second Call for Submissions, for digital projects related to book history and bibliography. These may include but are not limited to research tools, apps and software, bibliographies or databases, corpora of media or texts, digitization initiatives, remediations, and interactive interfaces.

 We will exhibit up to 20 of these projects in a free-form session in which participants will be able to share their digital and new media work with an audience of nearly 300 conference delegates (faculty, librarians, administrators, independent scholars, graduate students).

 The Showcase will be held between 12 and 3pm on Saturday, July 20, 2013. The conference runs from Thursday, July 18 to Sunday, July 21, 2013.

 We welcome submissions on all aspects of SHARP’s purview: authorship, reading, and publishing. We particularly encourage proposals of new or recent work, as well as proposals directly relevant to the conference theme, “Geographies of the Book.” (To learn more about the 2013 Conference, please visit our website at http://www.library.upenn.edu/exhibits/lectures/SHARP2013/index.html).

 The deadline for proposals is Friday, January 25, 2013, at 11:59 p.m.

Eastern Standard Time (GMT +5h).

 

To submit, please email the SHARP 2013 Program Committee at sharpupenn2013@gmail.com with a brief introduction (up to 400 words) of your project/tool/software. Questions that may be addressed include:

  •  what were the origins of your project; what are its theoretical underpinnings and its goals?
  • what are the historical period and geography/ies covered?
  • what determined its design? what tools and software were used? if your project *is* a tool or software, how does it benefit book historians and/or bibliographers
  • how did the digital or media component(s) of your project enable, strengthen, or transform the materials and methods under consideration? what new questions were raised?
  • how might this approach or tool be scaled up, appropriated, or reused in other contexts?

 Please be sure to name all participants and institutions involved.

 Participants will be expected to provide their own hardware for demonstrations (PCs/Macs, tablets, drives, sound systems, etc.). The conference’s Local Arrangements Committee will provide logistical assistance (tables, chairs, extension cords, Internet access) but cannot offer tech support.

 Those who have submitted papers to the main conference program may also submit project proposals to the Digital Projects Showcase, but, with consideration for program planning and maximal participation, will only be selected for one or the other.

 One participant for each proposal must be(come) a member of SHARP prior to the conference.

 

Some financial assistance may be available; in the past we have been able to fund between 10-15% of all travel grant requests. If you wish to apply for a travel grant, please include a statement of up to 150 words explaining how much funding you are requesting and why.

 

Please contact the SHARP Program Committee with any questions by email at sharpupenn2013@gmail.com or by phone at +1.347.6SHRP13 (+1.347.647.7713).

 We look forward to your submissions and to showcasing our changing digital landscape in Philadelphia next July.

 Sincerely,

David McKnight

Convenor, SHARP 2013 Conference, Philadelphia

 

“Geographies of the Book”

The 21st Annual Conference of the Society for the History of Authorship, Reading, and Publishing (SHARP)

18-21 July, 2013

University of Pennsylvania, Philadelphia, PA http://www.library.upenn.edu/exhibits/lectures/SHARP2013/index.html

Folger Digital Texts: Shakespeare’s Plays, Cutting-Edge Code: A Powerful Research Tool for Scholars

December 6, 2012

The Folger is delighted to announce the launch of Folger Digital Texts. These are reliable, expertly edited, and free digital Shakespeare texts for use by researchers. Starting from the Folger Editions of Shakespeare’s works edited by Barbara Mowat and Paul Werstine, Folger Digital Texts uses XML to create a highly articulate indexing system. Researchers can read the plays online, download PDFs for offline reading, search a play or the whole corpus, navigate by act, scene, line, or the new Folger Throughline Numbers. In short, every word, space, and piece of punctuation has its own place online. Twelve plays are currently available, and the remainder of the works and poems will be released throughout 2013.

The XML-coded files are offered as a free download for noncommercial use by scholars and can be used as the groundwork for digital Shakespeare research projects, app development, and other projects.

The Folger Shakespeare Library editions, published by Simon and Schuster, remain available in print and as ebooks and include essays, glosses, notes, and illustrations from the materials in the Folger collections.

The Folger Digital Texts team includes Rebecca Niles, editor and interface architect, and Michael Poston, editor and encoding architect. They welcome your feedback at folgertexts (at) folger.edu.

If you click here, you will be taken directly to Folger Digital Texts.

T-PEN: A New Tool for Transcription of Digitized Manuscripts

October 22, 2012

One of the exciting turn of events for scholars has been the growing number of unpublished, hand-written documents now available on the world wide web. Textual scholars no longer have to travel to distant countries for view the essential manuscript(s) for their research. Instead, they can now sit themselves down in front of their laptop and display each successive page. This has moved many sources that were once difficult to access into the “completely accessible” category.

But does that make them usable?  Despite the desire to make many manuscript collection freely accessible, many digital repositories use “tiled-based” viewers in order to protect unauthorized copying of the collection. This is completely understandable, but those viewers sometimes place limits on how a digital surrogate can be viewed. They can even make it difficult for scholars to extract what they often want most: a transcription of the manuscript’s content. Moreover, the current practice of transcribing from digitized pages can easily permit mistakes to occur. Transcribers currently move from the image to a word processing application in another display window (either on the same screen or on a different monitor). That process can easily mimic the same mistakes that the original scribe could make: haplography (omission of content between similar or identical words; “saut du même au meme”), dittography (repetition of letters or syllables), duplication or omission (of letters, words, or lines), often caused by homoearcton and homoeoteleuton (similar beginnings and endings of words), and transpositions. Could it then be possible to make these digital manuscripts both accessible and highly usable?

T-PEN (Transcription for Paleographical and Editorial Notation) seeks to address both the accessibility and usability of digital repositories. Developed by the Center for Digital Theology of Saint Louis University, in collaboration with the Carolingian Canon Law Project of the University of Kentucky, this new digital tool is a sophisticated web-based application that assists scholars in transcribing these manuscripts. To reduce the likelihood of transcription errors, we took advantage of digital technology to place both the transcription and the exemplar in a manner that minimized the visual movement between the two as much as possible. We accomplished this with a simple but novel visualization of the lines of script in the exemplar, which we integrated with interactive transcription spaces. To build the tool, we developed an algorithm for “parsing” the lines of script in an image, and a data model that connected the image delivery of manuscript repositories with the actions of transcribers.

But we wanted T-PEN to offer more than just a means to ensure good transcription. We had, in fact,  three goals in mind:

  1. To build a tool useful for any kind of scholar, from the digital Luddite to those obsessed with text encoding;
  2. To provide as many tools as possible to enhance the transcription process;
  3. To help scholars make their transcriptions interoperable so that those transcriptions would never be locked into the world of T-PEN alone.

After two years of design, development, and intensive testing this tool is now available to the wider public. It was built in the first instance for those working with pre-modern manuscripts, but there is nothing in its design that would prevent early modern scholars from exploiting T-PEN for their purposes. T-PEN is a complex application and to explain every function would take several posts. Instead, I want to provide a brief overview of how someone can set up a transcription project, how they can use T-PEN to produce high-quality work and finally how to get transcriptions out of T-PEN and into other applications or contexts.

Choosing your Manuscript

T-PEN is meant to act as a nexus between digital repositories and the scholar. To date, we have negotiated access to over 3,000 European manuscripts and we are working on further agreements to expand that list. Our aim is to have a minimum of 10,000 pre-modern European manuscripts available for transcription. Even with that number, we will never be able to satisfy all potential users. We therefore enabled private uploads to extend T-PEN’s usability. Many scholars have obtained digital images of a manuscript and they have permission to make use of them for research purposes. Private uploads to T-PEN are an extension of that “fair use.”  Users zip the JPG images into a single file and then upload them to T-PEN. These type of projects can only add five additional collaborators (see project management, below), and they can never become public projects. Currently T-PEN can support around 300 private projects, and we are expanding our storage capacity for more.

T-PEN's Catalog of Available Manuscripts

Transcribing your Manuscript

Once you select your manuscript you can immediately begin your transcription work. T-PEN does not store any permanent copies of the page images, so each time you request to see a page T-PEN loads the image from the originating repository. If you have never transcribed the page before, T-PEN takes you to the line parsing interface. This adds a little time to the image loading as T-PEN parses the image in real time. When it finishes, you will see a page that looks like this:

T-PEN's Line Parsing Interface

T-PEN attempts to identify the location of each line on the page and then uses alternating colors to display those coordinates. As you can see, we make no claim of absolute perfection. We worked on this algorithm for  almost two and half years and after extensive testing, we’ve been able to promise, on average, an 85% success rate. There are a number of factors that prohibit complete accuracy and so we offer a way for the transcriber to introduce corrections herself. You can add, delete or re-size columns; and insert or merge lines as well. You can even adjust the width of individual lines if they vary in length. You can even combine a number of lines if you want to have them grouped together for your  transcription. Sometimes, manuscripts don’t merge well in our modern, rectilinear world: many handwritten texts were written at an angle or were so tightly bound that the page could not be photographed as flat. T-PEN ultimately doesn’t care: what really matters for connecting transcription to a set of coordinates on a digital image. What really matters is that the left side of the line box aligns with the written text. That’s the anchor.

When you are satisfied with the line parsing, you can start transcribing. The transcription interface looks like this:

T-PEN Transcription User Interface

This interface allows you to transcribe line by line, with the current line surrounded by a red box. There are some basic features to note. First, as you transcribe the previous line is noted above because so often sentence units are split across lines. Transcription input is stored in Unicode and T-PEN will take whatever language set the user has enabled his computer to type. If there are special characters in the manuscript, the transcriber can insert them either by clicking on the special character button (the first ten are hot-keyed to CTRL+1 through 0).

Second, users can encode their transcription as they go. On this aspect, T-PEN is both innovative and provocative. Many scholarly projects that include text encoding often adopt a three-step process: the scholar transcribes the text and then hands it to support staff to complete the encoding, which is finally vetted by the scholar. However, there are many times in which semantic encoding of transcriptions has to include how the text is presented on the page. T-PEN innovatively allows scholars to integrate transcription (with the manuscript wholly in view) and encoding into one step. Often the best encoder is the transcriber herself. That innovation comes with a provocative concept, however. In digital humanities where TEI is the reigning orthodoxy, T-PEN is at least heterodox if not openly heretical. T-PEN’s data model does not expect,  nor require, a transcription to be encoded much less utilize TEI as the basis of structured text. Instead, T-PEN treats all XML elements as simply part of the character stream. T-PEN can support transcribers who don’t want to encode at all as well as those who are wholly committed to the world of TEI. For those who want to encode, a schema can be linked to a project to produce a set of XML buttons that can be used in the transcription interface.

Project Management

For those who simply want to start transcribing, project management will not be that important. For those who envisage a more sustained project (and perhaps a collaborative one at that), it will be vital. There are a number of components in managing a T-PEN project, but here I want to highlight two of them.

Collaboration. Like most digital tools, T-PEN allows you to invite collaborators to join your project. All members of a project have to be registered on T-PEN (but that’s free and requires only providing your full name and an email address). Managing collaboration has three features, of which only a few projects will use all three. There is first adding and deleting project members. Any member of a project can see who is also a member, but only the project leader can add or delete members. A project leader can even have T-PEN send an invitation to a non-T-PEN person and invite them to join (and once they do, they automatically become part of that project).

Collaboration in Project Management

Second, there is a project log to inspect. This log records any activity that changes the content or parameters of the project. This can be particularly helpful when tracking down how a transcription has changed in a shared project (and a user can display the history of each line in the Trasnscription UI). Finally, projects can make use of T-PEN’s switchboard feature. This is for transcription projects that may be part of a larger project, and where the transcriptions will be aggregated in another digital environment. Switchboard does two things for a project: (1) it allows different projects to share the same XML schema so that all transcriptions will conform to the larger project’s standards; and (2) it will expose the transcription through a web service to permit easy export to the larger project.

Project Options. The two more important options are button management and setting the transcription tools. As seen in the screen shot of the transcription interface, users can use buttons to insert both XML elements and special characters. Those buttons are created and modified as part of the project options. If there is an XML schema for the project, a project leader can link it to the project. Then in button management, the elements in that schema populate the XML button list. The button populator does not discern between metadata elements and elements found in the body of an encoding schema. Users then have to modify the button list to cull the elements that won’t be used during transcription. There’s an additional advantage to editing that list: each button can gain a more readable title. This can be helpful if the encoding schema exploits the varying use of the <seg>  or the <div> elements in TEI. When the possible deployment of the tag might be unclear to those with less experience with TEI, a more straightforward title can become a better guide to its use.

Special characters allow the user to identify characters in the UTF-8 system which may not be represented on a standard keyboard. These can be created by entering the correct Unicode value for the character. The first 10 characters are mapped to hotkeys CTRL+1 through 0.

Finally, the set of tools that are available on the transcription interface are set in project options. T-PEN has thirteen tools built-in and most of them were included to assist transcribers of pre-modern manuscripts. Some will be helpful to editors of modern texts. If those tools are unhelpful, then the user can expand that list of tools: all that is needed a name of the tool and its URL. Once attached to the project, the user will be able to access that tool in the transcription interface.

Getting your Transcription out of T-PEN

Digital tools often fall into one of two categories. “Thinking” tools are ones that allow users to manipulate and process datasets in order to test a certain idea or to visualize an abstract concept. They can also allow the user to annotate a resource as a way of processing the scholar’s conception of the object’s meaning or the hermeneutical framework it may require. These tools are invaluable, but they do not easily produce results that can be integrated into a print or digital publication. The second type is what I call the production tool. With these applications, the final objective is to produce something that can be integrated in other contexts. T-PEN falls firmly into this second category—although it has its own annotation tool with which a user can record observations about each manuscript page (and it is compliant with the W3C standard, the Open Annotation Collaboration). Scholars transcribe normally one of three reasons; to create a scholarly edition; to place those transcriptions in footnotes or in the appendices of a monograph; or to integrate an encoded text into a larger resource.

T-PEN supports four basic export formats: XML/plaintext, where the user can filter out one or more XML tags; PDF; RTF which is compatible with most word processors; and finally, basic HTML. For the first one, if the user has attached a header to the project, that header can be included in the export. There is an important caveat here:  T-PEN was not designed to be an XML editor. We do offer a basic, well-formedness check (which stops at the first error), but T-PEN does not offer full validation services. Most scholars who encode with T-PEN export their transcriptions to an XML editor for full validation of the file. The last three export formats include some simple transformation for text decoration (italics, bold, etc.). Users can also identify the whole transcription or specify a range based on the pagination (or foliation) of the manuscript.

T-PEN's Export Options

This post only covers the basics of T-PEN. There are more features available to the user. There is a demonstration video on YouTube  where you can walk with one of T-PEN’s research fellows as she begins a transcription project.  T-PEN is freely available, thanks to a major investment from the Andrew W. Mellon Foundation and a Level 2 Start-up grant from the National Endowment for the Humanities. So go to t-pen.org and register for an account.


Follow

Get every new post delivered to your Inbox.

Join 127 other followers