Archive for the ‘TCP’ Category

March 9th Bodleian Libraries Hosts EEBO-TCP Hackfest

February 23, 2015
Readers may be interested in the following announcement of an upcoming hackfest:

The Bodleian Libraries are hosting a one-day hackfest on 9 March to celebrate the release of 25,000 texts from the Early English Books Online project into the public domain. The event encourages students, researchers from all disciplines, and members of the public with an interest in the intersection between technology, history and literature to work together to develop a project using the texts and the data they may generate.

The EEBO-TCP corpus covers the period from 1473 to 1700 and is now estimated to comprise more than two million pages and nearly a billion words. It represents a history of the printed word in England from the birth of the printing press to the reign of William and Mary, and it contains texts of incomparable significance for research across all academic disciplines, including literature, history, philosophy, linguistics, theology, music, fine arts, education, mathematics, and science.

Prizes will be given to the best of the day’s projects.

Participants in the day’s event are encouraged to consider entering their ideas into the online Early English Books Ideas Hack (http://www.bodleian.ox.ac.uk/get-involved/competitions-and-projects), which seeks to explore innovative and creative approaches to the data and identify potential paths for future activity. Submissions for the Ideas Hack close on 2 April.

Advertisement

Early English Books Online Text Creation Partnership: User Survey

October 8, 2012

Posted on behalf of the EEBO-TCP project

Please help the Early English Books Online Text Creation Partnership
plan for the future by filling in our user survey, and be entered into
a prize draw to win one of ten £50 Amazon vouchers!

http://bit.ly/EEBO-TCPSurvey

The survey is part of a JISC-funded project SECT:Sustaining the
EEBO-TCP Corpus in Transition, which is investigating the impact and
sustainability of the EEBO-TCP collection. For more details on the
project, go to http://www.bodleian.ox.ac.uk/eebotcp/SECT

The Devonshire Manuscript: A Digital Social Edition

February 29, 2012

Readers are invited to participate in a promising and methodically thought-through experiment in social editing.

The University of Victoria’s Electronic Textual Cultures Lab‘s Devonshire MS Editorial Group invites contributions to a new project involving collaborative knowledge curation.  The project aims at attributing contributions and ensuring scholarly authority.

Guided by Ray Siemens, the ETCL’s editorial group is producing a collaborative electronic wikibooks edition of the Devonshire manuscript, which contains 185 items from the 1530s and 1540s, including complete poems, transcriptions, verse fragments, excerpts, anagrams, and notes by many authors and transcribers.

Because 125 of the poems are attributed to Sir Thomas Wyatt and have been transcribed and published in print, the miscellany was long considered exclusively as a source for his work.  Arthur Marotti notes, however, that this “author-centered view of the miscellany obscures its value as a document “illustrating some of the uses of lyric verse within an actual social environment” (Marotti, Manuscript, Print, and Renaissance Lyric, 1995).  In addition to Wyatt, other contemporaries contributing to the manuscript include Henry Howard, earl of Surrey, Lady Margaret Douglas, Richard Hatfield, Mary Fitzroy (née Howard), Lord Thomas Howard, Sir Edmund Knyvet, Sir Anthony Lee, Henry Stewart, Lord Darnley, Mary Shelton, and perhaps Anne Boleyn.

The Devonshire manuscript wikibooks site states that the purpose of the edition is to

preserve the socially mediated textual and extra-textual elements of the manuscript that have been elided in previous transcriptions.  These “paratexts” make significant contributions to the meaning and appreciation of the manuscript miscellany and its constituent parts: annotations, glosses, names, ciphers, and various jottings; the telling proximity of one work and another; significant gatherings of materials; illustrations entered into the manuscript alongside the text; and so forth.  To accomplish these goals, the present edition has been prepared as a diplomatic transcription of the Devonshire Manuscript with extensive scholarly apparatus.

The miscellany illustrates the social use of verse and provides what Colin Burrow calls “the richest surviving record of early Tudor poetry and of the literary activities of 16th-century women.”

Currently, a PDF version of the edition is under review at the University of Toronto’s Iter Gateway.  In July, the PDF and Wikibooks versions will be compared and a final edition will be published by Medieval and Renaissance Texts and Studies.

Readers are invited to participate in the editing of this interesting and complex manuscript.  Some immediate questions include the following:

  • How should blank spaces–often tellingly omitting one name to suggest another–be presented?
  • How can the manuscript’s structure be maintained, while allowing for efficient navigation?  For example, use of “forward” and “backward” buttons might misrepresent the complex spatial relationship among the poems, which frequently appear side-by-side in the manuscript.
  • What is the best way to ensure credit for Wikibooks editors?

Access to the digital facsimile is available to subscribers of Adam Mathew.  The link can be found at the bottom of the Devonshire Manuscripts’ wikibooks page.

Text Creation Partnership makes 18th century texts freely available to the public

April 25, 2011

This announcement is making the rounds of listservs and the like, and it should be of interest to emob readers:

(Ann Arbor, MI—April 25, 2011) — The University of Michigan Library announced the opening to the public of 2,229 searchable keyed-text editions of books from Eighteenth Century Collections Online (ECCO). ECCO is an important research database that includes every significant English-language and foreign-language title printed in the United Kingdom during the 18th century, along with thousands of important works from the Americas. ECCO contains more than 32 million pages of text and over 205,000 individual volumes, all fully searchable. ECCO is published by Gale, part of Cengage Learning.

The Text Creation Partnership (TCP) produced the 2,229 keyed texts in collaboration with Gale, which provided page images for keying and is permitting the release of the keyed texts in support of the Library’s commitment to the creation of open access cultural heritage archives. Gale has been a generous partner, according to Maria Bonn, Associate University Librarian for Publishing. “Gale’s support for the TCP’s ECCO project will enhance the research experience for 18th century scholars and students around the world.”

Laura Mandell, Professor of English and Digital Humanities at Miami University of Ohio, says, “The 2,229 ECCO texts that have been typed by the Text Creation Partnership, from Pope’s Essay on Man to a ‘Discourse addressed to an Infidel Mathematician,’ are gems.”

Mandell, a key collaborator on 18thConnect, an online resource initiative in 18th century studies, says that the TCP is “a groundbreaking partnership that is creating the highest quality 18th century scholarship in digital form.”

This announcement marks another milestone in the work of the TCP, a partnership between the University of Michigan and Oxford University, which since 1999 has collaborated with scholars, commercial publishers, and university libraries to produce scholar-ready (that is, TEI-compliant, SGML/XML enhanced) text editions of works from digital image collections, including ECCO, Early English Books Online (EEBO) from ProQuest, and Evans Early American Imprint from Readex.

The TCP has also just published 4,180 texts from the second phase of its EEBO project, having already converted 25,355 books in its first phase, leaving 39,000 yet to be keyed and encoded. According to Ari Friedlander, TCP Outreach Coordinator, the EEBO-TCP project is much larger than ECCO-TCP because pre-1700 works are more difficult to capture with optical character recognition (OCR) than ECCO’s 18th-century texts, and therefore depend entirely on the TCP’s manual conversion for the creation of fully searchable editions.

Friedlander explains that, for a limited period, the EEBO-TCP digital editions are available only to subscribers—ten years from their initial release—as per TCP’s agreement with the publisher. Eventually all TCP-created titles will be freely available to scholars, researchers, and readers everywhere under the Creative Commons Public Domain Mark (PDM).

Paul Courant, University Librarian and Dean of Libraries, says that large projects such as those undertaken by the TCP are only possible when the full range of library, scholarly, and publishing resources are brought together. “The TCP illustrates the dynamic role played by today’s academic research library in encouraging library collaboration, forging public/private partnerships, and ensuring open access to our shared cultural and scholarly record.”

More than 125 libraries participate in the TCP, as does the Joint Information Systems (JISC), which represents many British libraries and educational institutions.

To learn more about the Text Creation Partnership, visit http://www.lib.umich.edu/tcp. To learn more about ECCO, visit http://gdc.gale.com/products/eighteenth-century-collections-online/

Collaborative Reading: Elizabeth Scott-Baumann and Ben Burton’s “Encoding form: A proposed database of poetic form”

March 8, 2010

Elizabeth Scott-Baumann and Ben Burton’s recent paper,“Encoding form: A proposed database of poetic form”, for APPOSITIONS:
Studies in Renaissance / Early Modern Literature and Culture
‘s recent E-Conference: February-March, 2010, is suggestive of how new digital resources can be developed to augment the capabilities of existing tools such as EEBO and EECO. Responding many years later to Heather Dubrow’s 1979 call for “new methodology in early modern studies,” Scott-Baumann and Burton are constructing a database devoted to poetic form. Their project will afford a means of studying, historically and formally, poetic form by enabling queries about poetic form and generic transformations that resemble those we can now pose about words, thanks to electronic databases such as EEBO and EECO:

  • What is the origin (or origins) of a given form?
  • How does its structure, use, and meaning change over time?
  • Are there variations in use and meaning in different regions, or among different groups?
  • How does a given form relate to others, and how does this relationship change over time?
  • Concentrating on sixteenth- and seventeenth-century poetry, Scott-Baumann and Burton will use existing EEBO-TCP texts and enhance them with additional mark-up that builds upon Text Encoding Initiative (TEI) tags. As those familiar with TEI documentation will recall, its tags include ones designed for encoding verse: “stanza divisions, caesurae, enjambment, rhyme scheme, and metrical information, as well as a special purpose rhyme element to support the simple analysis of rhyming words.” Because encoding capabilities extend beyond merely marking general formal conventions and can also entail encoding that represent interpretive judgments, Scott-Baumann and Burton will experiment with both possibilities. The inevitably time-consuming nature of their task will probably result in building the databases in stages.

    As for publication plans for the database, its creators “aim to negotiate with EEBO and Chadwyck-Healey to find a form of publication which both respects intellectual property and commercial interests, while also making this rich new material accessible to the widest possible audience.” Scott-Baumann and Burton have clearly thought hard about issues of access and how to maximize this database’s availability for users. They present four different possible options, formulated with an eye to those lacking access to EEBO. As they note though, much will depend on what arrangements they are able to make with EEBO/Chadwyck-Healey.

    Noting that their database, once built, could be expanded beyond its present focus on the 1500s and 1600s to cover all periods of poetry, they then devote a section of their paper to its potential scholarly and pedagogical uses. Most obvious perhaps is the usefulness this planned tool could have on advancing work in historical formalism, an emerging approach that revisits “poetic form as historically specific, historically determined, and historically efficacious.” The ability to conduct specific searches across a significant number of poetic texts enables the quick capture of evidence to support or disprove what are currently only hypothetical propositions based on a small textual sample. Rightly claiming that this database “would change the way in which scholarship on poetic form is conducted, Scott-Baumann and Burton detail a wealth of possible questions and issues it could serve. This section also offers a range of pedagogical uses for this tool and addresses a range of audiences from the undergraduate to the secondary student.

    Before a brief conclusion, the paper then turns to discussing the two-stage pilot project for the database:

    1. A small database containing information on the metrical structures and rhyme schemes of all verse in the first edition of 10 texts published between 1590 and 1599. 2. A larger database containing information on the metrical structures and rhyme schemes of all verse in first editions of texts published during this period.

    Scott-Baumann and Burton’s database plans present another way of thinking about EEBO and how to augment its value. That they have proposed to build their database using EEBO-TCP seems essentially a wise plan, notwithstanding unsettled questions about access.* For one, linking one’s project to an already well-established resource should ensure its visibility. Too often very worthy projects are launched but remain unknown to many who would benefit from them. In addition, such a tie-in helps ensure continuity among resources. This augmentation of EEBO’s capabilities and the efforts to provide continuity are similar to what NINES and 18thConnect are offering later periods.

    *One of the access options does offer “[o]pen access to database and texts but not with mark up. …if we are not able to make the XML-encoded texts freely available, we would display the texts in their entirety [as users request them], but with the encoding invisible. … and display the verse with, for example, its stresses marked with accents, or its rhyme scheme colour-coded, rather than with visible tags.”

    Text Creation Partnership (Redesigned Website)

    December 3, 2009

    The Text Creation Partnership (TCP) at the University of Michigan has recently launched its redesigned website. As its name suggests, TCP fosters collaborative efforts to create “accurately keyboarded and encoded editions of thousands of culturally significant works in all fields of scholarly and artistic endeavor.” That TCP works together with both the international library community and commercial publishers of scholarly electronic is one of its defining strengths. It is concerned not only with creating electronic texts in formats that keep pace with shifting technological changes but also with promoting access to texts. Its partnership projects with EEBO, ECCO, and Evans illustrate these commitments. Over 25,000 EEBO texts have already been encoded, and these texts will become part of the public domain on January 1, 2015. Aaron McCollough, Text Creation Partnership Project Outreach Librarian, has commented on this forthcoming access to these EEBO-TCP texts and also provided an example of what such access may look like in a recent comment to an earlier emob posting.

    Among the features of TCP’s redesigned website that Aaron announced on the SHARP-L listserv, the following should especially interest readers of emob:

    * regularly updated TCP “spotlights” on project milestones and related projects in research and scholarly application

    * reviews of recently encoded texts

    * fun with early modern print

    As McCollough noted in his announcement, “we aim for it to be a place of encounter between students and scholars working in Early Modern fields of study, especially those interested in the role of digital archives in those fields.”

    One can also follow TCP developments on the TCP News & Views blog. One of the recent announcements here and on the TCP website alerts users to the newly created The EEBO Introduction Series. This series provides bibliographical, contextual information, and more for less well-known early modern texts. Ten editions are now available, but access to them does require a subscription to EEBO.

    Monk Project

    July 24, 2009

    Among the text databases included in the Monk (metadata offer new knowledge) Project are ECCO and EEBO (both of which are part of the Text Creation Partnership (TCP). While not addressing bibliographic errors, this initiative does hold relevancy for our discussions on improving these tools. In particular, this project’s efforts are apparently aimed at providing scholars with the means to work more effectively and simulatenously with texts created and housed in different databases.

    A recent PowerPoint presentation about the Monk Project, Tools for Textual Data (May 20, 2009), by John Unsworth sketches such issues as treating text as data, the Monk Project’s efforts to facilitate means to “mix and match” texts that reside in different databases, the development of features that will enable searches that users may wish to conduct (for example, what adjectives does a given author favor the most?), and the acceptable level of curatorial/user intervention. The tools being developed to allow both the posing of questions that users may wish to ask and the mining of the data to yield responses to these queries seem highly promising.

    Under “Questions for Discussion” (slide 22), I was interested in the two-part query, “Should users be allowed to change, correct, or improve data? If so, under what constraints or conditions?”. Thes question set seems directly pertinent to our discussion of how to improve bibliographic issues in these databases, but it rightly also asks about what sorts of constraints should (or need) to be in place–the answer to which would speaks to issues of quality control. Another question, “Should those who provide collections also collect the results of work done on their collections? Why or why not?,” was surprising to me. While I could see how gathering information about the ways that the collections were being used and the results obtained could help developers improve these databases’ functionality and accuracy, the collection of this information–especially by the owners of databases that are commercial enterprises–seemed far more worrisome to me.