Archive for the ‘18thConnect’ Category

Text Creation Partnership makes 18th century texts freely available to the public

April 25, 2011

This announcement is making the rounds of listservs and the like, and it should be of interest to emob readers:

(Ann Arbor, MI—April 25, 2011) — The University of Michigan Library announced the opening to the public of 2,229 searchable keyed-text editions of books from Eighteenth Century Collections Online (ECCO). ECCO is an important research database that includes every significant English-language and foreign-language title printed in the United Kingdom during the 18th century, along with thousands of important works from the Americas. ECCO contains more than 32 million pages of text and over 205,000 individual volumes, all fully searchable. ECCO is published by Gale, part of Cengage Learning.

The Text Creation Partnership (TCP) produced the 2,229 keyed texts in collaboration with Gale, which provided page images for keying and is permitting the release of the keyed texts in support of the Library’s commitment to the creation of open access cultural heritage archives. Gale has been a generous partner, according to Maria Bonn, Associate University Librarian for Publishing. “Gale’s support for the TCP’s ECCO project will enhance the research experience for 18th century scholars and students around the world.”

Laura Mandell, Professor of English and Digital Humanities at Miami University of Ohio, says, “The 2,229 ECCO texts that have been typed by the Text Creation Partnership, from Pope’s Essay on Man to a ‘Discourse addressed to an Infidel Mathematician,’ are gems.”

Mandell, a key collaborator on 18thConnect, an online resource initiative in 18th century studies, says that the TCP is “a groundbreaking partnership that is creating the highest quality 18th century scholarship in digital form.”

This announcement marks another milestone in the work of the TCP, a partnership between the University of Michigan and Oxford University, which since 1999 has collaborated with scholars, commercial publishers, and university libraries to produce scholar-ready (that is, TEI-compliant, SGML/XML enhanced) text editions of works from digital image collections, including ECCO, Early English Books Online (EEBO) from ProQuest, and Evans Early American Imprint from Readex.

The TCP has also just published 4,180 texts from the second phase of its EEBO project, having already converted 25,355 books in its first phase, leaving 39,000 yet to be keyed and encoded. According to Ari Friedlander, TCP Outreach Coordinator, the EEBO-TCP project is much larger than ECCO-TCP because pre-1700 works are more difficult to capture with optical character recognition (OCR) than ECCO’s 18th-century texts, and therefore depend entirely on the TCP’s manual conversion for the creation of fully searchable editions.

Friedlander explains that, for a limited period, the EEBO-TCP digital editions are available only to subscribers—ten years from their initial release—as per TCP’s agreement with the publisher. Eventually all TCP-created titles will be freely available to scholars, researchers, and readers everywhere under the Creative Commons Public Domain Mark (PDM).

Paul Courant, University Librarian and Dean of Libraries, says that large projects such as those undertaken by the TCP are only possible when the full range of library, scholarly, and publishing resources are brought together. “The TCP illustrates the dynamic role played by today’s academic research library in encouraging library collaboration, forging public/private partnerships, and ensuring open access to our shared cultural and scholarly record.”

More than 125 libraries participate in the TCP, as does the Joint Information Systems (JISC), which represents many British libraries and educational institutions.

To learn more about the Text Creation Partnership, visit http://www.lib.umich.edu/tcp. To learn more about ECCO, visit http://gdc.gale.com/products/eighteenth-century-collections-online/

Advertisements

Collaborative Reading: Elizabeth Scott-Baumann and Ben Burton’s “Encoding form: A proposed database of poetic form”

March 8, 2010

Elizabeth Scott-Baumann and Ben Burton’s recent paper,“Encoding form: A proposed database of poetic form”, for APPOSITIONS:
Studies in Renaissance / Early Modern Literature and Culture
‘s recent E-Conference: February-March, 2010, is suggestive of how new digital resources can be developed to augment the capabilities of existing tools such as EEBO and EECO. Responding many years later to Heather Dubrow’s 1979 call for “new methodology in early modern studies,” Scott-Baumann and Burton are constructing a database devoted to poetic form. Their project will afford a means of studying, historically and formally, poetic form by enabling queries about poetic form and generic transformations that resemble those we can now pose about words, thanks to electronic databases such as EEBO and EECO:

  • What is the origin (or origins) of a given form?
  • How does its structure, use, and meaning change over time?
  • Are there variations in use and meaning in different regions, or among different groups?
  • How does a given form relate to others, and how does this relationship change over time?
  • Concentrating on sixteenth- and seventeenth-century poetry, Scott-Baumann and Burton will use existing EEBO-TCP texts and enhance them with additional mark-up that builds upon Text Encoding Initiative (TEI) tags. As those familiar with TEI documentation will recall, its tags include ones designed for encoding verse: “stanza divisions, caesurae, enjambment, rhyme scheme, and metrical information, as well as a special purpose rhyme element to support the simple analysis of rhyming words.” Because encoding capabilities extend beyond merely marking general formal conventions and can also entail encoding that represent interpretive judgments, Scott-Baumann and Burton will experiment with both possibilities. The inevitably time-consuming nature of their task will probably result in building the databases in stages.

    As for publication plans for the database, its creators “aim to negotiate with EEBO and Chadwyck-Healey to find a form of publication which both respects intellectual property and commercial interests, while also making this rich new material accessible to the widest possible audience.” Scott-Baumann and Burton have clearly thought hard about issues of access and how to maximize this database’s availability for users. They present four different possible options, formulated with an eye to those lacking access to EEBO. As they note though, much will depend on what arrangements they are able to make with EEBO/Chadwyck-Healey.

    Noting that their database, once built, could be expanded beyond its present focus on the 1500s and 1600s to cover all periods of poetry, they then devote a section of their paper to its potential scholarly and pedagogical uses. Most obvious perhaps is the usefulness this planned tool could have on advancing work in historical formalism, an emerging approach that revisits “poetic form as historically specific, historically determined, and historically efficacious.” The ability to conduct specific searches across a significant number of poetic texts enables the quick capture of evidence to support or disprove what are currently only hypothetical propositions based on a small textual sample. Rightly claiming that this database “would change the way in which scholarship on poetic form is conducted, Scott-Baumann and Burton detail a wealth of possible questions and issues it could serve. This section also offers a range of pedagogical uses for this tool and addresses a range of audiences from the undergraduate to the secondary student.

    Before a brief conclusion, the paper then turns to discussing the two-stage pilot project for the database:

    1. A small database containing information on the metrical structures and rhyme schemes of all verse in the first edition of 10 texts published between 1590 and 1599. 2. A larger database containing information on the metrical structures and rhyme schemes of all verse in first editions of texts published during this period.

    Scott-Baumann and Burton’s database plans present another way of thinking about EEBO and how to augment its value. That they have proposed to build their database using EEBO-TCP seems essentially a wise plan, notwithstanding unsettled questions about access.* For one, linking one’s project to an already well-established resource should ensure its visibility. Too often very worthy projects are launched but remain unknown to many who would benefit from them. In addition, such a tie-in helps ensure continuity among resources. This augmentation of EEBO’s capabilities and the efforts to provide continuity are similar to what NINES and 18thConnect are offering later periods.

    *One of the access options does offer “[o]pen access to database and texts but not with mark up. …if we are not able to make the XML-encoded texts freely available, we would display the texts in their entirety [as users request them], but with the encoding invisible. … and display the verse with, for example, its stresses marked with accents, or its rhyme scheme colour-coded, rather than with visible tags.”

    Collaboration, Costs, and Digital Resources

    January 30, 2010

    On February 19 and 20 Yale will host a graduate student symposium, The Past’s Digital Presence Conference: Database, Archive and Knowledge Work in the Humanities. A quick survey of the conference program and available abstracts indicate several topics that dovetail with issues or subjects that have engaged emob. Jessica Weare’s paper, “The Dark Tide: Digital Preservation, Interpretive Loss, and the Google Books Project”, for instance, examines the discarding of material evidence in the process of digitizing, Vera Brittain’s The Dark Tide. Similarly, Scott Spillman and Julia Mansfield’s presentation, “Mapping Eighteenth-Century Intellectual Networks”, discusses their work on Benjamin Franklin’s letters and their relationship within the Republic of Letters. The conference’s purpose also addresses many of the questions we have been posing on this blog:

    ■ How is digital technology changing methods of scholarly research with pre-digital sources in the humanities?
    ■ If the “medium is the message,” then how does the message change when primary sources are translated into digital media?
    ■ What kinds of new research opportunities do databases unlock and what do they make obsolete?
    ■ What is the future of the rare book and manuscript library and its use?
    ■ What biases are inherent in the widespread use of digitized material? How can we correct for them?
    ■ Amidst numerous benefits in accessibility, cost, and convenience, what concerns have been overlooked?

    Peter Stallybrass is offering the keynote, and Jacqueline Goldsby will be the colloquium speaker, while Willard McCartney, Rolena Adorno, and others will appear on the closing roundtable. Such a lineup points to the range of perspectives represented. The conference is free to all affiliated with a university.

    Among the places this conference has been announced is the JISC Digitisation News section of the UK Digitisation Programme website, and its announcement emphasizes the participation of students “from around the globe.”

    Collaboration as it occurs across boundaries is the implicit topic of this posting, and I wish to use reports from the JISC website both as a springboard and as a contrast in the discussing the topic.

    A 2008-2009 JISC report, Enriching Digital Resources 2008-2009, Enriching Digital Content program—a strand of the JISC Online Content Program—features a podcast with Ben Showers. Because of the national nature of JISC, the program described offers a unified, coherent approach to advancing digital resources for its higher institutions of education; it represents a collaborative agenda. In this podcast Showers explains the purpose of the program: Rather than fund the creation of new resources, the program invested £1.8 million to enhance and enrich existing digital content while also developing a system for universities and colleges to vet and recognize this work. He then turns to explaining the following four key benefits of this program:
    • “unlocking the hidden—making things that are hard to access easy” to obtain and preserve. To illustrate, he uses CORRAL (UK Colonial Registers and Royal Navy Logbooks) project as an example of opening up primary data to make it not only much more available but also to preserve it.
    • enhancing experiences of students. Here Showers exemplifies the Enlightening Science project at Sussex that offers students opportunities to watch video re-enactments of Newton’s experiments and read original texts by Newton and others.
    • speeding up research—once a document has been digitized, there is no need to repeat the process. The document will now be available for all other researchers to use.
    • widening participation—engaging broader audiences including not only faculty and students within Britain’s educational community but also participants globally.

    Turning to the new goals for the 2009-2011 program cycle, Showers notes an emphasis on the “clustering” of content, that is bringing various projects together and establishing, when appropriate, links among them. Another focus is further building skills and strategies within institutions to deliver digital content effectively. Finally, he mentions the strengthening of transatlantic partnerships, and here the US National Endowment for the Humanities (NEH) is given as an example. Of course, there is a long history of scholarly collaboration between the NEH and British institutions—perhaps most notably the English Short Title Catalogue (ESTC).

    Indeed, through collaborative digital grants offered by JISC and NEH several transatlantic projects are underway or near completion, including the Shakespeare Quartos Archive, a collaborative effort involving Oxford University and the Folger Library, and the St Kitts-Nevis Digital Archaeology Initiative, undertaken by Southampton University and the Thomas Jefferson Foundation, Charlottesville, VA, to advance scholarship on slavery. There are several others as well.

    Both the goals and benefits detailed by Showers are ones that would attract the support of diverse parties, and they do parallel many arguments being made on this side of the Atlantic for such work, including ones advanced by the NEH. Moreover, this and other JISC reports suggest that JISC has also helped broker mutually beneficial relationships between British universities and commercial vendors such as Cengage-Gale and ProQuest. Yet another JISC report, The Value of Money, offers arguments that we need to be making and also points the obstacles and divides affecting various types of collaboration in the United States.

    After offering the following figures on the return of money invested in the JISC,

    • For each £1 spent by JISC on the provision of e-resources, the return to the community in value of time saved in information gathering is at least £18.

    • For every £1 of the JISC services budget, the education and research community receives £9 of demonstrable value.

    • For every £1 JISC spent on securing national agreements for e-resources, the saving to the community was more than £26.

    the report summary offers the following remarks:

    These are the figures revealed by a recently-published Value for Money report on JISC services. Although many countries have centrally provided research and education networks, and some have provided supplementary services, no other country has a comparable single body providing an integrated range of network services, content services, advice, support and development programmes.

    The cost-effectiveness of JISC is again highlighted in two sidebars:

    These figures suggest that for every £1 JISC spent on securing national agreements for e-resources, the saving to the community was more than £26
    and
    The added value, equivalent to more than £156m per year, suggests the community is gaining 1.4 million person/days, by using e-resources rather than paper-based information.

    The end of the summary further reinforces why investments in JISC benefit the UK as a whole:

    The value of JISC activities extends beyond the benefits identified here. Education and research are high-value commodities that play an important role in the UK economy and underpin the UK’s global economic position.

    The JISC’s “Value of Money” report contains the types of arguments and data that we in the US need to be making. While our system of higher education does not operate under the centralized system that characterizes that of the UK, the push for more transparent reporting on and assessment of what our various universities and colleges are delivering perhaps provides an opportunity for new forms of collaboration. Through national scholarly societies, the NEH, Mellon Foundation, ALA, and more, we need to supply some “noisy feedback” from a dollars-and-cents/sense perspective about what investing in digital resources means not just for our institutions of higher learning but also for our society.

    Technology and the “Republic of Letters”

    December 28, 2009

    The “sell” for a recent article on Mapping the Republic of Letters, a Stanford University digital humanities project led by Dan Edelstein and Paula Findlen, highlights the ways in which technology is altering our understanding of the past and shaping the kinds of questions we can ask:

    Researchers map thousands of letters exchanged in the 18th century’s “Republic of Letters” – and learn at a glance what it once took a lifetime of study to comprehend

    In this case researchers have applied GIS (geographical information system) mapping technology to explore the wealth of letters exchanged by Enlightenment figures. As the article details, the computer mapping of correspondence from the Enlightenment (the dates focus on 1759 to 1780, but the project also contains letters from the Renaissance) has enabled the relationship among vast amounts of material to be organized and presented in flexible ways. This YouTube video, Tracking 18th-century “social network” through letters, shows snapshots of the trajectories of Locke’s and Voltaire’s correspondence:

    The “big pictures” that this project facilitates are altering perceptions of Enlightenment networks and their influences. As the video demonstrates, despite French views of England as an incredible site of religious freedom and tolerance, Voltaire actually corresponded very little with those in England.

    What is especially interesting (but not surprising) is the importance of metadata and collaboration to this project’s success. That Oxford “supplied the metadata for 50,000 letters,” Dan Edelstein explains,
    “allow[ed] the project to go “beyond any of our expectations.” Mapping the Republic of Letters has also acquired the data for all of Benjamin Franklin’s correspondence, and talks are underway to obtain data from other European sources.

    Projects such as TCP and 18thConnect, which are establishing rich, reliable metadata for digital texts, are expanding the possibilities for scholarly exploration of past textual worlds, both for individual and collaboratively-driven scholarship.

    Jonathan Rose, whose post on SHARP-L drew my attention to this project, noted the potential of GIS technology for literary and intellectual history. Canadian book historians Bertrum MacDonald and Fiona Black have already begun to realize this potential for book historians. Their article “Geographic Information Systems: A New Research Method for Book History” (Book History 1 (1998): 11-31) can be found through Project Muse, and they have also

    proposed a long-term, international, collaborative project using GIS for comparative analyses of defined elements of print culture in several countries. An Advisory Board is being established, which currently includes scholars in the United States and the United Kingdom. The project has three primary goals: to explore the methodology through a variety of applications concerning various aspects of book history; to aid comparative studies; and to provide the foundation for an electronic atlas of book history (GIS for Book History International Collaborative Project, description from Fiona Black’s website).

    Such technology of course has rich potential for other projects, and we have had various mentions of such projects in past emob posts including comments on the Monk Project.

    For more recent work on uses of GIS in historical research, see the special issue of Historical Geography: An Annual Journal of Research, Commentary, and Reviews, Emerging Trends in Historical GIS, ed., Anne Kelly Knowles, vol. 33 (2005).

    Collaborative Readings #4: Shawn Martin’s “Reaching Out: What do Scholars Want from Electronic Resources?”

    September 24, 2009

    Shawn Martin’s brief 2005 article, “Reaching Out: What do Scholars Want from Electronic Resources?,” still poses relevant questions for this community. Noting the varied responses received by the TCP (Text Creation Partnership) when they interviewed scholars about why digital tools were not more widely used, Martin suggests the following:

    1. Consider examining and perhaps reshaping the interface of the database.

    2. Encourage librarians and faculty to raise awareness about the existence of these tools in their college communities.

    3. Generate grants, contests, or prizes designed to reward “innovative electronic publication and research.”

    As Martin goes on to note, however, these suggestions only raise larger questions about the influence of electronic resources on the humanities, including how use can be maximized, how best to reach out to college communities, how we can identify which obstacles impede using these resources in the classroom or in scholarly research, or how we evaluate their impact on the humanities.

    All of these questions are important, but the last one seems especially significant. As promising new platforms such as 18thConnect begin to take shape, we should be asking what we want from these new capabilities and potentials.

    18thConnect

    August 7, 2009

    Hello to the Early Modern Online Bibliography blog: your discussions here are amazing, and rich with references.

    Robert Markley at the University of Illinois and I started 18thConnect — we are co-directors — as a subsidiary organization to NINES (http://www.nines.org) which is incredibly supportive, both financially and in other ways as well.  Basically, 18thConnect is an organization that will peer-review digital resources created by 18th-century scholars and then aggregate those resources along with commerical resources.

    What does that mean?  When you come to the 18thConnect home page, you will be able to search for digital resources among free scholarly resources available on the web that have been judged high quality through peer review, AND commercial catalogs:  ECCO, Adam Matthew’s Eighteenth-Century Journals Portal, JSTOR, ProjectMuse, etc.  Our finding aid will deliver links to these resources — 18thConnect won’t house them in any way — and then, when you click on a link to an edition of Clarissa, say, proffered by ECCO, if your library subscribes to it and you are logged in at work, you will be sent directly to the resource.

    Here is the news for those of you who already know about this initiative: at our summer meeting, July 15, in Dublin, Ireland, at the Royal Irish Academy, Gale consented to give us their page images.  We will attempt to machine-read them better, using our own home-made OCR program, in order to produce better plain text files, something closer to the keyed texts produced by the ECCO TCP.  Gale will allow us to index the texts that we produce to allow keyword searching on ECCO texts EVEN FOR THOSE PEOPLE WHO DON’T OWN the ECCO catalog.  In other words, you’ll be able to find the bibliographic data of the texts containing the keywords for which you search: if your library subscribes to ECCO, you can get the text directly, but if not, at least you now know which texts you’ll have to find through some other means (microfilm, interlibrary loan, visit to special collections).

    We are now negotiating with the British Library and ESTC to get that catalog in as well.  The Digital Bibliography for English Literature (formerly the NCBEL) will be in soon.  We don’t yet  have the 18thConnect finding aid up and running: once we have the Gale (ECCO), Adam Matthew (18th-c Journals Portal), DBEL, ESTC data ingested and running smoothly, we will launch: we hope, June 2010.

    If you would like to contribute ideas to how this organization should work, you may wish to first take a look at online videos about NINES and 18thConnect available at:

    http://unixgen.muohio.edu/~poetess/NINES

    and

    http://unixgen.muohio.edu/~poetess/NINES/18thConnect.html

    (our temporary home)

    The NINES interface has changed since I made these videos, but the principles of its operation have not.

    Please contribute ideas here, as I will check frequently, but also feel free to email me: mandellc@muohio.edu

    Update on 18thConnect

    July 25, 2009

    Laura Mandell has placed on update on 18thConnect that indicates that an agreement has been reached for 18thConnect to work with Gale. There’s a recorded link to her ALA talk that is not opening for me as well as the following news about a grant the project has received from ICHASS:

    18thConnect: From PDF Images to Clean Data Sets, led by the University of Illinois’ Robert Markley, will use supercomputer time to run a parallelized optical character recognition (OCR) program on pages of images of 18th century printed texts, made available through its collaboration with Gale Group. The resulting archive of machine-readable 18th-century texts in history, literature, art, the sciences, and the emerging social sciences will be accessible to scholars for faceted searching, automated semantic tagging, hand encoding of digital scholarly editions, and data mining. By converting a vast archive of images into machine-readable texts, this project will provide a model for adapting OCR programs to field-specific problems that must be solved in order to preserve the full range of our cultural heritage.

    I am hoping that Laura and Bob may be able to tell us more.

    HPCwire: The Next Big Thing in Humanities, Arts and Social Science Computing: 18thConnect

    July 1, 2009

    HPCwire: The Next Big Thing in Humanities, Arts and Social Science Computing: 18thConnect

    Posted using ShareThis