Archive for the ‘ESTC’ Category

UC Riverside wins $405,000 Mellon Foundation Grant for ESTC

February 9, 2014

The UC Riverside Center for Bibliographical Studies and Research (CBSR) has won $405,000 to build software that will help edit and curate the English Short Title Catalog (ESTC).

In the past, the CBSR won $48,500 from the Mellon Foundation for curating and expanding the ESTC.  The goal of the new grant is to allow scholars to help curate the ESTC by adding information to entries.  According to a  write-up in UCR Today,

Approval from ESTC staff will be required for changes suggested to core catalog data, which must remain intact for use by librarians . . .The new software will allow additional information provided by researchers to be recorded in different data fields, with safeguards designed to prevent errors.

Congratulations to the staff at CBSR for this tremendous accomplishment.  For more information, see ucrtoday.ucr.edu.

Advertisement

English Short Title Catalogue, 21st century (ESTC21): Call for Feedback

March 20, 2012

Brien Geiger, Director, CBSR and ETSC/NA, has recently sent us the following announcement and call for feedback:

Big changes are underway with the English Short Title Catalogue (ESTC), and we need your input. A union catalog and bibliography of English printing from 1473 to 1800, the ESTC has developed over the last three decades into one of the most comprehensive and authoritative bibliographies available. Yet access to ESTC data has evolved very little. Last year the ESTC was awarded a planning-grant from the Andrew W. Mellon Foundation to “redesign the project as a 21st century research tool.” For the last nine months a planning committee has discussed how to make the resource more usable to a broad spectrum of researchers and librarians and to harness the knowledge and input of those users to refine and expand ESTC data. The recommendations of that committee are now available online at the estc21 blog. The planning committee welcomes and encourages feedback on our ideas from ESTC users. The ESTC21 website with our recommendations will remain active through April 20. Please support this effort to rethink the future of the ESTC by commenting on the ESTC21 pages and taking the brief survey at the end of the website. Your feedback is critical. From the entire planning committee, thank you for your contributions to this project. Brian Geiger Director, CBSR and ESTC/NA

EEBO Interactions and Bibliography: Linking the Past to the Present

February 5, 2012

“Even as more and more texts become widely available through digital surrogates, studies of the book remain grounded in physical bibliography.”

–Stephen Tabor, “ESTC and the Bibliographical Community”

This is a heady time for literary scholars using digital tools.  Visualization and text tagging software offers new ways to analyze old texts’ rhetorical and linguistic features.  Docu-scope, for example, is being used by Michael Witmore, Director of the Folger Shakespeare Library, to chart maps of Shakespeare’s plays using 1000-word strings.  The resulting maps posted on Witmore’s blog, Wine Dark Sea, reveal that Othello, for example, shares linguistic features, such as frequent first-person forms, with Shakespeare’s comedies.  Asking why this is so may provide a more detailed understanding of Shakespeare’s craft.

Other data mining projects, underway at Matthew Jockers and Franco Moretti’s Stanford Literary Lab, broaden and transform the practice of literary study, in part by advancing what Moretti calls “distant reading.”  These projects forgo traditional “close” reading of individual texts to analyze computer-generated data derived from running thousands of texts through specific programs.

Elsewhere, annotation tools, such as Digital Mappaemundi, allow annotation of digital artifacts such as, in DM’s case, medieval maps and geographic texts.

Aggregating platforms, including 18thConnect and NINES, create virtual environments where digital work can be shared.  Digital texts, images, maps, data, video, and audio can be collected and annotated for projects difficult to imagine just a few years ago.

Finally, the digital world has nourished new participatory models of scholarship, advanced, for example, by Kathleen Fitzpatrick’s Planned Obsolescence.

These new and often visually alluring scholarly ventures chart new avenues of inquiry and reshape literary studies as we know it.  Stanley Fish has blogged about them; Witmore has been interviewed by Forbes, introducing them to the commercial world; and granting agencies like the NEH have responded by dedicating specific funds for such projects.

But in the shadow of these projects, runs a slower, methodical, far less glamorous digital task on which all other projects rely: ensuring that digital texts retain bibliographical integrity.  As Stephen Tabor put it in a 2007 comment used in the epigraph above, “even as more and more texts become widely available through digital surrogates, studies of the book remain grounded in physical bibliography” (The Library 8:4, 369).

EEBO Interactions offers a unique venue for scholarly dialogue about bibliographical matters.   Though it describes itself as a “social network for Early English Books Online,” it might be more accurate to think of it as a site for asynchronous conferencing about bibliographical matters.  A broad range of readers–Proquest editors, graduate students, theologians, literary scholars, historians, philosophers, independent scholars, curators, librarians and library administrators, digital editors,  undergraduates, bibliographers, and textual critics–have already posted queries or comments, often correcting bibliographical entries or expanding our understanding of a given text.  The comments appear under the following rubrics:

Comments about this copy: Comments include requests that missing title pages be restored, or that two variants counted as the same copy by both ESTC and EEBO be distinguished.  They range from providing resolutions of complex pagination problems, to asking general book history questions.

About this work:  This section allows readers to suggest the broader context of a given text.  Nick Poyntz of Mercurius Politicus fame identifies one pamphlet as an advertorial for a cup lined with antimony and notes that two customers died after using the cup.  Other readers correct publication dates, post questions about attribution, note additional authors not mentioned in the EEBO or ESTC entries, or track the evolution of a text from one edition to the next.

Notes:  Aliases can be discussed here, something helpful in reading recusant literature.  This is also the space to discuss a text’s plurality–its relation to other texts it cites or responds to, and its reception.

Suggest a link: This space allows for links to ODNB entries or to pertinent articles, particularly useful for acquiring a fuller understanding of little known works. 

Perhaps most innovatively, EEBO Interactions invites scholars and librarians to talk with one another and with representatives from the commercial world that produced EEBOEEBO Interactions is the only purpose-built space designed to bring together members of the bibliographical community–normally working in isolation and apart from one another–to collaborate for a moment or two on the joint endeavor of linking the past to the present.  This is the kind of experiment that benefits everyone. 

It would be great to hear readers’ responses to EEBO Interactions.

ASECS 2011 Sessions on Electronic Resources and Related Topics

February 16, 2011

Below are sessions related to the digital humanities, electronic resources, or book history at the upcoming annual meeting of the American Society for Eighteenth-Century Studies in Vancouver.  If you would like a session included in the list below, please let me know.

8-9:30 Thursday, March 17

9. “Media Technologies and Mediation in Intercultural Contact”

(Roundtable) Pavilion Ballroom D

Chair: Scarlet BOWEN, University of Colorado, Boulder

1. Mary Helen MCMURRAN, University of Western Ontario

2. Neil CHUDGAR, Macalester College

3. Jordan STEIN, University of Colorado, Boulder

9:45-11:15 Thursday, March 17

19. “Scholarship and Digital Humanities, Part I: Editing and

Publishing” (Roundtable) Grand Ballroom BC

Chair: Lorna CLYMER, California State University, Bakersfield

1. Timothy ERWIN, University of Nevada, Las Vegas

2. Christopher MOUNSEY, University of Winchester

3. Eleanor SHEVLIN, West Chester University

4. Christopher VILMAR, Salisbury University

23. “Britain 2.0: The New New British Studies?” (Roundtable)

Chair: Leith DAVIS, Simon Fraser University Cracked Ice Lounge

1. James MULHOLLAND, Wheaton College

2. Michael BROWN, Aberdeen University

3. Eoin MAGENNIS, Eighteenth-Century Ireland Society

26. “Eighteenth-Century Reception Studies” – I Port Hardy

Chair: Marta KVANDE, Texas Tech University

1. Alise JAMESON, Ghent University, “The Influence of Gerard

Langbaine’s Seventeeth-Century Play Catalogues on Eighteenth-

Century Criticism and Authorship Ideals”

2. Diana SOLOMON, Simon Fraser University, “Sex and Solidarity:

Restoration Actresses and Female Audiences”

3. Jennifer BATT, University of Oxford, “The Digital Miscellanies Index

and the Reception of Eighteenth-Century Poetry”

4. Michael EDSON, University of Delaware, “From Rural Retreat to Grub

Street: The Audiences of Retirement Poetry”

29. “Bodies, Affect, Reading” Parksville

Chair: David A. BREWER, The Ohio State University

1. Amelia WORSLEY, Princeton University, “Lonely Readers in the Long

Eighteenth Century”

2. Amit YAHAV, University of Haifa, “Rhythm, Sympathy, and Reading

Out Loud”

3. Wendy LEE, Yale University, “A Case for Impassivity”

11:30-1pm, Thursday, March 17

38. “Scholarship and Digital Humanities, Part II: Authoritative

Sources” (Roundtable) Grand Ballroom BC

Chair: Christopher VILMAR, Salisbury State University

1. Katherine ELLISON, Illinois State University

2. Ben PAULEY, Eastern Connecticut State University

3. Adam ROUNCE, Manchester Metropolitan University

4. Brian GEIGER, University of California, Riverside

5. Lorna CLYMER, California State University, Bakersfield

2:30-4 Thursday, March 17

56. “Scholarship and Digital Humanities, Part III: Materials for

Research and Teaching” (Roundtable) Grand Ballroom BC

Chair: Bridget KEEGAN, Creighton University

1. Mark ALGEE-HEWITT, McGill University

2. Anna BATTIGELLI, State University of New York, Plattsburgh

3. Ingrid HORROCKS, Massey University

4. John O’BRIEN AND Brad PASANEK, University of Virginia

59. “The Private Library” Pavilion Ballroom D

Chair: Stephen H. GREGG, Bath Spa University

1. Laura AURICCHIO, Parsons the New School for Design, “Lafayette’s

Library and Masculine Self-Fashioning”

2. Nancy B. DUPREE, University of Alabama, “The Life and Death of a

Library: The Collection of John Joachim Zubly”

2. Meghan PARKER, Texas A&M University, “Private Library, Public

Memory”

3. Mark TOWSEY, University of Liverpool, “‘The Talent Hid in a

Napkin’: Borrowing Private Books in Eighteenth-Century Scotland”

66. “Editing the Eighteenth Century for the Twenty-First Century

Classroom” (Roundtable) Junior Ballroom B

Chair: Evan DAVIS, Hampden-Sydney College

1. Joseph BARTOLOMEO, University of Massachusetts, Amherst

2. Linda BREE, Cambridge University Press

3. Anna LOTT, University of North Alabama

4. Marjorie MATHER, Broadview Press

5. Laura RUNGE, University of South Florida

9:45-11:15 a.m, Friday, March 18

102. “The Eighteenth Century in the Twenty-First: The Impact of the Digital Humanities” (Digital Humanities Caucus) (Roundtable)

Grand Ballroom BC

Chair: George H. WILLIAMS, University of South Carolina, Upstate

1. Katherine ELLISON, Illinois State University

2. Michael SIMEONE, University of Illinois, Urbana-Champaign

3. Elizabeth Franklin LEWIS, University of Mary Washington

4. Kelley ROWLEY, Cayuga Community College

11:30-1 p.m. Friday, March 18

130. “Writing and Print: Uses, Interactions, Cohabitation” – II

(Society for the History of Authorship, Reading, and Publishing,

SHARP) Junior Ballroom D

Chair: Eleanor SHEVLIN, West Chester University

1. Shannon L. REED, Cornell College, “The Enactment of Theory:

Literary Commonplace Books in the Eighteenth Century”

2. Miranda YAGGI, Indiana University, “‘A Method So Entirely New’:

Female Literati and Hybrid Forms of Eighteenth-Century Novel

Criticism”

3. Shirley TUNG, University of California, Los Angeles, “Manuscripts

‘Mangled and Falsify’d’: Lady Mary Wortley Montagu’s ‘1736.

Address’d T –‘ and The London Magazine”

4. A. Franklin PARKS, Frostburg State University, “Colonial

American Printers and the Transformation from Oral-Scribal to Print

Culture”

132. The Eighteenth Century on Film Orca

(Northeast American Society for Eighteenth-Century Studies)

Chair: John H. O’NEILL, Hamilton College

1. Elizabeth KRAFT, University of Georgia, “The King on the Screen”

2. Natania MEEKER, University of Southern California, “Le Bonheur au

féminin: Passion and Illusion in Du Châtelet and Varda”

3. David RICHTER, Graduate Center, City University of New York,

“Writing Lives and Telling Stories: The Narrative Ethics of the

Jane Austen Biopics”

2:30-4 p.m., Friday, March 18

146. “New Media In the Eighteenth Century” (New Lights Forum:

Contemporary Perspectives on the Enlightenment) Port Alberni

Chair: Jennifer VANDERHEYDEN, Marquette University

1. Lisa MARUCA, Wayne State University, “From Body to Book: Media

Representations in Eighteenth-Century Education”

2. Caroline STONE, University of Florida, “Publick Occurences and the

Digital Divide: The Influence of Technological Borders on Emergent

Forms of Media”

3. George H. WILLAMS, University of South Carolina, Upstate,

“Creating Our Own Tools? Leadership and Independence in

Eighteenth-Century Digital Scholarship”

8-9:30 a.m., Saturday, March 19

156. “The Circulating Library and the Novel in the Long Eighteenth

Century” Orca

Chair: Hannah DOHERTY, Stanford University

1. Lesley GOODMAN, Harvard University, “Under the Sign of the

Minerva: A Case of Literary Branding”

2. Natalie PHILLIPS, Stanford University, “Richardson’s Clarissa and the

Circulating Library”

3. Elizabeth NEIMAN, University of Maine, “Novels Begetting Novels—

and Novelists: Reading authority in (and into) Minerva Press Formulas

9:45-11:15, Saturday, March 19

170. “Will Tomorrow’s University Be Able to Afford the Eighteenth

Century? If So, How and Why? (Roundtable) (New Lights Forum:

Contemporary Perspectives on the Enlightenment) Parksville

Chair: Julie Candler HAYES, University of Massachusetts, Amherst

1. Downing A. THOMAS, University of Iowa

2. Daniel BREWER, University of Minnesota

3. Melissa MOWRY, St. John’s University

4. Albert J. RIVERO, Marquette University

173. “Colloquy with Matt Cohen on The Networked Wilderness” (Roundtable) Port Alberni

Chair: Dennis MOORE, Florida State University

1. Birgit Brander RASMUSSEN, Yale University

2. Bryce TRAISTER, University of Western Ontario

3. Cristobal SILVA, Columbia University

4. Jeffrey GLOVER, Loyola University, Chicago

5. Matt COHEN, University of Texas at Austin

6. Sarah RIVETT, Princeton University

177. “Crowding-sourcing and Collaboration: Community-Based

Projects in Eighteenth-Century Studies” Grand Ballroom D

Chair: Bridget DRAXLER, University of Iowa

1. Margaret WYE, Rockhurst University, “The Challenge and

Exhilaration of Collaboration: From Post Grad to Undergrad, It’s All

Research, All the Time”

2. Victoria Marrs FLADUNG, Rockhurst University, “Undergraduate

Research: How I Learned to Love Irony in Jane Austen’s Mansfield

Park

3. Laura MANDELL, Miami University, “Crowd-sourcing the Archive:

18thConnect.org”

Respondent: Elizabeth GOODHUE, University of California, Los Angeles

2-3:30 p.m., Saturday, March 19

181. Evaluating Digital Work: Projects, Programs and Peer Review”

(Digital Humanities Caucus) (Roundtable) Grand Ballroom BC

Chair: Lisa MARUCA, Wayne State University

1. Holly Faith NELSON, Trinity Western University

2. Bill BLAKE, University of Wisconsin, Madison

3. Allison MURI, University of Saskatchewan

4. Laura MCGRANE, Haverford College

5. Gaye ASHFORD, Dublin City University

6. Anne Marie HERRON, Dublin City University

184. New Approaches to Teaching the Great (and not-so-great) Texts of

the Eighteenth Century” (Roundtable) (Graduate Student Caucus)

Chair: Jarrod HURLBERT, Marquette University Junior Ballroom B

1. Christian BEDNAR, North Shore Community College

2. Ann CAMPBELL, Boise State University

3. Christopher NAGLE, Western Michigan University

4. Peggy THOMPSON, Agnes Scott College

5. Deborah WEISS, University of Alabama

193. “Marketing and Selling Books in Eighteenth-Century France: People, Places and Practices” Orca

Chair: Reed BENHAMOU, Indiana University

1. Thierry RIGOGNE, Fordham University, “Marketing Literature and

Selling Books in the Parisian Café, 1680-1789”

2. Marie-Claude FELTON, Ecole des Hautes Etudes en Sciences Sociales,

Paris and Université du Québec à Montréal, “Cutting out the

Middlemen: Self-Publishing Authors and their Autonomous

Commercial Endeavors in the Parisian Literary Market, 1750-1791”

3. Paul BENHAMOU, Purdue University, “Le Commerce de la lecture à

Lyon dans la seconde moitié du 18ème siècle: Le cas du libraire-

imprimeur Reguilliat”

EEBO Editions Now Available Through Amazon

October 10, 2010

In August,  Eleanor posted a piece on ECCO’s print on demand (POD) offerings through various online booksellers.  These POD copies are produced by companies such as Nabu, Bibliolife, BiblioBazaar, and others.

EEBO has also struck a deal with Bibliolife, making about 3,000 EEBO POD titles available through Amazon.com.  These can be found by searching Amazon for “EEBO Editions.”  According to Jo-Anne Hogan, Product Manager at ProQuest, this initial offering through Bibliolife is  a trial stage; evaluating the response to and quality of the books will be necessary before ProQuest will expand the title list offered through POD.  It is thus a good  moment to reflect on the nature of the entries.

Neither Gale nor ProQuest flag the status of the books they sell as digital reprints on or near the title line, though both companies include boilerplate marketing blurbs about the nature of digital reprints later in the entry.  A simple flag next to the initial title, something like  [paperback digital reprint] or [paperback digital facsimile], would help all readers understand what these books are.

ECCO’s POD entries provide something like full bibliographical information only inconsistently.   EEBO entries on Amazon provide consistently fuller bibliographical information, though this information appears under “Editorial Reviews” rather than under “Product Details.”  By scrolling down Amazon’s entry for the digital reprint of  a pirated copy of Lily’s Short Introduction to Grammar (1570), for example, we find the following information:

++++
The below data was compiled from various identification fields in the bibliographic record of this title. This data is provided as an additional tool in helping to insure edition identification:
++++

A shorte introduction of grammar generally to be vsed, compiled and sette forth, for the bringyng vp of all those that intende to attaine the knowledge of the Latine tongue.
Lily, William, 1468?-1522.
Colet, John, 1467?-1519.
Robertson, Thomas, fl. 1520-1561.
By William Lily, with contributions by John Colet, Thomas Robertson, and others.
Signatures: A-C D4, A-G H4, A-B4.
In three parts.
Part 2 has a separate title page, without imprint, reading: Brevissima institutio seu ratio grammatices cognoscendae, ad omnium puerorum vtilitatem praescripta, quam solam regia maiestatis in omnibus scholis profitendam praecipit.
Part 3 has a half title, reading: Nominum in regulis generum contentorum, tum heteroclitorum, ac verborum interpretatio aliqua.
Title pages for parts 1 and 2 within ornamental borders.
A pirated edition, probably printed in Holland.–STC.
Another edition of STC 15610.10, first published in 1548.
Some print faded and show-through; some pages marked and stained.
[192] p.
[Holland? : s.n., c. 1570]
STC (2nd ed.) / 15615
Latin
Reproduction of the original in the Cambridge University Library

This is, in fact, a slightly revised version of the EEBO entry for the same pirated edition of Lily’s Short Introduction of Grammar:

Title: A shorte introduction of grammar generally to be vsed, compiled and sette forth, for the bringyng vp of all those that intende to attaine the knowledge of the Latine tongue. Create interaction
Author: Lily, William, 1468?-1522. Create interaction
Other authors: Colet, John, 1467?-1519. Create interaction
Robertson, Thomas, fl. 1520-1561. Create interaction
Imprint: [Holland? : s.n., c. 1570]
Date: 1570
Bib name / number: STC (2nd ed.) / 15615
Physical description: [192] p.
Notes: By William Lily, with contributions by John Colet, Thomas Robertson, and others.
Signatures: A-C D4, A-G H4, A-B4.
In three parts.
Part 2 has a separate title page, without imprint, reading: Brevissima institutio seu ratio grammatices cognoscendae, ad omnium puerorum vtilitatem praescripta, quam solam regia maiestatis in omnibus scholis profitendam praecipit.
Part 3 has a half title, reading: Nominum in regulis generum contentorum, tum heteroclitorum, ac verborum interpretatio aliqua.
Title pages for parts 1 and 2 within ornamental borders.
A pirated edition, probably printed in Holland.–STC.
Another edition of STC 15610.10, first published in 1548.
Some print faded and show-through; some pages marked and stained.
Reproduction of the original in Cambridge University Library.
Copy from: Cambridge University Library
UMI Collection / reel number: STC / 1354:02
Subject: Latin language — Grammar — Early works to 1800.

While this bibliographical information is provided consistently for EEBO editions on Amazon and its affiliate, Abebooks, it does not  get transferred to entries provided by other online booksellers, like Alibris.  It would be interesting to account for this failure to get full bibliographical information transferred.

ProQuest’s decision to make EEBO titles available through POD is a promising new development.  Its attempt to create a template providing fuller bibliographical information than has yet been attempted must be applauded.  Some questions remain:

  • Are the entries as functional as they need to be?  That is, can a scholar looking for a specific edition of an early modern text locate the exact POD copy, given the entries provided?
  • Can the layout be improved?
  • Is there a more efficient template (a different set of fields, for example) for bibliographical information than the fields currently envisioned?

I look forward to hearing readers’ reactions to these new POD offerings.

Classification and Interpretation, and the Construction of Digital Resources

September 28, 2010

In “The Alchemy of Turning Fiction into Truth” (Journal of Scholarly Publishing, [July 2008]: 354-372), David Henge examines the LC classification system and its treatment of “historical” works. Noting that works catalogued under the LC classification system’s D-DX, E and F categories are generally assumed to be factually based, Henge demonstrates the error of this assumption. He opens by discussing four types of works devoted to studying the past—“history based on solid evidence and argument, history based on less acceptable forms of these, pseudo-history, and counterfactual history” (354)—but his key concern is with the cataloguing of the last kind of history. Counterfactual histories or “pretend histories”

immediately and unabashedly depart from accepted versions of the past in order to hypothesize about what the course of the past and present might have been., if only different events and outcomes had taken place. They never quite pretend that these alternative histories did occur, but they clearly often wish they had. (357-58)

Despite addressing themselves to a past that never occurred, these counterfactual works are more often than not given LC designations that place them among works of actual history. Such placement seems all the more odd if we consider that the LC system does have other categories that would better signal their status. For example, the HX806-HX811 call numbers represent Utopias, the Ideal State, and these categories often seem a far better fit for the titles Henge discusses (368). Although most of these works end up in history, a few have been correctly placed under the classification designations for fiction. That some do end up in fiction ironically boosts the factual nature of those fictional works that remain classified as history. Further clouding the status of these “pretend histories” is their frequent adoption of the trappings of authoritative scholarly work—the appearance of “maps, footnotes, numbers, and pictures with false captions” (362) as well as the imprint of a university press.

While Henge identifies general readers as the population at greatest risk for viewing titles bearing D-DX, E or F designations as credible and factually based, his study does address issues relevant to the creation of scholarly digital resources. Henge notes that although “guides to the LC classification scheme spend considerable time classifying history, they ignore the equally important task of defining it” (363). Similarly, building digital resources entails designing classification schemes, and it is important to make the logic of those systems transparent. Henge’s article usefully reminds us that classification is an exercise in interpretation and that users must understand the rationale and assumptions behind the interpretative processes employed in the various classificatory designations. Even a cursory look at the description of the TEI header on the Text Coding Initiative’s website makes the link between classification and interpretation abundantly clear.

From another, less technical perspective, the desired feedback sought by Julia Flanders and John Melson for “Exploring Reception History in Women Writers Online” represents the type of forethought necessary for anticipating users’ needs and assumptions effectively and for creating the type of supporting contextual documents that will help lay bare the thought processes involved in creating a digital resource. In the process of discussing visionary failures of the LC classification designers, Henge points out that its originators assigned the essentially the same amount of classification space to the history of Asia as they did to the history of gypsies and labels this case “the most egregious example in the D-DX (history properly speaking) classes of the failure to anticipate growth” (360). While this decision seems inexplicable, generally it can be very difficult to predict future needs and build a resource capable of growth. This difficulty is compounded by the potential of digital resources to create new perspectives and new areas of inquiries not yet imagined.

The cataloguing of “pretend histories” as actual history that Henge identifies underscores that even accepted authorities like the LC classification scheme are not infallible. A parallel to Henge’s critique, work by Jim May, Stephen Tabor and others on problems with the ESTC have already received attention on emob, and both cases suggest a healthy dose of skepticism is often warranted even when dealing with respected and well-established resources.

Google Books Award: ESTC Receives Digital Humanities Grant

July 21, 2010

Posted on behalf of Brian Geiger, University of California, Riverside.

Brian reports:

I’m pleased to announce that Ben Pauley and I have received one of twelve inaugural Google Digital Humanities grants to match pre-1801 items in Google Books to the ESTC. The official announcement was made last week. You can read more about the grant at Inside HigherEd.

Our plan is to match as much as we can through computer matching, putting urls for Google Books in appropriate ESTC records and providing Google with ESTC ids and metadata. We don’t know for sure, but estimate that there will be between 100,000 and 200,000 ESTC-related items in Google Books. Based on matching that the Center for Bibliographical Studies and Research (CBSR) has done of records from electronic library catalogs, we should be able to computer match up to 50% of the Google records. This number could be lower than usual, however, given the truncated nature of much of the Google metadata.

The remaining 50% or so of the records we hope to put in a version of Ben’s Eighteenth-Century Book Tracker and make publicly accessible for users to help with the matching. For those of you teaching bibliography or bibliographically-minded courses next year, this could be a wonderful teaching tool, allowing your students to struggle with the complexities of early modern bibliography and learn first-hand its importance for understanding the history of the book.

We’ll update this blog about our progress with the Google Books metadata and hope to have a version of the Eighteenth-Century Book Tracker ready for use by the end of the fall or early spring.

An update on Eighteenth-Century Book Tracker

July 20, 2010

[Edit: fixed a couple of broken links—my apologies. -bp]

I wanted to let readers of this blog know about a couple of updates at Eighteenth-Century Book Tracker that I hope will make the site a valuable adjunct for those who look for early modern books at Google Books and the Internet Archive. These changes should also make it easier for users to contribute links to the site.

For several months, between about November, 2009 and March, 2010, visitors to the site wouldn’t have seen a whole lot happening. During that period, rather than adding new links to the site, I was re-tooling the site’s data model in order to make things more flexible and robust—essentially, I was recreating all of the site’s content along new lines. This was not fun, but I think the results are worth it. (more…)

Bibliography: An Endangered Skill?

June 10, 2010

Recently Jennifer Howard, a reporter for the Chronicle of Higher Education, posted a request on SHARP-L about whether bibliography was an endangered skill or art in the academy. She sought thoughts from teachers and students about this question an as well as “where the field bibliography might be headed.”

Her query generated a number of responses ranging from ones that indicated bibliographic training was alive and well in the responder’s particular program to ones that indicated students’ exposure to the topic was highly dependent upon the faculty member they had for a given course or the climate within the department. That Howard added a note later that afternoon in which she clarifies what she meant by bibliography–“I’m interested in the book-history side of bibliography, not in how to prepare correct bibliographic citations”–is telling in my mind. While responses posted to the list before Howard’s clarification primarily addressed the “book-history side,” I do wonder if off-list comments suggested possible confusion about what Howard meant by “bibliography.” Bibliographic citations, annotated bibliographies, and the like are still the standard staples of what is taught in first-year writing courses and even more advanced topics. So it would seem odd, to me at least, if someone had misinterpreted her query, especially one posted on a listserv devoted to the history of the book.

Many of our discussions on emob have noted the important relationship between traditional bibliographic knowledge and electronic resources such as EEBO, ECCO, and Burney. (See for instance the discussion that emerged in the collaborative reading of Ian’s Gadd’s “The Use and Misuse of Early English Books Online.”) But we have not had an extended discussion about the state of bibliographic training. Rather some comments have considered it to be a given that descriptive and analytical bibliographic skills are not regularly or as vigorously taught in graduate programs (with admitted exceptions), while others have stressed the need for such knowledge. Thus, I would like to hear more about if and how we teach these skills in our undergraduate and graduate classrooms as well as whether students respond well to such lessons. How do colleagues respond? (One SHARP commentator made mention of “sneaking” this material into courses). What tools and materials do people use? And what is the context or type of course(s) in which such skills are taught? Some SHARP-L responses to Howard’s query favored teaching bibliographical skills within a textual studies context, while others preferred a “book-history” context.

I have tended to use both approaches, but it depends upon the course. In methods/skills courses, I have used Oxford University’s manuscript exercise, Wilfred Owen’s “Dulce et Decorum Est.” While some students found the process of editing tedious, almost all appreciate being exposed in a hands-on way to issues they had never considered. I also use videos and the workshop materials for the hand-press book from University of VA’s Rare Book School to teach bibliography from a book-history standpoint.

ASECS Summary of “Some Noisy Feedback” Roundtable, Albuquerque 3/18/10

March 27, 2010

ECCO, EEBO, and the Burney Collection: Some “Noisy Feedback” Roundtable

Chair: Anna Battigelli (SUNY Plattsburgh)   Panelists: Sayre Greenfield (University of Pittsburgh, Greensburg), Stephen Karian (Marquette University), James E. May (Penn State University—DuBois), Eleanor Shevlin (West Chester University), Michael Suarez (Rare Book School, University of Virginia).  Respondents: Jo-Anne Hogan, (ProQuest), Brian Geiger (ESTC, University of California, Riverside), and Scott Dawson (Gale/Cengage).

The following offers a summary of the roundtable that took place, Thursday,  March 18, 2010  at the ASECS 2010 conference in Albuquerque, N.M.  This session was the second part of a two-part series, the first part having been a roundtable discussion chaired by Eleanor Shevlin at the EC/ASECS meeting in Bethlehem, Pa in October 2009.  Copies of Eleanor’s summary of the EC/ASECS session (published in the Eighteenth-Century Intelligencer and also on this blog) were distributed at the outset of this session.  Many thanks to the members of the audience who so cheerfully presented themselves at an early hour on the conference’s first day.

Sayre Greenfield opened discussion with detailed working solutions to problems caused by ECCO’s OCR (optical character recognition) software.  He recommended that Gale provide an ECCO OCR troubleshooting page on their web site and noted that blogs like this one would be sure to start that process (see below).  Aided by Deidre Stuffer, he found ways to correct for errors stemming from the following letter combinations that OCR typically mistranslates: s, ss, and ct.  Using the word, fishmonger, he substituted for the s every other letter, then substituted numbers, and finally the wildcard question mark.  Advice from his search results, including how best to use the question mark as a wildcard, can be found on the ECCO OCR Troubleshooting Page on the “Pages” section of this blog.  He warned that using the question mark for any medial or initial s is problematic if one is using variables elsewhere, adding that ECCO does not allow wildcards for the first letter of a word.  Additionally, letters surrounding the s seem to affect how the OCR reads the s.  The double ss, for example, frequently morphs into fl, transforming passion into paflion. Word searching within a text also proved problematic.  Though he found 32 instances of passion or passions when he read John Tottie’s A View of Reason and Passion, his electronic search using passion* yielded only half of these.  Turning to ct, he found that OCR often reads ct as t, so that objection becomes objetion.  These results suggest that ECCO would help users by strengthening its web site, which currently recommends fuzzy searches to address OCR problems.  Fuzzy searches create too many false positive results.  Including a more robust help page on this issue is necessary.  (For now, see Sayre’s ECCO OCR Troubleshooting Page on this blog.)

Steve Karian began by acknowledging the indispensability of ESTC for bibliometrics, but he also identified four problems that need to be addressed if the ESTC is to become the powerful tool it can be for the twenty-first century.  The first is the ESTC’s unit of measurement: the ESTC record.  Users often equate an ESTC record with an imprint, title, edition, or an issue.  Because of variations in the correlation of record to item, one cannot simply assume that two parallel sets of search “hits” can be compared reliably.  As he puts it, “one is constantly comparing apples to oranges.”  Additionally, field records vary, limiting or complicating the kinds of searches that can be done.  These need to be standardized if searching is to become reliable.  The two ESTCs—one at UC-Riverside, the other at the British Library—use the same data but different interfaces.  Dates are complicated because they appear in two MARC (Machine-Readable Cataloguing) fields.  Steve recommended deleting the MARC record entirely and replacing it with a new database structure, one designed to expand and grow.  He called for a new stage of innovation, allowing the ESTC to transform itself from a bibliographical catalogue into a bibliographical database.  Only through such a transformation will the ESTC become the powerful tool it promises to be.

Jim May discussed the Burney Collection, which he argued should be called the Burney Collection of Newspapers, Periodicals, and Other Printed Matter.  Its material was first collected by Charles Burney, subsequently increased by the British Library, and eventually microfilmed before being turned over to Gale/Cengage.  It includes material dating back to the 1620s and beyond  1800 and material printed in Barbados, India, Ireland, and North America.  Citing James Tierney’s comments at the Bethlehem meeting, Jim noted that the collection includes 237 newspapers and 161 periodicals, 60 of which are partially available in Adam Matthews Eighteenth-Century Journals series or ProQuest’s British Periodicals.  Burney allows one to read an entire issue or study issues by year or month, and it offers searching, though this is problematic.  According to Jim’s results, searching sometimes yields only 10% of the relevant items.  Searching for “Tatler” between 1708 and 1712 yields 80 hits.  Though he has found hundreds of advertisements of Smollett’s Continuation of the Complete History of England, only few of these can be found through an electronic search.  Similarly, only a third or fewer of The London Evening Posts published 1760-61 turn up when you search for “London Evening”.  Robert Hume and Ashley Marshall have an essay forthcoming in Papers of the Bibliographical Society of America discussing Burney and noting, among other problems, how definite and indefinite articles interfere with searches.  Jim also cited Simon Tanner’s article in D-Lib Magazine (July/August 2009), which found the following accuracy rates for Burney: character 75%, word 65%, significant word 48.4%, capitalized word 47.4.% and number 59.3%.   The magnification feature enlarges pages by 100% and would be more useful if it magnified by 33%.  Spread dates are misrepresented, due to the lack of editorial apparatus explaining when newspapers were actually issued.  Burney’s lack of editorial apparatus, cross references, comments, and so forth is a deficit.  Having a scholarly editor–perhaps a graduate student or postdoc intership– would improve its utility.  Also needed is a review of the entire database.  A page dedicated to errors encountered by users would help, something EEBO is now working on with in its “EEBO Interactions, A Social Network.”

Eleanor Shevlin identified three pressing needs: 1) fostering greater awareness of the context of texts; 2) encouraging collaboration among users; and 3) cultivating greater access to these electronic resources.  She pointed to the need for bibliographical training in order to use these resources accurately and called for an examination of the cognitive effects these tools have on research processes.  Specifically, she wondered how EEBO’s TCP transcriptions or ECCO’s searching mechanism affects research methodology.  Noting that these tools provide opportunities to correct bibliographical inaccuracies, she urged the need for a more standardized process through which corrections could be forwarded to the ESTC or to commercial databases.  She also cited examples of productive collaboration among members of the bibliographic community, including her own experience correcting an error in Kansas’s Spencer Research library, a correction made possible by sending ECCO’s image of the British Library’s copy of a text to Kansas.  Finally, she noted that access continues to be a problem.  Scholars in the U.S. work at a notable disadvantage compared to scholars in the U.K. who typically have access to ECCO and ECCO II through the Joint Information Systems Committee (JISC).  ASECS President Peter Reill’s recent calls for feedback regarding access suggests that the issue is at least on the radar of those who can help, either through negotiations for large-scale access or  individual subscriptions.

Michael Suarez warned against the illusion of comprehensiveness in database searches.  Users are frequently unaware of what is missing in these databases, and the databases’ selectivity impoverishes word searches as tools for analysis.  Turning to the task of text-mining, he expressed skepticism regarding the mentalities of mining.  Where sustained engagement with individual texts allows for work linking texts to their culture and to other texts, textual extraction can produce radically decontextualized results.  Because these database tools are easy to use, we are, he warned, insufficiently uneasy with what they actually accomplish.  Suarez insisted that textual analysis demands an effort to fuse horizons between text and reader, a fusion that involves a reader’s deep engagement with a text’s historical context and with a text’s relationship to other texts.  Such contextualization, as James Boyd White would agree, is essential to a functional and robust literary hermeneutics.  Additionally, text-mining tools encourage scholars to work in even greater isolation, away from libraries and other scholars.  Precisely because the digital future will change the way we think, Suarez called for a greater bibliographical literacy in order to make these promising tools work properly.

Panelists’ Responses:

Jo-Anne Hogan (ProQuest)  agreed with Michael’s concern regarding the impact of these digitization projects.  She added that EEBO routinely receives emails pointing out errors, asking for missing items, and making recommendations, and that it works to incorporate these suggestions.  But she also noted a growing digital divide: concerns voiced at conferences like ASECS differed from those at conferences on the digital humanities.  At the latter, attendants ask EEBO to produce more tools for text-mining.  It is sometimes difficult to reconcile the competing requests received.  Money matters in these issues, and will always be a factor.  She agreed that more could be done to align the bibliographic data in EEBO with that in the ESTC and pointed out that efforts are under way to make that happen.  She also introduced the prospect of a social networking site for EEBO intended to facilitate communication between scholars and users so corrections can be reported and more contextual information can be made available.  We hope to hear more from her about this on this blog in the near future.  Access, she concluded, continues to be a concern, agreeing with Eleanor that it is unfortunate not to have a model for broad access in the U.S.  Personal subscriptions seem unlikely because such subscriptions cannot cover costs, at least not at subscription rates individuals are willing to pay. She hoped there might be a point in the future when ProQuest can provide broader access, but she could not guarantee such a thing.  More promising is the prospect that about half of the books in EEBO will soon be available for purchase at reasonable rates via Print on Demand.

Scott Dawson (Gale) agreed with Sayre’s suggestion that a Help screen dedicated to OCR problems  is an idea to consider seriously.  He added that Gale would look into post-OCR checks that might correct results.  18thConnect will help by testing new OCR software on ECCO page images, and that might solve problems.  Turning to Steve’s comments about ESTC, Scott noted that ECCO depends on ESTC for metadata, and that Gale is working with ESTC to add a link within the ECCO Full Citation to report problems with a given record.  He agreed with Jim May that Burney presents additional obstacles to getting accurate OCR  results.  Gale has been working with the British Library to resolve the issue of spread dates and hopes to have an update in the next few months.  On the issue of access raised by Eleanor, Scott mentioned that ECCO is concerned about the issue, but that by providing access to more than 500 institutions globally, it has helped make early modern printed material more accessible than is possible through hard copy or microfilm.  Tiered pricing and consortia-designed contracts help non-ARL institutions find ways to subscribe to ECCO.  He greed with Michael Suarez that ECCO is incomplete, even with the 50,000 titles added through ECCO II.   Gale is not planning an ECCO III.  But the possibility of linking missing titles to ECCO is being considered.

Brian Geiger (ESTC) outlined two main areas of work at the Center for Bibliographical Studies and Research (CBSR), which manages the North American branch of the ESTC.  First, they continue to upgrade and add records to the ESTC.  They are processing OPAC extracts from libraries, and recently began on an extract from Oxford University that resulted in some 200,000 records that will be matched against the file.  These OPAC extracts provide shelf marks (or call numbers) for existing items, and have turned up tens of thousands of new copies and hundreds of entirely new items.  They are adding urls from online collections.  EEBO, ECCO and TCP are matched, though not yet displayed by the public version at the British Library.  Brian has requested urls from Google and will do the same from Internet Archive.  They are digitizing title pages from paper reports submitted over the last two decades and will attach those images to the appropriate records, allowing users to compare a title page to its MARC record.  They hope to have many of the title pages in the ESTC by 2011.  And they have enhanced some 180,000 MARC records from title pages in ECCO.  Second, the ESTC has started to assess how to transform the project from an online catalog to a flexible and interactive database-driven research tool.  Brian corroborated Steve Karian’s assessment that this new resource should be built on relational databases, and noted with appreciation the value of the kind of collaborative thinking Steve offered about the project’s future.  Brian emphasized that a number of partner projects and institutions should be involved in the redesign, to ensure that the new project meets a variety of user needs and to try to plan for the sharing of information across platforms.  He mentioned some of the features that he thought should be included, among them user editing of bibliographic data and metadata and tools to send information to users about updates or changes to records.  He ended by pointing out that development of the database will require resources and the next stage of the ESTC’s evolution will be contingent on funding.  The ESTC is currently engaged in grant development.  It will be in a better position to discuss specific solutions once funding is secured.