Summary of EC/ASECS Roundtable: Bibliography, the ESTC, and 18th-Century Electronic Databases

October 24, 2009

Bibliography, the ESTC, and 18th-Century Electronic Databases:  A Roundtable

Chair: Eleanor F. Shevlin (West Chester University)   Participants: James E. May (Penn State University—DuBois), James Tierney (University of Missouri—St. Louis), David Vander Meulen (University of Virginia), Benjamin Pauley (Eastern Connecticut State University), Brian Geiger (ESTC, University of California, Riverside), and Scott Dawson (Gale/Cengage).

The following offers a summary of the roundtable that took place, Saturday, October 10, 2009, at the EC/ASECS 2009 conference hosted by Lehigh University and held at Bethlehem, Pennsylvania, October 8-11, 2009.

 Jim May opened the roundtable, and his remarks highlighted and extended the discussion he offered in his essay, “Some Problems in ECCO (and ESTC),” in The Eighteenth-Century Intelligencer, 23.1 (Jan. 2009), the article that inspired this session and Anna Battigelli’s forthcoming roundtable at ASECS (March 18th, 9:45 am—11:15 am). Key issues Jim raised included the need to correct missing images, to address the “disappearance” of letters originally printed in red ink on title pages, and to bring the ESTC up to date. In addition, he noted that ECCO’s electronic index is not always representative of what is actually there digitally. Work is also needed on providing or revising information about subscription lists, textual history, and attributions in ESTC. While noting that he had already addressed problems with Burney in his The Eighteenth-Century Intelligencer article, 23.2 (May 2009) and that Jim Tierney would be discussing this tool next, Jim commented on the usefulness of Burney, particularly to those working on the history of a publication.

Turning to the Burney collection, Jim Tierney drew attention to the potentially confusing name for this electronic collection because it is not by any means restricted to newspapers. Instead, it includes a good number of periodicals as well. Specifically, the collection consists of 237 newspapers and 161 periodicals, and, furthermore, some of the titles included are neither newspapers nor periodicals. That the Burney digitized collection follows the Anglo-American cataloguing procedure of creating a new entry every time a newspaper undergoes a title change results in the illusion of more titles than actually exist as well as confusion about the history of a given newspaper. Jim also provided a detailed handout (posted here as a page) listing the digitized periodicals (note: not newspapers) in Burney. The handout includes notes about missing issues, other locations where titles in Burney can be found, and a tentative list of Burney titles duplicated by other digitization projects. The two overarching points Jim made was the failure to have scholars involved in the planning of Burney and other digitization projects and the need for far greater collaboration among the creators/purveyors of these databases, librarians, and scholars. That given titles in Burney often include only a few issues when other issues were available elsewhere and, if digitized, would have approached a more complete run, exemplify the need for far better coordination and collaboration.

While David Vander Meulen serves on the ESTC board, his remarks for the roundtable were offered in his role as a researcher and user of these tools. He began by noting that ESTC is an evolving tool—a work in progress—and that ECCO follows ESTC.  Moreover, even as it progresses, the ESTC is still “functional and valuable” even though it is incomplete. Nonetheless, “any addition to ESTC will change the context.” An important development occurred in 2006 when the British Library initiated free access to this tool. As for problems, the ESTC had made the decision to truncate titles and places. Yet ECCO generally offers the full titles, while expanded locations can occasionally be found by going to public library catalogues. To improve these resources, David explained, we need to have an easier way to convey corrections to the British Library or University of California Riverside (the North American home of the ESTC) and, equally important, an ongoing staff to process editorial changes and comments. In discussing this need for a means of processing updates, David also drew attention to whether the uncontrolled notes field should be visible. Unfortunately agencies that have funded the ESTC, as he explained in his closing remarks, have decided the project is complete.  Obviously, given ESTC’s status as a work-in-progress, such a decision presents additional problems to continued updating and correcting.

 Ben Pauley spoke next about a project he has initiated. He began by noting the lack of access that many institutions (and thus their scholars and students) have to paid databases such as EEBO and ECCO. Both Internet Archives and Google Books, however, have a number of eighteenth-century books in their freely accessible databases. Yet it is typically very hard to identify properly what text one has accessed. Viewing these freely available texts as an opportunity, Ben established The Eighteenth-Century Book Tracker, a project in which he is supplying the bibliographic data so sorely lacking in eighteenth-century texts found in Google Books. Doing so has compelled him to become a textual scholar or an “accidental bibliographer.” Thus far, he has recorded about 150 copies not appearing in ESTC. At present, the project features 480 texts and 4 periodicals. Ben has been asked to write an article on the Eighteenth-Century Book Tracker for The Eighteenth-Century Intelligencer that will detail much more about his undertaking.

Speaking as the Associate Director and Resident Manager of the Center (University of California Riverside), the North American home of the ESTC, Brian Geiger explained that the British Library’s ESTC role has focused on cataloguing its own collection and that the Univ. of California Riverside  has handled everything else. In addition to reiterating points about the problem with truncated titles, he also discussed the lack of subject headings as a shortcoming. Turning to the digital surrogates of early modern imprints, he explained that the ECCO and Adam Matthews collections are based on ESTC, but EEBO is not. Next Brian addressed the need to foster better communication between ESTC and scholars. While the channels of communication between ESTC and librarians have remained strong, that has not been the case with scholars. Like Ben, Brian will also be writing an article on the ESTC for The Eighteenth-Century Intelligencer.

 Scott Dawson from Gale-Cengage concluded the presentations by roundtable panelists. He first supplied an historical overview of ECCO and Burney. In 1982 Research Publications began to microfilm the “Eighteenth Century” microform collection. By 2002 twenty-six million pages of eighteenth-century titles had been filmed. This microfilm collection is the basis for ECCO, but using the ESTC in conjunction with the microfilm has been overall a real plus for the project.  ECCO II, released at the start of this year, features 50,000 additional titles. By mid 2010 ECCO II, representing holdings from fifteen libraries, will be completed (titles from the Harry Ransom Center are still being prepared). ECCO and ECCO II, combined, will have made 185,000 eighteenth-century titles available to subscribers. As for the digitization of Burney, that project was handled by the British Library and not Gale-Cengage. Scott also addressed some of the problems that can and cannot be corrected. When pages are blurred, for instance, the microfilm plays a key role in what can be done. If the microfilm is clear, then the page is re-filmed. Yet if the problem occurred because the page is blurred in the microfilm, then, from the perspective of Gale, nothing can be done. When duplications of a title are discovered, however, the duplications can be deleted. 

After all six panelists had offered opening statements, the discussion was opened to the audience’s questions and comments. The point perhaps most stressed in the discussion with the audience was a need for far greater involvement by scholars in the creation and improvement of digital resources. In terms of updating or correcting resources, questions arose about how this might be done and what types of controls are needed. In subsequent discussions, the creation of advisory boards and (or) the involvement of a committee representing ASECS arose as possible avenues for communicating and addressing the scholar’s perspective more effectively. The establishment of an advisory board and/or ties with ASECS could play a vital role in future projects, and members of a board or ASECS committee could also devise potential solutions to some of the shortcomings with existing tools.  The resurrection of Factotum, the now defunct ESTC news publication of the British Library (ceased with issue no. 40 in 1995), or the initiation of a similar publication would be a way of establishing regular, ongoing communication with a broader base of scholars. (For those interested in the content of previous issues, see the index for Factotum.) Of course, an obstacle here is staffing and funding. Questions also arose about plans to make Burney more complete by digitizing issues not included for a particular newspaper or periodical title but available elsewhere. Yet that this digitization project had been undertaken by the British Library (see final report) and not Gale complicates the issue. Also, when asked about any plans for an ECCO III, Scott explained that the creation of ECCO II caused surprise among many libraries that had purchased ECCO because they believed that ECCO was complete at the time. When ECCO II was introduced for purchase, libraries were promised that there would not be any additional forms of ECCO.  (Depending on the discovery of additional eighteenth-century titles, however, I see no reason that another collection could not be pursued; if enough material for another collection becomes available, then scholars need to insert and assert themselves in conversations with vendors and librarians and make the need and value of a third collection known.)

Another very real, pressing concern was the large number of scholars who do not have access to these databases and for whom their institutions are not likely to be able to afford these resources even in the future. The point was raised that all universities in the U.K. have access to ECCO and ECCO II for an annual hosting fee through the auspices of the Joint Information Systems Committee (JISC), “established by the UK further and higher education funding councils in 2006 to negotiate with publishers and owners of digital content.” Because the situation differs greatly in the U.S.—we have no higher education government council overseeing all our universities—we do not have such a prospect here. While Ben Pauley’s Eighteenth-Century Book Tracker promises to bring some order to the current anarchy that characterizes freely available eighteenth-century texts, his valuable project can’t and won’t solve the inequity of access in the United States.


Are ECCO and Burney Classroom Necessities?

September 16, 2009

We all know how indispensable these text-bases have become to eighteenth-century research.  The question many of us face from skeptical librarians controlling acquisitions budgets is whether these text-bases are crucial to undergraduate teaching.  It seems to me that a strong case can be made that the existence of these text-bases changes the nature of what can be taught in the classroom.  Whether we look at large century-spanning text-mining projects such as Matthew Wilkens’s study of parts of speech and allegory or the three very targeted assignments recently described by Laura Rosenthal, Eleanor Shevlin, and Dave Mazella on The Long Eighteenth, these text-bases  make new kinds of assignments possible.  Are these new kinds of assignments tied to a new kind of reading?  Is there a new kind of learning that can now take place in the classroom, and if so, is it an important kind of learning?

Many of us mounting arguments on behalf of acquiring these text-bases would be interested in hearing readers’ responses to these questions.  Have these text-bases become essential, or do they merely contribute to an alternative but no more important kind of learning experience than what the classroom offers without them?

Burney database now at the Library of Congress

September 8, 2009

The Library of Congress has now obtained the “17th – 18th Century Burney Collection Newspapers” database.

It also has the following electronic resources:

  • 19th CenturyBritish Library Newspaper Collection
  • 19th Century UK Periodicals
  • British Periodicals
  • ECCO, Part I and II
  • EEBO (at long last, but not the Text Creation Partnership searchable part)
  • and plenty of fine American stuff
  • Abby Yochelson, a Humanities Librarian at the LC, noted, “Sometimes it’s tricky to find the listing for the database if it starts with 19th because it can be listed as 19th or Nineteenth, but generally not both. Do a keyword search on other parts of the title!”

    my new Jane Austen course: UPDATE

    August 23, 2009

    Since Anna requested this, I’m letting people take a peek at my course-blog syllabus for my Jane Austen and the Undergraduate Novel Course for the next few days; I’ll have to shut down access after then, as soon as students begin having their discussions.  I’m still working on the blog, but the The syllabus and resource page will should give you at least an idea of what I’m up to.  I expect I’ll build some of the Burney assignments into their weekly blogging assignment.

    Any thoughts, suggestions?


    on the uses of newspapers, in and out of the classroom (updated)

    August 19, 2009

    I found this post from Rachel at  A Historian’s Craft (via Carnivalesque 52) a while back, and thought it would be a useful way to discuss the Burney collection and its potential for the classroom.  Frankly, since I had already spent part of the summer reading Scottish newspapers in Edinburgh, I was very interested in what Rachel had to say about the best ways to plow through such materials.

    I think the best advice in Rachel’s post is to prepare a list of themes or events to use while browsing, since it’s so easy to get lost in the columns and columns of details.  This would be expecially important for students, if you expected them to find anything relevant to a particular novel.

    I also agree with Rachel that the letters and advertisements in newspapers are probably the most interesting to us as researchers, because they are the most human, least standardized elements of a very standardized medium.  They provide a period flavor to readers that other parts of the paper do not, largely because they contain such a concentration of “everyday life” and its unspoken/barely spoken assumptions.  I suspect that for a novel class, these would often be the most important parts.

    Since I got access to the Burney, I’ve been playing around with the keyword searching, figuring out the types of assignments that would work best for my Austen and her Predecessors novel course, and this is what I’m thinking:

    • keyword searching in newspapers works really well for author/work information, since it is mostly contained in advertisements.  I’d pair this up wth the Oxford Dictionary of National Biography, to see if students could compare the publication information they find in the newspapers with what they find in the bio.
    • advertisements also yield good contextual clues for everyday products or practices unlikely to be fully glossed.  So, for example, I found some good ads for “masquerades” and “masquerade-makers” that would be useful for readers of Fantomina.  Students are probably best off getting these kinds of keywords assigned to them, at least initially.  I’d pair this exercise with a period dictionary, to see if the terms coincide or diverge.
    • I think historical events, if they could be named with some precision, could be usefully glossed using the Burney.  Unfortunately, many of the novels that we’re reading (Haywood and Davys, for example) are less interested in such “realism,” though that of course makes for another point of entry into a discussion of such issues as realism.  And I’d endorse prefacing any use of the Burney with a discussion of realism and the critical debates surrounding its “rise,” including the Campbell article, I suppose.
    • A more general way to approach this kind of historicization, though, would be to assign students the task of finding the first advertisement of the assigned novel, then browsing the issue of the newspaper in which it occurred, to see what historical events, political debates, etc. are occurring at the moment of its first appearance.  If you were doing this, you would be facing a “stump the prof” style exercise unless you were fully prepared before they undertook their researches (not a bad thing, actually).  It would be interesting to compare their newspapers’ versions of that year with a typical scholarly chronology, and discuss the differences.
    • It would also be useful to see if you could get students to find real-world analogues to situations in the novels, but this would take some experience and direction, I think.  It might also work better if teachers found such an analogue ahead of time, and used it for discussion.
    • Overall, the effect of the Burney searches is pointillistic: you get details, very much embedded in local contexts, without much explanation of their significance.  So the kind of general question that a student might have, like, “why doesn’t Fantomina get married at the end?” will not get addressed by this kind of research activity.  But it would be interesting to see how one could use this resource to investigat the multifactedness of eighteenth-century marriages, for example.  This would require a series of directed prompts, I think.
    • As I read over my bullet-points, I’m noticing that the best uses of Burney would entail pairing it up with other kinds of resources (ODNB, dictionaries, chronlogies, etc.) so that students could follow up on what they found in Burney with additional information.

    So these are some of my initial reactions.  What do the rest of you think?


    Trial Access for Burney Collection and Search Methods

    August 12, 2009

    Gale/Cengage has generously agreed to offer a free trial of the Burney Collection for readers of this blog at  This provides us with an opportunity for an open discussion of the Burney Collection’s merits, both as a scholarly resource and as a pedagogical tool. 

    In preparation for the two sessions on digital text-bases, it would be interesting to hear more about how users search Burney.  Search results can be overwhelming and show the need for the Library of Congress cataloguing and classification system to help categorize and make sense of the wealth of data that emerges from any given search.  Thomas Mann, a Reference Librarian at the Library of Congress, has a still useful 2005 discussion on the limits of computerized searching for research at  Mann’s site might be particularly helpful in discussing computerized searching with students.  His example is that the 11,000,000 results for the word “Afghanistan” are unclassified, whereas under the LC system, they are neatly parsed into “Antiquities,” “Bibliography,” “Biography,” “Boundaries,” Civilization,” and so forth.  So the argument in favor of LC classification and cataloguing is clear.

    On the other hand, it would be foolish to overlook the value of non-classified search results.  Matthew’s p0st on machine reading makes clear the value of understanding more about what computers can do.  But searching Burney isn’t necessarily clear from the outset.  It would be very interesting to hear more about how individuals use search methods within ECCO, EEBO, and particularly Burney.  We are grateful to Gale/Cengage for making this collective review possible.

    Digital Textbases and Optical Character Recognition (OCR)

    July 16, 2009

    Experienced users of ECCO know about the limits of its full-text capability. The long s in eighteenth-century fonts is one of many peculiarities that can wreck an automated effort at optical character recognition (OCR). Though I’m grateful that I can search ECCO and other databases using full text, I often wonder how complete my search is. I usually get a sense of how many false hits I find, but how many true hits am I missing? How accurate are the full-text capabilities of these resources?

    A recent article presents a method for assessing the accuracy of OCR using the British Library’s 19th Century Newspaper Project as a case study:

    Simon Tanner, Trevor Muñoz, and Pich Hemy Ros, “Measuring Mass Text Digitization Quality and Usefulness: Lessons Learned from Assessing the OCR Accuracy of the British Library’s 19th Century Online Newspaper Archive,” D-Lib Magazine 15.7/8 (2009).

    This is available at:

    The article briefly mentions Gale’s Burney newspapers project. One of the good points in this article concerns how we should measure accuracy:

    Given a newspaper page of 1,000 words with 5,000 characters if the OCR engine yields a result of 90% character accuracy, this equals 500 incorrect characters. However, looked at in word terms this might convert to a maximum of 900 correct words (90% word accuracy) or a minimum of 500 correct words (50% word accuracy), assuming for this example an average word length of 5 characters. The reality is somewhere in between and probably more at the higher extent than the lower. The fact is: character accuracy of itself does not tell us word accuracy nor does it tell us the usefulness of the text output. Depending on the number of “significant words” rendered correctly, the search results could still be almost 100% or near zero with 90% character accuracy.

    The term “significant words” refers to words that users are likely to search for, in contrast to function words (pronouns, prepositions, etc.). A textbase’s accuracy in terms of “significant words” is an appropriate yardstick for how useful its full-text search is.

    The full article merits reading. The authors found that for significant word accuracy, the 19th Century Newspaper Project was 68.4% accurate and the Burney Newspapers was 48.4% accurate. Eighteenth-century newspapers can be astonishingly difficult to read even in the originals, so this low percentage is not that surprising. I suspect that ECCO is somewhere in between these two percentages.

    Roundtable Discussion at EC/ASECS 2009

    June 30, 2009

    EC/ASECS conference, Bethlehem, Pennsylvania, 8-11 October, 2009, hosted by Lehigh University.

    Bibliography, the ESTC, and 18th-Century Electronic Databases:  A Roundtable

     Inspired by James May’s recent essay, “Some Problems in ECCO (and ESTC),” in The Eighteenth-Century Intelligencer (23.1 [Jan. 2009]), this roundtable will examine current bibliographic shortcomings found in ECCO, the Burney Collection of 17th and 18th Century Newspapers and the ESTC and will explore ways that scholars and the managers of such databases could join forces to help solve and improve these tools. Each participant will offer a 5 to 8-minute opening statement, and ample time will be allowed for audience involvement in the discussion. Offering an east coast forum, this roundtable will follow on the heels of a similar roundtable that will be taking place at the Huntington when the International ESTC board meets this September. In addition, “ECCO and EEBO: Some ‘Noisy Feedback’”, an ASECS 2010 roundtable organized by Anna Battigelli, will offer a “part-two” to this EC/ASECS session. 

    Chair: Eleanor Shevlin (West Chester University)

    Participants: James E. May (Penn State University—DuBois), James Tierney (University of Missouri—St. Louis), David Vander Meulen (University of Virginia), Benjamin Pauley (Eastern Connecticut State University), Brian Geiger (ESTC, University of California, Riverside), Scott Dawson (Cengage-Gale).

    This blog, Early Modern Online Bibliography (EMOB), offers an excellent opportunity for exchange and discussion in advance of these roundtables.

    Tentative Bibliography of Articles Pertaining to Early Modern Online Text-bases

    June 19, 2009

    There are a number of excellent articles on online text-bases, some of them online.   Below is a preliminary list of items.  As additional entries are received, they will be entered in the bibliography listed under  the “Pages” link on the blog’s home page.  Please refer to that link for the most updated version of the bibliography.


    Robin C. Alston, “The History of ESTC,” The Age of Johnson: A Scholarly Annual 15 (2004), 269-329.

    Hugh Amory, “Pseudodoxia Bibliographica, or When is a Book Not a Book? When It’s a Record” In The Scholar & the Database: Papers Presented on 4 November 1999 at the CERL Conference Hosted by the Royal Library, Brussels, ed. Lotta Hellinga, 2 (2001), 1-14.

    Kevin Berland, “Formalized Curiosity in the Electronic Age and Uses of On-line Text-Bases,” The Age of Johnson 17 (2006), 392-413.

    Peter W. M. Blayney, “The Numbers Game: Appraising the Revised STC,” Papers of the Bibliographical Society of America 88:3 (1994), 353-407.

    Peter Damian-Grint, “Eighteenth-Century Literature in English and Other Languages: Image, Text, and Hypertext,” A Companion to Digital Literary Studies, ed. Susan Schreibman and Ray Siemens. Oxford: Blackwell, 2008.

    Marilyn Deegan and Simon Tanner, “Conversion of Primary Sources,” A Companion to Digital Humanities, ed. Susan Schreibman and Ray Siemens. Oxford: Blackwell, 2008.

    Gabriel Egan and John Jowett, “Review of the Early English Books Online (EEBO),” Interactive Early Modern Literary Studies (January 2001), 1-13

    Alan B. Farmer and Zachary Lesser, “Early Modern Digital Scholarship and DEEP: Databases of Early English Playbooks,” Literature Compass Online:

    Kevin Franklin and Karen Rodriguez’G, “The Next Big Thing in Humanities, Arts and Social Science Computing: 18thConnect,” HPCWire (November 24, 2008), 3 pp. Humanities_Arts_and_Social_Science_Computing_18thConnect_35010199.-html 

    Ian Gadd, “The Use and Misuse of Early English Books Online,” Literature Compass Online:

    Sayre Greenfield, “ECCO-Locating the Eighteenth Century,” The Eighteenth-Century Intelligencer (Jan. 2007), N.S. 21:1 (2007): 1-9.

    Robert D. Hume, “The ECCO Revolution,”

    William A. Jackson, “Some Limitations of Microfilm,” Papers of the Bibliographical Society of America 35 (1941), 281-88.

    George Justice, “The ESTC and Eighteenth-Century Literary Studies,” Literature Compass Online:

    Diana Kichuk, “Metamorphosis: Remediation in Early English Books Online (EEBO),” Literary and Linguistic Computing 22:3 (2007), 291-303.

    Thea Lindquist and Heather Wicht, “‘Pleas’d By a Newe Inuention? Assessing the Impact of Early English Books Online on Teaching and Research at the University of Colorado at Boulder,” The Journal of Academic Librarianship 33:3 (2007), 347-60.

    Shawn Martin, “EEBO, Microfilm, and Umberto Eco: Historical Lessons and Future Directions for Building Electronic Collections,” Microform & Imaging Review 36:4 (2007), 159-64.

    Shawn Martin, “Digital Scholarship and Cyberinfrastructure in the Humanities: Lessons from the Text Creation Partnership,” Journal of Electronic Publishing 10:1 (2007),

    Shawn Martin, “Collaboration in Electronic Scholarly Communication: New Possibilities for Old Books,” Journal of the Association for History and Computing 9:2 (2006),

    Shawn Martin, “Reaching Out: What do Scholars Want from Electronic Resources?” Proceedings of the Association for Computing in the Humanities, (2005),

    James May, “Some Problems in ECCO (and ESTC),” The Eighteenth-Century Intelligencer N.S. 23:1 (Jan. 2009), 20-30.

    James May, “Accessing the Inclusiveness of Searches in the Online Burney Newspapers Collection,” The Eighteenth-Century Intelligencer N.S. 23:2 (May 2009), 28-34.

    James E. May, “Who Will Edit the ESTC? (And Have You Checked OCLC Lately?),” Analytical and Enumerative Bibliography, n.s. 12 (2001), 288-304.

    John P. Schmitt, “Early English Books Online,” The Charleston Advisor 4:4 (2003), 5-8.

    Henry L. Snyder and Michael S. Smith, eds., The English Short-Title Catalogue: Past, Present, Future (New York, AMS Press, 2003).

    Matthew Steggle, “Knowledge Will be Multiplied,” Digital Literary Studies and Early Modern Literature,” In A Companion to Digital Literary Studies.  Ed. Susan Schreibman and Ray Siemens (Oxford: Blackwell, 2007).

    Stephen Tabor, “ESTC and the Bibliographical Community,” The Library 7th ser., 8:4 (2007), 367-86.

    Simon Tanner, Trevor Muñoz, and Pich Hemy Ros, “Measuring Mass Text Digitization Quality and Usefulness: Lessons Learned from Assessing the OCR Accuracy of the British Library’s 19th Century Online Newspaper Archive,” D-Lib Magazine 15.7/8 (2009).

    Claire Warwick, “Print Scholarship and Digital Resources” A Companion to Digital Humanities, ed. Susan Schreibman and Ray Siemens. Oxford: Blackwell, 2008.

    William Proctor Williams and William Baker, “Caveat Lector.  English Books 1475-1700 and the Electronic Age,” Analytical and Enumerative Bibliography 12 (2001), 1-29.