Posts Tagged ‘Burney Collection’

Digital Projects at SHARP 2015–Part I

July 25, 2015

The Society for the History of Authorship, Reading, and Publishing (SHARP) has featured digital projects at its conferences for many years now. With the SHARP 2013 conference at the University of Pennsylvania, SHARP began the tradition of hosting a stand-alone digital projects showcase at its conferences. During a two-hour time slot, creators present and demonstrate their projects to attendees. SHARP 2015, held in Montreal this past July 7th through July 10th, offered attendees the following fourteen fascinating digital projects and tools:

  • Jonathan Armoza, “Topic Words in Context (TWiC)”
  • Belinda Barnet, Jason Ensor and Sydney Shep, “A Prototype for Using Xanadu Transclusive Relationships in Academic Texts”
  • Troy J. Bassett, “At the Circulating Library: A Database of Victorian Fiction, 1837–1901”
  • Léon Robichaud, “Bibliographie de l’histoire de Montréal”
  • Richard Cunningham, “Architectures of the Book Knowledge Base”
  • Bertrand Gervais, “Arts et littératures numériques: du répertoire à l’agrégateur”
  • Joshua McEvilla, “Facet-Searching the Shakespearian Drama”
  • Jordan Michael Howell, “Digital Bibliography Quick Start”
  • Hélène Huet, “Mapping Decadence”
  • Mireille Laforce, “Des innovations pour faciliter le dépôt légal à Bibliothèque et Archives nationales du Québec” ”
  • Sophie Marcotte, “Le projet HyperRoy”
  • Andrew Ross, Sierra Dye and Melissa Ann McAfee, “From Wandering Peddlers to Purveyors of Bit-Streams: The Rebirth of Scottish Chapbooks in the Twenty-First Century”
  • Chantal Savoie, Pierre Barrette, Olivier Lapointe, “Le « Laboratoire de recherche sur la culture de grande consommation et la culture médiatique au Québec » : un ambitieux système de métadonnées pour mieux comprendre la culture populaire”
  • Mélodie Simard-Houde, “Présentation de la plateforme numérique Médias 19”

Complete abstracts may be found here on the SHARP 2015 conference website.

This two-part post, however, will focus on a few projects most relevant to EMOB’s focus. Part I will focus on Joshua McEvilla’s “Facet-Searching the Shakespearian Drama” and Andrew Ross, Sierra Dye and Melissa Ann McAfee’s “From Wandering Peddlers to Purveyors of Bit-Streams: The Rebirth of Scottish Chapbooks in the Twenty-First Century.” Part II will cover Jordan Michael Howell’s “Digital Bibliography Quick Start” and Richard Cunningham’s “Architectures of the Book Knowledge Base.”

Joshua McEvilla‘s “Facet-Searching the Shakespearian Drama” showcased his An Online Reader of John Cotgrave’s The English Treasury of Wit and Language, a resource aimed at encouraging the study of neglected seventeenth-century dramatic authors whose work and contributions have been overshadowed by the attention given to Shakespeare.

mcevilla-sharp-2015-poster-1
(Click to enlarge)

As the site’s introduction explains, John Cotgrave’s The English Treasury of Wit and Language (1655) is the first seventeenth-century book of quotations to draw its material exclusively from early modern dramas. As such, Cotgrave’s collection “provides a means of studying the original reception of the plays of Shakespeare with the plays of other dramatists” (Cotgrave home). In turn, Dr. McEvilla’s construction of a digital edition of Cotgrave’s work—complete with a concept-based faceted search tool (introduction and search tool), a full list of all the known plays from which the quotations are drawn, data tables, and much more—harnesses the power of the digital to transform this printed resource into a dynamic tool. Besides assisting researchers and encouraging study of neglected English seventeenth-century dramatic works, the Online Reader of John Cotgrave’s ETWL also seems useful for teaching English drama in an advanced undergraduate classroom or graduate course. For those with access to Early English Books Online (EEBO) and/or 17th and 18th Century Burney Newspaper Collection, McEvilla’s tool could serve as an important complement in assisting students understand the contexts for the drama contained in EEBO or in providing them with a guide for selecting texts in EEBO. That the bookseller Humphrey Moseley held the license to print Cotgrave’s work is also worthy of note. As David Kastan recounts in “Humphrey Moseley and the Invention of English Literature,” Moseley played an important role in what he terms the “invention” of English literature (see Agent of Change: Print Culture Studies after Elizabeth L. Eisenstein, Univ. of Mass Press, 2007, pp. 104-124).

Andrew Ross, Sierra Dye and Melissa Ann McAfee’s Scottish Chapbook Project at the University of Guelph draws from the university’s collection of Scottish chapbooks—the largest such collection in North America. A true exercise in collaboration, the digital project results from the cooperation of the university’s Archival and Special Collections and its Department of History”. Not only have librarians, faculty, and graduate students been involved, but undergraduate students (114 since 2013!) in Dr. Andrew Ross’s digital humanities course have helped to build various exhibits as the one depicted in this image.

Exhibit: A Groat's Worth of Wit for a Penny

Exhibit: A Groat’s Worth of Wit for a Penny

(Click to enlarge)

Besides the exhibits, the site also features teaching modules geared to high school instruction, thus extending the reach of this work beyond the university student population.

Among the site’s goals stated in the SHARP abstract is the aim of supporting “an ongoing analysis of the role of woodcut images for the popular readership in Scotland during the early modern period” as well as “the goals of the recently formed Chapbook Working Group of the UK Bibliographic Society.” At present one can browse 416 items, and more are being added regularly. The ultimate aim of this project is to integrate all the estimated extant 10,000 Scottish chapbooks in an interconnected site. Such a long-term goal of integration and interconnection is a promising one, especially in terms of centralizing sources and information on a given topic. As a related aside in terms of integration of projects, Benjamin Pauley’s Eighteenth-Century Book Tracker (see prior emob post, post, and post) is now being phased out, and its information being incorporated into the English Short Title Catalogue.

Please explore these tools and offer your comments and suggestions.

Advertisement

Bibliography: An Endangered Skill?

June 10, 2010

Recently Jennifer Howard, a reporter for the Chronicle of Higher Education, posted a request on SHARP-L about whether bibliography was an endangered skill or art in the academy. She sought thoughts from teachers and students about this question an as well as “where the field bibliography might be headed.”

Her query generated a number of responses ranging from ones that indicated bibliographic training was alive and well in the responder’s particular program to ones that indicated students’ exposure to the topic was highly dependent upon the faculty member they had for a given course or the climate within the department. That Howard added a note later that afternoon in which she clarifies what she meant by bibliography–“I’m interested in the book-history side of bibliography, not in how to prepare correct bibliographic citations”–is telling in my mind. While responses posted to the list before Howard’s clarification primarily addressed the “book-history side,” I do wonder if off-list comments suggested possible confusion about what Howard meant by “bibliography.” Bibliographic citations, annotated bibliographies, and the like are still the standard staples of what is taught in first-year writing courses and even more advanced topics. So it would seem odd, to me at least, if someone had misinterpreted her query, especially one posted on a listserv devoted to the history of the book.

Many of our discussions on emob have noted the important relationship between traditional bibliographic knowledge and electronic resources such as EEBO, ECCO, and Burney. (See for instance the discussion that emerged in the collaborative reading of Ian’s Gadd’s “The Use and Misuse of Early English Books Online.”) But we have not had an extended discussion about the state of bibliographic training. Rather some comments have considered it to be a given that descriptive and analytical bibliographic skills are not regularly or as vigorously taught in graduate programs (with admitted exceptions), while others have stressed the need for such knowledge. Thus, I would like to hear more about if and how we teach these skills in our undergraduate and graduate classrooms as well as whether students respond well to such lessons. How do colleagues respond? (One SHARP commentator made mention of “sneaking” this material into courses). What tools and materials do people use? And what is the context or type of course(s) in which such skills are taught? Some SHARP-L responses to Howard’s query favored teaching bibliographical skills within a textual studies context, while others preferred a “book-history” context.

I have tended to use both approaches, but it depends upon the course. In methods/skills courses, I have used Oxford University’s manuscript exercise, Wilfred Owen’s “Dulce et Decorum Est.” While some students found the process of editing tedious, almost all appreciate being exposed in a hands-on way to issues they had never considered. I also use videos and the workshop materials for the hand-press book from University of VA’s Rare Book School to teach bibliography from a book-history standpoint.

Summary of EC/ASECS Roundtable: Bibliography, the ESTC, and 18th-Century Electronic Databases

October 24, 2009

Bibliography, the ESTC, and 18th-Century Electronic Databases:  A Roundtable

Chair: Eleanor F. Shevlin (West Chester University)   Participants: James E. May (Penn State University—DuBois), James Tierney (University of Missouri—St. Louis), David Vander Meulen (University of Virginia), Benjamin Pauley (Eastern Connecticut State University), Brian Geiger (ESTC, University of California, Riverside), and Scott Dawson (Gale/Cengage).

The following offers a summary of the roundtable that took place, Saturday, October 10, 2009, at the EC/ASECS 2009 conference hosted by Lehigh University and held at Bethlehem, Pennsylvania, October 8-11, 2009.

 Jim May opened the roundtable, and his remarks highlighted and extended the discussion he offered in his essay, “Some Problems in ECCO (and ESTC),” in The Eighteenth-Century Intelligencer, 23.1 (Jan. 2009), the article that inspired this session and Anna Battigelli’s forthcoming roundtable at ASECS (March 18th, 9:45 am—11:15 am). Key issues Jim raised included the need to correct missing images, to address the “disappearance” of letters originally printed in red ink on title pages, and to bring the ESTC up to date. In addition, he noted that ECCO’s electronic index is not always representative of what is actually there digitally. Work is also needed on providing or revising information about subscription lists, textual history, and attributions in ESTC. While noting that he had already addressed problems with Burney in his The Eighteenth-Century Intelligencer article, 23.2 (May 2009) and that Jim Tierney would be discussing this tool next, Jim commented on the usefulness of Burney, particularly to those working on the history of a publication.

Turning to the Burney collection, Jim Tierney drew attention to the potentially confusing name for this electronic collection because it is not by any means restricted to newspapers. Instead, it includes a good number of periodicals as well. Specifically, the collection consists of 237 newspapers and 161 periodicals, and, furthermore, some of the titles included are neither newspapers nor periodicals. That the Burney digitized collection follows the Anglo-American cataloguing procedure of creating a new entry every time a newspaper undergoes a title change results in the illusion of more titles than actually exist as well as confusion about the history of a given newspaper. Jim also provided a detailed handout (posted here as a page) listing the digitized periodicals (note: not newspapers) in Burney. The handout includes notes about missing issues, other locations where titles in Burney can be found, and a tentative list of Burney titles duplicated by other digitization projects. The two overarching points Jim made was the failure to have scholars involved in the planning of Burney and other digitization projects and the need for far greater collaboration among the creators/purveyors of these databases, librarians, and scholars. That given titles in Burney often include only a few issues when other issues were available elsewhere and, if digitized, would have approached a more complete run, exemplify the need for far better coordination and collaboration.

While David Vander Meulen serves on the ESTC board, his remarks for the roundtable were offered in his role as a researcher and user of these tools. He began by noting that ESTC is an evolving tool—a work in progress—and that ECCO follows ESTC.  Moreover, even as it progresses, the ESTC is still “functional and valuable” even though it is incomplete. Nonetheless, “any addition to ESTC will change the context.” An important development occurred in 2006 when the British Library initiated free access to this tool. As for problems, the ESTC had made the decision to truncate titles and places. Yet ECCO generally offers the full titles, while expanded locations can occasionally be found by going to public library catalogues. To improve these resources, David explained, we need to have an easier way to convey corrections to the British Library or University of California Riverside (the North American home of the ESTC) and, equally important, an ongoing staff to process editorial changes and comments. In discussing this need for a means of processing updates, David also drew attention to whether the uncontrolled notes field should be visible. Unfortunately agencies that have funded the ESTC, as he explained in his closing remarks, have decided the project is complete.  Obviously, given ESTC’s status as a work-in-progress, such a decision presents additional problems to continued updating and correcting.

 Ben Pauley spoke next about a project he has initiated. He began by noting the lack of access that many institutions (and thus their scholars and students) have to paid databases such as EEBO and ECCO. Both Internet Archives and Google Books, however, have a number of eighteenth-century books in their freely accessible databases. Yet it is typically very hard to identify properly what text one has accessed. Viewing these freely available texts as an opportunity, Ben established The Eighteenth-Century Book Tracker, a project in which he is supplying the bibliographic data so sorely lacking in eighteenth-century texts found in Google Books. Doing so has compelled him to become a textual scholar or an “accidental bibliographer.” Thus far, he has recorded about 150 copies not appearing in ESTC. At present, the project features 480 texts and 4 periodicals. Ben has been asked to write an article on the Eighteenth-Century Book Tracker for The Eighteenth-Century Intelligencer that will detail much more about his undertaking.

Speaking as the Associate Director and Resident Manager of the Center (University of California Riverside), the North American home of the ESTC, Brian Geiger explained that the British Library’s ESTC role has focused on cataloguing its own collection and that the Univ. of California Riverside  has handled everything else. In addition to reiterating points about the problem with truncated titles, he also discussed the lack of subject headings as a shortcoming. Turning to the digital surrogates of early modern imprints, he explained that the ECCO and Adam Matthews collections are based on ESTC, but EEBO is not. Next Brian addressed the need to foster better communication between ESTC and scholars. While the channels of communication between ESTC and librarians have remained strong, that has not been the case with scholars. Like Ben, Brian will also be writing an article on the ESTC for The Eighteenth-Century Intelligencer.

 Scott Dawson from Gale-Cengage concluded the presentations by roundtable panelists. He first supplied an historical overview of ECCO and Burney. In 1982 Research Publications began to microfilm the “Eighteenth Century” microform collection. By 2002 twenty-six million pages of eighteenth-century titles had been filmed. This microfilm collection is the basis for ECCO, but using the ESTC in conjunction with the microfilm has been overall a real plus for the project.  ECCO II, released at the start of this year, features 50,000 additional titles. By mid 2010 ECCO II, representing holdings from fifteen libraries, will be completed (titles from the Harry Ransom Center are still being prepared). ECCO and ECCO II, combined, will have made 185,000 eighteenth-century titles available to subscribers. As for the digitization of Burney, that project was handled by the British Library and not Gale-Cengage. Scott also addressed some of the problems that can and cannot be corrected. When pages are blurred, for instance, the microfilm plays a key role in what can be done. If the microfilm is clear, then the page is re-filmed. Yet if the problem occurred because the page is blurred in the microfilm, then, from the perspective of Gale, nothing can be done. When duplications of a title are discovered, however, the duplications can be deleted. 

After all six panelists had offered opening statements, the discussion was opened to the audience’s questions and comments. The point perhaps most stressed in the discussion with the audience was a need for far greater involvement by scholars in the creation and improvement of digital resources. In terms of updating or correcting resources, questions arose about how this might be done and what types of controls are needed. In subsequent discussions, the creation of advisory boards and (or) the involvement of a committee representing ASECS arose as possible avenues for communicating and addressing the scholar’s perspective more effectively. The establishment of an advisory board and/or ties with ASECS could play a vital role in future projects, and members of a board or ASECS committee could also devise potential solutions to some of the shortcomings with existing tools.  The resurrection of Factotum, the now defunct ESTC news publication of the British Library (ceased with issue no. 40 in 1995), or the initiation of a similar publication would be a way of establishing regular, ongoing communication with a broader base of scholars. (For those interested in the content of previous issues, see the index for Factotum.) Of course, an obstacle here is staffing and funding. Questions also arose about plans to make Burney more complete by digitizing issues not included for a particular newspaper or periodical title but available elsewhere. Yet that this digitization project had been undertaken by the British Library (see final report) and not Gale complicates the issue. Also, when asked about any plans for an ECCO III, Scott explained that the creation of ECCO II caused surprise among many libraries that had purchased ECCO because they believed that ECCO was complete at the time. When ECCO II was introduced for purchase, libraries were promised that there would not be any additional forms of ECCO.  (Depending on the discovery of additional eighteenth-century titles, however, I see no reason that another collection could not be pursued; if enough material for another collection becomes available, then scholars need to insert and assert themselves in conversations with vendors and librarians and make the need and value of a third collection known.)

Another very real, pressing concern was the large number of scholars who do not have access to these databases and for whom their institutions are not likely to be able to afford these resources even in the future. The point was raised that all universities in the U.K. have access to ECCO and ECCO II for an annual hosting fee through the auspices of the Joint Information Systems Committee (JISC), “established by the UK further and higher education funding councils in 2006 to negotiate with publishers and owners of digital content.” Because the situation differs greatly in the U.S.—we have no higher education government council overseeing all our universities—we do not have such a prospect here. While Ben Pauley’s Eighteenth-Century Book Tracker promises to bring some order to the current anarchy that characterizes freely available eighteenth-century texts, his valuable project can’t and won’t solve the inequity of access in the United States.

Trial Access for Burney Collection and Search Methods

August 12, 2009

Gale/Cengage has generously agreed to offer a free trial of the Burney Collection for readers of this blog at http://access.gale.com/emob.  This provides us with an opportunity for an open discussion of the Burney Collection’s merits, both as a scholarly resource and as a pedagogical tool. 

In preparation for the two sessions on digital text-bases, it would be interesting to hear more about how users search Burney.  Search results can be overwhelming and show the need for the Library of Congress cataloguing and classification system to help categorize and make sense of the wealth of data that emerges from any given search.  Thomas Mann, a Reference Librarian at the Library of Congress, has a still useful 2005 discussion on the limits of computerized searching for research at http://www.guild2910.org/searching.htm.  Mann’s site might be particularly helpful in discussing computerized searching with students.  His example is that the 11,000,000 results for the word “Afghanistan” are unclassified, whereas under the LC system, they are neatly parsed into “Antiquities,” “Bibliography,” “Biography,” “Boundaries,” Civilization,” and so forth.  So the argument in favor of LC classification and cataloguing is clear.

On the other hand, it would be foolish to overlook the value of non-classified search results.  Matthew’s p0st on machine reading makes clear the value of understanding more about what computers can do.  But searching Burney isn’t necessarily clear from the outset.  It would be very interesting to hear more about how individuals use search methods within ECCO, EEBO, and particularly Burney.  We are grateful to Gale/Cengage for making this collective review possible.

Digital Textbases and Optical Character Recognition (OCR)

July 16, 2009

Experienced users of ECCO know about the limits of its full-text capability. The long s in eighteenth-century fonts is one of many peculiarities that can wreck an automated effort at optical character recognition (OCR). Though I’m grateful that I can search ECCO and other databases using full text, I often wonder how complete my search is. I usually get a sense of how many false hits I find, but how many true hits am I missing? How accurate are the full-text capabilities of these resources?

A recent article presents a method for assessing the accuracy of OCR using the British Library’s 19th Century Newspaper Project as a case study:

Simon Tanner, Trevor Muñoz, and Pich Hemy Ros, “Measuring Mass Text Digitization Quality and Usefulness: Lessons Learned from Assessing the OCR Accuracy of the British Library’s 19th Century Online Newspaper Archive,” D-Lib Magazine 15.7/8 (2009).

This is available at:

http://www.dlib.org/dlib/july09/munoz/07munoz.html

The article briefly mentions Gale’s Burney newspapers project. One of the good points in this article concerns how we should measure accuracy:

Given a newspaper page of 1,000 words with 5,000 characters if the OCR engine yields a result of 90% character accuracy, this equals 500 incorrect characters. However, looked at in word terms this might convert to a maximum of 900 correct words (90% word accuracy) or a minimum of 500 correct words (50% word accuracy), assuming for this example an average word length of 5 characters. The reality is somewhere in between and probably more at the higher extent than the lower. The fact is: character accuracy of itself does not tell us word accuracy nor does it tell us the usefulness of the text output. Depending on the number of “significant words” rendered correctly, the search results could still be almost 100% or near zero with 90% character accuracy.

The term “significant words” refers to words that users are likely to search for, in contrast to function words (pronouns, prepositions, etc.). A textbase’s accuracy in terms of “significant words” is an appropriate yardstick for how useful its full-text search is.

The full article merits reading. The authors found that for significant word accuracy, the 19th Century Newspaper Project was 68.4% accurate and the Burney Newspapers was 48.4% accurate. Eighteenth-century newspapers can be astonishingly difficult to read even in the originals, so this low percentage is not that surprising. I suspect that ECCO is somewhere in between these two percentages.

Roundtable Discussion at ASECS, 2010

June 25, 2009

ASECS conference, Albuquerque, N.M., 18-21 March, 2010

EEBO, ECCO, and Burney Collection Online:
Some “Noisy Feedback” 

In a 2009 article in the Eighteenth-Century Intelligencer, James May suggested that “scholars need to provide a little noisy feedback to corporate ventures like ECCO if future projects are to benefit from their expertise.”  This roundtable discussion is designed to provide constructive scholarly feedback for ECCO, EEBO, and the Burney Collection Online.  Brief (5-minute) presentations on these databases’ bibliographical problems should focus on ways in which they might be strengthened.  Possible topics include how to correct attribution errors, strengthen search mechanisms, detect and improve digital images that are insufficiently clear or in some cases illegible, augment and clarify holdings information, eliminate duplicate records, signal the existence of listings not reproduced, and so forth.  Following the brief presentations, panelists will consider the issues raised and invite members of the audience to participate in the discussion.  All participants are encouraged to read the set of related readings on the bibliography below, suggest additions to it, and join in discussions on this blog leading up to the session. 

Chair: Anna Battigelli, SUNY Plattsburgh

Panelists: James E. May (Penn State University—DuBois); Sayre Greenfield (University of Pittsburgh at Greensburg); Eleanor Shevlin (West Chester University of Pennsylvania); Stephen Karian (Marquette University); Michael F. Suarez, S.J. (Rare Book School, University of Virginia)

Respondents:  Scott Dawson (Gale/Cengage); Brian Geiger (ESTC); Jo-Ann Hogan, (Proquest)