Archive for the ‘Proquest’ Category

Commercial Databases: Greater Access to JStor, EEBO, ECCO, Burney, and more in 2014?

January 2, 2014

 As EMOB readers know, equal access to various subscription databases has been one of our key concerns over the years. Posts such Unequal Access and Commercial Databases have addressed this problem in detail, while other entries have suggested arguments to present to administrators and librarians as to why subscribing to these resources is crucial for scholars and students alike. From time to time we have been able to obtain trial subscriptions to commercial databases—EEBO, ECCO, WWO, Burney, Orlando—for EMOB readers. Most recently, Anna has detailed a Cengage-Gale trial granted to SUNY institution and the results of that trial.

Issues of access, however, continue to affect many—both those whose institutions do not subscribe to these digital resources and those whose status as independent scholars, retired, or seeking employment  means that they lack the necessary affiliation to gain access. Yet some recent developments indicate that 2014 might be a turning point in gaining greater albeit not equal access for scholars.

JStor, for instance, has launched a number of initiatives.

  • Two years ago JStor instituted Early Journal Content, which made its holdings of material “published prior to 1923 in the United States and prior to 1870 elsewhere freely available to anyone, anywhere in the world.”
  • After a three-year pilot, JStor established the Alumni Access program for institutions participating in JStor. This video features a presentation on Alumni Access given at the Fall 2012 Coalition for Networked Information (CNI) conference. SAGE journals also has a similar program.
  • In March 2012, as a follow-up of sorts to its Early Journal Content, JStor commenced its Register & Read program. This program enables those without institutional access to gain access to a subset of JStor—to articles in roughly 700 journals; the program, however, does not enable access to current material. See FAQs for more information.

Most promising, perhaps, is JStor’s JPass launched this past fall. JPass offers individuals access to 83% of JStor’s database for a fee ranging from $19.50 a month to $199.00 a year. The JPass enables unlimited access for reading articles contained in 1,500 journals and published up until 3 to 5 years prior. The program also allows JPass holders the ability to download a limited number of articles each month. Equally promising, in late October the Modern Language Association (MLA) announced that it had just added discounts on the JPass as a member benefit. Rather than pay $199 for annual subscription to JPass, MLA members can obtain this pass for only $99 per year.  This model resembles to some degree that of the The British Newspaper Archive , which offers annual, monthly, week, and daily access plans.

MLA, however, is not the only scholarly society to add access to databases as a member benefit.  Other societies and scholarly organizations (including the Society for the History of Authorship, Reading & Publishing [SHARP])are, or will be shortly, making this a new member benefit.

Most impressive is the initiative by the Renaissance Society of America (RAS). This past November RSA announced that all members would enjoy full access to Early English Books Online.  RSA evidently secured an institutional subscription to EEBO, thus enabling all its members to have free access to EEBO. An experiment of sorts by Proquest and RSA, this model of a society acting as an institutional subscriber could serve as an example to others. At the same time, such subscriptions are costly to the society and databases would need to be ones that were relevant to most if not all members. Another potential risk that has arisen entails cancellation of database subscriptions by academic libraries based on the rationale that faculty members have access to a given database because of their membership in a professional organization. Such cancellations are extremely shortsighted and ignore entirely the pedagogical benefits of these databases for undergraduate and graduate students alike. Similarly, such a move seems particularly irrational given the large-scale push to promote undergraduate research and in light of the unusual opportunities that access to these primary texts offers undergraduates.Understandably such cancellations are not conducive to inspiring confidence in publishers of these databases to engage in such experiments.

To date Cengage-Gale has no plans to embark on individual plans or the like. For more than a few years, it has been investigating possible models that would allow it do so, but it has yet to discover one that is financially viable or that would not conflict with existing contracts (this latter issue is one often overlooked, but these contracts carry many clauses and can complicate opening up access given existing agreements with subscribing institutions). It has, however, been successful in lowering the costs of such databases as ECCO and 17th and 18th Century Burney Collection, enabling more academic libraries to be able to afford subscriptions.

This overview has not even touched upon the issues surrounding green and gold standards of open access, nor has it discussed the policies related to these standards announced in 2012-2103 in the UK, Australia, and continental Europe. Yet, these issues deserve an independent post in the future.

In the meantime, it would be interesting to hear what others think of these initiatives and what they might signal for better if not full equal access in the future.  Do these various plans seem affordable? What other solutions might be offered?

CFP: JEMCS Special Issue on the Early Modern Digital

August 11, 2012
The following call for papers, posted on SHARP-L, may be of interest
to readers.  Contact Devoney Looser for additional information (contact information below).
Journal for Early Modern Cultural Studies:  Special Issue on the Early Modern Digital (due 15 Jan 2013)
It is well understood that “the digital turn” has transformed the contemporary cultural, political and economic environment.  Less appreciated perhaps is its crucial importance and transformative potential for those of us who study the past.  Whether through newly—and differently—accessible data and methods (e.g. “distant reading”), new questions being asked of that new data, or recognizing how digital reading changes our access to the materiality of the past, the digital humanities engenders a particularized set of questions and concerns for those of us who study the early modern, broadly defined (mid-15th to mid-19th centuries).For this special issue of JEMCS, we seek essays that describe the challenges and debates arising from issues in the early modern digital, as well as work that shows through its methods, questions, and conclusions the kinds of scholarship that ought best be done—or perhaps can only be done— in its wake.  We look for contributions that go beyond describing the advantages and shortcomings of (or problems of inequity of access to) EEBO, ECCO, and the ESTC to contemplate how new forms of information produce new ways of thinking.We invite contributors to consider the broader implications and uses of existing and emerging early modern digital projects, including data mining, data visualization, corpus linguistics, GIS, and/or potential obsolescence, especially in comparison to insights possible through traditional archival research methods. Essays of 3000-8000 words are sought in .doc, .rtf, or.pdf format by January 15, 2013<>.  All manuscripts must include a 100-200 word abstract. JEMCS adheres to MLA format, and submissions should be prepared accordingly.In addition, we would welcome brief reports (500-1500 words) that describe digital projects in progress in early modern studies (defined here as spanning from the mid-fifteenth to the mid-nineteenth centuries), whether or not these projects have yet reached completion.  These reports, too, should be submitted in .doc, .rtf, or.pdf format, using MLA style, by 15 January 2013 to  to

Devoney Looser, Catherine Paine Middlebush Chair and Professor of English
Co-Editor, Journal for Early Modern Cultural Studies
Tate Hall 114
Department of English
University of Missouri
Columbia, MO 65211
FAX: 573-882-5785

JISC’s Historic Books: Searching EEBO, ECCO for meaning

March 6, 2012

This past fall JISC announced a new venture, the JISC eCollections, “a new community-owned content service for UK HE and FE institutions.” What might interest EMOB readers most is its Historic Books. This digital collection contains over 300,000 books from before 1800 and also makes over 65,000 19th-century first editions from the British Library available for the first time online. The entire corpus is accessible through institutional subscription and, most welcome, searchable over a single platform.

The pre-1800 material in the JISC Historic Books eCollection consists solely of ProQuest’s Early English Books Online (EEBO) and Gale’s Eighteenth Century Collections Online (ECCO) textbases, so some might wonder what this collection offers that is new for those working in the early modern period. One does not need to be in eCollections, for instance, to conduct searches simultaneously across both databases. Yet the Help page for the eCollections indicates that more than just the convenience of a single interface and platform is being offered:

JISC Historic Books uses meaning-based searching rather than traditional keyword searching, which is why you will notice you get different results to searching EEBO and ECCO on the publishers sites. Meaning-based searching enables you to find conceptual and contexual [sic] links betweeen [sic] related documents which aren’t possible using traditional keyword searching.

Besides returning traditional results, JISC Historic Books also delivers “meaning-based” concepts deemed relevant to the search in the form of a Concept Cloud:

Concept Cloud

The more prominent the word, the more relevant it is deemed to the search, and as the screenshot indicates, items in the cloud can be manipulated to narrow one’s search further.

Over the past three or four years (and maybe longer) I have been consistently struck by the transformations that traditional searches of ECCO, Burney, EEBO, as well as Google Books have had on the ways I think about searching, construct searches, and view my results. More specifically, these keyword searches, described here as traditional, were already encouraging me to view results in a more networked, contextual way and, as a consequence, to devise additional searches aimed at teasing out new potential relationships. The meaning-based search enabled by JISC’s mimas platform, of course, is offering something quite different, but I wonder how its use might cause rethinking of what it means to search and research.

It would be interesting to hear from EEBO and EECO users in the UK who have used JISC Historic Books, especially the differences between results obtained from searching using the JISC platform and those obtained by searching using the original publishers’ platform.


EEBO Interactions and Bibliography: Linking the Past to the Present

February 5, 2012

“Even as more and more texts become widely available through digital surrogates, studies of the book remain grounded in physical bibliography.”

–Stephen Tabor, “ESTC and the Bibliographical Community”

This is a heady time for literary scholars using digital tools.  Visualization and text tagging software offers new ways to analyze old texts’ rhetorical and linguistic features.  Docu-scope, for example, is being used by Michael Witmore, Director of the Folger Shakespeare Library, to chart maps of Shakespeare’s plays using 1000-word strings.  The resulting maps posted on Witmore’s blog, Wine Dark Sea, reveal that Othello, for example, shares linguistic features, such as frequent first-person forms, with Shakespeare’s comedies.  Asking why this is so may provide a more detailed understanding of Shakespeare’s craft.

Other data mining projects, underway at Matthew Jockers and Franco Moretti’s Stanford Literary Lab, broaden and transform the practice of literary study, in part by advancing what Moretti calls “distant reading.”  These projects forgo traditional “close” reading of individual texts to analyze computer-generated data derived from running thousands of texts through specific programs.

Elsewhere, annotation tools, such as Digital Mappaemundi, allow annotation of digital artifacts such as, in DM’s case, medieval maps and geographic texts.

Aggregating platforms, including 18thConnect and NINES, create virtual environments where digital work can be shared.  Digital texts, images, maps, data, video, and audio can be collected and annotated for projects difficult to imagine just a few years ago.

Finally, the digital world has nourished new participatory models of scholarship, advanced, for example, by Kathleen Fitzpatrick’s Planned Obsolescence.

These new and often visually alluring scholarly ventures chart new avenues of inquiry and reshape literary studies as we know it.  Stanley Fish has blogged about them; Witmore has been interviewed by Forbes, introducing them to the commercial world; and granting agencies like the NEH have responded by dedicating specific funds for such projects.

But in the shadow of these projects, runs a slower, methodical, far less glamorous digital task on which all other projects rely: ensuring that digital texts retain bibliographical integrity.  As Stephen Tabor put it in a 2007 comment used in the epigraph above, “even as more and more texts become widely available through digital surrogates, studies of the book remain grounded in physical bibliography” (The Library 8:4, 369).

EEBO Interactions offers a unique venue for scholarly dialogue about bibliographical matters.   Though it describes itself as a “social network for Early English Books Online,” it might be more accurate to think of it as a site for asynchronous conferencing about bibliographical matters.  A broad range of readers–Proquest editors, graduate students, theologians, literary scholars, historians, philosophers, independent scholars, curators, librarians and library administrators, digital editors,  undergraduates, bibliographers, and textual critics–have already posted queries or comments, often correcting bibliographical entries or expanding our understanding of a given text.  The comments appear under the following rubrics:

Comments about this copy: Comments include requests that missing title pages be restored, or that two variants counted as the same copy by both ESTC and EEBO be distinguished.  They range from providing resolutions of complex pagination problems, to asking general book history questions.

About this work:  This section allows readers to suggest the broader context of a given text.  Nick Poyntz of Mercurius Politicus fame identifies one pamphlet as an advertorial for a cup lined with antimony and notes that two customers died after using the cup.  Other readers correct publication dates, post questions about attribution, note additional authors not mentioned in the EEBO or ESTC entries, or track the evolution of a text from one edition to the next.

Notes:  Aliases can be discussed here, something helpful in reading recusant literature.  This is also the space to discuss a text’s plurality–its relation to other texts it cites or responds to, and its reception.

Suggest a link: This space allows for links to ODNB entries or to pertinent articles, particularly useful for acquiring a fuller understanding of little known works. 

Perhaps most innovatively, EEBO Interactions invites scholars and librarians to talk with one another and with representatives from the commercial world that produced EEBOEEBO Interactions is the only purpose-built space designed to bring together members of the bibliographical community–normally working in isolation and apart from one another–to collaborate for a moment or two on the joint endeavor of linking the past to the present.  This is the kind of experiment that benefits everyone. 

It would be great to hear readers’ responses to EEBO Interactions.

EEBO Editions Now Available Through Amazon

October 10, 2010

In August,  Eleanor posted a piece on ECCO’s print on demand (POD) offerings through various online booksellers.  These POD copies are produced by companies such as Nabu, Bibliolife, BiblioBazaar, and others.

EEBO has also struck a deal with Bibliolife, making about 3,000 EEBO POD titles available through  These can be found by searching Amazon for “EEBO Editions.”  According to Jo-Anne Hogan, Product Manager at ProQuest, this initial offering through Bibliolife is  a trial stage; evaluating the response to and quality of the books will be necessary before ProQuest will expand the title list offered through POD.  It is thus a good  moment to reflect on the nature of the entries.

Neither Gale nor ProQuest flag the status of the books they sell as digital reprints on or near the title line, though both companies include boilerplate marketing blurbs about the nature of digital reprints later in the entry.  A simple flag next to the initial title, something like  [paperback digital reprint] or [paperback digital facsimile], would help all readers understand what these books are.

ECCO’s POD entries provide something like full bibliographical information only inconsistently.   EEBO entries on Amazon provide consistently fuller bibliographical information, though this information appears under “Editorial Reviews” rather than under “Product Details.”  By scrolling down Amazon’s entry for the digital reprint of  a pirated copy of Lily’s Short Introduction to Grammar (1570), for example, we find the following information:

The below data was compiled from various identification fields in the bibliographic record of this title. This data is provided as an additional tool in helping to insure edition identification:

A shorte introduction of grammar generally to be vsed, compiled and sette forth, for the bringyng vp of all those that intende to attaine the knowledge of the Latine tongue.
Lily, William, 1468?-1522.
Colet, John, 1467?-1519.
Robertson, Thomas, fl. 1520-1561.
By William Lily, with contributions by John Colet, Thomas Robertson, and others.
Signatures: A-C D4, A-G H4, A-B4.
In three parts.
Part 2 has a separate title page, without imprint, reading: Brevissima institutio seu ratio grammatices cognoscendae, ad omnium puerorum vtilitatem praescripta, quam solam regia maiestatis in omnibus scholis profitendam praecipit.
Part 3 has a half title, reading: Nominum in regulis generum contentorum, tum heteroclitorum, ac verborum interpretatio aliqua.
Title pages for parts 1 and 2 within ornamental borders.
A pirated edition, probably printed in Holland.–STC.
Another edition of STC 15610.10, first published in 1548.
Some print faded and show-through; some pages marked and stained.
[192] p.
[Holland? : s.n., c. 1570]
STC (2nd ed.) / 15615
Reproduction of the original in the Cambridge University Library

This is, in fact, a slightly revised version of the EEBO entry for the same pirated edition of Lily’s Short Introduction of Grammar:

Title: A shorte introduction of grammar generally to be vsed, compiled and sette forth, for the bringyng vp of all those that intende to attaine the knowledge of the Latine tongue. Create interaction
Author: Lily, William, 1468?-1522. Create interaction
Other authors: Colet, John, 1467?-1519. Create interaction
Robertson, Thomas, fl. 1520-1561. Create interaction
Imprint: [Holland? : s.n., c. 1570]
Date: 1570
Bib name / number: STC (2nd ed.) / 15615
Physical description: [192] p.
Notes: By William Lily, with contributions by John Colet, Thomas Robertson, and others.
Signatures: A-C D4, A-G H4, A-B4.
In three parts.
Part 2 has a separate title page, without imprint, reading: Brevissima institutio seu ratio grammatices cognoscendae, ad omnium puerorum vtilitatem praescripta, quam solam regia maiestatis in omnibus scholis profitendam praecipit.
Part 3 has a half title, reading: Nominum in regulis generum contentorum, tum heteroclitorum, ac verborum interpretatio aliqua.
Title pages for parts 1 and 2 within ornamental borders.
A pirated edition, probably printed in Holland.–STC.
Another edition of STC 15610.10, first published in 1548.
Some print faded and show-through; some pages marked and stained.
Reproduction of the original in Cambridge University Library.
Copy from: Cambridge University Library
UMI Collection / reel number: STC / 1354:02
Subject: Latin language — Grammar — Early works to 1800.

While this bibliographical information is provided consistently for EEBO editions on Amazon and its affiliate, Abebooks, it does not  get transferred to entries provided by other online booksellers, like Alibris.  It would be interesting to account for this failure to get full bibliographical information transferred.

ProQuest’s decision to make EEBO titles available through POD is a promising new development.  Its attempt to create a template providing fuller bibliographical information than has yet been attempted must be applauded.  Some questions remain:

  • Are the entries as functional as they need to be?  That is, can a scholar looking for a specific edition of an early modern text locate the exact POD copy, given the entries provided?
  • Can the layout be improved?
  • Is there a more efficient template (a different set of fields, for example) for bibliographical information than the fields currently envisioned?

I look forward to hearing readers’ reactions to these new POD offerings.

ASECS Summary of “Some Noisy Feedback” Roundtable, Albuquerque 3/18/10

March 27, 2010

ECCO, EEBO, and the Burney Collection: Some “Noisy Feedback” Roundtable

Chair: Anna Battigelli (SUNY Plattsburgh)   Panelists: Sayre Greenfield (University of Pittsburgh, Greensburg), Stephen Karian (Marquette University), James E. May (Penn State University—DuBois), Eleanor Shevlin (West Chester University), Michael Suarez (Rare Book School, University of Virginia).  Respondents: Jo-Anne Hogan, (ProQuest), Brian Geiger (ESTC, University of California, Riverside), and Scott Dawson (Gale/Cengage).

The following offers a summary of the roundtable that took place, Thursday,  March 18, 2010  at the ASECS 2010 conference in Albuquerque, N.M.  This session was the second part of a two-part series, the first part having been a roundtable discussion chaired by Eleanor Shevlin at the EC/ASECS meeting in Bethlehem, Pa in October 2009.  Copies of Eleanor’s summary of the EC/ASECS session (published in the Eighteenth-Century Intelligencer and also on this blog) were distributed at the outset of this session.  Many thanks to the members of the audience who so cheerfully presented themselves at an early hour on the conference’s first day.

Sayre Greenfield opened discussion with detailed working solutions to problems caused by ECCO’s OCR (optical character recognition) software.  He recommended that Gale provide an ECCO OCR troubleshooting page on their web site and noted that blogs like this one would be sure to start that process (see below).  Aided by Deidre Stuffer, he found ways to correct for errors stemming from the following letter combinations that OCR typically mistranslates: s, ss, and ct.  Using the word, fishmonger, he substituted for the s every other letter, then substituted numbers, and finally the wildcard question mark.  Advice from his search results, including how best to use the question mark as a wildcard, can be found on the ECCO OCR Troubleshooting Page on the “Pages” section of this blog.  He warned that using the question mark for any medial or initial s is problematic if one is using variables elsewhere, adding that ECCO does not allow wildcards for the first letter of a word.  Additionally, letters surrounding the s seem to affect how the OCR reads the s.  The double ss, for example, frequently morphs into fl, transforming passion into paflion. Word searching within a text also proved problematic.  Though he found 32 instances of passion or passions when he read John Tottie’s A View of Reason and Passion, his electronic search using passion* yielded only half of these.  Turning to ct, he found that OCR often reads ct as t, so that objection becomes objetion.  These results suggest that ECCO would help users by strengthening its web site, which currently recommends fuzzy searches to address OCR problems.  Fuzzy searches create too many false positive results.  Including a more robust help page on this issue is necessary.  (For now, see Sayre’s ECCO OCR Troubleshooting Page on this blog.)

Steve Karian began by acknowledging the indispensability of ESTC for bibliometrics, but he also identified four problems that need to be addressed if the ESTC is to become the powerful tool it can be for the twenty-first century.  The first is the ESTC’s unit of measurement: the ESTC record.  Users often equate an ESTC record with an imprint, title, edition, or an issue.  Because of variations in the correlation of record to item, one cannot simply assume that two parallel sets of search “hits” can be compared reliably.  As he puts it, “one is constantly comparing apples to oranges.”  Additionally, field records vary, limiting or complicating the kinds of searches that can be done.  These need to be standardized if searching is to become reliable.  The two ESTCs—one at UC-Riverside, the other at the British Library—use the same data but different interfaces.  Dates are complicated because they appear in two MARC (Machine-Readable Cataloguing) fields.  Steve recommended deleting the MARC record entirely and replacing it with a new database structure, one designed to expand and grow.  He called for a new stage of innovation, allowing the ESTC to transform itself from a bibliographical catalogue into a bibliographical database.  Only through such a transformation will the ESTC become the powerful tool it promises to be.

Jim May discussed the Burney Collection, which he argued should be called the Burney Collection of Newspapers, Periodicals, and Other Printed Matter.  Its material was first collected by Charles Burney, subsequently increased by the British Library, and eventually microfilmed before being turned over to Gale/Cengage.  It includes material dating back to the 1620s and beyond  1800 and material printed in Barbados, India, Ireland, and North America.  Citing James Tierney’s comments at the Bethlehem meeting, Jim noted that the collection includes 237 newspapers and 161 periodicals, 60 of which are partially available in Adam Matthews Eighteenth-Century Journals series or ProQuest’s British Periodicals.  Burney allows one to read an entire issue or study issues by year or month, and it offers searching, though this is problematic.  According to Jim’s results, searching sometimes yields only 10% of the relevant items.  Searching for “Tatler” between 1708 and 1712 yields 80 hits.  Though he has found hundreds of advertisements of Smollett’s Continuation of the Complete History of England, only few of these can be found through an electronic search.  Similarly, only a third or fewer of The London Evening Posts published 1760-61 turn up when you search for “London Evening”.  Robert Hume and Ashley Marshall have an essay forthcoming in Papers of the Bibliographical Society of America discussing Burney and noting, among other problems, how definite and indefinite articles interfere with searches.  Jim also cited Simon Tanner’s article in D-Lib Magazine (July/August 2009), which found the following accuracy rates for Burney: character 75%, word 65%, significant word 48.4%, capitalized word 47.4.% and number 59.3%.   The magnification feature enlarges pages by 100% and would be more useful if it magnified by 33%.  Spread dates are misrepresented, due to the lack of editorial apparatus explaining when newspapers were actually issued.  Burney’s lack of editorial apparatus, cross references, comments, and so forth is a deficit.  Having a scholarly editor–perhaps a graduate student or postdoc intership– would improve its utility.  Also needed is a review of the entire database.  A page dedicated to errors encountered by users would help, something EEBO is now working on with in its “EEBO Interactions, A Social Network.”

Eleanor Shevlin identified three pressing needs: 1) fostering greater awareness of the context of texts; 2) encouraging collaboration among users; and 3) cultivating greater access to these electronic resources.  She pointed to the need for bibliographical training in order to use these resources accurately and called for an examination of the cognitive effects these tools have on research processes.  Specifically, she wondered how EEBO’s TCP transcriptions or ECCO’s searching mechanism affects research methodology.  Noting that these tools provide opportunities to correct bibliographical inaccuracies, she urged the need for a more standardized process through which corrections could be forwarded to the ESTC or to commercial databases.  She also cited examples of productive collaboration among members of the bibliographic community, including her own experience correcting an error in Kansas’s Spencer Research library, a correction made possible by sending ECCO’s image of the British Library’s copy of a text to Kansas.  Finally, she noted that access continues to be a problem.  Scholars in the U.S. work at a notable disadvantage compared to scholars in the U.K. who typically have access to ECCO and ECCO II through the Joint Information Systems Committee (JISC).  ASECS President Peter Reill’s recent calls for feedback regarding access suggests that the issue is at least on the radar of those who can help, either through negotiations for large-scale access or  individual subscriptions.

Michael Suarez warned against the illusion of comprehensiveness in database searches.  Users are frequently unaware of what is missing in these databases, and the databases’ selectivity impoverishes word searches as tools for analysis.  Turning to the task of text-mining, he expressed skepticism regarding the mentalities of mining.  Where sustained engagement with individual texts allows for work linking texts to their culture and to other texts, textual extraction can produce radically decontextualized results.  Because these database tools are easy to use, we are, he warned, insufficiently uneasy with what they actually accomplish.  Suarez insisted that textual analysis demands an effort to fuse horizons between text and reader, a fusion that involves a reader’s deep engagement with a text’s historical context and with a text’s relationship to other texts.  Such contextualization, as James Boyd White would agree, is essential to a functional and robust literary hermeneutics.  Additionally, text-mining tools encourage scholars to work in even greater isolation, away from libraries and other scholars.  Precisely because the digital future will change the way we think, Suarez called for a greater bibliographical literacy in order to make these promising tools work properly.

Panelists’ Responses:

Jo-Anne Hogan (ProQuest)  agreed with Michael’s concern regarding the impact of these digitization projects.  She added that EEBO routinely receives emails pointing out errors, asking for missing items, and making recommendations, and that it works to incorporate these suggestions.  But she also noted a growing digital divide: concerns voiced at conferences like ASECS differed from those at conferences on the digital humanities.  At the latter, attendants ask EEBO to produce more tools for text-mining.  It is sometimes difficult to reconcile the competing requests received.  Money matters in these issues, and will always be a factor.  She agreed that more could be done to align the bibliographic data in EEBO with that in the ESTC and pointed out that efforts are under way to make that happen.  She also introduced the prospect of a social networking site for EEBO intended to facilitate communication between scholars and users so corrections can be reported and more contextual information can be made available.  We hope to hear more from her about this on this blog in the near future.  Access, she concluded, continues to be a concern, agreeing with Eleanor that it is unfortunate not to have a model for broad access in the U.S.  Personal subscriptions seem unlikely because such subscriptions cannot cover costs, at least not at subscription rates individuals are willing to pay. She hoped there might be a point in the future when ProQuest can provide broader access, but she could not guarantee such a thing.  More promising is the prospect that about half of the books in EEBO will soon be available for purchase at reasonable rates via Print on Demand.

Scott Dawson (Gale) agreed with Sayre’s suggestion that a Help screen dedicated to OCR problems  is an idea to consider seriously.  He added that Gale would look into post-OCR checks that might correct results.  18thConnect will help by testing new OCR software on ECCO page images, and that might solve problems.  Turning to Steve’s comments about ESTC, Scott noted that ECCO depends on ESTC for metadata, and that Gale is working with ESTC to add a link within the ECCO Full Citation to report problems with a given record.  He agreed with Jim May that Burney presents additional obstacles to getting accurate OCR  results.  Gale has been working with the British Library to resolve the issue of spread dates and hopes to have an update in the next few months.  On the issue of access raised by Eleanor, Scott mentioned that ECCO is concerned about the issue, but that by providing access to more than 500 institutions globally, it has helped make early modern printed material more accessible than is possible through hard copy or microfilm.  Tiered pricing and consortia-designed contracts help non-ARL institutions find ways to subscribe to ECCO.  He greed with Michael Suarez that ECCO is incomplete, even with the 50,000 titles added through ECCO II.   Gale is not planning an ECCO III.  But the possibility of linking missing titles to ECCO is being considered.

Brian Geiger (ESTC) outlined two main areas of work at the Center for Bibliographical Studies and Research (CBSR), which manages the North American branch of the ESTC.  First, they continue to upgrade and add records to the ESTC.  They are processing OPAC extracts from libraries, and recently began on an extract from Oxford University that resulted in some 200,000 records that will be matched against the file.  These OPAC extracts provide shelf marks (or call numbers) for existing items, and have turned up tens of thousands of new copies and hundreds of entirely new items.  They are adding urls from online collections.  EEBO, ECCO and TCP are matched, though not yet displayed by the public version at the British Library.  Brian has requested urls from Google and will do the same from Internet Archive.  They are digitizing title pages from paper reports submitted over the last two decades and will attach those images to the appropriate records, allowing users to compare a title page to its MARC record.  They hope to have many of the title pages in the ESTC by 2011.  And they have enhanced some 180,000 MARC records from title pages in ECCO.  Second, the ESTC has started to assess how to transform the project from an online catalog to a flexible and interactive database-driven research tool.  Brian corroborated Steve Karian’s assessment that this new resource should be built on relational databases, and noted with appreciation the value of the kind of collaborative thinking Steve offered about the project’s future.  Brian emphasized that a number of partner projects and institutions should be involved in the redesign, to ensure that the new project meets a variety of user needs and to try to plan for the sharing of information across platforms.  He mentioned some of the features that he thought should be included, among them user editing of bibliographic data and metadata and tools to send information to users about updates or changes to records.  He ended by pointing out that development of the database will require resources and the next stage of the ESTC’s evolution will be contingent on funding.  The ESTC is currently engaged in grant development.  It will be in a better position to discuss specific solutions once funding is secured.

ASECS Session: “ECCO, EEBO, and the Burney Collection: Some “Noisy Feedback” (roundtable)

March 13, 2010

Thursday, March 18,  9:45 – 11:15 a.m.

“ECCO, EEBO, and the Burney Collection: Some ‘Noisy Feedback’(Roundtable)    Alvarado E

Chair:    Anna BATTIGELLI, State University of New York, Plattsburgh

1.    Sayre GREENFIELD, University of Pittsburgh, Greensburg

2.    Stephen KARIAN, Marquette University

3.    James E. MAY, Pennsylvania State University, DuBois

4.    Eleanor F. SHEVLIN, West Chester University

5.    Michael F. SUAREZ, S.J., Rare Book School, University of Virginia

RESPONDENTS: ScottDAWSON,Gale/Cengage; Brian GEIGER, ESTC: Jo-Anne HOGAN, Proquest


Get every new post delivered to your Inbox.

Join 121 other followers