Summary of EC/ASECS Roundtable: Bibliography, the ESTC, and 18th-Century Electronic Databases

by

Bibliography, the ESTC, and 18th-Century Electronic Databases:  A Roundtable

Chair: Eleanor F. Shevlin (West Chester University)   Participants: James E. May (Penn State University—DuBois), James Tierney (University of Missouri—St. Louis), David Vander Meulen (University of Virginia), Benjamin Pauley (Eastern Connecticut State University), Brian Geiger (ESTC, University of California, Riverside), and Scott Dawson (Gale/Cengage).

The following offers a summary of the roundtable that took place, Saturday, October 10, 2009, at the EC/ASECS 2009 conference hosted by Lehigh University and held at Bethlehem, Pennsylvania, October 8-11, 2009.

 Jim May opened the roundtable, and his remarks highlighted and extended the discussion he offered in his essay, “Some Problems in ECCO (and ESTC),” in The Eighteenth-Century Intelligencer, 23.1 (Jan. 2009), the article that inspired this session and Anna Battigelli’s forthcoming roundtable at ASECS (March 18th, 9:45 am—11:15 am). Key issues Jim raised included the need to correct missing images, to address the “disappearance” of letters originally printed in red ink on title pages, and to bring the ESTC up to date. In addition, he noted that ECCO’s electronic index is not always representative of what is actually there digitally. Work is also needed on providing or revising information about subscription lists, textual history, and attributions in ESTC. While noting that he had already addressed problems with Burney in his The Eighteenth-Century Intelligencer article, 23.2 (May 2009) and that Jim Tierney would be discussing this tool next, Jim commented on the usefulness of Burney, particularly to those working on the history of a publication.

Turning to the Burney collection, Jim Tierney drew attention to the potentially confusing name for this electronic collection because it is not by any means restricted to newspapers. Instead, it includes a good number of periodicals as well. Specifically, the collection consists of 237 newspapers and 161 periodicals, and, furthermore, some of the titles included are neither newspapers nor periodicals. That the Burney digitized collection follows the Anglo-American cataloguing procedure of creating a new entry every time a newspaper undergoes a title change results in the illusion of more titles than actually exist as well as confusion about the history of a given newspaper. Jim also provided a detailed handout (posted here as a page) listing the digitized periodicals (note: not newspapers) in Burney. The handout includes notes about missing issues, other locations where titles in Burney can be found, and a tentative list of Burney titles duplicated by other digitization projects. The two overarching points Jim made was the failure to have scholars involved in the planning of Burney and other digitization projects and the need for far greater collaboration among the creators/purveyors of these databases, librarians, and scholars. That given titles in Burney often include only a few issues when other issues were available elsewhere and, if digitized, would have approached a more complete run, exemplify the need for far better coordination and collaboration.

While David Vander Meulen serves on the ESTC board, his remarks for the roundtable were offered in his role as a researcher and user of these tools. He began by noting that ESTC is an evolving tool—a work in progress—and that ECCO follows ESTC.  Moreover, even as it progresses, the ESTC is still “functional and valuable” even though it is incomplete. Nonetheless, “any addition to ESTC will change the context.” An important development occurred in 2006 when the British Library initiated free access to this tool. As for problems, the ESTC had made the decision to truncate titles and places. Yet ECCO generally offers the full titles, while expanded locations can occasionally be found by going to public library catalogues. To improve these resources, David explained, we need to have an easier way to convey corrections to the British Library or University of California Riverside (the North American home of the ESTC) and, equally important, an ongoing staff to process editorial changes and comments. In discussing this need for a means of processing updates, David also drew attention to whether the uncontrolled notes field should be visible. Unfortunately agencies that have funded the ESTC, as he explained in his closing remarks, have decided the project is complete.  Obviously, given ESTC’s status as a work-in-progress, such a decision presents additional problems to continued updating and correcting.

 Ben Pauley spoke next about a project he has initiated. He began by noting the lack of access that many institutions (and thus their scholars and students) have to paid databases such as EEBO and ECCO. Both Internet Archives and Google Books, however, have a number of eighteenth-century books in their freely accessible databases. Yet it is typically very hard to identify properly what text one has accessed. Viewing these freely available texts as an opportunity, Ben established The Eighteenth-Century Book Tracker, a project in which he is supplying the bibliographic data so sorely lacking in eighteenth-century texts found in Google Books. Doing so has compelled him to become a textual scholar or an “accidental bibliographer.” Thus far, he has recorded about 150 copies not appearing in ESTC. At present, the project features 480 texts and 4 periodicals. Ben has been asked to write an article on the Eighteenth-Century Book Tracker for The Eighteenth-Century Intelligencer that will detail much more about his undertaking.

Speaking as the Associate Director and Resident Manager of the Center (University of California Riverside), the North American home of the ESTC, Brian Geiger explained that the British Library’s ESTC role has focused on cataloguing its own collection and that the Univ. of California Riverside  has handled everything else. In addition to reiterating points about the problem with truncated titles, he also discussed the lack of subject headings as a shortcoming. Turning to the digital surrogates of early modern imprints, he explained that the ECCO and Adam Matthews collections are based on ESTC, but EEBO is not. Next Brian addressed the need to foster better communication between ESTC and scholars. While the channels of communication between ESTC and librarians have remained strong, that has not been the case with scholars. Like Ben, Brian will also be writing an article on the ESTC for The Eighteenth-Century Intelligencer.

 Scott Dawson from Gale-Cengage concluded the presentations by roundtable panelists. He first supplied an historical overview of ECCO and Burney. In 1982 Research Publications began to microfilm the “Eighteenth Century” microform collection. By 2002 twenty-six million pages of eighteenth-century titles had been filmed. This microfilm collection is the basis for ECCO, but using the ESTC in conjunction with the microfilm has been overall a real plus for the project.  ECCO II, released at the start of this year, features 50,000 additional titles. By mid 2010 ECCO II, representing holdings from fifteen libraries, will be completed (titles from the Harry Ransom Center are still being prepared). ECCO and ECCO II, combined, will have made 185,000 eighteenth-century titles available to subscribers. As for the digitization of Burney, that project was handled by the British Library and not Gale-Cengage. Scott also addressed some of the problems that can and cannot be corrected. When pages are blurred, for instance, the microfilm plays a key role in what can be done. If the microfilm is clear, then the page is re-filmed. Yet if the problem occurred because the page is blurred in the microfilm, then, from the perspective of Gale, nothing can be done. When duplications of a title are discovered, however, the duplications can be deleted. 

After all six panelists had offered opening statements, the discussion was opened to the audience’s questions and comments. The point perhaps most stressed in the discussion with the audience was a need for far greater involvement by scholars in the creation and improvement of digital resources. In terms of updating or correcting resources, questions arose about how this might be done and what types of controls are needed. In subsequent discussions, the creation of advisory boards and (or) the involvement of a committee representing ASECS arose as possible avenues for communicating and addressing the scholar’s perspective more effectively. The establishment of an advisory board and/or ties with ASECS could play a vital role in future projects, and members of a board or ASECS committee could also devise potential solutions to some of the shortcomings with existing tools.  The resurrection of Factotum, the now defunct ESTC news publication of the British Library (ceased with issue no. 40 in 1995), or the initiation of a similar publication would be a way of establishing regular, ongoing communication with a broader base of scholars. (For those interested in the content of previous issues, see the index for Factotum.) Of course, an obstacle here is staffing and funding. Questions also arose about plans to make Burney more complete by digitizing issues not included for a particular newspaper or periodical title but available elsewhere. Yet that this digitization project had been undertaken by the British Library (see final report) and not Gale complicates the issue. Also, when asked about any plans for an ECCO III, Scott explained that the creation of ECCO II caused surprise among many libraries that had purchased ECCO because they believed that ECCO was complete at the time. When ECCO II was introduced for purchase, libraries were promised that there would not be any additional forms of ECCO.  (Depending on the discovery of additional eighteenth-century titles, however, I see no reason that another collection could not be pursued; if enough material for another collection becomes available, then scholars need to insert and assert themselves in conversations with vendors and librarians and make the need and value of a third collection known.)

Another very real, pressing concern was the large number of scholars who do not have access to these databases and for whom their institutions are not likely to be able to afford these resources even in the future. The point was raised that all universities in the U.K. have access to ECCO and ECCO II for an annual hosting fee through the auspices of the Joint Information Systems Committee (JISC), “established by the UK further and higher education funding councils in 2006 to negotiate with publishers and owners of digital content.” Because the situation differs greatly in the U.S.—we have no higher education government council overseeing all our universities—we do not have such a prospect here. While Ben Pauley’s Eighteenth-Century Book Tracker promises to bring some order to the current anarchy that characterizes freely available eighteenth-century texts, his valuable project can’t and won’t solve the inequity of access in the United States.

Tags: , , ,

17 Responses to “Summary of EC/ASECS Roundtable: Bibliography, the ESTC, and 18th-Century Electronic Databases”

  1. Anna Battigelli Says:

    What an interesting discussion! It seems as if five major issues emerged: 1) acces; 2) funding; 3) coordination between ESTC and these databases; 4) bibliographical problems of ECCO (a broad category with many sub-categories); and 5) the need for greater and more consistent scholarly contribution to the design of these databases and bibliographies, perhaps through some sort of committee.

    I wonder whether funding for ESTC might be enhanced though what I assume will be its considerable role within 18thConnect. Perhaps this is already being considered. It would be helpful to have a fuller account of the ESTC’s future plans.

    I’d also like to hear more about Jim May’s concern about disappearing letters originally printed in red ink.
    AB

    Like

  2. Eleanor Shevlin Says:

    Thanks, Anna…it was interesting, and I did not do justice to all the opening position statements. The speakers kept to their time, but still mananged to mention quite a bit.

    I would just expand your point 4 to read bibliographical issues (errors as well as updated information) affecting both ESTC and databases that rely on ESTC as well as technical shortcomings such as poor or missing images or OCR (we could perhaps make this second point an issue in its own right). Also, in regard to point 4, ECCO, in offering the full title, is one way that it improves upon ESTC.

    18thConnect was mentioned briefly during the audience session, and it will be interesting to see what relationships develop here. I would think that independent finacial resources also need to be found for ESTC work.

    Finally, I can clarify the “disappearing red letters.” The problem here is that when filming pages with red ink, anything in red ink does not show up on the film, so those letters/word do not appear on the microfilm. Digitizing the film only perpetuates the problem of missing letters/words.

    Like

  3. Anna Battigelli Says:

    I was also interested in the problem of blurred microfilm. Do I understand correctly that Gale will re-digitize a film page if the film is not blurry but that if the film (which they own) is itself blurry, they will not seek a copy of that page from a library with a clear paper version of the text? Is this a financial decision? I would be interested in hearing more.
    AB

    Like

  4. Eleanor Shevlin Says:

    Yes, if the page is fine on the microform, then Gale will re-digitized that page for ECCO. However, if the problem is with the microfilm–blurred, torn, spotted, only half-filmed or skewed, or any number of such problems, then Gale will not fix. That decision is probably in part financial, but we need to remember that the basis of their whole digitization project was the microfilm and not the originals. A digitized page from an orginal work–as opposed to a microfilmed copy–will look quite different (and frankly much better–think of some the works in Internet Archives); in addition, there’s the issue of gaining access to, obtaining permission to digitize, and securing rights to reproduce a page for a commercial work.

    Like

  5. Benjamin Pauley Says:

    Thanks for this excellent summary, Eleanor. The various strands you highlight inform a question I’d like to pose to readers of this blog: What bibliographical matters would scholars like to see addressed that currently aren’t being addressed—and perhaps can’t be addressed—by the resources we have now?

    I was very impressed by the candor with which both Brian and Scott addressed the questions posed to the ESTC and to ECCO. Both of those projects are shaped by constraints (for lack of a better word) that I think most of us scholars aren’t very practiced at considering. (Scholars, to borrow from Dryden, tend to be a headstrong, moody, murmuring race.)

    As Brian noted, for instance, the ESTC is, in many ways, an adjunct of cataloguing efforts at the British Library. While the BL has made great commitments to the ESTC, there are things that scholars might well want that the ESTC simply isn’t in a position to deliver because they are at odds with the very practical demands of library cataloguing. (A case in point would be the matter of transcriptions of titles: as Kevin Berland noted, it sure would be nice if the ESTC could transcribe titles in the way scholars want them to be transcribed. As Brian candidly responded, though, the ESTC is bound by certain conventions of casing that arise from elsewhere.)

    Similarly, the issue of correcting faulty scans in ECCO is one that I think most scholars see as some sort of moral obligation for Gale/Cengage: if the pages are illegible, then, by goodness, fix them! But, as Scott rightly noted, ECCO was a viable business decision only because the microfilms already existed. Though Gale took on new filming for the creation of ECCO II, it’s not like they can realistically keep scanning stations set up at every library they ever partnered with indefinitely. Rescanning a page from sound film is one thing, but returning to, say, Austin, to re-do one bum page from a microfilm just isn’t something we can expect a for-profit company to commit to—much less to commit to doing at the drop of a hat every time somebody discovers an illegible page in some little-read tract or other.

    So I’d like to hear what sorts of things scholars would like to see that might fall outside the purview of projects like ESTC and ECCO. Obviously, we want the bibliographical record to be as accurate as possible, and we want it be as consistent as possible across multiple sources: we want the state of the art to be reflected accurately. What are the projects that scholars ought to take on for ourselves? (Always with an eye toward coordinating and sharing information with established sources like ESTC and ECCO, of course.)

    Like

  6. Eleanor Shevlin Says:

    Thanks, Ben, for your very thoughtful useful response and excellent question.

    Yes, the choice of cataloguing procedures that the ESTC chose to follow has pluses and minuses (and there were debates at the time about which rules to follow–and I think also whether new rules should be created), and no doubt if something other than Anglo-American rules had won out, shortcomings would have later arisen. Thanks also for offering the transcriptions of title issue. It’s a good example. That one can typically obtain the full title from ECCO is quite a useful addition–IF one has access to ECCO.

    As for the problems with imperfect pages, I’ve been sympathetic to Gale’s inability to re-do imperfectly filmed; it’s just doesn’t seem feasible to do so–nor reasonable to expect such changes. But if people do not know what is involved, then of course it seems as if Gale is simply being unresponsive.

    I also failed to mention in my summary the availability of MARC records (for an additional fee) for ECCO that Gale has sourced from the ESTC through work with Brian’s North American ESTC office.

    As for bibliographic improvements, I think your list, Ben, offers a good overview. Your question about which projects seems more complicated. Your Eighteenth-Century Book Tracker exemplifies a highly praiseworthy effort that is providing an invaluable service to other scholars. But it requires quite a bit of effort and time, and it was only through this blog that I learned of your efforts. In other words, there needs to be not just coordination and communication with ESTC, ECCO, and Burney, but also better coordination and communication with other scholars (a point George Williams and others brought up this summer on the Long Eighteenth Century blog. 18thConnect promises to offer one means of a “collective glue”, but I think there is room for others (especially while that project is being developed). I also think that many scholars are not aware of the shortcomings of ECCO and ESTC because they are such wonderful, valuable tools, and increasing awareness (done in a way that would not denigrate these resources) would be helpful. I wpuld like to see a new version of Factotum appear as one project/means of fostering communication. And I think a project that worked very closely with ESTC and Gale that could offer corrections and updates to records would have the best chance of succeeding and becoming widely known. (I realize that ESTC has a procedure for reporting, but it is understaffed.) Another project could address the issue of access to these databases. I know have not really addressed your very fine queston, Ben, but these are some initial thoughts–I am sure I’ll have more to say during the coming week.

    Fianally, while Anna and I started this blog to help generate discussion in advance of our roundtable, we also hoped that it would continue to help advance discussion after the sessions had been held.

    Like

  7. Anna Battigelli Says:

    Like Ben, I would also be interested in hearing what “what sorts of things scholars would like to see that might fall outside the purview of projects like ESTC and ECCO.”

    I’d also like to see more rigorous discussion of the bibliographical issues we have mentioned but not fully addressed. Are there plans for trying to integrate ESTC and ECCO, or are the limitations noted above so overpowering that integration cannot be achieved? Is resurrecting Factotum a possibility, or, given our current financial climate, is a blog like this a more likely venue for addressing these issues? An advisory board is long overdue; we need scholarly imput to get to ECCO and to be implemented by that database, but how would this work? Any single issue such a board took on would be complicated: trying to improve ECCO’s full-text search mechanism, for example, is linked to faulty OCR, faulty or misleading bibliographical entries, duplicate entries, ECCO’s relation to ESTC, and so forth. So making ECCO stronger is not a simple task. If we are serious about creating text-bases that serve us well, we will need to work collectively and patiently on this project, which is likely to take years. The benefits, however, are potentially enormous.
    AB

    Like

  8. Eleanor Shevlin Says:

    Here’s some responses to Anna’s queries. I would be interested in hearing comments from others:

    1) Integration of ESTC and ECCO: ECCO, unlike EEBO, is based on ESTC. MARC records, offering subject headings, for ECCO titles are now available through the collaborative efforts of ECCO and ESTC. See both FAQ’s for ECCO and MARC records as well as Jeffrey Garrett’s article“Subject Headings in Full-Text
    Environments: The ECCO Experiment”
    . Unfortunately, the link to sample records on Gale’s FAQ’s page does not seem to be working.
    However, these records must be purchased separately.

    The example of the MARC records is one integration that has already transpired, but because it needs to be purchased separately and because it has only been recently released, many may not have experienced this feature.

    2) As for other types of integration, I think the issue of updates/comments need to be channeled through ESTC (just my opinion)–whether it is Gale passing comments they have received from users to ESTC, individual scholars, or librarians who have discovered or been alerted to problems. I say this because of quality control. It seems as if Gale is willing to add updates, corrections to its titles. All of this said, that does not address the funding needs for ESTC to process these changes in a timely fashion. Having ASECS, similarly relevant scholarly associations, and individual scholars perhaps alerting the original bodies (NEH, for instance) that funded ESTC about the need for additional work (and the importance of this work to the overall value of the tool) *might* change attitudes that the project is complete and might see results.

    Projects such as Ben’s also have great potential for furthering updates, corrections, and so forth, but these efforts need to make direct corrections with ESTC as Ben has.

    3) Resurrecting Factotum was my suggestion after hearing Brian express his realization that better communication needs to be fostered with scholars. Factotum was originally published by the BL, and I was hoping that UC Riverside might consider hosting if it had reliable, dedicated help from volunteers. For example, if Brian could find (or if this blog could help find volunteers if Brian was at all inclined to consider launching a newsletter) scholars interested in serving as contributors or advisory editors for such a newsletter–perhaps issued twice a year as a PDF that could be distributed to those interested (perhaps available on this and similar blogs, the ESTC’s UC RIverside and BL ESTC websites, etc.) and also sent to libraries. It could also be fee-based, but that could be more complicated (perhaps move to a fee-based publication; I would subscribe.) We could also see if an existing publication–Studies in Bibliography, Papers of the Bibliographic Society of America, and even (though less likely) Eighteenth-Century Studies might be willing to include a section annually offering news and developments (in this scenario, the publication would need to find a scholar with the expertise and ties to assume duties as the main section editor/correspondent; this pwerson could coordinate with Brian or someone else at ESTC). There are so many issues/obstacles/problems to a newsletter here that I have not addressed, but I offer this skeletal view to foster discussion. I don’t think that such a newsletter is an impossibility.

    4) A blog such as this one is also potentially an excellent means for communication (especially for ongoing, daily, weekly discussions), but I worry about its reach to those who have valuable expertise but do not blog or read blogs.

    5) A board and/or an ASECS committee (both ideas suggested by four or five people to me after thinking about the exchanges–several of whom would be in a position to make these happen) would probably be helpful in developing and/or ensuring ways to correct faulty or misleading bibliographic entries, duplicate titles, and so forth (see points 2 and 3 discussed above) as well as supplying informed advice for any future electronic projects that might be undertaken under the auspices of Gale and/or ESTC, BL, etc–as well as those devised by other vendors. Of course in order for a board to play a direct role in many of these issues in ECCO, Gale would need to want to have such an advisory body (think there’s interest there). There’s already a board for ESTC, so there it would be a matter of adding a subcommittee or the like. It seems as if ASECS as the main representative body of 18th-century scholars, should add an official committee to review, advise, lend assistance, give their “scholarly stamp of approval,” etc. to electronic resources related to our period. A group of scholars could submit a formal proposal to ASECS to do so.

    Frankly, like the problem with faulty pages, it would seem that improving ECCO’s search function through better OCR technology is just not feasible. It would require rescanning all the film with the better technology.

    As for other projects, my main, immediate interest is improving the accuracy of records/information in ESTC and ECCO as new information arises. I also am very interested in Ben’s project. Finally, though one not of immediate relevance here necessarily, projects that allow one to search online catalogues of archival records and provide the capability of ordering online desired documents (and having them delivered online) for a fee (I am thinking of the British National Archives/ A2A Wills service–a huge boon).

    Like

    • Benjamin Pauley Says:

      Lots and lots of good ideas here that should be a good jumping off point for some very big discussions.

      Just wanted to note, regarding the penultimate paragraph: improving search through better OCR actually is something that 18thConnect is looking to take on first.

      My understanding is that Gale/Cengage has given them access to all of the page images, which they’ll re-process (not rescanning the films, then, but in effect the same thing). Another key point, as I understand, is that 18thConnect has also been given access to the triple-keyed clean texts done by the Text Creation Partnership, which will greatly facilitate training the OCR scheme they’ll be developing. That’s where those one million hours of supercomputing time are going to come in handy. But Laura Mandell would be the one to shed light on this point.

      Like

      • Eleanor Shevlin Says:

        You are absolutely right about 18thConnect’s undertaking the OCR issue–I should have mentioned this point. Your understanding about Gale/Cengage’s willingness to work with 18thConnect and in what capcity matches mine based on Laura Mandell’s and Bob Markley’s descriptions of 18thConnect goals. I think I was too focused on what an advisory board to Gale might do (of course, such a board would no doubt include members of 18thConnect). Laura Mandell has posted earlier on emob, too.

        Like

      • Anna Battigelli Says:

        I would just add that until scanning is more reliable, which necessitates better OCR technology, ECCO won’t be what it can be. Access to a rich archive is, of course, great, but scanning is still the greatest asset ECCO offers. And it is the asset that most promises to transform literary studies. So 18thConnect is the great hope here.

        Like

  9. Anna Battigelli Says:

    Now that’s a helpful list! I particularly like the idea of a PDF file for Factotum. That wouldn’t be very expensive, and it might foster greater communication and clearer updates. It seems to me that building an audience for this kind of thing is the key. And part of building that audience is explaining the current situation clearly.

    Like

  10. Eleanor Shevlin Says:

    Alerting former subscribers to the new newsletter, having an article/announcement in the ASECS newsletter; the EC Intelligencer, and SHARP News (40% of its members are interested in 18th-century topics); and publicizing on the SHARP-L, C18th-L, and blogs would be the first-steps at alerting potential readers to such a resurrected “Factotum” (or whatever name is devised).

    I suspect that the audience is there–it’s just a means of spreading the word…

    Like

    • Anna Battigelli Says:

      Those are good suggestions for building an audience, Eleanor. I also think that demystifying these issues by writing about them clearly will on its own draw an audience.

      Like

      • Eleanor Shevlin Says:

        Your point about clearly articulating these issues, Anna, seems key, and doing so would also help convey why all scholars should care about these developments.

        Like

  11. Anna Battigelli Says:

    Though this is perhaps only an interim solution
    to the problems Jim Tierney pointed out, couldn’t
    the following be added to Burney’s front page
    sidebar:

    17th-18th Century
    Burney Collection Newspapers & Periodicals
    1 million newspaper pages
    Newspapers, newsbooks, Acts of Parliament, addresses, broadsides, pamphlets, proclamations, periodicals
    The most comprehensive collection of early English newspapers and periodicals Titles from London, British Isles, and colonies

    It would also help to have Jim’s list of periodicals accessible on the site itself. In fact, richer resource pages are needed for both ECCO and Burney.

    If an advisory committee gets established, looking at how user-friendly these sites are would be a good first step.
    AB

    Like

  12. Eleanor Shevlin Says:

    Anna,

    These are good suggestions–and would seem easy to implement in some ways (though Gale/Cengage might worry that altering the product name by adding “periodicals” at this point could cause confusion). Interestingly, the Biritish Library, that oversaw the digitization of the collection, calls this project the British Newspapers Collection with no mention of “Burney” (see the BL’s final report).

    Gale/Cengage actually already has an advisory board for Burney that includes Brycchan Carey and Markman Ellis. (The board may have been organized by the BL, though).

    Like

Leave a comment