Posts Tagged ‘Proquest’

Collaborative Reading: Elizabeth Scott-Baumann and Ben Burton’s “Encoding form: A proposed database of poetic form”

March 8, 2010

Elizabeth Scott-Baumann and Ben Burton’s recent paper,“Encoding form: A proposed database of poetic form”, for APPOSITIONS:
Studies in Renaissance / Early Modern Literature and Culture
‘s recent E-Conference: February-March, 2010, is suggestive of how new digital resources can be developed to augment the capabilities of existing tools such as EEBO and EECO. Responding many years later to Heather Dubrow’s 1979 call for “new methodology in early modern studies,” Scott-Baumann and Burton are constructing a database devoted to poetic form. Their project will afford a means of studying, historically and formally, poetic form by enabling queries about poetic form and generic transformations that resemble those we can now pose about words, thanks to electronic databases such as EEBO and EECO:

  • What is the origin (or origins) of a given form?
  • How does its structure, use, and meaning change over time?
  • Are there variations in use and meaning in different regions, or among different groups?
  • How does a given form relate to others, and how does this relationship change over time?
  • Concentrating on sixteenth- and seventeenth-century poetry, Scott-Baumann and Burton will use existing EEBO-TCP texts and enhance them with additional mark-up that builds upon Text Encoding Initiative (TEI) tags. As those familiar with TEI documentation will recall, its tags include ones designed for encoding verse: “stanza divisions, caesurae, enjambment, rhyme scheme, and metrical information, as well as a special purpose rhyme element to support the simple analysis of rhyming words.” Because encoding capabilities extend beyond merely marking general formal conventions and can also entail encoding that represent interpretive judgments, Scott-Baumann and Burton will experiment with both possibilities. The inevitably time-consuming nature of their task will probably result in building the databases in stages.

    As for publication plans for the database, its creators “aim to negotiate with EEBO and Chadwyck-Healey to find a form of publication which both respects intellectual property and commercial interests, while also making this rich new material accessible to the widest possible audience.” Scott-Baumann and Burton have clearly thought hard about issues of access and how to maximize this database’s availability for users. They present four different possible options, formulated with an eye to those lacking access to EEBO. As they note though, much will depend on what arrangements they are able to make with EEBO/Chadwyck-Healey.

    Noting that their database, once built, could be expanded beyond its present focus on the 1500s and 1600s to cover all periods of poetry, they then devote a section of their paper to its potential scholarly and pedagogical uses. Most obvious perhaps is the usefulness this planned tool could have on advancing work in historical formalism, an emerging approach that revisits “poetic form as historically specific, historically determined, and historically efficacious.” The ability to conduct specific searches across a significant number of poetic texts enables the quick capture of evidence to support or disprove what are currently only hypothetical propositions based on a small textual sample. Rightly claiming that this database “would change the way in which scholarship on poetic form is conducted, Scott-Baumann and Burton detail a wealth of possible questions and issues it could serve. This section also offers a range of pedagogical uses for this tool and addresses a range of audiences from the undergraduate to the secondary student.

    Before a brief conclusion, the paper then turns to discussing the two-stage pilot project for the database:

    1. A small database containing information on the metrical structures and rhyme schemes of all verse in the first edition of 10 texts published between 1590 and 1599. 2. A larger database containing information on the metrical structures and rhyme schemes of all verse in first editions of texts published during this period.

    Scott-Baumann and Burton’s database plans present another way of thinking about EEBO and how to augment its value. That they have proposed to build their database using EEBO-TCP seems essentially a wise plan, notwithstanding unsettled questions about access.* For one, linking one’s project to an already well-established resource should ensure its visibility. Too often very worthy projects are launched but remain unknown to many who would benefit from them. In addition, such a tie-in helps ensure continuity among resources. This augmentation of EEBO’s capabilities and the efforts to provide continuity are similar to what NINES and 18thConnect are offering later periods.

    *One of the access options does offer “[o]pen access to database and texts but not with mark up. …if we are not able to make the XML-encoded texts freely available, we would display the texts in their entirety [as users request them], but with the encoding invisible. … and display the verse with, for example, its stresses marked with accents, or its rhyme scheme colour-coded, rather than with visible tags.”

    Collaboration, Costs, and Digital Resources

    January 30, 2010

    On February 19 and 20 Yale will host a graduate student symposium, The Past’s Digital Presence Conference: Database, Archive and Knowledge Work in the Humanities. A quick survey of the conference program and available abstracts indicate several topics that dovetail with issues or subjects that have engaged emob. Jessica Weare’s paper, “The Dark Tide: Digital Preservation, Interpretive Loss, and the Google Books Project”, for instance, examines the discarding of material evidence in the process of digitizing, Vera Brittain’s The Dark Tide. Similarly, Scott Spillman and Julia Mansfield’s presentation, “Mapping Eighteenth-Century Intellectual Networks”, discusses their work on Benjamin Franklin’s letters and their relationship within the Republic of Letters. The conference’s purpose also addresses many of the questions we have been posing on this blog:

    ■ How is digital technology changing methods of scholarly research with pre-digital sources in the humanities?
    ■ If the “medium is the message,” then how does the message change when primary sources are translated into digital media?
    ■ What kinds of new research opportunities do databases unlock and what do they make obsolete?
    ■ What is the future of the rare book and manuscript library and its use?
    ■ What biases are inherent in the widespread use of digitized material? How can we correct for them?
    ■ Amidst numerous benefits in accessibility, cost, and convenience, what concerns have been overlooked?

    Peter Stallybrass is offering the keynote, and Jacqueline Goldsby will be the colloquium speaker, while Willard McCartney, Rolena Adorno, and others will appear on the closing roundtable. Such a lineup points to the range of perspectives represented. The conference is free to all affiliated with a university.

    Among the places this conference has been announced is the JISC Digitisation News section of the UK Digitisation Programme website, and its announcement emphasizes the participation of students “from around the globe.”

    Collaboration as it occurs across boundaries is the implicit topic of this posting, and I wish to use reports from the JISC website both as a springboard and as a contrast in the discussing the topic.

    A 2008-2009 JISC report, Enriching Digital Resources 2008-2009, Enriching Digital Content program—a strand of the JISC Online Content Program—features a podcast with Ben Showers. Because of the national nature of JISC, the program described offers a unified, coherent approach to advancing digital resources for its higher institutions of education; it represents a collaborative agenda. In this podcast Showers explains the purpose of the program: Rather than fund the creation of new resources, the program invested £1.8 million to enhance and enrich existing digital content while also developing a system for universities and colleges to vet and recognize this work. He then turns to explaining the following four key benefits of this program:
    • “unlocking the hidden—making things that are hard to access easy” to obtain and preserve. To illustrate, he uses CORRAL (UK Colonial Registers and Royal Navy Logbooks) project as an example of opening up primary data to make it not only much more available but also to preserve it.
    • enhancing experiences of students. Here Showers exemplifies the Enlightening Science project at Sussex that offers students opportunities to watch video re-enactments of Newton’s experiments and read original texts by Newton and others.
    • speeding up research—once a document has been digitized, there is no need to repeat the process. The document will now be available for all other researchers to use.
    • widening participation—engaging broader audiences including not only faculty and students within Britain’s educational community but also participants globally.

    Turning to the new goals for the 2009-2011 program cycle, Showers notes an emphasis on the “clustering” of content, that is bringing various projects together and establishing, when appropriate, links among them. Another focus is further building skills and strategies within institutions to deliver digital content effectively. Finally, he mentions the strengthening of transatlantic partnerships, and here the US National Endowment for the Humanities (NEH) is given as an example. Of course, there is a long history of scholarly collaboration between the NEH and British institutions—perhaps most notably the English Short Title Catalogue (ESTC).

    Indeed, through collaborative digital grants offered by JISC and NEH several transatlantic projects are underway or near completion, including the Shakespeare Quartos Archive, a collaborative effort involving Oxford University and the Folger Library, and the St Kitts-Nevis Digital Archaeology Initiative, undertaken by Southampton University and the Thomas Jefferson Foundation, Charlottesville, VA, to advance scholarship on slavery. There are several others as well.

    Both the goals and benefits detailed by Showers are ones that would attract the support of diverse parties, and they do parallel many arguments being made on this side of the Atlantic for such work, including ones advanced by the NEH. Moreover, this and other JISC reports suggest that JISC has also helped broker mutually beneficial relationships between British universities and commercial vendors such as Cengage-Gale and ProQuest. Yet another JISC report, The Value of Money, offers arguments that we need to be making and also points the obstacles and divides affecting various types of collaboration in the United States.

    After offering the following figures on the return of money invested in the JISC,

    • For each £1 spent by JISC on the provision of e-resources, the return to the community in value of time saved in information gathering is at least £18.

    • For every £1 of the JISC services budget, the education and research community receives £9 of demonstrable value.

    • For every £1 JISC spent on securing national agreements for e-resources, the saving to the community was more than £26.

    the report summary offers the following remarks:

    These are the figures revealed by a recently-published Value for Money report on JISC services. Although many countries have centrally provided research and education networks, and some have provided supplementary services, no other country has a comparable single body providing an integrated range of network services, content services, advice, support and development programmes.

    The cost-effectiveness of JISC is again highlighted in two sidebars:

    These figures suggest that for every £1 JISC spent on securing national agreements for e-resources, the saving to the community was more than £26
    and
    The added value, equivalent to more than £156m per year, suggests the community is gaining 1.4 million person/days, by using e-resources rather than paper-based information.

    The end of the summary further reinforces why investments in JISC benefit the UK as a whole:

    The value of JISC activities extends beyond the benefits identified here. Education and research are high-value commodities that play an important role in the UK economy and underpin the UK’s global economic position.

    The JISC’s “Value of Money” report contains the types of arguments and data that we in the US need to be making. While our system of higher education does not operate under the centralized system that characterizes that of the UK, the push for more transparent reporting on and assessment of what our various universities and colleges are delivering perhaps provides an opportunity for new forms of collaboration. Through national scholarly societies, the NEH, Mellon Foundation, ALA, and more, we need to supply some “noisy feedback” from a dollars-and-cents/sense perspective about what investing in digital resources means not just for our institutions of higher learning but also for our society.

    Collaborative Readings #1: Ian Gadd’s “The Use and Misuse of Early English Books Online”

    July 7, 2009
    We are launching a series of “Collaborative Readings,” borrowing the model popularized so successfully by David Mazella and Carrie Shanafelt on The Long Eighteenth, to discuss some of the items on our bibliography.  “Collaborative Readings” can run concurrently with other postings.

    To begin this series, I’ll summarize Ian Gadd’s lucid “The Use and Misuse of Early English Books Online,” which argues that using EEBO properly requires an understanding of its evolution and of the evolution of the catalogues on which it relies.  Particularly crucial, Gadd argues, is an understanding of EEBO’s historical reliance on ESTC.

    Gadd’s article falls into three parts.  Part 1 describes the three catalogues on which EEBO and ECCO are based: 

    • STC: Pollard and Redgrave’s Short-title Catalogue of Books Printed in England, Scotland, & Ireland, and of English Books Printed Abroad, 1475-1640
    • WING: Donald Wing’s Short-title Catalogue of Books Printed in England, Scotland, Ireland, Wales, and British America, and of English Books printed in other Countries, 1641-1700
    • ESTC: English Short Title Catalogue, which began its history as The Eighteenth Century Short Title Catalogue, but eventually incorporated material from the previous two catalogues to become The English Short Title Catalogue, retaining its acronym.

    Each of these catalogues uses different cataloguing principles and different criteria of inclusion.  The former two differ in what they include, but both catalogue books that have been located (as opposed to copies known to have existed).  The ESTC, on the other hand, began as a computerized and comprehensive union catalogue, merging “together the existing catalogue records of other libraries.”  Because the ESTC includes items in the previous two catalogues, it is, as Gadd puts it,

    a hybrid database consisting of three sets of catalogue records, each constructed on different principles.  Searching across these record sets, therefore, poses problems: the unsuspecting student, for example, interested in Stationers’ Company registrations of works might assume that registrations all but dried up after 1640 when in fact this is simply a consequence of information that STC recorded but Wing and ESTC routinely did not.

    Part 2 details the evolution of microfilm collections based on these catalogues and their eventual digitization.  Two companies oversaw this process, eventually producing first EEBO then ECCO.

    • UMI: University Microfilms used STC and Wing to produce two series of microfilm collections known as “Early English Books, 1475-1640” and “Early English Books, 1641-1700.”  In 1998, UMI (now ProQuest) digitized copies from these collections to produce EEBO.
    • Research Publications produced a rival microfilm set based on the ESTC.  In 2003, Thomson Gale (now Gale/Cengage) digitized copies from this collection to produce ECCO.

    EEBO was permitted to use the bibliographical records of the ESTC, but

    it did so for its own purposes: certain categories of data were removed (e.g. collations, Stationers’ Register entrances), some information was amended (e.g. subject headings), and some was added (e.g. microfilm-specific details).

    Additionally, there was no formal mechanism for synchronizing the data between the two resources.  Consequently, two divergent holding records exist in EEBO’s and ESTC’s respective catalogues. 

    Gadd’s cautionary note pertains to the divergence bewteen these two catalogues:

    As both resources continue to amend and expand their bibliographical data for their own purposes, there is an increasing likelihood of significant discrepancy between the two resources. . . . there is no absolute one-to-one correspondence between the pre-1701 entries in ESTC and the materials on EEBO; there are—and will always be—items on ESTC not available on EEBO.

    Because different copies in the same edition can vary, there is, Gadd explains,

    a vital difference between any single bibliographical record on EEBO and the corresponding ‘image set': the former describes the particular edition  (or issue), the latter is taken from one copy from that particular edition. Moreover, unlike scholarly facsimile editions, the selection process for microfilming was often arbitrary.  Copies were selected primarily by reference to the copies listed in STC and WING, with particular preference for certain major collections; they were not selected because they were considered representative of a particular edition.

    Gadd suggests that EEBO refer to itself as “a library of copies, rather than a catalogue of titles.”

    Gadd commends ProQuest for its receptivity toward the scholarly community.  Part 3 briefly reviews ECCO, noting its “underlying text-transcription,” which allows for searches but is flawed by the inaccuracy of the OCR software it uses. 

     


    Follow

    Get every new post delivered to your Inbox.

    Join 123 other followers