Among the text databases included in the Monk (metadata offer new knowledge) Project are ECCO and EEBO (both of which are part of the Text Creation Partnership (TCP). While not addressing bibliographic errors, this initiative does hold relevancy for our discussions on improving these tools. In particular, this project’s efforts are apparently aimed at providing scholars with the means to work more effectively and simulatenously with texts created and housed in different databases.
A recent PowerPoint presentation about the Monk Project, Tools for Textual Data (May 20, 2009), by John Unsworth sketches such issues as treating text as data, the Monk Project’s efforts to facilitate means to “mix and match” texts that reside in different databases, the development of features that will enable searches that users may wish to conduct (for example, what adjectives does a given author favor the most?), and the acceptable level of curatorial/user intervention. The tools being developed to allow both the posing of questions that users may wish to ask and the mining of the data to yield responses to these queries seem highly promising.
Under “Questions for Discussion” (slide 22), I was interested in the two-part query, “Should users be allowed to change, correct, or improve data? If so, under what constraints or conditions?”. Thes question set seems directly pertinent to our discussion of how to improve bibliographic issues in these databases, but it rightly also asks about what sorts of constraints should (or need) to be in place–the answer to which would speaks to issues of quality control. Another question, “Should those who provide collections also collect the results of work done on their collections? Why or why not?,” was surprising to me. While I could see how gathering information about the ways that the collections were being used and the results obtained could help developers improve these databases’ functionality and accuracy, the collection of this information–especially by the owners of databases that are commercial enterprises–seemed far more worrisome to me.