Teaching Digital Computation?


Suppose one were to ask undergraduates to engage in the most basic of digital computational projects–say measuring the percentage of words used in dialogue in a given novel compared to percentage of words dedicated to narration.  How would one calculate that and what tools are needed?

What other elementary computational exercises might one use in an undergraduate classroom?

Finally, how many of you ask undergraduates to explore digital computation of any kind?  Which digital tools are necessary for such projects?  And how useful is this kind of exercise in the classroom?


21 Responses to “Teaching Digital Computation?”

  1. Eleanor Shevlin Says:

    Ideally students would have access to a marked-up text (one with embedded coding–in which “dialogue” portions are tagged –e.g., a TEI encoded text). Then the exercise would be fairly simple. If not, then the students could mark the dialogue portions. Work with just the body of the text, of course; that is, delete paratextual items such as prefaces, footnotes, any critical variant apparatus, and so forth, so you are dealing with just the actual text of the novel.

    Obtain the total word count, and then search for the marked dialogue portion and obtain the word count of this portion.


  2. Anna Battigelli Says:

    Thanks! Where would I find a TEI encoded copy of, say, Northanger Abbey?


  3. shgregg Says:

    Hi Anna. I’ve been working with my undergraduates and using word-frequency analysis too. For your particular task, I’m not sure there is a TEI/XML encoded version of Northanger Abbey, so I’d go for getting a plain text file from, say, Gutenberg Project and follow Eleanor’s advice. It will be a tiresome, but enlightening, process to separate dialogue from the rest, but perhaps it could be a collaborative task splitting the novel up across teams?

    But there are other fun things to do with computers and texts. I’d highly recommend http://voyant-tools.org/. This is a reliable and online tool that can count and visualise word frequencies in a variety of ways, it can also calculate the number of unique words against the total, and you get a ratio which might start a discussion about linguistic complexity (esp. good for poetry and drama). I’ve also had students who created a corpus of largish number of novels (12-20) and analysed keywords across this mini-corpus. I’ve also had students analyse dozens of titles (great with the long titles of 18thC novels). NB. I’ve just seen that this is down for maintenance, which is unusual.

    What I’ve found with word-frequency analysis is that it gets students to think about historically-precise terms (e.g. early searches for words like ‘class’ or ‘race’ didn’t work, but ‘Lord’ or ‘servant’ or ‘negroe’ or ‘moor’ might). And constructing a corpus gets them to think about genre, periodicity, and literary history. By all means borrow my ideas for an undergrad module I ran on DH and the 18C here https://dlsatbsu.wordpress.com/ (click ‘Worksheets’ and see weeks 15 to 18).

    Liked by 1 person

    • Eleanor Shevlin Says:

      Some wonderful ideas, here, Stephen….. your remarks about titles made me think of the work I did too long ago by hand– creating charts of the keywords, first names and surnames found in titles of all works identified in various bibliographies of novels/prose fiction from 1740 through the 1774. Very tedious–but the analysis and findings made it worth it… These new tools have certainly made our lives easier.

      Liked by 1 person

    • Eleanor Shevlin Says:


      I particularly like your focus on class, and I’ve done some work on that concept using occupations and terms of status as well as names within small sets of texts. As you well know with all your work on Defoe, his use of occupations in lieu of proper names offers an excellent starting point. The News-paper Wedding project that my seminar created last spring (we are actually still working on it, and we hope to make it public before too long) dealt a lot with searching and cross-searching this text with Campbell’s The London Tradesman.

      Liked by 1 person

  4. Dave Mazella Says:

    This person has an interesting approach. https://twitter.com/jenterysayers/status/814499300407410688


  5. Dave Mazella Says:

    I haven’t taught with them systematically, but the voyant tools look like they’d be useful, too. https://voyant-tools.org/

    Liked by 1 person

    • Eleanor Shevlin Says:

      I’ve not used the voyant tools for classes, but I had almost suggested them. Yet, I also think that you then need to devote more time to teaching skills. It would be good if we had more courses such as Jentery Sayers’s two-week immersion course that you just referenced Dave. It seems as if it is a graduate one, but there’s no reason that it could not be designed for undergraduates, too. We have a started a DH minor at WCU, but while it is proving attractive, we are not getting as many students creating literary digital projects.

      I will be requiring a digital project in my seminar this spring, but this time I am introducing digital skills at the very first meeting.


  6. Anna Battigelli Says:

    Here’s what I’m doing. I’m cutting and pasting the text of a novel, highlighting dialogue, deleting it, and dividing the narrative word count by the total word count. I think you’d have to do it this way with Austen because quotation marks so often denote reported dialogue, not actual dialogue.

    If there are better ways to do this, I’d like to hear about them, but this works.


    • Eleanor Shevlin Says:

      That makes sense if you don’t have texts in which the dialogue (in all forms) is encoded. If you wanted to assess other aspects of the dialogue (all its forms, its placement at various portions in a novel,and so forth), then it would make sense to mark up the dialogue and would probably take the same amount of time than your-cut-paste-highlight.


  7. shgregg Says:

    Hi Anna. I tried posting a comment that discusses other examples of word-frequency tasks for undergrads (on Dec 29) but it seems to be still ‘awaiting moderation.’


    • Eleanor Shevlin Says:

      Hi, Stephen–

      Apologies. We did not receive any notifications that we had comments awaiting approval. I have just approved yours and another that were waiting.


  8. Snakeweight Says:

    Those working on Austen (who don’t already know of this resource) might find this a helpful jumping off point: http://www.janeausten.ac.uk/manuscripts/index.html. I don’t think that the texts are encoded, but they’re at least closely tied to the manuscripts, which is interesting.

    A wealth of digital tools and approaches are listed in the DiRT Directory: http://dirtdirectory.org/. I’m writing an article about some of these tools and digital approaches to the archive (17th and 18thC focus) for Literature Compass which I think will be out in the spring.

    If you’d like to take a collaborative approach to marking up text, say with a class or with a broader community, you can build your own project on the free crowdsourcing platform called Zooniverse using the Project Builder (https://www.zooniverse.org/lab). There is a host of how-to information on the site about how to set up a project, but if you have any questions you can email me victoria@zooniverse.org. The Project Builder allows you to download your own data in csv format. If you have time in the course of a semester, your students could have the experience of setting up a project and using the data for their research.

    Liked by 2 people

  9. Eleanor Shevlin Says:


    Thanks for all this–we discussed DIRT (and Bamboo) a long time ago, but I see it does not appear as a resource here.. We’ll add..

    I look forward to your Compass article.

    Liked by 1 person

  10. Anna Battigelli Says:

    Thanks, Victoria and Stephen, for these recent excellent suggestions. One thing that is emerging from this discussion is that having students code Austen novels would be a great way to teach free indirect discourse.

    Liked by 1 person

  11. Arno Bosse Says:

    Hi Anna, since you’re also interested in giving undergraduate students a structured framework for exploring computational thinking more generally, you might find Wolfram’s Programming Lab of value http://www.wolfram.com/programming-lab/ See, for example, the various kinds of examples listed under the explorations http://www.wolfram.com/programming-lab/explorations/ The free version will be more than enough to get started with the Wolfram Language. If they are interested, a good next step would be working through the examples and exercises in the Elementary Introduction textbook http://www.wolfram.com/language/elementary-introduction/ Students aren’t likely to go on to use the Wolfram Language for their next digital humanities project, but if I understood you correctly, that wasn’t the point of your exercise either — it was instead to introduce them to simple to use, and simple to expand computational methods applicable to text analysis — and for that the WL is certainly well suited.


    • Anna Battigelli Says:

      Many thanks, Arno, for calling my attention to Wokfram’s Programming Lab.

      The three links are useful introductions to this program.

      Because FID is complex, it still seems necessary to code the text myself, cut out spoken dialogue, and do a word count of respective narrative and dialogue sections.


    • Eleanor Shevlin Says:

      Yes, many thanks, Arno. I had not heard of Wolfram language, and it seems it would enable me to try more things with undergraduates than I have to date.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: