Corpora Resources

In my research on recognizing children's understanding of science concepts, I led the development of an annotated corpus of elementary students' responses to assessment questions.

We acquired grade 3-6 responses to 287 questions from the Assessing Science Knowledge (ASK) project, conducted by the University of California, Berkeley, (Lawrence Hall of Science, 2006). The responses, which range in length from moderately short verb phrases to several sentences, cover sixteen diverse teaching and learning modules, spanning life science, physical science, earth and space science, scientific reasoning, and technology. We generated a corpus by transcribing a random sample (approximately 15400) of the students' handwritten responses.

The ASK assessments included a reference answer for each of their constructed response questions. We decomposed these reference answers into fine-grained facets and annotated each facet according to the student's apparent understanding of that facet. Please see my Publications page for more detail regarding the corpus, in particular see:

Rodney D. Nielsen, Wayne Ward, James H. Martin and Martha Palmer. (2008). Annotating Students' Understanding of Science Concepts. In Proceedings of the Sixth International Language Resources and Evaluation Conference, (LREC'08), Marrakech, Morocco, May 28-30, 2008. Published by the European Language Resources Association, (ELRA), Paris, France.

See downloads below.

Training Data

Download the annotated learner answer corpus.

Download the reference answer markup.