SENSEVAL-3 Scoring

When you submit your answers to a task, they will be scored automatically using the program scorer2 (unless otherwise indicated by the task organizers). You can download the C source code for this program to use during training and/or to test whether your answers are in the right format.

Your system will be scored for precision and recall using three different schemes:

  • Fine-grained: Your answers much match exactly.
  • Coarse-grained: Your answers will be mapped to coarse-grained senses and compared to the gold-standard tags, also mapped to coarse-grained senses. (This will not be available for all tasks, since a 'sense map' is required.  The mapping will not be published beforehand.)
  • Mixed-grained: If a sense subsumption hierarchy is available, then mixed-grained scoring gives some credit to choosing a more coarse-grained sense than the gold standard tag, but not full credit.  See Resnik and Melamed proposal (below) for more information.

Downloads:

  1. Documentation
  2. Format for answers
  3. Melamed and Resnik's proposal for Senseval scoring
  4. Scorer source code (scorer2.c)
  5. Example files: sampletask.system.answers, sampletask.key, sampletask.sensemap

Archive containing 1-5 above: tar.gz


Note: These guidelines were adapted from the Senseval-2 scoring guidelines