Detection of Errors and Correction
in Corpus Annotation
People
Detmar Meurers
is an associate professor in the Department of
Linguistics at the the Ohio State University. His research interests
focus on the intersection of linguistic insight and computational
linguistics. He became interested in the detection of errors in corpus
annotation after teaching a seminar on "Corpora and
Linguistic Knowledge" in Spring 2002. Joined by Markus Dickinson,
who had participated in the seminar, they started developing an
automatic error detection method based on detecting variation of
annotation across linguistically comparable instances. As reported in
several papers since then (cf. publications), they
have successfully applied this and related ideas to detect (and
correct) errors in part-of-speech annotation as well as to continuous
and discontinuous syntactic annotation - with current project work
focusing on dependency annotation and data outside linguistics.
Markus Dickinson is
an assistant professor at Indiana University in the Department of
Linguistics. Much of his research has focused on the detection and
correction of annotation errors across various levels of linguistic
structure. Currently, his main interests lie in what annotation errors
reveal about the design of annotation schemes and in how annotation can be
optimized for data-driven natural language processing.
Adriane Boyd is a
Ph.D. student in computational linguistics at Ohio State
University. She is interested in syntactic annotation and data-driven
parsing for freer word order languages.