CLIN 2005 Abstracts
  • Parsing partially bracketed input
    Martijn Wieling (University of Groningen)
    Mark-Jan Nederhof (University of Groningen)
    Gertjan van Noord (University of Groningen)
    A method is proposed to convert a Context Free Grammar to a Bracket Context Free Grammar (BCFG). A BCFG is able to parse input strings which are, in part or whole, annotated with structure information (brackets). Parsing partially bracketed strings arises naturally in several cases. One interesting application is semi-automatic treebank construction. Another application is parsing of input strings which are first annotated by a NP-chunker. Three ways of annotating an input string with structure information are introduced: identifying a complete constituent by using a pair of round brackets, identifying the start or the end of a constituent by using square brackets and identifying the type of a constituent by subscripting the brackets with the type. An important, non-trivial property of the proposed transformation is that it does not generate spurious ambiguous parse trees. If an input string is annotated with structure information and is parsed with the BCFG, the number of generated parse trees can be reduced. Only parse trees are generated which comply with the indicated structure.