Parsing of Korean Based on CFG Using Sentence Pattern Information


Hyeon-Yeong Lee, Yi-Gyu Hwang, Yong-Seok Lee


Vol. 7  No. 7  pp. 57-62


The Korean language has different structural properties than English. English is a more or less fixed word order language, while Korean is a partially free word order language and it controls sentences by limiting the meanings of the predicate. Therefore it is difficult to describe appropriate grammar or syntactic constraint for the Korean. In this paper, CFG-based grammar is described and the way to solve syntactic ambiguity by using syntactic constraint, which was originally sentence patterns information (SPI), is given. SPI is structural patterns of resorted sentence according to the subcategorization of predicate of Korean. In this thesis 39 sentence patterns are used. SPI solve ambiguity of double-object, double-subject or attachment of noun and adverb phrase which appears in the Korean. However the sentence patterns information can't solve every syntactic ambiguity. These sentences are parsed by using semantic markers with semantic constraint. Semantic markers can be used to solve ambiguity caused by auxiliary particle or commutative case particle. By empirical results of parsing 1000 sentences, we found that our method decreases 88.32% of syntactic ambiguities compared to the method that doesn't use SPI and split the sentence with basic clauses.


Resolution of Syntactic Ambiguity, Unification based CFG, Sentence Patterns Information (SPI), Semantic Marker, Parsing