The compilation of a sample PFR Chinese corpus of Skeleton-parsed sentences
Fecha
2005Autor
Wong, May Lai-Yin
Metadatos
Mostrar el registro completo del ítem
Anuario del Seminario de Filología Vasca Julio de Urquijo 39(2) : 271-287 (2005)
Resumen
The approach taken in this paper for the construction of a treebank is inspired by the skeleton parsing approach. From the PFR Chinese Corpus, a sample text of some 100,000 word tokens was chosen for the production of the treebank. A clear account of the 17 non terminal constituents that are defined and instantiated in the corpus texts will be provided in a parsing scheme. A set of parsing guidelines on practical issues related to map any parses on to sentences in the application of the parsing scheme will also be considered. It is noteworthy also to discuss the major difficulties encountered in the course of skeleton parsing, as this illuminates some of the peculiarities of the Chinese language. The conclusion is an evaluation of the success of the treebank compilation.