A corpus of K’iche’ annotated for morphosyntactic structure

Francis M. Tyers, Robert Henderson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Scopus citations

Abstract

This article describes a collection of sentences in K’iche’ annotated for morphology and syntax. K’iche’ is a language in the Mayan language family, spoken in Guatemala. The annotation is done according to the guidelines of the Universal Dependencies project. The corpus consists of a total of 1,433 sentences containing approximately 10,000 tokens and is released under a free/open-source licence. We present a comparison of parsing systems for K’iche’ using this corpus and describe how it can be used for mining linguistic examples.

Original languageEnglish (US)
Title of host publicationProceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021
EditorsManuel Mager, Arturo Oncevay, Annette Rios, Ivan Vladimir Meza Ruiz, Alexis Palmer, Graham Neubig, Katharina Kann
PublisherAssociation for Computational Linguistics (ACL)
Pages10-20
Number of pages11
ISBN (Electronic)9781954085442
StatePublished - 2021
Event1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021 - Virtual, Online
Duration: Jun 11 2021 → …

Publication series

NameProceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021

Conference

Conference1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021
CityVirtual, Online
Period6/11/21 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Computational Theory and Mathematics
  • Information Systems

Fingerprint

Dive into the research topics of 'A corpus of K’iche’ annotated for morphosyntactic structure'. Together they form a unique fingerprint.

Cite this