Abstract
This research is focused on the automatic detection of one of the fundamental elements of Brazil’s prosody model, the tone unit. We compared the performance of using silent pause duration alone to delimit tone units and using relative pitch resets and slow pace (or post-boundary lengthening) along with silent pause duration to delimit them. The corpus used for the comparison is composed of 18 highly proficient speakers giving academic lectures in six varieties of English which are representative of the inner (American and British), outer (Indian and South African), and expanding (Chinese and Spanish) concentric circles of Kachru’s World Englishes. The performance was compared by computing Pearson’s correlation between the numbers of tone units in a trained linguist’s transcription of the corpus and the numbers automatically detected by the computer. The computer detected the tone units from phone sequences identified in the audio files by a large vocabulary spontaneous speech recognition (LVCSR) program. We found including relative pitch resets and slow pace along with silent pause duration in the computer algorithm improved the correlation between the numbers of tone units in the linguist’s transcription of the corpus and the numbers automatically detected by the computer from 0.935 to 0.959.
Original language | English (US) |
---|---|
Pages (from-to) | 287-291 |
Number of pages | 5 |
Journal | Proceedings of the International Conference on Speech Prosody |
Volume | 2016-January |
DOIs | |
State | Published - 2016 |
Event | 8th Speech Prosody 2016 - Boston, United States Duration: May 31 2016 → Jun 3 2016 |
Keywords
- Automatic speech recognition (ASR)
- Brazil’s prosody model
- Large vocabulary spontaneous speech recognition (LVCSR)
- Tone unit
- World Englishes
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language