An attention model for hypernasality prediction in children with cleft palate

Vikram C. Mathad, Nancy Scherer, Kathy Chapman, Julie Liss, Visar Berisha

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations


Hypernasality refers to the perception of abnormal nasal resonances in vowels and voiced consonants. Estimation of hypernasality severity from connected speech samples involves learning a mapping between the frame-level features and utterance-level clinical ratings of hypernasality. However, not all speech frames contribute equally to the perception of hypernasality. In this work, we propose an attention-based bidirectional long-short memory (BLSTM) model that directly maps the frame-level features to utterance-level ratings by focusing only on specific speech frames carrying hypernasal cues. The models performance is evaluated on the Americleft database containing speech samples of children with cleft palate and clinical ratings of hypernasality. We analyzed the attention weights over broad phonetic categories and found that the model yields results consistent with what is known in the speech science literature. Further, the correlation between the predicted and perceptual rating is found to be significant (r = 0.684, p < 0.001) and better than conventional BLSTMs trained using frame-wise and last-frame approaches.

Original languageEnglish (US)
Pages (from-to)7248-7252
Number of pages5
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
StatePublished - 2021
Event2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada
Duration: Jun 6 2021Jun 11 2021


  • Attention
  • Cleft palate
  • Hypernasality
  • Recurrent neural networks

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering


Dive into the research topics of 'An attention model for hypernasality prediction in children with cleft palate'. Together they form a unique fingerprint.

Cite this