Malmö University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Modelling of a Speech-to-Text Recognition System for Air Traffic Control and NATO Air Command
Department of Electrical, Electronic and Computer Engineering, University of Pretoria, South Africa.
Malmö University, Faculty of Technology and Society (TS), Department of Computer Science and Media Technology (DVMT). Malmö University, Internet of Things and People (IOTAP). Department of Electrical, Electronic and Computer Engineering, University of Pretoria, South Africa.ORCID iD: 0000-0002-2763-8085
2022 (English)In: Journal of Internet Technology, ISSN 1607-9264, E-ISSN 2079-4029, Vol. 23, no 7, p. 1527-1539Article in journal (Refereed) Published
Abstract [en]

Accent invariance in speech recognition is a chal- lenging problem especially in the are of aviation. In this paper a speech recognition system is developed to transcribe accented speech between pilots and air traffic controllers. The system allows handling of accents in continuous speech by modelling phonemes using Hidden Markov Models (HMMs) with Gaussian mixture model (GMM) probability density functions for each state. These phonemes are used to build word models of the NATO phonetic alphabet as well as the numerals 0 to 9 with transcriptions obtained from the Carnegie Mellon University (CMU) pronouncing dictionary. Mel-Frequency Cepstral Co-efficients (MFCC) with delta and delta-delta coefficients are used for the feature extraction process. Amplitude normalisation and covariance scaling is implemented to improve recognition accuracy. A word error rate (WER) of 2% for seen speakers and 22% for unseen speakers is obtained.

Place, publisher, year, edition, pages
Angle Publishing Co., Ltd. , 2022. Vol. 23, no 7, p. 1527-1539
Keywords [en]
Automatic Speech Recognition (ASR), Hidden Markov Model (HMM), Gaussian Mixture Model (GMM), Mel-Frequency Cepstral Coefficients (MFCC), Covariance scaling
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:mau:diva-59129DOI: 10.53106/160792642022122307008ISI: 000965724700008Scopus ID: 2-s2.0-85146344089OAI: oai:DiVA.org:mau-59129DiVA, id: diva2:1749252
Available from: 2023-04-06 Created: 2023-04-06 Last updated: 2023-12-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Malekian, Reza

Search in DiVA

By author/editor
Malekian, Reza
By organisation
Department of Computer Science and Media Technology (DVMT)Internet of Things and People (IOTAP)
In the same journal
Journal of Internet Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 89 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf