Publikationer från Malmö universitet
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Modelling of a Speech-to-Text Recognition System for Air Traffic Control and NATO Air Command
Department of Electrical, Electronic and Computer Engineering, University of Pretoria, South Africa.
Malmö universitet, Fakulteten för teknik och samhälle (TS), Institutionen för datavetenskap och medieteknik (DVMT). Malmö universitet, Internet of Things and People (IOTAP). Department of Electrical, Electronic and Computer Engineering, University of Pretoria, South Africa.ORCID-id: 0000-0002-2763-8085
2022 (Engelska)Ingår i: Journal of Internet Technology, ISSN 1607-9264, E-ISSN 2079-4029, Vol. 23, nr 7, s. 1527-1539Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Accent invariance in speech recognition is a chal- lenging problem especially in the are of aviation. In this paper a speech recognition system is developed to transcribe accented speech between pilots and air traffic controllers. The system allows handling of accents in continuous speech by modelling phonemes using Hidden Markov Models (HMMs) with Gaussian mixture model (GMM) probability density functions for each state. These phonemes are used to build word models of the NATO phonetic alphabet as well as the numerals 0 to 9 with transcriptions obtained from the Carnegie Mellon University (CMU) pronouncing dictionary. Mel-Frequency Cepstral Co-efficients (MFCC) with delta and delta-delta coefficients are used for the feature extraction process. Amplitude normalisation and covariance scaling is implemented to improve recognition accuracy. A word error rate (WER) of 2% for seen speakers and 22% for unseen speakers is obtained.

Ort, förlag, år, upplaga, sidor
Angle Publishing Co., Ltd. , 2022. Vol. 23, nr 7, s. 1527-1539
Nyckelord [en]
Automatic Speech Recognition (ASR), Hidden Markov Model (HMM), Gaussian Mixture Model (GMM), Mel-Frequency Cepstral Coefficients (MFCC), Covariance scaling
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:mau:diva-59129DOI: 10.53106/160792642022122307008ISI: 000965724700008Scopus ID: 2-s2.0-85146344089OAI: oai:DiVA.org:mau-59129DiVA, id: diva2:1749252
Tillgänglig från: 2023-04-06 Skapad: 2023-04-06 Senast uppdaterad: 2023-12-13Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Malekian, Reza

Sök vidare i DiVA

Av författaren/redaktören
Malekian, Reza
Av organisationen
Institutionen för datavetenskap och medieteknik (DVMT)Internet of Things and People (IOTAP)
I samma tidskrift
Journal of Internet Technology
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 135 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf