Malmö University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Känsloigenkänning och röstigenkänning i realtid gjort i en web-baserad videomötes-applikation
Malmö University, Faculty of Technology and Society (TS), Department of Computer Science and Media Technology (DVMT).
2021 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Emotion recognition and speech recognition in real-time done in a web based video conference application (English)
Abstract [en]

The world is undergoing a digitization process in the ways in which people communicate with each other. As a result of the Covid-19 pandemic, certain parts of this process have accelerated and it is possible to see increased use of meeting applications that facilitate communication where physical presence is not possible. Despite certain disadvantages, the digital meeting has an advantage, namely that it generates data. This data could be analyzed and visualized to the user in order to improve the interaction between the meeting participants, all during the conversation in real-time.

The goal of this thesis is to explore how to build a web-based video conferencing application that can record and visualize meeting participants' facial emotions but also transcribe the conversation between meeting participants in real-time. Such an application has therefore been built and subsequently tested to examine whether the application meets the requirements in the definition of a real-time system. The test investigates the system's RTF (real time factor) by examining the time for recording facial emotions and speech utterances but also the time from recording to the data being rendered in the browser. The results show that the system's RTF for facial emotion recognition and automatic speech recognition is greater than 1 in all tests. Since the data is displayed to the user as soon as it is available but also within a reasonable time, the system can classifies as a real-time system. The conclusion to be drawn is that it is the described event-driven architecture that enables the system to achieve the requirements from the definition of a real-time system.

Place, publisher, year, edition, pages
2021. , p. 35
Keywords [en]
Machine Learning, Tensor Flow, Face-api.js, Automatic Speech Recognition, Facial Emotion Recognition, Video Conferencing Application, Web browser
National Category
Other Computer and Information Science
Identifiers
URN: urn:nbn:se:mau:diva-43300OAI: oai:DiVA.org:mau-43300DiVA, id: diva2:1564749
Educational program
TS Systemutvecklare
Presentation
2021-06-02, 10:20 (English)
Supervisors
Examiners
Available from: 2021-06-29 Created: 2021-06-12 Last updated: 2021-06-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Search in DiVA

By author/editor
Eriksson, Tom
By organisation
Department of Computer Science and Media Technology (DVMT)
Other Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 97 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf