Malmö University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Activity Recognition through Interactive Machine Learning in a Dynamic Sensor Setting
Malmö University, Faculty of Technology and Society (TS), Department of Computer Science and Media Technology (DVMT). Malmö University, Internet of Things and People (IOTAP). Malmö University.ORCID iD: 0000-0002-3155-8408
Malmö University, Faculty of Technology and Society (TS), Department of Computer Science and Media Technology (DVMT). Malmö University, Internet of Things and People (IOTAP).ORCID iD: 0000-0003-0998-6585
Malmö University, Internet of Things and People (IOTAP). Malmö University, Faculty of Technology and Society (TS), Department of Computer Science and Media Technology (DVMT).ORCID iD: 0000-0002-9471-8405
2024 (English)In: Personal and Ubiquitous Computing, ISSN 1617-4909, E-ISSN 1617-4917, Vol. 28, no 1, p. 273-286Article in journal (Refereed) Published
Abstract [en]

The advances in Internet of things lead to an increased number of devices generating and streaming data. These devices can be useful data sources for activity recognition by using machine learning. However, the set of available sensors may vary over time, e.g. due to mobility of the sensors and technical failures. Since the machine learning model uses the data streams from the sensors as input, it must be able to handle a varying number of input variables, i.e. that the feature space might change over time. Moreover, the labelled data necessary for the training is often costly to acquire. In active learning, the model is given a budget for requesting labels from an oracle, and aims to maximize accuracy by careful selection of what data instances to label. It is generally assumed that the role of the oracle only is to respond to queries and that it will always do so. In many real-world scenarios however, the oracle is a human user and the assumptions are simplifications that might not give a proper depiction of the setting. In this work we investigate different interactive machine learning strategies, out of which active learning is one, which explore the effects of an oracle that can be more proactive and factors that might influence a user to provide or withhold labels. We implement five interactive machine learning strategies as well as hybrid versions of them and evaluate them on two datasets. The results show that a more proactive user can improve the performance, especially when the user is influenced by the accuracy of earlier predictions. The experiments also highlight challenges related to evaluating performance when the set of classes is changing over time.

Place, publisher, year, edition, pages
Springer, 2024. Vol. 28, no 1, p. 273-286
Keywords [en]
machine learning, interactive machine learning, active learning, machine teaching, online learning, sensor data
National Category
Other Computer and Information Science Computer Sciences
Identifiers
URN: urn:nbn:se:mau:diva-17434DOI: 10.1007/s00779-020-01414-2ISI: 000538990600002Scopus ID: 2-s2.0-85086152913OAI: oai:DiVA.org:mau-17434DiVA, id: diva2:1436305
Note

Correction available: https://doi.org/10.1007/s00779-020-01465-5

Available from: 2020-06-07 Created: 2020-06-07 Last updated: 2024-03-06Bibliographically approved
In thesis
1. Approaches to Interactive Online Machine Learning
Open this publication in new window or tab >>Approaches to Interactive Online Machine Learning
2020 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

With the Internet of Things paradigm, the data generated by the rapidly increasing number of connected devices lead to new possibilities, such as using machine learning for activity recognition in smart environments. However, it also introduces several challenges. The sensors of different devices might be of different types, making the fusion of data non-trivial. Moreover, the devices are often mobile, resulting in that data from a particular sensor is not always available, i.e. there is a need to handle data from a dynamic set of sensors. From a machine learning perspective, the data from the sensors arrives in a streaming fashion, i.e., online learning, as compared to many learning problems where a static dataset is assumed. Machine learning is in many cases a good approach for classification problems, but the performance is often linked to the quality of the data. Having a good data set to train a model can be an issue in general, due to the often costly process of annotating the data. With dynamic and heterogeneous data, annotation can be even more problematic, because of the ever-changing environment. This means that there might not be any, or a very small amount of, annotated data to train the model on at the start of learning, often referred to as the cold start problem.

To be able to handle these issues, adaptive systems are needed. With adaptive we mean that the model is not static over time, but is updated if there for instance is a change in the environment. By including human-in-the-loop during the learning process, which we refer to as interactive machine learning, the input from users can be utilized to build the model. The type of input used is typically annotations of the data, i.e. user input in the form of correctly labelled data points. Generally, it is assumed that the user always provides correct labels in accordance with the chosen interactive learning strategy. In many real-world applications these assumptions are not realistic however, as users might provide incorrect labels or not provide labels at all in line with the chosen strategy.

In this thesis we explore which interactive learning strategies are possible in the given scenario and how they affect performance, as well as the effect of machine learning algorithms on performance. We also study how a user who is not always reliable, i.e. that does not always provide a correct label when expected to, can affect performance. We propose a taxonomy of interactive online machine learning strategies and test how the different strategies affect performance through experiments on multiple datasets. The findings show that the overall best performing interactive learning strategy is one where the user provides labels when previous estimations have been incorrect, but that the best performing machine learning algorithm depends on the problem scenario. The experiments also show that a decreased reliability of the user leads to decreased performance, especially when there is a limited amount of labelled data.

Place, publisher, year, edition, pages
Malmö: Malmö universitet, 2020. p. 129
Series
Studies in Computer Science ; 10
Keywords
Machine Learning, Interactive Machine Learning, Online Learning, Active Learning, Machine Teaching
National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:mau:diva-17433 (URN)10.24834/isbn.9789178770854 (DOI)978-91-7877-084-7 (ISBN)978-91-7877-085-4 (ISBN)
Presentation
2020-06-18, 10:15 (English)
Opponent
Supervisors
Funder
Knowledge Foundation, 20140035
Available from: 2020-06-09 Created: 2020-06-09 Last updated: 2024-03-05Bibliographically approved
2. Interactive Online Machine Learning
Open this publication in new window or tab >>Interactive Online Machine Learning
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

With the Internet of Things paradigm, the data generated by the rapidly increasing number of connected devices lead to new possibilities, such as using machine learning for activity recognition in smart environments. However, it also introduces several challenges. The sensors of different devices might be mobile and of different types, i.e. there is a need to handle streaming data from a dynamic and heterogeneous set of sensors. In machine learning, the performance is often linked to the availability and quality of annotated data. Annotating data is in general costly, but it can be even more challenging if there is not any, or a very small amount of, annotated data to train the model on at the start of learning. To handle these issues, we implement interactive and adaptive systems. By including human-in-the-loop, which we refer to as interactive machine learning, the input from users can be utilized to build the model. The type of input used in interactive machine learning is typically annotations of the data, i.e. correctly labelled data points. Generally, it is assumed that the user always provides correct labels in accordance with the chosen interactive learning strategy. In many real-world applications these assumptions are not realistic however, as users might provide incorrect labels or not provide labels at all in line with the chosen strategy.

In this thesis we explore which interactive learning strategy types are possible in the given scenario and how they affect performance, as well as the effect of machine learning algorithms on the performance. We also study how a user who is not always reliable, i.e. who does not always provide a correct label when expected to, can affect performance. We propose a taxonomy of interactive online machine learning strategies and test how the different strategies affect performance through experiments on multiple datasets. Simulated experiments are compared to experiments with human participants, to verify the results. The findings show that the overall best performing interactive learning strategy is one where the user provides labels when current estimations are incorrect, but that the best performing machine learning algorithm depends on the problem scenario. The experiments also show that a decreased reliability of the user leads to decreased performance, especially when there is a limited amount of labelled data. The robustness of the machine learning algorithms differs, where e.g. Naïve Bayes classifier is better at handling a lower reliability of the user. We also present a systematic literature review on machine teaching, a subfield of interactive machine learning where the human is proactive in the interaction. The study shows that the area of machine teaching is rapidly evolving with an increased number of publications in recent years. However, as it is still maturing, there exists several open challenges that would benefit from further exploration, e.g. how human factors can affect performance.

Place, publisher, year, edition, pages
Malmö: Malmö universitet, 2022. p. 209
Series
Studies in Computer Science ; 18
Keywords
Interactive Machine Learning, Active Learning, Machine Teaching, Online Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:mau:diva-51987 (URN)10.24834/isbn.9789178772810 (DOI)978-91-7877-280-3 (ISBN)978-91-7877-281-0 (ISBN)
Public defence
2022-06-23, HS aula samt livestramas, Jan Waldenströms gata 25, Malmö, 10:00 (English)
Opponent
Supervisors
Note

In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of Malmö University's products or services. Internal or personal use of this material is permitted.

Paper VI and VII appear in dissertation as manuscripts.

Available from: 2022-06-03 Created: 2022-06-02 Last updated: 2023-09-05Bibliographically approved

Open Access in DiVA

fulltext(1942 kB)207 downloads
File information
File name FULLTEXT01.pdfFile size 1942 kBChecksum SHA-512
583c2bee6a77ab9650303dd94c06fe38bdcbc6af321b4e215d7f7816a8d23ff79c65f7f6c026a76305097f5dbd7dd6a7831b97424a349083d024c1bd11dabe3d
Type fulltextMimetype application/pdf
Correction(358 kB)44 downloads
File information
File name ERRATA01.pdfFile size 358 kBChecksum SHA-512
cb6d155f74699ecffc207829384cc1db1fe1230538b802190428bb063517f8cc3e662c0da4f9960c92e75cc4c0bb82690a2371c213cb58a1de94fed437b72132
Type errataMimetype application/pdf

Other links

Publisher's full textScopusCorrection

Authority records

Tegen, AgnesDavidsson, PaulPersson, Jan A.

Search in DiVA

By author/editor
Tegen, AgnesDavidsson, PaulPersson, Jan A.
By organisation
Department of Computer Science and Media Technology (DVMT)Internet of Things and People (IOTAP)
In the same journal
Personal and Ubiquitous Computing
Other Computer and Information ScienceComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 207 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 74 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf