Publikationer från Malmö universitet
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Approaches to Interactive Online Machine Learning
Malmö universitet, Fakulteten för teknik och samhälle (TS), Institutionen för datavetenskap och medieteknik (DVMT). Malmö universitet, Internet of Things and People (IOTAP).ORCID-id: 0000-0002-3155-8408
2020 (Engelska)Licentiatavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

With the Internet of Things paradigm, the data generated by the rapidly increasing number of connected devices lead to new possibilities, such as using machine learning for activity recognition in smart environments. However, it also introduces several challenges. The sensors of different devices might be of different types, making the fusion of data non-trivial. Moreover, the devices are often mobile, resulting in that data from a particular sensor is not always available, i.e. there is a need to handle data from a dynamic set of sensors. From a machine learning perspective, the data from the sensors arrives in a streaming fashion, i.e., online learning, as compared to many learning problems where a static dataset is assumed. Machine learning is in many cases a good approach for classification problems, but the performance is often linked to the quality of the data. Having a good data set to train a model can be an issue in general, due to the often costly process of annotating the data. With dynamic and heterogeneous data, annotation can be even more problematic, because of the ever-changing environment. This means that there might not be any, or a very small amount of, annotated data to train the model on at the start of learning, often referred to as the cold start problem.

To be able to handle these issues, adaptive systems are needed. With adaptive we mean that the model is not static over time, but is updated if there for instance is a change in the environment. By including human-in-the-loop during the learning process, which we refer to as interactive machine learning, the input from users can be utilized to build the model. The type of input used is typically annotations of the data, i.e. user input in the form of correctly labelled data points. Generally, it is assumed that the user always provides correct labels in accordance with the chosen interactive learning strategy. In many real-world applications these assumptions are not realistic however, as users might provide incorrect labels or not provide labels at all in line with the chosen strategy.

In this thesis we explore which interactive learning strategies are possible in the given scenario and how they affect performance, as well as the effect of machine learning algorithms on performance. We also study how a user who is not always reliable, i.e. that does not always provide a correct label when expected to, can affect performance. We propose a taxonomy of interactive online machine learning strategies and test how the different strategies affect performance through experiments on multiple datasets. The findings show that the overall best performing interactive learning strategy is one where the user provides labels when previous estimations have been incorrect, but that the best performing machine learning algorithm depends on the problem scenario. The experiments also show that a decreased reliability of the user leads to decreased performance, especially when there is a limited amount of labelled data.

Ort, förlag, år, upplaga, sidor
Malmö: Malmö universitet, 2020. , s. 129
Serie
Studies in Computer Science ; 10
Nyckelord [en]
Machine Learning, Interactive Machine Learning, Online Learning, Active Learning, Machine Teaching
Nationell ämneskategori
Annan data- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:mau:diva-17433DOI: 10.24834/isbn.9789178770854ISBN: 978-91-7877-084-7 (tryckt)ISBN: 978-91-7877-085-4 (digital)OAI: oai:DiVA.org:mau-17433DiVA, id: diva2:1437537
Presentation
2020-06-18, 10:15 (Engelska)
Opponent
Handledare
Forskningsfinansiär
KK-stiftelsen, 20140035Tillgänglig från: 2020-06-09 Skapad: 2020-06-09 Senast uppdaterad: 2024-03-05Bibliografiskt granskad
Delarbeten
1. Collaborative Sensing with Interactive Learning using Dynamic Intelligent Virtual Sensors
Öppna denna publikation i ny flik eller fönster >>Collaborative Sensing with Interactive Learning using Dynamic Intelligent Virtual Sensors
2019 (Engelska)Ingår i: Sensors, E-ISSN 1424-8220, Vol. 19, nr 3, artikel-id 477Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Although the availability of sensor data is becoming prevalent across many domains, it still remains a challenge to make sense of the sensor data in an efficient and effective manner in order to provide users with relevant services. The concept of virtual sensors provides a step towards this goal, however they are often used to denote homogeneous types of data, generally retrieved from a predetermined group of sensors. The DIVS (Dynamic Intelligent Virtual Sensors) concept was introduced in previous work to extend and generalize the notion of a virtual sensor to a dynamic setting with heterogenous sensors. This paper introduces a refined version of the DIVS concept by integrating an interactive machine learning mechanism, which enables the system to take input from both the user and the physical world. The paper empirically validates some of the properties of the DIVS concept. In particular, we are concerned with the distribution of different budget allocations for labelled data, as well as proactive labelling user strategies. We report on results suggesting that a relatively good accuracy can be achieved despite a limited budget in an environment with dynamic sensor availability, while proactive labeling ensures further improvements in performance.

Ort, förlag, år, upplaga, sidor
MDPI, 2019
Nyckelord
virtual sensors, sensor fusion, machine learning, dynamic environments, Internet of Things
Nationell ämneskategori
Teknik och teknologier
Identifikatorer
urn:nbn:se:mau:diva-2628 (URN)10.3390/s19030477 (DOI)000459941200040 ()30682809 (PubMedID)2-s2.0-85060551967 (Scopus ID)30112 (Lokalt ID)30112 (Arkivnummer)30112 (OAI)
Tillgänglig från: 2020-02-27 Skapad: 2020-02-27 Senast uppdaterad: 2024-02-05Bibliografiskt granskad
2. Activity Recognition through Interactive Machine Learning in a Dynamic Sensor Setting
Öppna denna publikation i ny flik eller fönster >>Activity Recognition through Interactive Machine Learning in a Dynamic Sensor Setting
2024 (Engelska)Ingår i: Personal and Ubiquitous Computing, ISSN 1617-4909, E-ISSN 1617-4917, Vol. 28, nr 1, s. 273-286Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The advances in Internet of things lead to an increased number of devices generating and streaming data. These devices can be useful data sources for activity recognition by using machine learning. However, the set of available sensors may vary over time, e.g. due to mobility of the sensors and technical failures. Since the machine learning model uses the data streams from the sensors as input, it must be able to handle a varying number of input variables, i.e. that the feature space might change over time. Moreover, the labelled data necessary for the training is often costly to acquire. In active learning, the model is given a budget for requesting labels from an oracle, and aims to maximize accuracy by careful selection of what data instances to label. It is generally assumed that the role of the oracle only is to respond to queries and that it will always do so. In many real-world scenarios however, the oracle is a human user and the assumptions are simplifications that might not give a proper depiction of the setting. In this work we investigate different interactive machine learning strategies, out of which active learning is one, which explore the effects of an oracle that can be more proactive and factors that might influence a user to provide or withhold labels. We implement five interactive machine learning strategies as well as hybrid versions of them and evaluate them on two datasets. The results show that a more proactive user can improve the performance, especially when the user is influenced by the accuracy of earlier predictions. The experiments also highlight challenges related to evaluating performance when the set of classes is changing over time.

Ort, förlag, år, upplaga, sidor
Springer, 2024
Nyckelord
machine learning, interactive machine learning, active learning, machine teaching, online learning, sensor data
Nationell ämneskategori
Annan data- och informationsvetenskap Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:mau:diva-17434 (URN)10.1007/s00779-020-01414-2 (DOI)000538990600002 ()2-s2.0-85086152913 (Scopus ID)
Anmärkning

Correction available: https://doi.org/10.1007/s00779-020-01465-5

Tillgänglig från: 2020-06-07 Skapad: 2020-06-07 Senast uppdaterad: 2024-03-06Bibliografiskt granskad
3. A Taxonomy of Interactive Online Machine Learning Strategies
Öppna denna publikation i ny flik eller fönster >>A Taxonomy of Interactive Online Machine Learning Strategies
2020 (Engelska)Ingår i: ECML PKDD 2020: Machine Learning and Knowledge Discovery in Databases / [ed] Hutter F.; Kersting K.; Lijffijt J.; Valera I., Springer, 2020, s. 1-17Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In interactive machine learning, human users and learning algorithms work together in order to solve challenging learning problems, e.g. with limited or no annotated data or trust issues. As annotating data can be costly, it is important to minimize the amount of annotated data needed for training while still getting a high classification accuracy. This is done by attempting to select the most informative data instances for training, where the amount of instances is limited by a labelling budget. In an online learning setting, the decision of whether or not to select an instance for labelling has to be done on-the-fly, as the data arrives in a sequential order and is only valid for a limited time period. We present a taxonomy of interactive online machine learning strategies. An interactive learning strategy determines which instances to label in an unlabelled dataset. In the taxonomy we differentiate between interactive learning strategies when the computer controls the learning process (active learning) and those when human users control the learning process (machine teaching). We then make a distinction between what triggers the learning: active learning could be triggered by uncertainty, time, or randomly, whereas machine teaching could be triggered by errors, state changes, time, or factors related to the user. We also illustrate the taxonomy by implementing versions of the different strategies and performing experiments on a benchmark dataset as well as on a synthetically generated dataset. The results show that the choice of interactive learning strategy affects performance, especially in the beginning of the online learning process, when there is a limited amount of labelled data.

Ort, förlag, år, upplaga, sidor
Springer, 2020
Serie
Lecture notes in computer science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12458
Nyckelord
interactive machine learning, active learning, machine teaching, online learning, streaming data
Nationell ämneskategori
Annan data- och informationsvetenskap
Identifikatorer
urn:nbn:se:mau:diva-17435 (URN)10.1007/978-3-030-67661-2_9 (DOI)000717542900009 ()2-s2.0-85103280211 (Scopus ID)978-3-030-67660-5 (ISBN)978-3-030-67661-2 (ISBN)
Konferens
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
Tillgänglig från: 2020-06-08 Skapad: 2020-06-08 Senast uppdaterad: 2024-02-05Bibliografiskt granskad
4. The Effects of Reluctant and Fallible Users in Interactive Online Machine Learning
Öppna denna publikation i ny flik eller fönster >>The Effects of Reluctant and Fallible Users in Interactive Online Machine Learning
2020 (Engelska)Ingår i: Proceedings of the Workshop on Interactive Adaptive Learning co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2020) / [ed] Daniel Kottke, Georg Krempl, Vincent Lemaire, Andreas Holzinger & Adrian Calma, CEUR Workshops , 2020, s. 55-71Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In interactive machine learning it is important to select the most informative data instances to label in order to minimize the effort of the human user. There are basically two categories of interactive machine learning. In the first category, active learning, it is the computational learner that selects which data to be labelled by the human user, whereas in the second one, machine teaching, the selection is done by the human teacher. It is often assumed that the human user is a perfect oracle, i.e., a label will always be provided in accordance with the interactive learning strategy and that this label will always be correct. In real-world scenarios however, these assumptions typically do not hold. In this work, we investigate how the reliability of the user providing labels affects the performance of online machine learning. Specifically, we study reluctance, i.e., to what extent the user does not provide labels in accordance with the strategy, and fallibility, i.e., to what extent the provided labels are incorrect. We show results of experiments on a benchmark dataset as well as a synthetically created dataset. By varying the degree of reluctance and fallibility of the user, the robustness of the different interactive learning strategies and machine learning algorithms is explored. The experiments show that there is a varying robustness of the strategies and algorithms. Moreover, certain machine learning algorithms are more robust towards reluctance compared to fallibility, while the opposite is true for others

Ort, förlag, år, upplaga, sidor
CEUR Workshops, 2020
Serie
CEUR Workshop Proceedings, E-ISSN 1613-0073
Nationell ämneskategori
Programvaruteknik
Identifikatorer
urn:nbn:se:mau:diva-17673 (URN)
Konferens
Interactive Adaptive Learning 2020, Ghent, Belgium, September 14th, 2020.
Tillgänglig från: 2020-07-03 Skapad: 2020-07-03 Senast uppdaterad: 2023-12-28Bibliografiskt granskad

Open Access i DiVA

fulltext(3317 kB)381 nedladdningar
Filinformation
Filnamn FULLTEXT04.pdfFilstorlek 3317 kBChecksumma SHA-512
73f4751fb83275a5429be5e45482e95483ced77d8df89468e34b2715ee8bb34bbb0969dee8fa4e9e446c1bc38f37254ed9d2c2b7bde8dfd2d2da70b58c1eba70
Typ fulltextMimetyp application/pdf

Övriga länkar

Förlagets fulltext

Person

Tegen, Agnes

Sök vidare i DiVA

Av författaren/redaktören
Tegen, Agnes
Av organisationen
Institutionen för datavetenskap och medieteknik (DVMT)Internet of Things and People (IOTAP)
Annan data- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 492 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 1155 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf