Malmö University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Data Labeling: An Empirical Investigation into Industrial Challenges and Mitigation Strategies
Chalmers University of Technology, Hörselgången 11, 417 56, Gothenburg, Sweden.ORCID iD: 0000-0001-8176-5846
Chalmers University of Technology, Hörselgången 11, 417 56, Gothenburg, Sweden.ORCID iD: 0000-0002-2501-9926
Chalmers University of Technology, Hörselgången 11, 417 56, Gothenburg, Sweden.ORCID iD: 0000-0003-2854-722X
Malmö University, Faculty of Technology and Society (TS), Department of Computer Science and Media Technology (DVMT).ORCID iD: 0000-0002-7700-1816
2020 (English)In: Product-Focused Software Process Improvement: 21st International Conference, PROFES 2020, Turin, Italy, November 25–27, 2020, Proceedings / [ed] Maurizio Morisio; Marco Torchiano; Andreas Jedlitschka, Springer, 2020, p. 202-216Conference paper, Published paper (Refereed)
Abstract [en]

Labeling is a cornerstone of supervised machine learning. However, in industrial applications, data is often not labeled, which complicates using this data for machine learning. Although there are well-established labeling techniques such as crowdsourcing, active learning, and semi-supervised learning, these still do not provide accurate and reliable labels for every machine learning use case in the industry. In this context, the industry still relies heavily on manually annotating and labeling their data. This study investigates the challenges that companies experience when annotating and labeling their data. We performed a case study using a semi-structured interview with data scientists at two companies to explore their problems when labeling and annotating their data. This paper provides two contributions. We identify industry challenges in the labeling process, and then we propose mitigation strategies for these challenges.

Place, publisher, year, edition, pages
Springer, 2020. p. 202-216
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12562
National Category
Other Computer and Information Science
Identifiers
URN: urn:nbn:se:mau:diva-56802DOI: 10.1007/978-3-030-64148-1_13ISI: 000766320200013ISBN: 978-3-030-64147-4 (print)ISBN: 978-3-030-64148-1 (electronic)OAI: oai:DiVA.org:mau-56802DiVA, id: diva2:1720434
Conference
21st International Conference, PROFES 2020, Turin, Italy, November 25–27, 2020
Available from: 2022-12-19 Created: 2022-12-19 Last updated: 2022-12-19Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textFulltext

Authority records

Olsson, Helena Holmström

Search in DiVA

By author/editor
Fredriksson, TeodorMattos, David IssaBosch, JanOlsson, Helena Holmström
By organisation
Department of Computer Science and Media Technology (DVMT)
Other Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 18 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf