Malmö University Publications
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
From precision to perception: Human-in-the-loop evaluation of keyword extraction for internet-scale contextual advertising
Department of Computing Science, Umeå University, Umeå, 90187, Sweden.ORCID iD: 0009-0004-0580-6270
Malmö University, Faculty of Technology and Society (TS), Department of Computer Science and Media Technology (DVMT).ORCID iD: 0000-0002-1535-6195
Department of Computing Science, Umeå University, Umeå, 90187, Sweden.ORCID iD: 0000-0003-0596-627X
2026 (English)In: Information Systems, ISSN 0306-4379, E-ISSN 1873-6076, Vol. 138, article id 102665Article in journal (Refereed) Published
Abstract [en]

Keyword extraction is a foundational task in natural language processing, underpinning countless real-world applications. One of these is contextual advertising, where keywords help predict the topical congruence between ads and their surrounding media contexts to enhance advertising effectiveness. Recent advances in artificial intelligence have improved keyword extraction capabilities but also introduced concerns about computational cost. Moreover, although the end-user experience is of vital importance, human evaluation of keyword extraction performances remains under-explored. This study provides a comparative evaluation of prevalent keyword extraction algorithms with different levels of complexity represented by TF-IDF, KeyBERT, and Llama 2. To evaluate their effectiveness, a mixed-methods approach is employed, combining quantitative benchmarking with qualitative assessments from 855 participants through four survey-based experiments. The findings demonstrate that KeyBERT achieves an effective balance between user preferences and computational efficiency, compared to the other algorithms. We observe a clear overall preference for gold-standard keywords, but there is a misalignment between algorithmic benchmark performance and user ratings. This reveals a long-overlooked gap between traditional precision-focused metrics and user-perceived algorithm efficiency. The study underscores the importance of human-in-the-loop evaluation methodologies and proposes analytical tools to facilitate their implementation.

Place, publisher, year, edition, pages
Elsevier , 2026. Vol. 138, article id 102665
Keywords [en]
Contextual advertising, Human evaluation, Human-in-the-loop, Keyword extraction, Language models, Statistical methods, Word embeddings
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:mau:diva-81473DOI: 10.1016/j.is.2025.102665ISI: 001641293500001Scopus ID: 2-s2.0-105024445488OAI: oai:DiVA.org:mau-81473DiVA, id: diva2:2025575
Available from: 2026-01-07 Created: 2026-01-07 Last updated: 2026-01-08Bibliographically approved

Open Access in DiVA

fulltext(4658 kB)9 downloads
File information
File name FULLTEXT01.pdfFile size 4658 kBChecksum SHA-512
d3abc7884f2c4b8dcdf67a864b3130fab8064d49982df892c5767f06da46dea74e64b145db2127cad6692143ec20e659c11aa454f12943b8b1f0633c1c58f876
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Leckner, Sara

Search in DiVA

By author/editor
Cai, JingwenLeckner, SaraBjörklund, Johanna
By organisation
Department of Computer Science and Media Technology (DVMT)
In the same journal
Information Systems
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 29 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf