Malmö University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
From text to meaning: Semantic interpretation of non-standardized metadata in piping and instrumentation diagrams
McDermott, Engineering, The Hague, The Netherlands; Eindhoven University of Technology, Mathematics and Computer Science, Eindhoven, The Netherlands.ORCID iD: 0009-0006-2515-6951
McDermott, Engineering, The Hague, The Netherlands; Eindhoven University of Technology, Mathematics and Computer Science, Eindhoven, The Netherlands.ORCID iD: 0000-0003-3411-4084
Eindhoven University of Technology, Mathematics and Computer Science, Eindhoven, The Netherlands; Chalmers University of Technology, Computer Science and Engineering, Gothenburg, Sweden.ORCID iD: 0000-0003-2854-722X
Malmö University, Faculty of Technology and Society (TS), Department of Computer Science and Media Technology (DVMT).ORCID iD: 0000-0002-7700-1816
2026 (English)In: Computers and Chemical Engineering, ISSN 0098-1354, E-ISSN 1873-4375, Vol. 204, article id 109436Article in journal (Refereed) Published
Abstract [en]

The extraction of structured metadata from Piping and Instrumentation Diagrams (P&IDs) is a major bottleneck for digitalization in the process industries. Existing methods, based on Optical Character Recognition (OCR), stop at raw text extraction, failing to interpret critical engineering information encoded within variable-format identifiers like pipeline numbers. This paper bridges this semantic gap by introducing a system for the format-aware interpretation of P&ID pipeline metadata. Our hybrid system architecture integrates deep learning for text recognition with domain interpretation rules that allow the system to adapt to new project formats without model retraining. These rules perform validation, error correction, and semantic mapping of raw text to structured data. We validated our system on a challenging dataset of real-world P&IDs from four distinct industrial projects, each with a unique and complex pipeline number format. Our method achieved 91.1% end-to-end accuracy, demonstrating a significant leap in performance over standard OCR tools, which proved insufficient for the task. This work presents a robust solution that unlocks valuable data from non-standardized engineering documents, providing a practical pathway for creating reliable digital twins and supporting plant lifecycle management in the chemical engineering sector.

Place, publisher, year, edition, pages
Elsevier , 2026. Vol. 204, article id 109436
Keywords [en]
Document analysis, Engineering automation, Engineering drawings, Hybrid AI systems, Information extraction
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:mau:diva-80171DOI: 10.1016/j.compchemeng.2025.109436ISI: 001593492700002Scopus ID: 2-s2.0-105018195801OAI: oai:DiVA.org:mau-80171DiVA, id: diva2:2009239
Available from: 2025-10-27 Created: 2025-10-27 Last updated: 2025-11-04Bibliographically approved

Open Access in DiVA

fulltext(3002 kB)15 downloads
File information
File name FULLTEXT01.pdfFile size 3002 kBChecksum SHA-512
45c4136f47c76a7dd76b9a3fe9c00a45c3b83530d1bee270bd88bcb8ea996b11d163c79718b24572657b291cfcbe46418d3f468b64ad8648a16cee2360fe4a9a
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Olsson, Helena Holmström

Search in DiVA

By author/editor
Shteriyanov, VasilDzhusupova, RimmaBosch, JanOlsson, Helena Holmström
By organisation
Department of Computer Science and Media Technology (DVMT)
In the same journal
Computers and Chemical Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 63 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf