LLM Pipeline for Mapping Heterogeneous Data: A Case Study in Food Classification

Kevin Nils Röhl; Rainer Alt; Jan Wirsam

doi:10.18690/um.fov.4.2025

LLM Pipeline for Mapping Heterogeneous Data: A Case Study in Food Classification

Avtorji

Kevin Nils Röhl

Univerza uporabnih znanosti HTW Berlin

Rainer Alt

Leipzig University

https://orcid.org/0000-0002-6395-0658 (neavtoriziran)

Jan Wirsam

Univerza uporabnih znanosti HTW Berlin

https://orcid.org/0009-0004-7083-178X (neavtoriziran)

DOI: https://doi.org/10.18690/um.fov.4.2025.30

Kratka vsebina

Accurate food classification is essential for ensuring compliance with dietary regulations, nutritional standards, and sustainability guidelines, but it remains challenging due to fragmented data and semantic complexity. This study presents a pipeline leveraging large language model (LLM) embeddings, ontology mapping, and human-in-the-loop validation to enhance food classification in institutional food services. The pipeline achieves high accuracy in dietary-group mapping (precision 0.94, recall 0.91, F1-score 0.92), though precise FoodEx2 code matching remains challenging. A confidence-based validation strategy effectively balances automated processes with expert oversight to manage ambiguity. The proposed approach enables digital transformation of traditionally fragmented food service systems, enhancing transparency, operational efficiency, and alignment with dietary and public health guidelines. Future research should deploy this pipeline in operational canteen settings to refine embedding techniques, enhance accuracy, and support sustainable nutrition management.

Biografije avtorja

Kevin Nils Röhl, Univerza uporabnih znanosti HTW Berlin

Berlin, Nemčija. E-mail: roehl@htw-berlin.de

Rainer Alt, Leipzig University

Leipzig, Nemčija. E-mail: rainer.alt@uni-leipzig.de

Jan Wirsam, Univerza uporabnih znanosti HTW Berlin

Berlin, Nemčija. E-mail: wirsam@htw-berlin.de

Prenosi

PDF

Izdaja

38^thBled eConference: Empowering Transformation: Shaping Digital Futures for All: Conference Proceedings

Strani

483-498

Izdano

9 junij 2025

Zbirka

Blejska eKonferenca

Kategorije

Licenca

To delo je licencirano pod Creative Commons Priznanje avtorstva 4.0 mednarodno licenco.

LLM Pipeline for Mapping Heterogeneous Data: A Case Study in Food Classification

Avtorji

Kratka vsebina

Biografije avtorja

Prenosi

Izdaja

Strani

Izdano

Zbirka

Kategorije

Licenca

Oddaj nov prispevek

Ključne besede

Deli