LLM Pipeline for Mapping Heterogeneous Data: A Case Study in Food Classification

Authors

Kevin Nils Röhl
HTW Berlin University of Applied Sciences
Rainer Alt
Leipzig University
https://orcid.org/0000-0002-6395-0658
Jan Wirsam
HTW Berlin University of Applied Sciences
https://orcid.org/0009-0004-7083-178X

Synopsis

Accurate food classification is essential for ensuring compliance with dietary regulations, nutritional standards, and sustainability guidelines, but it remains challenging due to fragmented data and semantic complexity. This study presents a pipeline leveraging large language model (LLM) embeddings, ontology mapping, and human-in-the-loop validation to enhance food classification in institutional food services. The pipeline achieves high accuracy in dietary-group mapping (precision 0.94, recall 0.91, F1-score 0.92), though precise FoodEx2 code matching remains challenging. A confidence-based validation strategy effectively balances automated processes with expert oversight to manage ambiguity. The proposed approach enables digital transformation of traditionally fragmented food service systems, enhancing transparency, operational efficiency, and alignment with dietary and public health guidelines. Future research should deploy this pipeline in operational canteen settings to refine embedding techniques, enhance accuracy, and support sustainable nutrition management.

Author Biographies

Kevin Nils Röhl, HTW Berlin University of Applied Sciences

Berlin, Germany. E-mail: roehl@htw-berlin.de

Rainer Alt, Leipzig University

Leipzig, Germany. E-mail: rainer.alt@uni-leipzig.de

Jan Wirsam, HTW Berlin University of Applied Sciences

Berlin, Germany. E-mail: wirsam@htw-berlin.de

Downloads

Published

June 9, 2025

License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Nils Röhl, K., Alt, R., & Wirsam, J. (2025). LLM Pipeline for Mapping Heterogeneous Data: A Case Study in Food Classification. In A. Pucihar, M. Kljajić Borštnar, S. Blatnik, M. Marolt, R. W. H. Bons, K. Smit, & M. Glowatz (Eds.), & (Ed.), 38th Bled eConference: Empowering Transformation: Shaping Digital Futures for All: Conference Proceedings (pp. 483-498). University of Maribor Press. https://press.um.si/index.php/ump/catalog/book/947/chapter/615