Post-Acute COVID-19 Syndrome; Human Phenotype Ontology; Narrative Medicine; Natural Language Processing; Large Language Models; General Practice; Observational Studies as Topic; Controlled Vocabulary; Belgium; Terminological Biomarker
Abstract :
[en] Background:
Long COVID presents with complex and multisystemic symptoms that are difficult to recognize and document using traditional diagnostic classifications in primary care.
Research questions:
To explore how Human Phenotype Ontology (HPO) can be used to index and analyze patient narratives in general practice and to propose the concept of a "terminological biomarker" to describe the syndrome
Method:
A four-year observational study (2021--2025) conducted in a Belgian general practice, combining narrative interviews, ontology mapping, and a large language models (ChatGPT). Patient narratives were transcribed and indexed using ChatGPT-assisted prompts. HPO terms were extracted and validated using semantic similarity methods, and combined with clinical metadata and functional outcome scores. In parallel, peripheral blood samples were collected from each patient and analyzed transcriptomically to identify potential molecular signatures associated with viral persistence.
Results:
In a cohort of 307 patients, 1320 distinct HPO terms were identified. Fatigue, memory impairment, and exertional intolerance were most frequent. Manual verification confirmed the reliability of the LLM-HPO matching. A subset of 50 patients showed transcriptomic evidence of viral persistence.
Conclusions:
HPO enables structured representation of complex symptoms in Long COVID and supports narrative-informed documentation. The proposed "terminological biomarker" bridges lived experience and clinical semantics, providing a reproducible signal for emerging syndromes. Future studies will examine the correspondence between biological findings, particularly transcriptomic data, and the terminological patterns derived from patient narratives.
Points for discussion:
Narrative medicine can be transformed into a terminological biomarker, especially in the context of complex or poorly defined conditions like Long COVID
By integrating patient expressions into structured medical vocabularies, previously dismissed or "medically unexplained" symptoms gain visibility and legitimacy.
Narrative medicine becomes a source of terminological biomarkers when patient language is extracted, normalized, and reintegrated into structured clinical terminologies.
Disciplines :
General & internal medicine
Author, co-author :
Jamoulle, Marc ; Université de Liège - ULiège > HEC Liège : UER > UER Opérations : Systèmes d'information de gestion
Soylu, Serhan; ULB - Université Libre de Bruxelles > Département de Médecine Générale
Van Weyenbergh, Johan; KU Leuven - Katholieke Universiteit Leuven > Rega Institute
Language :
English
Title :
Identifying Terminological Biomarkers of Long COVID Through Narrative Medicine and Ontology Mapping