Poster (Scientific congresses and symposiums)
MARVIN: A Deep Generative Model for Flow Cytometry Analysis Informed by Biological Assumption
De Voeght, Adrien; Bodart, Fanny; Baron, Frédéric et al.
202641st General Annual Meeting of the Belgian Hematology Society
Peer reviewed
 

Files


Full Text
MARVIN_Poster_BHS_GL_FB_final.pdf
Author postprint (2.33 MB) Creative Commons License - Public Domain Dedication
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Artificial intelligence; generative model; flow cytometry; MRD; Acute leukemia
Abstract :
[en] Introduction Flow cytometry (FCM) is widely used in research and clinical practice to characterise complex cell populations, generating high-dimensional single-cell data across numerous markers. Despite technological advances, manual gating remains the standard approach for annotating cell populations, even if the process is time-consuming and operator-dependent. Deep generative models offer the potential to perform classification and discovery tasks simultaneously, improving efficiency and consistency. Methods Model MARVIN is a semi-supervised deep generative model for cytometry analysis. Its architecture is structured around the biological assumption that the immune system consists of mixtures of cell populations. This assumption constrains its latent space to reflect the population structure, enabling biologically interpretable representations. MARVIN can perform multiple tasks: classification of known populations, discovery of novel or rare subpopulations, and exploration of immune system dynamics. Dataset and Experiments The dataset comprises 5,480,065 cells from three patients without active disease and 10,222 malignant lymphoblastic cells from four additional patients. In total, the dataset includes 12 annotated cell populations profiled with 8 markers. All measurements were transformed and standardized using an auto-logicle transformation. Classification task: We trained the model by using a large dataset combining labelled cells from one patient and unlabeled cells from others. Cell-discovery tasks: Healthy and pathological cells were merged, and two analyses were conducted: (i) (ii) Results Subpopulation discovery: Increasing the number of clusters in the latent space and masking malignant cells during training and evaluating whether MARVIN isolates pathological cells into additional clusters. Anomaly detection: A previously unseen cell population was provided to the model without addition of new clusters, and reconstruction error was used to assess its dissimilarity from learned populations. Classification task: Accuracy, F1 score and balanced accuracy are for patient 2, 99.21%, 94.83%, 96.83%, respectively and for patient 3, 75.88%, 78.41% and 92.26%, respectively. Discovery/anomaly detection MARVIN successfully highlighted rare pathological populations (<0.1%). Through cluster expansion, it identified new pathological populations as distinct from healthy cells. It grouped two small MRD populations (MRD2 and MRD4) into the same cluster while still detecting subtle differences, and it mapped patient 1 and 3 blast groups into separate clusters. Marvin detected and correctly assigned 99.2% leukemic cells in new clusters. Using reconstruction error, MARVIN identified all pathological populations as previously unseen and suitable for further characterisation. Conclusion MARVIN is a semi-supervised generative model grounded in biological assumptions for FCM data. It can be trained on routinely standardised datasets and applied across instruments, supporting broad laboratory implementation. MARVIN achieves high classification accuracy and detects novel populations through expanded clustering and reconstruction-loss evaluation. Ongoing work focuses on biological refinement to improve rare population clustering and applying MARVIN to study MRD dynamics in acute leukemia.
Disciplines :
Hematology
Author, co-author :
De Voeght, Adrien  ;  Université de Liège - ULiège > Département des sciences cliniques
Bodart, Fanny ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Big Data
Baron, Frédéric  ;  Université de Liège - ULiège > Département des sciences cliniques
Louppe, Gilles  ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Big Data
Language :
English
Title :
MARVIN: A Deep Generative Model for Flow Cytometry Analysis Informed by Biological Assumption
Publication date :
06 February 2026
Event name :
41st General Annual Meeting of the Belgian Hematology Society
Event date :
6-7 february
Audience :
International
Peer review/Selection committee :
Peer reviewed
Available on ORBi :
since 09 February 2026

Statistics


Number of views
29 (6 by ULiège)
Number of downloads
9 (2 by ULiège)

Bibliography


Similar publications



Contact ORBi