Abstract :
[en] Recently, a paradigm shift in the acquisition of photosynthesis has proposed the implication of an intracellular obligate pathogen in the primary plastid establishment (so far thought as an association between a heterotrophic unicellular eukaryote and a cyanobacterium). This hypothesis, dubbed the Menage-a-trois Hypothesis (MATH), suggests an active and direct role of Chlamydiales in primary endosymbiosis, which would have provided many critical genes to the cyanobiont in the common inclusion vesicle. The expression and efficient localization of these genes, such as key transporters and glucan transferases, would have initiated the biochemical fluxes of symbiosis. Even if still controversial, the MATH is supported by molecular, biochemical and phylogenetic evidence. Hence, studies performed more than a decade ago concluded that 30-100 genes would have been transferred from Chlamydiales pathogens to the ancestor of Archaeplastida. In this work, we revisit the phylogenetic support for the MATH with the objective of updating the list of Chlamydiales genes found in modern Archaeplastida and identifying a congruent phylogenomic signal (if any) corresponding to the potential original pathogen. Starting from all relevant publicly available data, we produced a representative set of primary algae and Chlamydiales genomes, supplemented by non-photosynthetic eukaryotic and bacterial genomes. Selected proteomes were compared to each other and their proteins grouped into orthologous groups. Single-gene phylogenetic analyses then allowed us to automatically identify trees suggesting a Chlamydiales origin of the Archaeplastida proteins. This way, we were able to identify about 150 genes (of which 40-50 identified in the original studies) that may have been transferred from Chlamydiales during plastid establishment. Manual analyses are nevertheless necessary to confirm this number. A second round of phylogenomic analyses is ongoing, either based on explicit subsets of concatenated congruent gene alignments or using a Bayesian model that automatically clusters genes based on shared histories.