Abstract :
[en] The origin of the eukaryotic cell remains one of the most contentious puzzles in evolutionary biology. In the late 1970s, by discovering the domain Archaea, Woese put an end to the dichotomous view of life (eukaryotes vs prokaryotes) (C. R. Woese & Fox, 1977). His work, dubbed "Woese’s Revolution" shows that the living world is divided into three domains: bacteria, archaea and eukaryota. First, bacteria were considered the ancestral line, which gave birth to archaea and eukaryota. However, this initial view, from simple to complex, is still reassessed, especially since we know that some species can evolve by secondary simplification. Accordingly, rooting the tree of life has become a problem, relationships among these three domains being not reliable. Moreover, the eukaryotic cell seems to have both bacterial operational genes and archaeal informational genes, so that it could have originated from a fusion event between a bacterium and an archaeon. Then, since the discovery of the Asgard group, it has been suggested that eukaryota originate from archaea, making them paraphyletic. Nevertheless, things are perhaps not as simple. Indeed, many artifacts can affect phylogenetics reconstructions, such as long branch attraction phenomenon, contaminations etc.
Two main types of competing scenarii explain the origin of the eukaryotic cell. The first posits a symbiotic fusion between an archaea (whose nature varies) and an α-proteobacterium at the origin of the mitochondrion, while the second considers the eukaryotes as an independent lineage of both Archaea and Bacteria that would have, during its evolution, phagocyted an α-proteobacterium. The recent discoveries on the Archaea, in particular with the highlighting of the Asgard group, have thus revived the debate between the supporters of a two-domain life (the eukaryotes being descended from the Archaea, therefore paraphyletic) and the supporters of a three-domain life.
The aim of this thesis is to revisit the question of the relationship between Archaea and Eukaryotes, based on an original and rigorous methodology. The Archaea, which were very poorly represented until recently, are now appearing more and more as a very diverse domain of life. The discovery of new Archaea, the super-phylum Asgard, possessing genes encoding proteins previously considered specific to eukaryotes, suggests that the latter could be directly derived from the former and would thus be the result of a fusion between an Asgard Archaea and a Bacterium. However, the reliability of the datasets as well as the phylogenetic inference methods can sometimes be questionable. Phylogenetic inference models struggle to avoid artifacts that often go unnoticed. The considerations of this thesis manuscript focus on these methodological biases in order to minimize systematic errors in phylogenomic reconstruction and provide a more reliable phylogeny between Archaea and Eukaryotes.
In a first chapter, we revisited the question of the root of the tree of life through the use of the Elongation Factor (EF) gene. Then we took a critical look at the methods used in papers dealing with deep phylogenies, focusing on the consideration of phylogenetic reconstruction artifacts and the identification of methodological biases.
In a second study, we established a phylogeny of the Archaea through a « quadratic jackknife » procedure resampling both genes and species, in order to evaluate the robustness of the phylogenetic results in the face of these variations in the data, but also in the methods (super-matrices and super-trees) and models used (LG4X, C20, C60 and PMSF models). We then performed Slow-Fast analyses to compare tree topologies based on supermatrices of sites featuring different substitution rates. Our analyses favor the Korarchaeota rather than the SCGC group as a sister group of Sulfolobales + Desulfurococcales + Thermofilaceae + Thermoproteaceae. Furthermore, we recover Hadesarchaeota as the sister group to Thermococci, Theionarchaea and Methanomicrobia_Arc. Finally, our results struggle to systematically recover the monophyly of Euryarchaeota. Indeed, the DPANN group, whose position is itself uncertain, tends to attract the Altiarchaeaota, making the euryarchaeota paraphyletic. It is impossible for us to decide for one or the other solution. On the other hand, we also observe some polyphyly at the genus level. i.e. archaea classified in the same genus while belonging to different clades.
In the third study, we included Eukaryotes in our datasets in order to test the hypotheses concerning a relationship between Asgard Archaea and Eukaryotes. To do so, we consider multiple approaches to control the systematic error. Thus, we controlled problems related to paralogy by examining topologies for each gene and performed sites removal to test heterotachy and heteropecilly. For the slowest genes, the clustering of Asgard with eukaryotes is only minimally supported, in favor of a clustering of Asgard with TACKs. Only a strong addition of genes with fast mutation rates systematically groups Asgard with eukaryotes. We can therefore hypothesize that the grouping of Asgard with eukaryotes is the result of a phylogenetic reconstruction bias due to the use of genes with a too fast substitution rate.
We also conducted work aiming to root the archaeal tree, both with and without eukaryotes. Our findings support a root within the SANT group. Euryarchaeota appear to us as a paraphyletic group. The uncertainty lies in the group at the base of the Ouranosarchaea: DPANN, Altiarchaeota, or Hadesarchaeota. While the PMSF method of IQ-TREE favors a connection between DPANN and Altiarchaeota as the basal group, Bayesian inference analyses with PhyloBayes rather support Hadesarchaeota in this position.
We therefore conclude that it is difficult to promote with certainty a model of eukaryogenesis based on a 2-domain model where eukaryotes would be the sister group of Asgard. Without excluding an archaeal origin of eukaryotes, a 3-domain (or even 1-domain) scenario is still possible as long as doubts remain about the phylogenetic methods used. We debate the possibility that the grouping of Asgards with eukaryotes is, as the history of the conception of the tree of Life has too often shown us, the result of a lack of reliable data and analysis artifacts due to the particular biology of these species.