Automated tools for the generation and interpretation of single gene trees at a broad taxonomic scale

Baurain, Denis; Van Vlierberghe, Mick; Arnaud, Di Franco; Philipe, Hervé

Download

Poster (Scientific congresses and symposiums)

Automated tools for the generation and interpretation of single gene trees at a broad taxonomic scale

Baurain, Denis; Van Vlierberghe, Mick; Arnaud, Di Franco et al.

2016 • ECCB 2016

Permalink
https://hdl.handle.net/2268/254874

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

ECCB2016.pdf

Publisher postprint (2.11 MB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

bioinformatics; orthology; phylogeny

Abstract :

[en] Identifying orthology relationships among sequences is fundamental in phyloge- nomics; indeed, those are essential to understand evolution, diversity of life and ancestry among organisms. To build alignments of orthologous sequences, phy- logenomic pipelines often start with a step of all-vs-all similarity search followed by a clustering with an algorithm such as OrthoFinder [Emms and Kelly (2015) Genome Biol 16:157]. For it to be as accurate as possible, proteomes of good quality are needed but their availability is limited to a small subset of the living beings. Therefore, large-scale taxonomic phylogenomic analyses imply the enrichment of preexisting orthologous groups with transcriptomic or genomic data and the need for robust tools for identifying orthologues from heterogeneous sequence data. To this end, we have developed a novel tool, ”Forty-Two”, along the lines of HaMStR [Ebersberger et al. (2009) BMC Evol Biol 9:157], whose aim is to add (and op- tionally align) sequences to thousands of preexisting multiple sequence alignments (MSA) while controlling for orthology relationships and potentially contaminating sequences. ”Forty-Two” uses advanced heuristics based on a multiple Best Recipro- cal Hit (multi-BRH) strategy against reference proteomes to distinguish orthologous and paralogous sequences among homologues. It is fully functional and has already been used in two high-profile phylogenomic manuscripts (under review) dealing with the animal tree of life. Here, we present the principles and algorithms underlying ”Forty-Two” as well as the results of an extensive test suite of its features, in order to support its release to the public.

Disciplines :

Biochemistry, biophysics & molecular biology

Author, co-author :

Baurain, Denis ; Université de Liège - ULiège > Département des sciences de la vie > Phylogénomique des eucaryotes

Van Vlierberghe, Mick ; Université de Liège - ULiège > InBioS

Arnaud, Di Franco

Philipe, Hervé

Language :

English

Title :

Automated tools for the generation and interpretation of single gene trees at a broad taxonomic scale

Publication date :

05 September 2016

Number of pages :

Event name :

ECCB 2016

Event date :

from 04-09-2016 to 07-09-2016

Audience :

International

Available on ORBi :

since 07 January 2021

Statistics

Number of views

159 (20 by ULiège)

Number of downloads

47 (4 by ULiège)

More statistics