Reference : Consensus assessment of the contamination level of publicly available cyanobacterial ...
Scientific journals : Article
Life sciences : Biochemistry, biophysics & molecular biology
Life sciences : Microbiology
Life sciences : Genetics & genetic processes
http://hdl.handle.net/2268/227075
Consensus assessment of the contamination level of publicly available cyanobacterial genomes.
English
Cornet, Luc mailto [Université de Liège - ULiège > Département des sciences de la vie > Phylogénomique des eucaryotes > >]
Meunier, Loïc mailto [Université de Liège - ULiège > Département des sciences de la vie > Phylogénomique des eucaryotes >]
Van Vlierberghe, Mick mailto [Université de Liège - ULiège > Département des sciences de la vie > Phylogénomique des eucaryotes >]
Léonard, Raphaël mailto [Université de Liège - ULiège > Département des sciences de la vie > Cristallographie des macromolécules biologiques >]
Durieu, Benoit mailto [Université de Liège - ULiège > Département des sciences de la vie > Centre d'ingénierie des protéines >]
Lara, Yannick mailto [Université de Liège - ULiège > Département de géologie > Paléobiogéologie - Paléobotanique - Paléopalynologie (PPP) >]
Misztak, Agnieszka [> >]
Sirjacobs, Damien mailto [Université de Liège - ULiège > Département des sciences de la vie > Phylogénomique des eucaryotes >]
Javaux, Emmanuelle mailto [Université de Liège - ULiège > Département de géologie > Paléobiogéologie - Paléobotanique - Paléopalynologie (PPP) >]
Philippe, Herve [> >]
Wilmotte, Annick mailto [Université de Liège - ULiège > Département des sciences de la vie > Physiologie et génétique bactériennes >]
Baurain, Denis mailto [Université de Liège - ULiège > Département des sciences de la vie > Phylogénomique des eucaryotes >]
2018
PLoS ONE
13
7
e0200323
Yes (verified by ORBi)
International
1932-6203
United States
[en] Publicly available genomes are crucial for phylogenetic and metagenomic studies, in which contaminating sequences can be the cause of major problems. This issue is expected to be especially important for Cyanobacteria because axenic strains are notoriously difficult to obtain and keep in culture. Yet, despite their great scientific interest, no data are currently available concerning the quality of publicly available cyanobacterial genomes. As reliably detecting contaminants is a complex task, we designed a pipeline combining six methods in a consensus strategy to assess the contamination level of 440 genome assemblies of Cyanobacteria. Two methods are based on published reference databases of ribosomal genes (SSU rRNA 16S and ribosomal proteins), one is indirectly based on a reference database of marker genes (CheckM), and three are based on complete genome analysis. Among those genome-wide methods, Kraken and DIAMOND blastx share the same reference database that we derived from Ensembl Bacteria, whereas CONCOCT does not require any reference database, instead relying on differences in DNA tetramer frequencies. Given that all the six methods appear to have their own strengths and limitations, we used the consensus of their rankings to infer that >5% of cyanobacterial genome assemblies are highly contaminated by foreign DNA (i.e., contaminants were detected by 5 or 6 methods). Our results will help researchers to check the quality of publicly available genomic data before use in their own analyses. Moreover, we argue that journals should make mandatory the submission of raw read data along with genome assemblies in order to facilitate the detection of contaminants in sequence databases.
F.R.S.-FNRS - Fonds de la Recherche Scientifique ; BELSPO - SPP Politique scientifique - Service Public Fédéral de Programmation Politique scientifique ; FRIA - Fonds pour la formation à la Recherche dans l'Industrie et dans l'Agriculture ; Université de Liège (Fédération Wallonie-Bruxelles) ; ANR - Agence Nationale de la Recherche ; WBI - Wallonie-Bruxelles International ; CECi - Consortium des Équipements de Calcul Intensif
Researchers ; Professionals ; Students
http://hdl.handle.net/2268/227075
10.1371/journal.pone.0200323
FP7 ; 308074 - ELITE - Early Life Traces, Evolution, and Implications for Astrobiology

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
Cornet_et_al_2018_PLOS_ONE_postprint_editor.pdfPublisher postprint6.8 MBView/Open

Additional material(s):

File Commentary Size Access
Open access
Cornet_et_al_2018_PLOS_ONE_suppl_data.zipSupplemental Data3.31 MBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.