[en] [en] BACKGROUND: Sitobion miscanthi is an ideal model for studying host plant specificity, parthenogenesis-based phenotypic plasticity, and interactions between insects and other species of various trophic levels, such as viruses, bacteria, plants, and natural enemies. However, the genome information for this species has not yet to be sequenced and published. Here, we analyzed the entire genome of a parthenogenetic female aphid colony using Pacific Biosciences long-read sequencing and Hi-C data to generate chromosome-length scaffolds and a highly contiguous genome assembly.
RESULTS: The final draft genome assembly from 33.88 Gb of raw data was ∼397.90 Mb in size, with a 2.05 Mb contig N50. Nine chromosomes were further assembled based on Hi-C data to a 377.19 Mb final size with a 36.26 Mb scaffold N50. The identified repeat sequences accounted for 26.41% of the genome, and 16,006 protein-coding genes were annotated. According to the phylogenetic analysis, S. miscanthi is closely related to Acyrthosiphon pisum, with S. miscanthi diverging from their common ancestor ∼25.0-44.9 million years ago.
CONCLUSIONS: We generated a high-quality draft of the S. miscanthi genome. This genome assembly should help promote research on the lifestyle and feeding specificity of aphids and their interactions with each other and species at other trophic levels. It can serve as a resource for accelerating genome-assisted improvements in insecticide-resistant management and environmentally safe aphid management.
Disciplines :
Chemistry
Author, co-author :
Jiang, Xin ✱; Université de Liège - ULiège > TERRA Research Centre ; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
Zhang, Qian ✱; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
Qin, Yaoguo ✱; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
Yin, Hang; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
Zhang, Siyu; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
Li, Qian; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
Zhang, Yong ; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
Fan, Jia ; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
Chen, Julian ; State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, People's Republic of China
✱ These authors have contributed equally to this work.
Language :
English
Title :
A chromosome-level draft genome of the grain aphid Sitobion miscanthi.
NSCF - National Natural Science Foundation of China China Postdoctoral Science Foundation CSC - China Scholarship Council
Funding text :
This research was sponsored by the National Key R & D Plan of China (Nos. 2017YFD0200900, 2016YFD0300700, and 2017YFD0201700), the National Natural Science Foundation of China (Nos. 31871966 and 31871979), China Postdoctoral Science Foundation (No. 2018M631646), the State Modern Agricultural Industry Technology System (No. CARS-22-G-18), and the China Scholarship Council (No. 201703250048).
Zhang G. Aphids in Agriculture and Forestry of Northwest China. 1st ed. Beijing: China Environmental Science, 1999.
The International Aphid Genomics Consortium. Genome sequence of the pea aphid Acyrthosiphon pisum. PLoS Biol 2010;8:e1000313.
Mathers TC, Chen Y, Kaithakottil G, et al. Rapid transcriptional plasticity of duplicated gene clusters enables a clonally reproducing aphid to colonise diverse plant species. Genome Biol 2017;18:27.
Wenger JA, Cassone BJ, Legeai F, et al. Whole genome sequence of the soybean aphid, Aphis glycines. Insect Biochem Mol Biol 2017, doi:10.1016/j.ibmb.2017.01.005.
Burger NFV, Botha AM. Genome of Russian wheat aphid an economically important cereal aphid. Stand Genomic Sci 2017;12:90.
Thorpe P, Escudero-Martinez CM, Cock PJA, et al. Shared transcriptional control and disparate gain and loss of aphid parasitism genes. Genome Biol Evol 2018;10:2716-33.
Quan Q, Hu X, Pan B, et al. Draft genome of the cotton aphid Aphis gossypii. Insect Biochem Mol Biol 2019;105:25-32.
Chen W, Shakir S, Bigham M, et al. Genome sequence of the corn leaf aphid (Rhopalosiphum maidis Fitch). GigaScience 2019;8(4), doi:10.1093/gigascience/giz033.
Kuznesova VG, Shaposhnikoy GKH. The chromosome numbers of the aphids (Homoptera, Aphidinea) of the world fauna. Entomol Rev 1973;52:78-96.
Chen X, Zhang G. The chromosome numbers of the aphids in Beijing region. Dong Wu Xue Bao 1985;31(1):12-9.
Altschul SF, Gish W, Miller W, et al. Basic Local Alignment Search Tool. J Mol Biol 1990;215:403-10.
Li R, Li Y, Kristiansen K, et al. SOAP: Short Oligonucleotide Alignment Program. Bioinformatics 2008;24:713-4.
Koren S, Walenz BP, Berlin K, et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 2017;27:722-36.
wtdbg. https://github.com/ruanjue/wtdbg. 2016.09.30. Accessed on 30 September 2019
Chakraborty M, Baldwin-Brown JG, Long AD, et al. Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage. Nucleic Acids Res 2016;44:e147.
Mummer. https://github.com/mummer4/mummer. Accessed on 11 August 2017.
Chin CS, Peluso P, Sedlazeck FJ, et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods 2016;13:1050-4.
Simao FA,Waterhouse RM, Ioannidis P, et al. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 2015;31(19):3210-2.
Li H, Richard D. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009;25(14):1754-60.
Servant Nicolas, Varoquaux N, Lajoie BR, et al. HiC-Pro: An optimized and flexible pipeline for Hi-C data processing. Genome Biol 2015;16(1):1-11.
Burton JN, Adey A, Patwardhan RP, et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 2013;31(12):1119-25.
Wicker, T., Sabot, F., Hua-Van, A., et al.A unified classification system for eukaryotic transposable elements. Nat Rev Genet 2007;8: 973-8217984973
Xu Z, Wang H. LTR FINDER: An efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 2007;35:W265-8.
Han Y, Wessler SR. MITE-Hunter: A program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res 2010;38:e199.
Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics 2005;21:i351-8.
Edgar RC, Myers EW. PILER: identification and classification of genomic repeats. Bioinformatics 2005;21:i152-8.
Hoede C, Arnoux S, Moisset M, et al. PASTEC: An automatic transposable element classification tool. PLoS One 2014;9:e91929.
Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 2015;6:11.
Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics 2009;Chapter 4:Unit 4.10.
Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 2003;19:ii215-25.
Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: Two open source ab initio eukaryotic gene-finders. Bioinformatics 2004;20:2878-9.
Korf I. Gene finding in novel genomes. BMC Bioinformatics 2004;5:59.
Blanco E, Parra G, Guigo R. Using geneid to identify genes. Curr Protoc Bioinformatics 2007;Chapter 4:Unit 4.3.
Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol 1997;268:78-94.
Keilwagen J,Wenk M, Erickson JL, et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res 2016;44:e89.
Campbell MA, Haas BJ, Hamilton JP, et al. Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis. BMC Genomics 2006;7:327.
Haas BJ, Salzberg SL, Zhu W, et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol 2008;9:R7.
Altschul SF, Gish W, Miller W, et al. Basic Local Alignment Search Tool. J Mol Biol 1990;215:403-10.
Marchler-Bauer A, Lu S, Anderson JB, et al. CDD: A conserved domain database for the functional annotation of proteins. Nucleic Acids Res 2011;39:D225-9.
Koonin EV, Fedorova ND, Jackson JD, et al. A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes. Genome Biol 2004;5:R7.
Dimmer EC, Huntley RP, Alam-Faruque Y, et al. The UniProt-GO annotation database in 2011. Nucleic Acids Res 2012;40:D565-70.
Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 2000;28:27-30.
Boeckmann B, Bairoch A, Apweiler R, et al. Phan I: The SWISSPROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 2003;31:365-70.
Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 2003;13:2178-89.
A. melliferagenome data. ftp://ftp.ncbi.nlm.nih.gov/genom es/all/GCF/003/254/395/GCF 003254395.2 Amel HAv3.1/GC F 003254395.2 Amel HAv3.1 genomic.fna.gz
D. pulexgenome data. ftp://ftp.ncbi.nlm.nih.gov/genomes/a ll/GCA/000/187/875/GCA 000187875.1 V1.0/GCA 000187875.1 V1.0 genomic.fna.gz
D. melanogastergenome. data ftp://ftp.ncbi.nlm.nih.gov/gen omes/all/GCF/000/001/215/GCF 000001215.4 Release 6 plu s ISO1 MT/GCF 000001215.4 Release 6 plus ISO1 MT gen omic.fna.gz
Guindon S, Dufayard JF, Lefort V, et al. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst Biol 2010;59:307-21.
Yang Z, Rannala B. Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. Mol Biol Evol 2006;23:212-26.
Hedges SB, Marin J, Suleski M, et al. Tree of life reveals clocklike speciation and diversification.Mol Biol Evol 2015;32:835-45.
Jiang X, Zhang Q, Qi Y, et al. Supporting data for "A chromosome-level draft genome of the grain aphid Sitobion miscanthi." GigaScience Database 2019. http://dx.doi.org/10. 5524/100635.