Sheynkman, G. M.; Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, United States, Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, United States, Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, United States
Tuttle, K. S.; Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, United States, Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, United States, Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, United States, Department of Biochemistry, Northeastern University, Boston, MA 02115, United States, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States, Icahn Institute of Data Science and Genomic Technology, New York, NY 10029, United States
Tseng, E.; Pacific Biosciences, Menlo Park, CA 94025, United States
Underwood, J. G.; Pacific Biosciences, Menlo Park, CA 94025, United States
Yu, L.; School of Computer Science and Technology, Xidian University, Xi’an, 710071, China
Dong, D.; School of Computer Science and Technology, Xidian University, Xi’an, 710071, China
Smith, M. L.; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States, Icahn Institute of Data Science and Genomic Technology, New York, NY 10029, United States
Sebra, R.; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States, Icahn Institute of Data Science and Genomic Technology, New York, NY 10029, United States
Willems, Luc ; Université de Liège - ULiège > Cancer-Cellular and Molecular Epigenetics
Hao, T.; Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, United States, Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, United States, Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, United States
Calderwood, M. A.; Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, United States, Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, United States, Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, United States
Hill, D. E.; Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, United States, Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, United States, Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, United States
Vidal, M.; Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, United States, Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, United States
ORF Capture-Seq as a versatile method for targeted identification of full-length isoforms
Publication date :
2020
Journal title :
Nature Communications
eISSN :
2041-1723
Publisher :
Nature Research
Volume :
11
Issue :
1
Peer reviewed :
Peer Reviewed verified by ORBi
Funders :
Melanoma Research Foundation, MRFBelgian American Educational Foundation, BAEFNational Institutes of Health, NIH: T32CA009361National Cancer Institute, NCI: U01CA232161P50HG004233
Blencowe, B. J. Alternative splicing: new insights from global analyses. Cell 126, 37–47 (2006).
Yang, X. et al. Widespread expansion of protein interaction capabilities by alternative splicing. Cell 164, 805–817 (2016).
Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).
Mudge, J. M. & Harrow, J. The state of play in higher eukaryote gene annotation. Nat. Rev. Genet. 17, 758–772 (2016).
Hayer, K. E., Pizarro, A., Lahens, N. F., Hogenesch, J. B. & Grant, G. R. Benchmark analysis of algorithms for determining and quantifying full-length mRNA splice forms from RNA-seq data. Bioinformatics 31, 3938–3945 (2015).
Steijger, T. et al. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10, 1177–1184 (2013).
Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
Sharon, D., Tilgner, H., Grubert, F. & Snyder, M. A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol. 31, 1009–1014 (2013).
Volden, R. et al. Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA. Proc. Natl. Acad. Sci. USA 115, 9726–9731 (2018).
Tilgner, H. et al. Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events. Nat. Biotechnol. 33, 736–742 (2015).
Stark, R., Grzelak, M. & Hadfield, J. RNA sequencing: the teenage years. Nat. Rev. Genet. 20, 631–656 (2019).
Spataro, N., Rodriguez, J. A., Navarro, A. & Bosch, E. Properties of human disease genes and the role of genes linked to Mendelian disorders in complex disease aetiology. Hum. Mol. Genet. 26, 489–500 (2017).
Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).
Mamanova, L. et al. Target-enrichment strategies for next-generation sequencing. Nat. Methods 7, 111–118 (2010).
Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009).
Halvardson, J., Zaghlool, A. & Feuk, L. Exome RNA sequencing reveals rare and novel alternative transcripts. Nucleic Acids Res. 41, e6 (2013).
Levin, J. Z. et al. Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts. Genome Biol. 10, 8 (2009).
Mercer, T. R. et al. Targeted sequencing for gene discovery and quantification using RNA CaptureSeq. Nat. Protoc. 9, 989–1009 (2014).
Ueno, T. et al. High-throughput resequencing of target-captured cDNA in cancer cells. Cancer Sci. 103, 131–135 (2012).
Mercer, T. R. et al. Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat. Biotechnol. 30, 99–104 (2012).
Bragalini, C. et al. Solution hybrid selection capture for the recovery of functional full-length eukaryotic cDNAs from complex environmental samples. DNA Res. 21, 685–694 (2014).
Giolai, M. et al. Comparative analysis of targeted long read sequencing approaches for characterization of a plant’s immune receptor repertoire. BMC Genomics 18, 564 (2017).
Karamitros, T. & Magiorkinis, G. A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits. Nucleic Acids Res. 43, e152 (2015).
Wang, M. et al. PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations. BMC Genomics 16, 12 (2015).
Witek, K. et al. Accelerated cloning of a potato late blight-resistance gene using RenSeq and SMRT sequencing. Nat. Biotechnol. 34, 656–660 (2016).
Giolai, M. et al. Targeted capture and sequencing of gene-sized DNA molecules. Biotechniques 61, 315–322 (2016).
Lagarde, J. et al. High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat. Genet. 49, 1731–1740 (2017).
Deveson, I. W. et al. Universal alternative splicing of noncoding exons. Cell Syst. 6, 245–255 (2018).
ORFeome Collaboration. The ORFeome Collaboration: a genome-scale human ORF-clone resource. Nat. Methods 13, 191–192 (2016).
Jiang, L. C. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011).
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).
Clark, M. B. et al. Quantitative gene profiling of long noncoding RNAs with targeted RNA sequencing. Nat. Methods 12, 339–342 (2015).
Paul, L. et al. SIRVs: Spike-In RNA Variants as external isoform controls in RNA-sequencing. Preprint at https://www.biorxiv.org/content/10.1101/080747v1 (2016).
Rodriguez, J. M. et al. APPRIS: annotation of principal and alternative splice isoforms. Nucleic Acids Res. 41, D110–D117 (2013).
Kelemen, O. et al. Function of alternative splicing. Gene 514, 1–30 (2013).
Lopez, A. J. Developmental role of transcription factor isoforms generated by alternative splicing. Dev. Biol. 172, 396–411 (1995).
Renaux, A. & UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 46, 2699–2699 (2018).
Gordon, S. P. et al. Widespread polycistronic transcripts in fungi revealed by single-molecule mRNA sequencing. PLoS ONE 10, 15 (2015).
Tardaguila, M. et al. SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. Genome Res. 28, 396–411 (2018).
Dougherty, M. L. et al. Transcriptional fates of human-specific segmental duplications in brain. Genome Res. 28, 1566–1576 (2018).
Salehi-Ashtiani, K. et al. Isoform discovery by targeted cloning, ‘deep-well’ pooling and parallel sequencing. Nat. Methods 5, 597–600 (2008).
Li, Y. I. et al. Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151–158 (2018).
Teng, M. et al. A benchmark for RNA-seq quantification pipelines. Genome Biol. 17, 74 (2016).
Singh, M. et al. High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes. Nat. Commun. 10, 3120 (2019).
Noonan, J. P. et al. Sequencing and analysis of Neanderthal genomic DNA. Science 314, 1113–1118 (2006).
Maricic, T., Whitten, M. & Pääbo, S. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS ONE 5, e14004 (2010).
Tsangaras, K. et al. Hybridization capture using short PCR products enriches small genomes by capturing flanking sequences (CapFlank). PLoS ONE 9, e109101 (2014).
Portal, M. M., Pavet, V., Erb, C. & Gronemeyer, H. TARDIS, a targeted RNA directional sequencing method for rare RNA discovery. Nat. Protoc. 10, 1915–1938 (2015).
Alvarado, D. M., Yang, P., Druley, T. E., Lovett, M. & Gurnett, C. A. Multiplexed direct genomic selection (MDiGS): a pooled BAC capture approach for highly accurate CNV and SNP/INDEL detection. Nucleic Acids Res. 42, e82 (2014).
Bashiardes, S. et al. Direct genomic selection. Nat. Methods 2, 63–69 (2005).
Byron, S. A., Van Keuren-Jensen, K. R., Engelthaler, D. M., Carpten, J. D. & Craig, D. W. Translating RNA sequencing into clinical diagnostics: opportunities and challenges. Nat. Rev. Genet. 17, 257–271 (2016).
Rual, J. F. et al. Human ORFeome version 1.1: A platform for reverse proteomics. Genome Res. 14, 2128–2135 (2004).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Li, H. et al. The sequence alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Li, H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32, 2103–2110 (2016).
Forrest, A. R. R. et al. A promoter-level mammalian expression atlas. Nature 507, 462–470 (2014).
Haeussler, M. et al. The UCSC Genome Browser database: 2019 update. Nucleic Acids Res. 47, D853–D858 (2019).
Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
Lopez, F., Granjeaud, S., Ara, T., Ghattas, B. & Gautheret, D. The disparate nature of “intergenic” polyadenylation sites. RNA 12, 1794–1801 (2006).