[en] The developmental and epileptic encephalopathies (DEE) are a group of rare, severe neurodevelopmental disorders, where even the most thorough sequencing studies leave 60-65% of patients without a molecular diagnosis. Here, we explore the incompleteness of transcript models used for exome and genome analysis as one potential explanation for a lack of current diagnoses. Therefore, we have updated the GENCODE gene annotation for 191 epilepsy-associated genes, using human brain-derived transcriptomic libraries and other data to build 3,550 putative transcript models. Our annotations increase the transcriptional 'footprint' of these genes by over 674 kb. Using SCN1A as a case study, due to its close phenotype/genotype correlation with Dravet syndrome, we screened 122 people with Dravet syndrome or a similar phenotype with a panel of exon sequences representing eight established genes and identified two de novo SCN1A variants that now - through improved gene annotation - are ascribed to residing among our exons. These two (from 122 screened people, 1.6%) molecular diagnoses carry significant clinical implications. Furthermore, we identified a previously classified SCN1A intronic Dravet syndrome-associated variant that now lies within a deeply conserved exon. Our findings illustrate the potential gains of thorough gene annotation in improving diagnostic yields for genetic disorders.
Disciplines :
Neurology Pediatrics
Author, co-author :
Steward, Charles A.
Roovers, Jolien
Suner, Marie-Marthe
Gonzalez, Jose M.
Uszczynska-Ratajczak, Barbara
Pervouchine, Dmitri
Fitzgerald, Stephen
Viola, Margarida
Stamberger, Hannah
Hamdan, Fadi F.
Ceulemans, Berten
LEROY, Patricia ; Centre Hospitalier Universitaire de Liège - CHU > Département de Pédiatrie > Service de pédiatrie
Nieh, S. E. & Sherr, E. H. Epileptic encephalopathies: new genes and new pathways. Neurotherapeutics 11, 796–806 (2014).
EuroEpinomics-R.E.S.Consortium; Epilepsy Phenome/Genome-Project & Epi4K.Consortium. De novo mutations in synaptic transmission genes including DNM1 cause epileptic encephalopathies. Am. J. Hum. Genet. 95, 360–370 (2014).
Djemie, T. et al. Pitfalls in genetic testing: the story of missed SCN1A mutations. Mol. Genet. Genom. Med. 4, 457–464 (2016).
Deciphering-Developmental-Disorders-Study. Prevalence and architecture of de novo mutations in developmental disorders. Nature 542, 433–438 (2017).
Mark, C. et al. The 100,000 Genomes Project Protocol (2017).
EpiPMConsortium. A roadmap for precision medicine in the epilepsies. Lancet Neurol. 14, 1219–1228 (2015).
Wright, C. F. et al. Making new genetic diagnoses with old data: iterative reanalysis and reporting from genome-wide data in 1,133 families with developmental disorders. Genet. Med. 20, 1216–1223 (2018).
Schneider, V. A. et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849–864 (2017).
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–45 (2016).
Reggiani, C. et al. Novel promoters and coding first exons in DLG2 linked to developmental disorders and intellectual disability. Genome Med. 9, 67 (2017).
Epilepsy-Genetics-Initiative. De novo variants in the alternative exon 5 of SCN8A cause epileptic encephalopathy. Genet Med. 20, 275–281 (2018).
Mudge, J. M. et al. The origins, evolution, and functional potential of alternative splicing in vertebrates. Mol. Biol. Evol. 28, 2949–2959 (2011).
Kurosaki, T., Popp, M. W. & Maquat, L. E. Quality and quantity control of gene expression by nonsense-mediated mRNA decay. Nat. Rev. Mol. Cell Biol. 20, 406–420 (2019).
Lareau, L. F., Brooks, A. N., Soergel, D. A., Meng, Q. & Brenner, S. E. The coupling of alternative splicing and nonsense-mediated mRNA decay. Adv. Exp. Med. Biol. 623, 190–211 (2007).
Yan, Q. et al. Systematic discovery of regulated and conserved alternative exons in the mammalian brain reveals NMD modulating chromatin regulators. Proc. Natl Acad. Sci. USA 112, 3445–3450 (2015).
da Costa, P. J., Menezes, J. & Romao, L. The role of alternative splicing coupled to nonsense-mediated mRNA decay in human disease. Int. J. Biochem. Cell Biol. 91, 168–175 (2017).
Anna, A. & Monika, G. Splicing mutations in human genetic disorders: examples, detection, and confirmation. J. Appl. Genet. 59, 253–268 (2018).
Jaffe, A. E. et al. Developmental regulation of human cortex transcription and its clinical relevance at single base resolution. Nat. Neurosci. 18, 154–161 (2015).
Mercer, T. R., Dinger, M. E., Sunkin, S. M., Mehler, M. F. & Mattick, J. S. Specific expression of long noncoding RNAs in the mouse brain. Proc. Natl Acad. Sci. USA 105, 716–721 (2008).
Young, R. S. & Ponting, C. P. Identification and function of long non-coding RNAs. Essays Biochem. 54, 113–126 (2013).
Frankish, A., Mudge, J. M., Thomas, M. & Harrow, J. The importance of identifying alternative splicing in vertebrate genome annotation. Database 2012, bas014 (2012).
Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644 (2008).
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Lin, M. F., Jungreis, I. & Kellis, M. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics 27, i275–82 (2011).
Bodian, D. L., Schreiber, J. M., Vilboux, T., Khromykh, A. & Hauser, N. S. Mutation in an alternative transcript of CDKL5 in a boy with early-onset seizures. Cold Spring Harb. Mol. Case Stud. 4 a002360 (2018).
Dravet, C. The core Dravet syndrome phenotype. Epilepsia 52, 3–9 (2011).
Parihar, R. & Ganesh, S. The SCN1A gene variants and epileptic encephalopathies. J. Hum. Genet. 58, 573–580 (2013).
Tate, S. K. et al. Genetic predictors of the maximum doses patients receive during clinical use of the anti-epileptic drugs carbamazepine and phenytoin. Proc. Natl Acad. Sci. USA 102, 5507–5512 (2005).
Oh, Y. & Waxman, S. G. Novel splice variants of the voltage-sensitive sodium channel alpha subunit. Neuroreport 9, 1267–1272 (1998).
Bowling, K. M. et al. Genomic diagnosis for children with intellectual disability and/or developmental delay. Genome Med. 9, 43 (2017).
Carvill, G. L. et al. Aberrant inclusion of a poison exon causes dravet syndrome and related SCN1A-associated genetic epilepsies. Am. J. Hum. Genet. 103, 1022–1029 (2018).
Long, Y. S. et al. Identification of the promoter region and the 5′-untranslated exons of the human voltage-gated sodium channel Nav1.1 gene (SCN1A) and enhancement of gene expression by the 5′-untranslated exons. J. Neurosci. Res. 86, 3375–3381 (2008).
de Lange, I. M. et al. Influence of common SCN1A promoter variants on the severity of SCN1A-related phenotypes. Mol. Genet. Genom. Med. 7, e00727 (2019).
Carvill, G. L. et al. Targeted resequencing in epileptic encephalopathies identifies de novo mutations in CHD2 and SYNGAP1. Nat. Genet. 45, 825–830 (2013).
Jiang, Y. et al. Incorporating functional information in tests of excess de novo mutational load. Am. J. Hum. Genet. 97, 272–283 (2015).
Jean-Philippe, J., Paz, S. & Caputi, M. hnRNP A1: the Swiss army knife of gene expression. Int. J. Mol. Sci. 14, 18999–19024 (2013).
Beusch, I., Barraud, P., Moursy, A., Clery, A. & Allain, F. H. Tandem hnRNP A1 RNA recognition motifs act in concert to repress the splicing of survival motor neuron exon 7. Elife 6, e25736 (2017).
Zhang, X. et al. Cell-type-specific alternative splicing governs cell fate in the developing cerebral cortex. Cell 166, 1147–1162 e15 (2016).
Lynch, D. C. et al. Disrupted auto-regulation of the spliceosomal gene SNRPB causes cerebro-costo-mandibular syndrome. Nat. Commun. 5, 4483 (2014).
Rahbari, R. et al. Timing, rates and spectra of human germline mutation. Nat. Genet. 48, 126–133 (2016).
Su, C. H., Dhananjaya, D. & Tarn, W. Y. Alternative splicing in neurogenesis and brain development. Front. Mol. Biosci. 5, 12 (2018).
Hofman, M. A. Evolution of the human brain: when bigger is better. Front. Neuroanat. 8, 15 (2014).
Levchenko, A., Kanapin, A., Samsonova, A. & Gainetdinov, R. R. Human accelerated regions and other human-specific sequence variations in the context of evolution and their relevance for brain development. Genome Biol. Evol. 10, 166–188 (2018).
Cho, M. J. et al. Efficacy of Stiripentol in Dravet Syndrome with or without SCN1A mutations. J. Clin. Neurol. 14, 22–28 (2018).
Costain, G. et al. Periodic reanalysis of whole-genome sequencing data enhances the diagnostic advantage over standard clinical genetic testing. Eur. J. Hum. Genet. 26, 740–744 (2018).
Wright, C. F. et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 385, 1305–1314 (2015).
Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
Benson, D. A. et al. GenBank. Nucleic Acids Res. 45, D37–D42 (2017).
Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Mott, R. EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA. Comput. Appl. Biosci. 13, 477–478 (1997).
Tilgner, H. et al. Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events. Nat. Biotechnol. 33, 736–742 (2015).
Wu, T. D., Reeder, J., Lawrence, M., Becker, G. & Brauer, M. J. GMAP and GSNAP for genomic sequence alignment: enhancements to speed, accuracy, and functionality. Methods Mol. Biol. 1418, 283–334 (2016).
Rhoads, A. & Au, K. F. PacBio sequencing and its applications. Genom. Proteom. Bioinforma. 13, 278–289 (2015).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Petryszak, R. et al. Expression Atlas update–a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments. Nucleic Acids Res. 42, D926–32 (2014).
Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).
Pervouchine, D. D., Knowles, D. G. & Guigo, R. Intron-centric estimation of alternative splicing from RNA-seq data. Bioinformatics 29, 273–274 (2013).
Zhu, J. et al. Comparative genomics search for losses of long-established genes on the human lineage. PLoS Comput. Biol. 3, e247 (2007).
Siepel, A. et al. Targeted discovery of novel human exons by comparative genomics. Genome Res. 17, 1763–1773 (2007).
Goossens, D. et al. Simultaneous mutation and copy number variation (CNV) detection by multiplex PCR-based GS-FLX sequencing. Hum. Mutat. 30, 472–476 (2009).
Reumers, J. et al. Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing. Nat. Biotechnol. 30, 61–68 (2012).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Tischler, G. & Leonard, S. biobambam: tools for read pair collation based algorithms on BAM files. Source Code Biol. Med. 9, 13 (2014).
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Genomes Project, C. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Desmet, F. O. et al. Human Splicing Finder: an online bioinformatics tool to predict splicing signals. Nucleic Acids Res. 37, e67 (2009).