Improving the annotation of the cattle genome by annotating transcription start sites in a diverse set of tissues and populations using CAGE sequencing.
[en] Understanding the genomic control of tissue-specific gene expression and regulation can help to inform the application of genomic technologies in farm animal breeding programs. The fine mapping of promoters (transcription start sites (TSS)) and enhancers (divergent amplifying segments of the genome local to TSS) in different populations of cattle across a wide diversity of tissues provides information to locate and understand the genomic drivers of breed- and tissue-specific characteristics. To this aim we used Cap Analysis Gene Expression (CAGE) sequencing, of 24 different tissues from three populations of cattle, to define TSS and their co-expressed short-range enhancers (<1 kb) in the ARS-UCD1.2_Btau5.0.1Y reference genome (1000bulls run9), and analysed tissue- and population specificity of expressed promoters. We identified 51,295 TSS and 2,328 TSS-Enhancer regions shared across the three populations (Dairy, Dairy - Beef cross and Canadian Kinsella composite cattle from 2 individuals, one of each sex, per population). Cross-species comparative analysis of CAGE data from seven other species, including sheep, revealed a set of TSS and TSS-Enhancers that were specific to cattle. The CAGE dataset will be combined with other transcriptomic information for the same tissues to create a new high-resolution map of transcript diversity across tissues and populations in cattle for the BovReg Project. Here we provide the CAGE dataset and annotation tracks for TSS and TSS-Enhancers in the cattle genome. This new annotation information will improve our understanding of the drivers of gene expression and regulation in cattle and help to inform the application of genomic technologies in breeding programs.
Disciplines :
Genetics & genetic processes
Author, co-author :
Salavati, Mazdak ; The Roslin Institute, University of Edinburgh, EH25 9RG, Edinburgh, UK
Clark, Richard; Genetics Core, Edinburgh Clinical Research Facility, University of Edinburgh, EH4 2XU, Edinburgh, UK
Becker, Doreen; Institute of Genome Biology, Research Institute for Farm Animal Biology (FBN), 18196, Dummerstorf, Germany
Kühn, Christa; Institute of Genome Biology, Research Institute for Farm Animal Biology (FBN), 18196, Dummerstorf, Germany ; Faculty of Agricultural and Environmental Sciences, University Rostock, 18059, Rostock, Germany
Plastow, Graham; Livestock Gentec, Department of Agricultural, Food and Nutritional Science, University of Alberta, T6G 2H1, Edmonton, Canada
Dupont, Sébastien ; Université de Liège - ULiège > Département de gestion vétérinaire des Ressources Animales (DRA)
Costa Monteiro Moreira, Gabriel ; Université de Liège - ULiège > Département de gestion vétérinaire des Ressources Animales (DRA) ; Université de Liège - ULiège > GIGA > GIGA Medical Genomics - Unit of Animal Genomics
Charlier, Carole ; Université de Liège - ULiège > GIGA > GIGA Medical Genomics
Clark, Emily Louise; The Roslin Institute, University of Edinburgh, EH25 9RG, Edinburgh, UK
Language :
English
Title :
Improving the annotation of the cattle genome by annotating transcription start sites in a diverse set of tissues and populations using CAGE sequencing.
Alexandre PA, Naval-Sánchez M, Menzies M, Nguyen LT, Porto-Neto LR, Fortes MRS, Reverter A. Chromatin accessibility and regulatory vocabulary across indicine cattle tissues. Genome Biol. 2021;22(1):273. doi:10.1186/s13059-021-02489-7.
Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C, Suzuki T, et al. An atlas of active enhancers across human cell types and tissues. Nature 2014;507-(7493):455–461. doi:10.1038/nature12787.
Batut P, Gingeras TR. RAMPAGE: Promoter Activity Profiling by Paired-End Sequencing of 5′-Complete cDNAs. Curr Protoc Mol Biol. 2013;104(1). doi:10.1002/0471142727.2013.104.issue-1.
Bertin N, Mendez M, Hasegawa A, Lizio M, Abugessaisa I, Severin J, Sakai-Ohno M, Lassmann T, Kasukawa T, Kawaji H, et al. Linking FANTOM5 CAGE peaks to annotations with CAGEscan. Sci Data. 2017;4(1):170147. doi:10.1038/sdata.2017.147.
Blobel GA, Higgs DR, Mitchell JA, Notani D, Young RA Testing the super-enhancer concept. Nat Rev Genet. 2021;22(12):749–755. doi:10.1038/s41576-021-00398-w.
Camargo AP, Vasconcelos AA, Fiamenghi MB, Pereira GAG, Carazzolle MF. tspex: a tissue-specificity calculator for gene expression data 2020:1–7. doi:10.21203/RS.3.RS-51998/V1.
Chen S-Y, Schenkel FS, Melo ALP, Oliveira HR, Pedrosa VB, Araujo AC, Melka MG, Brito LF. Identifying pleiotropic variants and candidate genes for fertility and reproduction traits in Holstein cattle via association studies based on imputed whole-genome sequence genotypes. BMC Genomics 2022;23(1):331. doi:10.1186/ s12864-022-08555-z.
Clark E, Archibald AL, Daetwyler HD, Groenen MAM, Harrison PW, Houston RD, Kühn C, Lien S, Macqueen DJ, Reecy JM, et al. From FAANG to fork: application of highly annotated genomes to improve farmed animal production. Genome Biol. 2020;21(1):285. doi:10.1186/s13059-020-02197-8.
Crysnanto D, Leonard AS, Fang ZH, Pausch H. Novel functional sequences uncovered through a bovine multiassembly graph. Proc Natl Acad Sci U S A. 2021;118(20):e2101056118. doi:10. 1073/pnas.2101056118.
Deviatiiarov R, Lizio M, Gusev O. Application of a CAGE method to an avian development study. Methods Mol Biol. 2017;1650:101–109. doi:10.1007/978-1-4939-7216-6_6.
Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017;35(4):316–319. doi:10.1038/nbt. 3820.
Doyle JL, Berry DP, Veerkamp RF, Carthy TR, Walsh SW, Evans RD, Purfield DC. Genomic regions associated with skeletal type traits in beef and dairy cattle are common to regions associated with carcass traits, feed intake and calving difficulty. Front Genet. 2020;11:20. doi:10.3389/FGENE.2020.00020/BIBTEX.
FANTOM Consortium and the RIKEN PMI and CLST (DGT); Forrest ARR, Kawaji H, Rehli M, Baillie JK, de Hoon MJL, Haberle V, Lassmann T, Kulakovskiy IV, Lizio M, et al. A promoter-level mammalian expression atlas. Nature 2014;507(7493):462–470. doi:10.1038/nature13182.
Frith MC, Valen E, Krogh A, Hayashizaki Y, Carninci P, Sandelin A. A code for transcription initiation in mammalian genomes. Genome Res. 2008;18(1):1–12. doi:10.1101/gr.6831208.
Georges M, Charlier C, Hayes B. Harnessing genomic information for livestock improvement. Nat Rev Genet. 2019;20(3):135–156. doi: 10.1038/s41576-018-0082-2.
Giuffra E, Tuggle CK; FAANG Consortium. Functional Annotation of Animal Genomes (FAANG): current achievements and roadmap. Annu Rev Anim Biosci. 2019;7:65–88. doi:10.1146/annurevanimal-020518-114913.
Goszczynski DE, Halstead MM, Islas-Trejo AD, Zhou H, Ross PJ. Transcription initiation mapping in 31 bovine tissues reveals complex promoter activity, pervasive transcription, and tissue-specific promoter usage. Genome Res. 2021;31(4):732–744. doi: 10.1101/GR.267336.120.
Guerrini MM, Oguchi A, Suzuki A, Murakawa Y. Cap analysis of gene expression (CAGE) and noncoding regulatory elements. Semin Immunopathol. 2022;44(1):127–136. doi:10.1007/s00281-021-00886-5.
Halstead MM, Kern C, Saelao P, Wang Y, Chanthavixay G, Medrano JF, van Eenennaam AL, Korf I, Tuggle CK, Ernst CW, et al. A comparative analysis of chromatin accessibility in cattle, pig, and mouse tissues. BMC Genomics 2020;21(1):698. doi:10.1186/ s12864-020-07078-9.
Hayes BJ, Daetwyler HD. 1000 bull genomes project to map simple and complex genetic traits in cattle: applications and outcomes. Annu Rev Anim Biosci. 2019;7(1):89–102. doi:10.1146/annurevanimal-020518-115024.
Heiman P, Mohsen AW, Karunanidhi A, St Croix C, Watkins S, Koppes E, Haas R, Vockley J, Ghaloul-Gonzalez L. Mitochondrial dysfunction associated with TANGO2 deficiency. Sci Rep. 2022;12(1):3045. doi:10.1038/S41598-022-07076-9.
Kern C, Wang Y, Xu X, Pan Z, Halstead M, Chanthavixay G, Saelao P, Waters S, Xiang R, Chamberlain A, et al. Functional annotations of three domestic animal genomes provide vital resources for comparative and agricultural research. Nat Commun. 2021;12(1): 1821. doi:10.1038/s41467-021-22100-8.
Kolde R. 2018. raivokolde/pheatmap: pretty heatmaps. [accessed 2022 Aug 29]. https://github.com/raivokolde/pheatmap.
Lex A, Gehlenborg N, Strobelt H, Vuillemot R, Pfister H. Upset: visualization of intersecting sets. IEEE Trans Vis Comput Graph. 2014;20(12):1983–1992. doi:10.1109/TVCG.2014.2346248.
Li R, Fu W, Su R, Tian X, Du D, Zhao Y, Zheng Z, Chen Q, Gao S, Cai Y, et al. Towards the complete goat pan-genome by recovering missing genomic segments from the reference genome. Front Genet. 2019;10:1169. doi:10.3389/fgene.2019.01169.
Moreira GCM, Dupont S, Becker D, Salavati M, Clark R, Clark EL, Plastow G, Kühn C, Charlier C. 2022. Multi-dimensional functional annotation of the bovine genome for the BovReg project. Proceedings of 12th World Congress on Genetics Applied to Livestock Production (WCGALP). 2261–2264.
Muhammad Aslam MK, Kumaresan A, Sharma VK, Tajmul M, Chhillar S, Chakravarty AK, Manimaran A, Mohanty TK, Srinivasan A, Yadav S. Identification of putative fertility markers in seminal plasma of crossbred bulls through differential proteo-mics. Theriogenology 2014;82(9):1254–1262.e1. doi:10.1016/J. THERIOGENOLOGY.2014.08.007.
Muroya S, Zhang Y, Kinoshita A, Otomaru K, Oshima K, Gotoh Y, Oshima I, Sano M, Roh S, Oe M, et al. Maternal undernutrition during pregnancy alters amino acid metabolism and gene expression associated with energy metabolism and angiogenesis in fetal calf muscle. Metabolites 2021;11(9):582. doi:10.3390/ METABO11090582.
Noguchi S, Arakawa T, Fukuda S, Furuno M, Hasegawa A, Hori F, Ishikawa-Kato S, Kaida K, Kaiho A, Kanamori-Katayama M, et al. FANTOM5 CAGE profiles of human and mouse samples. Sci Data. 2017;4(1):170112–10. doi:10.1038/sdata.2017.112.
Nolte W, Weikard R, Albrecht E, Hammon HM, Kühn C. Metabogenomic analysis to functionally annotate the regulatory role of long non-coding RNAs in the liver of cows with different nutrient partitioning phenotype. Genomics 2022;114(1):202–214. doi:10.1016/j.ygeno.2021.12.004.
Nolte W, Weikard R, Brunner RM, Albrecht E, Hammon HM, Reverter A, Küehn C. Identification and annotation of potential function of regulatory antisense long non-coding RNAs related to feed efficiency in Bos taurus bulls. Int J Mol Sci. 2020;21(9):3292. doi:10. 3390/ijms21093292.
Robert C, Kapetanovic R, Beraldi D, Watson M, Archibald AL, Hume DA. Identification and annotation of conserved promoters and macrophage-expressed genes in the pig genome. BMC Genomics 2015;16(1):970. doi:10.1186/S12864-015-2111-2.
Rosen BD, Bickhart DM, Schnabel RD, Koren S, Elsik CG, Tseng E, Rowan TN, Low WY, Zimin A, Couldrey C, et al. De novo assembly of the cattle reference genome with single-molecule sequencing. Gigascience 2020;9(3):giaa021. doi:10.1093/gigascience/giaa021.
Ross EM, Sanjana H, Nguyen LT, Cheng YY, Moore SS, Hayes BJ. Extensive variation in gene expression is revealed in 13 fertility-related genes using RNA-Seq, ISO-Seq, and CAGE-Seq from Brahman cattle. Front Genet. 2022;13:784663. doi:10.3389/ FGENE.2022.784663/PDF.
RStudio Team. 2015. RStudio: integrated development for R. http://www.rstudio.com/.
Salavati M, Caulton A, Clark R, Gazova I, Smith TPL, Worley KC, Cockett NE, Archibald AL, Clarke SM, Murdoch BM, et al. Global analysis of transcription start sites in the new ovine reference genome (Oar rambouillet v1.0). Front Genet. 2020;11:580580. doi: 10.3389/fgene.2020.580580.
Salavati M, Espinosa-Carrasco J. MazdaX/nf-cage: nf-cage 2022. doi:10.5281/ZENODO.6855541.
Takahashi H, Kato S, Murata M, Carninci P. CAGE (cap analysis of gene expression): a protocol for the detection of promoter and transcriptional networks. Methods Mol Biol. 2012;786:181–200. doi:10.1007/978-1-61779-292-2_11.
Talenti A, Powell J, Hemmink JD, Cook EAJ, Wragg D, Jayaraman S, Paxton E, Ezeasor C, Obishakin ET, Agusi ER, et al. A cattle graph genome incorporating global breed diversity. Nat Commun. 2022; 13(1):256. doi:10.1038/s41467-022-28605-0.
Thodberg M, Thieffry A, Vitting-Seerup K, Andersson R, Sandelin A. CAGEfightr: analysis of 5′-end data using R/Bioconductor. BMC Bioinformatics 2019;20(1):487. doi:10.1186/s12859-019-3029-5.
Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14(2):178–192. doi:10.1093/bib/bbs017.
Tippens ND, Vihervaara A, Lis JT. Enhancer transcription: what, where, when, and why? Genes Dev. 2018;32(1):1–3. doi:10.1101/ GAD.311605.118.
University of Edinburgh 2020. Edinburgh compute and data facility. [accessed 2020 Jul 6]. https://www.ed.ac.uk/is/researchcomputing-service.
Van Eenennaam AL, Weigel KA, Young AE, Cleveland MA, Dekkers JC. Applied animal genomics: results from the field. Annu Rev Anim Biosci. 2014;2(1):105–139. doi:10.1146/annurev-animal-022513-114119.
Venables WN, Ripley BD. Modern Applied Statistics with S. Fourth. New York: Springer; 2002.
Wickham H, Averick M, Bryan J, Chang W, Mcgowan L, François R, Grolemund G, Hayes A, Henry L, Hester J, et al. Welcome to the tidyverse. J Open Source Softw. 2019;4(43):1686. doi:10.21105/JOSS.01686.
Wickham H. ggplot2: Elegant Graphics for Data Analysis. 1 ed. New York, NY: Springer; 2009.
Xu L, Cole JB, Bickhart DM, Hou Y, Song J, VanRaden PM, Sonstegard TS, van Tassell CP, Liu GE. Genome wide CNV analysis reveals additional variants associated with milk production traits in Holsteins. BMC Genomics 2014;15(1):683. doi:10.1186/1471-2164-15-683/FIGURES/4.
Zarek CM, Lindholm-Perry AK, Kuehn LA, Freetly HC. Differential expression of genes related to gain and intake in the liver of beef cattle. BMC Res Notes. 2017;10(1):1. doi:10.1186/s13104-016-2345-3.
Zhang J, Zhou Y, Yue W, Zhu Z, Wu X, Yu S, Shen Q, Pan Q, Xu W, Zhang R, et al. Super-enhancers conserved within placental mammals maintain stem cell pluripotency. Proc Natl Acad Sci U S A. 2022; 119(40):e2204716119. doi:10.1073/pnas.2204716119.