Article (Scientific journals)
Impact of U2-type introns on splice site prediction in A. thaliana species using deep learning.
Kabanga, Espoir; Jee, Seonil; Yun, Soeun et al.
2025In BMC Bioinformatics, 26 (1), p. 288
Peer Reviewed verified by ORBi
 

Files


Full Text
Kabanga_et_al-2025-BMC_Bioinformatics.pdf
Publisher postprint (5.52 MB) Creative Commons License - Attribution, Non-Commercial, No Derivative
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Arabidopsis thaliana; CNN; Splice site prediction; U2-type introns; RNA Splice Sites; Genome, Plant; Computational Biology/methods; RNA Splicing; Introns; Deep Learning; Arabidopsis/genetics; Acceptor sites; Biological observations; Learning models; Plant genomes; Spatial complexity; Splice site; Thaliana; U2-type intron; Arabidopsis; Computational Biology; Structural Biology; Biochemistry; Molecular Biology; Computer Science Applications; Applied Mathematics
Abstract :
[en] [en] BACKGROUND: Splice site prediction in plant genomes poses substantial challenges that can be addressed using deep learning models. U2-type introns are especially useful for such studies given their ubiquity in plant genomes and the availability of rich datasets. We formulated two hypotheses: one proposing that short introns may enhance prediction effectiveness due to reduced spatial complexity, and another suggesting that sequences with multiple introns provide a richer context for splicing events. RESULTS: Our findings demonstrate that (1) models trained on datasets containing shorter introns achieve improved effectiveness for acceptor splice sites, but not for donor splice sites, indicating a more nuanced relationship between intron length and splice site prediction than initially hypothesized, and (2) models trained on datasets with multiple introns per sequence show higher effectiveness compared to those trained on datasets with a single intron per sequence. Notably, among the 402 bp sequences analyzed, 72% contained single introns while 28% contained multiple introns for donor sites (36,399 versus 13,987 sequences), with similar proportions observed for acceptor sites (37,236 versus 14,112 sequences). These computational insights align with biological observations, particularly regarding the conserved spatial relationship between branch points and acceptor splice sites, as well as the synergistic effects of multiple introns on splicing efficiency. CONCLUSIONS: The obtained results contribute to a deeper understanding of how intronic features influence splice site prediction and suggest that future prediction models should consider factors such as intron length, multiplicity, and the spatial arrangement of splice-related signals.
Disciplines :
Computer science
Biotechnology
Biochemistry, biophysics & molecular biology
Author, co-author :
Kabanga, Espoir;  Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, 21985, Republic of Korea. espoir.kabanga@ghent.ac.kr ; IDLab, Department of Electronics and Information Systems, Ghent University, Ghent, 9000, Belgium. espoir.kabanga@ghent.ac.kr
Jee, Seonil;  Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, 21985, Republic of Korea
Yun, Soeun;  Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, 21985, Republic of Korea
Depuydt, Stephen;  Department of Health Care, HOGENT University of Applied Sciences and Arts, Ghent, 9000, Belgium
Van Messem, Arnout  ;  Université de Liège - ULiège > Mathematics
De Neve, Wesley;  Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, 21985, Republic of Korea ; IDLab, Department of Electronics and Information Systems, Ghent University, Ghent, 9000, Belgium
Language :
English
Title :
Impact of U2-type introns on splice site prediction in A. thaliana species using deep learning.
Publication date :
28 November 2025
Journal title :
BMC Bioinformatics
eISSN :
1471-2105
Publisher :
BioMed Central Ltd, England
Volume :
26
Issue :
1
Pages :
288
Peer reviewed :
Peer Reviewed verified by ORBi
Available on ORBi :
since 30 January 2026

Statistics


Number of views
6 (0 by ULiège)
Number of downloads
1 (0 by ULiège)

Scopus citations®
 
0
Scopus citations®
without self-citations
0
OpenCitations
 
0
OpenAlex citations
 
0

Bibliography


Similar publications



Contact ORBi