Abstract :
[en] Background: RNA velocity is a new theoretical model whose objective is to predict the short-term future of a cell in terms of its transcriptome from single cell RNA sequencing data. In addition, the production of single cell data has drastically increased for several years.
Objectives: From single cell RNA sequencing databases produced to study development, the objective is to create a computer model recapitulating the differentiation trajectories by using the concepts of RNA velocity and Markov chains.
Methods: The database used comes from a study about retinal development (Georges et al., 2020) which contains murine retinal cells collected at 4 stages of development. By associating the transcriptomic profile of each of these cells to a state, the long-term evolution of these cells can be determined using Markov chains. Transition probabilities are defined from RNA velocities, providing a biological basis for predictions. These velocities are calculated with the steady state model (La Manno et al., 2018). Three models have been developed to calculate the transition probabilities. These take into account the angle between the RNA velocity vector and the vector connecting the two states involved in the transition. Moreover, the distance between these two states is also considered.
Results: Of the three models created, none was able to completely recapitulate the process of retinal development. This is partly due to the inability of photoreceptor precursors to differentiate. However, the results obtained do not depend only on the model used. Other factors can be responsible for the problems encountered, such as a lack of cells in the database, biases in the calculation of RNA velocity, the fact that cell death is not accounted for in our models, an incorrect gene filtering, the poor capture of the transcriptome with the 10X method and difficulties to determine whether an RNA molecule is spliced or not.
Conclusion: In order to obtain more biologically consistent results, the models must be optimized and the external factors mentioned above must be taken into account. Once this is done, the early genes responsible for the distinct differentiation pathways could then be identified by analyzing the regions where the main trajectories split into several different trajectories by using principal curves.