Article (Périodiques scientifiques)
Multivariate Surprisal Analysis of Gene Expression Levels
Remacle, Françoise; Goldstein, S. Andrew; Levine, D. Raphael
2017In Entropy, 18 (12), p. 445
Peer reviewed vérifié par ORBi
 

Documents


Texte intégral
entropy-18-00445.pdf
Postprint Éditeur (1.72 MB) Licence Creative Commons - Attribution
Télécharger

Tous les documents dans ORBi sont protégés par une licence d'utilisation.

Envoyer vers



Détails



Mots-clés :
multivariate analysis; surprisal analysis; High order SVD
Résumé :
[en] We consider here multivariate data which we understand as the problem where each data point i is measured for two or more distinct variables. In a typical situation there are many data points i while the range of the different variables is more limited. If there is only one variable then the data can be arranged as a rectangular matrix where i is the index of the rows while the values of the variable label the columns. We begin here with this case, but then proceed to the more general case with special emphasis on two variables when the data can be organized as a tensor. An analysis of such multivariate data by a maximal entropy approach is discussed and illustrated for gene expressions in four different cell types of six different patients. The different genes are indexed by i, and there are 24 (4 by 6) entries for each i. We used an unbiased thermodynamic maximal-entropy based approach (surprisal analysis) to analyze the multivariate transcriptional profiles. The measured microarray experimental data is organized as a tensor array where the two minor orthogonal directions are the different patients and the different cell types. The entries are the transcription levels on a logarithmic scale. We identify a disease signature of prostate cancer and determine the degree of variability between individual patients. Surprisal analysis determined a baseline expression level common for all cells and patients. We identify the transcripts in the baseline as the “housekeeping” genes that insure the cell stability. The baseline and two surprisal patterns satisfactorily recover (99.8%) the multivariate data. The two patterns characterize the individuality of the patients and, to a lesser extent, the commonality of the disease. The immune response was identified as the most significant pathway contributing to the cancer disease pattern. Delineating patient variability is a central issue in personalized diagnostics and it remains to be seen if additional data will confirm the power of multivariate analysis to address this key point. The collapsed limits where the data is compacted into two dimensional arrays are contained within the proposed formalism.
Centre/Unité de recherche :
Theoretical Physical Chemistry
Disciplines :
Sciences du vivant: Multidisciplinaire, généralités & autres
Auteur, co-auteur :
Remacle, Françoise  ;  Université de Liège > Département de chimie (sciences) > Laboratoire de chimie physique théorique
Goldstein, S. Andrew
Levine, D. Raphael
Langue du document :
Anglais
Titre :
Multivariate Surprisal Analysis of Gene Expression Levels
Date de publication/diffusion :
2017
Titre du périodique :
Entropy
eISSN :
1099-4300
Maison d'édition :
MDPI, Basel, Suisse
Volume/Tome :
18
Fascicule/Saison :
12
Pagination :
445
Peer reviewed :
Peer reviewed vérifié par ORBi
Projet européen :
FP7 - 618024 - BAMBI - Bottom-up Approaches to Machines dedicated to Bayesian Inference
Organisme subsidiant :
CE - Commission Européenne
European Union
Disponible sur ORBi :
depuis le 04 juillet 2017

Statistiques


Nombre de vues
115 (dont 3 ULiège)
Nombre de téléchargements
2 (dont 0 ULiège)

citations Scopus®
 
5
citations Scopus®
sans auto-citations
3
OpenCitations
 
5
citations OpenAlex
 
10

Bibliographie


Publications similaires



Contacter ORBi