Multiple-response regression analysis links magnetic resonance imaging features to de-regulated protein expression and pathway activity in lower grade glioma

Background and Purpose Lower grade gliomas (LGGs), lesions of WHO grades II and III, comprise 10-15% of primary brain tumors. In this first-of-a-kind study, we aim to carry out a radioproteomic characterization of LGGs using proteomics data from the TCGA and imaging data from the TCIA cohorts, to obtain an association between tumor MRI characteristics and protein measurements. The availability of linked imaging and molecular data permits the assessment of relationships between tumor genomic/proteomic measurements with phenotypic features. Materials and Methods Multiple-response regression of the image-derived, radiologist scored features with reverse-phase protein array (RPPA) expression levels generated correlation coefficients for each combination of image-feature and protein or phospho-protein in the RPPA dataset. Significantly-associated proteins for VASARI features were analyzed with Ingenuity Pathway Analysis software. Hierarchical clustering of the results of the pathway analysis was used to determine which feature groups were most strongly correlated with pathway activity and cellular functions. Results The multiple-response regression approach identified multiple proteins associated with each VASARI imaging feature. VASARI features were found to be correlated with expression of IL8, PTEN, PI3K/Akt, Neuregulin, ERK/MAPK, p70S6K and EGF signaling pathways. Conclusion Radioproteomics analysis might enable an insight into the phenotypic consequences of molecular aberrations in LGGs.


Model formulation
Consider a regression model of the following form: Y = XB + E, where Y is an n × q matrix of responses, X is an n × p matrix of predictors, E is an n × q matrix of regression errors and B is a p × q matrix of regression coefficients. Using the notation of Dawid (1981), we further assume E follows the matrix normal distribution MN n×q (0 n×q , I n , Σ G ), where 0 n×q is an n × q matrix of zeros, Σ G is the q × q covariance matrix of q possibly correlated responses and I n is an identity matrix of size n. We assume a separable covariance structure of E along the rows and columns and the matrix normal formulation gives Vec(E) ∼ N nq (0 nq , I n ⊗ Σ G ), a multivariate normal, with ⊗ denoting the Kronecker product. Specifically, our assumption is that the n samples are independent, but within each sample, the q responses share a common covariance structure encoded by Σ G . Conditional independence is modeled through an underlying (undirected) graph G = (V, E), where V corresponds to response variables Y 1 ,…, Y q , with the implication that {u, v} ∉ E ⇔ Σ G −1 (u, v) = 0, implying conditional independence of u and v given the rest, where u, v ∈ V. Clearly, when p and q are much larger than n, the model is not identifiable. Thus, following Bhadra and Mallick (30) and Feldman et al. (31), we now consider a sparse formulation: Let X γ be an n × p γ is a matrix of relevant predictors encoded by the vector γ = (γ 1 , … , γ p ) ∈ {0, 1} p with γ i = 1 if i th predictor is present in the model and γ i = 0 otherwise. Thus, p i p i γ γ = = ∑ 1 †is the number of active covariates. We form X γ by dropping (p − p γ ) columns corresponding to inactive predictors from the n × p matrix X; correspondingly, B γ is now a p γ × q matrix of regression coefficients for the selected features. E is distributed according to MN n×q (0 n×q , I n , Σ G ), as before. We consider the following hierarchical Bayesian model: In Equation (2), we restrict the set of permitted graphs to the set of all decomposable graphs with nodes V, and define a prior distribution with that support as: The hyper-inverse Wishart (HIW) prior is conjugate for the covariance matrix in a decomposable Gaussian graphical model (Dawid and Lauritzen, 1993). Here b, c, d are fixed, positive hyper-parameters. A symmetric matrix of parameters W = (w uv ) u, v∈V and w γ are fixed prior probabilities, presumably close to zero, that control the sparsity in G and γ respectively. Thus, the model specifies that the priors on Σ G and B γ are conjugate in a graphical setting, which allows analytic marginalization of these parameters. Table S1 gives a summary of the variables used.
A collapsed gibbs sampler Bhadra and Mallick (2013) demonstrated that one of the main advantages of the model above is that it allows a collapsed Gibbs sampler for the dependence structure G and a subset of features γ after analytically integrating out nuisance terms B γ and Σ G . The marginal data distribution summarizes the contributions of X and Y to the model. The hierarchical model collapses to: If the graphs G are decomposable, the distribution of T γ | γ, G is hyper-matrix t (33), a special type of t-distribution which, given the graph, splits into products and ratios over the cliques and separators of the graph. We recall that a decomposable graph G admits a (perfect) sequence of maximal cliques C 1 ,…, C k so that S j = (C 1 ∪ · · · ∪ C j−1 ) ∩ C j , j = 2,…,k (called separators) are complete subgraphs of G (33). The density of the hyper-matrix-t distribution HMT n×q (b, I n , dI q ) at T γ = t is and t A is a n×|A| sub-matrix of t with columns corresponding to cliques A ⊆ V in G (Equation (45) in Dawid and Lauritzen, 1993). The densities on the separators are defined similarly. This collapsed Gibbs sampler alleviates the need to sample B γ and Σ G in MCMC and allows for crucial computational advantages for scaling to high dimensions and faster mixing.

MCMC algorithm
We outline the MCMC sampler algorithm of Bhadra and Mallick (2013) below and refer the interested reader to that article for details. We also follow their recommendation for the choice of hyper-parameters b, c and d.
3. Accept the candidate γ * with probability

Updating G given γ and T γ
Similar to γ; G is searched by random addition or deletion of off-diagonal edges. Using w uv ∼ Uniform(0, 1) and integrating out w uv gives where r G is half the number of edges in the symmetric graph G. Two additional constraints for searching G are: (a) the proposed candidate G* must be decomposable. If not, propose again (a rejection scheme) and (b) the proposed candidate G* must be symmetric, since it encodes an MRF (Markov Random Field).

Ingenuity pathway analysis
Significantly-associated proteins from the RPPA dataset correlated with each VASARI feature were queried using the Ingenuity Pathway Analysis software package (IPA™ QIAGEN, Redwood City, CA, http://www.qiagen. com/ingenuity). Correlation co-efficients computed from the high-dimensional regression were used as a surrogate for fold-change. IPA Core Analyses were run on each list of mapped identifiers for each VASARI feature. In the IPA software, p-values were computed by applying the righttailed Fisher's exact test based on the number of functions/ pathways/molecules in the annotation as defined by the molecules in the selected Reference set, the number of molecules in the Reference set known to be associated with that function, the number of functions/pathways/ molecules in the Reference set, and the number of molecules in the Reference set (34). www.impactjournals.com/oncoscience www.impactjournals.com/oncoscience