# GraphSPME

**High dimensional precision matrix estimation with a known graphical structure**

- Works in very high dimensions
- Non-parametric
- Asymptotic regularization ?
- Lightning fast
- Both Python and R

*Note: Still work in progress. Waiting for Py*

## Installation

**R**: Install the development version from GitHub

`devtools::install_github("Blunde1/GraphSPME/GraphSPME")`

## Example code and documentation

Simulate a zero-mean AR1 process with a known graphical structure:

```
library(Matrix)
# Zero mean AR1
rar1 <- function(n, psi){
x <- numeric(n)
x[1] <- rnorm(1, 0, 1/sqrt(1-psi^2))
for(i in 2:n){
x[i] <- psi*x[i-1] + rnorm(1)
}
return(x)
}
# Simulate data
set.seed(123)
p <- 5
psi <- 0.8
n <- 200
x <- t(replicate(n, rar1(p,psi)))
Z <- bandSparse(p, p, (-1):1,
list(rep(1,p-1),
rep(1,p),
rep(1,p-1)))
```

The graphical structure of the data is contained in `Z`

, which shows

the non-zero elements of the precision matrix.

Such information is typically known in real-world problems.

```
Z
# 5 x 5 sparse Matrix of class "dgCMatrix"
#
# [1,] 1 1 . . .
# [2,] 1 1 1 . .
# [3,] . 1 1 1 .
# [4,] . . 1 1 1
# [5,] . . . 1 1
```

The exact dependence-structure is however typically unknown.

GraphSPME therefore estimates a non-parametric estimate of the precision matrix

using the `prec_sparse()`

function.

```
library(GraphSPME)
prec_est <- prec_sparse(x, Z)
prec_est
# 5 x 5 sparse Matrix of class "dgCMatrix"
#
# [1,] 0.94 -0.85 . . .
# [2,] -0.83 1.73 -0.78 . .
# [3,] . -0.86 1.58 -0.73 .
# [4,] . . -0.70 1.54 -0.78
# [5,] . . . -0.76 1.02
```

Note that GraphSPME allows working in very high dimensions:

```
frobenius_norm <- function(M) sum(M^2)
set.seed(123)
p <- 10000
x <- t(replicate(n, rar1(p,psi)))
Z <- bandSparse(p, p, (-1):1,
list(rep(1,p-1),
rep(1,p),
rep(1,p-1)))
dim(Z)
# [1] 10000 10000
start.time <- Sys.time()
prec_est <- prec_sparse(x, Z)
end.time <- Sys.time()
time.taken <- end.time - start.time
time.taken
# Time difference of 0.9 secs
# We can compare with the population precision
prec_pop <- bandSparse(p, p, (-1):1,
list(rep(-psi,p-1),
c(1, rep(1+psi^2,p-2), 1),
rep(-psi,p-1)))
frobenius_norm(prec_pop-prec_est)
# [1] 533
```

## Dependencies

GraphSPME is built on the linear algebra library Eigen

and utilizes Rcpp for bindings to R.

Bindings to Python are done via PyBind.

In particular the sparse matrix class in Eigen is extensively utilized to obtain efficient and scalable result.

## Main idea

GraphSPME combines the inversion scheme in Le (2021) with the regularized

covariance estimate of Touloumis (2015) as described in .

The package leverages Eigen to obtain scalable result by numerically taking advantage

of the graphical nature of the problem.

- Le (2021) introduces a sparse precision matrix estimate from the ml-estimate covariance matrix.

The method utilizes the knowledge of a graphical structure beneath the realized data. - Touloumis (2015) finds asymptotic closed form results for schrinkage of the frequentist covariance estimate.