Flattening and Folding With Covariance Matrices.

[1]:
import numpy as np
import paragami

In this example, we will consider flattening and folding a simple symmetric positive semi-definite matrix:

\[\begin{split}A = \left[ \begin{matrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ \end{matrix} \right].\end{split}\]

Of course, symmetry and positive semi-definiteness impose constraints on the entries \(a_{ij}\) of \(A\).

Flattening and Folding.

In the Original Space.

Let us first consider how to represent \(A\) as a vector, which we call simply flattening, and then as an unconstrained vector, which we call free flattening.

When a parameter is flattened, it is simply re-shaped as a vector. Every number that was in the original parameter will occur exactly once in the flattened shape. (In the present case of a matrix, this is exactly the same as np.flatten.)

\[\begin{split}A = \left[ \begin{matrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ \end{matrix} \right] \xrightarrow{flatten} A_{flat} = \left[ \begin{matrix} a_{flat,1} \\ a_{flat,2} \\ a_{flat,3} \\ a_{flat,4} \\ a_{flat,5} \\ a_{flat,6} \\ a_{flat,7} \\ a_{flat,8} \\ a_{flat,9} \\ \end{matrix}\right] = \left[ \begin{matrix} a_{11} \\ a_{12} \\ a_{13} \\ a_{21} \\ a_{22} \\ a_{23} \\ a_{31} \\ a_{32} \\ a_{33} \\ \end{matrix} \right]\end{split}\]

Converting to and from \(A\) and \(A_{flat}\) can be done with the flatten method of a paragami.PSDSymmetricMatrixPattern pattern.

For the moment, because we are flattening, not free flattening, we use the option free=False. We will discuss the free=True option shortly.

[2]:
# A sample positive semi-definite matrix.
a = np.eye(3) + np.random.random((3, 3))
a = 0.5 * (a + a.T)

# Define a pattern and fold.
a_pattern = paragami.PSDSymmetricMatrixPattern(size=3)
a_flat = a_pattern.flatten(a, free=False)

print('Now, a_flat contains the elements of a exactly as shown in the formula above.\n')
print('a:\n{}\n'.format(a))
print('a_flat:\n{}\n'.format(a_flat))
Now, a_flat contains the elements of a exactly as shown in the formula above.

a:
[[1.2712968  0.74536048 0.33203184]
 [0.74536048 1.91869072 0.4602062 ]
 [0.33203184 0.4602062  1.58338   ]]

a_flat:
[1.2712968  0.74536048 0.33203184 0.74536048 1.91869072 0.4602062
 0.33203184 0.4602062  1.58338   ]

We can also convert from \(A_{flat}\) back to \(A\) by ‘folding’.

[3]:
print('Folding the flattened value recovers the original matrix.\n')
a_fold = a_pattern.fold(a_flat, free=False)
print('a:\n{}\n'.format(a))
print('a_fold:\n{}\n'.format(a_fold))
Folding the flattened value recovers the original matrix.

a:
[[1.2712968  0.74536048 0.33203184]
 [0.74536048 1.91869072 0.4602062 ]
 [0.33203184 0.4602062  1.58338   ]]

a_fold:
[[1.2712968  0.74536048 0.33203184]
 [0.74536048 1.91869072 0.4602062 ]
 [0.33203184 0.4602062  1.58338   ]]

By default, flattening and folding perform checks to make sure the result is a valid instance of the parameter type – in this case, a symmetric positive definite matrix.

The diagonal of a positive semi-definite matrix must not be less than 0, and folding checks this when validate=True, which it is by default.

[4]:
a_flat_bad = np.array([-1, 0, 0,  0, 0, 0,  0, 0, 0])
print('A bad folded value: {}'.format(a_flat_bad))
try:
    a_fold_bad = a_pattern.fold(a_flat_bad, free=False)
except ValueError as err:
    print('Folding with a_pattern raised the following ValueError:\n{}'.format(err))
A bad folded value: [-1  0  0  0  0  0  0  0  0]
Folding with a_pattern raised the following ValueError:
Diagonal is less than the lower bound 0.0.

If validate_value is False, folding will produce an invalid matrix without an error.

[5]:
a_fold_bad = a_pattern.fold(a_flat_bad, free=False, validate_value=False)
print('Folding a non-pd matrix with validate=False:\n{}'.format(a_fold_bad))
Folding a non-pd matrix with validate=False:
[[-1  0  0]
 [ 0  0  0]
 [ 0  0  0]]

However, it will not produce a matrix of the wrong shape even when validate is False.

[6]:
a_flat_very_bad = np.array([1, 0, 0])
print('A very bad folded value: {}.'.format(a_flat_very_bad))
try:
    a_fold_very_bad = a_pattern.fold(a_flat_very_bad, free=False, validate_value=False)
except ValueError as err:
    print('Folding with a_pattern raised the following ValueError:\n{}'.format(err))
A very bad folded value: [1 0 0].
Folding with a_pattern raised the following ValueError:
Wrong length for PSDSymmetricMatrix flat value.

You can always check validity of a folded value with the validate_folded method of a pattern, which returns a boolean and an error message.

[7]:
valid, msg = a_pattern.validate_folded(a_fold)
print('Valid: {}.\tMessage: {}'.format(valid, msg))

valid, msg = a_pattern.validate_folded(a_fold - 10 * np.eye(3))
print('Valid: {}.\tMessage: {}'.format(valid, msg))
Valid: True.    Message:
Valid: False.   Message: Diagonal is less than the lower bound 0.0.

In an Unconstrained Space: “Free” Flattening and Folding.

Ordinary flattening converts a 3x3 symmetric PSD matrix into a 9-d vector. However, as seen above, not every 9-d vector is a valid 3x3 symmetric positive definite matrix. It is useful to have an “free” flattened representation of a parameter, where every finite value of the free flattened vector corresponds is guaranteed valid.

To accomplish this for a symmetric positive definite matrix, we consider the Cholesky decomposition \(A_{chol}\). This is an lower-triangular matrix with positive diagonal entries such that \(A = A_{chol} A_{chol}^T\). By taking the log of the diagonal of \(A_{chol}\) and stacking the non-zero entries, we can construct a 6-d vector, every value of which corresponds to a symmetric PSD matrix.

\[\begin{split}% A \xrightarrow{\textrm{free flatten}} A_{freeflat} \quad\quad \textrm{where} \\ A \xrightarrow{} A_{chol} = \left[ \begin{matrix} \alpha_{11} & 0 & 0 \\ \alpha_{21} & \alpha_{22} & 0 \\ \alpha_{31} & \alpha_{32} & \alpha_{33} \\ \end{matrix} \right] \xrightarrow{} A_{freeflat} = \left[ \begin{matrix} \log(\alpha_{11}) \\ \alpha_{21} \\ \alpha_{31} \\ \log(\alpha_{22})\\ \alpha_{32} \\ \log(\alpha_{33}) \end{matrix} \right].\end{split}\]

The details of the freeing transform aren’t important to the end user, as paragami takes care of the transformation behind the scenes with the option free=True. We denote the flattened \(A\) in the free parameterization as \(A_{freeflat}\).

The free flat value a_freeflat is not immediately recognizable as a.

[8]:
a_freeflat = a_pattern.flatten(a, free=True)
print('a:\n{}\n'.format(a))
print('a_freeflat:\n{}\n'.format(a_freeflat))
a:
[[1.2712968  0.74536048 0.33203184]
 [0.74536048 1.91869072 0.4602062 ]
 [0.33203184 0.4602062  1.58338   ]]

a_freeflat:
[0.12001874 0.66106306 0.19659043 0.2944803  0.21814513 0.18546238]

However, it transforms correctly back to a when folded.

[9]:
a_freefold = a_pattern.fold(a_freeflat, free=True)
print('a:\n{}\n'.format(a))
print('a_fold:\n{}\n'.format(a_freefold))
a:
[[1.2712968  0.74536048 0.33203184]
 [0.74536048 1.91869072 0.4602062 ]
 [0.33203184 0.4602062  1.58338   ]]

a_fold:
[[1.2712968  0.74536048 0.33203184]
 [0.74536048 1.91869072 0.4602062 ]
 [0.33203184 0.4602062  1.58338   ]]

Any length-six vector will free fold back to a valid PSD matrix up to floating point error. Let’s draw 100 random vectors, fold them, and check that this is true.

[10]:
# Draw random free vectors and confirm that they are positive semi definite.
def assert_is_pd(mat):
    eigvals = np.linalg.eigvals(mat)
    assert np.min(eigvals) >= -1e-8

for draw in range(100):
    a_rand_freeflat = np.random.normal(scale=2, size=(6, ))
    a_rand_fold = a_pattern.fold(a_rand_freeflat, free=True)
    assert_is_pd(a_rand_fold)