Linear trial spaces

Next: Mixture models Up: Parameterizing likelihoods: Variational methods Previous: Gaussian priors for parameters Contents

Linear trial spaces

Solving a density estimation problem numerically, the function $\phi$ has to be discretized. This is done by expanding $\phi$ in a basis (not necessarily orthonormal) and, choosing some $l_{\rm max}$ , truncating the sum to terms with $l\le l_{\rm max}$ ,

$\begin{displaymath} \phi = \sum_{l=1}^\infty c_l B_l \rightarrow \phi = \sum_{l=1}^{l_{\rm max}} c_l B_l . \end{displaymath}$

(380)

This, also called Ritz's method, corresponds to a finite linear trial space and is equivalent to solving a projected stationarity equation. Using a discretization (380) the functional (187) becomes

$\begin{displaymath} E_{\rm Ritz} = -(\,\ln P(\phi),\,N\,) +\frac{1}{2}\sum_{kl} c_k c_l (\,B_k,\,{{\bf K}}\,B_l\,) + (\,P(\phi),\,\Lambda_X\,). \end{displaymath}$

(381)

Solving for the coefficients

, $l\le l_{\rm max}$ to minimize the error results according to Eq.[355) and

$\begin{displaymath} \Phi^\prime (l;x,y) = B_l(x,y), \end{displaymath}$

(382)

$\begin{displaymath} 0 = (\, B_l ,\, {\bf P}^\prime {\bf P}^{-1}\,N\,) - \sum_k ... ...,\, {\bf P}^\prime \, \Lambda_X\,) , \forall l\le l_{\rm max}, \end{displaymath}$

(383)

corresponding to the $l_{\rm max}$ -dimensional equation

$\begin{displaymath} {{\bf K}}_B c = N_B (c) - \Lambda_B (c), \end{displaymath}$

(384)

with

$\displaystyle c(l)$	$\textstyle =$	$\displaystyle c_l,$	(385)
$\displaystyle {{\bf K}}_B (l,k)$	$\textstyle =$	$\displaystyle (\,B_l,\, {{\bf K}}\,B_k\,),$	(386)
$\displaystyle N_B (c)(l)$	$\textstyle =$	$\displaystyle (\,B_l,\,{\bf P}^\prime(\phi(c))\,{\bf P}^{-1}(\phi(c))\,N\,),$	(387)
$\displaystyle \Lambda_B (c)(l)$	$\textstyle =$	$\displaystyle (\, B_l,\, {\bf P}^\prime(\phi (c)) \,\Lambda_X\,).$	(388)

Thus, for an orthonormal basis

Eq. (384) corresponds to Eq. (189) projected into the trial space by the projector $\sum_l B_l\,B_l^T$ .

The so called linear models are obtained by the (very restrictive) choice

$\begin{displaymath} \phi(z) = \sum_{l=0}^{1} c_l B_l = c_0 + \sum_l c_l z_l \end{displaymath}$

(389)

with

and

= 1 and

. Interactions, i.e., terms proportional to products of

-components like $c_{mn}z_mz_n$ can be included. Including all possible interaction would correspond to a multidimensional Taylor expansion of the function $\phi (z)$ .

If the functions are also parameterized this leads to mixture models for $\phi$ . (See Section 4.4.)

Next: Mixture models Up: Parameterizing likelihoods: Variational methods Previous: Gaussian priors for parameters Contents

Joerg_Lemm 2001-01-21