General parameterizations

Next: Gaussian priors for parameters Up: Parameterizing likelihoods: Variational methods Previous: Parameterizing likelihoods: Variational methods Contents

General parameterizations

Approximate solutions of the error minimization problem are obtained by restricting the search (trial) space for = $\phi(x,y)$ (or in regression). Functions $\phi$ which are in the considered search space are called trial functions. Solving a minimization problem in some restricted trial space is also called a variational approach [97,106,29,36,27]. Clearly, minimal values obtained by minimization within a trial space can only be larger or equal than the true minimal value, and from two variational approximations that with smaller error is the better one.

Alternatively, using parameterized functions $\phi$ can also be interpreted as implementing the a priori information that $\phi$ is known to have that specific parameterized form. (In cases where $\phi$ is only known to be approximately of a specific parameterized form, this should ideally be implemented using a prior with a parameterized template and the parameters be treated as hyperparameters as in Section 5.) The following discussion holds for both interpretations.

Any parameterization $\phi$ = $\phi (\{\xi_l\})$ together with a range of allowed values for the parameter vector $\xi$ defines a possible trial space. Hence we consider the error functional

$\begin{displaymath} E_{\phi (\xi)} = -(\,\ln P (\xi),\, N\,) +\frac{1}{2} (\,\phi (\xi),\, {{\bf K}}\,\phi (\xi)\,) + (\,P( \xi ),\,\Lambda_X\,), \end{displaymath}$

(352)

for $\phi$ depending on parameters $\xi$ and $p(\xi )$ = $p(\,\phi (\xi)\,)$ . In the special case of Gaussian regression this reads

$\begin{displaymath} E_{h (\xi)} = \frac{1}{2} (\,h (\xi)-t_D,\, {{\bf K}_D}\,h (\xi)-t_D\,) +\frac{1}{2} (\,h (\xi),\, {{\bf K}}\,h (\xi)\,) . \end{displaymath}$

(353)

Defining the matrix

$\begin{displaymath} \Phi^\prime (l;x,y) = \frac{\partial \phi (x,y)}{\partial \xi_l} \end{displaymath}$

(354)

the stationarity equation for the functional (352) becomes

$\begin{displaymath} 0 = \Phi^\prime {\bf P}^\prime {\bf P}^{-1} N - \Phi^\prime {{\bf K}}\phi -\Phi^\prime {\bf P}^\prime \Lambda_X . \end{displaymath}$

(355)

Similarly, a parameterized functional $E_\phi$ with non-zero template

as in (226) would give

$\begin{displaymath} 0 = \Phi^\prime {\bf P}^\prime {\bf P}^{-1} N - \Phi^\prime... ...left( \phi - t\right) - \Phi^\prime {\bf P}^\prime \Lambda_X . \end{displaymath}$

(356)

To have a convenient notation when solving for $\Lambda_X$ we introduce

$\begin{displaymath} {\bf P}^\prime_\xi = \Phi^\prime (\xi) {\bf P}^\prime(\phi), \end{displaymath}$

(357)

i.e.,

$\begin{displaymath} {\bf P}^\prime_\xi (l;x,y) = \frac{\partial P(x,y)}{\partia... ...i_l} \frac{\delta P(x,y)}{\delta \phi (x^\prime ,y^\prime )} , \end{displaymath}$

(358)

and

$\begin{displaymath} G_{\phi(\xi)} ={\bf P}^\prime_\xi {\bf P}^{-1} N - \Phi^\prime {{\bf K}}\phi , \end{displaymath}$

(359)

to obtain for Eq. (355)

$\begin{displaymath} {\bf P}^\prime_\xi \Lambda_X = G_{\phi(\xi)} . \end{displaymath}$

(360)

For a parameterization $\xi$ restricting the space of possible

the matrix ${\bf P}^\prime_\xi$ is not square and cannot be inverted. Thus, let $({\bf P}^\prime_\xi)^{\char93 }$ be the Moore-Penrose inverse of ${\bf P}^\prime_\xi$ , i.e.,

$\begin{displaymath} ({\bf P}^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi ({\bf P}... ...ar93 } {\bf P}^\prime_\xi = ({\bf P}^\prime_\xi)^{\char93 } , \end{displaymath}$

(361)

and symmetric $({\bf P}^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi$ and ${\bf P}^\prime_\xi ({\bf P}^\prime_\xi)^{\char93 }$ . A solution for $\Lambda_X$ exists if

$\begin{displaymath} {\bf P}^\prime_\xi ({\bf P}^\prime_\xi)^{\char93 } G_{\phi(\xi)} =G_{\phi(\xi)} . \end{displaymath}$

(362)

In that case the solution can be written

$\begin{displaymath} \Lambda_X = ({\bf P}^\prime_\xi)^{\char93 } G_{\phi(\xi)} +... ...({\bf P}^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi V_\Lambda , \end{displaymath}$

(363)

with arbitrary vector $V_\Lambda$ and

$\begin{displaymath} \Lambda_X^0 = V_\Lambda - ({\bf P}^\prime_\xi)^{\char93 }{\bf P}^\prime_\xi V_\Lambda \end{displaymath}$

(364)

from the right null space of ${\bf P}^\prime_\xi$ , representing a solution of

$\begin{displaymath} {\bf P}_\xi^\prime \Lambda_X^0 = 0 . \end{displaymath}$

(365)

Inserting for $\Lambda_X(x) \ne 0$ Eq. (363) into the normalization condition $\Lambda_X$ = ${\bf I}_X {\bf P} \Lambda_X$ gives

$\begin{displaymath} \Lambda_X = {\bf I}_X {\bf P} \left( ({\bf P}^\prime_\xi)^{... ...^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi V_\Lambda \right) . \end{displaymath}$

(366)

Substituting back in Eq. (355) $\Lambda_X$ is eliminated yielding as stationarity equation

$\begin{displaymath} 0 = \left( {\bf I} - {\bf P}^\prime_\xi {\bf I}_X {\bf P} ({... ...^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi V_\Lambda \right) , \end{displaymath}$

(367)

where $G_{\phi(\xi)}$ has to fulfill Eq. (362). Eq. (367) may be written in a form similar to Eq. (193)

$\begin{displaymath} {{\bf K}}_{\phi(\xi)}( \xi) = T_{\phi(\xi)} \end{displaymath}$

(368)

with

$\begin{displaymath} T_{\phi(\xi)} (\xi) = {\bf P}_\xi^\prime {\bf P}^{-1}N -{\bf P}_\xi^\prime \Lambda_X , \end{displaymath}$

(369)

but with

$\begin{displaymath} {{\bf K}}_{\phi(\xi)} (\xi) = \Phi^\prime {{\bf K}} \Phi(\xi) , \end{displaymath}$

(370)

being in general a nonlinear operator.

Next: Gaussian priors for parameters Up: Parameterizing likelihoods: Variational methods Previous: Parameterizing likelihoods: Variational methods Contents

Joerg_Lemm 2001-01-21