next up previous contents
Next: Gaussian priors for parameters Up: Parameterizing likelihoods: Variational methods Previous: Parameterizing likelihoods: Variational methods   Contents

General parameterizations

Approximate solutions of the error minimization problem are obtained by restricting the search (trial) space for $h(x,y)$ = $\phi(x,y)$ (or $h(x)$ in regression). Functions $\phi $ which are in the considered search space are called trial functions. Solving a minimization problem in some restricted trial space is also called a variational approach [97,106,29,36,27]. Clearly, minimal values obtained by minimization within a trial space can only be larger or equal than the true minimal value, and from two variational approximations that with smaller error is the better one.

Alternatively, using parameterized functions $\phi $ can also be interpreted as implementing the a priori information that $\phi $ is known to have that specific parameterized form. (In cases where $\phi $ is only known to be approximately of a specific parameterized form, this should ideally be implemented using a prior with a parameterized template and the parameters be treated as hyperparameters as in Section 5.) The following discussion holds for both interpretations.

Any parameterization $\phi $ = $\phi (\{\xi_l\})$ together with a range of allowed values for the parameter vector $\xi$ defines a possible trial space. Hence we consider the error functional

\begin{displaymath}
E_{\phi (\xi)} =
-(\,\ln P (\xi),\, N\,)
+\frac{1}{2} (\,\phi (\xi),\, {{\bf K}}\,\phi (\xi)\,)
+ (\,P( \xi ),\,\Lambda_X\,),
\end{displaymath} (352)

for $\phi $ depending on parameters $\xi$ and $p(\xi )$ = $p(\,\phi (\xi)\,)$. In the special case of Gaussian regression this reads
\begin{displaymath}
E_{h (\xi)} =
\frac{1}{2} (\,h (\xi)-t_D,\, {{\bf K}_D}\,h (\xi)-t_D\,)
+\frac{1}{2} (\,h (\xi),\, {{\bf K}}\,h (\xi)\,)
.
\end{displaymath} (353)

Defining the matrix
\begin{displaymath}
\Phi^\prime (l;x,y)
= \frac{\partial \phi (x,y)}{\partial \xi_l}
\end{displaymath} (354)

the stationarity equation for the functional (352) becomes
\begin{displaymath}
0 =
\Phi^\prime {\bf P}^\prime {\bf P}^{-1} N
- \Phi^\prime {{\bf K}}\phi
-\Phi^\prime {\bf P}^\prime \Lambda_X
.
\end{displaymath} (355)

Similarly, a parameterized functional $E_\phi$ with non-zero template $t$ as in (226) would give
\begin{displaymath}
0 =
\Phi^\prime {\bf P}^\prime {\bf P}^{-1} N
- \Phi^\prime...
...left( \phi - t\right)
- \Phi^\prime {\bf P}^\prime \Lambda_X
.
\end{displaymath} (356)

To have a convenient notation when solving for $\Lambda_X$ we introduce
\begin{displaymath}
{\bf P}^\prime_\xi
= \Phi^\prime (\xi) {\bf P}^\prime(\phi),
\end{displaymath} (357)

i.e.,
\begin{displaymath}
{\bf P}^\prime_\xi (l;x,y)
= \frac{\partial P(x,y)}{\partia...
...i_l}
\frac{\delta P(x,y)}{\delta \phi (x^\prime ,y^\prime )}
,
\end{displaymath} (358)

and
\begin{displaymath}
G_{\phi(\xi)}
={\bf P}^\prime_\xi {\bf P}^{-1} N
- \Phi^\prime {{\bf K}}\phi
,
\end{displaymath} (359)

to obtain for Eq. (355)
\begin{displaymath}
{\bf P}^\prime_\xi \Lambda_X
= G_{\phi(\xi)}
.
\end{displaymath} (360)

For a parameterization $\xi$ restricting the space of possible $P$ the matrix ${\bf P}^\prime_\xi$ is not square and cannot be inverted. Thus, let $({\bf P}^\prime_\xi)^{\char93 }$ be the Moore-Penrose inverse of ${\bf P}^\prime_\xi$, i.e.,
\begin{displaymath}
({\bf P}^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi
({\bf P}...
...ar93 }
{\bf P}^\prime_\xi
=
({\bf P}^\prime_\xi)^{\char93 }
,
\end{displaymath} (361)

and symmetric $({\bf P}^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi$ and ${\bf P}^\prime_\xi ({\bf P}^\prime_\xi)^{\char93 }$. A solution for $\Lambda_X$ exists if
\begin{displaymath}
{\bf P}^\prime_\xi ({\bf P}^\prime_\xi)^{\char93 }
G_{\phi(\xi)}
=G_{\phi(\xi)}
.
\end{displaymath} (362)

In that case the solution can be written
\begin{displaymath}
\Lambda_X =
({\bf P}^\prime_\xi)^{\char93 } G_{\phi(\xi)}
+...
...({\bf P}^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi V_\Lambda
,
\end{displaymath} (363)

with arbitrary vector $V_\Lambda$ and
\begin{displaymath}
\Lambda_X^0 = V_\Lambda
- ({\bf P}^\prime_\xi)^{\char93 }{\bf P}^\prime_\xi V_\Lambda
\end{displaymath} (364)

from the right null space of ${\bf P}^\prime_\xi$, representing a solution of
\begin{displaymath}
{\bf P}_\xi^\prime \Lambda_X^0 = 0
.
\end{displaymath} (365)

Inserting for $\Lambda_X(x) \ne 0$ Eq. (363) into the normalization condition $\Lambda_X$ = ${\bf I}_X {\bf P} \Lambda_X$ gives
\begin{displaymath}
\Lambda_X =
{\bf I}_X {\bf P} \left(
({\bf P}^\prime_\xi)^{...
...^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi V_\Lambda
\right)
.
\end{displaymath} (366)

Substituting back in Eq. (355) $\Lambda_X$ is eliminated yielding as stationarity equation
\begin{displaymath}
0 =
\left(
{\bf I} - {\bf P}^\prime_\xi {\bf I}_X {\bf P} ({...
...^\prime_\xi)^{\char93 } {\bf P}^\prime_\xi V_\Lambda
\right)
,
\end{displaymath} (367)

where $G_{\phi(\xi)}$ has to fulfill Eq. (362). Eq. (367) may be written in a form similar to Eq. (193)
\begin{displaymath}
{{\bf K}}_{\phi(\xi)}( \xi) = T_{\phi(\xi)}
\end{displaymath} (368)

with
\begin{displaymath}
T_{\phi(\xi)} (\xi)
= {\bf P}_\xi^\prime {\bf P}^{-1}N
-{\bf P}_\xi^\prime \Lambda_X
,
\end{displaymath} (369)

but with
\begin{displaymath}
{{\bf K}}_{\phi(\xi)} (\xi)
= \Phi^\prime {{\bf K}} \Phi(\xi)
,
\end{displaymath} (370)

being in general a nonlinear operator.


next up previous contents
Next: Gaussian priors for parameters Up: Parameterizing likelihoods: Variational methods Previous: Parameterizing likelihoods: Variational methods   Contents
Joerg_Lemm 2001-01-21