next up previous contents
Next: Mutual information and learning Up: Basic model and notations Previous: Posterior and likelihood   Contents

Predictive density

Within a Bayesian approach predictions about (e.g., future) events are based on the predictive probability density, being the expectation of probability for $y$ for given (test) situation $x$, training data $D$ and prior data $D_0$

\begin{displaymath}
p(y\vert x,f)
=
p(y\vert x,D,D_0)
= \int \!d{h} \, p({h}\v...
...f) \, p(y\vert x,{h})
=
\,\,< p(y\vert x,{h})>_{{H}\vert f}
.
\end{displaymath} (27)

Here $< \cdots >_{{H}\vert f}$ denotes the expectation under the posterior $p({h}\vert f)$ = $p({h}\vert D,D_0)$, the state of knowledge $f$ depending on prior and training data. Successful applications of Bayesian approaches rely strongly on an adequate choice of the model space ${H}$ and model likelihoods $p(y\vert x,{h})$.

Note that $p(y\vert x,f)$ is in the convex cone spanned by the possible states of Nature ${h}\in {H}$, and typically not equal to one of these $p(y\vert x,{h})$. The situation is illustrated in Fig. 2. During learning the predictive density $p(y\vert x,f)$ tends to approach the true $p(y\vert x,h)$. Because the training data are random variables, this approach is stochastic. (There exists an extensive literature analyzing the stochastic properties of learning and generalization from a statistical mechanics perspective [63,64,65,231,239,178]).

Figure 2: The predictive density $p(y\vert x,f)$ for a state of knowledge $f$ = $f(D,D_0)$ is in the convex hull spanned by the possible states of Nature $h_i$ characterized by the likelihoods $p(y\vert x,h_i)$. During learning the actual predictive density $p(y\vert x,f)$ tends to move stochastically towards the extremal point $p(y\vert x,h_{\rm true})$ representing the ``true'' state of Nature.
\begin{figure}\begin{center}
\setlength{\unitlength}{1.5mm}\begin{picture}(50,40...
...7.5,28.)
\put(27.5,28.){\vector(1,1){0.1}}
\end{picture}\end{center}\end{figure}


next up previous contents
Next: Mutual information and learning Up: Basic model and notations Previous: Posterior and likelihood   Contents
Joerg_Lemm 2001-01-21