next up previous contents
Next: Classification Up: Regression Previous: Gaussian mixture regression (cluster   Contents

Support vector machines and regression

Expanding the regression function $h(x)$ in a basis of eigenfunctions $\Psi_k$ of ${\bf K}_0$

\begin{displaymath}
K_0 = \sum_k \lambda_k \Psi_k \Psi_k^T
,\quad
h(x) = \sum_k n_k \Psi_k (x)
\end{displaymath} (333)

yields for functional (247)
\begin{displaymath}
E_h = \sum_i \left(\sum_k n_k \Psi_k (x_i)-y_i\right)^2
+\sum_k \lambda_k \vert n_k \vert^2
.
\end{displaymath} (334)

Under the assumption of output noise for training data the data terms may for example be replaced by the logarithm of a mixture of Gaussians. Such mixture functions with varying mean can develop flat regions where the error is insensitive (robust) to changes of $h$. Analogously, Gaussians with varying mean can be added to obtain errors which are flat compared to Gaussians for large absolute errors. Similarly to such Gaussian mixtures the mean-square error data term $(y_i-h(x_i))^2$ may be replaced by an $\epsilon $-insensitive error $\vert y_i-h(x_i)\vert _\epsilon$, which is zero for absolute errors smaller $\epsilon $ and linear for larger absolute errors (see Fig.5). This results in a quadratic programming problem and is equivalent to Vapnik's support vector machine [225,74,226,214,215,49]. For a more detailed discussion of the relation between support vector machines and Gaussian processes see [229,208].

Figure 5: Three robust error functions which are insensitive to small errors. Left: Logarithm of mixture with two Gaussians with equal variance and different means. Middle: Logarithm of mixture with 11 Gaussians with equal variance and different means. Right: $\epsilon $-insensitive error.
\begin{figure}\vspace{-6cm}
\begin{center}
\epsfig{file=ps/insens.ps, width=100mm}\end{center}\vspace{-6.5cm}
\end{figure}


next up previous contents
Next: Classification Up: Regression Previous: Gaussian mixture regression (cluster   Contents
Joerg_Lemm 2001-01-21