Radial Basis Functions
- Instead of using a linear function, we try to learn a function of the form
\[ \hat{f}(x) = w_0 + \sum_{u=1}^{k} w_u K_u(d(x_u,x)) \]
where each $x_u$ is an instance from $X$ where the kernel function
$K_u(d(x_u,x))$ is defined so that it decreases as the distance
$d(x_u,x)$ increases.
- $k$ is a user-defined constant that specifies the number
of kernel functions to be included.
- Note that, even thought $\hat{f}(x)$ is a global approx to
$f(x)$, the contribution from each of the $K_u$ terms is
localized to a region near $x_u$.
- It is common to choose $K$ to be a Gaussian centered around $x_u$
\[
K_u(d(x_u,x)) = e^{\frac{d^2(x_{u},x)}{2\sigma_u^2}}
\]
- It has been shown that this $\hat{f}(x)$ can approximate
any function with arbitrarily small error, given enough $k$
and given that each $\sigma^2$ can be separately
specified.
José M. Vidal
.
11 of 18