Instance Based Learning

- If we choose Eq. 3, we have to re-derive a gradient descent rule using the same arguments as for neural nets.
- As such, we can adjust the weights of $\hat{f}(x)$ with \[ \Delta w_i \equiv \eta \sum_{x\in k nearest nbrs of x_q} K(d(x_q,x))(f(x)-\hat{f}(x))a_j(x) \]
- This equation will do gradient descent on the weights of $\hat{f}(x)$ so as to minimize its error against $f(x)$.
- There are many other much more efficient methods to fit linear functions to a set of fixed training examples (see book for references).
- Locally Weighted Regression often uses linear or quadratic functions because more complex functions are too hard to fit and only provide marginal benefits, at best.

10 of 18