Classify Naive Bayes Text
- $positions \leftarrow$ all word positions in $Doc$ that contain
tokens found in $Vocabulary$
- Return $v_{NB}$, where
\[v_{NB} = \argmax_{v_{j} \in V} P(v_{j}) \prod_{i \in positions}P(a_{i}\,|\,v_{j}) \]
- This algorithm was shown to classify Usenet articles into
their appropriate newsgroups with 89% accuracy.
- A similar approach was proposed by Paul Graham in A Plan for Spam. Several
implementations exist such as Spambayes.
José M. Vidal
.
24 of 39