1 / 8
文档名称:

parison of Event Models for Naive Bayes Text Classification.pdf

格式:pdf   页数:8页
下载后只包含 1 个 PDF 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

parison of Event Models for Naive Bayes Text Classification.pdf

上传人:36296518 2015/11/28 文件大小:0 KB

下载得到文件列表

parison of Event Models for Naive Bayes Text Classification.pdf

相关文档

文档介绍

文档介绍:parison of Event Models for Naive Bayes Text Classification
Andrew McCallum‡† Kamal Nigam†
mccallum@ ******@

Just Research †
4616 Henry Street School puter Science
Pittsburgh, PA 15213 Carnegie Mellon University
Pittsburgh, PA 15213
Abstract learning, especially when the number of attributes is
large.
Recent approaches to text classification have used two Document classification is just such a domain with
different first-order probabilistic models for classifica- a large number of attributes. The attributes of the
tion, both of which make the naive Bayes assumption.
Some use a multi-variate Bernoulli model, that is, a examples to be classified are words, and the number
work with no dependencies between words of different words can be quite large indeed. While
and binary word features (. Larkey and Croft 1996; some simple document classification tasks can be ac-
Koller and Sahami 1997). Others use a multinomial curately performed with vocabulary sizes less than one
model, that is, a uni-gram language model with integer hundred, plex tasks on real-world data from
word counts (. Lewis and Gale 1994; Mitchell 1997). the Web, and newswire articles do best with vo-
This paper aims to clarify the confusion by describing cabulary sizes in the thousands. Naive Bayes has been
the differences and details of these two models, and by essfully