WASET
	%0 Journal Article
	%A Takeru YOKOI and  Hidekazu YANAGIMOTO and  Sigeru OMATU
	%D 2009
	%J International Journal of Computer and Information Engineering
	%B World Academy of Science, Engineering and Technology
	%I Open Science Index 26, 2009
	%T Information Filtering using Index Word Selection based on the Topics
	%U https://publications.waset.org/pdf/14748
	%V 26
	%X We have proposed an information filtering system
using index word selection from a document set based on the
topics included in a set of documents. This method narrows
down the particularly characteristic words in a document set
and the topics are obtained by Sparse Non-negative Matrix
Factorization. In information filtering, a document is often
represented with the vector in which the elements correspond
to the weight of the index words, and the dimension of the
vector becomes larger as the number of documents is
increased. Therefore, it is possible that useless words as index
words for the information filtering are included. In order to
address the problem, the dimension needs to be reduced. Our
proposal reduces the dimension by selecting index words
based on the topics included in a document set. We have
applied the Sparse Non-negative Matrix Factorization to the
document set to obtain these topics. The filtering is carried out
based on a centroid of the learning document set. The centroid
is regarded as the user-s interest. In addition, the centroid is
represented with a document vector whose elements consist of
the weight of the selected index words. Using the English test
collection MEDLINE, thus, we confirm the effectiveness of
our proposal. Hence, our proposed selection can confirm the
improvement of the recommendation accuracy from the other
previous methods when selecting the appropriate number of
index words. In addition, we discussed the selected index
words by our proposal and we found our proposal was able to
select the index words covered some minor topics included in
the document set.
	%P 276 - 282