論文

基本情報

氏名 中藤 哲也
氏名(カナ) ナカトウ テツヤ
氏名(英語) NAKATOU TETSUYA
所属 中村学園大学 栄養科学部 栄養科学科
職名 准教授

題名

Weighting of noun phrases based on local frequency of nouns

単著・共著の別

 

著者

Yasuhiro Yamada
Yuusuke Himeno
Tetsuya Nakatoh

担当区分

 

概要

The tf-idf is a well-known weighting measure for words in texts. It measures both the frequency and the locality of words. It is often used for information retrieval and text mining. However, a lot of infrequent words have the same tf-idf value. In this study, the words are noun phrases. This paper proposes a novel weighting measure for noun phrases in texts by using the local frequency of nouns that construct a noun phrase. The proposed measure is calculated by combining the tf-idf of a noun phrase and the average of the difference between its frequency and the frequency of nouns within the phrase. The proposed measure was evaluated in experiments on the datasets of 19,997 newsgroup texts written in English and 206 Wikipedia pages written in Japanese. The experiments showed that the number of noun phrases with the same proposed measure is less than the number of noun phrases with the same tf-idf.

発表雑誌等の名称

Advances in Intelligent Systems and Computing

出版者

Springer Verlag

700

 

開始ページ

436

終了ページ

445

発行又は発表の年月

2018

査読の有無

有り

招待の有無

無し

記述言語

英語

掲載種別

研究論文(国際会議プロシーディングス)

国際・国内誌

 

国際共著

 

ISSN

 

eISSN

 

DOI

10.1007/978-3-319-72550-5_42

Cinii Articles ID

 

Cinii Books ID

 

Pubmed ID

 

PubMed Central 記事ID

 

形式

無償ダウンロード

JGlobalID

 

arXiv ID

 

ORCIDのPut Code

 

DBLP ID