論文

基本情報

氏名 中藤 哲也
氏名(カナ) ナカトウ テツヤ
氏名(英語) NAKATOU TETSUYA
所属 中村学園大学 栄養科学部 栄養科学科
職名 准教授

題名

Mining pure patterns in texts

単著・共著の別

 

著者

Yasuhiro Yamada
Tetsuya Nakatoh
Kensuke Baba
Daisuke Ikeda

担当区分

 

概要

We herein investigate finding unusual patterns from a given string as a text. In the present paper, the pattern is expressed as a substring of the string. The natural assumption with respect to the frequency of a pattern is that the shorter the length of the pattern, the larger the frequency of the pattern. We define a pattern to be pure if the frequencies of all of the substrings of the pattern are the same as the frequency of the pattern. This means that the substrings appear only within the pattern in the string. This condition is in contrast to the natural assumption. The present paper proposes three statistics for quantifying the purity of a pattern, i.e., probability, entropy, and difference, which are calculated based on the frequency of the pattern and its substrings. Experiments using DNA sequences reveal that patterns with large probability correspond to the features of the sequences. © 2012 IEEE.

発表雑誌等の名称

Proceedings of the 2012 IIAI International Conference on Advanced Applied Informatics, IIAIAAI 2012

出版者

IEEE Computer Society

 

 

開始ページ

285

終了ページ

290

発行又は発表の年月

2012

査読の有無

有り

招待の有無

無し

記述言語

英語

掲載種別

研究論文(国際会議プロシーディングス)

国際・国内誌

 

国際共著

 

ISSN

 

eISSN

 

DOI

10.1109/IIAI-AAI.2012.75

Cinii Articles ID

 

Cinii Books ID

 

Pubmed ID

 

PubMed Central 記事ID

 

形式

無償ダウンロード

JGlobalID

 

arXiv ID

 

ORCIDのPut Code

 

DBLP ID