題名: Mining Maximal Frequent Itemsets in Data Streams
作者: Li, Hua-Fu
Lee, Suh-Yin
Shan, Man-Kwan
關鍵字: Data mining
data streams
maximal frequent itemsets
online algorithm
single-pass mining
期刊名/會議名稱: 2004 ICS會議
摘要: Mining streaming data brings not only unique opportunities but also new difficult challenges of online algorithm design, such as one streaming data scan, bounded memory requirement, fast processing time, and short response time. In this paper, we propose a single-pass algorithm, called DSM-MFI (Data Stream Mining for Maximal Frequent Itemsets), to mine the set of all maximal frequent itemsets (MFI) in a continuous stream of transactions. In single one scan of incoming streaming data, an in-memory summary data structure, called IPM-Forest (Item- Prefix Maximal-itemset Forest), is developed to store all the frequent information about the maximal frequent itemsets of the data streams. In DSM-MFI, two efficient mechanisms, namely Transaction Item-prefix Projection (TIP) and Top-Down Maximal frequent itemset Finding (TDMF), is used to improve the performance of mining MFI in data streams. More specifically, TIP makes the space requirement of DSM-MFI predicable and reconstructs the smallest parts of IPM-Forest. In addition, TDMF finds all maximal frequent itemsets by a “MaxTo3” approach from the IPM-Forest generated so far. Based on our knowledge, DSMMFI is the first algorithm for online mining maximal frequent patterns in continuous data streams.
日期: 2006-10-16T05:43:21Z
分類:2004年 ICS 國際計算機會議

文件中的檔案:
檔案 描述 大小格式 
ce07ics002004000095.pdf423.67 kBAdobe PDF檢視/開啟


在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。