Main Page | Modules | Namespace List | Data Structures | File List | Data Fields | Globals

Global Functions

Global functions. More...

Data Structures

class  ResizableArray
 A data structure that provides a resizable array This is used within Spam to avoid reading input data files multiple times, without having to use any STL library classes. More...

class  StringMap
 A data structure that acts as a two-way one-to-one mapping from integers to strings Optimized for SPAM to quickly retrieve customer IDs, transaction IDs, and item IDs from their corresponding string names, and vice versa. More...


Functions

DatasetInfoReadDataset (bool isBinaryFile, bool isStringFile, char *filename, double minSupPercent, StringMap *&custStrMap, StringMap *&transStrMap, StringMap *&itemStrMap)
 It reads the input file and finds the frequent-1 itemsets.

int compare (const void *arg1, const void *arg2)
void LogStdoutSequence (const int c)
void LogFileSequence (const int c)
void LogSequence (const int c)
void CreateOrBitmap (SeqBitmap **f1, int *indexList, int indexLength, SeqBitmap *&orBitmap)
 OR's all of the frequent-1 itemset bitmaps together This is used to create the refBitmap for bitmap compression.

void Compress (SeqBitmap *refBitmap, SeqBitmap *tempAndBitmap, SeqBitmap *&returnBitmap, SeqBitmap **f1, SeqBitmap **newF1, int *indexList, int indexLength)
 Perform compression on a sequence bitmap.

void FindSequentialPatterns (TreeNode *curNode)
 A recursive call that goes down the search lattice to find sequential patterns.

void StartMining (DatasetInfo *info)
 Start the mining algorithm by generating the initial TreeNode to start recursing from.

void PrintError ()
int main (int argc, char **argv)

Detailed Description

Global functions.


Function Documentation

int compare const void *  arg1,
const void *  arg2
 

Definition at line 140 of file Spam.cpp.

Referenced by FindSequentialPatterns(), and StartMining().

void Compress SeqBitmap refBitmap,
SeqBitmap tempAndBitmap,
SeqBitmap *&  returnBitmap,
SeqBitmap **  f1,
SeqBitmap **  newF1,
int *  indexList,
int  indexLength
 

Perform compression on a sequence bitmap.

Compression converts as many Bitmap64 bits as possible to Bitmap32 bits, Bitmap32 to Bitmap16, and so on, with the goal of speeding up support counting. We can only get rid of a given bit in a bitmap if it will never be used further on down the tree. The refBitmap parameter should contain a 0 for every bit that is never again used, and a 1 for a bit that is used again. Note that the frequent-1 itemset bitmaps are also compressed so that their bits are still aligned with the sequence bitmap after compression.

Parameters:
refBitmap Bitmap specifying which bits are compressible
tempAndBitmap The sequence bitmap to compress
returnBitmap The compressed sequence bitmap
f1 Frequent-1 itemset bitmaps prior to compression
newF1 The compressed versions of the frequent-1 itemset bitmaps
indexList The list of item names for the 1 itemsets that are still frequent
indexLength The length of indexList

Definition at line 481 of file Spam.cpp.

Referenced by FindSequentialPatterns().

void CreateOrBitmap SeqBitmap **  f1,
int *  indexList,
int  indexLength,
SeqBitmap *&  orBitmap
 

OR's all of the frequent-1 itemset bitmaps together This is used to create the refBitmap for bitmap compression.

Parameters:
f1 The frequent-1 itemset bitmaps
indexList The list of item names for the 1 itemsets that are still frequent
indexLength The length of indexList
orBitmap [output]The OR'd bitmap. This tells us which bits can be used in operations further down this branch of the sequence tree

Definition at line 449 of file Spam.cpp.

Referenced by FindSequentialPatterns().

void FindSequentialPatterns TreeNode curNode  ) 
 

A recursive call that goes down the search lattice to find sequential patterns.

Parameters:
curNode information about the current node

Definition at line 681 of file Spam.cpp.

Referenced by StartMining().

void LogFileSequence const int  c  ) 
 

Definition at line 297 of file Spam.cpp.

Referenced by LogSequence().

void LogSequence const int  c  ) 
 

Definition at line 149 of file Spam.cpp.

Referenced by FindSequentialPatterns(), and StartMining().

void LogStdoutSequence const int  c  ) 
 

Definition at line 158 of file Spam.cpp.

Referenced by LogSequence().

int main int  argc,
char **  argv
 

Definition at line 1285 of file Spam.cpp.

void PrintError  ) 
 

Definition at line 1246 of file Spam.cpp.

Referenced by main().

DatasetInfo* ReadDataset bool  isBinaryFile,
bool  isStringFile,
char *  filename,
double  minSupPercent,
StringMap *&  custStrMap,
StringMap *&  transStrMap,
StringMap *&  itemStrMap
 

It reads the input file and finds the frequent-1 itemsets.

Parameters:
isBinaryFile whether the input file is a binary data file
isStringFile whether the input file contains integers or the string names
filename the filename of the data file
minSupPercent the minimum support percentage
custStrMap [output] Maps cust IDs to strings (only used when isStringFile == true)
transStrMap [output] Maps trans IDs to strings (only used when isStringFile == true)
itemStrMap [output] Maps item IDs to strings (only used when isStringFile == true)
Returns:
DatasetInfo - the information gathered from the dataset

Definition at line 877 of file FileInput.cpp.

Referenced by main().

void StartMining DatasetInfo info  ) 
 

Start the mining algorithm by generating the initial TreeNode to start recursing from.

Parameters:
info information about the data set

Definition at line 1163 of file Spam.cpp.

Referenced by main().


Generated on Thu Mar 11 12:01:54 2004 for SPAM by doxygen 1.3.4