preview

Nt1310 Unit 7 Data Analysis Paper

Satisfactory Essays

KDDCup99 dataset was introduced at the Third International Knowledge Discovery and Data Mining Tools Competition which was held by DARPA in 1999 .KDDCup99 is a refined data set from DARPA 1998 dataset as it contains only network data[3]. KDDCup99 is commonly used developers and implementers of new IDS to evaluate their systems. IDS systems take the KDDCup99 dataset as an input to train ,test the system and check performance of the IDS in classifying and detecting attack records. KDDCup99 dataset is used by most researchers because it contains 22 different attack types which could be classified into four main attack categories of the network discussed in the previous section. The full DARPA dataset consists of relatively 4,900,000 lines of connection vectors where each single connection vectors consists of 41 features and is marked as either normal or an attack, with exactly one particular attack type [38]. Among the 41 features of the connection, only sixteen significant attributes are considered which are: A1,A5,A6,A8, A9, A10, A11, A13, A16, A17, A18, A19, A23, A24, A32, A33[38] The KDD 99 …show more content…

IMPLEMENTATION PROCEDURE In my implementation of fuzzy genetic algorithm ii followed the steps of genetic algorithm so I started with reading the data of KDDCup99 and creating a population, this population was composed of number of individuals which are the records in the KDDCup99 which means that each individual has an array of genes to hold the features of audit records. This was accomplished by first encoding audit record data into binary because some feature such as protocol type has value "TCP". Once i finished creating my initial population I evaluated every individual in the population to calculate its fitness using function below Cur: value of a feature of current individual Max: Max fitness value of individual in population Min: Min Fitness value of individual in

Get Access