Giter Site home page Giter Site logo

ericzs1995 / kdd99_feature_extractor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lucasduete/kdd99_feature_extractor

0.0 1.0 0.0 235 KB

Utility for extraction of subset of KDD '99 features from realtime network traffic or .pcap file

License: MIT License

CMake 2.96% C++ 96.82% C 0.22%

kdd99_feature_extractor's Introduction

kdd99_feature_extractor

Utility for extraction of subset of KDD '99 features [1] from realtime network traffic or .pcap file This utility is a part of our project at University of Bergen.

Some feature might not be calculated exactly same way as in KDD, because there was no documentation explaining the details of KDD implementation found. Algorithms are based on some articles [2][3] and observation of values in KDD dataset.

Features in KDD should be the same as features introduced by Lee & Stolfo in their work [2].

Status

  • Current version is not 100% guarenteed to be perfect in sense that some features might be calculated bit different algorighms than KDD '99 dataset a Lee & Stolfo used. Hovewer, it is suitable for educational purposes.
  • Compiled & tested in following environments:
    • Windows 7 x64, MSCV 2015 (14), WinPcap 4.1.3
    • Windows 7 x64, MSCV 2013 (12), WinPcap 4.1.3
    • Ubuntu 12.04 x64, gcc 4.6.3, libpcap 4.2

Features

  • Subset of KDD '99 features [1]
    • Content features (columns 10-22 of KDD) are not included
  • Optional extra features - IP addresses, ports, timestamp of last packet (option -e)

Main components

  1. Sniffer
  • Network traffic sniffer & frame parser
  1. IP reassembler
  • Only IP header "summaries"
  • Payload not reassembled (content features not extracted, it is not needed)
  1. Connection/Conversation reconstructor
  • Reconstructs conversations
  • Computes intrinsic features (columns 1-9 of KDD)
  1. Statistical engine
  • Computes derived features (columns 23-41 of KDD)

Build instructions to Linux (tested on Ubuntu)

  1. Create a folder to temporal build files
    mkdir build-files

  2. Enter in the folder and compile the cache
    cd build-files
    cmake -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" ..

  3. Exit the folder of build cache and compile the project
    cd ..
    cmake --build ./build-files --target kdd99extractor -- -j 4

  4. Path to compiled project is:
    build-files/src/kdd99extractor

Planned sections in this readme

  • TODOs (e.g. IP checksum checking not implemented)
  • Known/possible problems, bugs & limitations

Main sources of feature documentation

[1] KDD Cup 1999 Data, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

[2] Lee, W. & Stolfo, S. J. (2000), 'A framework for onstructing features and models for intrusion detection systems', Information and System Security 3 (4) , 227-261.

[3] Dybey, D. & Dubey, J. (2014), 'A Survey Intrusion Detection with KDD99 Cup Dataset', International Journal of Computer Science and Information Technology Research 2 (3), 146-157.

kdd99_feature_extractor's People

Contributors

bittomix avatar zhiyuan-liao avatar lucasduete avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.