Giter Site home page Giter Site logo

pcap-index's Introduction

UPDATE - APR 2012
It looks like you're best bet for "production" (not that this was ever anywhere near prod...) performance is cxtracker (https://github.com/gamelinux/cxtracker). It turns out my approach of indexing every packet wasn't a winning one :) But I'll leave this code up as it might be interesting to someone else, especially the pyparsing code.


OVERVIEW
This is a set of scripts to index a PCAP file into a SQLite3 DB and allow you to recover them VERY quickly.


WHY
Going through PCAPs with tcpdump is slow. Precious minutes eat away at concentration.


HISTORY
The first results for PCAP indexing from google are:
 - http://geek00l.blogspot.com/2008/03/sancp-pcap-index.html
 - http://blog.vorant.com/2008/04/pcap-indexing.html
This is the same idea I followed and showed very good speed improvements. The problem was too much coupling. SANCP is a nice tool, but I didn't want to couple a meta data extraction tool to direct packet recovery. One of the problems is with SANCP sessionizing - I'm not sure how good it is as I've seen it do some very odd things when analyzing PCAP files.

Started this project because I was bored during a few presentations :) 


BUILDING
The only thing that needs to be built is the index_pkts.c file. Simply run:

	$ gcc -lpcap -o index_pkts index_pkts.c
	
	
USAGE
To index a pcap file
	$ /path/to/pcap_index/index_pkts.sh /path/to/pcap_file /patch/to/new/sqlite_db
	
To retrieve packets
	$ /path/to/pcap_index/get_pkts.py -s /patch/to/sqlite_db -f "src = 1.2.3.4 and dst = 5.6.7.8 and time < 6-apr-2009 12:03:05 and time > 6-apr-2009 12:03:00"
	
The -f flag is for the database filter to use. It's pretty free form - you can bracket expressions, and/or/not things. A time span isn't required. If the -w flag isn't given, the PCAP data will be written to stdout.

Possible filter keywords:
- time 
- src
- sport
- dst 
- dport
- ether_type
- ip_proto

Arguments can be integers (for time, ether_type, ip_proto, sport, dport). Must be decimal - I haven't built it to parse hex.
Arguments to src and dst can be IPv4 or IPv6 addresses. For packets without an IP layer, you can recover them using their MAC addresses as src/dst.

The major filter option missing is a BPF-like 'host' option that checks for both src and dst. You can still do this manually by plugging in 'src = y and dst = x or src = x and dst = y'.


STATS
For a 13GB PCAP, it takes 2h 36min to create the index. SQLite DB file size: 4.6GB. 9s to recover 3k pkts between two IPs.
For a 1.8GB PCAP, it takes 2m 48s to create the index. SQLite DB file size: 764MB. A couple of seconds to recover approx 2k pkts between two hosts.


FUTURE WORK
This would likely be faster and more efficient using MySQL rather than SQLite. I avoided this for the first version in order to make it portable.

It would also be neat to integrate this into something like OpenFPC, I think.

pcap-index's People

Contributors

taterhead avatar cglewis avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.