ymirsky / kitsune-py Goto Github PK
View Code? Open in Web Editor NEWA network intrusion detection system based on incremental statistics (AfterImage) and an ensemble of autoencoders (KitNET)
License: MIT License
A network intrusion detection system based on incremental statistics (AfterImage) and an ensemble of autoencoders (KitNET)
License: MIT License
Greetings YisroelMirsky,
I wish to use your datasets as an input to my models. However, upon looking into the I/O graphs of the captured pcap files, I found that there is no spikes of any attack packets after 1st million packets in the following dataset (I downloaded all 9 pcaps from google drive in your github kitsune project):
In the SSL renegotiation pcap:
As can be seen, after the first million packets, there is no significant rise in SSL filter line.
Also, there is no abnormal behavior in the UDP filter line. I presume in SSDP flood attack, UDP packets are the attack vectors. (The abnormal behavior of UDP packets doesn't happen until the very end, which is after around 2.621.185 packets)
Do I understand your statement of "clean network traffic was captured for the first 1 million packets " correctly? Or am I missing something?
Thanks,
Hieu
I ran an example of this project using Ubuntu18, Python3.6.5, and got the following result.However, this result graph is not consistent with the result graph you shared.This result graph does not reflect the attack, and the data under attack appears to be drawn in the wrong place. I would appreciate if you can tell me why and how to get the right result!
Has any body implemented this in pure C or C++?
I have trouble for understanding the code of KitNET.emmmm~
I'm using network packets from my elasticsearch and I wanna know how do I fit my own features to this.
Is it possible to use different data than a pcap file or is it necessary to use pcap ?
Does the kitsune continuously train the model when it executes anomaly detection or does it just execute once training period is done?
If not why ?
Note that we currently only look for tshark in the Windows directory "C:\Program Files\Wireshark\tshark.exe"
How i can change the code to look for tshark in linux directory /usr/bin/tshark
Compilation fails on Windows with C:\\MinGW\\bin\\gcc.exe' failed with exit status 1\n
In the code, FMgrace has been used as 5000 which is for 100 features. If I'm reducing the number of features would it be better to reduce FMgrace or to increase?
how to show the result and how to know which is a malicious traffic
How to determine if a packet is malicious packets vim pacp file
I am still having this issue
Please can anyone help
Originally posted by @ymirsky in #4 (comment)
Hi, I have the following in the virtualenv
Package Version
Cython 0.29.6
numpy 1.16.2
pip 19.0.3
scapy 2.4.2
setuptools 40.9.0
wheel 0.33.1
When it run it give error such as "
Importing AfterImage Cython Library
Importing Scapy Library
Traceback (most recent call last):
File "example.py", line 1, in
from Kitsune import Kitsune
File "/home/pi/vir1/Kitsune-py/Kitsune.py", line 2, in
from KitNET.KitNET import KitNET
File "/home/pi/vir1/Kitsune-py/KitNET/KitNET.py", line 2, in
import KitNET.dA as AE
ImportError: No module named dA
"
Please advise thanks
in both AfterImage files in the function: getHeaders_2D()
we see this line:
hdrs = incStat_cov(incStat(Lambda,IDs[0]),incStat(Lambda,IDs[0]),Lambda).getHeaders(ver,suffix=False)
i think the second index should be 1 like this:
hdrs = incStat_cov(incStat(Lambda,IDs[0]),incStat(Lambda,IDs[1]),Lambda).getHeaders(ver,suffix=False)
The dataset in Google Cloud Disk is empty. Where can I download the dataset used in the paper?
Hello,
I am getting overflow warnings in the sigmoid function in the utils file. This gives me very large results for the RMSEs. The error only starts after 'training' is done and after the first error appears it keeps happening more and more often. I have changed the code a little so that instead of getting packets from a file, the FE receives them from a live capture. Before the error appears, the training seems to be going ok. Any ideas as to what could be going wrong? Could this be an issue with the way I parse the live capture packets?
Edit: Could this be happening because I am not reducing the value of my max int by a factor of 10? I just noticed this in the prep() method in the FE.
Thank you,
Miguel
I dont see any call to cleanOutOldRecords in the code. When should this be called and with what values?
in the AfterImage.pyx , Class Queue:
cdef insert(self,double v):
self.q[self.indx] = v
self.indx = (self.indx + 1) % 3
self.n += 1
cdef get_last(self):
return self.q[self.indx]
when n is 1 or 2, calling the 'get_last' will always get 0.
when n is 3, calling the 'get_last' will get the first element in queue not the last.
I'd like to ask if it should be written like this : return self.q[(self.indx-1)%3]
I have pcap file and i want to extract file to tsv / csv with your framework. For result sholud be like this:
toriimallock.csv
Could you help me please ?
I have used the code to detect anomalies on one of my datasets. When I use the active wiretap dataset, I managed to obtain the anomaly when FMGrace was 100K and ADGrace was around 800k. It was a almost horizontal plot with two anomaly spikes.
But when I use it for my dataset, prediction rmse values shows more of a linear graph. I'm guessing it's because of overfitting.
I would like to know how and what parameters need to be changed to obtain a more smooth result?
Thank you!
Hey and thanks in advance for any help. My error description is probably not very helpful and rather broad but maybe anyone has some pointer as to how I could proceed in fixing it.
I am using a pcap file of IEC 104 communication (so TCP/IP packages with a custom payload (which shouldn't matter since kitsune only uses flow data, right?) but using the example.py with my own pcap gives me an array of [0.0,...,0.0] as a result.
Is there any obvious thing I could be missing or does anyone have an idea on where I should start looking for the issue in order to debug this?
Best Regards
hi, I have read your paper, which is a wonderful work.
I try to use Kitsune as an online IDS on Jetson Nano.
However, I am a new bird in this field.
I want to know how to do an online detection.
if I need to capture packages with Wireshark first , and generate a file.pcap, then pass the file.pcap to Kitsune?
or if there are some ways to pass the features to Kitsune directly.
thank you so much for apply so wonderful paper and code!
thank you again if any advance!
I need to use pytorch for backpropagation to calculate adversarial samples, but I found that the torch framework is not used in this code, how can it be easily and quickly changed to torch? ?
Hi
I have noticed that some metrics are calculated differently from what is described in the paper, in particular:
Radius. Radius in the paper is defined as sqrt(var1^2+var2^2), however in AfterImage.py, line 88, it is calculated as sqrt(var1+var2)
Covariance. The paper defines covariance as SR_ij/(w_i+w_j), but in AfterImage.py, line 203, it defines a new weight w3 and divides SR_ij by w3.
Could you please clarify why these two metrics are calculated differently?
Can I know the attributes mentioned in the dataset.
In the paper, a window has 23 features, but in the code(netStat.py), a window has only 20 features.
I do not understand, why incStatDB.update_get_1D_Stats update all incStat.covs, and then incStatDB.update_get_2D_Stats update one of them again in AfterImage.py.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.