Giter Site home page Giter Site logo

file-extraction's Introduction

Module for File Extraction

This is a Zeek package that provides convenient extraction of files.

As a secondary goal, this script performs additional commonly requested file extraction and logging tasks, such as naming extracted files after their calculated file checksum or naming the file with its common file extension.

Installing with zkg (preferred)

This package can be installed through the zeek package manager by utilizing the following commands:

zkg install zeek/hosom/file-extraction

# you must separately load the package for it to actually do anything
zkg load zeek/hosom/file-extraction

Installing manually

While not preferred, this package can also be installed manually. To do this, follow the tasks below:

cd <prefix>/share/zeek/site

git clone git://github.com/hosom/file-extraction file-extraction

echo "@load file-extraction" >> local.zeek

Configuration

The package installs with the extract-common-exploit-types.zeek policy, however, additional functionality may be desired.

Configuration must always be done within the config.zeek file. Failure to isolate configuration to config.zeek will result in your configuration being overwritten.

Advanced Configuration

For advanced configuration of file extraction, the best option available is to hook the FileExtraction::extract hook. For examples of this, look at the scripts in the plugins directory.

Plugins

extract-all-files.zeek

Attaches the extract files analyzer to every file that has a mime_type detected.

extract-java.zeek

Attaches the extract files analyzer to every JNLP and Java Archive file detected.

extract-pe.zeek

Attaches the extract files analyzer to every PE file detected.

extract-ms-office.zeek

Attaches the extract files analyzer to every ms office file detected.

extract-pdf.zeek

Attaches the extract files analyzer to every PDF file detected.

extract-common-exploit-types.zeek

Loads the following plugins:

  • extract-java.zeek
  • extract-pe.zeek
  • extract-ms-office.zeek
  • extract-pdf.zeek

store-files-by-md5.zeek

Uses file_state_remove to rename extracted files based on the md5 checksum whenever it is available.

store-files-by-sha1.zeek

Uses file_state_remove to rename extracted files based on the sha1 checksum whenever it is available.

store-files-by-sha256.zeek

Uses file_state_remove to rename extracted files based on the sha256 checksum whenever it is available.

file-extraction's People

Contributors

evoxco avatar hosom avatar jeffgeiger avatar justinazoff avatar unusedphd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

file-extraction's Issues

Storing extracted files by file hash name

macOS 10.14
zeek version 3.1.2
zeek/hosom/file-extraction (installed: 2.0.3) (installed from zeek pkg manager)

Having what seems like an issue getting the plugins store-files-by-*.zeek to work.

file-extraction is configured with site/packages/file-extraction/config.zeek containing:

module FileExtraction;
redef path = "";
@load ./plugins/extract-all-files.zeek
@load ./plugins/store-files-by-sha256.zeek

site/local.zeek is set to @load packages and site/packages/packages.zeek is set to @load ./file-extraction, and I can verify files are being extracted to disk.

With a run of Zeek against a PCAP, this is the result:

$ egrep 'file-extract' loaded_scripts.log 
    /usr/local/Cellar/zeek/3.1.2/share/zeek/site/packages/file-extraction/__load__.zeek
      /usr/local/Cellar/zeek/3.1.2/share/zeek/site/packages/file-extraction/main.zeek
        /usr/local/Cellar/zeek/3.1.2/share/zeek/site/packages/file-extraction/file-extensions.zeek
      /usr/local/Cellar/zeek/3.1.2/share/zeek/site/packages/file-extraction/config.zeek
        /usr/local/Cellar/zeek/3.1.2/share/zeek/site/packages/file-extraction/plugins/extract-all-files.zeek
        /usr/local/Cellar/zeek/3.1.2/share/zeek/site/packages/file-extraction/plugins/store-files-by-sha256.zeek

$ zeek-cut < files.log 
1588643444.082378	FbeEe53ts6OauGQ9B5	74.115.244.70	10.1.1.5	Cb4HP34zEptG6mo0Pc	HTTP	0	PE,MD5,EXTRACT,SHA1,SHA256	application/x-dosexec	-	4.443742	-	F	5265416	5265416	0	0	F	-	e4001cd6654ad6b53afd5018c57b3254	3743a5890931ce37d6e58a3a14b95bc614e54e85	bea9538673ef534043219b831887f9e14e3560e21243c517d97a82b20f2a7a85	HTTP-FbeEe53ts6OauGQ9B5.exe	F	-

$ ls extract_files/
HTTP-FbeEe53ts6OauGQ9B5.exe

I'm not clear why the file name is the default rather than the hash of the file.

plugin not working with Bro 2.5.2

Hello, this plugin is the working with Bro 2.5.2 version and giving following error on start (after doing a install in broctl> prompt):
[BroControl] > install
removing old policies in /opt/bro/spool/installed-scripts-do-not-touch/site ...
removing old policies in /opt/bro/spool/installed-scripts-do-not-touch/auto ...
creating policy directories ...
installing site policies ...
generating standalone-layout.bro ...
generating local-networks.bro ...
generating broctl-config.bro ...
generating broctl-config.sh ...
[BroControl] > start
starting bro (was crashed) ...
Error: error occurred while trying to send mail: send-mail: SENDMAIL-NOTFOUND not found

bro terminated immediately after starting; check output with "diag"
[BroControl] >
[BroControl] >
[BroControl] >
[BroControl] > diag
[bro]

No core file found and gdb is not installed. It is recommended to
install gdb so that BroControl can output a backtrace if Bro crashes.

Bro 2.5.2
Linux 4.9.26-1.mlos3.x86_64

Bro plugins: (none found)

==== No reporter.log

==== stderr.log
error in /opt/bro/spool/installed-scripts-do-not-touch/site/local.bro, line 103: Failed to open package '/opt/bro/spool/installed-scripts-do-not-touch/site/file-extraction': missing 'load.bro' file
fatal error in /opt/bro/spool/installed-scripts-do-not-touch/site/local.bro, line 103: can't open /opt/bro/spool/installed-scripts-do-not-touch/site/file-extraction/load.bro

==== stdout.log
max memory size (kbytes, -m) unlimited
data seg size (kbytes, -d) unlimited
virtual memory (kbytes, -v) unlimited
core file size (blocks, -c) unlimited

==== .cmdline
-i eth0 -U .status -p broctl -p broctl-live -p standalone -p local -p bro local.bro broctl broctl/standalone broctl/auto

==== .env_vars
PATH=/opt/bro/bin:/opt/bro/share/broctl/scripts:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/bro/bin:/opt/bro/bin
BROPATH=/opt/bro/spool/installed-scripts-do-not-touch/site::/opt/bro/spool/installed-scripts-do-not-touch/auto:/opt/bro/share/bro:/opt/bro/share/bro/policy:/opt/bro/share/bro/site
CLUSTER_NODE=

==== .status
TERMINATED [atexit]

==== No prof.log

==== No packet_filter.log

==== No loaded_scripts.log

[BroControl] >

Extract file with extention

Hi All
This script is extract file and store it without there extension.
How can I store the file with the extension ?
Best Regard

store files by md5 doesn't seem to be working

When adding store files by md5 to config.bro it doesn't seem to impact the file names of extracted files. Is there a way to change how the file is stored originally if md5 is present vs the fuid with f$info$md5?

@load ./plugins/store-files-by-md5

File Extraction in cluster mode

I have installed zeek with pf_ring, but file extraction not working. please help me. Following are the configuration:-

[logger]
type=logger
host=localhost

[manager]
type=manager
host=localhost

[proxy-1]
type=proxy
host=localhost

[worker-1]
type=worker
host=localhost
interface=eno1
lb_method=pf_ring
lb_procs=5

Legacy office mime types missing

Actually in extract-ms-office some legacy office mime types are missing:

From fileext.com:

ext mime type
.doc application/msword
.dot application/msword
.docx application/vnd.openxmlformats-officedocument.wordprocessingml.document
.dotx application/vnd.openxmlformats-officedocument.wordprocessingml.template
.docm application/vnd.ms-word.document.macroEnabled.12
.dotm application/vnd.ms-word.template.macroEnabled.12
.xls application/vnd.ms-excel
.xlt application/vnd.ms-excel
.xla application/vnd.ms-excel
.xlsx application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xltx application/vnd.openxmlformats-officedocument.spreadsheetml.template
.xlsm application/vnd.ms-excel.sheet.macroEnabled.12
.xltm application/vnd.ms-excel.template.macroEnabled.12
.xlam application/vnd.ms-excel.addin.macroEnabled.12
.xlsb application/vnd.ms-excel.sheet.binary.macroEnabled.12
.ppt application/vnd.ms-powerpoint
.pot application/vnd.ms-powerpoint
.pps application/vnd.ms-powerpoint
.ppa application/vnd.ms-powerpoint
.pptx application/vnd.openxmlformats-officedocument.presentationml.presentation
.potx application/vnd.openxmlformats-officedocument.presentationml.template
.ppsx application/vnd.openxmlformats-officedocument.presentationml.slideshow
.ppam application/vnd.ms-powerpoint.addin.macroEnabled.12
.pptm application/vnd.ms-powerpoint.presentation.macroEnabled.12
.potm application/vnd.ms-powerpoint.template.macroEnabled.12
.ppsm application/vnd.ms-powerpoint.slideshow.macroEnabled.12

See also this gist

Configuring file extraction to write files in a directory when reading pcap files

Hello I am trying to also use this plugin however I can't get it to create a directory like ./extracted-files to put them in when I am reading a cap file. Do I also need the base policy extract-all-files? I do like the fact that I can add mime types arrays. Where can I specify the output path and directory?

I am using bro 2.5.1 and the current github version of bro.

Thanks for working this.

How to filter files according to their contents

Hello, I wonder if you could help me with this.
I just want to filter files through regex, extract the files of contents matching the regex.
I found a related filed named bof_buffer of fa_file type, but the size of it is small.
I can't get the whole contents of the file before send it to file analyzers.
How can i do? Thank you very much.

File rotation

Hi!

Sorry for the stupid question, I'm a bit new to Zeek.
I've installed the plugin and configured it throw the config.zeek file.
The extraction directory is set to "/usr/local/zeek/extracted/".
It's all ok and extraction is working fine.
I'd like to know if file rotation is managed and how it works.

Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.