Giter Site home page Giter Site logo

mattockfs's Introduction

MattockFS Computer Forensics File-System

MattockFS is currently looking to be expanded from a single developer M.Sc project to becoming a comunity driven development effort. If you are a Computer Forensics professional with programing skills in either Python or C++, please have a look at the github project Issues and see if here is any item to what you would like to contribute. Also consider joining the currently still rather silent G+ group: https://plus.google.com/communities/102487198908055860744

MattockFS is a computer forensics actor-framework component, computer forensic data-repository and message-bus implemented as Fuse based user space file system. It is based partially on CarvFs and the AnyCast-relay from the Open Computer Forensics Architecture (OCFA). MattockFS uses CarvPath annotations to designate frozen repository data in the same way that CarvFS does. MattockFS was designed to address some of the shortcomings of OCFA in respect to disk-cache misses and access control, and as such aims to become an essential foundational component in future actor-model based computer forensic frameworks. MattockFS is not a complete computer forensics framework, rather MattockFS provides essential features that a computer forensics framework may build upon.

MattockFS provides the following facilities to future actor-model based computer forensic frameworks:

  • Lab-side privilege-separation equivalent of Sealed Digital Evidence Bags. After creation, repository data is made immutable, thus guarding the integrity of the data from unintended write access by untrusted modules.
  • Trusted provenance logs. Actors/workers roles in the processing of digital evidence chunks are logged to a provenence log, leaving no opportunity for untrusted modules to falsify or corrupt provenance logs.
  • CarvPath based access to frozen (immutable) data. Multi-layer CarvPath based access in the same way as provided by CarvFs.
  • Domain specific actors oriented localhost message bus. MattockFS provides sparse-capability based access to an Anycast message bus aimed specifically at use by a computer forensics framework and the concept of toolchains. This is basically the same functionality that used to be provided by the Anycast-Relay in the Open Computer Forensics Architecture.
  • CarvPath based opportunistic hashing. MattockFS maps all low-level reads and writes to reads and writes on all active (either open files or or part of an active tool-chain) CarvPaths and will opportunisticaly calculate BLAKE2 hashes for these CarvPaths when possible.
  • Page-cache friendly archive interaction. MattockFS keeps track of CarvPaths belonging to tool-chains that are not yet completely done. It will communicate with the kernel when a toolchain completes and as a result part of the archive should be considered to be no longer active (and thus can be flushed from page cache).
  • Actor Job picking policies: MattockFS implements multiple job CarvPath based picking policies aimed either at opportunistic hashing or page-cache load optimized strategies.
  • Load balancing support: MattockFS allows a special actor, a load-balancer, to steal jobs from other (overloaded) actors in order to redistribute the job to an other node in a multi-host setup.
  • Throttle information: MattockFS provides the overlaying computer forensic framework with meta-data concerning potential page-cache load and per Actor queue size and volume. Based on this information, actors should throttle their new-data output in order to avoid spurious page-cache misses caused by to much active evidence data at a time.
  • Hooks for a distributed FIVES router. In the Open Computer Forensics Architecture a stateless router process was responsible for dynamic toolchain-path routing based on meta-data extracted from the evidence data. Later, the FIVES project created an alternative router process. This router carried state regarding the current location within a routing rule-list over to the next time the same data was processed by a router process. MattockFS provides a simple hook for use by an envisioned distributed version of FIVES-router like functionality.

MattockFS is not a complete forensic framework, it is a component that can be used as foundation for a complete forensic framework. Currently MattockFS is in beta. MattockFS comes with a Python API aimed at usage by an overlaying computer forensics framework. Future API's for other programming languages (C++ and others) are planned.

More detailed ingo can be found here:

http://pibara.github.io/MattockFS/

DFRWS MattockFS workshop

On the March 21..23 2017 DFRWS-EU conference there will be a workshop om MattockFS. More information, the slides of the presentations and hands-on session, and some relevant downloads are available at: http://dfrws.capibara.com/

Install

If you want to install MattockFS on your (Ubuntu) system, run the script ubuntu_setup. This script will do the following:

  • Build and install a python module named mattock containing 'carvpath', 'api' and all the MattockFS files except for the starter and stopper.
  • Create the user 'mattockfs' and its /var/mattock working directory.
  • Patch /etc/fuse.conf if needed to contain user_allow_other directive.
  • Install the non-pip dependencies fuse, fuse-dev and redis-server
  • Copy the start and stop scripts to /usr/local/bin

If you are on Ubuntu, running the script as follows should do the trick:

./ubuntu_setup

On any other Linux distro, follow the following steps:

  • install redis
  • install fuse and fuse libraries
  • Run 'python ./setup.py build'
  • Run 'sudo python ./setup.py install'
  • Create a new user named 'mattockfs' and make sure the user is allowed to run fuse file-systems (/dev/fuse access rights)
  • Update /etc/fuse.conf as to 'allow_others'
  • Copy all the files in the bin directory to /ust/local/bin

If you wish to play with EWF files, you should also install the pyewf python module.

After successfully running this script or manually going through all the steps, you should be able to use start_mattockfs and stop_mattockfs respectively to start or stop MattockFS. You should be asked for your sudo sudo password if you call these.

To check if installation was fully sucessfull, you should first start MattockFS in the background:

start_mattockfs

And than run the base test script:

test_mattockfs.py

You may also want to try adding a disk image to MattockFS and play with the CarvPath section of MattockFS. Please note that ewf2mattock requires pyewf to be installed, a python module that is NOT part of the standard installation dependency checks for MattockFS!

ewf2mattock someimage.E01

It should return two relative paths for the meta and image data within MattockFS.

mattockfs's People

Contributors

pibara avatar robklpd avatar

Stargazers

EverGrok'n' avatar Michael LaMuerte avatar  avatar Jae-woo Kim avatar Mark S. Miller avatar  avatar Harry Pantazis avatar Joshua I. James avatar  avatar Yuki Yoshikawa avatar Romuald Bruno Aquinas avatar land avatar Marcel Maatkamp avatar

Watchers

Mark S. Miller avatar  avatar  avatar James Cloos avatar Jae-woo Kim avatar  avatar  avatar Harry Pantazis avatar

mattockfs's Issues

uid/gid based IPTABLES rules example

While AppArmor should take care of most of the file-system based concerns that MattockFS can not address on its own, there is still the threat of access to networking from potentially vulnerable workers.
IPTABLES allows for uid based firewall rules that could enhance the privsep for MattockFS beyond being purely file-system focused. For example disalowing networj access to the redis servers used by MattockFS for longpath storage, to a database server used by some specific data storage module, or to an indexing server used by some text indexing module. We should provide a simple set if IPTABES rules to demonstrate how uid and possibly gid based firewall rules can contribute to stricter privsep.

AppArmor base + example config.

Provide an base AppArmor module profile with example profile for an example actor. A worker should only be able to access its own .ctl, not that of other actors. It should have full access to sparse-cap controlled MattockFS subdirs as well as to the carvpath subsystem and some special inf and ctl files. Further, a worker should have limited access to the /proc virtual file-system that could potentially allow
it to steal capabilities from its worker peers or possibly even from other actors. A proper AppArmor profile for a module should allow that module access to everything it needs while maintaining proper privilege separation and elevating the sparse-cap based access control to the level of true capability based security.

Limit the power of actor reset.

Currently workers can revoke all authority of their peers by invoking reset on .ctl.
We should change this feature as to make reset work only if the process that registered as worker is no longer an active process.

C++ port

Port MattockFS to C++: While Python is a good programming language for prototyping, our evaluation has shown that the implementation is relatively slow with respect to import and possibly messaging.
This slowness could potentially partially nullify part of the performance benefits from the page-cache and opportunistic hashing. Rewriting it in C++ is likely to lead to a much faster implementation.

Update website

Website was made largely pre-implemetation and needs some major updates.

Flexible FUSE read block size.

As we identified in our evaluation of file/digest order, reducing the read block-size for FUSE could potentially greatly reduce the amount of spurious reads, but at the expense of file-system
overhead. This should be evaluated and if worth the price, implemented into MattockFS.

Create worker private working directories

In order to further reduce shared mutable state, we want to provide workers with a private
working directory under $MNT/workdir/$CAP. We would thus provide workers with a private
working directory so workers running as the same uid can't corrupt each others persistent mutable storage.

Implement tick

Given that MattockFS is a user-space file-system, it's events come from file-system interaction only.
For some things, such as opportunistic-hashing Merkle tree logging at the end of a run, it is important to still get some 'tick' events when there is no regular file-system communication going on.

We either need to fit in timer events into the MattockFS process itself, or implement an external 'tick' daemon that for example requests a 'tick' extended attribute.

Config snapshotting

We should create a config file snapshotting facility under $MNT/etc/.
For forensic chain integrity, module config file versions used should be recorded in the archive and referred to from job meta data or provenance logs. This facility should provide an extended attribute for snapshotting config files from /etc/mattock.d/ and retrieving the carvpath of the snapshot.

Quaranine of "reset" linked jobs.

Implement temporary-quarantine facility: In OCFA, the Anycast implemented a quarantine facility for data that would crash a module.
This facility, that OCFA implemented in the form of the never priority allowed the investigation to keep running while maintenance programmers would fix the module. Research is needed into the possibility of
implementing a similar data quarantine functionality in MattockFS.

Restore from journal

Testing (+ debugging) restore state from journal. This bastard has been with
me for a few months now and "I REALLY COULD USE A SECOND PAIR OF EYES ON THIS ONE".
After a restart, the anycast state should get restored from the journal. Currently
the restored state is somehow messed up, so I've disabled this crucial feature for
now.

Restrict stealing of jobs to load balancing actors.

Setting a module select policy allows one actor to STEAL jobs from other actors. This is meant for load balancing only and should not be a right usable by normal modules. This should be a privileged operation explicitly permitted to an actor through the /etc/mattockfs.json config.

C++ language bindings.

Write ports of the base API to C++ Next to Python, C++ should be
considered an important language for writing modules in.
The current Python wrapper API for access to the file-system based
archive and messaging API should also be ported to C++

Eventloop class for python API

Need to extend Python-API with a multi-mountpoint eventloop Class that has hooks for the main higher level API functions (but does not implement them).

Kickstarting as privileged operation.

Kickstarting creates jobs and data out of thin air from the S0 job. This should be a privileged operation explicitly permitted to an actor through the /etc/mattockfs.json config.

STEEM MerkleTree root posting deamon.

MattockFS will log full Merkle tree JSON representations to per-mountpoint Merkle tree logs.
For integrity purposes, we want a daemon to tail the logs and place the Merkle tree root hashes as memo into STEEM transactions.

The STEEM account and key, transaction size, currency (STEEM or SBD) should be configurable in a config JSON, but the transactions shall default to a 0.01 STEEM targeted at the @null account.

BLAKE2BP, multi-core version of BLAKE2

The current implementation uses a Python module without support for the parallel multi-core BLAKE2bp implementation. With just one file-system on a node for a given archive, this would mean at most one core could be working on opportunistic hashing for the whole node, possibly creating a hashing bottleneck for the system. Implementing BLAKE2bp, either by patching the BLAKE2 python module code, or by moving to C++ and appropriate library could remove that potential bottleneck.

Secondary Opportunistic hashing

Implement and evaluate secondary opportunistic hashing : Secondary
incore based opportunistic hashing could potentially greatly improve
the interaction between carving and opportunistic hashing. This feature
should be enabled only explicitly for specific modules in the config.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.