Giter Site home page Giter Site logo

anssi-fr / polichombr Goto Github PK

View Code? Open in Web Editor NEW
372.0 38.0 64.0 3.52 MB

Collaborative malware analysis framework

License: Other

Python 65.80% HTML 12.59% Shell 0.06% Ruby 20.60% JavaScript 0.94%
malware-analysis reverse-engineering ida idapro malware-research security-tools ida-plugin

polichombr's Introduction

Polichombr

Build Status Maintainability Test Coverage

What is Polichombr?

This tool aim to provide a collaborative malware analysis framework.

Documentation

A more detailed documentation is placed in the docs folder

Features

  • Sample storage and documentation
  • Semi automated malware analysis
  • IDA Pro collaboration
  • Online disassembly
  • Binary matching with the MACHOC fuzzy hash algorithm
  • Yara matching

Installation

Please see the corresponding file in the docs directory

Example scripts

Scripts under the folder examples permits some basic actions for a Polichombr instance.

Screenshots

Generic sample informations

screenshot

Family/Threat overview

screenshot

Online disassembly

screenshot

Share IDA Pro informations from the WebUI / directly to other users

screenshot

Automated hotpoints detection

screenshot

Taking notes right from IDA

screenshot

Feature documentation

Malware analysis

Polichombr provides an engine to automate the analysis tasks by identifying points of interest inside the malicious binary, and providing them both on a web interface and inside the analyst's tools via an API.

Plugins / tasks

Analysis tasks are loaded from the app/controllers/tasks directory, and must inherit from the Task object. In particular, several tasks are already implemented:

  • AnalyzeIt, a ruby script based on metasm, wich is used to identify interesting points in the binary. The goal is to help the analyst by giving hints about where to start. For example, we try to identify crypto loops, functions wich calls sensitive API (file, process, network, ...)

  • Peinfo : We load the PE metadata with the peinfo library.

  • Strings : extract ASCII and Unicode strings

Signatures

We use several signature models to classify malware:

  • Yara
  • imphash
  • Machoc

Machoc

Machoc is a CFG-based algorithm to classify malware. For more informations, please refer to the following documentation:

IDA Collaboration: Skelenox

This is an IDAPython plugin, wich is used to synchronize the names and comments with the knowledge base, and with other users database

Contributing

Contributions are welcome, so please read CONTRIBUTING.md to have a quick start on how to get help or add features in Polichombr

polichombr's People

Contributors

tpo-anssi avatar tpourcelot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

polichombr's Issues

AnalyzeIt.rb is broken

There is two problems

  • the encoding has been removed and metasm doesn't like it
  • There is a typo on line 948

Attached two stack traces.

analysis_tools/AnalyzeIt.rb:949:in `block in <main>': undefined local variable or method `egexStr' for main:Object (NameError)
polichombr/metasm/metasm/os/main.rb:165:in `index': incompatible character encodings: ASCII-8BIT and UTF-8 (Encoding::CompatibilityError)
polichombr/metasm/metasm/os/main.rb:165:in `index
from analysis_tools/AnalyzeIt.rb:757:in `block (2 levels) in <main>'
from analysis_tools/AnalyzeIt.rb:753:in `each'
from analysis_tools/AnalyzeIt.rb:753:in `block in <main>'
from analysis_tools/AnalyzeIt.rb:750:in `each'
from analysis_tools/AnalyzeIt.rb:750:in `<main>'

skelenox config issues

When trying to use skelenox with Ida Pro 6.95 on the mac, there are a few issues:
1: Port number
Port number as defined in skelsettings.json yields a unicode string. on the mac this triggers a bug:

[11/03/2017 10:26] [ERROR] [MainThread]: The polichombr server seems down
Traceback (most recent call last):
File "/scripts/skelenox/skelenox.py", line 148, in get_online
self.__do_init()
File "/scripts/skelenox/skelenox.py", line 164, in __do_init
self.h_conn.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect
self.timeout, self.source_address)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 557, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
error: getaddrinfo() argument 2 must be integer or string

Solution is to use "poli_port": 5000, instead of "poli_port": "5000",

2: communication with polichombr
when loading the script in ida, the following occurs:

[12/03/2017 05:55] [ERROR] [MainThread]: HTTPS is not managed at the moment...
[12/03/2017 05:55] [INFO] [MainThread]: Skelenox init finished
[12/03/2017 05:55] [ERROR] [SkelSyncAgent]: The GET request didn't go as expected
[12/03/2017 05:55] [ERROR] [SkelSyncAgent]: 
Traceback (most recent call last):
  File "/scripts/skelenox/skelenox.py", line 892, in run
    self.sync_names()
  File "/scripts/skelenox/skelenox.py", line 833, in sync_names
    comments = self.skel_conn.get_comments(timestamp=self.last_timestamp)
  File "/scripts/skelenox/skelenox.py", line 329, in get_comments
    res = self.poli_get(endpoint)
  File "/scripts/skelenox/skelenox.py", line 216, in poli_get
    result = self.poli_request(endpoint, data, method='GET')
  File "/scripts/skelenox/skelenox.py", line 208, in poli_request
    raise IOError
IOError

On the server side of things, with debugging turned on, the following is logged:

ERROR in webui [/web/polichombr/poli/views/webui.py:58]:
404 Not Found: The requested URL was not found on the server.  If you entered the URL manually please check your spelling and try again.
--------------------------------------------------------------------------------
192.168.2.55 - - [11/Mar/2017 17:37:38] "GET /2/comments/?timestamp=1970-01-01T01:00:00.000000 HTTP/1.1" 404 -

I haven't been able to track this down yet.

Tasks are not using the correct db handle.

This bug can be triggered when running the tests ( tests/full_tests.py ),
when the main db (app.db file) is not created.
The tests will then give you a lots of SQLAlchemyOperationalError about the inexistant Sample table

Docker start error

Hello,

First, thank you for your work.

I would like to warn you, that by following the INSTALL_DOCKER.md guidelines, the docker container starts sucessfully, but the database does not seem to be created. It is probably because of the commented lines at the end of the Dockerfile (probably for a good reason!)

Here is the the backtrace:

File "/opt/polichombr/flask/lib/python2.7/site-packages/sqlalchemy/pool.py", line 449, in init
self.connection = self.__connect()
File "/opt/polichombr/flask/lib/python2.7/site-packages/sqlalchemy/pool.py", line 607, in __connect
connection = self.__pool._invoke_creator(self)
File "/opt/polichombr/flask/lib/python2.7/site-packages/sqlalchemy/engine/strategies.py", line 97, in connect
return dialect.connect(_cargs, *_cparams)
File "/opt/polichombr/flask/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 385, in connect
return self.dbapi.connect(_cargs, *_cparams)
OperationalError: (sqlite3.OperationalError) unable to open database file

Another problem is the fact that ./utils/db_create.py requires the module poli that is only locally referenced (out of the box) at the root folder.
Therefore, we get the error :

Traceback (most recent call last):
File "./utils/db_create.py", line 2, in
from poli import db
ImportError: No module named poli

Regards,

Sending machoc hashes from Skelenox

Hi !

Sending "sub-information" (address, and machoc hash) from Skelenox may be useful in several cases :

  • non "analyzeit"-supported processors ;
  • subs redefinition ;
  • sample is not available (only the IDB file).

A murmurhash3 python implementation may be found here : https://github.com/wc-duck/pymmh3

Cheers,

Add more analysis tools

Here is a list of analysis modules to implement

  • Resources / overlay extraction
  • Section hashing/matching
  • file carving
  • xor bruteforce to discover simple ciphered payloads
  • Parse authenticode signatures (metadata)
  • AnalyzeIt for ELFs

Re-perform analysis button

Hi !

Adding a "re-perform analysis" button on the Web Interface (in case of server crashs / re-automatic classification / etc.) might be useful :)

Cheers,

Improve the web User Interface

The UI needs several improvements.

  • In the disassembly view, use the API to get the functions names and comments
  • Create a view highlighting problems (unassociated yaras, machoc without matches, false positives, ...)
  • Create an executive overview (family progress, who worked on what at this time?, ...)
  • Display the exports names by sample
  • In the sample view, display function names using the API
  • Improve family display using the API

Dependencies between samples

Relationships between samples, as for example "dropper -> dropped" or "dropper-> decoy" should be implemented.

The family view is not hierarchical

There should be an arborescent structure to improve the family view.

For exemple, there should be something like
-> Family 1
---> Family 1.1
-> Family 2

instead of all the families and subfamilies on the same level.

Zip archive upload is broken

There is an exception when uploading a zip file

File "poli/controllers/api.py", line 74, in create_sample_and_run_analysis
    with zipfile.ZipFile(file_data, "r") as zcl:
  File "/usr/lib/python2.7/zipfile.py", line 756, in __init__
    self.fp = open(file, modeDict[mode])
TypeError: file() argument 1 must be encoded string without null bytes, not str

(minor) family data modifications

Hi,

It's actually impossible to update family-related data (files / detection items). It can actually be performed by deleting and re-creating the item, so that's only a minor update request regarding the time required to implement the features, but may be useful to do it one day :)

Cheers,

Integrating MISP galaxies into polichombr (families)

Very nice and interesting tool. We will have a look to integrate it with MISP like we did with viper.

On the side note, MISP galaxies contain machine parsable information about threat-actors and attacker tools. This could be a nifty extension for the users of your tools to get automatically potential information for classifying their analysis with existing taxonomies.

https://github.com/MISP/misp-galaxy/blob/master/elements/adversary-groups.json

or

https://github.com/MISP/misp-galaxy/blob/master/elements/threat-actor-tools.json

HTTPS support in Skelenox

Hi,

A Skelenox HTTPS support may be useful. httplib exposes the HTTPSConnection method, SSL certificate pinning may be performed by loading it directly from the configuration file for instance.

Cheers,

No feedback when failing to create a Yara

When submitting a dummy Yara rule,
the user is not aware of the failure of the rule creation.

ERROR in yara_rule [./poli/controllers/yara_rule.py:149]:
yara.compile(source=raw_data)
SyntaxError: undefined identifier "toto"

Block multiple skelenox instances in IDA Pro

Hi,

While patching skelenox.py, I did load it multiple times in my IDA Pro instance. All of the instances were actually "working" (i.e hooks receivinig notifications), which lead to several issues (multiple requests, connection error, etc.).

Skelenox should find existing instances and safely unload them (i.e remove hooks) in order to avoid such issues. In the main proc, just check for the "skel" object, and if it exists, call its "end_skelenox()" method, then overwrite it.

Cheers,

importing several YARA rules

Hello,

Is it possible to import several YARA rules at the same time ?

For what I see, it seems we have to import them one by one ...

Thanks !

Yara API

There is currently no API endpoint for creating yara rules

Compare redirection

Hi,

This is a minor bug : when I try to "machoc-compare" a sample I'm redirected to the sample's home page, not the "#matches" one. Results are correct, this is only a "UI" issue :)

Cheers,

Skelenox : exception when losing connectivity

Hi,

When the connectivity fails when viewing the "func Infos" tab, (i.e Error during request, retrying / Closing connection / Connecting using simple HTTP), skelenox still fails : poli_request, res = self.h_conn.getresponse() triggers an httplib.BadStatusLine: '' exception.

Not sure why. Actually the sample is pretty big (>50k subs), and I've got an nginx setup, maybe related ?

Cheers,

TypeError: Unicode-objects must be encoded before hashing

Bonjour lorsque je lance run.py j'ai ces erreurs :

neolex@archlinux> polichombr$./run.py

INFO in analysis [/home/atlas/polichombr/app/controllers/analysis.py:29]:

Loading tasks


ERROR in analysis [/home/atlas/polichombr/app/controllers/analysis.py:51]:

Could not load task_yara : Parent module 'app.controllers.tasks' not loaded, cannot perform relative import


ERROR in analysis [/home/atlas/polichombr/app/controllers/analysis.py:51]:

Could not load task_peinfo : Parent module 'app.controllers.tasks' not loaded, cannot perform relative import


ERROR in analysis [/home/atlas/polichombr/app/controllers/analysis.py:51]:

Could not load task_analyzeitrb : Parent module 'app.controllers.tasks' not loaded, cannot perform relative import


ERROR in analysis [/home/atlas/polichombr/app/controllers/analysis.py:51]:

Could not load task_strings : Parent module 'app.controllers.tasks' not loaded, cannot perform relative import

* Restarting with stat

INFO in analysis [/home/atlas/polichombr/app/controllers/analysis.py:29]:

Loading tasks


ERROR in analysis [/home/atlas/polichombr/app/controllers/analysis.py:51]:

Could not load task_yara : Parent module 'app.controllers.tasks' not loaded, cannot perform relative import


ERROR in analysis [/home/atlas/polichombr/app/controllers/analysis.py:51]:

Could not load task_peinfo : Parent module 'app.controllers.tasks' not loaded, cannot perform relative import


ERROR in analysis [/home/atlas/polichombr/app/controllers/analysis.py:51]:

Could not load task_analyzeitrb : Parent module 'app.controllers.tasks' not loaded, cannot perform relative import


ERROR in analysis [/home/atlas/polichombr/app/controllers/analysis.py:51]:

Could not load task_strings : Parent module 'app.controllers.tasks' not loaded, cannot perform relative import

  • Debugger is active!
  • Debugger pin code: 132-071-423

`

Et lorsque je m'inscrit sur l'interface web :
`builtins.TypeError

TypeError: Unicode-objects must be encoded before hashing
Traceback (most recent call last)

File "/usr/lib/python3.5/site-packages/flask/app.py", line 2000, in __call__

return self.wsgi_app(environ, start_response)

File "/usr/lib/python3.5/site-packages/flask/app.py", line 1991, in wsgi_app

response = self.make_response(self.handle_exception(e))

File "/usr/lib/python3.5/site-packages/flask/app.py", line 1567, in handle_exception

reraise(exc_type, exc_value, tb)

File "/usr/lib/python3.5/site-packages/flask/_compat.py", line 33, in reraise

raise value

File "/usr/lib/python3.5/site-packages/flask/app.py", line 1988, in wsgi_app

response = self.full_dispatch_request()

File "/usr/lib/python3.5/site-packages/flask/app.py", line 1641, in full_dispatch_request

rv = self.handle_user_exception(e)

File "/usr/lib/python3.5/site-packages/flask/app.py", line 1544, in handle_user_exception

reraise(exc_type, exc_value, tb)

File "/usr/lib/python3.5/site-packages/flask/_compat.py", line 33, in reraise

raise value

File "/usr/lib/python3.5/site-packages/flask/app.py", line 1639, in full_dispatch_request

rv = self.dispatch_request()

File "/usr/lib/python3.5/site-packages/flask/app.py", line 1625, in dispatch_request

return self.view_functions[rule.endpoint](**req.view_args)

File "/home/atlas/polichombr/app/views/webui.py", line 127, in register_user

registration_form.poke_id.data)

File "/home/atlas/polichombr/app/controllers/user.py", line 27, in create

apikey_seed = apikey_seed + sha256(username).hexdigest()

TypeError: Unicode-objects must be encoded before hashing

`

Add more documentation

The API documentation is not up to date, and the rest of the doc could use some freshening and additions.

Add relationships between samples

The current model based on matches such as iat_hash and machoc_hash are not sufficient
to explain the links possible between samples.

An example could be to declare a relationship between a dropper and its payload, or a relationship between an orchestrator and its plugins.

Passing the tests : I have to add some "with poli.app.app_context():"

Good evening,

I just cloned the repository, and tried to pass the tests as described in "CONTRIBUTING.md". All tests fail, I have a "Working outside of application context." for each of them. The errors correspond to the calls to poli.db in the setUp and tearDown methods of tests_api.py and tests_webui.py. In order to successfully pass the tests I have to enclose the poli.db calls with a "with poli.app.app_context():".

See my comit: https://github.com/harfangeek/polichombr/commit/4367c88ca74d357497732f8a92fbbd74d9974500

Do you experiment this issue?

Thank you.

Push a batch of names when starting skelenox

The synchronization mechanism in skelenox pushes the already defined names one by one,
which can take a long time for big samples.

Submitting a batch of names could be faster.

[SKELENOX] Add a popup when a rename conflict happens

When a renaming conflict happens in skelenox (ie another person has renamed something wich is already renamed in the local IDB), a popup should be shown so the user can choose wether to keep the old name,
or rename with the new one.

use relative addresses

Hi,

Disclaimer : we did not actually test it so that's an hypothetic issue (didn't want to pollute the production platform).

As addresses are Virtual addresses and not relatives ones, an IDA program rebase may probably trigger an undefined behavior (dupplicated names, addresses not found, etc.). Actions performed while debugging* (ASLR) might also cause issues.

Anyway, using relative virtual addresses might be a good solution ("just" sub the OEP address ?).

Cheers,

  • : okay, debugging with IDA is not cool and noone should ever do it, but still.

sort/hide already defined subs in func infos

Hi,

The "func infos" tab may be quite large when few samples are present, it could be useful to have a sorting capability.

Also, it may be also useful to hide already renamed subs (i.e not starting with "sub_") in order to gain more space.

Cheers,

Paginate APIs

Some API endpoints can be very verbose,
they should be paginated in order to extract only meaningful data.

The main endpoints are the following:

  • Families endpoints
  • Sample / funcinfos endpoints

Skelenox hangs IDA when updating

As skelenox pulls a lot of information,
several steps of improvements are doable:

  • Use a thread to pull information in background
  • Manage correctly the timestamps of the information server-side

Modifying family name

A family name is not editable after having been created.
A new form should be added to handle this case

IDA commands authors

Hi,

The commands pushed by Skelenox are displayed as authored by "Analyst" in the Web UI, even if the skelsettings "username" is set to another value.

Cheers,

IDA v7.00 compatibility

Skelenox is not compatible with IDA 7.0 without the idaapi compatibility layer (documented here).

A porting should be done to avoid problems with the next IDA update.

Impossible to change user password

With the current implementation of the set_pass method in the user controller ( here ),
the password is never changed even if the user changes it in the user view.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.