Giter Site home page Giter Site logo

momalab / icsref Goto Github PK

View Code? Open in Web Editor NEW
149.0 14.0 46.0 62.03 MB

A tool for reverse engineering industrial control systems binaries.

License: MIT License

Python 99.65% PLSQL 0.35%
industrial-control-systems binaries reverse-engineering codesys wago

icsref's Introduction

ICSREF: ICS Reverse Engineering Framework

Overview

ICSREF is a modular framework that automates the reverse engineering process of CODESYS binaries compiled with the CODESYS v2 compiler.

_______________ ____  ____________

/ _/ ____/ ___// __ / ____/ ____/ / // / __ / /_/ / __/ / /_

_/ // /___ ___/ / _, _/ /___/ __/

/___/____//____/_/ |_/_____/_/

by Tasos Keliris \@koukouviou

Cite us!

If you find our work interesting and use it in your (academic or not) research, please cite our NDSS'19 paper describing ICSREF:

Anastasis Keliris, and Michail Maniatakos, "ICSREF: A Framework for Automated Reverse Engineering of Industrial Control Systems Binaries", in NDSS'19.

Bibtex:

@inproceedings{keliris2019icsref,
title={{ICSREF}: A Framework for Automated Reverse Engineering of Industrial Control Systems Binaries},
author={Keliris, A. and Maniatakos, M.},
booktitle={Network and Distributed System Security Symposium (NDSS)},
year={2019}
}

Preview

Analyses

The framework can:

  • Perform core analysis of arbitrary PRG programs. Core analysis includes:
    1. Delimitation of binary blobs (i.e., functions/routines).
    2. Identification of calls to dynamic libraries.
    3. Identification of calls to static libraries (other locations in the same binary).
    4. Identification of how many and which physical I/Os the binary uses, provided a TRG file that contains the memory mappings of physical I/Os of the particular device the binary is compiled for.
  • Identify known library functions included statically in the binary:
    1. Using an opcode-based hash matching technique
    2. Using experimental signature-based techniques. This is at the moment only implemented for Proportional-Integral-Derivative (PID) CODESYS library functions.
  • Extract arguments passed to static functions. This is at the moment only implemented for the PID_FIXCYCLE CODESYS library function, but it is trivial to extend this to other functions of interest.
    1. Argument extraction is powered by symbolic execution and angr
    2. It can handle cases where the arguments are not impacted by I/O measurements (i.e., defined globally or passed directly)
  • Plot SVG graphs of the analyzed binary, including:
    1. Calls between static functions
    2. Calls to dynamic functions
    3. Hyperlinks to the disassembly listings of each function from the SVG

Graphs are powered by Graphviz. Here's a neat example:

image

The framework supports an interactive mode, where all the processing modules are loaded. Users can further investigate and analyze their binaries by exploring the different options. The interactive environment also offers useful help docstrings.

(icsref) me@example:$ ./icsref.py

ICS Reverse Engineering Framework
    _______________ ____  ____________
   /  _/ ____/ ___// __ \/ ____/ ____/
   / // /    \__ \/ /_/ / __/ / /_    
 _/ // /___ ___/ / _, _/ /___/ __/    
/___/\____//____/_/ |_/_____/_/       

author: Tasos Keliris (@koukouviou)
Type <help> if you need a nudge
reversing@icsref:$ 
reversing@icsref:$ help

Documented commands (type help <topic>):
========================================
__changepid         changepid       exp_pid_match  history  pyscript  set      
__replace_callname  cleanup         graphbuilder   load     quit      shell    
_relative_load      cmdenvironment  hashmatch      pidargs  run       shortcuts
analyze             edit            help           py       save      show     

Installation

For the latest installation instructions see INSTALL.md. For the legacy installation instructions see here.

Documentation

The ICSREF API is documented in a Read the Docs style. Once you download the repository you can traverse the docs directory and open index.html in your favorite browser.

Acknowledgements

ICSREF, as all things good in life, is based on the shoulder of giants. The framework relies on symbolic execution using angr for performing the most interesting analyses such as calculating offsets for static calls and the arguments to function calls. Disassembly listings for the graphing module are generated using the amazing r2. The interactive mode of the tool is powered by the cmd2 python tool. Beautiful documentation is generated with Sphinx and the sphinx_rtd_theme.

Contributors

A big thank you to everyone contributing on this project. See CONTRIBUTORS

icsref's People

Contributors

cedarctic avatar momalab avatar tkeliris avatar w00kong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

icsref's Issues

(MAY HAVE) issue in hashmatch

Hi @tkeliris,
When I was trying to reproduce things you did in your demo video, I find a problem in hashmatch stage:

Environment: Clean Ubuntu 16.04 and 18.04
reversing@icsref:$ analyze samples/PRG_binaries/In-house/TE.PRG
Working on samples/PRG_binaries/In-house/TE.PRG.
DONE: Hexdump generation
DONE: Header analysis
DONE: String analysis
DONE: I/O analysis
DONE: Find function boundaries
DONE: Function disassembly
DONE: Find dynamic calls
DONE: Find static calls
DONE: Call offsets renaming
Analysis of TE finished.
Total analysis time: 25.6079239845

reversing@icsref:$ !ls results/TE
TE_init_analysis.dat

reversing@icsref:$ hashmatch
Hashmatch module (MAY HAVE) found function maybe_DERIVATIVE_INIT at 0x1150
Hashmatch module (MAY HAVE) found function maybe_INTEGRAL_INIT at 0x15d4

reversing@icsref:$ pidargs
No PID_FIXCYCLE functions identified. Cannot extract arguments.

reversing@icsref:$ graphbuilder
Cleanup complete
Generated graph_TE.svg

Even I could build the CFG, the CFG itself was not 100% identical with the picture listed on the paper.

https://ibb.co/TcznjcT

I was not sure how this happened, since everything is okay in your demo video.

Any help will be appreciated!

Using range

Hi! I want to know it can be use to MC7 bytecode? If not, what do I need to do? Can you help me.

Targets with dynamic memory layout

Hi! I'm wondering how would you approach binaries with targets memory layout being dynamically allocated?

I've been playing around trying to reverse some HMI+PLC -type of devices.
To be more exact, Exor eTOP50x-series devices. These utilize ARM core on SoC.
They run Exor's own "jMobile" generated project files to handle the HMI side and majority of anything else, anything not UI-related, are plain old CODESYS V2.3 binaries. Oh, and all this mess is running on top of WinCE6.
Those CODESYS files are compiled as armv7 binaries but only way to make any sense of the binaries is to manually identify allocated memory layout (codesys only tells that the memory is automatically allocated and nothing more). And most of the things ICSREF automatically identifies (such as function boundaries & header information) are there and mostly in such a way one would expect... but header addresses don't make any sense and those strings used as identifiers are different than the ones used in the PRG_analysis.py ... I made some progress by manually identifying forementioned addresses & strings... And by modifying the PRG_analysis.py accordingly, but never got the analysis to complete successfully.. most far I've gotten it to run was up to 'find static libraries' -routine... nevertheless, the generaterd work-in-progress HEX proved to be very useful.

Anyway, I believe this situation / class of devices are out of scope for ICSREF anyway - at least for now? And since I got satisfactory results anyway, this is query is mostly just out of curiosity.

Anyway, very impressive & interesting work! I sure hope this project has a future!

Project recovering

I built an icsref environment on Ubuntu desktop 16.04 lts and all went good. The problem is that running analyze on a real Wago 750-881 prg file never end. System uptime is more than one day with this result log:

/ _/ ___/ // __ / / /
/ // / _
/ /
/ / / / /
/ // /
/ / , / // /
/
/_
//
/
/ |
/_____/
/

author: Tasos Keliris (@koukouviou)
Type if you need a nudge
reversing@icsref:$ history
reversing@icsref:$ analize /home/clientsam/ICSREF/PLC/default.prg
*** Unknown syntax: analize /home/clientsam/ICSREF/PLC/default.prg
reversing@icsref:$ analyze /home/clientsam/ICSREF/PLC/default.prg
Working on /home/clientsam/ICSREF/PLC/default.prg.
DONE: Hexdump generation
DONE: Header analysis
DONE: String analysis
DONE: I/O analysis
DONE: Find function boundaries
(and nothing more)

Given command was: analyze /home/clientsam/ICSREF/PLC/default.prg
default.prg is coming from wago plc memory by ftp get.
It seems that program is locked in an endless loop, and top reveals CPU is 100% most of the time running python command.
Can you give any suggestion?

Many thanks

KeyError

I met a KeyError while running analyze 2.PRG, and I found many samples has the same error. I wonder how to deal with this problem, it seems like an error of finding the function call.
The error and traceback is here

reversing@icsref:$ analyze /home/ye/ICSREF/samples/PRG_binaries/GitHub/2.PRG
Working on /home/ye/ICSREF/samples/PRG_binaries/GitHub/2.PRG.
DONE: Hexdump generation
DONE: Header analysis
DONE: String analysis
DONE: I/O analysis
DONE: Find function boundaries
DONE: Function disassembly
DONE: Find dynamic calls
DONE: Find static calls

Traceback (most recent call last):
File "/home/ye/.virtualenvs/icsref/local/lib/python2.7/site-packages/cmd2.py", line 2476, in onecmd_plus_hooks
stop = self.onecmd(statement)
File "/home/ye/.virtualenvs/icsref/local/lib/python2.7/site-packages/cmd2.py", line 2675, in onecmd
stop = func(statement)
File "/home/ye/ICSREF/icsref/icsref.py", line 75, in do_analyze
self.prg = Program(filename)
File "/home/ye/ICSREF/icsref/PRG_analysis.py", line 107, in init
self.__find_libcalls()
File "/home/ye/ICSREF/icsref/PRG_analysis.py", line 281, in __find_libcalls
lib_name = self.libs_dict[jump]
KeyError: 21676
EXCEPTION of type 'KeyError' occurred with message: '21676'
To enable full traceback, run the following command: 'set debug true'`

CODESYS v2.3 version

hello,can you help me ? I want to know your codesys 2.3's specific version, because i can not open your *.pro files successfully. I get error "Description file for module 'Module.root' not found".Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.