mandiant / fidl Goto Github PK
View Code? Open in Web Editor NEWA sane API for IDA Pro's decompiler. Useful for malware RE and vulnerability research
Home Page: https://fidl.readthedocs.io
License: MIT License
A sane API for IDA Pro's decompiler. Useful for malware RE and vulnerability research
Home Page: https://fidl.readthedocs.io
License: MIT License
Do you have plan to support IDA 7.0?
Hi!
Great job on providing a higher level API for Hex-Rays decompiler!
There are a few bugs we've encountered when we first tried it and would like to contribute patches to fix them. However, the license isn't explicitly given. I know, it's GitHub, and you probably want people to fork, but legally we have no right to republish (fork) without your explicit consent. You probably also want to protect yourself and/or FireEye and keep a copyright notice and credit in forks.
It mentions MIT here: https://github.com/fireeye/FIDL/blob/master/setup.py#L22. Should it be MIT?
Once this is fixed we'll open pull requests with the fixes,
Thanks!
M-E
The call to get_func in find_all_calls_to may return a NoneType which causes the .start_ea attribute lookup to throw.
Hi,
I've noticed that in the latest version of IDA for Linux (7.5.200728) FIDL fails with followin error:
Python 3.8.5
[GCC 9.3.0]
IDAPython v7.4.0 final (serial 0) (c) The IDAPython Team <[email protected]>
--------------------------------------------------------------------------------------
Python>import FIDL.decompiler_utils as du
Python>c = du.controlFlowinator(ea=here(),fast=False)
...
-> OK
in method 'ctree_items_t___getitem__', argument 2 of type 'size_t'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/path/FIDL/FIDL/decompiler_utils.py", line 1403, in __init__
self._generate_better_cfg()
File "/path/FIDL/FIDL/decompiler_utils.py", line 1845, in _generate_better_cfg
hi = citem2higher(obj) # cinsn_t
File "/path/FIDL/FIDL/decompiler_utils.py", line 530, in citem2higher
if citem.is_expr():
AttributeError: 'NoneType' object has no attribute 'is_expr'
The error seems to be because ida_hexrays.decompile()
has lazy population of ctree so if body.cblock
is accessed directly before using some decompiler API method which forces population of structure it will return None.
Calling refresh_func_ctext()
after decompile()
fixes the issue.
Given some code like this:
33| v28 = a3;
34| v27 = objc_retain(CFSTR("/Library/MobileSubstrate/DynamicLibraries/libFLEX.dylib"));
35| v3 = objc_msgSend(&OBJC_CLASS___NSFileManager, "defaultManager");
36| v26 = (void *)objc_retainAutoreleasedReturnValue(v3);
with the cursor right here (column 38 in the code above) for example:
v
OBJC_CLASS__|_NSFileManager
funcEA = ...
line = 35
cursor = 38
item = fidl.lex_citem_at_pos(funcEA, line, cursor)
# Prints obj
item.cexpr.opname
# Prints address of _OBJC_CLASS_$_NSFileManager
item.cexpr.obj_ea
The explicit cursor position is important to me as I'm not working directly in IDA; I would like to be able to query arbitrary locations without using IDA's cursor API.
Referring to this code, used to add a comment:
When I saw this, I thought to myself, surely there is a better way!
According to the IDA CPP header,
/// Invisible COLOR_ADDR tags in the output text are used to refer to ctree items and variables
struct ctree_anchor_t
{
uval_t value;
#define ANCHOR_INDEX 0x1FFFFFFF
#define ANCHOR_MASK 0xC0000000
#define ANCHOR_CITEM 0x00000000 ///< c-tree item
#define ANCHOR_LVAR 0x40000000 ///< declaration of local variable
#define ANCHOR_ITP 0x80000000 ///< item type preciser
#define ANCHOR_BLKCMT 0x20000000 ///< block comment (for ctree items)
...
item_preciser_t get_itp(void)
bool is_valid_anchor(void)
bool is_citem_anchor(void)
bool is_itp_anchor(void)
...
};
… these other types of anchors are embedded in the string, and the citem_t
anchor just happens to be all 0's. I do (think I) see them in a few places, such as this local variable anchor here:
�(0000000040000007��void *v7��� ;� // ��[xsp+48h] [xbp-8h]��
But I don't see them at all on some other lines where I would at least expect to see an ANCHOR_ITP
for an ITP_SEMI
item preciser, like this:
�(0000000000000031 �(0000000000000033��objc_release���(0000000000000032� (� �(0000000000000034��v1��� )� � ;� �(0000000000000031
which corresponds to this line:
objc_release(v1);
So, what gives? Why these anchors only on some lines?
Hello!
First of all, I'd like to thank you for making FIDL public, it really helps!
I have a question regarding the following code from FIDL decompiler_utils.py:
if not has_cached_cfunc(ea):
# Open the disassembly view here
# to populate the cache
pw = pseudoViewer()
pw.show(ea=ea)
try:
cf = decompile(ea=ea, flags=ida_hexrays.DECOMP_NO_WAIT)
except ida_hexrays.DecompilationFailure as e:
print("Failed to decompile @ {:X}".format(ea))
cf = None
This code creates a lot of decompiler windows and slows down or even hangs IDA while working with large databases. If I don't use pseudoViewer and just call decompile() for target functions, code works fine (at least as I see).
Can you explain why you decided to use pseudoViewer? What are its benefits comparing to omitting this code and just using decompile()?
I realize this repo doesn't strive to support Python 2 / IDA < 7.4, but I think a lot of it works out of the box except for this code right here: https://github.com/fireeye/FIDL/blob/e6ceb000cda43b450717eb171309c02dee06dd4f/FIDL/decompiler_utils.py#L1070-L1073
The version of this function present in 7.0 only accepts an address and a failure pointer, no flags. Would it be possible to somehow detect which one is available and call that? Or to add a new Python 2.7 and IDA < 7.4 release to call decompile()
without the flags?
Python3 latest ida, hash: 3126280e968397a118b2e75e5349ef613e6ea5a1aca458c38a9271490458313c
import itertools
from idaapi import *
from idautils import *
from idc import *
import ida_hexrays
import FIDL.decompiler_utils as du
def string_decoder(args):
if args and args[0].type == 'string':
s = bytearray.fromhex(args[0].val)
res = ""
last = s[0]
for a,b in zip(s[1:], itertools.cycle('IIDH47ETIBQRYOF258RYOSYW5XVMEYODH257Y')):
x = a ^ ord(b)
z = (x - last) % 255
res+= chr(z)
last = a
return res
def get_func_start(xref):
try:
if len(get_func_name(xref.frm)) > 0:#get_func_start will crash IDA without this additional check
if xref.iscode:
func = get_func(xref.frm)
find_func_bounds(func, idaapi.FIND_FUNC_DEFINE)
return func.start_ea
return 0
except:
return 0
def main():
addr = 0x004808E0
func_list = []
if addr is not None:
for xref in XrefsTo(addr, ida_xref.XREF_ALL):
f = get_func_start(xref)
if f == 0:
print('xref outside of defined function %x' % xref.frm)
else:
func_list.append(f)
func_list = set(func_list)
count = 0
for f in func_list:
c = du.controlFlowinator(ea=f)
for co in c.calls:
if co.call_ea == addr:
t = string_decoder(co.args)
du.create_comment(co.c,co.ea,'%s' % (t))
count+=1
print('calls found %d' % (count))
if __name__ == '__main__':
main()
Traceback (most recent call last):
File "<string>", line 53, in <module>
File "<string>", line 44, in main
File "C:\Users\steve\AppData\Roaming\Python\Python38\site-packages\FIDL\decompiler_utils.py", line 1402, in __init__
self._generate_better_cfg()
File "C:\Users\steve\AppData\Roaming\Python\Python38\site-packages\FIDL\decompiler_utils.py", line 1844, in _generate_better_cfg
hi = citem2higher(obj) # cinsn_t
File "C:\Users\steve\AppData\Roaming\Python\Python38\site-packages\FIDL\decompiler_utils.py", line 530, in citem2higher
if citem.is_expr():
AttributeError: 'NoneType' object has no attribute 'is_expr'
As far as I understand, currently, if one wants to get all the calls to a certain function, they have two options:
display_all_calls_to (function)
https://github.com/fireeye/FIDL/blob/6b127946b704d4e5f027c48cdb02cbdbef4d8890/FIDL/decompiler_utils.py#L1302-L1307
find_all_calls_to (f_name, ea)
https://github.com/fireeye/FIDL/blob/6b127946b704d4e5f027c48cdb02cbdbef4d8890/FIDL/decompiler_utils.py#L2250-L2255
While the display_all_calls_to
prints all the calls to a function globally (across the entire binary), the latter, find_all_calls_to
, returns only the xrefs to a function from a specific function (which ea
belongs to`).
The problem is that the name find_all_calls_to
is confusing due to the localization of the search-range (to a specific function), and that there is no option to get a list of callObj
for all the calls to a function, globally.
I suggest having three functions:
display_all_calls_to (function)
- keep as isfind_all_calls_to_from_function (f_name, ea)
will behave as the current find_all_calls_to
find_all_calls_to (function)
- will return a list of callObj
for all the xrefs, globallyOn IDA 7.4, the LocByName
function was removed and replaced by get_name_ea
and get_name_ea_simple
.
Using display_all_calls_to
, which is the only API that uses LocByName
, will end up with the following error.
Python>import FIDL.decompiler_utils as du
Python>du.display_all_calls_to("decryptString")
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "c:\apps\fidl\FIDL\decompiler_utils.py", line 1309, in display_all_calls_to
f_ea = LocByName(func_name)
NameError: name 'LocByName' is not defined
File "<string>", line 62, in main
File "C:\Python27\lib\site-packages\FIDL\decompiler_utils.py", line 1388, in __init__
self._generate_i_cfg(blocks_to_expand=blocks)
File "C:\Python27\lib\site-packages\FIDL\decompiler_utils.py", line 1813, in _generate_i_cfg
self._generate_i_cfg(blocks_to_expand=blocks_to_expand)
File "C:\Python27\lib\site-packages\FIDL\decompiler_utils.py", line 1800, in _generate_i_cfg
new_blocks = self._expand_switch_block(block)
File "C:\Python27\lib\site-packages\FIDL\decompiler_utils.py", line 1609, in _expand_switch_block
self.i_cfg.add_edge(case_ins[-2].index, succ)
IndexError: list index out of range```
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.