google / binja-hexagon Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v2.0
License: GNU General Public License v2.0
decode_jmp_buf doesn't seem to be thread safe and is shared by threads. This would mean that if thread 1 attempts to use it to perform a longjmp and thread 2 sets it to a value it will use before thread 1 can perform this jump, thread 1 will start using thread 2's stack. This causes a lot of crashes on macOS due to the buffer being used to temporarily save x30 before sigprocmask is called internally in setjmp, which then gets restored and validated with a autibsp:
setjmp @ libsystem_platform.dylib:
-> 0x188a83b24: pacibsp
0x188a83b28: stp x21, x30, [x0]
0x188a83b2c: mov x21, x0
0x188a83b30: orr w0, wzr, #0x1
0x188a83b34: mov x1, #0x0
0x188a83b38: add x2, x21, #0xb0
0x188a83b3c: bl 0x188a885f8 ; symbol stub for: sigprocmask
0x188a83b40: mov x0, x21
0x188a83b44: ldp x21, x30, [x0]
0x188a83b48: autibsp
0x188a83b4c: eor x16, x30, x30, lsl #1
0x188a83b50: tbz x16, #0x3e, 0x188a83a54 ; _setjmp
If another thread writes to the buffer during the time of register save & restore, a wrong value will find its way into x30 and autibsp will crash the process. If _setjmp and _longjmp are used instead, skipping the register save & restore at the cost of not saving the signal mask, the crashes seemingly stop [^1]. I am not sure what the reason for this is.
Potential fixes are:
[^1] I do have one crash which doesn't happen in setjmp which I wasn't able to reproduce but I think it might be related to the non-thread safety of decode_jmp_buf.
here is the log output. it seems for some reason, binaryninja exported api did not compiled properly.
[ 92%] Linking CXX shared library libarch_hexagon.dylib
Undefined symbols for architecture x86_64:
"_BNAreArgumentRegistersUsedForVarArgs", referenced from:
BinaryNinja::CoreCallingConvention::AreArgumentRegistersUsedForVarArgs() in libbinaryninjaapi.a(callingconvention.cpp.o)
"_BNCreateStructureFromOffsetAccess", referenced from:
BinaryNinja::BinaryView::CreateStructureFromOffsetAccess(BinaryNinja::QualifiedName const&, bool*) const in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNCreateStructureMemberFromAccess", referenced from:
BinaryNinja::BinaryView::CreateStructureMemberFromAccess(BinaryNinja::QualifiedName const&, unsigned long long) const in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNCreateWideCharType", referenced from:
BinaryNinja::Type::WideCharType(unsigned long, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&) in libbinaryninjaapi.a(type.cpp.o)
"_BNCreateWideCharTypeBuilder", referenced from:
BinaryNinja::TypeBuilder::WideCharType(unsigned long, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&) in libbinaryninjaapi.a(type.cpp.o)
"_BNFindAllConstantWithProgress", referenced from:
BinaryNinja::BinaryView::FindAllConstant(unsigned long long, unsigned long long, unsigned long long, BinaryNinja::RefBinaryNinja::DisassemblySettings, BNFunctionGraphType, std::__1::function<bool (unsigned long, unsigned long)> const&, std::__1::function<bool (unsigned long long, BinaryNinja::LinearDisassemblyLine const&)> const&) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNFindAllDataWithProgress", referenced from:
BinaryNinja::BinaryView::FindAllData(unsigned long long, unsigned long long, BinaryNinja::DataBuffer const&, BNFindFlag, std::__1::function<bool (unsigned long, unsigned long)> const&, std::__1::function<bool (unsigned long long, BinaryNinja::DataBuffer const&)> const&) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNFindAllTextWithProgress", referenced from:
BinaryNinja::BinaryView::FindAllText(unsigned long long, unsigned long long, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, BinaryNinja::RefBinaryNinja::DisassemblySettings, BNFindFlag, BNFunctionGraphType, std::__1::function<bool (unsigned long, unsigned long)> const&, std::__1::function<bool (unsigned long long, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, BinaryNinja::LinearDisassemblyLine const&)> const&) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNFreeTypeFieldReferenceSizeInfo", referenced from:
BinaryNinja::BinaryView::GetAllSizesReferenced(BinaryNinja::QualifiedName const&) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNFreeTypeFieldReferenceSizes", referenced from:
BinaryNinja::BinaryView::GetSizesReferenced(BinaryNinja::QualifiedName const&, unsigned long long) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNFreeTypeFieldReferenceTypeInfo", referenced from:
BinaryNinja::BinaryView::GetAllTypesReferenced(BinaryNinja::QualifiedName const&) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNFreeTypeFieldReferenceTypes", referenced from:
BinaryNinja::BinaryView::GetTypesReferenced(BinaryNinja::QualifiedName const&, unsigned long long) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNFreeTypeFieldReferences", referenced from:
BinaryNinja::BinaryView::GetCodeReferencesForTypeField(BinaryNinja::QualifiedName const&, unsigned long long) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNGetAllFieldsReferenced", referenced from:
BinaryNinja::BinaryView::GetAllFieldsReferenced(BinaryNinja::QualifiedName const&) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNGetAllSizesReferenced", referenced from:
BinaryNinja::BinaryView::GetAllSizesReferenced(BinaryNinja::QualifiedName const&) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNGetAllTypesReferenced", referenced from:
BinaryNinja::BinaryView::GetAllTypesReferenced(BinaryNinja::QualifiedName const&) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNGetBasicBlockInstructionContainingAddress", referenced from:
BinaryNinja::BasicBlock::GetInstructionContainingAddress(unsigned long long, unsigned long long*) in libbinaryninjaapi.a(basicblock.cpp.o)
"_BNGetFunctionAddressRanges", referenced from:
BinaryNinja::Function::GetAddressRanges() in libbinaryninjaapi.a(function.cpp.o)
"_BNGetFunctionHighestAddress", referenced from:
BinaryNinja::Function::GetHighestAddress() in libbinaryninjaapi.a(function.cpp.o)
"_BNGetFunctionLowestAddress", referenced from:
BinaryNinja::Function::GetLowestAddress() in libbinaryninjaapi.a(function.cpp.o)
"_BNGetInstructionContainingAddress", referenced from:
BinaryNinja::Function::GetInstructionContainingAddress(BinaryNinja::Architecture*, unsigned long long, unsigned long long*) in libbinaryninjaapi.a(function.cpp.o)
"_BNGetSizesReferenced", referenced from:
BinaryNinja::BinaryView::GetSizesReferenced(BinaryNinja::QualifiedName const&, unsigned long long) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNGetTypesReferenced", referenced from:
BinaryNinja::BinaryView::GetTypesReferenced(BinaryNinja::QualifiedName const&, unsigned long long) in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNHasInitialAnalysis", referenced from:
BinaryNinja::BinaryView::HasInitialAnalysis() in libbinaryninjaapi.a(binaryview.cpp.o)
"_BNSetAnalysisHold", referenced from:
BinaryNinja::BinaryView::SetAnalysisHold(bool) in libbinaryninjaapi.a(binaryview.cpp.o)
ld: symbol(s) not found for architecture x86_64
This issue was automatically created by Allstar.
Security Policy Violation
Project is out of compliance with Binary Artifacts policy: binaries present in source code
Rule Description
Binary Artifacts are an increased security risk in your repository. Binary artifacts cannot be reviewed, allowing the introduction of possibly obsolete or maliciously subverted executables. For more information see the Security Scorecards Documentation for Binary Artifacts.
Remediation Steps
To remediate, remove the generated executable artifacts from the repository.
Artifacts Found
Additional Information
This policy is drawn from Security Scorecards, which is a tool that scores a project's adherence to security best practices. You may wish to run a Scorecards scan directly on this repository for more details.
Allstar has been installed on all Google managed GitHub orgs. Policies are gradually being rolled out and enforced by the GOSST and OSPO teams. Learn more at http://go/allstar
This issue will auto resolve when the policy is in compliance.
Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.
I suspect this is because cstdint is no longer implicit, and the third_party absl version is too old.
[ 39%] Building CXX object third_party/abseil-cpp/absl/strings/CMakeFiles/absl_str_format_internal.dir/internal/str_format/extension.cc.o
In file included from /home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.cc:16:
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:34:6: warning: elaborated-type-specifier for a scoped enum must not use the ‘class’ keyword
34 | enum class FormatConversionChar : uint8_t;
| ~~~~ ^~~~~
| -----
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:34:33: error: found ‘:’ in nested-name-specifier, expected ‘::’
34 | enum class FormatConversionChar : uint8_t;
| ^
| ::
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:34:12: error: ‘FormatConversionChar’ has not been declared
34 | enum class FormatConversionChar : uint8_t;
| ^~~~~~~~~~~~~~~~~~~~
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:35:6: warning: elaborated-type-specifier for a scoped enum must not use the ‘class’ keyword
35 | enum class FormatConversionCharSet : uint64_t;
| ~~~~ ^~~~~
| -----
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:35:36: error: found ‘:’ in nested-name-specifier, expected ‘::’
35 | enum class FormatConversionCharSet : uint64_t;
| ^
| ::
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:35:12: error: ‘FormatConversionCharSet’ has not been declared
35 | enum class FormatConversionCharSet : uint64_t;
| ^~~~~~~~~~~~~~~~~~~~~~~
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:173:8: warning: elaborated-type-specifier for a scoped enum must not use the ‘class’ keyword
173 | enum class Enum : uint8_t {
| ~~~~ ^~~~~
| -----
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:173:14: error: use of enum ‘Enum’ without previous declaration
173 | enum class Enum : uint8_t {
| ^~~~
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:173:21: error: ‘uint8_t’ was not declared in this scope
173 | enum class Enum : uint8_t {
| ^~~~~~~
/home/implr/dev/binja-hexagon/third_party/abseil-cpp/absl/strings/internal/str_format/extension.h:29:1: note: ‘uint8_t’ is defined in header ‘<cstdint>’; did you forget to ‘#include <cstdint>’?
28 | #include "absl/strings/internal/str_format/output.h"
+++ |+#include <cstdint>
29 | #include "absl/strings/string_view.h"
In Hexagon, when a predicate register is used as a scalar value, the LSB of the register is used to show true/false. This can be confirmed by testing with hexagon-sim or QEMU.
Consider the following instruction packet which is randomly generated:
{
R6|=asl(R3,#28u5)
if (P3) R7=zxth(R8)
if (!P1) R0=sxtb(R6)
if (P0) R8=sxth(R1)
}
It is converted to the LLIL code:
temp6.d = R6
temp6.d = temp6.d | R3 << 0x1c
if (P3) then 3 else 5
temp7.d = R8 & ((1 << 0x10) - 1)
goto 5
if (not.b(P1)) then 6 else 8
temp0.d = ((R6 & ((1 << 8) - 1)) ^ 0x1 << (8 - 1)) - (1 << (8 - 1))
goto 8
if (P0) then 9 else 11
temp8.d = ((R1 & ((1 << 0x10) - 1)) ^ 0x1 << (0x10 - 1)) - (1 << (0x10 - 1))
goto 11
R8 = temp8.d
R0 = temp0.d
R7 = temp7.d
R6 = temp6.d
In the code, if (P3)
, if (not.b(P1))
and if (P0)
show that it is not considering the LSB.
I assume this bug is from the old version of QEMU, as it has the same bug.
The publicly available "Hexagon V5x Programmer’s Reference Manual" describes behavior for "auto-AND" predicates in section 6.2.3
If multiple compare instructions in a packet write to the same predicate register, the result is the logical AND of the individual compare results.
binja-hexagon does not seem to support this right now.
test_auto_and_predicates:
{ P0 = cmp.eq(r0,#1)
P0 = cmp.eq(r0,#2)
if (P0.new) jump:T 1f }
{ r0 = #0
jumpr r31 }
1:
{ r0 = #1
jumpr r31 }
Lifts to
...
1 @ 00020330 temp90.b = P0
2 @ 00020330 temp90.b = R0 == 1 # P0's temp written first
3 @ 00020330 temp90.b = R0 == 2 # P0's temp overwritten instead of AND'd
4 @ 00020330 if (temp90.b) then 5 else 7
...
I following setup & build on MacOS with python3.8 and Cmake 3.21.3,but I stuck on this error.
[ 85%] Generating insn_text_funcs_generated.cc
Processing 2228 tags in parallel
Done processing tag # 0
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/Users/xxxxx/Desktop/xxx/decomplie_tools/binja-hexagon/plugin/gen_insn_text_funcs.py", line 602, in gen_insn_text_func
tokens = process_insn_tokens(tag, regs, imms)
File "/Users/xxxxx/Desktop/xxx/decomplie_tools/binja-hexagon/plugin/gen_insn_text_funcs.py", line 595, in process_insn_tokens
beh = behdict[tag]
KeyError: 'J2_jump'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/xxxxx/Desktop/xxx/decomplie_tools/binja-hexagon/plugin/gen_insn_text_funcs.py", line 681, in <module>
main()
File "/Users/xxxxx/Desktop/xxx/decomplie_tools/binja-hexagon/plugin/gen_insn_text_funcs.py", line 652, in main
tag_to_fbody = process_all_tags(tagregs, tagimms)
File "/Users/xxxxx/Desktop/xxx/decomplie_tools/binja-hexagon/plugin/gen_insn_text_funcs.py", line 622, in process_all_tags
tag_to_fbody[tag] = future.result() # myslef added reslut error
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
KeyError: 'J2_jump'
make[2]: *** [plugin/insn_text_funcs_generated.cc] Error 1
make[1]: *** [plugin/CMakeFiles/plugin_lib.dir/all] Error 2
make: *** [all] Error 2
I find dict of behdict
haven't added any element when 'gen_insn_text_funcs.py' use it to get some value 'beh = behdict[tag]'
Undefined symbols for architecture arm64:
"_BNMergeUserAnalysis", referenced from:
BinaryNinja::FileMetadata::MergeUserAnalysis(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, std::__1::function<bool (unsigned long, unsigned long)> const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > >) in libbinaryninjaapi.a(filemetadata.cpp.o)
BNMergeUserAnalysis is gone in latest API.
Needs to be replaced.
Didn't see this caveat in the docs, when there are multiple hexagon binaries open in the same binary ninja process, the disassembly for both of them at a specific address is the same despite them having different instructions.
Looks like packets are being cached by address and the plugin does not take into account the current BinaryView for that address. When a user opens a different hexagon binary and goes to the same address that was previously cached, instead of attempting to disassemble the current bytes, it disassembles the already cached bytes from the other tab.
Since it is cached at the plugin level and not the view, closing all hexagon tabs and opening up a new file (that isn't the original file cached) will still show the cached data from the initial file.
The autogenerated HLIL from this plugin generates lots of spurious code for conditional jumps, to the point where it is distracting. As I understand it, this is done for correctness, since jump targets must be resolved at the end of a packet.
Example code (test_dualjump_cond_jump() in bn_llil_test_app):
00000084 014101f3 { R1 = add(R1,R1)
00000088 06d0005c if (P0) jump:t data_90 }
0000008c c03f0048 { R0 = #0x0; jumpr LR }
00000090 c03f1048 { R0 = #0x1; jumpr LR }
Resulting HLIL:
00000084 char temp211 = 0
00000084 if (arg1)
00000084 temp211 = 1
00000090 if (temp211 == 1)
00000090 return 1
0000008c return 0
I'd expect output more like
00000090 if (arg1)
00000090 return 1
0000008c return 0
I guess this lifted IL is so situational that it doesn't make sense to ask the binja devs to optimize this particular construct... However it should be straightforward to fix it up manually in simple cases using the new Workflows API.
Binja-hexagon lifts this packet incorrectly:
00053d78 104100f5 { R17:R16 = combine(R0,R1)
00053d7c 301cf4eb memd(SP+#0xfffffff0) = R17:R16; allocframe(#0x18) }
I believe this is supposed to be interpreted like so
temp16 = combine(R0,R1)
memd(SP+#0xfffffff0) = R17:R16; // R17:R16 are callee-saved
allocframe(#0x18)
R17:R16 = temp16
However, binja-hexagon lifts this to:
// combine
0 @ 00053d78 temp16.q = (temp16.q & not.q(0xffffffff << 0)) | (R1 & 0xffffffff) << 0
1 @ 00053d78 temp16.q = (temp16.q & not.q(0xffffffff << 0x20)) | (R0 & 0xffffffff) << 0x20
// memd
2 @ 00053d78 temp16.q = R17:R16
3 @ 00053d78 temp100.d = SP - 0x10
4 @ 00053d78 [temp100.d {var_10}].q = temp16.q
// allocframe
5 @ 00053d78 temp100.d = SP - 8 {var_8}
6 @ 00053d78 [temp100.d {var_8}].q = LR:FP
7 @ 00053d78 FP = temp100.d {var_8}
8 @ 00053d78 SP = temp100.d - 0x18
// clobbered regs
9 @ 00053d78 R17:R16 = temp16.q
It looks like only line 2 is wrong, where we overwrite the temp16 reg that was already written above due to the combine
call. From what I can tell this comes from an incorrect lift in lift_S2_storerd_io
.
void lift_S2_storerd_io(Architecture *arch, uint64_t pc, const Packet &pkt,
const Insn &insn, int insn_num, PacketContext &ctx) {
LowLevelILFunction &il = ctx.IL();
const int RsV = MapRegNum('R', insn.regno[0]);
SourceReg tmp_RttV(8, MapRegNum('R', insn.regno[1]), il); // <-- SourceReg's constructor calls CopyToTemp()
const int RttV = tmp_RttV.Reg();
int siV = insn.immed[0];
il.AddInstruction(il.SetRegister(
4, EA_REG, il.Add(4, il.Register(4, RsV), il.Const(4, siV))));
il.AddInstruction(il.Store(8, il.Register(4, EA_REG), il.Register(8, RttV)));
}
Besides correctness, this is one of the (maybe several) reasons that binja infers all typically callee-saved regs as arguments.
The autogenerated LLIL for e.g. allocframe() and dealloc_return() manipulate LLIL_SPLIT_REG(LR, FP), and those refs don't seem to be elided in HLIL.
Example code (test_allocframe() in bn_llil_test_app):
00000020 01c09da0 { allocframe(SP,#0x8):raw } {var_8} {arg_0} {var_10}
00000024 00e00078 { R0 = #0x100 }
00000028 1ec01e96 { LR:FP = dealloc_return(FP):raw } {var_8}
LLIL:
// allocframe
0 @ 00000020 temp29.d = SP {arg_0}
1 @ 00000020 temp100.d = temp29.d - 8
2 @ 00000020 [temp100.d {var_8}].q = LR:FP
3 @ 00000020 FP = temp100.d
4 @ 00000020 temp29.d = temp100.d - 8 {var_10}
5 @ 00000020 SP = temp29.d
// r0 = 0x100
6 @ 00000024 temp0.d = 0x100
7 @ 00000024 R0 = temp0.d
// deallocframe
8 @ 00000028 temp100.d = FP {var_8}
9 @ 00000028 temp101.q = [temp100.d {var_8}].q
10 @ 00000028 temp30.q = temp101.q
11 @ 00000028 SP = temp100.d + 8
12 @ 00000028 LR:FP = temp30.q
13 @ 00000028 <return> jump(LR)
Resulting HLIL:
00000028 int32_t FP
00000028 int32_t FP_1
00000028 int32_t LR
00000028 int32_t LR_1
00000028 LR_1:FP_1 = LR:FP
00000028 return 0x100
I'd expect output more like
00000028 return 0x100
Looking at a different x86 binary, it seems that RBP and all callee-saved registers are eliminated somewhere between LLIL and MLIL.
During build process it fails with various undefined references such as
undefined reference to BNCanArchitectureAssemble
when linking libbinaryninjaapi.a.
libarch_hexagon.so still builds but crashes Binja when loading a hexagon elf.
My version is Binja is a little old -> 2.0.2166
is this a binja version issue as i suspect?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.