redballoonsecurity / ofrak Goto Github PK

OFRAK: unpack, modify, and repack binaries.

License: Other

Makefile 0.30% Python 78.79% Shell 0.24% Java 1.92% C 0.28% Dockerfile 0.02% CSS 0.19% HTML 0.02% JavaScript 4.78% Svelte 7.71% Assembly 0.06% AngelScript 0.12% ActionScript 0.04% Jupyter Notebook 5.52%

ctf firmware firmware-tools repacking reverse-engineering unpacker hacktoberfest

ofrak's People

Contributors

Stargazers

Watchers

Forkers

rbs-afflitto slosnow9879 ethern0t dasdboot sbepstein brian-tung ottoisblato markbirss lpgn bofhell rampuriezt helioxgroup nemesisqp joydo crackercat wickywanka kenschwartz githangar namreeb cclauss zha0 timb-machine-mirrors gmh5225 isiaon hotelzululima vikramvel shock-1 thzzzll doytsujin dantheman333 thezedwards alvin-tosh unr0i mohamedtaoufik y0d4a yolo-zyx mu-l loejing booool jonyhuang hyrathon zhouxiaoliu sanjayws ptruser justinforbes hellgheast edwardlarson qigedashen hanxiao123 lyrl ajunlonglive alex-testlab haoxiqiang whyitfor pdlloyd hj3938 dmalley13 bronxc dmcp89 miralayipouya arryboom onlypwns jaredgonzales zinohome sjossi guliqbal87 ahojjati ov3rf1ow etherx-dev rbs-jacob zzhsec thejoelpatrol 5l1v3r1 galbwe natsurepo alexreuter matthewtingum data-gami andrewpak p0prxx dannyp303 superligen exiahan saml98 fhahn-swd madalin-dogaru redbouk icodein c0axial lukesyber fiebererdi keyman9848 marczalik kibo-redballoon nomadyb anogin realxor1 tang78236 kernux rbs-alexr

ofrak's Issues

Run OFRAK on MacBook with M1 chip

What is the use case for the feature?
I would like to be able to run OFRAK on my MacBook with the M1 chip.

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
Ensure that Docker image can be built and run on this platform. Additionally, ensure that the Python install of OFRAK on the target platform works.

ofrak CLI tool does not work when binwalk is not installed

What is the problem? (Here is where you provide a complete Traceback.)
Running python3 -m ofrak deps or python3 -m ofrak list fails when the binwalk package is not installed:

% python3 -m ofrak list
Traceback (most recent call last):
 <REDACTED>/ofrak/ofrak_components/ofrak_components/binwalk.py", line 7, in <module>
    import binwalk
ModuleNotFoundError: No module named 'binwalk'

Please provide some information about your environment.
Running ofrak on 385ad3e.

If you've discovered it, what is the root cause of the problem?
There are two related problems:

Binwalk is imported at the top of the file. If the package is not installed, this results in a ModuleNotFoundError.
Binwalk requires a custom ComponentExternalTool.is_tool_installed implementation, since ofrak relies on the Python package being installed and not the CLI tool (which may be installed on a system but using a different version of python, different python environment).

How often does the issue happen?
Every time the ofrak cli tool is run.

What are the steps to reproduce the issue?
Ideally, give us a short script that reproduces the issue.
python3 -m ofrak deps.

How would you implement this fix?

Add a custom ComponentExternalTool.is_tool_installed implementation for binwalk to check if the python package is installed.
Handle the ModuleNotFoundError so that OFRAK can run even when this tool is not installed.

Are there any (reasonable) alternative approaches?
Additionally, I would like to add a check that excludes BinwalkAnalyzer from discovery at runtime if binwalk is not installed. This is probably beyond scope of this issue.

Are you interested in implementing it yourself?
Yes.

Improve asm patch testing

What is the use case for the feature?
Currently, our tests in toolchain_asm.py verify only that a fem is created, it does not verify that the patch would work as intended. This is most evident by the fact that one can supply empty .as files to the test and it will still pass. The current version of the test likely only catches whether the .as files contain syntactically incorrect assembly code, if the code is present at all.

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
Use the angr backend to emulate applying the patch and ensuring that the intended functionality changes are present in the final executable. Also, the test itself needs to be fixed - currently the manual_map maps segments to .as files out of order such that the resulting BOM is incorrect.

Are there any (reasonable) alternative approaches?
N/A

Are you interested in implementing it yourself?
Yes.

Linked to #236

Use pigz instead of gzip for Gzip components

pigz is a parallel version of gzip also written by Mark Adler (the co-creator of gzip/zlib). It acts as a sort of drop-in replacement for the gzip command line utility, but it's parallelized and much faster.

We should replace uses of gzip in OFRAK with pigz.

Which files would be affected?
https://github.com/redballoonsecurity/ofrak/blob/master/ofrak_components/ofrak_components/gzip.py

Does the proposed maintenance include non-doc string functional changes to the Python code?
Yes -- changing the command used and the Dockerstub file.

Are you interested in implementing it yourself?
No. This is a great first contributor issue!

Minor: `make test` fails in `min` container

Which files would be affected?
ofrak_core/test_ofrak/unit/test_ofrak_server.py

Does the proposed maintenance include non-doc string functional changes to the Python code?
Only for tests.

Are you interested in implementing it yourself?
No ;)

ofrak-minimal.yml does not include frontend, and as a result ofrak_core/test_ofrak/unit/test_ofrak_server.py tests fail. See also #212. Not sure what's best - include frontend, or skip these tests when it's not there.

ERROR test_ofrak/unit/test_ofrak_server.py::test_get_index - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_get_resource - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_create_root_resource - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_get_data - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_error - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_unpack_recursively - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_get_root - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_unpack - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_get_ancestors - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_pack - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_pack_recursively - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_analyze - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_get_children - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_get_parent - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_data_summary - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_identify - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_delete_comment - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_queue_patch - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_create_mapped_child - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_add_comment - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_search_for_vaddr - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_find_and_replace - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_get_root_resources - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_server_main - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'
ERROR test_ofrak/unit/test_ofrak_server.py::test_get_data_range - ValueError: No directory exists at '/ofrak_core/ofrak/gui/public'

ofrak CLI tool raises an Attribute error when run with no arguments

What is the problem? (Here is where you provide a complete Traceback.)
As of 385ad3e, when running the ofrak CLI tool with no arguments, an AttributeError is raised:

% python3 -m ofrak
Traceback (most recent call last):
...
  <REDACTED>/ofrak/ofrak_core/ofrak/__main__.py", line 7, in <module>
    ofrak_cli.parse_and_run(sys.argv[1:])
  <REDACTED>/ofrak/ofrak_core/ofrak/ofrak_cli.py", line 294, in parse_and_run
    parsed.func(parsed)
AttributeError: 'Namespace' object has no attribute 'func'

If you've discovered it, what is the root cause of the problem?
Parser does not handle situation when no subcommand is supplied.

What are the steps to reproduce the issue?
python3 -m ofrak.

How would you implement this fix?
Probably should print help message if no command is provided.

Are there any (reasonable) alternative approaches?
Possibly.

Are you interested in implementing it yourself?
Possibly.

Binary Ninja license file may contain space between key and seperator

What is the problem? (Here is where you provide a complete Traceback.)
My license.dat file for Binary Ninja contains a space between the "serial" key and the : separator. This causes the regex described in the INSTALL.md file to return the full line instead of just the serial number.

foo@bar:~$ cat license.dat | grep serial
		"serial" : "[MY SERIAL NUMBER]",
foo@bar:~$ echo $(grep "serial" license.dat | sed 's/"serial": "//g' | sed 's/",//g')
"serial" : "[MY SERIAL NUMBER]

This in turn causes the python3 build_image.py --config ofrak-binary-ninja.yml --base --finish command to fail while downloading Binary Ninja:

#36 3.910 usage: download_headless.py [-h] [--serial SERIAL] [--dev] [--output OUTPUT]
#36 3.910                             [-q] [-i] [-d DIR] [-c]
#36 3.910 download_headless.py: error: unrecognized arguments: : "[MY SERIAL NUMBER]
#36 3.925 unzip:  cannot find or open BinaryNinja-headless.zip, BinaryNinja-headless.zip.zip or BinaryNinja-headless.zip.ZIP.
#36 3.927 rm: cannot remove 'BinaryNinja-headless.zip': No such file or directory
#36 3.940 python: can't open file 'binaryninja/scripts/install_api.py': [Errno 2] No such file or directory
#36 3.942 /tmp/install_binary_ninja_headless_linux.sh: line 17: ./binaryninja/scripts/linux-setup.sh: No such file or directory

Add support for El Torito ISO images

Many ISOs from the Internet are El Torito images, which are currently not supported by the OFRAK ISO unpacker.

Often the El Torito images will unpack successfully, but there are no tests for them, and it is not clear what more would need to be done to support them.

What is the use case for the feature?
Support unpacking and modifying El Torito images.

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
Probably by extending current ISO support.

Are there any (reasonable) alternative approaches?

Are you interested in implementing it yourself?
No. This is a great first contributor issue!

Update ofrak-binary-ninja package on PyPI

What is the use case for the feature?
I'm using the OFRAK pip packages for all my work, but ofrak-binary-ninja has not been updated from 0.0.1 in August 2022.

Does the feature contain any proprietary information about another company's intellectual property?
No

How would you implement this feature?
Update PyPI package

Are there any (reasonable) alternative approaches?
I'm using the git checkout with make install for now.

Are you interested in implementing it yourself?
Yes

when an Elf is unpacked, update the LinkableBinary symbols

The Elf class will contain the symbols of the ELF, if they exist, after unpacking recursively.

Elf is also a subclass of Program, which is a subclass of LinkableBinary.

So an Elf resource view will have a get_symbols method, coming from LinkableBinary. But it currently returns an empty list even after unpacking recursively an ELF containing symbols.

Suggestion: when unpacking an ElfSymbolSection, convert the resulting ElfSymbol objects into LinkableSymbols and update the list of LinkableSymbols of the Elf as a LinkableBinary.

What is the use case for the feature?
See above.

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
See above.

Are there any (reasonable) alternative approaches?
N/A.

Are you interested in implementing it yourself?
No -- this is a good first contributor issue!

OpenWRT TRX unpacker fails to unpack when rootfs_offset is null

What is the problem? (Here is where you provide a complete Traceback.)
Many TRX images seem to zero out the rootfs_offset field in the TRX header, and store data in a UBIFS filesystem at kernel_offset within the image. When this happens, the OpenWrtTrxKernel is empty, and the OpenWrtTrxRootfs object contains all data from offset 0 to the end of the file.

Please provide some information about your environment.
redballoonsecurity/ofrak/binaryninja

If you've discovered it, what is the root cause of the problem?

How often does the issue happen?
This seems to affect all targets except bcrm47xx/mips74k across all versions of OpenWRT

What are the steps to reproduce the issue?
Running the following function on a (bcm53xx target)[https://downloads.openwrt.org/releases/19.07.0/targets/bcm53xx/generic/openwrt-19.07.0-bcm53xx-buffalo-wxr-1900dhp-squashfs.trx] and a (bcrm47xx)[https://downloads.openwrt.org/releases/21.02.0-rc1/targets/bcm47xx/mips74k/openwrt-21.02.0-rc1-bcm47xx-mips74k-asus_rt-n10u-b-squashfs.trx] will produce the output below

async def unpack_image(resource):
    resource.add_tag(OpenWrtTrx)
    await resource.save()
    await resource.unpack()

    trx_view = await resource.view_as(OpenWrtTrx)
    trx_header = await trx_view.get_header()
    print(f'loader_offset: {hex(trx_header.trx_loader_offset)}')
    print(f'kernel_offset: {hex(trx_header.trx_kernel_offset)}')
    print(f'rootfs_offset: {hex(trx_header.trx_rootfs_offset)}')

    kernel = await resource.get_only_descendant_as_view(
        OpenWrtTrxKernel, r_filter=ResourceFilter(tags=(OpenWrtTrxKernel,))
    )
    print(f'kernel data length: {await kernel.resource.get_data_length()}')

    print(await resource.summarize_tree())

openwrt-19.07.0-bcm53xx-buffalo-wxr-1900dhp-squashfs.trx
loader	offset: 0x1c
kernel_offset: 0x400000
rootfs_offset: 0x0
kernel data length: 0
┌f3ae23f495724d2da8dde702461de851: [caption=(File, OpenWrtTrx), attributes=(Magic), global_offset=(0x0-0x640000), parent_offset=(0x0-0x0), data_hash=46cdc765]
├────d712a6fd9582414b887b5561b657de70: [caption=(OpenWrtTrxHeader), attributes=(OpenWrtTrxHeaderAutoAttributes), global_offset=(0x0-0x1c), parent_offset=(0x0-0x1c), data_hex=48445230000064003c0f556b000001001c0000000000400000000000]
├────6a81c1d4de354793a6647fa639a3fb86: [caption=(OpenWrtTrxLzmaLoader), attributes=(), global_offset=(0x0-0x3fffe4), parent_offset=(0x0-0x0), data_hash=0d248e14]
├────fe9b61633a8e4deeb7ab2655b3a6fbbd: [caption=(OpenWrtTrxKernel), attributes=(), global_offset=(0x0-0x0), parent_offset=(0x0-0x0), data_ascii=""]
└────539a7426ad7846fd8f58ac3ff323bc47: [caption=(OpenWrtTrxRootfs), attributes=(), global_offset=(0x0-0x640000), parent_offset=(0x0-0x0), data_hash=46cdc765]

openwrt-21.02.0-rc1-bcm47xx-mips74k-asus_rt-n10u-squashfs.trx
loader	offset: 0x1c
kernel_offset: 0x948
rootfs_offset: 0x1d1000
kernel data length: 1902264
┌dadda0afd664423aa8adff0c92ebb3db: [caption=(File, OpenWrtTrx), attributes=(Magic), global_offset=(0x0-0x531000), parent_offset=(0x0-0x0), data_hash=812040c8]
├────2c37c975cd13402dadc6a0d1b083c775: [caption=(OpenWrtTrxHeader), attributes=(OpenWrtTrxHeaderAutoAttributes), global_offset=(0x0-0x1c), parent_offset=(0x0-0x1c), data_hex=4844523000105300e1851726000001001c0000004809000000101d00]
├────99ac34f3e7ab4db9b7aaff3f02de0878: [caption=(OpenWrtTrxLzmaLoader), attributes=(), global_offset=(0x0-0x92c), parent_offset=(0x0-0x0), data_hash=3bac3c22]
├────8edc7a6eb7864c668a217b7111fa0327: [caption=(OpenWrtTrxKernel), attributes=(), global_offset=(0x0-0x1d06b8), parent_offset=(0x0-0x0), data_hash=b4c54f04]
└────35106e38c1ee40779b06f147807c39b9: [caption=(OpenWrtTrxRootfs), attributes=(), global_offset=(0x0-0x360000), parent_offset=(0x0-0x0), data_hash=8b281388]

It took 1.363 seconds to run the OFRAK script

How would you implement this fix?

Are there any (reasonable) alternative approaches?

Are you interested in implementing it yourself?

GUI string replace modifier error.

What is the problem? (Here is where you provide a complete Traceback.)
User reported the following error when running the GUI's string replace modifier:

Error: Can't deserialize ofrak_components.string.StringFindReplaceConfig: module not already loaded

Please provide some information about your environment.
N/A.

If you've discovered it, what is the root cause of the problem?
Error is likely this line: https://github.com/redballoonsecurity/ofrak/blob/master/frontend/src/ofrak/remote_resource.js#L144

How often does the issue happen?

What are the steps to reproduce the issue?
Build the GUI, try to use the modifier

How would you implement this fix?
It should probably use ofrak.strings.StringFindReplaceConfig.

Are there any (reasonable) alternative approaches?

Are you interested in implementing it yourself?
Yes.

Nonfree license

I saw your tool mentioned in Wired. The article says OFRAK is open source, but the license says it is only for non-commercial usage.

Non-commercial (NC) clauses are nonfree so OFRAK is not free software according to the FSF or open source according to the OSI. I would recommend removing the non-commercial clause from your license.

OFRAK 2.2.1 incorrectly displays GUI version as 2.2.0

What is the problem? (Here is where you provide a complete Traceback.)
ofrak 2.2.1 from PyPI has the wrong version string in the GUI.

Please provide some information about your environment.
To reproduce:

% pip install ofrak
% ofrak gui

The version listed in bottom right of GUI is 2.2.0. It should be 2.2.1.

If you've discovered it, what is the root cause of the problem?
Possible that GUI wasn't rebuilt after #246.

What are the steps to reproduce the issue?
See above.

How would you implement this fix?
Probably should be addressed in a 2.2.2 release at some point.
Release process should also enforce rebuilding the GUI.

Elf.get_symbol_section fails when more than one symbol section exists

What is the problem? (Here is where you provide a complete Traceback.)
Elf.get_symbol_section fails when the target elf has more than one symbol section.

Please provide some information about your environment.
The following test (using the elf_executable_file fixture) reproduces the issue

import pytest

from ofrak import OFRAKContext, Resource
from ofrak.core import Elf, ElfSymbolSection, ElfSymbolBinding, ElfSymbolType, \
    ElfSymbolVisibility
from ofrak_type.range import Range


@pytest.fixture
async def elf_resource(elf_executable_file: str, ofrak_context: OFRAKContext) -> Resource:
    return await ofrak_context.create_root_resource_from_file(elf_executable_file)


async def test_elf_view(elf_resource: Resource):
    await elf_resource.unpack()
    elf = await elf_resource.view_as(Elf)
    for elf_section_header in await elf.get_section_headers():
        file_range = elf_section_header.get_file_range()
        assert isinstance(file_range, Range)

    symbol_section = await elf.get_symbol_section()
    assert isinstance(symbol_section, ElfSymbolSection)
    for symbol in await symbol_section.get_symbols():
        assert isinstance(symbol.get_binding(), ElfSymbolBinding)
        assert isinstance(symbol.get_type(), ElfSymbolType)
        assert isinstance(symbol.get_visibility(), ElfSymbolVisibility)
        symbol_section_index = symbol.get_section_index()
        if symbol_section_index is not None:
            assert isinstance(symbol_section_index, int)

If you've discovered it, what is the root cause of the problem?
The resulting binary has an ElfSymbolSection as well as an ElfDynaSymbolSection.

How often does the issue happen?
N/A.

What are the steps to reproduce the issue?
Run the above test in the OFRAK container.

How would you implement this fix?
TBD.

Are there any (reasonable) alternative approaches?

Are you interested in implementing it yourself?

improve the PatchMaker error message when the patch is too big

Currently, supplying a patch that is too big for the imparted space will fail at the linking step, with an error message like:

subprocess.CalledProcessError: Command '['/opt/rbs/toolchain/binutils-2.34/ld/ld-new', '--no-dynamic-linker',
'--error-unresolved-symbols', '--warn-section-align', '--nmagic', '--no-eh-frame-hdr', '--no-check-sections',
'-T/tmp/tmp7aw21v82/0000017e_00000002_66t_t6t1.ld', '-Map', '/tmp/tmp7aw21v82/output_exec.map',
'-o/tmp/tmp7aw21v82/output_exec', '/tmp/tmp7aw21v82/0000017e_00000002_bom_files/patch_2_too_big.c.o',
'/tmp/tmp7aw21v82/stubs_bom_files/stub_main.as.o']' returned non-zero exit status 1.

Even if we know that this could mean the patch is too big, it could also be the result of any other linker failure.

It would be better to raise an explicit error when the patch is too big.

Which files would be affected?
ofrak_patch_maker package will be affected.

Does the proposed maintenance include non-doc string functional changes to the Python code?
Yes.

Are you interested in implementing it yourself?
No. This is a good first contributor issue!

Ghidra Server JSON Parsing does not escape control characters.

The JSON Implementation in the ofrak_ghidra components does not consider control characters in JSON output.

In my case, attempting to send string data recovered from a binary with newline (\n) characters produces:
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 31 (char 30)

Ghidra has it's own JSON parser with about zero documentation:
https://ghidra.re/ghidra_docs/api/generic/json/JSONParser.html

And only some people using it to parse json, not construct one:
NationalSecurityAgency/ghidra#1982

Improve syntax highlighting Python scripts in the GUI

What is the use case for the feature?
As part of #265 we will show syntax highlighting of the generated scripts in the GUI. It would be good to further split up the highlighting so that there are fewer types mapped to the same color (i.e., use more colors), and to create a new color palette (potentially with the use of Huemint).

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
Modify code.css to apply different colors to different types generated by highlight.js.

Are there any (reasonable) alternative approaches?
No.

Are you interested in implementing it yourself?
I can modify the CSS to apply the colors appropriately. Perhaps Neil can assist with building a good color palette.

`make inspect` fails on 6 tutorial `.ipynb` files.

Which files would be affected?
ofrak_tutorial/notebooks_with_outputs/1_simple_string_modification.ipynb
ofrak_tutorial/notebooks_with_outputs/2_ofrak_internals.ipynb
ofrak_tutorial/notebooks_with_outputs/3_binary_format_modification.ipynb
ofrak_tutorial/notebooks_with_outputs/4_simple_code_modification.ipynb
ofrak_tutorial/notebooks_with_outputs/5_filesystem_modification.ipynb
ofrak_tutorial/notebooks_with_outputs/6_code_insertion_with_extension.ipynb

Does the proposed maintenance include non-doc string functional changes to the Python code?
Only in tutorial.

Are you interested in implementing it yourself?
Sure, trivial to change, but not sure:

Whether we want this changed.
Whether we want to do anything to reconcile the difference between what make inspect checks (when its jupyter support is installed?) vs what pre-commit and CI check.

LLVM Toolchain handling of assembler_target needs to know which GNU assembler will be used

The abstract method _get_assembler_target of Toolchain is supposed to figure out the "target" that will be passed to the assembler. How this "target" is actually used by/passed to the assembler is up to the Toolchain implementation to figure out.

The LLVM toolchain implementation assumes that it will be passing the returned target (stored as self._assembler_target) to the assembler via the option -march (llvm.py:40):

self._assembler_flags.append(f"-march={self._assembler_target}")

However, there are a few problems here:

The LLVM toolchain assumes that the assembler which will be used has an -march option, which seems reasonable but the PPC toolchain we recently added doesn't actually have this, so I guess it's not universal.
The way the Toolchain by default chooses the assembler is by looking in the toolchain.conf, so we don't know for sure which assembler will be used. The LLVM toolchain does not override this the default behavior, shown here (abstract.py:185:

    def _assembler_path(self) -> str:
        """
        Provides path to installed assembler given the ISA.

        :raises NotImplementedError: if an assembler for that ISA does not exist
        :returns: filepath to the assembler program
        """
        if self._processor.isa == InstructionSet.M68K:
            assembler_path = "M68K_ASM_PATH"
        elif (
            self._processor.isa == InstructionSet.X86
            and self._processor.bit_width == BitWidth.BIT_64
        ):
            assembler_path = "X86_64_ASM_PATH"
        else:
            assembler_path = f"{self._processor.isa.value.upper()}_ASM_PATH"
        return get_repository_config("ASM", assembler_path)

The LLVM toolchain does not appear to really have its own assembler, so it's not a case where it should clearly just set its own assembler correctly.

Which files would be affected?
ofrak_patch_maker/toolchain/llvm_12.py

Does the proposed maintenance include non-doc string functional changes to the Python code?
Yes.

A simple fix is for the LLVM to store the entire assembler arch argument in _assembler_target, instead of only part of the argument and constructing the full thing later. The disadvantage to this is that if the user wants to pass in a specific assembler target, they have to pass in the whole toolchain option (e.g. "-march=armv7" instead of "armv7"). Passing arbitrary command-line options through a config is... atypical in PatchMaker.

A more complex fix is figure out if we can create an LLVM assembler that it knows it will use, then the arguments can be constructed consistently.

Are you interested in implementing it yourself?
Yes but it's not high-priority right now.

Un-covered resource service functions

PR #122 added two functions to the resource service (delete_resources, and update_many). But it does not seem to have added any tests to cover these functions. Tests should be added to cover them.

Which files would be affected?

ofrak_core/ofrak/service/resource_service.py
ofrak_core/test_ofrak/service/resource_service/*.py

Does the proposed maintenance include non-doc string functional changes to the Python code?

Only adding tests.

Are you interested in implementing it yourself?

Perhaps

Cloning repository on Windows adds carriage return characters which can break Docker builds

What is the problem? (Here is where you provide a complete Traceback.)
For example, building the ofrak tutorial image will fail with a "file not found" error which traces back to the interpreter on the first line of generate_stripped_notebook.sh being not found, because it is looking for "/bin/bash^M" (^M is the carriage return character that Windows expects and automatically added by git).

Please provide some information about your environment.
This is seen on Windows with the Windows Git GUI.

If you've discovered it, what is the root cause of the problem?
Git (and/or perhaps Visual Studio) adding Windows line endings to files which are only meant to run in Linux Docker containers.

How often does the issue happen?

What are the steps to reproduce the issue?
Git clone the repo on windows, run the command to build the tutorial image.

How would you implement this fix?
According to this, we might be able to set up .gitattributes so that git will not try to change the line endings when cloning on Windows. We can target this at specific files, maybe we want it on all files. TBD

Are there any (reasonable) alternative approaches?
Not sure, TBD

Are you interested in implementing it yourself?
Certainly not today

Use asyncio interface for subprocess calls

Many components use subprocess to make calls to external programs via the command line. These are currently blocking, non-asynchronous calls.

The asyncio.create_subprocess_exec and asyncio.create_subprocess_shell functions represent an asyncio-compatible API for making subprocess calls that don't block the event loop. We should replace existing subprocess calls in OFRAK with async calls via this API.

Which files would be affected?
Any file that uses a subprocess call. (ofrak_patch_maker should not be updated since it is a synchronous package).

Does the proposed maintenance include non-doc string functional changes to the Python code?
Yes (see above)

Are you interested in implementing it yourself?
No -- this is a great first contributor issue!

LLVM_MACH_O_Parser

What is the use case for the feature?
The LLVM_MACH_O_Parser would give the ability to parse Mach-O files.

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
Continue working off of WIP code that was removed in #155.

Are you interested in implementing it yourself?
No. This is a good first contributor issue!

beartype not included with "pip install ofrak"

What is the problem? (Here is where you provide a complete Traceback.)
OFRAK does not list beartype in its install_requires list: https://github.com/redballoonsecurity/ofrak/blob/master/ofrak_core/setup.py.

As a result, pip install ofrak does not work if beartype is not installed.

What are the steps to reproduce the issue?

% python3 -m venv venv
% source venv/bin/activate
(venv) % python3 -m pip install ofrak
(venv) % python3 ex1_simple_string_modification.py
...
    from beartype import beartype
ModuleNotFoundError: No module named 'beartype'
(venv) %

If you've discovered it, what is the root cause of the problem?
beartype is not installed with ofrak.

Users can work around this by installing beartype.

How often does the issue happen?
Every time.

How would you implement this fix?
Include beartype in ofrak's install_requires: https://github.com/redballoonsecurity/ofrak/blob/master/ofrak_core/setup.py.

make class Elf support multiple symbol sections (namely .dynsym and .symtab)

What is the use case for the feature?
The class Elf has a get_symbol_section which returns the Elf's only child of type ElfSymbolSection.
But an ELF can have two symbol sections: .dynsym and .symtab.

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
Possibilities:

create another get_symbol_sections method (and maybe improve the error message of get_symbol_section when there is more than one such section)
modify the current method to return .symtab by default, since .dynsym, used only for dynamic linking, is generally (or always?) included in .symtab (source) (only if there is a .symtab though I guess)

Are there any (reasonable) alternative approaches?
Possibly.

Are you interested in implementing it yourself?
No. This is a good first contributor issue!

Address PEP 585 & BeartypeDecorHintPep585DeprecationWarning

What is the issue?
When beartype is installed, OFRAK emits BeartypeDecorHintPep585DeprecationWarning warnings on python >= 3.9. See https://github.com/beartype/beartype#pep-585-deprecations for more information.

As the above link explains, the typing module will deprecate some type hints in 2025 or 2026.

Which files would be affected?
All files with the deprecated typehints.

What is the proposed fix?
https://github.com/beartype/beartype#pep-585-deprecations lists a couple options for addressing this.

For now, users can filter these warnings with:

# Do it globally for everyone, whether they want you to or not!
# This is the "Make Users Suffer" option.
from beartype.roar import BeartypeDecorHintPep585DeprecationWarning
from warnings import filterwarnings
filterwarnings("ignore", category=BeartypeDecorHintPep585DeprecationWarning)

The next step on this is picking an approach to addressing this that makes sense for OFRAK.

Support cross-platform compatibility by making xattr conditionally dependent

What is the use case for the feature?
OFRAK, and in particular FilesystemEntry, currently supports working with xattr attributes, which exist for Linux and macOS files. Rather than xattr, Windows uses Alternative Data Streams. As such, OFRAK has a platform-specific dependency for xattr that prevents Windows users from using OFRAK. Since we currently only manipulate xattr attributes for completeness (copying them to files OFRAK has modified), we can remove this dependency for now in order to enable OFRAK on Windows.

Rather than removing support for xattr, we will make it a dependency conditional upon the platform running OFRAK. On platforms which do not support xattr (e.g., Windows), we will use a stub xattr library that will warn users that extended attributes are not supported when using OFRAK on Windows.

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
~~Remove references to xattr from the FilesystemEntry class.~~

Make xattr dependency conditional on platform_system. Any file importing xattr will fall back to the stub library if xattr is not present.

Are there any (reasonable) alternative approaches?
It makes sense to remove xattr for now in order to get Windows users up and running on OFRAK. We should have design discussions around whether/how we will support xattr/ADS in the future.

Are you interested in implementing it yourself?
Yes.

UF2 unpacker/packer

What is the use case for the feature?
To unpack/pack UF2 file format (https://github.com/microsoft/uf2/).

Does the feature contain any proprietary information about another company's intellectual property?
No: https://github.com/microsoft/uf2.

How would you implement this feature?
Probably by wrapping https://github.com/microsoft/uf2/blob/master/utils/uf2conv.py .

Are there any (reasonable) alternative approaches?
Other uf2 tools probably exist.

Are you interested in implementing it yourself?
Maybe :).

BuildKit requirement is not clear – `cmake: command not found`

What is the problem? (Here is where you provide a complete Traceback.)
25e7e9a introduced a change that breaks CMake installation where BuildKit is not enabled.

$ python3 build_image.py --config ofrak-ghidra.yml --base --finish
...
[SNIP]
...
Step 23/45 : RUN cd /tmp &&     git clone https://github.com/rbs-forks/keystone.git &&     cd keystone &&     git checkout 2021.09.01 &&     ./install_keystone.sh &&     cd /tmp/keystone/bindings/python && python setup.py install &&     cd /tmp &&     rm -r keystone
 ---> Running in c4aa3830570a
Cloning into 'keystone'...
Note: switching to '2021.09.01'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at f67afb9 Merge remote-tracking branch 'ks_master/master' into upstream_pull_9.2021
../make-share.sh: 21: cmake: not found
./install_keystone.sh: line 6: cmake: command not found
make: *** No targets specified and no makefile found.  Stop.
make: *** No rule to make target 'install'.  Stop.
cp: cannot stat '/usr/local/lib/x86_64-linux-gnu/*': No such file or directory
rm -rf ./build src/
rm -rf prebuilt/win64/keystone.dll
rm -rf prebuilt/win32/keystone.dll
if test -n ""; then \
    python setup.py build -b ./build install --root=""; \
else \
    python setup.py build -b ./build install; \
fi
running build
Building C++ extensions
make[1]: Entering directory '/tmp/keystone/bindings/python'
rm -rf ./build src/ dist/ README
rm -f keystone/*.so
rm -rf prebuilt/win64/keystone.dll
rm -rf prebuilt/win32/keystone.dll
make[1]: Leaving directory '/tmp/keystone/bindings/python'
/tmp/keystone/bindings/python/../../CMakeLists.txt -> /tmp/keystone/bindings/python/src/CMakeLists.txt
/tmp/keystone/bindings/python/../../CMakeUninstall.in -> /tmp/keystone/bindings/python/src/CMakeUninstall.in
/tmp/keystone/bindings/python/../../CMakeLists.txt -> /tmp/keystone/bindings/python/src/CMakeLists.txt
/tmp/keystone/bindings/python/../../LICENSE-COM.TXT -> /tmp/keystone/bindings/python/src/LICENSE-COM.TXT
/tmp/keystone/bindings/python/../../AUTHORS.TXT -> /tmp/keystone/bindings/python/src/AUTHORS.TXT
/tmp/keystone/bindings/python/../../SPONSORS.TXT -> /tmp/keystone/bindings/python/src/SPONSORS.TXT
/tmp/keystone/bindings/python/../../CREDITS.TXT -> /tmp/keystone/bindings/python/src/CREDITS.TXT
/tmp/keystone/bindings/python/../../COPYING -> /tmp/keystone/bindings/python/src/COPYING
/tmp/keystone/bindings/python/../../LICENSE-COM.TXT -> /tmp/keystone/bindings/python/src/LICENSE-COM.TXT
/tmp/keystone/bindings/python/../../EXCEPTIONS-CLIENT -> /tmp/keystone/bindings/python/src/EXCEPTIONS-CLIENT
/tmp/keystone/bindings/python/../../README.md -> /tmp/keystone/bindings/python/src/README.md
/tmp/keystone/bindings/python/../../RELEASE_NOTES -> /tmp/keystone/bindings/python/src/RELEASE_NOTES
/tmp/keystone/bindings/python/../../ChangeLog -> /tmp/keystone/bindings/python/src/ChangeLog
/tmp/keystone/bindings/python/../../SPONSORS.TXT -> /tmp/keystone/bindings/python/src/SPONSORS.TXT
/tmp/keystone/bindings/python/../../pkg-config.pc.cmake -> /tmp/keystone/bindings/python/src/pkg-config.pc.cmake
/tmp/keystone/bindings/python/../../make-share.sh -> /tmp/keystone/bindings/python/src/make-share.sh
/tmp/keystone/bindings/python/../../install_keystone.sh -> /tmp/keystone/bindings/python/src/install_keystone.sh
/tmp/keystone/bindings/python/../../make-common.sh -> /tmp/keystone/bindings/python/src/make-common.sh
/tmp/keystone/bindings/python/../../make-afl.sh -> /tmp/keystone/bindings/python/src/make-afl.sh
/tmp/keystone/bindings/python/../../make-lib.sh -> /tmp/keystone/bindings/python/src/make-lib.sh
/tmp/keystone/bindings/python/../../nmake-dll.bat -> /tmp/keystone/bindings/python/src/nmake-dll.bat
/tmp/keystone/bindings/python/../../nmake-lib.bat -> /tmp/keystone/bindings/python/src/nmake-lib.bat
../make-share.sh: 21: cmake: not found
make[1]: Entering directory '/tmp/keystone/bindings/python/src/build'
make[1]: Leaving directory '/tmp/keystone/bindings/python/src/build'
make[1]: *** No targets specified and no makefile found.  Stop.
/usr/local/lib/python3.7/site-packages/setuptools/dist.py:700: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
  % (opt, underscore_opt))
error: [Errno 2] No such file or directory: '/tmp/keystone/bindings/python/src/build/llvm/lib64/libkeystone.so'
make: *** [Makefile:18: install] Error 1
rm -rf ./build src/
rm -rf prebuilt/win64/keystone.dll
rm -rf prebuilt/win32/keystone.dll
if test -n ""; then \
    python3 setup.py build -b ./build install --root=""; \
else \
    python3 setup.py build -b ./build install; \
fi
running build
Building C++ extensions
make[1]: Entering directory '/tmp/keystone/bindings/python'
rm -rf ./build src/ dist/ README
rm -f keystone/*.so
rm -rf prebuilt/win64/keystone.dll
rm -rf prebuilt/win32/keystone.dll
make[1]: Leaving directory '/tmp/keystone/bindings/python'
/tmp/keystone/bindings/python/../../CMakeLists.txt -> /tmp/keystone/bindings/python/src/CMakeLists.txt
/tmp/keystone/bindings/python/../../CMakeUninstall.in -> /tmp/keystone/bindings/python/src/CMakeUninstall.in
/tmp/keystone/bindings/python/../../CMakeLists.txt -> /tmp/keystone/bindings/python/src/CMakeLists.txt
/tmp/keystone/bindings/python/../../LICENSE-COM.TXT -> /tmp/keystone/bindings/python/src/LICENSE-COM.TXT
/tmp/keystone/bindings/python/../../AUTHORS.TXT -> /tmp/keystone/bindings/python/src/AUTHORS.TXT
/tmp/keystone/bindings/python/../../SPONSORS.TXT -> /tmp/keystone/bindings/python/src/SPONSORS.TXT
/tmp/keystone/bindings/python/../../CREDITS.TXT -> /tmp/keystone/bindings/python/src/CREDITS.TXT
/tmp/keystone/bindings/python/../../COPYING -> /tmp/keystone/bindings/python/src/COPYING
/tmp/keystone/bindings/python/../../LICENSE-COM.TXT -> /tmp/keystone/bindings/python/src/LICENSE-COM.TXT
/tmp/keystone/bindings/python/../../EXCEPTIONS-CLIENT -> /tmp/keystone/bindings/python/src/EXCEPTIONS-CLIENT
/tmp/keystone/bindings/python/../../README.md -> /tmp/keystone/bindings/python/src/README.md
/tmp/keystone/bindings/python/../../RELEASE_NOTES -> /tmp/keystone/bindings/python/src/RELEASE_NOTES
/tmp/keystone/bindings/python/../../ChangeLog -> /tmp/keystone/bindings/python/src/ChangeLog
/tmp/keystone/bindings/python/../../SPONSORS.TXT -> /tmp/keystone/bindings/python/src/SPONSORS.TXT
/tmp/keystone/bindings/python/../../pkg-config.pc.cmake -> /tmp/keystone/bindings/python/src/pkg-config.pc.cmake
/tmp/keystone/bindings/python/../../make-share.sh -> /tmp/keystone/bindings/python/src/make-share.sh
/tmp/keystone/bindings/python/../../install_keystone.sh -> /tmp/keystone/bindings/python/src/install_keystone.sh
/tmp/keystone/bindings/python/../../make-common.sh -> /tmp/keystone/bindings/python/src/make-common.sh
/tmp/keystone/bindings/python/../../make-afl.sh -> /tmp/keystone/bindings/python/src/make-afl.sh
/tmp/keystone/bindings/python/../../make-lib.sh -> /tmp/keystone/bindings/python/src/make-lib.sh
/tmp/keystone/bindings/python/../../nmake-dll.bat -> /tmp/keystone/bindings/python/src/nmake-dll.bat
/tmp/keystone/bindings/python/../../nmake-lib.bat -> /tmp/keystone/bindings/python/src/nmake-lib.bat
../make-share.sh: 21: cmake: not found
make[1]: *** No targets specified and no makefile found.  Stop.
make[1]: Entering directory '/tmp/keystone/bindings/python/src/build'
make[1]: Leaving directory '/tmp/keystone/bindings/python/src/build'
/usr/local/lib/python3.7/site-packages/setuptools/dist.py:700: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
  % (opt, underscore_opt))
error: [Errno 2] No such file or directory: '/tmp/keystone/bindings/python/src/build/llvm/lib64/libkeystone.so'
make: *** [Makefile:28: install3] Error 1
The command '/bin/sh -c cd /tmp &&     git clone https://github.com/rbs-forks/keystone.git &&     cd keystone &&     git checkout 2021.09.01 &&     ./install_keystone.sh &&     cd /tmp/keystone/bindings/python && python setup.py install &&     cd /tmp &&     rm -r keystone' returned a non-zero code: 2
Error running command: 'docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from redballoonsecurity/ofrak/ghidra-base:master -t redballoonsecurity/ofrak/ghidra-base:c0144b75 -t redballoonsecurity/ofrak/ghidra-base:latest -f base.Dockerfile .'
Exit status: 2

Please provide some information about your environment.

Platform:
- Intel x86_64 running Ubuntu 20.04
- Buildingredballoonsecurity/ofrak/ghidra

If you've discovered it, what is the root cause of the problem?
DOCKER_BUILDKIT=1 must be exported or buildkit must be enabled.

How often does the issue happen?
Every time I attempt to build the docker image without buildkit.

What are the steps to reproduce the issue?

$ DOCKER_BUILDKIT=0 python3 build_image.py --config ofrak-ghidra.yml --base --finish

How would you implement this fix?
Update docs to replace:

python3 build_image.py --config ofrak-ghidra.yml --base --finish

with:

DOCKER_BUILDKIT=1 python3 build_image.py --config ofrak-ghidra.yml --base --finish

Are there any (reasonable) alternative approaches?
Detect target architecture without the use of buildkit.

Are you interested in implementing it yourself?
Yes, after discussing solutions.

Use official version of 7zip on Linux instead of unsupported p7zip

See: https://news.ycombinator.com/item?id=31846317

Which files would be affected?
https://github.com/redballoonsecurity/ofrak/blob/master/ofrak_components/ofrak_components/p7zip.py

Does the proposed maintenance include non-doc string functional changes to the Python code?
Yes.

Are you interested in implementing it yourself?
No -- this is a great first issue!

PowerPC PatchMaker Support

What is the use case for the feature?
Patching PPC binaries with PatchMaker requires support for PPC builds. The LLVM toolchain may be sufficient, but we should also have a GCC toolchain targeting PPC. For example, I'm not sure how good LLVM's support for the VLE encoding is.

Does the feature contain any proprietary information about another company's intellectual property?
Nope.

How would you implement this feature?
A Toolchain & tests that targets PPC. The same test could also test the PPC target for LLVM.

Are there any (reasonable) alternative approaches?
Just relying on LLVM is probably fine, but GCC toolchains give more flexibility.

Are you interested in implementing it yourself?
Not as the main contributor, but I'll definitely help out with blockers!

ElfPointerArraySectionAddModifier modifications do not propogate to children view (ElfVirtualAddress)

What is the problem? (Here is where you provide a complete Traceback.)
The following test, when added to TestElfPointerArraySectionModifier (from #71 ), fails.

    async def test_elf_pointer_array_section_modifier_virtual_address(self, elf_resource: Resource):
        """
        Test that `ElfPointerArraySectionModifier` results in updates to the children
        `ElfVirtualAddress`.
        """
        pointer_array_section = await self._unpack_and_get_first_pointer_array_section(elf_resource)
        original_values = list(await pointer_array_section.get_entries())

        await pointer_array_section.resource.run(
            ElfPointerArraySectionAddModifier,
            ElfPointerArraySectionAddModifierConfig(skip_list=(), add_value=self.add_value),
        )
        updated_pointer_array_section = await pointer_array_section.resource.view_as(
            ElfPointerArraySection
        )

        for i, entry in enumerate(await updated_pointer_array_section.get_entries()):
            assert entry.value - self.add_value == original_values[i].value

This test appears to fail because the ElfVirtualAddress.value is not updated when the section's entries are retrieved after ElfPointerArraySectionAddModifier is run.

Please provide some information about your environment.
At minimum we would like the following information on your platform and Python environment:
This was first observed when running the make image container on #71.

If you've discovered it, what is the root cause of the problem?
This could possibly be related to the way in which ElfPointerArraySectionAddModifier is implemented: it does not iterate over its children, but rather modifies its own data directly.

How often does the issue happen?
Every time the above-referenced test is run

Store known symbolic information in LinkableSymbols and generate assembly stub files

What is the use case for the feature?
LinkableSymbols represent symbols in a program that OFRAK can link against with PatchMaker (i.e. a C patch you inject can call functions and access data defined by those functions). Previously these had to be always defined manually, even when some symbolic information was present that OFRAK could use. Automatically populating LinkableSymbols with symbolic information will allow patch authors to more quickly link against known symbols.

Does the feature contain any proprietary information about another company's intellectual property?
No.

How would you implement this feature?
Symbolic information (vaddr, name, type, and mode) about LinkableSymbols for functions is currently stored in ComplexBlock objects. These ComplexBlocks can be tagged as LinkableSymbols and the information will be copied to these new resources via the get_symbols() method of LinkableBinary. PatchFromSourceModifier gets linkage info from LinkableBinary's make_linkable_bom() method.

We will need to stress test these changes to ensure that performance bottlenecks are not introduced by providing the Linker with potentially hundreds or thousands of object files to link against. We should consider adding test cases to the CI pipeline.

Are there any (reasonable) alternative approaches?
This change will store LinkableSymbols as descendant resources of a LinkableBinary. We should consider whether this is the best structure for this information and whether there are better alternatives.

Symbolic information will only be taken from the disassembler backend, not directly from the symtab. Disassemblers may disagree with the symtab (if present) when a binary has been compiled with position-independent code. However, this should not be an issue to use the linkage info from the disassembler as long as patch authors are compiling their patches to be position-independent.

Are you interested in implementing it yourself?
Yes.

Potential post-MVP features

Creating LinkableSymbols for data. Currently it is not as easy to get symbolic information for data as it is for functions. Furthermore, we would need to handle each disassembler backend separately.

capture noisy ghidra output and log it with DEBUG level

The stdout and stderr streams of Ghidra contain output which, as far as OFRAK is concerned, is potentially useful but distracting most of the time. For example in lesson 4 of the tutorial, this output is shown (scattered over stdout and stderr):

/opt/rbs/ghidra_10.1.2_PUBLIC/support/analyzeHeadless ghidra://localhost:13100/ofrak -connect root -p -import /tmp/tmpk4aq9pyc/f5cd089390a2de4123fb3b36d1d0e4c88f9c45625cd77b58a969bb932ea51bc5 -overwrite
openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment (build 11.0.15+10-post-Debian-1deb11u1)
OpenJDK 64-Bit Server VM (build 11.0.15+10-post-Debian-1deb11u1, mixed mode)
/opt/rbs/ghidra_10.1.2_PUBLIC/support/analyzeHeadless ghidra://localhost:13100/ofrak -connect root -p -process f5cd089390a2de4123fb3b36d1d0e4c88f9c45625cd77b58a969bb932ea51bc5 -readOnly -scriptPath /ofrak_components_ghidra/ofrak_components_ghidra/ghidra_scripts/ -postScript AnalysisServer.java
openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment (build 11.0.15+10-post-Debian-1deb11u1)
OpenJDK 64-Bit Server VM (build 11.0.15+10-post-Debian-1deb11u1, mixed mode)

To avoid confusing users, this output should be captured and logged with a DEBUG level.

Which files would be affected?
The disassemblers/ofrak_ghidra package would be affected.

Does the proposed maintenance include non-doc string functional changes to the Python code?
This might depend on how it is implemented.

Are you interested in implementing it yourself?
This is a great first contributor issue!

Add additional PPC toolchain test

As per this comment by @andresito00 on the PPC toolchain PR, this issue documents the need to translate the remaining patch maker test to PPC assembly.

Which files would be affected?

ofrak_patch_maker/ofrak_patch_maker_test/test_ppc_toolchain.py

ofrak/ofrak_patch_maker/ofrak_patch_maker_test/test_ppc_toolchain.py

Lines 39 to 42 in c0b0cfd

    
           # ASM Tests 
        
           # def test_challenge_3_reloc_toy_example(toolchain_under_test: ToolchainUnderTest): 
        
           #     # TODO 
        
           #     pass

Does the proposed maintenance include non-doc string functional changes to the Python code?

Adding tests.

Are you interested in implementing it yourself?

Yes, someday.

Second code block in tutorial lesson 5 fails with CalledProcessError, UnpackerError

When running the second code block in Lesson 5 of the tutorial,

# second block of code in 5_filesystem_modification.ipynb
from ofrak import OFRAK

ofrak = OFRAK()
basic_context = await ofrak.create_ofrak_context()
root_resource = await basic_context.create_root_resource_from_file("image.sqsh")
unpack_result = await root_resource.unpack_recursively()

it fails with the following error:

# error message
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
/ofrak_core/ofrak/core/filesystem.py in unpack_with_command(command)
    665     try:
--> 666         subprocess.run(command, check=True, capture_output=True)
    667     except subprocess.CalledProcessError as error:

/usr/local/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    511             raise CalledProcessError(retcode, process.args,
--> 512                                      output=stdout, stderr=stderr)
    513     return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['unsquashfs', '-no-exit-code', '-force', '-dest', '/tmp/tmpz1j0oj3b', '/tmp/tmp1cqq0pq0']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

UnpackerError                             Traceback (most recent call last)
/ofrak_core/ofrak/service/job_service.py in _run_component(self, job_id, resource_id, component, job_context, config)
    136         self._active_component_tasks[component_task_id] = component_task
--> 137         component_result = await component_task
    138         del self._active_component_tasks[component_task_id]

/ofrak_core/ofrak/component/abstract.py in run(self, job_id, resource_id, job_context, resource_context, resource_view_context, config)
     81             config = self.get_default_config()
---> 82         await self._run(resource, config)
     83         deleted_resource_models: List[MutableResourceModel] = list()

/ofrak_core/ofrak/component/unpacker.py in _run(self, resource, config)
     82             )
---> 83         await self.unpack(resource, config)
     84         resource.add_component(self.get_id(), self.get_version())

/ofrak_components/ofrak_components/squashfs.py in unpack(self, resource, config)
     52                 ]
---> 53                 await unpack_with_command(command)
     54 

/ofrak_core/ofrak/core/filesystem.py in unpack_with_command(command)
    667     except subprocess.CalledProcessError as error:
--> 668         raise UnpackerError(format_called_process_error(error))
    669 

UnpackerError: Command '['unsquashfs', '-no-exit-code', '-force', '-dest', '/tmp/tmpz1j0oj3b', '/tmp/tmp1cqq0pq0']' returned non-zero exit status 1. Stderr: b'FATAL ERROR: Data queue size is too large\n'. Stdout: b''.

The above exception was the direct cause of the following exception:

ComponentAutoRunFailure                   Traceback (most recent call last)
/tmp/ipykernel_1369/1614249283.py in async-def-wrapper()

/ofrak_core/ofrak/resource.py in unpack_recursively(self, blacklisted_components, do_not_unpack)
    540             all_unpackers=True,
    541             blacklisted_components=blacklisted_components,
--> 542             blacklisted_tags=do_not_unpack,
    543         )
    544 

/ofrak_core/ofrak/resource.py in auto_run_recursively(self, components, blacklisted_components, blacklisted_tags, all_unpackers, all_identifiers, all_analyzers)
    509                 all_identifiers=all_identifiers,
    510                 all_analyzers=all_analyzers,
--> 511                 tags_ignored=tuple(blacklisted_tags),
    512             )
    513         )

/ofrak_core/ofrak/service/job_service.py in run_components_recursively(self, request)
    269                 _run_components_requests,
    270                 request.job_id,
--> 271                 job_context,
    272             )
    273             components_result.update(iteration_components_result)

/ofrak_core/ofrak/service/job_service.py in _auto_run_components(self, requests, job_id, job_context)
    414                         request_causing_run.component_filter,
    415                         component_name.encode(),
--> 416                     ) from component_run_error
    417             concurrent_run_tasks = pending
    418 

ComponentAutoRunFailure: Component SquashfsUnpacker failed when running on b5091fce80b945899027ef2680f77e90. Component was chosen because it matched filters (((ComponentTypeFilter(Unpacker) or ComponentTypeFilter(Identifier))) and ((ComponentTypeFilter(Identifier) and ComponentTargetFilter(SquashfsFilesystem, GenericBinary, FilesystemRoot)) or (not ComponentTypeFilter(Identifier) and (ComponentTargetFilter(SquashfsFilesystem) then ComponentTargetFilter(GenericBinary, FilesystemRoot)))))

My system:
Docker version: 20.10.17, build 100c70180f
OS: Linux 5.19.1-2-MANJARO
Browser: Mozilla Firefox 103.0.2

I successfully ran make tutorial-image and was using the Jupyter notebook URL provided by make tutorial-run.

FreeSpaceModifier and PartialFreeSpace Modifier Improvements

The modifiers should be consolidated.
The modifiers should not require knowledge of instruction set architecture. We should opt for an ISA-agnostic bytes-based contract if any custom code should be used in replacement.

Which files would be affected?

ofrak_core/ofrak/core/free_space.py
ofrak_core/ofrak/service/job_service.py
ofrak_core/test_ofrak/unit/resource_view/test_view.py

Does the proposed maintenance include non-doc string functional changes to the Python code?
Yes

Are you interested in implementing it yourself?
Yes

Building from scratch

I am trying to build ofrak from the source and I am facing the following issue.

How can I rectify this issue.

PatchMaker VLE support

What is the use case for the feature?

Patching VLE binaries with the OFRAK PatchMaker.

Does the feature contain any proprietary information about another company's intellectual property?

Negative.

How would you implement this feature?

95% implemented on master...rbs-jacob:ofrak:feature/ppc-vle-toolchain The remaining 5% was discussed briefly on #204, and is summarized below.

Though free to download, the NXP toolchain required for VLE patching is gated behind a login wall. It therefore can't be pulled automatically by the ofrak_patch_maker/Dockerstub install. A couple of strategies to overcome this:

Conditionally copy the ZIP into the Docker image and don't install it if the file doesn't exist
- This would work for users, but would cause CI tests for the VLE PatchMaker to fail
Bundle the files directly
- The GNU toolchain is licensed under GPL, but I don't know if it acceptable for us to distribute NXP's builds outside of their auth wall
Use a different toolchain with VLE support

Are you interested in implementing it yourself?

I (mostly) already have :)

Add ZStandard components

What is the use case for the feature?
ZStandard is a modern compression format known for its speed and high compression ratios.

It has a command line tool that can be installed with:

# Ubuntu / Debian / OFRAK Docker
sudo apt install zstd

Or:

# macOS
brew install zstd

The packer and unpacker components would probably end up being very similar to the LZO components in structure.

There are some example ZStandard-compressed files here, but they may not make for great test files because they're fairly large, so it is probably worth finding better (or smaller) ones...

Does the feature contain any proprietary information about another company's intellectual property?
No, the format is open-source https://github.com/facebook/zstd/blob/dev/LICENSE.

How would you implement this feature?
See above.

Are there any (reasonable) alternative approaches?
N/A.

Are you interested in implementing it yourself?
No -- this is a great first contributor issue!

Assess whether `make_linkable_bom` needs PatchMaker instance.

Now that we're here... can we just pass a make-BOM Callable into this function instead of the whole PatchMaker?

I don't think this functions needs access or visibility of any of the other functionality of that class today.

Originally posted by @andresito00 in #115 (comment)

Convenience `from ofrak.core import *` causes namespace issues

What is the problem? (Here is where you provide a complete Traceback.)
The convenient from ofrak.core import * causes unintended namespace collisions

>>> zip
<class 'zip'>
>>> from ofrak.core import *
>>> zip
<module 'ofrak.core.zip' from '/Users/wyatt/ofrak/vevn_new/lib/python3.9/site-packages/ofrak/core/zip.py'>
>>> strings
<module 'ofrak.core.strings' from '/Users/wyatt/ofrak/vevn_new/lib/python3.9/site-packages/ofrak/core/strings.py'>

Please provide some information about your environment.
Can be reproduced with ofrak==2.0.0

If you've discovered it, what is the root cause of the problem?
See above.

How often does the issue happen?
Cursory glance suggests only zip is problematic.

What are the steps to reproduce the issue?
See above

How would you implement this fix?
The zip issue could be sidestepped by renaming the ofrak zip.py file.

Another thing worth considering is creating an __all__ list (https://docs.python.org/3/tutorial/modules.html#importing-from-a-package) to more selectively limit what is exposed.

Are there any (reasonable) alternative approaches?
Not that I can think of.

Are you interested in implementing it yourself?
Maybe! Would like to discuss first.

We need contribution guidelines.

Which files would be affected?
Need some templates in a new .github directory.

I'll be shuffling around some markdown content such as the beginnings of a coding standard, readme, etc.

Does the proposed maintenance include non-doc string functional changes to the Python code?

Are you interested in implementing it yourself?

Yes :)

Flush files to disk in scripts generated by the GUI

What is the use case for the feature?

The current implementation of script generation in the GUI will process input binaries, but it has no side effects. Generated scripts should be able to flush files to disk when they have been downloaded in the GUI.

As an OFRAK user, I want to be able to extract a file from a deeply-nested structure in the GUI, and then generate a script that can repeat that extraction by dumping the extracted file to disk somewhere.

Does the feature contain any proprietary information about another company's intellectual property?

No.

How would you implement this feature?

Since the "Download" button in the GUI runs fully client-side, whoever implements this will need to create an API endpoint to add a flush_to_disk step to the script in the back end. Then the function run when the GUI "Download" button is pressed will need to be updated to make a request to this endpoint, and call resource.update_script().

Are there any (reasonable) alternative approaches?

None that I can think of.

make FilesystemRoot.initialize_from_disk() fail when given an incorrect path, instead of not initializing anything

What is the problem? (Here is where you provide a complete Traceback.)
The current code is essentially:

for root, dirs, files in os.walk(root_path):
    <something>

Which does nothing when root_path doesn't exist.

It also does nothing when the path given corresponds to a file and not a directory.

Please provide some information about your environment.
N/A

If you've discovered it, what is the root cause of the problem?
See above.

How often does the issue happen?
Whenever an incorrect path is passed to FilesystemRoot.initialize_from_disk.

Are you interested in implementing it yourself?
No. This is a good first contributor issue!

UImage component should create children using data_range

The UImageUnpacker creates children using resource.create_child(data=uimage_data[slice:]) rather than resource.create_child(data_range=Range(slice, max_len)) resulting in an inaccurate data offset.

`_find_and_delete_overlapping_children` may have unintended behavior

The _find_and_delete_overlapping_children function (below) is used in the partial free space modifier.

ofrak/ofrak_core/ofrak/core/free_space.py

Lines 362 to 380 in 7b4c99f

    
           async def _find_and_delete_overlapping_children(resource: Resource, freed_range: Range): 
        
               # Note this filter calculation has the potential to be very expensive if, for instance, 
        
               # the resource is an entire program segment... 
        
               overlap_resources = list( 
        
                   await resource.get_children_as_view( 
        
                       MemoryRegion, 
        
                       r_filter=ResourceFilter( 
        
                           tags=(MemoryRegion,), 
        
                           attribute_filters=( 
        
                               ResourceAttributeRangeFilter(MemoryRegion.VirtualAddress, max=freed_range.end), 
        
                               ResourceAttributeRangeFilter(MemoryRegion.EndVaddr, min=freed_range.start), 
        
                           ), 
        
                       ), 
        
                   ) 
        
               ) 
        
               for overlapping_child in overlap_resources: 
        
                   await overlapping_child.resource.delete() 
        
                   await overlapping_child.resource.save()

As implied by its name, it finds and deletes children overlapping the range to be marked as free space. If a range partially overlaps a child, that entire child will be deleted even though only part of it will be marked as free space. This has the potential to lead to bugs.

In particular, it is rare that a partial child should be deleted. Instead, as an OFRAK user, I would like to know when such partial children are being deleted, because it may mean I have passed in an incorrect range to mark as free. I believe that if there is a partial overlap, an exception should be raised.

Which files would be affected?

ofrak_core/ofrak/core/free_space.py

Does the proposed maintenance include non-doc string functional changes to the Python code?

Minimally.

Are you interested in implementing it yourself?

Perhaps.

Add PyYAML dependency to docs and setup.py

OFRAK has PyYAML dependencies in three places which are not explicit in code or documentation:

ofrak_ghidra requires it
build_image.py requires it for the Docker build system
Building the OFRAK docs requires PyYAML

These three dependencies should be handled separately:

ofrak_ghidra's setup.py shall include PyYAML
INSTALL.md will explain that PyYAML is required for the Docker build
PyYAML will be included in the docs requirements in the extras_require section for the core ofrak package.

Originally posted by @joshterrill in #6 (comment)

Easy GUI installation and use

What is the use case for the feature?

As an OFRAK user, I want to be able to pip install the OFRAK GUI and run it outside of Docker on any OS supported by OFRAK. Moreover, I want to be able to start the GUI from the shell by running a command such as ofrak gui.

How would you implement this feature?

An MVP of the CLI for running the GUI might look something like this.

$ python3 -m pip install ofrak
$ ofrak gui --help
usage: ofrak gui [-H ADDR] [-p PORT]

Start the OFRAK GUI.

optional arguments:
  -h, --help
  -H ADDR, --hostname ADDR  hostname to bind to (default: 127.0.0.1)
  -p PORT, --port PORT              port to bind to (default: 8080)
$ ofrak gui
OFRAK web server started on 127.0.0.1:8080; press Ctrl-C to exit.
http://127.0.0.1:8080

Right now, there are two broad obstacles preventing the GUI from being installable with pip: removing non-Python dependencies of the GUI, and making it easy to run the GUI. These can be broken down into three main tasks:

The GUI is compiled from Svelte to static HTML/CSS/JavaScript using this step of the Docker build, which runs this command.
- These files should be pre-compiled and bundled in the pip package.
- Shipping the pre-compiled, static front end this enables users to run the GUI without installing NodeJS, npm, and the Svelte compiler.
The static GUI front end files are served using nginx inside the container. To run the GUI outside of Docker, these files need to be statically served without depending on NGINX.
- Since the OFRAK GUI server already needs to run for the GUI to work, it probably makes sense to add routes to the server to serve the static files that will be shipped with the pip package. This would likely involve using the static method provided by aiohttp.
Users should be able to run the GUI from the command line.
- This should involve integrating with the existing OFRAK CLI.
- The ofrak gui command (or one like it) should run the OFRAK GUI server and open the GUI in the default browser.
- As a default, INFO logs should be printed to standard out to make it easy for users to debug OFRAK.

Some post-MVP features include:

Optional argument to pass a binary to open when the GUI first loads
Optional logging level argument
Optional argument to specify the disassembler back end
Automation for compiling the Svelte files and pushing them to PyPI

Are you interested in implementing it yourself?

@marczalik is planning to take a look.

	# ASM Tests
	# def test_challenge_3_reloc_toy_example(toolchain_under_test: ToolchainUnderTest):
	# # TODO
	# pass

	async def _find_and_delete_overlapping_children(resource: Resource, freed_range: Range):
	# Note this filter calculation has the potential to be very expensive if, for instance,
	# the resource is an entire program segment...
	overlap_resources = list(
	await resource.get_children_as_view(
	MemoryRegion,
	r_filter=ResourceFilter(
	tags=(MemoryRegion,),
	attribute_filters=(
	ResourceAttributeRangeFilter(MemoryRegion.VirtualAddress, max=freed_range.end),
	ResourceAttributeRangeFilter(MemoryRegion.EndVaddr, min=freed_range.start),
	),
	),
	)
	)
	for overlapping_child in overlap_resources:
	await overlapping_child.resource.delete()
	await overlapping_child.resource.save()

redballoonsecurity / ofrak Goto Github PK

ofrak's People

Contributors

Stargazers

Watchers

Forkers

ofrak's Issues

Recommend Projects

Recommend Topics

Recommend Org