Giter Site home page Giter Site logo

clearbluejar / ghidriff Goto Github PK

View Code? Open in Web Editor NEW
435.0 6.0 18.0 12.29 MB

Python Command-Line Ghidra Binary Diffing Engine

Home Page: https://clearbluejar.github.io/ghidriff/

License: GNU General Public License v3.0

Shell 0.48% Python 93.05% Dockerfile 0.22% JavaScript 4.39% CSS 1.86%
patchdiff ghidra python bindiff binary-diffing vulnerability-research

ghidriff's Introduction

GitHub Workflow Status (with event) PyPI - Downloads

Ghidriff - Ghidra Binary Diffing Engine

ghidriff provides a command-line binary diffing capability with a fresh take on diffing workflow and results.

It leverages the power of Ghidra's ProgramAPI and FlatProgramAPI to find the added, deleted, and modified functions of two arbitrary binaries. It is written in Python3 using pyhidra to orchestrate Ghidra and jpype as the Python to Java interface to Ghidra.

Its primary use case is patch diffing. Its ability to perform a patch diff with a single command makes it ideal for automated analysis. The diffing results are stored in JSON and rendered in markdown (optionally side-by-side HTML). The markdown output promotes "social" diffing, as results are easy to publish in a gist or include in your next writeup or blog post.

High Level

flowchart LR

a(old binary - rpcrt4.dll-v1) --> b[GhidraDiffEngine]
c(new binary - rpcrt4.dll-v2) --> b

b --> e(Ghidra Project Files)
b --> diffs_output_dir

subgraph diffs_output_dir
    direction LR
    i(rpcrt4.dll-v1-v2.diff.md)
    h(rpcrt4.dll-v1-v2.diff.json)
    j(rpcrt4.dll-v1-v2.diff.side-by-side.html)
end

Sample Diffs

image image

Features

  • Command Line (patch diffing workflow reduced to a single step)
  • Highlights important changes in the TOC
  • Fast - Can diff the full Windows kernel in less than a minute (after Ghidra analysis is complete)
  • Enables Social Diffing
    • Beautiful Markdown Output
    • Easily hosted in a GitHub or GitLab gist, blog, or anywhere markdown is supported
    • Visual Diff Graph Results
  • Supports both unified and side by side diff results (unified is default)
  • Provides unique Meta Diffs:
    • Binary Strings
    • Called
    • Calling
    • Binary Metadata
  • Batteries Included
    • Docker support
    • Automated Testing
    • Ghidra (No license required)

See below for CVE diffs and sample usage

Design Goals

  • Find all added, deleted, and modified functions
  • Provide foundation for automation
  • Simple, Fast, Accurate
  • Resilient
  • Extendable
  • Easy sharing of results
  • Social Diffing

Powered by Ghidra

The heavy lifting of the binary analysis is done by Ghidra and the diffing is possible via Ghidra's Program API. ghidriff provides a diffing workflow, function matching, and resulting markdown and HTML diff output.

Docs

Engine

An "engine" is a self-contained, but externally-controllable, piece of code that encapsulates powerful logic designed to perform a specific type of work.

ghidriff provides a core base class GhidraDiffEngine that can be extended to create your own binary diffing implementations.

The base class implements the first 3 steps of the Ghidra headless workflow:

  1. Create Ghidra Project - Directory and collection of Ghidra project files and data
  2. Import Binary to project - Import one or more binaries to the project for analysis
  3. Analyze Binary - Ghidra will perform default binary analysis on each binary

The base class provides the abstract method find_matches where the actual diffing (function matching) takes place.

Extending ghidriff

ghidriff can be used as is, but it offers developers the ability to extend the tool by implementing their own differ. The basic idea is create new diffing tools by implementing the find_matches method from the base class.

class NewDiffTool(GhidraDiffEngine):

    def __init__(self,verbose=False) -> None:
        super().__init__(verbose)

    @abstractmethod
    def find_matches(
            self,            
            old: Union[str, pathlib.Path],
            new: Union[str, pathlib.Path]
    ) -> dict:
        """My amazing differ"""

        # find added, deleted, and modified functions
        # <code goes here>

        return [unmatched, matched]

Implementations

There are currently 3 diffing implementations, which also display the evolution of diffing for the project.

  1. SimpleDiff - A simple diff implementation. "Simple" as in it relies mostly on known symbol names for matching.
  2. StructualGraphDiff - A slightly more advanced differ, beginning to perform some more advanced hashing (such as Halvar's Structural Graph Comparison)
  3. VersionTrackingDiff - The latest differ, with several correlators (an algorithm used to score specific associations based on code, program flow, or any observable aspect of comparison) for function matching. This one is fast.

Each implementation leverages the base class, and implements find_changes.

Usage

usage: ghidriff [-h] [--engine {SimpleDiff,StructualGraphDiff,VersionTrackingDiff}] [-o OUTPUT_PATH] [--summary SUMMARY] [-p PROJECT_LOCATION] [-n PROJECT_NAME] [-s SYMBOLS_PATH] [--threaded | --no-threaded] [--force-analysis] [--force-diff] [--no-symbols] [--log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}]
                [--file-log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}] [--log-path LOG_PATH] [--va] [--min-func-len MIN_FUNC_LEN] [--use-calling-counts USE_CALLING_COUNTS] [--max-ram-percent MAX_RAM_PERCENT] [--print-flags] [--jvm-args [JVM_ARGS]] [--sxs] [--max-section-funcs MAX_SECTION_FUNCS]
                [--md-title MD_TITLE]
                old new [new ...]

ghidriff - A Command Line Ghidra Binary Diffing Engine

positional arguments:
  old                   Path to old version of binary '/somewhere/bin.old'
  new                   Path to new version of binary '/somewhere/bin.new'. (For multiple new binaries add oldest to newest)

options:
  -h, --help            show this help message and exit
  --engine {SimpleDiff,StructualGraphDiff,VersionTrackingDiff}
                        The diff implementation to use. (default: VersionTrackingDiff)
  -o OUTPUT_PATH, --output-path OUTPUT_PATH
                        Output path for resulting diffs (default: ghidriffs)
  --summary SUMMARY     Add a summary diff if more than two bins are provided (default: False)

Extendend Usage

There are quite a few options here, and some complexity. Generally you can succeed with the defaults, but you can override the defaults as needed. One example might be to increase the JVM RAM used to run Ghidra to enable faster analysis of large binaries (--max-ram-percent 80). See help for details of other options.

Show Extended Usage
Ghidra Project Options:
  -p PROJECT_LOCATION, --project-location PROJECT_LOCATION
                        Ghidra Project Path (default: ghidra_projects)
  -n PROJECT_NAME, --project-name PROJECT_NAME
                        Ghidra Project Name (default: ghidriff)
  -s SYMBOLS_PATH, --symbols-path SYMBOLS_PATH
                        Ghidra local symbol store directory (default: symbols)

Engine Options:
  --threaded, --no-threaded
                        Use threading during import, analysis, and diffing. Recommended (default: True)
  --force-analysis      Force a new binary analysis each run (slow) (default: False)
  --force-diff          Force binary diff (ignore arch/symbols mismatch) (default: False)
  --no-symbols          Turn off symbols for analysis (default: False)
  --log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Set console log level (default: INFO)
  --file-log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Set log file level (default: INFO)
  --log-path LOG_PATH   Set ghidriff log path. (default: ghidriff.log)
  --va, --verbose-analysis
                        Verbose logging for analysis step. (default: False)
  --min-func-len MIN_FUNC_LEN
                        Minimum function length to consider for diff (default: 10)
  --use-calling-counts USE_CALLING_COUNTS
                        Add calling/called reference counts (default: True)

JVM Options:
  --max-ram-percent MAX_RAM_PERCENT
                        Set JVM Max Ram % of host RAM (default: 60.0)
  --print-flags         Print JVM flags at start (default: False)
  --jvm-args [JVM_ARGS]
                        JVM args to add at start (default: None)

Markdown Options:
  --sxs                 Include side by side code diff (default: False)
  --max-section-funcs MAX_SECTION_FUNCS
                        Max number of functions to display per section. (default: 200)
  --md-title MD_TITLE   Overwrite default title for markdown diff (default: None)

Quick Start Environment Setup

  1. Download and install Ghidra.
  2. Set Ghidra Environment Variable GHIDRA_INSTALL_DIR to Ghidra install location.
  3. Pip install ghidriff

Windows

PS C:\Users\user> [System.Environment]::SetEnvironmentVariable('GHIDRA_INSTALL_DIR','C:\ghidra_10.2.3_PUBLIC_20230208\ghidra_10.2.3_PUBLIC')
PS C:\Users\user> pip install ghidriff

Linux / Mac

export GHIDRA_INSTALL_DIR="/path/to/ghidra/"
pip install ghidriff

Ghidriff in a Box

Don't want to install Ghidra and Java on your host? Try "Ghidriff in a box". It supports multiple-platforms (x64 and arm64).

Docker

docker pull ghcr.io/clearbluejar/ghidriff:latest

This is a docker container with the latest PyPi version of Ghidriff installed. You can check the latest container here.

For Docker command-line diffing

You will need to map the binaries you want to compare into the container. See below for an example.

mkdir -p ghidriffs
wget https://msdl.microsoft.com/download/symbols/clfs.sys/9848245C6f000/clfs.sys -O ghidriffs/clfs.sys.x64.10.0.22621.2506
wget https://msdl.microsoft.com/download/symbols/clfs.sys/D929C6E56f000/clfs.sys -O ghidriffs/clfs.sys.x64.10.0.22621.2715
docker run -it --rm -v $(pwd)/ghidriffs:/ghidriffs ghcr.io/clearbluejar/ghidriff:latest  ghidriffs/clfs.sys.x64.10.0.22621.2506 ghidriffs/clfs.sys.x64.10.0.22621.2715

The result will produce the following.

tree ghidriffs
ghidriffs
├── clfs.sys.x64.10.0.22621.2506
├── clfs.sys.x64.10.0.22621.2506-clfs.sys.x64.10.0.22621.2715.ghidriff.md
├── clfs.sys.x64.10.0.22621.2715
├── ghidra_projects
│   └── ghidriff-clfs.sys.x64.10.0.22621.2506-clfs.sys.x64.10.0.22621.2715
│       ├── ghidriff-clfs.sys.x64.10.0.22621.2506-clfs.sys.x64.10.0.22621.2715.gpr
│       ├── ghidriff-clfs.sys.x64.10.0.22621.2506-clfs.sys.x64.10.0.22621.2715.lock
│       └── ghidriff-clfs.sys.x64.10.0.22621.2506-clfs.sys.x64.10.0.22621.2715.rep
├── ghidriff.log
├── json
│   └── clfs.sys.x64.10.0.22621.2506-clfs.sys.x64.10.0.22621.2715.ghidriff.json
└── symbols
    ├── 000admin
    ├── clfs.pdb
    │   ├── 6EAE8987F981603FEFA0E55DE0CE2C521
    │   │   └── clfs.pdb
    │   └── E3D1FEA241ECEC3DC6DB2B278A22A6A31
    │       └── clfs.pdb
    └── pingme.txt

Devcontainer - For Ghidriff development

Use the .devcontainer in this repo. If you don't know how, follow the detailed instructions here: ghidra-python-vscode-devcontainer-skeleton quick setup.

Use Cases

Diffing a full Windows Kernel

Download two versions of the kernel (older and latest binary):

wget https://msdl.microsoft.com/download/symbols/ntoskrnl.exe/F7E31BA91047000/ntoskrnl.exe -O ntoskrnl.exe.10.0.22621.1344
wget https://msdl.microsoft.com/download/symbols/ntoskrnl.exe/17B6B7221047000/ntoskrnl.exe -O ntoskrnl.exe.10.0.22621.1413
Console Output:
vscode ➜ /workspaces/ghidriff (main) $ wget https://msdl.microsoft.com/download/symbols/ntoskrnl.exe/F7E31BA91047000/ntoskrnl.exe -O ntoskrnl.exe.10.0.22621.1344
--2023-05-17 03:18:40--  https://msdl.microsoft.com/download/symbols/ntoskrnl.exe/F7E31BA91047000/ntoskrnl.exe
Resolving msdl.microsoft.com (msdl.microsoft.com)... 204.79.197.219
Connecting to msdl.microsoft.com (msdl.microsoft.com)|204.79.197.219|:443... connected.
HTTP request sent, awaiting response... 302 Found
Could not parse String-Transport-Security header
Location: https://vsblobprodscussu5shard72.blob.core.windows.net/b-4712e0edc5a240eabf23330d7df68e77/8BFC691F50434EC2DC87BBDFC06A6A5FBACE992E60062F9C8CE829F58E3BCFB300.blob?sv=2019-07-07&sr=b&si=1&sig=Kgrvf90Kc15ac%2FtHsgPPj9ztxxTfkQ0yHGQh8dLDwQs%3D&spr=https&se=2023-05-18T03%3A32%3A47Z&rscl=x-e2eid-420cea82-598a4a00-a990abf8-919be2ff-session-5e9eb5eb-195146cb-b123c222-30eef52e [following]
--2023-05-17 03:18:40--  https://vsblobprodscussu5shard72.blob.core.windows.net/b-4712e0edc5a240eabf23330d7df68e77/8BFC691F50434EC2DC87BBDFC06A6A5FBACE992E60062F9C8CE829F58E3BCFB300.blob?sv=2019-07-07&sr=b&si=1&sig=Kgrvf90Kc15ac%2FtHsgPPj9ztxxTfkQ0yHGQh8dLDwQs%3D&spr=https&se=2023-05-18T03%3A32%3A47Z&rscl=x-e2eid-420cea82-598a4a00-a990abf8-919be2ff-session-5e9eb5eb-195146cb-b123c222-30eef52e
Resolving vsblobprodscussu5shard72.blob.core.windows.net (vsblobprodscussu5shard72.blob.core.windows.net)... 20.209.34.36
Connecting to vsblobprodscussu5shard72.blob.core.windows.net (vsblobprodscussu5shard72.blob.core.windows.net)|20.209.34.36|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11990400 (11M) [application/octet-stream]
Saving to: ‘ntoskrnl.exe.10.0.22621.1344’

ntoskrnl.exe.10.0.22621.1344                       100%[===============================================================================================================>]  11.43M  2.47MB/s    in 5.5s    

2023-05-17 03:18:46 (2.08 MB/s) - ‘ntoskrnl.exe.10.0.22621.1344’ saved [11990400/11990400]

vscode ➜ /workspaces/ghidriff (main) $ wget https://msdl.microsoft.com/download/symbols/ntoskrnl.exe/17B6B7221047000/ntoskrnl.exe -O ntoskrnl.exe.10.0.22621.1413
--2023-05-17 03:18:58--  https://msdl.microsoft.com/download/symbols/ntoskrnl.exe/17B6B7221047000/ntoskrnl.exe
Resolving msdl.microsoft.com (msdl.microsoft.com)... 204.79.197.219
Connecting to msdl.microsoft.com (msdl.microsoft.com)|204.79.197.219|:443... connected.
HTTP request sent, awaiting response... 302 Found
Could not parse String-Transport-Security header
Location: https://vsblobprodscussu5shard75.blob.core.windows.net/b-4712e0edc5a240eabf23330d7df68e77/D946523F2726056CD289008C977D02C0C0FBBCBB89D9FA40ADBB42CDE8D5022A00.blob?sv=2019-07-07&sr=b&si=1&sig=KfYz9cB7cUPO9JVo0U8eIj0etpASEWOyvCv5NkwVkfw%3D&spr=https&se=2023-05-18T03%3A50%3A53Z&rscl=x-e2eid-4960dee3-47d94aa4-a2207913-b73825a4-session-2879fa10-75774ef4-93e39015-3be72abb [following]
--2023-05-17 03:18:59--  https://vsblobprodscussu5shard75.blob.core.windows.net/b-4712e0edc5a240eabf23330d7df68e77/D946523F2726056CD289008C977D02C0C0FBBCBB89D9FA40ADBB42CDE8D5022A00.blob?sv=2019-07-07&sr=b&si=1&sig=KfYz9cB7cUPO9JVo0U8eIj0etpASEWOyvCv5NkwVkfw%3D&spr=https&se=2023-05-18T03%3A50%3A53Z&rscl=x-e2eid-4960dee3-47d94aa4-a2207913-b73825a4-session-2879fa10-75774ef4-93e39015-3be72abb
Resolving vsblobprodscussu5shard75.blob.core.windows.net (vsblobprodscussu5shard75.blob.core.windows.net)... 20.209.34.36
Connecting to vsblobprodscussu5shard75.blob.core.windows.net (vsblobprodscussu5shard75.blob.core.windows.net)|20.209.34.36|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11990336 (11M) [application/octet-stream]
Saving to: ‘ntoskrnl.exe.10.0.22621.1413’

ntoskrnl.exe.10.0.22621.1413                       100%[===============================================================================================================>]  11.43M  1.02MB/s    in 12s     

2023-05-17 03:19:11 (1004 KB/s) - ‘ntoskrnl.exe.10.0.22621.1413’ saved [11990336/11990336]

Run ghidriff:

ghidriff ntoskrnl.exe.10.0.22621.1344 ntoskrnl.exe.10.0.22621.1413
Console Output
(.env) vscode ➜ /workspaces/ghidriff (main) $ ghidriff ntoskrnl.exe.10.0.22621.1344 ntoskrnl.exe.10.0.22621.1413
INFO | ghidriff | Init Ghidra Diff Engine...
INFO | ghidriff | Engine Console Log: INFO
INFO | ghidriff | Engine File Log:  .ghidriffs/ghidriff.log INFO
INFO | ghidriff | Starting Ghidra...
INFO  Using log config file: jar:file:/ghidra/Ghidra/Framework/Generic/lib/Generic.jar!/generic.log4j.xml (LoggingInitialization)  
INFO  Using log file: /workspaces/ghidriff/.ghidriffs/ghidriff.log (LoggingInitialization)  
INFO  Loading user preferences: /home/vscode/.ghidra/.ghidra_10.2.3_PUBLIC/preferences (Preferences)  
INFO  Class search complete (716 ms) (ClassSearcher)  
INFO  Initializing SSL Context (SSLContextInitializer)  
INFO  Initializing Random Number Generator... (SecureRandomFactory)  
INFO  Random Number Generator initialization complete: NativePRNGNonBlocking (SecureRandomFactory)  
INFO  Trust manager disabled, cacerts have not been set (ApplicationTrustManagerFactory)  
INFO | ghidriff | GHIDRA_INSTALL_DIR: /ghidra
INFO | ghidriff | GHIDRA 10.2.3  Build Date: 2023-Feb-08 1242 EST Release: PUBLIC
INFO | ghidriff | Engine Args:
INFO | ghidriff |       old:                ['ntoskrnl.exe.10.0.22621.1344']
INFO | ghidriff |       new:                [['ntoskrnl.exe.10.0.22621.1413']]
INFO | ghidriff |       engine:             VersionTrackingDiff
INFO | ghidriff |       output_path:        .ghidriffs
INFO | ghidriff |       summary:            False
INFO | ghidriff |       project_location:   .ghidra_projects
INFO | ghidriff |       project_name:       ghidriff
INFO | ghidriff |       symbols_path:       .symbols
INFO | ghidriff |       threaded:           True
INFO | ghidriff |       force_analysis:     False
INFO | ghidriff |       force_diff:         False
INFO | ghidriff |       no_symbols:         False
INFO | ghidriff |       log_level:          INFO
INFO | ghidriff |       file_log_level:     INFO
INFO | ghidriff |       log_path:           ghidriff.log
INFO | ghidriff |       va:                 False
INFO | ghidriff |       max_ram_percent:    60.0
INFO | ghidriff |       print_flags:        False
INFO | ghidriff |       jvm_args:           None
INFO | ghidriff |       side_by_side:       False
INFO | ghidriff |       max_section_funcs:  200
INFO | ghidriff |       md_title:           None
INFO | ghidriff | Setting Up Ghidra Project...
INFO  Creating project: /workspaces/ghidriff/.ghidra_projects/ghidriff-ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413/ghidriff-ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413 (DefaultProject)  
INFO | ghidriff | Created project: ghidriff-ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413
INFO | ghidriff | Project Location: /workspaces/ghidriff/.ghidra_projects/ghidriff-ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413
INFO | ghidriff | Importing ntoskrnl.exe.10.0.22621.1344
INFO  Starting cache cleanup: /tmp/vscode-Ghidra/fscache2 (FileCacheMaintenanceDaemon)  
INFO  Finished cache cleanup, estimated storage used: 0 (FileCacheMaintenanceDaemon)  
INFO  Using Loader: Portable Executable (PE) (AutoImporter)  
INFO | ghidriff | Importing ntoskrnl.exe.10.0.22621.1413
INFO  Using Loader: Portable Executable (PE) (AutoImporter)  
INFO | ghidriff | Setting up Symbol Server for symbols...
INFO | ghidriff | path: .symbols level: 1
INFO | ghidriff | Symbol Server Configured path: SymbolServerService:
        symbolStore: LocalSymbolStore: [ rootDir: /workspaces/ghidriff/.symbols, storageLevel: -1],
        symbolServers:
                HttpSymbolServer: [ url: https://msdl.microsoft.com/download/symbols/, storageLevel: -1]
                HttpSymbolServer: [ url: https://chromium-browser-symsrv.commondatastorage.googleapis.com/, storageLevel: -1]
                HttpSymbolServer: [ url: https://symbols.mozilla.org/, storageLevel: -1]
                HttpSymbolServer: [ url: https://software.intel.com/sites/downloads/symbols/, storageLevel: -1]
                HttpSymbolServer: [ url: https://driver-symbols.nvidia.com/, storageLevel: -1]
                HttpSymbolServer: [ url: https://download.amd.com/dir/bin/, storageLevel: -1]
INFO  Connecting to https://msdl.microsoft.com/download/symbols/ (ConsoleTaskMonitor)  
INFO  Success (ConsoleTaskMonitor)  
INFO  Storing ntkrnlmp.pdb in local symbol store (12.66MB) (ConsoleTaskMonitor)  
INFO | ghidriff | Pdb stored at: /workspaces/ghidriff/.symbols/ntkrnlmp.pdb/FB0913AF0585F234BD64A64A87C62DB11/ntkrnlmp.pdb
INFO  Connecting to https://msdl.microsoft.com/download/symbols/ (ConsoleTaskMonitor)  
INFO  Success (ConsoleTaskMonitor)  
INFO  Storing ntkrnlmp.pdb in local symbol store (12.66MB) (ConsoleTaskMonitor)  
INFO | ghidriff | Pdb stored at: /workspaces/ghidriff/.symbols/ntkrnlmp.pdb/797E613DB16DB6C0E57795A0CB03F4711/ntkrnlmp.pdb
INFO | ghidriff | Program: /ntoskrnl.exe.10.0.22621.1344 imported: True has_pdb: True pdb_loaded: False analyzed False
INFO | ghidriff | Program: /ntoskrnl.exe.10.0.22621.1413 imported: True has_pdb: True pdb_loaded: False analyzed False
INFO | ghidriff | Starting analysis for 2 binaries
INFO | ghidriff | Analyzing: ntoskrnl.exe.10.0.22621.1413 - .ProgramDB
INFO | ghidriff | Analyzing: ntoskrnl.exe.10.0.22621.1344 - .ProgramDB
WARNING| ghidriff | Turning off 'Shared Return Calls' for ntoskrnl.exe.10.0.22621.1344 - .ProgramDB
INFO | ghidriff | Starting Ghidra analysis of ntoskrnl.exe.10.0.22621.1344 - .ProgramDB...
INFO  PDB analyzer parsing file: /workspaces/ghidriff/.symbols/ntkrnlmp.pdb/FB0913AF0585F234BD64A64A87C62DB11/ntkrnlmp.pdb (PdbUniversalAnalyzer)  
WARNING| ghidriff | Turning off 'Shared Return Calls' for ntoskrnl.exe.10.0.22621.1413 - .ProgramDB
INFO | ghidriff | Starting Ghidra analysis of ntoskrnl.exe.10.0.22621.1413 - .ProgramDB...
INFO  PDB analyzer parsing file: /workspaces/ghidriff/.symbols/ntkrnlmp.pdb/797E613DB16DB6C0E57795A0CB03F4711/ntkrnlmp.pdb (PdbUniversalAnalyzer)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/<unnamed-tag_00001117> (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/<unnamed-tag_0000111B> (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/<unnamed-tag_0000111F> (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/_WMI_LOGGER_CONTEXT (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/_PPM_PLATFORM_STATE (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/_IOP_IRP_EXTENSION (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/_EX_HEAP_POOL_NODE (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/_BLOB (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/_MMPAGING_FILE (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/_MMCLONE_DESCRIPTOR (CppCompositeType)  
WARN  PDB STRUCTURE reconstruction failed to align /ntkrnlmp.pdb/_KUSER_SHARED_DATA (CppCompositeType)  
INFO  Resolve time: 1939 mS (DefaultPdbApplicator)  
INFO  resolveCount: 3644 (DefaultPdbApplicator)  
INFO  Resolve time: 1854 mS (DefaultPdbApplicator)  
INFO  resolveCount: 3644 (DefaultPdbApplicator)  
WARN  Decompiling 1402efc70, pcode error at 14000000c: Unable to resolve constructor at 14000000c (DecompileCallback)  
WARN  Decompiling 1402efc70, pcode error at 14000000c: Unable to resolve constructor at 14000000c (DecompileCallback)  
WARN  Decompiling 1402efc70, pcode error at 14000000c: Unable to resolve constructor at 14000000c (DecompileCallback)  
INFO  Packed database cache: /tmp/vscode-Ghidra/packed-db-cache (PackedDatabaseCache)  

 
INFO  -----------------------------------------------------
    ASCII Strings                              0.137 secs
    Apply Data Archives                        3.295 secs
    Call Convention ID                         2.037 secs
    Call-Fixup Installer                       0.998 secs
    Create Address Tables                      0.021 secs
    Create Address Tables - One Time           5.159 secs
    Create Function                            8.858 secs
    Data Reference                            17.246 secs
    Decompiler Switch Analysis               266.328 secs
    Demangler Microsoft                        3.514 secs
    Disassemble                                0.232 secs
    Disassemble Entry Points                  58.448 secs
    Disassemble Entry Points - One Time        1.802 secs
    Embedded Media                             0.123 secs
    External Entry References                  0.125 secs
    Function ID                              114.335 secs
    Function Start Search                      0.859 secs
    Non-Returning Functions - Discovered      25.594 secs
    Non-Returning Functions - Known            0.144 secs
    PDB Universal                            168.601 secs
    Reference                                  6.969 secs
    Scalar Operand References                 91.615 secs
    Shared Return Calls                        4.718 secs
    Stack                                    291.121 secs
    Subroutine References                     14.672 secs
    Subroutine References - One Time           0.027 secs
    Windows x86 PE Exception Handling          0.470 secs
    Windows x86 PE RTTI Analyzer               0.091 secs
    Windows x86 Thread Environment Block (TEB) Analyzer     0.115 secs
    WindowsResourceReference                   0.413 secs
    x86 Constant Reference Analyzer          261.728 secs
-----------------------------------------------------
     Total Time   1349 secs
-----------------------------------------------------
 (AutoAnalysisManager)  
INFO  -----------------------------------------------------
    ASCII Strings                              3.249 secs
    Apply Data Archives                        3.290 secs
    Call Convention ID                         1.984 secs
    Call-Fixup Installer                       0.947 secs
    Create Address Tables                      0.007 secs
    Create Address Tables - One Time           5.178 secs
    Create Function                            8.855 secs
    Data Reference                            17.320 secs
    Decompiler Switch Analysis               264.962 secs
    Demangler Microsoft                        3.649 secs
    Disassemble                                0.468 secs
    Disassemble Entry Points                  58.480 secs
    Disassemble Entry Points - One Time        1.805 secs
    Embedded Media                             0.102 secs
    External Entry References                  0.120 secs
    Function ID                              114.285 secs
    Function Start Search                      0.987 secs
    Non-Returning Functions - Discovered      25.826 secs
    Non-Returning Functions - Known            0.034 secs
    PDB Universal                            169.189 secs
    Reference                                  6.714 secs
    Scalar Operand References                 91.422 secs
    Shared Return Calls                        4.760 secs
    Stack                                    291.137 secs
    Subroutine References                     14.711 secs
    Subroutine References - One Time           0.018 secs
    Windows x86 PE Exception Handling          0.463 secs
    Windows x86 PE RTTI Analyzer               0.089 secs
    Windows x86 Thread Environment Block (TEB) Analyzer     0.119 secs
    WindowsResourceReference                   0.403 secs
    x86 Constant Reference Analyzer          262.810 secs
-----------------------------------------------------
     Total Time   1353 secs
-----------------------------------------------------
 (AutoAnalysisManager)  
INFO | ghidriff | Analysis for ghidriff-ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413:/ntoskrnl.exe.10.0.22621.1413 complete
INFO | ghidriff | Analysis for ghidriff-ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413:/ntoskrnl.exe.10.0.22621.1344 complete
INFO | ghidriff | Diffing bins: ntoskrnl.exe.10.0.22621.1344 - ntoskrnl.exe.10.0.22621.1413
INFO | ghidriff | Setup 16 decompliers
INFO | ghidriff | Loaded old program: ntoskrnl.exe.10.0.22621.1344
INFO | ghidriff | Loaded new program: ntoskrnl.exe.10.0.22621.1413
INFO | ghidriff | p1 sym count: reported: 244603 analyzed: 16772
INFO | ghidriff | p2 sym count: reported: 244606 analyzed: 16809
INFO | ghidriff | Found unmatched: 65 matched: 16758 symbols
INFO  Hashing symbols in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing symbols in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Eliminate non-unique matches (ConsoleTaskMonitor)  
INFO  Finding symbol matches (ConsoleTaskMonitor)  
INFO | ghidriff | Exec time: 2.1672 secs
INFO | ghidriff | Match count 54939
INFO | ghidriff | Counter({('SymbolsHash',): 27893})
INFO | ghidriff | Running correlator: ExactBytesFunctionHasher
INFO | ghidriff | name: ExactBytesFunctionHasher hasher: ghidra.app.plugin.match.ExactBytesFunctionHasher@7167d81b one_to_one: True one_to_many: False
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Finding function matches (ConsoleTaskMonitor)  
INFO | ghidriff | ExactBytesFunctionHasher Exec time: 0.8299 secs
INFO | ghidriff | Match count: 100
INFO | ghidriff | Counter({('SymbolsHash',): 27893, ('ExactBytesFunctionHasher',): 100})
INFO | ghidriff | Running correlator: ExactInstructionsFunctionHasher
INFO | ghidriff | name: ExactInstructionsFunctionHasher hasher: ghidra.app.plugin.match.ExactInstructionsFunctionHasher@3c9cfcde one_to_one: True one_to_many: False
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Finding function matches (ConsoleTaskMonitor)  
INFO | ghidriff | ExactInstructionsFunctionHasher Exec time: 0.4906 secs
INFO | ghidriff | Match count: 123
INFO | ghidriff | Counter({('SymbolsHash',): 27893, ('ExactInstructionsFunctionHasher',): 123, ('ExactBytesFunctionHasher',): 100})
INFO | ghidriff | Running correlator: StructuralGraphExactHash
INFO | ghidriff | name: StructuralGraphExactHash hasher: <jpype._jproxy.proxy.StructuralGraphExactHasher object at 0xffff26c53bf0> one_to_one: True one_to_many: False
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Finding function matches (ConsoleTaskMonitor)  
INFO | ghidriff | StructuralGraphExactHash Exec time: 1.3213 secs
INFO | ghidriff | Match count: 0
INFO | ghidriff | Counter({('SymbolsHash',): 27893, ('ExactInstructionsFunctionHasher',): 123, ('ExactBytesFunctionHasher',): 100})
INFO | ghidriff | Running correlator: ExactMnemonicsFunctionHasher
INFO | ghidriff | name: ExactMnemonicsFunctionHasher hasher: ghidra.app.plugin.match.ExactMnemonicsFunctionHasher@7533923b one_to_one: True one_to_many: False
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Finding function matches (ConsoleTaskMonitor)  
INFO | ghidriff | ExactMnemonicsFunctionHasher Exec time: 2.5697 secs
INFO | ghidriff | Match count: 0
INFO | ghidriff | Counter({('SymbolsHash',): 27893, ('ExactInstructionsFunctionHasher',): 123, ('ExactBytesFunctionHasher',): 100})
INFO | ghidriff | Running correlator: BulkInstructionHash
INFO | ghidriff | name: BulkInstructionHash hasher: <jpype._jproxy.proxy.BulkInstructionsHasher object at 0xffff26c53b50> one_to_one: True one_to_many: False
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Finding function matches (ConsoleTaskMonitor)  
INFO | ghidriff | BulkInstructionHash Exec time: 1.1462 secs
INFO | ghidriff | Match count: 2
INFO | ghidriff | Counter({('SymbolsHash',): 27893, ('ExactInstructionsFunctionHasher',): 123, ('ExactBytesFunctionHasher',): 100, ('BulkInstructionHash',): 2})
INFO | ghidriff | Running correlator: StructuralGraphHash
INFO | ghidriff | name: StructuralGraphHash hasher: <jpype._jproxy.proxy.StructuralGraphHasher object at 0xffff26c53ab0> one_to_one: True one_to_many: True
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Finding function matches (ConsoleTaskMonitor)  
INFO | ghidriff | StructuralGraphHash Exec time: 0.1894 secs
INFO | ghidriff | Match count: 693
INFO | ghidriff | Counter({('SymbolsHash',): 27893, ('StructuralGraphHash',): 693, ('ExactInstructionsFunctionHasher',): 123, ('ExactBytesFunctionHasher',): 100, ('BulkInstructionHash',): 2})
INFO | ghidriff | Running correlator: BulkBasicBlockMnemonicHash
INFO | ghidriff | name: BulkBasicBlockMnemonicHash hasher: <jpype._jproxy.proxy.BulkBasicBlockMnemonicHasher object at 0xffff26c53a10> one_to_one: True one_to_many: True
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Finding function matches (ConsoleTaskMonitor)  
INFO | ghidriff | BulkBasicBlockMnemonicHash Exec time: 0.1846 secs
INFO | ghidriff | Match count: 0
INFO | ghidriff | Counter({('SymbolsHash',): 27893, ('StructuralGraphHash',): 693, ('ExactInstructionsFunctionHasher',): 123, ('ExactBytesFunctionHasher',): 100, ('BulkInstructionHash',): 2})
INFO | ghidriff | Running correlator: SigCallingCalledHasher
INFO | ghidriff | name: SigCallingCalledHasher hasher: <jpype._jproxy.proxy.SigCallingCalledHasher object at 0xffff26c53970> one_to_one: True one_to_many: False
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1344 (ConsoleTaskMonitor)  
INFO  Hashing functions in ntoskrnl.exe.10.0.22621.1413 (ConsoleTaskMonitor)  
INFO  Finding function matches (ConsoleTaskMonitor)  
INFO | ghidriff | SigCallingCalledHasher Exec time: 0.1398 secs
INFO | ghidriff | Match count: 0
INFO | ghidriff | Counter({('SymbolsHash',): 27893, ('StructuralGraphHash',): 693, ('ExactInstructionsFunctionHasher',): 123, ('ExactBytesFunctionHasher',): 100, ('BulkInstructionHash',): 2})
INFO | ghidriff | p1 missing = 1
INFO | ghidriff | p2 missing = 1
INFO | ghidriff | Deduping symbols and functions...
INFO | ghidriff | Sorting symbols and strings...
INFO | ghidriff | Sorting functions...
INFO | ghidriff | Starting esym lookups for 71 symbols using 8 threads
INFO | ghidriff | Completed 4 at 5%
INFO | ghidriff | Completed 8 at 11%
INFO | ghidriff | Completed 12 at 16%
INFO | ghidriff | Completed 16 at 22%
INFO | ghidriff | Completed 20 at 28%
INFO | ghidriff | Completed 24 at 33%
INFO | ghidriff | Completed 28 at 39%
INFO | ghidriff | Completed 32 at 45%
INFO | ghidriff | Completed 36 at 50%
INFO | ghidriff | Completed 40 at 56%
INFO | ghidriff | Completed 44 at 61%
INFO | ghidriff | Completed 48 at 67%
INFO | ghidriff | Completed 52 at 73%
INFO | ghidriff | Completed 56 at 78%
INFO | ghidriff | Completed 60 at 84%
INFO | ghidriff | Completed 64 at 90%
INFO | ghidriff | Completed 68 at 95%
INFO | ghidriff | Finished diffing old program: ntoskrnl.exe.10.0.22621.1344
INFO | ghidriff | Finished diffing program: ntoskrnl.exe.10.0.22621.1413
INFO | ghidriff | {
  "added_funcs_len": 1,
  "deleted_funcs_len": 1,
  "modified_funcs_len": 11,
  "added_symbols_len": 12,
  "deleted_symbols_len": 8,
  "diff_time": 86.45361375808716,
  "deleted_strings_len": 6,
  "added_strings_len": 39,
  "match_types": {
    "SymbolsHash": 27893,
    "ExactBytesFunctionHasher": 100,
    "ExactInstructionsFunctionHasher": 123,
    "BulkInstructionHash": 2,
    "StructuralGraphHash": 693
  },
  "items_to_process": 33,
  "diff_types": {
    "code": 5,
    "length": 5,
    "refcount": 7,
    "calling": 6,
    "address": 6,
    "called": 3
  },
  "unmatched_funcs_len": 2,
  "total_funcs_len": 60664,
  "matched_funcs_len": 60662,
  "matched_funcs_with_code_changes_len": 5,
  "matched_funcs_with_non_code_changes_len": 6,
  "matched_funcs_no_changes_len": 60651,
  "match_func_similarity_percent": "99.9819%",
  "func_match_overall_percent": "99.9967%"
}
INFO | ghidriff | Writing md diff...
INFO | ghidriff | Generating markdown from {'added_funcs_len': 1, 'deleted_funcs_len': 1, 'modified_funcs_len': 11, 'added_symbols_len': 12, 'deleted_symbols_len': 8, 'diff_time': 86.45361375808716, 'deleted_strings_len': 6, 'added_strings_len': 39, 'match_types': Counter({'SymbolsHash': 27893, 'StructuralGraphHash': 693, 'ExactInstructionsFunctionHasher': 123, 'ExactBytesFunctionHasher': 100, 'BulkInstructionHash': 2}), 'items_to_process': 33, 'diff_types': Counter({'refcount': 7, 'calling': 6, 'address': 6, 'code': 5, 'length': 5, 'called': 3}), 'unmatched_funcs_len': 2, 'total_funcs_len': 60664, 'matched_funcs_len': 60662, 'matched_funcs_with_code_changes_len': 5, 'matched_funcs_with_non_code_changes_len': 6, 'matched_funcs_no_changes_len': 60651, 'match_func_similarity_percent': '99.9819%', 'func_match_overall_percent': '99.9967%'}
INFO | ghidriff | Known Command line: python ghidriff --project-location .ghidra_projects --project-name ghidriff --symbols-path .symbols --threaded --log-level INFO --file-log-level INFO --log-path ghidriff.log --max-ram-percent 60.0 --max-section-funcs 200 ntoskrnl.exe.10.0.22621.1344 ntoskrnl.exe.10.0.22621.1413
INFO | ghidriff | Extra Command line: --engine VersionTrackingDiff --output-path .ghidriffs
INFO | ghidriff | Writing pdiff json...
INFO | ghidriff | Wrote .ghidriffs/ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413_diff.md
INFO | ghidriff | Wrote .ghidriffs/json/ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413_diff.json

Analyze the Diff

Results in this beatiful markdown: ntoskrnl.exe.10.0.22621.1344-ntoskrnl.exe.10.0.22621.1413.diff.md

See if you can figure out what function was patched for CVE-2023-2342.

Prefer a side by side diff? Try out ghidriff's custom html viewer. https://diffpreview.github.io/?b95ae854a92ee917cd0b5c7055b60282

Results stored in ghidriffs folder
$ tree ghidriffs
ghidriffs
├── ghidra_projects
│   └── ghidriff-ntoskrnl.exe.10.0.22621.2215-ntoskrnl.exe.10.0.22621.2283
│       ├── ghidriff-ntoskrnl.exe.10.0.22621.2215-ntoskrnl.exe.10.0.22621.2283.gpr
│       └── ghidriff-ntoskrnl.exe.10.0.22621.2215-ntoskrnl.exe.10.0.22621.2283.rep
│           ├── idata
│           ├── project.prp
│           ├── user
│           └── versioned
├── ghidriff.log
├── json
│   └── ntoskrnl.exe.10.0.22621.2215-ntoskrnl.exe.10.0.22621.2283.ghidriff.json
├── ntoskrnl.exe.10.0.22621.2215-ntoskrnl.exe.10.0.22621.2283.ghidriff.md
└── symbols
    ├── ntkrnlmp.pdb
        ├── 69071F680ADFE36F178C6EC06E79E09C1
        │   └── ntkrnlmp.pdb
        └── 738ED8FF966E8502EFE17095B9F1F5481
            └── ntkrnlmp.pdb

Diffing CVE-2023-21768

Details of the CVE-2023-21768 (detailed in this blog post). What if you wanted to repeat this patch diff with ghidriff?

  1. Download two versions of AFD.sys (vulnerable and patched):
wget https://msdl.microsoft.com/download/symbols/afd.sys/0C5C6994A8000/afd.sys -O afd.sys.x64.10.0.22621.1028
wget https://msdl.microsoft.com/download/symbols/afd.sys/50989142A9000/afd.sys -O afd.sys.x64.10.0.22621.1415
  1. Run ghidriff:
ghidriff afd.sys.x64.10.0.22621.1028 afd.sys.x64.10.0.22621.1415
  1. Review results

The diff results are posted in this GitHub gist. The vulnerable function AfdNotifyRemoveIoCompletion was identified here with a single line change.

Want to see the entire diff in a side by side? https://diffpreview.github.io/?f6fecbc507a9f1a92c9231e3db7ef40d or jump to the single line change

Notes

Markdown Spec + MermaidJs

  • Striving to be compliant with GFM and cmark. Still working on it though. See issues.
  • MermaidJs requires your markdown renderer support.

ghidriff's People

Contributors

clearbluejar avatar l4ys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ghidriff's Issues

Ghidra 10.2 getSymbol Exception

Exception has occurred: java.lang.IllegalArgumentException
java.lang.IllegalArgumentException: namespace is from different program instance: Global

The above exception was the direct cause of the following exception:

File "/workspaces/winbindiff/ghidra_diff_engine/ghidra_diff_engine/simple_diff.py", line 115, in diff_bins
sym2 = p2.getSymbolTable().getSymbol(sym.getName(),sym.getAddress(),sym.getParentNamespace())
File "/workspaces/winbindiff/winbindiff.py", line 99, in
pdiff = diff_engine.diff_bins(diff[0], diff[1])

Perhaps fix with this: https://github.com/NationalSecurityAgency/ghidra/blob/47f76c78d6b7d5c56a9256b0666620863805ff30/Ghidra/Features/Base/src/main/java/ghidra/program/util/DiffUtility.java#L300-L310

java.lang.IllegalArgumentException: Attempted to acquire the domain object ...

Not sure what this is. Run on Mac M1.

  File "/opt/homebrew/lib/python3.10/site-packages/ghidriff/__main__.py", line 81, in main
    d.setup_project(binary_paths, args.project_location, project_name, args.symbols_path)
  File "/opt/homebrew/lib/python3.10/site-packages/ghidriff/ghidra_diff_engine.py", line 423, in setup_project
    program = self.project.openProgram("/", program_path.name, False)
java.lang.java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Attempted to acquire the domain object more than once by the same consumer: ghidra.base.project.GhidraProject@615e7fe7

It seems to only occur when the bins are not in the same file path. If they are in the same directory, then this does not occur.

Precision on markdown format

Having tried two markdown implementations that both did not even recognize the mermaid parts as figures (let alone would render them), I'm slowly zeroing down on cmark --extension table (or cmark-gfm?) (which still doesn't render mermaid but that could be a plugin issue).

Please consider adding a note in the README about which flavor of markdown the files are to make this easier to use.

Python 3.12 JPype1 installation error

There is an issue with python 3.12 not correctly installing Ghidriff.

The issue lies with JPype1 not being correctly installed using PIP.

Changing to python 3.10 works for. Perhaps add this into the README section, that the user should not use python 3.12 (maybe 3.11 too).

Within the setup.cfg file there is no mention of any version above 3.10:

classifiers =
    Development Status :: 3 - Alpha
    Intended Audience :: Developers
    License :: OSI Approved :: GNU General Public License v3 (GPLv3)
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.7
    Programming Language :: Python :: 3.8
    Programming Language :: Python :: 3.9
    Programming Language :: Python :: 3.10

investige decompiler options

Ghidra/Features/Decompiler/src/decompile/cpp/ghidra_process.cc

void SetAction::rawAction(void)

{
  res = false;
  
  if (actionstring.size() != 0)
    ghidra->allacts.setCurrent(actionstring);
  if (printstring.size() != 0) {
    if (printstring == "tree")
      ghidra->setSendSyntaxTree(true);
    else if (printstring == "notree")
      ghidra->setSendSyntaxTree(false);
    else if (printstring == "c")
      ghidra->setSendCCode(true);
    else if (printstring == "noc")
      ghidra->setSendCCode(false);
    else if (printstring == "parammeasures")
      ghidra->setSendParamMeasures(true);
    else if (printstring == "noparammeasures")
      ghidra->setSendParamMeasures(false);
    else if (printstring == "jumpload")
      ghidra->flowoptions |= FlowInfo::record_jumploads;
    else if (printstring == "nojumpload")
      ghidra->flowoptions &= ~((uint4)FlowInfo::record_jumploads);
    else
      throw LowlevelError("Unknown print action: "+printstring);
  }
  res = true;
}

syntaxtree needed for decomp?

Code Refs counts sometimes fail in Ghidra

--- FaxRouteSetRoutingInfo called
+++ FaxRouteSetRoutingInfo called
@@ -1,10 +1,10 @@
-CDeviceRoutingInfo::EnableEmail-1
-CDeviceRoutingInfo::EnablePrint-1
-CDeviceRoutingInfo::EnableStore-1
-CDeviceRoutingInfo::SetStringValue-3
-CDevicesMap::GetDeviceRoutingInfo-1
-GetMaskBit-1
-KERNEL32.DLL::SetLastError-1
-WPP_SF_-1
-WPP_SF_S-1
-WPP_SF_l-1
+CDeviceRoutingInfo::EnableEmail
+CDeviceRoutingInfo::EnablePrint
+CDeviceRoutingInfo::EnableStore
+CDeviceRoutingInfo::SetStringValue
+CDevicesMap::GetDeviceRoutingInfo
+GetMaskBit
+KERNEL32.DLL::SetLastError
+WPP_SF_
+WPP_SF_S
+WPP_SF_l

handle large diffs

Want to be able to handle creating markdown files with diffs with several 1000 items to process. Will likely implement an adjustable limit.

Handle two diff bin files with the same name

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/ghidriff/.env/bin/ghidriff", line 33, in <module>
    sys.exit(load_entry_point('ghidriff', 'console_scripts', 'ghidriff')())
  File "/workspaces/ghidriff/ghidriff/__main__.py", line 81, in main
    d.setup_project(binary_paths, args.project_location, project_name, args.symbols_path)
  File "/workspaces/ghidriff/ghidriff/ghidra_diff_engine.py", line 423, in setup_project
    program = self.project.openProgram("/", program_path.name, False)
java.lang.java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Attempted to acquire the domain object more than once by the same consumer: ghidra.base.project.GhidraProject@3c2188f

Add decompilation correlator

Some cases, the decomp matching is a good indicator.

Only use this after all other correlators have been used.

Bug: 10.2 Default Analysis change causes issue for large binaries

Since the change in ghidra 10.2:

before

.winbindiffs/ntoskrnl.exe-22621/ntoskrnl.exe.x64.10.0.22621.382-ntoskrnl.exe.x64.10.0.22621.525.json:        "diff_time": 576.9418759346008
.winbindiffs/ntoskrnl.exe-22621/ntoskrnl.exe.x64.10.0.22621.382-ntoskrnl.exe.x64.10.0.22621.674.json:        "diff_time": 1728.207862854004
.winbindiffs/ntoskrnl.exe-22621/ntoskrnl.exe.x64.10.0.22621.525-ntoskrnl.exe.x64.10.0.22621.608.json:        "diff_time": 1449.8174550533295
.winbindiffs/ntoskrnl.exe-22621/ntoskrnl.exe.x64.10.0.22621.608-ntoskrnl.exe.x64.10.0.22621.674.json:        "diff_time": 189.17932105064392
.winbindiffs/ntoskrnl.exe-22621-x64/ntoskrnl.exe.x64.10.0.22621.382-ntoskrnl.exe.x64.10.0.22621.525.json:        "diff_time": 648.3320813179016

after

.winbindiffs/ntoskrnl.exe-22621-x64-10.2/ntoskrnl.exe.x64.10.0.22621.382-ntoskrnl.exe.x64.10.0.22621.525.json:        "diff_time": 6480.4214379787445
.winbindiffs/ntoskrnl.exe-22621-x64-10.2/ntoskrnl.exe.x64.10.0.22621.382-ntoskrnl.exe.x64.10.0.22621.819.json:        "diff_time": 12475.87869977951
.winbindiffs/ntoskrnl.exe-22621-x64-10.2/ntoskrnl.exe.x64.10.0.22621.525-ntoskrnl.exe.x64.10.0.22621.608.json:        "diff_time": 6663.064095497131
.winbindiffs/ntoskrnl.exe-22621-x64-10.2/ntoskrnl.exe.x64.10.0.22621.608-ntoskrnl.exe.x64.10.0.22621.674.json:        "diff_time": 3357.6645364761353
.winbindiffs/ntoskrnl.exe-22621-x64-10.2/ntoskrnl.exe.x64.10.0.22621.674-ntoskrnl.exe.x64.10.0.22621.675.json:        "diff_time": 118.16926455497742
.winbindiffs/ntoskrnl.exe-22621-x64-10.2/ntoskrnl.exe.x64.10.0.22621.675-ntoskrnl.exe.x64.10.0.22621.755.json:        "diff_time": 3409.6262555122375
.winbindiffs/ntoskrnl.exe-22621-x64-10.2/ntoskrnl.exe.x64.10.0.22621.755-ntoskrnl.exe.x64.10.0.22621.819.json:        "diff_time": 9517.311124563217

add option to force analysis without using symbols

            # clear current PDB loaded Ghidra/Features/PDB/src/main/java/ghidra/app/util/pdb/PdbProgramAttributes.java#L132
            from ghidra.app.util.bin.format.pdb import PdbParserConstants
            self.set_proginfo_option_bool(program_to_fix, PdbParserConstants.PDB_LOADED, False)

            self.analyze_program(program_to_fix, require_symbols=True, force_analysis=True)

            from ghidra.app.util.bin.format.pdb import PdbParserConstants
            self.set_proginfo_option_bool(p1, PdbParserConstants.PDB_LOADED, False)
            self.analyze_program(p1, require_symbols=True, force_analysis=True)

            self.set_proginfo_option_bool(p2, PdbParserConstants.PDB_LOADED, False)
            self.analyze_program(p2, require_symbols=True, force_analysis=True)

Add md-index

  • Koka hash, update structural graph to actual md-index
  • develop method to test hashes

Improve pytest suite

  • integration tests
  • diff known cases and check results
  • test correlators to find matches
  • test correlator combinations
  • test across different version of Ghidra. ( will need to provide docker images for each version needed to test

'ghidra.program.util.GhidraProgramUtilities' has no attribute 'setAnalyzedFlag'.

Not sure what this is. Mac M1.

Run with no flags, just ghidriff <old> <new>

Traceback (most recent call last):
  File "/opt/homebrew/bin/ghidriff", line 8, in <module>
    sys.exit(main())
  File "/opt/homebrew/lib/python3.10/site-packages/ghidriff/__main__.py", line 83, in main
    d.analyze_project()
  File "/opt/homebrew/lib/python3.10/site-packages/ghidriff/ghidra_diff_engine.py", line 757, in analyze_project
    prog = future.result()
  File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/ghidriff/ghidra_diff_engine.py", line 733, in analyze_program
    GhidraProgramUtilities.setAnalyzedFlag(program, True)
AttributeError: type object 'ghidra.program.util.GhidraProgramUtilities' has no attribute 'setAnalyzedFlag'. Did you mean: 'resetAnalysisFlags'?

handle 1000s of unmatched functions

"with a further 100,000 it could not correlate...."

This is mitigated somewhat by having a function limit (defaulted to 200), but want to handle this case generally.

Ability to diff already existing databases

In some cases it is much more useful to diff two pre-existing databases. For example, when ghidra's auto-analysis does not pick up certain functions especially when they are referred to using pointers and I typically have to go through the database and fix that by manually defining them functions. If not fixed, the diff will miss those pieces of code, of course.

If that is already supported, please let me know.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.