Giter Site home page Giter Site logo

pe-parse's Introduction

pe-parse

CI

pe-parse is a principled, lightweight parser for Windows portable executable files. It was created to assist in compiled program analysis, potentially of programs of unknown origins. This means that it should be resistant to malformed or maliciously crafted PE files, and it should support questions that analysis software would ask of an executable program container. For example, listing relocations, describing imports and exports, and supporting byte reads from virtual addresses as well as file offsets.

pe-parse supports these use cases via a minimal API that provides methods for

  • Opening and closing a PE file
  • Iterating over the imported functions
  • Iterating over the relocations
  • Iterating over the exported functions
  • Iterating over sections
  • Iterating over resources
  • Reading bytes from specified virtual addresses
  • Retrieving the program entry point

The interface is defined in parser-library/parse.h.

The program in dump-pe/main.cpp is an example of using the parser-library API to dump information about a PE file.

Internally, the parser-library uses a bounded buffer abstraction to access information stored in the PE file. This should help in constructing a sane parser that allows for detection of the use of bogus values in the PE that would result in out of bounds accesses of the input buffer. Once data is read from the file it is sanitized and placed in C++ STL containers of internal types.

Installation

pe-parse can be installed via vcpkg:

$ vcpkg install pe-parse

pe-parse includes Python bindings via pepy, which can be installed via pip:

$ pip3 install pepy

More information about pepy can be found in its README.

Dependencies

CMake

  • Debian/Ubuntu: sudo apt-get install cmake
  • RedHat/Fedora: sudo yum install cmake
  • OSX: brew install cmake
  • Windows: Download the installer from the CMake page

Building

Generic instructions

git clone https://github.com/trailofbits/pe-parse.git
cd pe-parse

mkdir build
cd build

cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build .

# optional
cmake --build . --target install

Windows-specific

VS 2017 and VS 2019 are supported.

# Compile 64-bit binaries with Visual Studio 2017
cmake -G "Visual Studio 15 2017 Win64" ..

# Or, with VS 2019, use the -A flag for architecture
cmake -G "Visual Studio 16 2019" -A Win64 ..

# Pass the build type at build time
cmake --build . --config Release

Testing

You can build the (catch2-based) tests by adding -DPEPARSE_ENABLE_TESTING=ON during CMake configuration. Build, and then run with ctest or cmake --build . --target test.

To run the full test suite with the Corkami test suite, you must clone the submodule with git submodule update --init.

Examples

You can build the included examples by adding -DPEPARSE_ENABLE_EXAMPLES=ON during CMake configuration.

Building with Sanitizers

If you are familiar with C++ sanitizers and any specific development environment requirements for them (compiler, instrumented standard library, etc.), you can choose to compile with any of the following sanitizers: Address, HWAddress, Undefined, Memory, MemoryWithOrigins, Leak, Address,Undefined.

For example, to compile with both Address and Undefined sanitizers, use the following (recommended for development and testing, and tested in CI):

mkdir build-san
cd build-san

cmake -DCMAKE_BUILD_TYPE=Debug -DPEPARSE_ENABLE_TESTING=ON -DPEPARSE_USE_SANITIZER=Address,Undefined ..
cmake --build .

Using the library

Once the library is installed, linking to it is easy! Add the following lines in your CMake project:

find_package(pe-parse REQUIRED)

target_link_libraries(your_target_name PRIVATE pe-parse::pe-parse)

You can see a full example in the examples/peaddrconv folder.

Authors

pe-parse was designed and implemented by Andrew Ruef, with significant contributions from Wesley Shields.

pe-parse is currently maintained by Eric Kilmer and William Woodruff.

pe-parse's People

Contributors

alessandrogario avatar armbues avatar artemdinaburg avatar awruef avatar bostick avatar cvspvr avatar dcnick3 avatar dependabot[bot] avatar dguido avatar ekilmer avatar gaasedelen avatar gsauthof avatar hobo-ru avatar jkolek avatar jonaski avatar lakor64 avatar lanyizi avatar mike-myers-tob avatar noxwizard avatar passthecilantro avatar pgoodman avatar postmodern avatar reaperhulk avatar ret2libc avatar slashvar avatar stefangs0x90 avatar tonytheodore avatar woodruffw avatar wxsbsd avatar yardenshafir avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pe-parse's Issues

out of memory error

hey,

i was trying to read a large executable of size 300 mb, ant it gives me an out of memory error, plus

readFileToFileBuffer:215

from pe error location

can you look into it?

aweosome work though

Corkami PE Testing - Known Failure Fixes

Within #145, there are very simple tests to detect whether pe-parse would correctly identify the executables as PE, without erroring. Unfortunately (but not unexpectedly), there are a few executables that are not parsed correctly.

Ideally, we should at least test and enforce that we support parsing of any PE in the Corkami dataset.

  • There are no test exceptions (read: known failures) when processing the Corkami dataset of PEs

Reference to known failing tests:

static const std::unordered_set<std::string> kKnownPEFailure{
"virtsectblXP.exe", "maxsec_lowaligW7.exe",
"maxsecXP.exe", "nullSOH-XP.exe",
"tinyXP.exe", "tinydllXP.dll",
"virtrelocXP.exe", "foldedhdrW7.exe",
"maxvals.exe", "d_nonnull.dll",
"reloccrypt.exe", "d_resource.dll",
"fakerelocs.exe", "lfanew_relocW7.exe",
"bigSoRD.exe", "tinyW7.exe",
"reloccryptW8.exe", "standard.exe",
"exe2pe.exe", "tinygui.exe",
"dllfwloop.dll", "tinydrivXP.sys",
"tiny.exe", "tinydll.dll",
"foldedhdr.exe", "dllmaxvals.dll",
"reloccryptXP.exe", "dosZMXP.exe",
"tinyW7_3264.exe", "dllfw.dll",
"hdrcode.exe", "ibrelocW7.exe",
"d_tiny.dll", "sc.exe"};

Secondly, a much bigger task would be to confirm that pe-parse is correctly parsing all and only the information that the Corkami PEs claim to exhibit.

  • There are specific PE field tests/asserts for the Corkami PEs to ensure correctness

How to Start Investigating

First, run git submodule update --init to pull the Corkami dataset (We will be focusing on the PEs here https://github.com/corkami/pocs/tree/master/PE/bin).

Then, running the standalone dump-pe tool that is included in this repo should be an easy way to iterate on code changes, since the testing logic is basically the same.

$ ./build/dump-pe/dump-pe tests/assets/corkami-poc-dataset/PE/bin/virtsectblXP.exe
Error: 3 (Invalid section)
Location: ParsePEFromBuffer:2394

Use that information as a starting point for where to begin debugging. Moreover, most, if not all, of the PEs have a corresponding asm file that provides the source code for building the PE and how the file is constructed. Use this information to gain a better understanding of why pe-parse is having difficulty parsing it and what kind of fix would be needed. Here it is for our example https://github.com/corkami/pocs/blob/master/PE/virtsectblXP.asm

Segmentation fault in get_bytes

from test.py:
byts = p.get_bytes(ep, 8)

I got segmentation fault in this function.
Here is my output:

[lxu@static1 python]$ python test.py /home/nfs/lxu/data/PE_binary/00001AA4444F539C1B63C4B4BECE8BA374757FEC49C17D218B6339EFD3EAA491
Magic: 0x10b
Signature: 0x4550
Machine: 0x14c
Number of sections: 3
Number of symbols: 0
Characteristics: 0x210e
Timedatestamp: 2006-12-26 06:48:21
Major linker version: 0x6
Minor linker version: 0x0
Size of code: 0xc7000
Size of initialized data: 0x2000
Size of uninitialized data: 0x0
Address of entry point: 0xc8fbe
Base address of code: 0x2000
Base address of data: 0xca000
Image base address: 0x45d60000
Section alignment: 0x2000
File alignment: 0x1000
Major OS version: 0x4
Minor OS version: 0x0
Win32 version: 0x0
Size of image: 0xce000
Size of headers: 0x1000
Checksum: 0xcde81
Subsystem: 0x3
DLL characteristics: 0x400
Size of stack reserve: 0x100000
Size of stack commit: 0x1000
Size of heap reserve: 0x100000
Size of heap commit: 0x1000
Loader flags: 0x0
Number of RVA and sizes: 0x10
Get entry point
Get bytes
Segmentation fault (core dumped)

dump-pe gives error when parsing CLI options

First found when trying:

$ dump-pe --version
Error: 7 (Unable to open)
Location: readFileToFileBuffer:217

Also happens with dump-pe --something.

These lines aren't processing the arguments correctly:

pe-parse/dump-pe/main.cpp

Lines 294 to 302 in 05676c1

int main(int argc, char *argv[]) {
if (argc != 2 || (argc == 2 && std::strcmp(argv[1], "--help") == 0)) {
std::cout << "dump-pe utility from Trail of Bits\n";
std::cout << "Repository: https://github.com/trailofbits/pe-parse\n\n";
std::cout << "Usage:\n\tdump-pe /path/to/executable.exe\n";
return 1;
}
parsed_pe *p = ParsePEFromFile(argv[1]);

Side-note: It would be nice to have --version print the version of the pe-parse-library.

Calculate Rich header checksum

The library should calculate the checksum (decryption key) used in the Rich header. Comparing this checksum to the Rich decryption key (the DWORD following Rich signature) indicates whether the the DOS header, stub, or Rich header has been modified.

dump-pe shows all PE32/PE32+ 64 bits

hello
From the README documentation, I know that VS2017 and VS2019 are supported.I compiled with VS2015. I tested with dump-pe.exe and found that the PE files(32 and 64 bit) all showed up as 64bit.Is this due to the VS2015 compiler? Thanks!

Is boost really necessary?

Looking at the source code, I fail to see where the need for the boost dependency is.
For instance, boost/cstdint.hpp could be literally replaced by just cstdint, which is the C++ header of stdint.h.

As for boost::to_upper, it could be replaced by std::transform(str.begin(), str.end(),str.begin(), ::toupper), which would have the exact same effect. (Also, in C++, stdlib functions are required by the standard to be actual functions, and not just macros like it is the case in C).

Removing boost, unless it's really, absolutely necessary, would probably be a good idea.

pe export table calculate wrong

when I use this lib to test with pe files. compared the reulst with python pefile.

this lib works wrong when I input windows xp sp3 "ntoskrnl.exe" x86 arch.

the export table function entry address is wrong.
function count is different of IDA Pro.

thanks for the author that u've give us a cross-platform libary
it works on win32,win64 also unix like system.

cmake failed with version 3.4.3

In CMakeLists.txt, line 20 if (${CMAKE_VERSION} VERSION_GREATER_EQUAL 3.4) breaks with cmake 3.4.3. It seems VERSION_GREATER_EQUAL is newly added in cmake 3.7

Easy to fix "possible lost of data" warnings

You have 17 (or so) warnings like that, which is not really acceptable.
(Absolutely great work otherwise!)

They all stem from incorrect work with 32-bit vs 64-bit targets, different pointer sizes.

The problem is easily solved by using the standard cross-platform types intptr_t / uintptr_t.
Will you fix it please?

Thank you.
—SA

Endianness

There are TODO comments for readWord, readDword and readQword in buffer.cpp:
// TODO: perform endian swap as needed

I would like to implement this, if that's fine.

Populate Rich header product names

The Rich header parser contains a map of build numbers to product names that needs to be populated. See peparse::ProductMap.

Each Rich header entry is made up of a product id, a build number, and a count of the number of objects of that type. The build number can be used to almost uniquely identify the product name and version used to create that object.

Useful references of mapping information:
http://bytepointer.com/articles/the_microsoft_rich_header.htm
https://github.com/dishather/richprint/blob/master/comp_id.txt
https://walbourn.github.io
https://dev.to/yumetodo/list-of-mscver-and-mscfullver-8nd

GetDataDirectoryEntry returns an incorrect buffer for DIR_SECURITY

GetDataDirectory currently assumes that each data directory vector entry's VirtualAddress corresponds to a (materializable) RVA. This is true for all known entries except DIR_SECURITY, which is a special case. Per MSDN:

The Certificate Table entry points to a table of attribute certificates. These certificates are not loaded into memory as part of the image. As such, the first field of this entry, which is normally an RVA, is a file pointer instead.

GetDataDirectoryEntry needs to special-case DIR_SECURITY and grab the buffer by file offset instead of by section + RVA lookup.

Parse the Rich header

Most PEs (always if linked with link.exe, and potentially with some other linkers) contain an undocumented structure (unofficially termed the "Rich header") that contains compilation and compiler toolchain information (number of built objects, number of source files, toolchain versions, etc). It's obfuscated with a basic XOR (the key is the DWORD that follows the "Rich" signature).

Malware authors frequently modify this structure (or forget to modify it, leading to mismatches), so exposing it through pe-parse could be useful.

Resources:

Segmentation fault parsing notepad.exe

I'm trying to parse notepad.exe which is PE32+.

$file executables/notepad.exe
executables/notepad.exe: PE32+ executable (GUI) x86-64, for MS Windows

Using the following code:

include

include

include "parse.h"

using namespace std;

int main(void)
{
parsed_pe *p = ParsePEFromFile("executables/notepad.exe");
return 0;
}

gdb said:

Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f67be8f034d in std::string::push_back(char) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

Compilation errors on macOS 10.12.4

I tried compiling pe-parse on macOS and ran into the following errors. What am I missing?

$ make
Scanning dependencies of target pe-parser-library
[ 20%] Building CXX object parser-library/CMakeFiles/pe-parser-library.dir/buffer.cpp.o
In file included from /Users/dan/github/pe-parse/parser-library/buffer.cpp:25:
In file included from /Users/dan/github/pe-parse/parser-library/parse.h:30:
/Users/dan/github/pe-parse/parser-library/nt-headers.h:35:1: error: unknown type
      name 'constexpr'
constexpr std::uint16_t MZ_MAGIC = 0x5A4D;
^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:35:16: error: cannot
      define or redeclare 'uint16_t' here because namespace 'peparse' does not
      enclose namespace 'std'
constexpr std::uint16_t MZ_MAGIC = 0x5A4D;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:35:16: error: no member
      named 'uint16_t' in namespace 'std'
constexpr std::uint16_t MZ_MAGIC = 0x5A4D;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:35:24: error: expected
      ';' after top level declarator
constexpr std::uint16_t MZ_MAGIC = 0x5A4D;
                       ^
                       ;
/Users/dan/github/pe-parse/parser-library/nt-headers.h:36:1: error: unknown type
      name 'constexpr'
constexpr std::uint32_t NT_MAGIC = 0x00004550;
^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:36:16: error: cannot
      define or redeclare 'uint32_t' here because namespace 'peparse' does not
      enclose namespace 'std'
constexpr std::uint32_t NT_MAGIC = 0x00004550;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:36:16: error: no member
      named 'uint32_t' in namespace 'std'
constexpr std::uint32_t NT_MAGIC = 0x00004550;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:36:24: error: expected
      ';' after top level declarator
constexpr std::uint32_t NT_MAGIC = 0x00004550;
                       ^
                       ;
/Users/dan/github/pe-parse/parser-library/nt-headers.h:37:1: error: unknown type
      name 'constexpr'
constexpr std::uint16_t NUM_DIR_ENTRIES = 16;
^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:37:16: error: cannot
      define or redeclare 'uint16_t' here because namespace 'peparse' does not
      enclose namespace 'std'
constexpr std::uint16_t NUM_DIR_ENTRIES = 16;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:37:16: error: no member
      named 'uint16_t' in namespace 'std'
constexpr std::uint16_t NUM_DIR_ENTRIES = 16;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:37:24: error: expected
      ';' after top level declarator
constexpr std::uint16_t NUM_DIR_ENTRIES = 16;
                       ^
                       ;
/Users/dan/github/pe-parse/parser-library/nt-headers.h:38:1: error: unknown type
      name 'constexpr'
constexpr std::uint16_t NT_OPTIONAL_32_MAGIC = 0x10B;
^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:38:16: error: cannot
      define or redeclare 'uint16_t' here because namespace 'peparse' does not
      enclose namespace 'std'
constexpr std::uint16_t NT_OPTIONAL_32_MAGIC = 0x10B;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:38:16: error: no member
      named 'uint16_t' in namespace 'std'
constexpr std::uint16_t NT_OPTIONAL_32_MAGIC = 0x10B;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:38:24: error: expected
      ';' after top level declarator
constexpr std::uint16_t NT_OPTIONAL_32_MAGIC = 0x10B;
                       ^
                       ;
/Users/dan/github/pe-parse/parser-library/nt-headers.h:39:1: error: unknown type
      name 'constexpr'
constexpr std::uint16_t NT_OPTIONAL_64_MAGIC = 0x20B;
^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:39:16: error: cannot
      define or redeclare 'uint16_t' here because namespace 'peparse' does not
      enclose namespace 'std'
constexpr std::uint16_t NT_OPTIONAL_64_MAGIC = 0x20B;
          ~~~~~^
/Users/dan/github/pe-parse/parser-library/nt-headers.h:39:16: error: no member
      named 'uint16_t' in namespace 'std'
constexpr std::uint16_t NT_OPTIONAL_64_MAGIC = 0x20B;
          ~~~~~^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
make[2]: *** [parser-library/CMakeFiles/pe-parser-library.dir/buffer.cpp.o] Error 1
make[1]: *** [parser-library/CMakeFiles/pe-parser-library.dir/all] Error 2
make: *** [all] Error 2
$cc -v
Apple LLVM version 8.1.0 (clang-802.0.38)
Target: x86_64-apple-darwin16.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Remove Boost dependency

As far as i see the boost library isn't required for the most parts (as C++ 11 does have most functionality that is used by this library). The problem with boost is, that it is a huge dependency and makes it very difficult to build this library on windows (eventhough it isn't too big of an issue for linux...). Instead of using boost integers i suggest going with stdint.h as it is available on every modern compiler. Other functionality can be replaced aswell by c++ 11 standard.
With best regards

Add Rich header/entry support to pepy

pe-parse has had Rich header support for a while now, but that hasn't been ported into pepy.

An interface like this would be good (and consistent with the others):

pe.get_rich_header()
pe.get_rich_entries()

Cannot access to parsed_pe::internal in IDEs in C++

I wanted to check contents in internal with below code but I couldn't access because some structs are defined in cpp file not header file.

	string path = ((AnalyzeInput*)input)->second;
	string file_name = path.substr(path.find_last_of("/") + 1);
	parsed_pe* pe = ParsePEFromFile(path.c_str());
	bool is_mallware = false;
	

	if (pe == nullptr) {
		handle->on_result(file_name, Results::RES_CANNOT_PARSE_PE);
		return;
	}

	if (pe->internal->secs.size() < 2) {
		handle->on_result(file_name, Results::RES_HAS_LESS_SEC);
	}

Structs below have to move the definition from parse.cpp to parse.h. parsed_pe_internal has declaration in parse.h but doesn't have member variables so cannot access from code neither.

struct section {
    std::string sectionName;
    std::uint64_t sectionBase;
    bounded_buffer* sectionData;
    image_section_header sec;
};

struct importent {
    VA addr;
    std::string symbolName;
    std::string moduleName;
};

struct exportent {
    VA addr;
    std::string symbolName;
    std::string moduleName;
};

union symbol_name {
    std::uint8_t shortName[NT_SHORT_NAME_LEN];
    std::uint32_t zeroes;
    std::uint64_t data;
};

struct aux_symbol_f1 {
    std::uint32_t tagIndex;
    std::uint32_t totalSize;
    std::uint32_t pointerToLineNumber;
    std::uint32_t pointerToNextFunction;
};

struct aux_symbol_f2 {
    std::uint16_t lineNumber;
    std::uint32_t pointerToNextFunction;
};

struct aux_symbol_f3 {
    std::uint32_t tagIndex;
    std::uint32_t characteristics;
};

struct aux_symbol_f4 {
    std::uint8_t filename[SYMTAB_RECORD_LEN];
    std::string strFilename;
};

struct aux_symbol_f5 {
    std::uint32_t length;
    std::uint16_t numberOfRelocations;
    std::uint16_t numberOfLineNumbers;
    std::uint32_t checkSum;
    std::uint16_t number;
    std::uint8_t selection;
};

struct symbol {
    std::string strName;
    symbol_name name;
    std::uint32_t value;
    std::int16_t sectionNumber;
    std::uint16_t type;
    std::uint8_t storageClass;
    std::uint8_t numberOfAuxSymbols;
    std::vector<aux_symbol_f1> aux_symbols_f1;
    std::vector<aux_symbol_f2> aux_symbols_f2;
    std::vector<aux_symbol_f3> aux_symbols_f3;
    std::vector<aux_symbol_f4> aux_symbols_f4;
    std::vector<aux_symbol_f5> aux_symbols_f5;
};

struct reloc {
    VA shiftedAddr;
    reloc_type type;
};

struct parsed_pe_internal {
    std::vector<section> secs;
    std::vector<resource> rsrcs;
    std::vector<importent> imports;
    std::vector<reloc> relocs;
    std::vector<exportent> exports;
    std::vector<symbol> symbols;
};

IterSec should yield sections in file-offset order

IterSec currently yields sections to its callback based on the order in which they appear in the section header table, which is not guaranteed to reflect the file offset order (i.e., the order account to each section header's PointerToRawData.

Making it yield in file-offset order should be easy (we can std::sort the sections after collecting them) and makes tasks like Authenticode hashing significantly simpler, so we should make this an API guarantee.

Extract delay-loaded libraries from import table

First

Thanks for your good library for parsing PE files.
We use your library for extract dependencies of executable application on CQtDeployer project.

Trouble

Windows use the delay-loaded technology for load dependencies.
And it is very likely that these libraries are not visible when parsing the import table.

In the Micrasoft documentation site has a instruction of extracting delay-load libraries but they use system functions.

Question

How to extract delay load libraries uses your Library?.

READ_WORD and READ_DWORD

Constantly defining and undefining READ_WORD and READ_DWORD makes the code a bit cumbersome to deal with. Would you be open to only defining them once? They would have to take some more arguments but I think it can be done. I'll do the leg work to come up with a prototype if you think it's worth doing.

Please consider using shorter/consistent names for library name / include directory.

When installing pe-parse on *nix systems, I noticed that headers get installed into PREFIX/include/parser-library and the library is named libpe-parser-library. This looks odd. ;-)

  1. parser-library is pretty generic, the name as used in #include <parser-library/parse.h> does not tell the reader anything what this is about.
  2. There is no need for the -library suffix in the library name?

How about using pe-parser in both cases?

fails to build on ubuntu xenial

I'm not able to build pe-parse following the build instructions on the main page. Here's the Dockerfile I'm using:

FROM ubuntu:xenial

WORKDIR /opt
RUN apt-get update && apt-get install -y cmake make build-essential git

RUN apt-get install -y gcc g++
RUN git clone https://github.com/trailofbits/pe-parse.git pe-parse
RUN cd /opt/pe-parse && \
    cmake . && \
    make

and the output:

λ docker build -t pe-parse .
Sending build context to Docker daemon   5.12kB
Step 1/6 : FROM ubuntu:xenial
 ---> 747cb2d60bbe
Step 2/6 : WORKDIR /opt
 ---> Using cache
 ---> aeba0b7ed96b
Step 3/6 : RUN apt-get update && apt-get install -y cmake make build-essential git
 ---> Using cache
 ---> 5bbf877668b4
Step 4/6 : RUN apt-get install -y gcc g++
 ---> Running in 7a26aa0d5c1a
Reading package lists...
Building dependency tree...
Reading state information...
g++ is already the newest version (4:5.3.1-1ubuntu1).
g++ set to manually installed.
gcc is already the newest version (4:5.3.1-1ubuntu1).
gcc set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 6 not upgraded.
 ---> 5bb62f59aa63
Removing intermediate container 7a26aa0d5c1a
Step 5/6 : RUN git clone https://github.com/trailofbits/pe-parse.git pe-parse
 ---> Running in f1b1f88daf0f
Cloning into 'pe-parse'...
 ---> f53ed2f52650
Removing intermediate container f1b1f88daf0f
Step 6/6 : RUN cd /opt/pe-parse &&     cmake . &&     make
 ---> Running in 23ce9dc2c7e7
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build type: RelWithDebInfo
-- Install prefix: /usr
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/pe-parse
/usr/bin/cmake -H/opt/pe-parse -B/opt/pe-parse --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /opt/pe-parse/CMakeFiles /opt/pe-parse/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/opt/pe-parse'
make -f pe-parser-library/CMakeFiles/pe-parser-library.dir/build.make pe-parser-library/CMakeFiles/pe-parser-library.dir/depend
make[2]: Entering directory '/opt/pe-parse'
cd /opt/pe-parse && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /opt/pe-parse /opt/pe-parse/pe-parser-library /opt/pe-parse /opt/pe-parse/pe-parser-library /opt/pe-parse/pe-parser-library/CMakeFiles/pe-parser-library.dir/DependInfo.cmake --color=
Scanning dependencies of target pe-parser-library
make[2]: Leaving directory '/opt/pe-parse'
make -f pe-parser-library/CMakeFiles/pe-parser-library.dir/build.make pe-parser-library/CMakeFiles/pe-parser-library.dir/build
make[2]: Entering directory '/opt/pe-parse'
[ 20%] Building CXX object pe-parser-library/CMakeFiles/pe-parser-library.dir/src/buffer.cpp.o
cd /opt/pe-parse/pe-parser-library && /usr/bin/c++    -I/opt/pe-parse/pe-parser-library/include  -O2 -g -DNDEBUG   -fPIC -pedantic -Wall -Wextra -Wcast-align -Wcast-qual -Wctor-dtor-privacy -Wdisabled-optimization -Wformat=2 -Winit-self -Wlong-long -Wmissing-declarations -Wmissing-include-dirs -Wcomment -Wold-style-cast -Woverloaded-virtual -Wredundant-decls -Wshadow -Wsign-conversion -Wsign-promo -Wstrict-overflow=5 -Wswitch-default -Wundef -Werror -Wunused -Wuninitialized -Wno-missing-declarations -gdwarf-2 -g3 -std=c++11 -o CMakeFiles/pe-parser-library.dir/src/buffer.cpp.o -c /opt/pe-parse/pe-parser-library/src/buffer.cpp
[ 40%] Building CXX object pe-parser-library/CMakeFiles/pe-parser-library.dir/src/parse.cpp.o
cd /opt/pe-parse/pe-parser-library && /usr/bin/c++    -I/opt/pe-parse/pe-parser-library/include  -O2 -g -DNDEBUG   -fPIC -pedantic -Wall -Wextra -Wcast-align -Wcast-qual -Wctor-dtor-privacy -Wdisabled-optimization -Wformat=2 -Winit-self -Wlong-long -Wmissing-declarations -Wmissing-include-dirs -Wcomment -Wold-style-cast -Woverloaded-virtual -Wredundant-decls -Wshadow -Wsign-conversion -Wsign-promo -Wstrict-overflow=5 -Wswitch-default -Wundef -Werror -Wunused -Wuninitialized -Wno-missing-declarations -gdwarf-2 -g3 -std=c++11 -o CMakeFiles/pe-parser-library.dir/src/parse.cpp.o -c /opt/pe-parse/pe-parser-library/src/parse.cpp
/opt/pe-parse/pe-parser-library/src/parse.cpp: In function 'bool peparse::readCString(const bounded_buffer&, uint32_t, std::__cxx11::string&)':
/opt/pe-parse/pe-parser-library/src/parse.cpp:153:1: error: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C2 -+ C1 [-Werror=strict-overflow]
 readCString(const bounded_buffer &buffer, std::uint32_t off, string &result) {
 ^
cc1plus: all warnings being treated as errors
pe-parser-library/CMakeFiles/pe-parser-library.dir/build.make:89: recipe for target 'pe-parser-library/CMakeFiles/pe-parser-library.dir/src/parse.cpp.o' failed
make[2]: Leaving directory '/opt/pe-parse'
make[2]: *** [pe-parser-library/CMakeFiles/pe-parser-library.dir/src/parse.cpp.o] Error 1
make[1]: *** [pe-parser-library/CMakeFiles/pe-parser-library.dir/all] Error 2
CMakeFiles/Makefile2:88: recipe for target 'pe-parser-library/CMakeFiles/pe-parser-library.dir/all' failed
make[1]: Leaving directory '/opt/pe-parse'
Makefile:130: recipe for target 'all' failed
make: *** [all] Error 2
The command '/bin/sh -c cd /opt/pe-parse &&     cmake . &&     make' returned a non-zero code: 2

Disable warning while ParsePEFromFile

Hi! I want to disable Warning message while using ParsePEFromFile. When I call ParsePEFromFile messages below are printed at stdout. Is there any way that I can disable these messages?

Warning: Skipping auxiliary symbol of type CLASS_EXTERNAL at offset 0x76fc
Warning: Skipping auxiliary symbol of type CLASS_EXTERNAL at offset 0x7888
Warning: Skipping auxiliary symbol of type CLASS_EXTERNAL at offset 0x78f4
Warning: Skipping auxiliary symbol of type CLASS_EXTERNAL at offset 0x7960
Warning: Skipping auxiliary symbol of type CLASS_EXTERNAL at offset 0x7edc
Warning: Skipping auxiliary symbol of type CLASS_EXTERNAL at offset 0x7fb4
Warning: Invalid internal offset (current: 0x8c8a, expected: 0x8c92)
Warning: Invalid internal offset (current: 0x93a4, expected: 0x93ac)

Documentation request: Comparing of resource names is not obvious

I tried to extract a specific resource, lets call it "ABCD" out of an exe, so I used the dumper as a reference:

int printRsrc(void *N, resource r) {
  if (r.name_str.length())
    cout << "Name (string): " << r.name_str << endl;
  return 0;
}

Output: Name (string): ABCD (as expected)

Then I replaced the code with

int printRsrc(void *N, resource r) {
  if (r.name_str == "ABCD")
    // print something

And wondered why it never enters the if-block, also the debugger only showed "A" and I wondered whats going because it's a normal std::string and std::cout was somehow able to print it properly.

After 30min of confusion I noticed that these are UTF-16 strings (obviously, that's Windows) and that "std::cout" silently swallows 0-bytes instead of echoing \0.

So for C++11 one way to do it is e.g.:

if (r.name_str.length() == 8 && !memcmp(u"ABCD", r.name_str.c_str(), 8))

Could a note about this be added somewhere? I assume this also confuses other devs for a few minutes :)

Publish pepy to PyPI

We should publish both source and binary distributions of pepy to PyPI. This will make it easier for others to discover, use, and contribute to pe-parse.

relocation information calculate wrong

at file parse.cpp line 1067.. the original code is:
//iter over all of the blocks
::uint32_t blockCount = blockSize/sizeof(::uint16_t);

the blockSize should minus 8.
//iter over all of the blocks
::uint32_t blockCount = (blockSize - 8) / sizeof(::uint16_t);

therefore the pe file has many blocks of relocs. reloc block is end of
pageRva = 0 and rvaofft = 0.

Windows 7 build

Hi

How can I build it for Windows 7 32-bit using VS2015. I generate the project but there is linking error of unistd which is a unix specific file.

Regards

Add support for debug entry iteration

We should add a high-level iter-style API for the debug data directory entry vector, similar to the other iter-style APIs.

One challenge: the contents of the debug entries are heterogeneously typed and have drastically different contents. Maybe a visitor pattern instead, where IterDebug is registered with a struct of callbacks, one per type.

See trailofbits/winchecksec#44.

Publish a stable pe-parse release

We haven't been versioning pe-parse. It's pretty stable, so we should cut a release.

We should also publish pepy to PyPI and pin its release version to that of the rest of pe-parse.

See #105.

Unable to parse pe image generated by mingw gcc

I am using ArchLinux. I have built pe-parse using the supplied PKGBUILD file.
I have installed mingw-w64-gcc-bin 7.3.0-1 from AUR.

$ x86_64-w64-mingw32-gcc hello.c -o hello.exe
$ dump-pe hello.exe
Error: 9 (Bad magic)
Location: getSymbolTable:1375
$ file hello.exe
hello.exe: PE32+ executable (console) x86-64, for MS Windows

hello.zip

Not able to install pe-parse in Windows

I am trying to install this library on Windows without success.

I installed cmake version for windows from:
https://cmake.org/download/

After that I add problem of 'Unable to find vcvarsall.bat'.
I fixed it by running:
SET VS90COMNTOOLS=%VS140COMNTOOLS%

This according to this:

Execute the following command based on the version of Visual Studio installed:

Visual Studio 2010 (VS10): SET VS90COMNTOOLS=%VS100COMNTOOLS%
Visual Studio 2012 (VS11): SET VS90COMNTOOLS=%VS110COMNTOOLS%
Visual Studio 2013 (VS12): SET VS90COMNTOOLS=%VS120COMNTOOLS%
Visual Studio 2015 (VS14): SET VS90COMNTOOLS=%VS140COMNTOOLS%

Now I have other problem that it doesn't find the file: 'pepy.cpp':

c:>C:\Python27\python.exe C:\Python27\pe-parse-master\python\setup.py build
running build
running build_ext
building 'pepy' extension
creating build
creating build\temp.win32-2.7
creating build\temp.win32-2.7\Release
creating build\temp.win32-2.7\parser-library
creating build\temp.win32-2.7\Release\Python27
creating build\temp.win32-2.7\Release\Python27\pe-parse-master
creating build\temp.win32-2.7\Release\Python27\pe-parse-master\parser-library
creating build\temp.win32-2.7\Release\Python27\pe-parse-master\parser-library\python
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -I/usr/local/include -I/opt/local/include -I/usr/include -I../parser-library -IC:\Python27\include -IC:\Python27\PC /Tppepy.cpp /Fobuild\temp.win32-2.7\Release\pepy.obj -g -O0
cl : Command line warning D9002 : ignoring unknown option '-g'
cl : Command line warning D9002 : ignoring unknown option '-O0'
pepy.cpp
c1xx: fatal error C1083: Cannot open source file: 'pepy.cpp': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\cl.exe' failed with exit status 2

I tried to edit the setp.py file like that:

sources = ['pepy.cpp',
                  '../parser-library/parse.cpp',
                  '../parser-library/buffer.cpp',
		   r'C:\Python27\pe-parse-master\parser-library\python\pepy.cpp',
		   r'C:\Python27\pe-parse-master\parser-library\parse.cpp',
		   r'C:\Python27\pe-parse-master\parser-library\buffer.cpp'
 ]

Any idea ?

Not able to execute dump-prog

Hi,

I have tried to get pe-parse working on a mac and rhel/fedora, in both cases the dump-prog program crashes.

Here is the output on linux:
[ec2-user@ip-10-35-66-166 dump-prog]$ make;./dump-prog "/home/ec2-user/pe/pe-parse/dump-prog/putty.exe"
[ 66%] Built target pe-parser-library
[100%] Building CXX object dump-prog/CMakeFiles/dump-prog.dir/dump.cpp.o
Linking CXX executable dump-prog
[100%] Built target dump-prog
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_S_construct null not valid
Aborted

Here is the output on osx:
arezafar@Alis-MacBook-Pro:dump-prog [master]$ make;./dump-prog "../../putty.exe"
Scanning dependencies of target pe-parser-library
[ 33%] Building CXX object parser-library/CMakeFiles/pe-parser-library.dir/buffer.cpp.o
[ 66%] Building CXX object parser-library/CMakeFiles/pe-parser-library.dir/parse.cpp.o
Linking CXX static library libpe-parser-library.a
[ 66%] Built target pe-parser-library
Scanning dependencies of target dump-prog
[100%] Building CXX object dump-prog/CMakeFiles/dump-prog.dir/dump.cpp.o
Linking CXX executable dump-prog
[100%] Built target dump-prog
Segmentation fault: 11

screen shot 2015-02-27 at 7 28 43 am

In both cases it seems to be a basic string library issue, how could I get around it?

Thanks,
Ali

Identifier/macro collisions with Windows.h

Including both parse.h and Windows.h in the same translation unit leads to hellish compilation errors, presumably because Windows.h defines IMAGE_FILE_* as macros and namespaces can't mask those.

Some potential solutions:

  • Provide official guidance to not use Windows.h and parse.h in the same source file
  • Prefix all of our duplicated constants

See #69

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.