Giter Site home page Giter Site logo

pyevtx-rs's Introduction

pyevtx-rs

Python bindings for https://github.com/omerbenamram/evtx/.

Installation

Available on PyPi - https://pypi.org/project/evtx/.

To install from PyPi - pip install evtx

Wheels

Wheels are currently automatically built for Python 3.7+ using abi3 tag (which means they are compatible with all version from 3.7 onwards).

Supported platforms are:

  • Linux x86_64
  • macOS x86_64
  • macOS arm64 (m1)
  • Windows x86_64

Installation from sources

Installation is possible for other platforms by installing from sources.

This requires a Rust compiler and a recent enough Setuptools and Pip.

Run pip install -e .

Usage

The API surface is currently fairly limited (only yields events as XML/JSON documents), but is planned to be expanded in the future.

This will print each record as an XML string.

from evtx import PyEvtxParser


def main():
    parser = PyEvtxParser("./samples/Security_short_selected.evtx")
    for record in parser.records():
        print(f'Event Record ID: {record["event_record_id"]}')
        print(f'Event Timestamp: {record["timestamp"]}')
        print(record['data'])
        print(f'------------------------------------------')

And this will print each record as a JSON string.

from evtx.parser import PyEvtxParser


def main():
    parser = PyEvtxParser("./samples/Security_short_selected.evtx")
    for record in parser.records_json():
        print(f'Event Record ID: {record["event_record_id"]}')
        print(f'Event Timestamp: {record["timestamp"]}')
        print(record['data'])
        print(f'------------------------------------------')

File-like objects are also supported.

from evtx.parser import PyEvtxParser


def main():
    a = open("./samples/Security_short_selected.evtx", 'rb')

    # io.BytesIO is also supported.
    parser = PyEvtxParser(a)
    for record in parser.records_json():
        print(f'Event Record ID: {record["event_record_id"]}')
        print(f'Event Timestamp: {record["timestamp"]}')
        print(record['data'])
        print(f'------------------------------------------')

pyevtx-rs's People

Contributors

akx avatar ohadravid avatar omerbenamram avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyevtx-rs's Issues

Multi-thread support in Python bindings?

I can see the binding is referencing the multi-threaded parameter, but seems to have no effect; just wondering if this is a planned feature or a bug that needs solving?

Really grateful you developed this btw - incredible tool, used every day!

Evtx.parser

Hello,
it's impossible to import from evtx.parser.....why please ? how to use this one for json strings ? please

Add debug ouput / more detailed traceback to make handling malformed EVTX files easier

I have MS Security Event Logs in EVTX format. I'm able to read them using williballenthin/python-evtx, but it's incredibly slow.
Thus, I wanted to export the events using pyevtx-rs, but the EVTX data seems to be corrupt, since I get the following Traceback:

Traceback (most recent call last):
    File "test.py", line 7, in <module>
        for record in parser.records_json():
            RuntimeError: Failed to parse chunk header

The code I'm using is the one for getting JSON from the EVTX written in your README.

As I don't know, why one lib is parsing the EVTX without issues and the other one crashes, and also don't know where exactly the malformed chunk header is, I'd like to ask for the implementation of some debug informations, like the EventNumber of the event which caused the crash and other information. So it would be easier to find the reason for the crash.

// Edit
I tested the original Rust Code and run the current release executable on the Security Event Logs.
To be clear, these Event Logs are just exported from the Eventviewer and no additional changes have been made.
So, either there's something not correctly handled within the code, or the events are not formatted as expected, what would be a MS issue.
The code crached as well due to invalid chunk headers. This is the error message I get:

Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2181
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[ 6, 24, 37, 3F, 47, FD, 37, 60]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2182
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[B4, 25, AB,  A,  2, 74, A7, 3B]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2183
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[6F, F7, 7E, 88, 83, D4, F7, D8]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2184
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[78, AA, B5, 63, 6A, D7, E4, F9]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2185
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[EA, 39, 57, 5A, 90,  C, 50, B5]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2186
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[F7, FB, B2, 9D, 20, E2, 78, 21]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2187
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[21, F3, 53, F3, A0, 40, AC, 32]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2188
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[AB, 63, B1, 65,  8, 29, 39, E9]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2189
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[5D, A4, 2F, 3D, 47, 1E, 6F, 54]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2190
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[A1, 2C, AE, 6A, 3C, 47, BE, 6B]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2191
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[21, 98,  0, 73, A8, 81,  0, 7F]`
Failed to dump the next record.

Caused by:
    0: Failed to parse chunk number 2192
    1: Failed to parse chunk header
    2: Invalid EVTX chunk header magic, expected `ElfChnk0`, found `[5B,  1, 9A, DC, E6, 37, E1, 45]`

pip install evtx fails

C:> pip install evtx
Collecting evtx
  Downloading evtx-0.3.0.tar.gz (1.8 kB)
    ERROR: Command errored out with exit status 1:
     command: 'c:\python38-32\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\admin\\AppData\\Local\\Temp\\pip-install-0tgd_mdr\\evtx\\setup.py'"'"'; __file__='"'"'C:\\Users\\admin\\AppData\\Local\\Temp\\pip-install-0tgd_mdr\\evtx\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\admin\AppData\Local\Temp\pip-pip-egg-info-o1hcqjr5'
         cwd: C:\Users\admin\AppData\Local\Temp\pip-install-0tgd_mdr\evtx\
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\admin\AppData\Local\Temp\pip-install-0tgd_mdr\evtx\setup.py", line 5, in <module>
        from setuptools_rust import RustExtension
    ModuleNotFoundError: No module named 'setuptools_rust'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Misunderstanding of 'timestamp' field

As I'm working with PyEvtxParser, I've noticed 2 distinct fields with the purpose of representing time. One of the is 'timestamp' just under root, and second is 'TimeCreated' under 'System' key. On Event Viewer the time represented is the time of 'TimeCreated' and in my example 'timestamp' is delayed by approximately 30 seconds.
Can you clarify on that obscurity?

Request for Compiled Version (No matching distribution found for evtx==0.8.3)

I am currently attempting to install and use evtx on a Linux Arch-based system (Docker on Mac). Unfortunately, I have encountered an issue as there is no compiled version of the package available for the package.
I understand that maintaining compiled versions for various architectures can be challenging, but having support for the Linux Arch architecture would greatly benefit users in this environment.
Thank you for your attention to this matter.

wrong ordering in records returned by records() iterator

hi, i think there is a bug in event parsing regarding ordering.
the PyEvtxParser().records() iterator return records parsing each chunk in DESCENDING order instead of ASCENDING, so when iterating chunks you obtain i.e.

(iterating chunk0): record10,record9,record8,record7,record6,record5,record4,record3,record2,record1
(iterating chunk1)chunk1: record20,record19,record18,record17,record16,record15,record14,record13,record12,record11
and so on ...

basically, seems records in each chunk are orderered in a descending way before being returned by the iterator, and this leads to the records not being in the original order. this may break some utilization of your lib where the original order needs to be preserved.

Arguments to PyEvtxParser is ignored during iteration

Hello,
I am facing an issue where the arguments passed to PyEvtxParser will be ignored when calling iteration functions (records() & records_json()). The issue appears to be related to the implementation in the function records_iterator were it will overwrite the configuration with the default configurations.
Configuration passed to PyEvtxParser and an instance of ParserSettings is created with the options:

pyevtx-rs/src/lib.rs

Lines 146 to 178 in bdbbd41

impl PyEvtxParser {
#[new]
fn new(
path_or_file_like: PyObject,
number_of_threads: Option<usize>,
ansi_codec: Option<String>,
) -> PyResult<Self> {
let file_or_file_like = FileOrFileLike::from_pyobject(path_or_file_like)?;
// Setup `ansi_codec`
let codec = if let Some(codec) = ansi_codec {
match encodings().iter().find(|c| c.name() == codec) {
Some(encoding) => *encoding,
None => {
return Err(PyErr::new::<PyValueError, _>(format!(
"Unknown encoding `[{}]`, see help for possible values",
codec
)));
}
}
} else {
ParserSettings::default().get_ansi_codec()
};
// Setup `number_of_threads`
let n_threads = match number_of_threads {
Some(number) => number,
None => *ParserSettings::default().get_num_threads(),
};
let configuration = ParserSettings::new()
.ansi_codec(codec)
.num_threads(n_threads);

Then the configuration is overwritten here:

pyevtx-rs/src/lib.rs

Lines 226 to 244 in bdbbd41

impl PyEvtxParser {
fn records_iterator(&mut self, output_format: OutputFormat) -> PyResult<PyRecordsIterator> {
let inner = match self.inner.take() {
Some(inner) => inner,
None => {
return Err(PyErr::new::<PyRuntimeError, _>(
"PyEvtxParser can only be used once",
));
}
};
Ok(PyRecordsIterator {
inner: inner.into_chunks(),
records: None,
settings: Arc::new(ParserSettings::new()),
output_format,
})
}
}

JSON output support

I know it is stated in the README, but I would like to formally request that JSON output be added. Thanks for your work on this parser, it's awesome.

`RuntimeError: Failed to parse record number <num>`. Skip over in iterator if possible.

Firstly, would just want to say that the library is great and really fast for use in code -- thank you.

However, in my use the iterator ran into an issue and the error message was simply: -
"RuntimeError: Failed to parse record number 6116193"
and the iterator will stop parsing the EVTX file further.

The EVTX file was not corrupted as far as I am aware, and I believe this is an extreme edge case, considering that no one opened an issue on this.

Would like to ask for either: -

  • More detailed error messages (i.e. why did it fail to parse), and/or
  • A message to indicate that the record was skipped (either as part of the JSON or a non-fatal error) and the iterator would not stop parsing the EVTX file (I think missing 1 line out of a few million others is an okay compromise)

Hopefully you'll be able to look into this.

Thank you
- Tim

file like object support

it would be awesome if file like objects were supported so file handles could be passed in from pytsk, dfvfs, etc.

InstanceID missing from logs

Evtx'es have a property "InstanceID" which is related to EventID:

InstanceID is not EventID, but can be:
The InstanceId property uniquely identifies an event entry for a configured event source. The InstanceId for an event log entry represents the full 32-bit resource identifier for the event in the message resource file for the event source. The EventID property equals the InstanceId with the top two bits masked off. Two event log entries from the same source can have matching EventID values, but have different InstanceId values due to differences in the top two bits of the resource identifier. If the application wrote the event entry using one of the WriteEntry methods, the InstanceId property matches the optional eventId parameter. If the application wrote the event using WriteEvent, the InstanceId property matches the resource identifier specified in the InstanceId of the instance parameter. If the application wrote the event using the Win32 API ReportEvent, the InstanceId property matches the resource identifier specified in the dwEventID parameter.

Taken from here: https://evotec.xyz/powershell-everything-you-wanted-to-know-about-event-logs/

I would very much like to have InstanceID read in. It isn't in the XML data; XML data contains EventID

I don't know enough about evtx structure to offer a patch.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.