Giter Site home page Giter Site logo

Comments (3)

simsor avatar simsor commented on June 10, 2024

I believe the issue comes from Python opening files using the cp1252 encoding under Windows by default (even on WSL), and Linux using utf8 (I assume you're running Linux?).

The bug can be replicated by editing etl2xml to force utf8 encoding:

import codecs
...
    with codecs.open(input, "rb", "utf8") as input_file:
        etl_reader = build_from_stream(input_file.read())
        etl_reader.parse(logger)

I believe it can be fixed by forcing the cp1252 encoding for all platforms.

from etl-parser.

febrezo avatar febrezo commented on June 10, 2024

Yes, I'm running it on Linux so it can be related to the encoding. I run a simple interactive text to check this issue, but i'm not sure if i fixed it, since using cp1252 still threw the error:

>>> import codecs
>>> from etl.etl import IEtlFileObserver, build_from_stream
>>> with codecs.open("AMSITrace.etl", "rb", "cp1252") as input_file:
...     etl_reader = build_from_stream(input_file.read())
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/lib/python3.6/codecs.py", line 700, in read
    return self.reader.read(size)
  File "/usr/lib/python3.6/codecs.py", line 503, in read
    newchars, decodedbytes = self.decode(data, self.errors)
  File "/usr/lib/python3.6/encodings/cp1252.py", line 15, in decode
    return codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 91: character maps to <undefined>

Meanwhile I tried a simple modification of the script to read the binary data (noticed that build_from_stream was prefering bytes to str) and got some extra output:

...
if __name__ == "__main__":
    try:
        file_name = sys.argv[1]
        with open(file_name, "rb") as etl_file:
            raw_data = etl_file.read()
            etl_reader = build_from_stream(raw_data)
            etl_reader.parse(EtlFileLogger())
    except IndexError:
        print("Not enough parameters. Add the .etl file as a parameter")

So my launch was as follows, which shows some parsing as a result but crashes in the end:

$ python3 test.py AMSITrace.etl 
Message (trace logging): {'Engine': 'PowerShell_C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe_10.0.18362.1', 'Script': 'if ($this.Name.IndexOf(\'-\') -lt 0)\r\n          {\r\n          if ($this.ResolvedCommand -ne $null)\r\n          {\r\n          $this.Name + " -> " + $this.ResolvedCommand.Name\r\n          }\r\n          else\r\n          {\r\n          $this.Name + " -> " + $this.Definition\r\n          }\r\n          }\r\n          else\r\n          {\r\n          $this.Name\r\n          }', 'Raw Script': [105, 102, 32, 40, 36, 116, 104, 105, 115, 46, 78, 97, 109, 101, 46, 73, 110, 100, 101, 120, 79, 102, 40, 39, 45, 39, 41, 32, 45, 108, 116, 32, 48, 41, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 123, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 105, 102, 32, 40, 36, 116, 104, 105, 115, 46, 82, 101, 115, 111, 108, 118, 101, 100, 67, 111, 109, 109, 97, 110, 100, 32, 45, 110, 101, 32, 36, 110, 117, 108, 108, 41, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 123, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 36, 116, 104, 105, 115, 46, 78, 97, 109, 101, 32, 43, 32, 34, 32, 45, 62, 32, 34, 32, 43, 32, 36, 116, 104, 105, 115, 46, 82, 101, 115, 111, 108, 118, 101, 100, 67, 111, 109, 109, 97, 110, 100, 46, 78, 97, 109, 101, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 125, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 101, 108, 115, 101, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 123, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 36, 116, 104, 105, 115, 46, 78, 97, 109, 101, 32, 43, 32, 34, 32, 45, 62, 32, 34, 32, 43, 32, 36, 116, 104, 105, 115, 46, 68, 101, 102, 105, 110, 105, 116, 105, 111, 110, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 125, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 125, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 101, 108, 115, 101, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 123, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 36, 116, 104, 105, 115, 46, 78, 97, 109, 101, 13, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 125]}
Traceback (most recent call last):
  File "test.py", line 42, in <module>
    etl_reader.parse(EtlFileLogger())
  File "/home/felix/Documentos/Proyectos/etl-parser/etl/etl.py", line 141, in parse
    actions[event.type](event.value)
  File "/home/felix/Documentos/Proyectos/etl-parser/etl/etl.py", line 133, in <lambda>
    "EventRecord": lambda obj: observer.on_event_record(Event(obj)),
  File "test.py", line 27, in on_event_record
    message = event.parse_etw() # Invoke Manifest based parser
  File "/home/felix/Documentos/Proyectos/etl-parser/etl/event.py", line 123, in parse_etw
    return build_etw(guid, event_id, version, user_data)
  File "/home/felix/Documentos/Proyectos/etl-parser/etl/parsers/etw/core.py", line 100, in build_etw
    raise GuidNotFound(guid)
etl.error.GuidNotFound: No class handle this ETW provider : (8e805eb3-6a8f-4a1e-90fa-a831d94e54a1)

I'm checking if this only happens to me in Linux and let you know. Thanks for such a fast reply.

from etl-parser.

simsor avatar simsor commented on June 10, 2024

Ok, after some more digging, I think I understood where your problem was.

First of all, build_from_stream does indeed expect bytes and not str. It is necessary to open the ETL file in "rb" mode (that's what etl2xml does), the README is incorrect.

Because we are opening the file in binary mode, the codecs module is useless, if not actively harmful. I was wrong in my initial assessment, I apologize!

Secondly, addressing your GuidNotFound error, this exception is raised because you called event.parse_etw() on an event which wasn't an ETW. etl2xml first tries to read the event as Tracelogging, only switching to ETW if an error happens. The README example must be adapted to your use-case, it all depends on what kind of message you are parsing.

I will close this issue, as I understand your initial issue will be solved by opening ETL files in binary mode. I will update the example code in the README.

from etl-parser.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.