Giter Site home page Giter Site logo

binary-parsing's Introduction

Awesome binary parsing

A list of generic tools for parsing binary data structures, such as file formats, network protocols or bitstreams.

Parser generators, parsing libraries and frameworks

  • Kaitai Struct (DSL): declarative language used for describe various binary data structures, laid out in files or in memory
  • Nom (Rust): Rust parser combinator framework
  • Hammer (C): bit-oriented parsing library
  • Construct (Python): library for parsing and building of data structures (binary or textual). Define your data structures in a declarative manner
  • Spicy (DSL, C/C++, Zeek): a next-generation parser generator for network protocols and file formats
  • Hachoir (Python): view and edit a binary stream field by field. Long list of parsers for all kinds of formats
  • Caterpillar: Python 3.12+ library to pack and unpack structurized binary data
  • RecordFlux: toolset for the formal specification of messages and the generation of verifiable binary parsers and message generators (Ada-inspired).
  • DataScript Tools (DSL): DataScript is a formal language for modelling binary datatypes, bitstreams or file formats. PDF
  • Parsifal (OCaml): OCaml-based parsing engine. Paper: A pragmatic solution to the binary parsing problem. Olivier Levillain
  • Haka (Lua): open source security oriented language which allows to describe protocols and apply security policies on (live) captured traffic
  • BinData (Ruby): provides a declarative way to read and write structured binary data
  • Binary-parser (JavaScript): binary parser builder library which enables you to write efficient parsers in a simple & declarative way
  • Gloss (Clojure): turn complicated byte formats into Clojure data structures and Clojure data structures into compact byte representations
  • Preon (Java): Bit syntax for Java. A declarative data binding framework for dealing with binary encoded data
  • attoparsec and attoparsec-binary: (Haskell): fast parser combinator library, aimed particularly at dealing efficiently with network protocols and complicated text/binary file formats
  • Marpa (C/C++, Perl, Go): libmarpa (C)
  • Scapy (Python): send, sniff and dissect and forge network packets. Usable interactively or as a library
  • libtins (C++): crafting, sending, sniffing and interpreting raw network packets
  • libcrafter (C++): high level library for C++ designed to create and decode network packets
  • scodec (Scala): Combinator library for working with binary data
  • Apache Daffodil (Scala/Java, XML Schema): an open-source implementation of DFDL (Data Format Description Language) capable of describing many industry and military standards and parsing into a infoset, which is most commonly represented as either XML or JSON, and writing back to native format.
  • binarylang (Nim, DSL): extensible Nim DSL for creating binary parsers/encoders in a symmetric fashion
  • binaryparse (Nim, DSL): In-language DSL for reading and writing binary data supporting all sorts of patterns. Generates an efficient stream based reader and writer for the runtime execution.
  • FlexT - a DSL and a tool for generating parsers in Delphi.
  • FormatFuzzer (C++): framework for high-efficiency, high-quality generation and parsing of binary inputs
  • Deku (Rust): bit-level, symmetric, serialization/deserialization implementations for structs and enums
  • restruct (Go): library for reading and writing binary data
  • Mr. Crowbar (Python): Django-esque model framework for reading and writing binary file formats. Includes a suite of command-line tools for visualising and digging through binary data.
  • jBinary (JavaScript) High-level API for working with binary data.
  • Wuffs: a memory-safe programming language (and a standard library written in that language) for Wrangling Untrusted File Formats Safely. Wrangling includes parsing, decoding and encoding.
  • EverParse: a framework for generating verified secure parsers and formatters from domain-specific format specification languages
  • binrw (Rust): binrw helps you write maintainable & easy-to-read declarative binary data readers and writers using ✨macro magic✨.
  • Dogma (DSL): human-friendly metalanguage for describing data formats in documentation using the familiar patterns of Backus-Naur Form.

Stand-alone software

Hex editors with grammars
  • Synalyze It!
  • Hexinator
  • 010 Editor
  • Kiewtai: plugin for the Hiew hex editor that makes the Kaitai parsers available
  • Hobbits: multi-platform GUI for bit-based analysis, processing, and visualization. Has a Kaitai plugin.
  • ImHex: A Hex Editor for Reverse Engineers, Programmers and people who value their retinas when working at 3 AM.
  • fq: jq for binary formats - tool, language and decoders for working with binary and text formats.
Wireshark

Wireshark is a network protocol analyzer that includes dissectors for over two thousand protocols.

  • TShark: command line version, can easily be called from shell scripts.
  • Wireshark Generic Dissector: add-on, allows dissection of a protocol based on a text description of the protocol elements
  • Wireshark Lua: dissectors can be written in Lua (Examples)
  • pyreshark: plugin providing a simple interface for writing Wireshark dissectors in Python
  • Sharktools (Python, Matlab): Tools for programmatic parsing of packet captures using Wireshark functionality
Other Stand-alone Software
  • GNU poke: The extensible editor for structured binary data
  • Netzob: open source tool for reverse engineering, traffic generation and fuzzing of communication protocols
  • Cat Karat Packet Builder: packet generation tool that allows to build custom packets for firewall or target testing
  • radare2 (C, with bindings/pipe for almost all languages): Unix-like reverse engineering framework and commandline tools. See Parsing a fileformat with radare2 and Types.
  • Veles: open source tool for binary analysis

Research papers

  • LangSec Platform: Towards a Platform to Compare Binary Parser Generators. Olivier Levillain, Sébastien Naud, Aina Toky Rasoamanana (Video)
  • Interval Parsing Grammars for File Format Parsing Jialun Zhang, Greg Morrisett, Gang Tan
  • Narcissus: Correct-By-Construction Derivation of Decoders and Encoders from Binary Formats. Benjamin Delaware, Sorawit Suriyakarn, Clément Pit-Claudel, Qianchuan Ye, Adam Chlipala
  • EverParse: Verified Secure Zero-Copy Parsers for Authenticated Message Formats. Tahina Ramananandro et. al.
  • Nail: A Practical Tool for Parsing and Generating Data Formats. Julian Bangert and Nickolai Zeldovich
  • Generic packet descriptions: Verified parsing and pretty printing of low-level data. Marcell van Geest, Wouter Swierstra
  • GAPA: Generic Application-Level Protocol Analyzer and its Language. Nikita Borisov, David J. Brumley, Helen J. Wang, Chuanxiong Guo
  • PADS/ML: a functional data description language. Y. Mandelbaum, K. Fisher, D. Walker, M. F. Fernandez, and A. Gleyzer.
  • PacketTypes: P. J. McCann and S. Chandra. Packet types: Abstract specification of network protocol messages.
  • Zebu: A Language-Based Approach for Improving the Robustness of Network Application Protocol Implementations. Larent Burgy et. al.
  • Zebra: Improving the Performance of Message Parsers for Embedded Systems. Jigar Solanki et. al.
  • z2z: Automatic Generation of Network Protocol Gateways. Yerom-David Bromberg, Laurent Reveillere, Julia L. Lawall, Gilles Muller
  • Yakker: Semantics and Algorithms for Data-dependent Grammars. Trevor Jim, Yitzhak Mandelbaum, David Walker
  • BinPAC: Superseded by BinPAC++, which is now known as Spicy
  • FlowSifter: High-Speed Application Protocol Parsing and Extraction for Deep Flow Inspection. Alex X. Liu, Chad R. Meiners, Eric Norige, and Eric Torng
  • TSN.1: Transfer Syntax Notation One (TSN.1). A formal notation for describing messages in binary protocols
  • NetPDL: Markup Language that aims to describe Protocols from OSI layer 2 to OSI layer 7
  • Tupni: Automatic Reverse Engineering of Input Formats. Weidong Cui et. al.
  • W. Underwood Grammar-Based Specification and Parsing of Binary File Formats. William Underwood

Lists of interesting binary formats

This is obviously rather subjective and definitely not supposed to be a complete list:

Related topics

binary-parsing's People

Contributors

briandorsey avatar dloss avatar ftao avatar greycat avatar kolanich avatar mbeckerle avatar moralrecordings avatar pastelmind avatar pmunch avatar wader avatar williballenthin avatar ziggystar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

binary-parsing's Issues

Another declarative parser framework in Python

Hi, I developed another binary parsing framework for Python. Is it possible to include it in the current framework list?

  • Here's a small comparison across the listed Python frameworks targeting binary parsing.
  • Caterpillar: A Python 3.12+ library to pack and unpack structured binary data.

Netzob URL is no longer about Netzob

It seems the link to Netzob tool under Other Stand-alone Software leads to the site www.netzob.org, the contents of which are no longer about what Netzob tool is about. Netzob's official GitHub repository has a brief hint to it that they actually refer to a archived version of their site which is what used to be the original homepage serving the actual Netzob, a reverse-engineering tool. To clarify:

Netzob.org before (archived from 2016) ✅ Netzob.org now ❌
Screen Shot 2022-05-24 at 10 34 27 AM Screen Shot 2022-05-24 at 10 34 52 AM

I believe the domain netzob.org is still under their control, but the contents are totally irrelevant (something about jobs?) and no mention of Netzob as a reverse engineering tool, though I do not know why. I had to Google to get their actual repository.

For the purposes of this Awesome list, it'd be better if the link to Netzob referred to their GitHub repository: github.com/netzob/netzob instead netzob.org (until their site is fixed).

GNU Poke

I think it should be added to the "Other Stand-alone Software" section.
Website: GNU Poke

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.