Giter Site home page Giter Site logo

maehw / microbit-pxt-code-extractor Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 24 KB

Extract code from Universal Hex files generated by the PXT-based Microsoft MakeCode IDE for micro:bit

License: GNU General Public License v3.0

Python 100.00%
bbc-microbit intelhex makecode-programming microbit microsoft-makecode pxt-microbit makecode-project pxt makecode makecode-graphical-programming

microbit-pxt-code-extractor's Introduction

bbc micro:bit PXT code extractor

This project attempts to extract the code from so called Universal Hex files generated by the PXT-based Microsoft MakeCode IDE for micro:bit (web IDE). PXT uses a technique called source embedding in order to add the code as (possibly compressed) text into the 0x0D("Custom Data") records of an Intel HEX file.

The code extractor itself is realized as Python 3 script.

This project is not supporting the extraction of Python code from the BBC micro:bit. To do so, the uBitTool. As soon as the PXT code extractor is in a sufficiently working state, it may be added to the uBitTool - feel free to create a pull request.

Usage

  • Clone this git repository
  • Make sure the Python module dependencies are met: pip3 install intelhex argparse lzma
  • Run the script from a Python 3 environment (should be runnable under Windows, Linux and MacOS):
  usage: extract.py [-h] [file]

  extract.py

  positional arguments:
    file        path to bbc micro:bit HEX input file

  options:
    -h, --help  show this help message and exit

Warning The Python script will automatically create an output folder named after the input file (without extension).

The following files are created by the tool and contain data from intermediate extraction steps:

  1. _code_header.json
  2. _lzma_compressed_text.bin
  3. _packed_code.txt

Example usage and output files:

  $ python extract.py sound-device.hex       
Input file w/o extension: sound-device
           Output folder: /Users/matthias/local_repos/microbit-pxt-code-extractor/sound-device
-------------------------------------------------------------------------
Embedded source dump:
0000  41 14 0E 2F B8 2F A2 BB 9D 00 40 10 00 00 00 00  |A.././....@.....|
0010  7B 22 63 6F 6D 70 72 65 73 73 69 6F 6E 22 3A 22  |{"compression":"|
0020  4C 5A 4D 41 22 2C 22 68 65 61 64 65 72 53 69 7A  |LZMA","headerSiz|
0030  65 22 3A 32 39 34 2C 22 74 65 78 74 53 69 7A 65  |e":294,"textSize|
0040  22 3A 31 37 35 31 30 2C 22 6E 61 6D 65 22 3A 22  |":17510,"name":"|
0050  73 6F 75 6E 64 2D 64 65 76 69 63 65 22 2C 22 65  |sound-device","e|
0060  55 52 4C 22 3A 22 68 74 74 70 73 3A 2F 2F 6D 61  |URL":"https://ma|
0070  6B 65 63 6F 64 65 2E 6D 69 63 72 6F 62 69 74 2E  |kecode.microbit.|
0080  6F 72 67 2F 22 2C 22 65 56 45 52 22 3A 22 35 2E  |org/","eVER":"5.|
0090  30 2E 31 32 22 2C 22 70 78 74 54 61 72 67 65 74  |0.12","pxtTarget|
00A0  22 3A 22 6D 69 63 72 6F 62 69 74 22 7D 5D 00 00  |":"microbit"}]..|
00B0  80 00 95 45 00 00 00 00 00 00 00 3D 88 89 C6 54  |...E.......=...T|
00C0  36 C3 17 4F E4 F9 EC 0D 07 A9 22 3E D4 1C 7C B5  |6..O......">..|.|
00D0  AF A5 88 58 62 DF 18 4A B0 53 1D A2 B3 BA 13 --  |...Xb..J.S..... |
...
-------------------------------------------------------------------------
JSON header length: 0x9D00 (157)
       Text length: 0x40100000 (4160)
          Reserved: 0x0000
-------------------------------------------------------------------------
Embedded JSON header (pretty-printed):
{
    "compression": "LZMA",
    "headerSize": 294,
    "textSize": 17510,
    "name": "sound-device",
    "eURL": "https://makecode.microbit.org/",
    "eVER": "5.0.12",
    "pxtTarget": "microbit"
}
Header size: 294
Text size: 17510
-------------------------------------------------------------------------
Text meta data:
  Length of text before truncation: 4163
   Length of text after truncation: 4160
  Text is LZMA-compressed.
  Writing LZMA compressed output text...
  Decompressing LZMA text...
Writing packed code...
-------------------------------------------------------------------------
Code header dump (pretty-printed)
{
    "name": "sound-device",
    "comment": "",
    "status": "unpublished",
    "cloudId": "pxt/microbit",
    "editor": "blocksprj",
    "targetVersions": {
        "branch": "v5.0.12",
        "tag": "v5.0.12",
        "commits": "https://github.com/microsoft/pxt-microbit/commits/97491d6832cccab6b5bdc05b58e4c6b5dcc18cdd",
        "target": "5.0.12",
        "pxt": "8.0.7"
    }
}
Writing code header JSON file...
-------------------------------------------------------------------------
Code payload analysis (pretty-printed)
  Length: 17519
   Files: ['README.md', 'main.blocks', 'main.ts', 'pxt.json', 'test.ts']
Writing file 'README.md'...
Writing file 'main.blocks'...
Writing file 'main.ts'...
Writing file 'pxt.json'...
Writing file 'test.ts'...

And some details for the example:

$ cd sound-device
matthias@maehcbook sound-device % ll
total 128
drwxr-xr-x  10 matthias  staff    320 31 Aug 22:22 .
drwxr-xr-x  10 matthias  staff    320 31 Aug 22:22 ..
-rw-r--r--   1 matthias  staff   1433 31 Aug 22:22 README.md
-rw-r--r--   1 matthias  staff    314 31 Aug 22:22 _code_header.json
-rw-r--r--   1 matthias  staff   4160 31 Aug 22:22 _lzma_compressed_text.bin
-rw-r--r--   1 matthias  staff  17813 31 Aug 22:22 _packed_code.txt
-rw-r--r--   1 matthias  staff  13402 31 Aug 22:22 main.blocks
-rw-r--r--   1 matthias  staff   1991 31 Aug 22:22 main.ts
-rw-r--r--   1 matthias  staff    537 31 Aug 22:22 pxt.json
-rw-r--r--   1 matthias  staff    129 31 Aug 22:22 test.ts

Contribution

Feel free to make any changes and support this project. ;)

microbit-pxt-code-extractor's People

Contributors

maehw avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.