Giter Site home page Giter Site logo

ideas's People

Contributors

xvilka avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ideas's Issues

nanoMIPS instruction set support

nanoMIPS™ Architecture

Designed for embedded devices, nanoMIPS is a variable lengths instruction set architecture (ISA) offering high performance in substantially reduced code size. Under comparable compiler flags, it can deliver up to 40% smaller code than MIPS32. With smaller memory accesses and efficient use of the instruction cache, nanoMIPS also helps to reduce system power consumption.

The nanoMIPS ISA combines recoded and new 16-, 32-, and 48-bit instructions to achieve an ideal balance of performance and code density. It incorporates all MIPS32 instructions and architecture modules including MIPS DSP and MIPS MT, as well as new instructions for advanced code size reduction.

nanoMIPS is supported in release 6 of the MIPS architecture. It is first implemented in the new MIPS I7200 multi-threaded multi-core processor series. Compiler support is included in the MIPS GNU-based development tools.

It is different from the "standard" instruction set.

MIPS_nanomips32_ISA_TRM_01_01_MD01247.pdf

Toolchain: https://github.com/MediaTek-Labs/nanomips-gnu-toolchain/releases
QEMU TCG backend: https://www.spinics.net/linux/fedora/libvir/msg217107.html

QEMU own disassembler: https://gitlab.com/qemu-project/qemu/-/blob/master/disas/nanomips.c

See also the nmips plugin for the IDA Pro.

Add support for Pokemon Mini ROMs (S1C88)

The Pokémon mini is a handheld game console by Nintendo based on a custom Epson S1C88 processor. It uses ROM cartridges for gameplay. The S1C88 is an 8-bit microcontroller with 16-bit operations. The processor provides numerous addressing modes with a 24bit addressing bus (with only 21bits mapped externally)

Instruction set - https://wiki.sublab.net/index.php/S1C88_InstructionSet
Binary Ninja disassembler plugin - https://github.com/hgarrereyn/bn-pokemon-mini
Another disassembler - https://github.com/pokemon-mini/mindis2
Compiled files - https://github.com/ubuntor/ctf_problems/tree/main/minipokemon

Add support for public signature server

Rizin lacks support for public or private signatures server, and there is open implementation for lumina https://github.com/naim94a/lumen . Maybe we should implement support for this server? Obviously, it should be very optional feature, but I'm confident that it helps rizin become more popular in collaborated RE.

Describe the solution you'd like
Implementation of lumina client in rizin

Describe alternatives you've considered
Create some other public server/client, that will be based on open source rizin protocol.

Restore and fix macOS KDP protocol support

Back in Radare2 times there was unmaintained KDP plugin. Idea is to restore it, test, and fix to support modern macOSes for Intel and ARM.
Moreover, just as GDB and WinKD support, it belongs in the Rizin main repository, not in extras

See:

Rizin/Cutter integration/cooperation with GNU Poke

http://www.jemarch.net/poke

GNU poke is an interactive, extensible editor for binary data. Not
limited to editing basic entities such as bits and bytes, it provides
a full-fledged procedural, interactive programming language designed
to describe data structures and to operate on them.

,----
| (poke) dump
| 76543210  0011 2233 4455 6677 8899 aabb ccdd eeff  0123456789ABCDEF
| 00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
| 00000010: 0100 3e00 0100 0000 0000 0000 0000 0000  ..>.............
| 00000020: 0000 0000 0000 0000 0802 0000 0000 0000  ................
| 00000030: 0000 0000 4000 0000 0000 4000 0b00 0a00  ....@.....@.....
| 00000040: 5548 89e5 b800 0000 005d c300 4743 433a  UH.......]..GCC:
| 00000050: 2028 4465 6269 616e 2036 2e33 2e30 2d31   (Debian 6.3.0-1
| 00000060: 382b 6465 6239 7531 2920 362e 332e 3020  8+deb9u1) 6.3.0 
| 00000070: 3230 3137 3035 3136 0000 0000 0000 0000  20170516........
| (poke) load elf
| (poke) var ehdr = Elf64_Ehdr @ 0#B
| (poke) ehdr.e_ident
| struct {
|   ei_mag=[0x7fUB,0x45UB,0x4cUB,0x46UB],
|   ei_class=0x2UB,
|   ei_data=0x1UB,
|   ei_version=0x1UB,
|   ei_osabi=0x0UB,
|   ei_abiversion=0x0UB,
|   ei_pad=[0x0UB,0x0UB,0x0UB,0x0UB,0x0UB,...],
|   ei_nident=0x0UB#B
| }
`----

Add support for Fujitsu FR and FR-V architectures

The Fujitsu FR-V (Fujitsu RISC-VLIW) is one of the very few processors ever able to process both a very long instruction word (VLIW) and vector processor instructions at the same time, increasing throughput with high parallel computing while increasing performance per watt and hardware efficiency. The family was presented in 1999. Its design was influenced by the VPP500/5000 models of the Fujitsu VP/2000 vector processor supercomputer line.

Featuring a 1–8 way very long instruction word (VLIW, Multiple Instruction Multiple Data (MIMD), up to 256 bit) instruction set it additionally uses a 4-way single instruction, multiple data (SIMD) vector processor core. A 32-bit RISC instruction set in the superscalar core is combined with most variants integrating a dual 16-bit media processor also in VLIW and vector architecture. Each processor core is superpipelined as well as 4-unit superscalar.

It is often used in various devices working with images or audio/video - DSLR, recorders, hardware codecs, etc.F or example, the Milbeaut signal processors specialized for image processing, with the newest version additionally including an FR-V based HD video H.264 codec engine.

The Milbeaut image engines are included in the Leica S2 and Leica M (Typ 240), Nikon DSLRs (see Nikon Expeed), some Pentax K mount cameras and for the Sigma True-II processor.

See also:

image

CIL/MSIL (.NET) bytecode support

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. CIL instructions are executed by a CLI-compatible runtime environment such as the Common Language Runtime. Languages which target the CLI compile to CIL. CIL is object-oriented, stack-based bytecode. Runtimes typically just-in-time compile CIL instructions into native code.

CIL was originally known as Microsoft Intermediate Language (MSIL) during the beta releases of the .NET languages. Due to standardization of C# and the CLI, the bytecode is now officially known as CIL. Windows Defender virus definitions continue to refer to binaries compiled with it as MSIL.

See short CIL instructions list.

Rizin-extras has basic MSIL support but it's broken and incomplete. It also doesn't implement analysis plugin. Moreover, there might be the need to fix/update PE parsing code in RzBin.

Also it makes sense to rename it into CIL as the official name.

See

The best reference implementation to compare with are:

See CHIMERA.ZIP as a test example.

GOST hash and crypto algorithms support

Currently there is no support in the mainstream OpenSSL for GOST hash and encryption algorithms. There is an external module for OpenSSL at https://github.com/gost-engine/engine

It would be awesome to add support of the GOST in Rizin, either in Core library or as an external plugin:

  • librz/hash/
  • librz/hash/p/
  • librz/crypto/
  • librz/crypto/p/

Since mainstream OpenSSL doesn't support it, it's better to provide it as an external installable module via rz-pm, where compilation would check the presence of the gost-engine.

Renesas RL78 architecture support

LLVM Assembler Plugin

There is already an assembler plugin using keystone: https://github.com/rizinorg/rizin-extras/tree/master/keystone
However at its core, keystone is really just a wrapper around code ripped from llvm.
So naturally it would be an interesting option to also have an external assembly plugin that directly uses llvm, especially since it's part of pretty much every distribution, well-tested and actively maintained.

Here is an example of using the llvm-mc tool:

florian-macbook:~ florian$ echo "mov r0, #0x42" | llvm-mc-mp-13 --arch=arm --filetype=obj -o tmp.o && rz -Qc "pd 1 @ 0" tmp.o
            ;-- segment.:
            ;-- section.0..__text:
            0x00000000      4200a0e3       mov   r0, 0x42              ; 'B' ; [00] -rwx section size 4 named 0..__text

The rizin plugin however should of course not call llvm-mc but be written in C++ and call the respective API from llvm libraries.

Motorola SREC file format support

Motorola SREC format is another popular text binary representation format along with Intel's IHEX. Similar to ihex:// it should be the IO plugin.

It looks something like:

S00F000068656C6C6F202020202000003C
S11F00007C0802A6900100049421FFF07C6C1B787C8C23783C6000003863000026
S11F001C4BFFFFE5398000007D83637880010014382100107C0803A64E800020E9
S111003848656C6C6F20776F726C642E0A0042
S5030003F9
S9030000FC

It should be available for the manual selection and autodetection.

See librz/io/p/io_ihex.c on how ihex:// is implemented.

Better support for HP PA-RISC architecture

PA-RISC is an instruction set architecture (ISA) developed by Hewlett-Packard. As the name implies, it is a reduced instruction set computer (RISC) architecture, where the PA stands for Precision Architecture. The design is also referred to as HP/PA for Hewlett Packard Precision Architecture.

The architecture was introduced on 26 February 1986, when the HP 3000 Series 930 and HP 9000 Model 840 computers were launched featuring the first implementation, the TS1.

PA-RISC has been succeeded by the Itanium (originally IA-64) ISA, jointly developed by HP and Intel. HP stopped selling PA-RISC-based HP 9000 systems at the end of 2008 but supported servers running PA-RISC chips until 2013.

A significant amount of the proprietary and valuable software was created for this platform thus might be of interest for the reverse engineers.

Rizin currently supports disassembly of PA-RISC, but it's based on binutils, and no analysis. Ideally, it should be reimplemented in LGPL-compatible license: LGPL itself or MIT.

image
PA-RISC 1.1 Architecture and Instruction Set Reference Manual
PA-RISC 2.0 Architecture and Instruction Set Reference Manual

See also:

See also rizinorg/rizin#1927

Implement block and function-level folding in visual mode

For example, at function level it could look like the old implementation:

rizin /bin/ls
[0x00006b60]> aaa
[0x00006b60]> afF
[0x00006b60]> Vp

It marks the first function as "folded" and outputs it like:

image

As @thestr4ng3r suggested:

I think instead of folding individual blocks and functions, it might make more sense to be able to make groups of blocks and then fold/unfold such groups. So you can get a more high-level graph with maybe some complex parts being collapsed into a single folded block.

Intel i860/i960 support

Intel's i960 (or 80960) was a RISC design that became popular during the early 1990s as an embedded microcontroller. It became a best-selling CPU in that segment, along with the competing AMD 29000 (Supported by Rizin). In spite of its success, Intel stopped marketing the i960 in the late 1990s, as a result of a settlement with DEC whereby Intel received the rights to produce the StrongARM CPU.

Texas Instruments TMS320 C2000 series support

C2000 microcontroller family consists of 32-bit microcontrollers with performance integrated peripherals designed for real-time control applications. C2000 consists of 5 sub-families: the newer C28x + ARM Cortex M3 series, C28x Delfino floating-point series, C28x Piccolo series, C28x fixed-point series, and C240x, an older 16-bit line that is no longer recommended for new development. The C2000 series is notable for its high performance set of on-chip control peripherals including PWM, ADC, quadrature encoder modules, and capture modules. The series also contains support for I²C, SPI, serial (SCI), CAN, watchdog, McBSP, external memory interface and GPIO. Due to features like PWM waveform synchronization with the ADC unit, the C2000 line is well suited to many real-time control applications. The C2000 family is used for applications like motor drive and control, industrial automation, solar and other renewable energy, server farms, digital power, power-line communications, and lighting. A line of low cost kits are available for key applications including motor control, digital power, solar, and LED lighting.

They are common in on/off-grid solar power inverters and orginal PIP / clones have firmware issues.

You can find test firmwares: https://forums.aeva.asn.au/viewtopic.php?f=64&t=4332

And there is some datasheet:
https://www.ti.com/lit/ug/spruh18h/spruh18h.pdf?ts=1642277203844

https://www.ti.com/microcontrollers-mcus-processors/microcontrollers/c2000-real-time-control-mcus/overview.html

GOFF file format

The GOFF (Generalized Object File Format) specification was developed for IBM's MVS operating system to supersede the IBM OS/360 Object File Format to compensate for weaknesses in the older format.

Currently it's used for z/OS and z/VM

See these files in the LLVM tree for details about the format:

  • llvm/lib/Object/GOFFObjectFile.cpp
  • llvm/include/llvm/BinaryFormat/GOFF.h
  • llvm/include/llvm/Object/GOFFObjectFile.h
  • llvm/include/llvm/Object/GOFF.h

See also:

MIL-STD-1750A CPU architecture support

MIL-STD-1750A or 1750A is the formal definition of a 16-bit computer instruction set architecture (ISA), including both required and optional components, as described by the military standard document MIL-STD-1750A (1980). Since August 1996 is inactive for new designs.

image

In addition to the core ISA, the definition defines optional instructions, such as a FPU and MMU. Importantly, the standard does not define the implementation details of a 1750A processor.

Examples of MIL-STD-1750A implementations include:

  • CPU Technology, Inc. CPU1750A-FB, a high performance 1750A SOC designed to give existing applications a late life performance boost.
  • Delco Electronics Magic V 1750 Processor
  • Dynex Semiconductor MAS281. A radiation hardened SOC implementation on a 64-pin multichip module with an optional MMU.
  • GEC-Plessey RH1750, a radiation-hardened version for aerospace and space flight applications. GEC-Plessey, under its previous incarnation as Marconi Electronic Devices, also initially developed the MAS281 and MA31750A[1] series of processors, later made available through Dynex Semiconductor
  • Honeywell HX1750, fabricated on Honeywell's Silicon on Insulator CMOS (SOI-IV) process giving radiation hardness. The HX1750 includes an FPU and peripherals on chip.
  • Johns Hopkins University Applied Physics Laboratory (JHU/APL) MIL-STD-1750AAV space flight qualified processor. A multi-board silicon on sapphire implementation specifically designed for space flight.
  • Marconi Electronic Devices MIL-STD-1750A.
  • McDonnell-Douglas MD-281. A radiation hardened SoS three die implementation on a 64-pin multichip module.
  • National Semiconductor F9450 series.
  • Pyramid Semiconductor PACE P1750A. The family includes the P1750A CPU, the P1750AE Enhanced CPU, the P1753 Memory Management Unit (MMU), the P1754 Processor Interface Chip (PIC) and the P1757ME Multi-Chip Module. This line was acquired from Performance Semiconductor in 2003.
  • Royal Aircraft Establishment Farnborough MIL-STD-1750A implementation in AMD 2901 bit-slice technology.[2]

HX1750-Datasheet.pdf
mil-std-1750a-1.7.pdf
ut1750micro.pdf

remote handling: explore and understand

This issue is to discuss and keep track of exploration/ideas about remote handling (all the = commands).
Right now there are multiple remote protocols supported: rap,tcp,http,raps (at least).

We should explore current state and see what exactly each one of these does and if we can remove custom protocols in favour of tcp/http, if nothing special is done in rap.

Also, it should be investigated how to deal with commands that should be executed through a remote connection without having a completely separated function. Right now when remote mode is enabled, all commands go through rz_core_rtr_cmd.

Support Intersil/Harris RTX2000 architecture

The RTX2010, manufactured by Intersil, is a radiation hardened stack machine microprocessor which has been used in numerous spacecraft.

It is a two-stack machine, each stack 256 words deep, that supports direct execution of Forth. Subroutine calls and returns only take one processor cycle and it also has a very low and consistent [nterrupt latency of only four processor cycles, which lends it well to realtime applications.

Example spacecraft that use the RTX2010:

See:

Support AmigaOS and MorphOS

AmigaOS

AmigaOS is a family of proprietary native operating systems of the Amiga and AmigaOne personal computers. It was developed first by Commodore International and introduced with the launch of the first Amiga, the Amiga 1000, in 1985. Early versions of AmigaOS required the Motorola 68000 series of 16-bit and 32-bit microprocessors. Later versions were developed by Haage & Partner (AmigaOS 3.5 and 3.9) and then Hyperion Entertainment (AmigaOS 4.0-4.1). A PowerPC microprocessor is required for the most recent release, AmigaOS 4.

image

AmigaOS is a single-user operating system based on a preemptive multitasking kernel called Exec.

MorphOS

MorphOS is an AmigaOS-like computer operating system. It is a mixed proprietary and open source) OS produced for the Pegasos PowerPC (PPC) processor-based computer, PowerUP accelerator equipped Amiga computers, and a series of Freescale development boards that use the Genesi firmware, including the Efika and mobileGT. Since MorphOS 2.4, Apple's Mac mini G4 is supported as well, and with the release of MorphOS 2.5 and MorphOS 2.6 the eMac and Power Mac G4 models are respectively supported. The release of MorphOS 3.2 added limited support for Power Mac G5. The core, based on the Quark microkernel, is proprietary, although several libraries and other parts are open source, such as the Ambient desktop.

image

Renesas M32R architecture supoort

The M32R is a 32-bit RISC instruction set architecture (ISA) developed by Mitsubishi Electric for embedded microprocessors and microcontrollers. The ISA is now owned by Renesas Electronics Corporation, and the company designs and fabricates M32R implementations. M32R processors are used in embedded systems such as Engine Control Units, digital cameras, and PDAs.

Generate disassemblers from QEMU's decodetree

QEMU generates (some of) its disassembling C code from so-called decodetrees, specified for example at https://gitlab.com/qemu-project/qemu/-/blob/31f59af395922b7f40799e75db6e15ff52d8f94a/target/arm/a32.decode

In qemu, this then results in code like this:

...
    switch ((insn >> 25) & 0x7) {
    case 0x0:
        /* ....000. ........ ........ ........ */
        switch (insn & 0x01000010) {
        case 0x00000000:
            /* ....0000 ........ ........ ...0.... */
            disas_a32_extract_s_rrr_shi(ctx, &u.f_s_rrr_shi, insn);
            switch ((insn >> 21) & 0x7) {
            case 0x0:
                /* ....0000 000..... ........ ...0.... */
                /* ../target/arm/a32.decode:62 */
                if (trans_AND_rrri(ctx, &u.f_s_rrr_shi)) return true;
                break;
            case 0x1:
                /* ....0000 001..... ........ ...0.... */
                /* ../target/arm/a32.decode:63 */
                if (trans_EOR_rrri(ctx, &u.f_s_rrr_shi)) return true;
                break;
...

It is documented in more detail here: https://qemu.readthedocs.io/en/latest/devel/decodetree.html

The generator is here: https://gitlab.com/qemu-project/qemu/-/blob/31f59af395922b7f40799e75db6e15ff52d8f94a/scripts/decodetree.py
And they call it from their build system like so: https://gitlab.com/qemu-project/qemu/-/blob/31f59af395922b7f40799e75db6e15ff52d8f94a/meson.build#L2593-2599

For rizin, it will probably make more sense to have the generator somewhere outside the main repo and update a generated version in the rizin repo whenever needed, like rz-hexagon: https://github.com/rizinorg/rz-hexagon

Support SerenityOS

SerenityOS is a free and open source desktop operating system in continuous development since 2018. Initially the one-man project of Swedish programmer Andreas Kling, SerenityOS is now developed by a community of hobbyists. The system supports the x86 and x86-64 instruction sets with ARM port being in progress, features a preemptive kernel, and hosts multiple complex applications, including its own web browser and integrated development environment. SerenityOS is written in C++.

image

Adding port would amount with minor fixes in the code and adding "rizin" package here: https://github.com/SerenityOS/serenity/tree/master/Ports

Use Random Words for constructing Names of Functions/Vars/...

Docker for example has the feature that it chooses a random name from readable words for any new container:
20201217_11h39m29s_grim
see for example https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go

This could be quite useful when constructing names for functions, variables and others when there are no symbols. So for example in this case:
random-before
You could instead see something like:
random-after
From the second screenshot, it becomes much easier to see that the first two calls are to the same function. Also, you can re-recognize functions in other code more easily.

Useful for:

  • Functions
  • Variables
  • ... what else?

Some aspects:

  • Names could be chosen deterministically by seeding on for example the address. This means any time you run aaa for example, you will get the same names, at least until the word list or algorithm is changed.
  • The actual address should probably still be part of the final name because otherwise:
    • it is hard to distinguish from functions whose names actually come from some evidence like a symbol
    • clashes could occur
  • Should probably be an opt-in feature as it might be undesired or confusing.
  • The word list should be chosen well. No profanity of course, but also no words that are too complicated. They entire goal is to make it as memorizable as possible.

Support AIX operating system

AIX (Advanced Interactive eXecutive), is a proprietary Unix operating system developed and sold by IBM for several of its computer platforms. Originally released for the IBM RT PC RISC workstation in 1986, AIX has supported a wide variety of hardware platforms, including the IBM RS/6000 series and later Power and PowerPC-based systems, IBM System i, System/370 mainframes, PS/2 personal computers, and the Apple Network Server. It is currently supported on IBM Power Systems alongside IBM i and Linux.

image

AIX is based on UNIX System V with 4.3BSD-compatible extensions. It is certified to the UNIX 03 and UNIX V7 marks of the Single UNIX Specification, beginning with AIX versions 5.3 and 7.2 TL5, respectively. Older versions were previously certified to the UNIX 95 and UNIX 98 marks.

Depends also on #33

Elbrus (E2k) architecture support

The Elbrus 2000, E2K (Russian: Эльбрус 2000) is a Russian 512-bit wide VLIW microprocessor developed by Moscow Center of SPARC Technologies (MCST) and fabricated by TSMC.

image

It supports two instruction set architectures (ISA):

  • Elbrus VLIW
  • Intel x86 (a complete, system-level implementation with a software dynamic binary translation virtual machine, similar to Transmeta Crusoe)

elbrus_prog_2020-05-30.pdf

https://github.com/nrdmn/elbrus-docs

There is a QEMU-based emulator: https://git.mentality.rip/OpenE2K/qemu-e2k
Binutils: https://git.mentality.rip/OpenE2K/binutils-gdb
Linux Kernel: https://git.mentality.rip/OpenE2K/linux

More sources at https://github.com/free-src/free-src

A major challenge would be the instruction addressing - they can start and end not with the byte-level offset, can be in the middle of it.

EBCDIC character support

Extended Binary Coded Decimal Interchange Code (EBCDIC; /ˈɛbsɪdɪk/) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six-bit binary-coded decimal code used with most of IBM's computer peripherals of the late 1950s and early 1960s. It is supported by various non-IBM platforms, such as Fujitsu-Siemens' BS2000/OSD, OS-IV, MSP, and MSP-EX, the SDS Sigma series, Unisys VS/9, Burroughs MCP and ICL VME.

Rizin should have

  • ASCII<->EBCDIC conversion
  • Unicode <-> EBCDIC conversion
  • EBCDIC strings autodetection

See:

The pushed code should follow LGPL license (or more permissive ones). There is some existing code in librz/magic/* but it's incomplete and not integrated in the Rizin itself.

The desired location of working with this encoding is librz/util/*

See also rizinorg/rizin#1052

Provide the mechanism for the API deprecation

While we are before the 1.x versions, it's relatively safe to change the API as we wish. On the other hand, after releasing the 1.x version it will be no longer the case. Thus I propose the way to deprecate the API:

  • Define the deprecation window. Usually it's 1 after 1 major release: 1st one will be deprecating the API, 2nd will remove the deprecated API
  • Separate those APIs as the wrappers of the newer API whenever possible
  • Mark them somehow in the function signature - RZ_DEPRECATED macro or something similar
  • Add a preprocessor/compiler warning, e.g. #warning that will make the developers who use this API to notice
  • If there will be non-C bindings provide the similar mechanisms to warn the developer too

Add support for Itanium (IA-64) architecture

IA-64 (also called Intel Itanium architecture) is the instruction set architecture (ISA) of the Itanium family of 64-bit Intel microprocessors. The basic ISA specification originated at Hewlett-Packard (HP), and was evolved and then implemented in a new processor microarchitecture by Intel with HP's continued partnership and expertise on the underlying EPIC design concepts. In order to establish what was their first new ISA in 20 years and bring an entirely new product line to market, Intel made a massive investment in product definition, design, software development tools, OS, software industry partnerships, and marketing. To support this effort Intel created the largest design team in their history and a new marketing and industry enabling team completely separate from x86. The first Itanium processor, codenamed Merced, was released in 2001.

The Itanium architecture is based on explicit instruction-level parallelism, in which the compiler decides which instructions to execute in parallel. This contrasts with superscalar architectures, which depend on the processor to manage instruction dependencies at runtime. In all Itanium models, up to and including Tukwila, cores execute up to six instructions per clock cycle.

In 2008, Itanium was the fourth-most deployed microprocessor architecture for enterprise-class systems, behind x86-64, Power ISA, and SPARC.

IA-64 Application Instruction Set Architecture Guide

image

See also: capstone-engine/capstone#714

Implement binary `sed` alternative

To allow easy patching of the binary files.
Good example of such feature is bbe- tool: https://sourceforge.net/projects/bbe-/

bbe - binary block editor
Synopsis
bbe [options]...
Description
bbe is a sed-like editor for binary files. It performs binary transformations on the blocks of input stream.
Options
bbe accepts the following options:

-b, --block=BLOCK
    Block definition. 
-e, --expression=COMMAND
    Add the COMMAND to the commands to be executed. 
-f, --file=script-file
    Add the contest of script-file to commands. 
-o, --output=name
    Write output to name instead of standard output. 
-s, --suppress
    Suppress normal output, print only block contents. 
-?, --help
    List all available options and their meanings. 
-V, --version
    Show version of program.

BLOCK can be defined as:

N:M
    Where N'th byte starts a M bytes long block (first byte is 0). 
:M
    Block length in input stream is M. 
/start/:M
    String start starts M bytes long block. 
/start/:/stop/
    String start starts the block and block ends to string stop. 
/start/:
    String start starts the block and block will end at next occurence of start. Only the first start is included to the block. 
:/stop/
    Block starts at the beginning of input stream (or at the end of previous block) and ends at the next occurrence of stop. String stop will be included to the block.

Special value '$' of M means the end of stream.

Default value for block is 0:$, meaning the whole input stream.

Both start and stop strings are included to block. Nonprintable characters can be escaped as

\nnn
    decimal 
\xnn
    hexadecimal 
\0nnn
    octal

Character '\' can be escaped as '\\'. Escape codes '\a','\b','\t','\n','\v','\f','\r' and '\;' can also be used.

Length (N and M) can be defined as decimal (n), hexadecimal (xn) or octal (0n) value.
Command Synopsis
bbe has two type of commands: block and byte commands, both are allways related to current block. That means that the input stream outside of block remains untouched.
Block commands

D [n]
    Delete the n'th block. Without n, all found blocks are deleted from the output stream. 
I string
    Insert the string string before the block. 
A string
    Append the string string at the end of block. 
J n
    Skip n blocks before executing commands after this command. 
L n
    Leave all blocks unmodified starting from block number n. Affects only commands after this command. 
N
    Before printing a block, the file name in which the block starts is printed. 
F f
    Before printing a block, the input stream offset at the begining of the block is printed. f can be H, D or O for Hexadecimal, Decimal or Octal format of offset. 
B f
    Before printing a block, the block number is printed (first block == 1) f can be H, D or O for Hexadecimal, Decimal or Octal format of block number. 
> file
    Before printing a block, the contents of file file is printed. 
< file
    After printing a block, the contents of file file is printed.

Byte commands
n in byte commands is offset from the beginning of current block (starts from zero).

r n string
    Replace bytes starting at position n with string string. 
i n string
    Insert string starting at position n. 
p format
    The contents of block is printed in format defined by format. format can have any of the formats H, D, O, A and B for Hexadecimal, Decimal, Octal, Asciii and Binary. 
s/search/replace/
    Replace all occurrences of search with replace. 
y/source/dest/
    Translate bytes in source to the corresponding bytes in dest. Source and dest must have equal length. 
d n m|*
    Delete m bytes starting from the offset n. If * is defined instead of m, then all bytes starting from n are deleted. 
c from to
    Convert bytes from format from to to. Currently supported formats are: 
BCD
    Binary coded decimal 
ASC
    Ascii 
j n
    Commands after the j-command are ignored for first n bytes of the block. 
l n
    Commands after the l-command are ignored from n'th byte of the block. 
w file
    Write bytes from the current block to file file. Commands before w-command have effect to what will be written. %B or %nB in file will be replaced by current block number. n in %nB is field length, leading zero in n causes the block number to be left padded with zeroes. 
& c
    Performs binary and with c. 
| c
    Performs binary or with c. 
^ c
    Performs binary xor with c. 
~
    Performs binary negation. 
u n c
    All bytes from start of the block to offset n are replaced by c. 
f n c
    All bytes starting from offset n to end of the block are replaced by c. 
x
    Exchange the contents of nibbles (half an octet) of bytes.

Nonvisible characters in strings can be escaped same way as in block definition strings. Character '/' in s and y commands can be any visible character.

Note that the D, A, I, F, B, c, s, i, y, p, <, > and d commands cause the length of input and output streams to be different.
Examples

bbe -e "s/c:\\temp\\data1.txt/c:\\temp\\data2.txt/" file1
    all occurences of "c:\temp\data1.txt" in file file1 are changed to "c:\temp\data2.txt" 
bbe -b 0420:16 -e "r 4 \x12\x4a" file1
    Two bytes starting at fifth byte of a 16 byte long block starting at offset 0420 (octal) in file1 are changed to hexadecimal values 12 and 4a. 
bbe -b :16 -e "A \x0a" file1
    Newline is added after every block, block length is 16.

Renesas RX architecture support

The RX microcontroller (MCU) family uses a 32-bit enhanced Harvard CISC architecture

There is LLVM port for RX https://lists.llvm.org/pipermail/llvm-dev/2020-April/140546.html
Thus, it's theoretically possible to use the auto-sync procedure to add its support to capstone (see Tricore support in capstone as a good example).

It is also supported by QEMU: https://gitlab.com/qemu-project/qemu/-/tree/master/target/rx

XCOFF file format

XCOFF, for "eXtended COFF", is an improved and expanded version of the COFF object file format defined by IBM and used in AIX. Early versions of the PowerPC Macintosh also supported XCOFF, as did BeOS.

XCOFF additions include the use of CSECTs to provide subsection granularity of cross-references, and the use of stabs for debugging. Information for the handling of shared libraries is also more elaborate than for plain COFF.

More recently, IBM defined an XCOFF64 version supporting 64-bit AIX, and used XCOFF32 to mean the original file format.

Support OpenVMS binaries

OpenVMS, often referred to as just VMS, is a multi-user, multiprocessingand virtual memory-based operating system. It is designed to support time-sharing, batch processing], transaction processing and workstation applications.[ Customers using OpenVMS include banks and financial services, hospitals and healthcare, telecommunications operators, network information services, and industrial manufacturers. During the 1990s and 2000s, there were approximately half a million VMS systems in operation worldwide.

image

It was first announced by Digital Equipment Corporation (DEC) as VAX/VMS (Virtual Address eXtension/Virtual Memory Systemalongside the VAX-11/780 minicomputer in 1977. OpenVMS has subsequently been ported to run on DEC Alpha systems, the Itanium-based HPE Integrity Servers, and select x86-64 hardware and hypervisors]. Since 2014, OpenVMS is developed and supported by VMS Software Inc. (VSI). OpenVMS offers high availability through clustering]— the ability to distribute the system over multiple physical machines. This allows clustered applications and data to remain continuously available while operating system software and hardware maintenance and upgrades are performed, or if part of the cluster is destroyed. VMS cluster uptimes of 17 years have been reported.

Also depends on #2

Add support for DEC Alpha architecture

DEC Alpha, originally known as Alpha AXP, is a 64-bit reduced instruction set computing (RISC) instruction set architecture (ISA) developed by Digital Equipment Corporation (DEC), designed to replace their 32-bit VAX complex instruction set computer (CISC) ISA. Alpha was implemented in microprocessors originally developed and fabricated by DEC. These microprocessors were most prominently used in a variety of DEC workstations and servers, which eventually formed the basis for almost all of their mid-to-upper-scale lineup. Several third-party vendors also produced Alpha systems, including PC form factor motherboards.

Operating systems that supported Alpha included OpenVMS (previously known as OpenVMS AXP), Tru64 UNIX (previously known as DEC OSF/1 AXP and Digital UNIX), Windows NT (discontinued after NT 4.0; and pre-release Windows 2000 RC1), Linux (Debian, SUSE, Gentoo and Red Hat), BSD UNIX (NetBSD, OpenBSD and FreeBSD up to 6.x), Plan 9 from Bell Labs, as well as the L4Ka::Pistachio kernel. The Alpha architecture was sold, along with most parts of DEC, to Compaq in 1998. Compaq, already an Intel customer, phased out Alpha in favor of the forthcoming Hewlett-Packard/Intel Itanium architecture, and sold all Alpha intellectual property to Intel in 2001, effectively killing the product. Hewlett-Packard purchased Compaq later that same year, continuing development of the existing product line until 2004, and selling Alpha-based systems, largely to the existing customer base, until April 2007.

A significant amount of the proprietary and valuable software was created for this platform thus might be of interest for the reverse engineers.

Alpha Architecture Reference Manual

image

See also rizinorg/rizin#1925

Depends on capstone-engine/capstone#2071

Visual Basic P-code support

P-Code is a name for several of Microsoft's proprietary intermediate languages. They provided an alternate binary format to machine code. At various times, Microsoft have said p-code is an abbreviation for either packed code[1] or pseudo code.[2]

Microsoft p-code was used in Visual C++ and Visual Basic. Like other p-code implementations, Microsoft p-code enabled a more compact executable at the expense of slower execution.

https://decoded.avast.io/davidzimmer/vb6-p-code-disassembly/

Support HP-UX

HP-UX (from "Hewlett Packard Unix") is Hewlett Packard Enterprise's proprietary implementation of the Unix operating system, based on Unix System V (initially System III) and first released in 1984. Current versions support HPE Integrity Servers, based on Intel's Itanium architecture.

Earlier versions of HP-UX supported the HP Integral PC and HP 9000 Series 200, 300, and 400 computer systems based on the Motorola 68000 series of processors, the HP 9000 Series 500 computers based on HP's proprietary FOCUS architecture, and later HP 9000 Series models based on HP's PA-RISC instruction set architecture.

image

It can be easily downloaded and ran in QEMU: https://supratim-sanyal.blogspot.com/2020/01/hpux-1020-on-hp-9000778-pa-risc-guest.html

Support OS/2, eComStation, and ArcaOS

OS/2 (Operating System/2) is a series of computer operating systems, initially created by Microsoft and IBM under the leadership of IBM software designer Ed Iacobucci. As a result of a feud between the two companies over how to position OS/2 relative to Microsoft's new Windows 3.1 operating environment, the two companies severed the relationship in 1992, and OS/2 development fell to IBM exclusively. The name stands for "Operating System/2" because it was introduced as part of the same generation change release as IBM's "Personal System/2 (PS/2)" line of second-generation personal computers. The first version of OS/2 was released in December 1987, and newer versions were released until December 2001.

OS/2 was intended as a protected-mode successor of PC DOS. Notably, basic system calls were modeled after MS-DOS calls; their names even started with "Dos," and it was possible to create "Family Mode" applications – text mode applications that could work on both systems. Because of this heritage, OS/2 shares similarities with Unix, Xenix, and Windows NT.

IBM discontinued its support for OS/2 on 31 December 2006. Since then, OS/2 has been developed, supported, and sold by two different third-party vendors under license from IBM – first by Serenity Systems as eComStation since 2001, and later by Arca Noae LLC as ArcaOS since 2017.

eComStation or eCS is an operating system based on OS/2 Warp for the 32-bit x86 architecture. It was originally developed by Serenity Systems and Mensys BV under license from IBM. It includes additional applications and support for new hardware which were not present in OS/2 Warp. It is intended to allow OS/2 applications to run on modern hardware and is used by a number of large organizations for this purpose. By 2014, approximately thirty to forty thousand licenses of eComStation had been sold.

Financial difficulties at Mensys in 2012 led to the development of eComStation stalling and ownership being transferred to a sister company named XEU.com (now known as PayGlobal Technologies BV), which continues to sell and support the operating system. The lack of a new release since 2011 was one of the motivations for the creation of the ArcaOS OS/2 distribution

image

ArcaOS is an operating system based on OS/2, developed and marketed by Arca Noae, LLC under license from IBM.[3][4] It was codenamed Blue Lion during its development.[5] It builds on OS/2 Warp 4.52 by adding support for new hardware, fixing defects and limitations in the operating system, and by including new applications and tools.[6] It is targeted at professional users who need to run their OS/2 applications on new hardware, as well as personal users of OS/2.[7]

Like OS/2 Warp, ArcaOS is a 32-bit single user, multiprocessing preemptive multitaskingoperating system for the x86 architecture. It is supported on both physical hardware and virtual machine hypervisors.

image

Support for z/OS

z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn followed a string of MVS versions. Like OS/390, z/OS combines several formerly separate, related products, some of which are still optional. z/OS has the attributes of modern operating systems but also retains much of the older functionality that originated in the 1960s and is still in regular use—z/OS is designed for backward compatibility.

z/OS supports stable mainframe facilities such as CICS, COBOL, IMS, PL/I, IBM Db2, RACF, SNA, IBM MQ, record-oriented data access methods, REXX, CLIST, SMP/E, JCL, TSO/E, and ISPF, among others. However, z/OS also implements 64-bit Java, C, C++, and UNIX (Single UNIX Specification) APIs and applications through UNIX System ServicesThe Open Group certifies z/OS as a compliant UNIX operating system – with UNIX/Linux-style hierarchical HFS and zFS file systems. These compatibilities make z/OS capable of running a range of commercial and open source software. z/OS can communicate directly via TCP/IP, including IPv6, and includes standard HTTP servers (one from Lotus, the other Apache-derived) along with other common services such as SSH, FTP, NFS, and CIFS/SMB. z/OS is designed for high quality of service (QoS), even within a single operating system instance, and has built-in Parallel Sysplex clustering capability.

LuaJIT bytecode support

Rizin supports Lua bytecode already:

But not the LuaJIT version one, which gets more and more popular these days due to the better performance compared to the mainstream Lua.

Different versions of LuaJIT produce different versions of the bytecode:

  1. Revision 1 (corresponding to LuaJIT 2.0.x)
  2. Revision 2 (corresponding to LuaJIT 2.1.x)
  3. Revision 3 (RaptorJIT fork)

https://www.mickaelwalter.fr/reverse-engineering-luajit/

Persistent state/buffer module

It would be nice if we would have a module, which allows other modules (RzArch for example) to persists its internal state for a certain lifetime.

I think it would be useful to implement this without any connection to an existing module.

Feature could be:

  • Thread save
  • Possibility to set lifetime
  • Access control maybe (so it is not used accidentally by other modules).

Practical example where it would be useful are token patterns.
Those can be compiled once on the asm module init. And then used every time something is disassembled.
But currently those compiled patterns can not be easily stored somewhere.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.