The TA-VM Thin Shell

This is the Thin Shell for Target-Agnostic Modtalk, Powerlang/Bee-DMR, and OpenSmalltalkVM.

A Thin Shell is a trivial starting point for constructing a remote-target process image under Universally-Live Debugging, as described in the Brescia IWST paper

thinshell's People

Contributors

Watchers

thinshell's Issues

Speed up and generalize decomposition

(Issues marked with "VEX", relate to the regularization branch. Strictly speaking, regularization is not part of the ThinShell per se; it just historically so happened that I doodled these experiments in this corner of the filesystem and wanted to quickly commit somewhere).

Phase 2 of Instruction Regularization, partitions all the ground instruction instances of the given instruction declaration, into a small number of equivalence classes I'll call "shapes". Intuitively, two instructions with the same opcode but different operands, could do "the same thing modulo parametrization" or do "substantially different things". In analysis, these two situations are differentiated by whether the two VEX IRSB trees have the same shape (only with different leaf Constants); here by "sameness" we mean definitional equality, not homotopy. For example, addis r3, r1, 0x1234 on POWER will have a GET(r1) node somewhere in the VEX, but addis r3, r0, 0x1234 will have no similar node. In this example, we tend to characterize the difference as "functional"; but at other times it's due to notational convention of the ISA. For example "b" on ARM is considered one instruction whose linking behavior depends on the H bit; in contrast, on POWER b and bl (which differ only in the LK bit) are notationally considered different instructions. The crucial point is that within a single given shape we arrive at a straightline execution trace during IR interptetation (think Isla Jib with no Jumps).

We are interested in when these shape partitions of the encoding space are rectangular. We want to express the partition function -- going from instruction encoding to shape['s number],

sh: BV32 → ℤ

-- as a composition going through a small number of functions over small bitslices of the encoding:

∀x ∀y. sh(x||y) = f(s(x), t(y)) [*]

where x means some bit positions in x||y, and y means some other bit positions. The function vexshape.factorize_flock() finds this decomposition by asserting [*] and the values of sh into Z3 and asking for a model of f, s, t.
The current proof-of-concept code is messy, not general, and very slow; this needs to be rewritten properly.

Speed up VEX bruteforcing

The first phase of Instruction-Regularization analysis, is to FFI-call the Lifter on all instruction encodings within the space of interest, and collect all the resulting IRs into an array. This can be slow, and we can't do much about it algorithmically because the time is mostly spent in the FFI call itself. However, these calls are trivially parallelizable because the data is disjoint. I expect it should be a matter of calling Python's threading to see an order-of-magnitude speedup.

Recommend Projects

shingarov / thinshell Goto Github PK

thinshell's Introduction

The TA-VM Thin Shell

thinshell's People

Contributors

Watchers

Forkers

thinshell's Issues

Speed up and generalize decomposition

Speed up VEX bruteforcing

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent