Comments (3)
Implementation note: In this design, the moment when an SNM pointer dies is semantically important. For example, consider the following code, in which all memory accesses are to SNM locations:
load from r0
store r0 at a
// drop r0
load r1 from a
store at r1
We can move the load and store at a
as early as we like, revealing a possible optimisation (which, for clarity, I will not do):
store r0 at a
load r1 from a
load from r0
// drop r0
store at r1
However, r0
is dead when we store at r1
, so we cannot exchange the load from r0
with the store at r1
. There is a dependency chain "load -> drop -> store".
from mijit.
Two years on, I have clearer ideas on this subject. Relying on instruction order to represent dependencies between loads and stores is a bad idea because (a) it doesn't do that very well, and (b) it constrains the optimizer too much. Instead, I'd like to lean into the "shared NAND mutable" (hereafter "shnam") idea.
Overview
Mijit's entire memory model should be shnam. There is no need to mark particular memory accesses as shnam because they all are. The memory model should be unsafe; the Mijit code includes hints about the way memory access instructions interact, and if the hints are not true the behaviour is undefined.
I'm satisfied that many common memory models can be encoded into such a model, including separate memories for code, stack and heap, virtual registers, immutable shared memory, heap allocated memory, stack allocated memory, memory protected by a lock, memory belonging to an actor, and so on. In particular, dumb flat memory can be encoded if necessary. And these various models compose nicely; for example you can have segregated code and stack within an otherwise flat memory model.
Concepts
The goal is to allow the optimizer to assume that when it sees a store to x.f
followed by a load from x.f
the value that is loaded is the same value that was stored. In particular, it must be allowed to ignore an intervening store to y.f
, on the grounds that x
and y
are not allowed to point to the same object.
This requires a disciplined way of using pointers. Each pointer notionally has "permissions" allowing it to be used to access certain areas of memory. Specifically, there is set of locations it can load from, and a set that it can store to. The latter is a subset of the former; there are no write-only locations. The shnam rule is that if one pointer can store to a location then no other pointer can load from the same location.
There are six main ways that one pointer can be used to compute another:
- Compute the address of an element of an array.
- Compute the address of a field of a struct.
- Use one pointer to load another from memory.
- Push an object onto a stack, and take its address.
- Allocate an object from a heap, and take its address.
- Send a message from one agent (thread?) to another.
In each of these cases, the permissions of the new pointer are carved out of those of the old pointer. When the new pointer is no longer needed, its permissions are usually returned to the old pointer. Thus, the permissions of a pointer change over time, and Mijit needs to be aware of these changes.
Mijit does not otherwise need to be aware of the permissions attached to each pointer. If challenged, the author of the Mijit code should be able to say what the permissions of every pointer are, and prove that the shnam rule is followed. However, Mijit won't ever make such a challenge. It will blindly trust the author. If that trust is misplaced, the behaviour is undefined.
The "send" primitive
We add one primitive send
to the Mijit language to inform Mijit of changes to the pointer permissions. It is syntactically an arithmetic instruction: it takes two pointers x
and y
and returns a fresh pointer that is numerically equal to x
but which has additional permissions taken from y
. In native code send
is a no-op; it's only purpose is to constrain the optimizer.
A typical use of send is as follows:
- Suppose
p
points to an array. - Based on
p
and an array index, compute a pointerq
that points to an array element.
Since there is a dataflow dependency fromp
toq
there is no need to use "send" yet; it is obvious that the permissions ofq
have been carved out ofp
. - Use
q
to access the array element. - Compute
p' = send(p, q)
.
Permissions ofq
(in this case all of them) are merged with those ofp
to make a new pointerp'
. - Thereafter, only access the array using
p'
, notp
.
Undefined behaviour
There are certain things you are not allowed to do. For example:
- Break the shnam rule, i.e. store to a location using one pointer and load from the same locations using another (in either order). You must use
send
to transfer permissions from one pointer to the other. - Store to a location twice using the same pointer. You must use
send
to obtain a numerically equal pointer that you can use for the second store operation. - Store to a location using a pointer, use the pointer to compute another pointer, and access the same location using the new pointer. This is in fact a special case of one of the first two violations; the memory accesses are unordered.
- Construct a pointer from an integer, and use it to access memory. You must use
send
to give the pointer some permissions, taking them from some other pointer (which could benull
).
from mijit.
I've now added Action::Send
and the optimizer knows what it means.
This is a backwards-incompatible change to the Mijit instruction set. Code using Mijit will require modifications in order to work on the new version (likely to be 0.2.4).
from mijit.
Related Issues (20)
- Spin off Beetle as a separate project that depends on Mijit HOT 1
- ARM backend HOT 1
- Extend the target abstraction layer to cover optimization
- Add profiling counters HOT 3
- Write tests for high register pressure
- `optimizer::Simulation` should understand `Push` and `Pop` instructions
- Exceptions
- Rename `is_true` to `invert` in conditionals HOT 1
- Changes to trait Lower HOT 1
- Test Lower HOT 1
- Make `Action::Push` and `Action::Pop` operate on pairs of words.
- `aarch64::Assembler::mem()` should take an `enum Address` HOT 1
- Rename `code::Value` to `code::Variable`.
- Examples of bad code
- Cannot build mijit 0.1.1 with rustc 1.51 HOT 3
- Store code in a more abstract form
- Redesign JIT API: Expose EBB HOT 1
- Do not call mprotect unnecessarily
- Peephole optimize code as it is added to `Dataflow`
- Separate `new_exit()`, `new_label()` and `new_entry()`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mijit.