sunjay / brainfuck Goto Github PK
View Code? Open in Web Editor NEWBrainfuck interpreter companion to the brain programming language
License: MIT License
Brainfuck interpreter companion to the brain programming language
License: MIT License
This Brainfuck interpreter is used for running code, so it isn't necessarily useful to simulate a Turing machine exactly. Some optimizations should be enabled by default.
In particular, grouping instructions before execution should be on by default. Batching instructions like this greatly speeds up interpretation. Grouping should be done once, lazily, as required for execution.
In debug mode, default to having optimizations off (unless an optimization flag is explicitly passed in). But make sure optimizations can be enabled afterwards if requested by the debugger.
Make sure optimizations can be turned off too if someone doesn't want them.
Turns out that Rust provides an equivalent to the NullWrite struct defined in the benchmark. That means that we can get rid of that code and use the standard library equivalent.
This is an easy, drop in fix that should not take very much time at all.
There are very few cases where unwinding is really useful here.
Try adding the following to Cargo.toml and measure the difference it makes in the release binary size.
[profile.release]
panic = "abort"
From: http://doc.crates.io/manifest.html#the-profile-sections
[exit 0]
instead of just [0]
to be more forwards compatible if we decide to use infinite loops for other FFI things.
In the Rust programming language, when a panic occurs, the program cleans up, prints an error message about the panic and then aborts (ends) the program.
Brainfuck has no sophisticated jump instruction, but these kinds of aborts are still necessary for writing useful programs.
To resolve this, I am proposing a new way to communicate panics using brainfuck instructions that is completely backwards compatible with current brainfuck. This is not a new instruction or anything that would require updating other brainfuck interpreters. If another interpreter does not implement this functionality, it can gracefully fallback to its default behaviour and everything will still function mostly as expected.
Aside: I should note that since brainfuck is turing complete, it is possible to implement this in brainfuck already. One naive implementation could be to simply wrap the remaining program in an
if else
statement (implemented in brainfuck of course). That way, if whatever the condition is fails, the rest of the program would not run at all. The downside to this is that it severely complicates the generated brainfuck. Imagine being inside several nested brainfuck loops and then having to somehow navigate your way out to the end of the program. It would be extremely messy and nearly impossible to do so. Every single thing from that point in the program onward would need to be wrapped in conditionals.
If a brainfuck program needs to abort, it should enter an infinite loop by using the instructions:
[]
This is an infinite loop. From a syntax point of view, an infinite loop is defined as a brainfuck [
instruction, followed by any number of non-instruction characters, concluded by a ]
instruction. Without any further instructions inside a loop, if the loop is entered, it is guaranteed to run forever. The runtime semantics of this are a bit more complicated and are described below.
Several important points about this:
+[]
as a replacement for the code above. The reason this is not specified that way above is because there is no guarantee that +
will not cause a overflow and reset the current cell back to zero. It is up to the programmer to ensure that the current cell is non-zero before the []
instructions if they wish for the abort to occur.
Printing out an error message and doing whatever clean up is necessary is already possible in brainfuck. Those things are left up to the programmer to do on their own.
Recommendation: once it is decided that the panic will occur, use some other cells to print the error message, then return back to the cell that was used to decide to panic and place the []
instructions.
On Unix-based operating systems and not-so-much Windows, it is possible to return an "exit code" from a program to indicate whether the program succeeded or failed. This is represented as an integer. The convention is to return 0
when the program succeeds, and a non-zero value when the program fails. The non-zero value is not specific and can be anything the program wants.
To facilitate this, an additional extension to this extension is proposed.
In addition to being able to abort with this syntax:
[]
It will also be possible to abort with the following syntax:
[0]
or:
[-1]
In this case, and only in this case, the inner number will be interpreted as the exit code.
The following cases and their variations will not be accepted for this.
[abc]
[-1 abc]
[abc 1]
[-1>>+]
These will be interpreted using the standard brainfuck interpretation and not used for the exit code.
If no exit code is specified, the default exit code will be 0. So just using []
results in the program exiting with a code of 0
.
As stated above, this code will work in a reasonable way on brainfuck interpreters that do not support abort semantics. If a user prints a message before the abort, the message will be displayed and then the program will loop infinitely until the user decides to abort it manually.
For exit codes, those characters are ignored by default anyway, so adding them causes no issues with existing implementations.
The actual content of the updates to the specification can largely be taken from the text above.
The code is currently all in one file. Some attempts have been made to better organize this, but have been reverted due to performance concerns.
We want to thoroughly test that this implementation of brainfuck meets the specification. That way we can have some guarantees about its functionality. This will also make it easier to extend the interpreter as time goes on.
When making changes to this codebase, it is critical that performance be maintained. The single file version isn't pretty, but it does perform quite well. We should build a benchmark to ensure that we aren't slowing down the interpreter and adding extra overhead as we refactor it.
print!
in output by adding a generic parameter that accepts any Write instance and using write directly since it supports bytes - this change will help with unit testing - check out: https://doc.rust-lang.org/std/io/struct.LineWriter.htmlRead
parameter in the functionAs it is implemented now, we perform a naive search every time we see a jump instruction. It would be better if as we searched for a matching jump, we cached the positions of other matching jump instructions we find. This would save us a lot of searching the same thing over and over again.
]
instruction (since there is no way to reach the beginning [
anymore)Instead of the string formatting we use for JSON now, use a good logging crate like slog. Logging each message as JSON and supporting different output formats like plaintext will make this tool easier to use with or without brain-debug.
Note that these changes only effect stderr. The output of the running program is unchanged. Part of this should also be replacing panics with well formatted error log messages.
The current precompilation step destroys most of the important information necessary to accurately report the position of instructions when we need to report an error. That's why this isn't already implemented properly.
We should store the instruction's position when we create an Instruction instance, before the other precompilation steps destroy the necessary information. Then we could report the position properly whenever an error occurs.
As far as error messages go, they are pretty rare. This interpreter is targeted towards the brain compiler which is designed to produce valid code.
--debug
is enabled instead of the index of the instructionThe fastest brainfuck interpreter bff4 uses some clever linear optimization techniques to execute certain brainfuck loops in one step. We could also take advantage of these and speed things up quite a bit.
Here's my proposal: after the precompile step, we should go through the precompiled instructions as lazily as possible and determine which loops match the linearizable form that we are looking for. If you look at mandel.bf, there are many examples of such loops. By the end of this, we should be able to beat bff4.
Make sure to test on simple programs to see how this impacts both simple and loop heavy programs.
This should only run when optimization is enabled.
One limitation of brainfuck is that you can't move left or right in the tape and arbitrary amount during runtime.
For example, you can't start from a cell, and move left or right by the amount given in that cell. This extension proposes two new instructions:
{
- move left by the unsigned 32-bit or 64-bit integer in the next 4 or 8 cells (the offset) starting at the current position of the pointer}
- move right by the unsigned 32-bit or 64-bit integer in the next 4 or 8 cells (the offset) starting at the current position of the pointerThis is not a total solution at the moment. There are issues that need to be worked out. For example, if I move to a given position via {
or }
, how do I get back? How can these instructions be used for things like copying if they always use up the current cell? It may be that copies will need to allocate some extra room for a pointer so that the offset can be copied. But then how do you copy the offset in the first place if it's already gone when you use it?
Setup a benchmark to run in the Travis build (or elsewhere). Run this for every PR and successful commit. For PRs, report the result as a comment.
The benchmark should use two Brainfuck scripts at least:
Starting resource: https://beachape.com/blog/2016/11/02/rust-performance-testing-on-travis-ci/
Can do this in an after_success hook if bench results are written to a file. Then just use the Github API to post a comment.
Test the PR and the branch it is based on
cargo bench
Cargo benchcmp may solve many problems: https://github.com/BurntSushi/cargo-benchcmp/
Travis script will be a bash script which will do the following:
cargo bench
on the PR branch - $TRAVIS_BRANCH
(save results to a file)$TRAVIS_PULL_REQUEST_BRANCH
cargo bench
on the base branch (with silent option so not too much build output is produced) (save results to a different file)$TRAVIS_PULL_REQUEST
with the output from each branch clearly labeling what the old results were and what the new results are
This script will only run if the build and tests succeed. If this script fails, nothing happens and the error is printed in the build log.
Maybe do this as a GitHub integration?
Related to sunjay/brain-debug#1
Since the interpreter is launched by the debugger, maybe the IPC scheme could be as simple as sending/receiving data on the stdin/stderr channels of the interpreter process. Normally, stdin is reserved for the interpreter to use as input for the interpreted program. stderr is already used for debug messages.
Since we already have a --debug-format
option, we can simply use that to communicate when to activate command support:
--debug-format text
- same behaviour as before, no debugger commands, input is directly used for the interpreted program--debug-format json
- when the debug format is json, all input is interpreted as a json command
--debug-format msgpack
- same as json but using the more compact msgpack formatFrom the spec:
Jump If Zero ([)
Jumps to the matching
]
instruction if the value of the current cell is zero.
...Jump Unless Zero (])
Jumps to the matching
[
instruction if the value of the current cell is not zero.
...
This is misleading because it implies that you should jump directly to the other jump instruction whenever you need to jump. This is incorrect. You should not jump to another jump instruction as that would result in the evaluation of the condition for a second time. This is wasteful, unnecessary and not what this reference implementation actually does. We clarified that the jump should be matching, but neglected to mention how to avoid that extra, unnecessary evaluation.
The examples in those sections are misleading as well.
[
instruction should jump to the instruction just after the matching ]
[
]
instruction should jump to the instruction just after the matching [
[
Much like the Python interpreter, we would benefit from caching the parsed, precompiled version of the program. This way, on multiple runs, loading the precompiled version would be much faster than reparsing and recompiling the same source over and over again.
We would have to check the file to see if it is older that the precompiled cache. If the precompiled cache isn't openable or parsable, it should be ignored and overwritten with a new cache. Failing to write the cache file is not an error and should not stop the program.
We should try to take what works and fix what doesn't work from our predecessors:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.