Giter Site home page Giter Site logo

francisrstokes / 16bitjs Goto Github PK

View Code? Open in Web Editor NEW
487.0 16.0 53.0 227 KB

๐Ÿ’ป A 16-bit virtual machine, including assembly language with 37 instructions, binary assembler, and a step through debugger

License: MIT License

Assembly 5.11% JavaScript 94.76% Brainfuck 0.13%
javascript 16-bit vm virtual-machine assembly-language cpu assembly virtual

16bitjs's Introduction

16-bit Javascript VM

The project consists of:

  • The definition of an assembly language with 37 instructions
  • An assembler to transform a *.asm file into a binary executable format
  • A small virtual machine which simulates a basic computer architecture: A memory space, stack, and CPU with 4 general purpose registers and a fetch-decode-execute cycle
  • A compiler for the brainfuck language directly to the binary executable format

The virtual machine can run in two modes: run (default) and step. Run mode simply run` the entire program in sucession. Step mode runs the program in a debug environment, pausing before executing each instruction and displaying the entire state of the machine.

Running the VM

Running the assembler

node src/assembler -i {infile.asm} -o {outfile.bin}

Running the brainfuck compiler

node src/compilers/bf -i {infile.bf} -o {outfile.bin}

Running a program

node src -p {program.bin}

Assembly language

The assembly language consists of 16 distinct instructions which support all the basic features you would expect: Arithmetic, Loading values to and from memory/registers, conditional/non conditional jumps, functions, system calls for reading and and writing stdio etc.

Comments start with a ; character. Labels can be defined by :some_label_name on their own line and then referenced in an instruction like so: LDV16 A, :some_label_name.

Examples

A couple of examples illustrating the language can be found in the asm/ folder.

Instruction set

Real instructions

Instruction Arguments 16 bit representation Description
MVR D, S, V VVVVVVVVSSDDIIII Add a sign-extended byte to value at source register and move it to destination register
MVV D, V, O VVVVVVVVOODDIIII Move or add an immediate value into destination register, depending on O.
LDA D, M MMMMMMMMMMDDIIII Load a value from memory into destination register using direct address
STA D, M MMMMMMMMMMDDIIII Store the value in destination register into memory using direct address
LDR D, S[, V] VVVVVVVVSSDDIIII Load from memory into the destination register using the source register as a base address
STR D, S[, V] XXXXXXXXSSDDIIII Store the value in source register into the memory using the destination register as a base address
ATH D, S, O, M, B BBBMOOOOSSDDIIII Perform an arithmetic operation on the source and destination registers. O specifies the operation (listed below) and M is the mode, where 0 = place result in destination register and 1 = place result in source register. If the instruction is right or left shift then B specifies the shifting value
CAL D XXXXXXXXXXDDIIII Call a function in memory pointed at by the destination register
JCP D, S, A, O XXXOOOAASSDDIIII Jump to memory address pointed at by the address register, depending on the comparison specified by the O operation of the destination register and the source register. Operation table specified below.
PSH S XXXXXXXXSSXXIIII Push the value in source register onto the stack
POP D XXXXXXXXXXDDIIII Pop the stack into the destination register
JMP M VVVVVVVVVVVVIIII Add a signed 12-bit offset to the program counter.
JMR S XXXXXXXXSSXXIIII Jump to the address pointed at by the source register
NOA O XXXXXXXXOOOOIIII No Argument calls. This includes SYS, HLT and RET, which have pseudo instructions

Pseudo Instructions

Pseudo instructions are prepocessed by the assembler and expanded into combinations of the real instructions.

Instruction Arguments Expanded length Description
MVI D, V 1 Set a zero-extended immediate value to destination register
LDV D, S, V 1 Alias for MVI to keep retro-compatibility in assembly source
MUI D, V 1 Set a 8-bit left shifted immediate value to destination register
ADI D, S 1 Add a sign-extended immediate value to destination register
INC D 1 Alias for ADI 1
DEC D 1 Alias for ADI -1
AUI D, S 1 Add a 8-bit left shifted immediate value to destination register
MOV D, S 1 Copy value at source register to destination register
RET 1 Return from function
HLT 1 Program halt
SYS 1 Perform a system call. This is described below in more detail.
LDM D, S 1 Alias for STA to keep retro-compatibility in assembly source
LDP D, S 1 Alias for STR without offset to keep retro-compatibility in assembly source
ADD D, S 1 Add destination to source and store the result in destination
ADDS D, S 1 Add destination to source and store the result in source
SUB D, S 1 Subtract destination from source and store the result in destination
SUBS D, S 1 Subtract destination from source and store the result in source
MUL D, S 1 Multiply destination with source and store the result in destination
MULS D, S 1 Multiply destination with source and store the result in source
DIV D, S 1 Divide destination by source and store the result in destination
DIVS D, S 1 Divide destination by source and store the result in source
LSF D, A 1 Binary shift left the destination register by amount A (max 7)
LSR D, A 1 Binary shift right the destination register by amount A (max 7)
AND D, S 1 Binary and the destination and source, and store the result in the destination
OR D, S 1 Binary or the destination and source, and store the result in the destination
XOR D, S 1 Binary exclusive-or the destination and source, and store the result in the destination
NOT D 1 Binary not (invert) the destination
LDV16 D, V 2 Load a 16 bit value into destination
SWP D, S 3 Swap the values in the source and destination registers
JEQ D, S, A 1 Jump to address A if value in destination register is equal to the source register.
JNE D, S, A 1 Jump to address A if value in destination register is not equal to the source register.
JLT D, S, A 1 Jump to address A if value in destination register is less than the source register.
JGT D, S, A 1 Jump to address A if value in destination register is greater than the source register.
JLE D, S, A 1 Jump to address A if value in destination register is less than or equal to the source register.
JGE D, S, A 1 Jump to address A if value in destination register is greater than or equal to the source register.
JZE D, S, A 1 Jump to address A if value in destination register is zero.
JNZ D, S, A 1 Jump to address A if value in destination register is not zero.

Arithmetic Operation table

Operation Value
Add 0000
Subtract 0001
Multiply 0010
Divide 0011
Increment 0100
Decrement 0101
Left shift 0110
Right shift 0111
And 1000
Or 1001
Xor 1010
Not 1011

Conditional Jump Operation table

Operation Value
Equal 000
Not equal 001
Less than 010
Greater than 011
Less than or equal 100
Greater than or equal 101
Zero 110
Not zero 111

System calls

A system call in the VM allows the program to ask resources outside of it's context, such as communication with stdin and stdout. System calls are passed off to the os module and can return their results directly into the CPUs registers.

System call Call code
Write to stdout 0000
Read from stdin buffer 0001
I/O Modes
Output
Mode Description B C D
0 Output register in decimal Destination Mode
1 Output register in binary Destination Mode
2 Output register in hex Destination Mode
3 Output register as a character Destination Mode
4 Output string in memory address pointed to by register Start address Mode
Input
Mode Description B C D
0 Read single character value of input into register Destination - -

Debugger

Running with the step option (node src -p {program.bin} --step), enables the step through debugger, giving a overview of memory, stack and registers as the program executes.

Memory:
0000    08c1 0009 0008 0009 1d01 030b 1c81 030b 1d41 030b 1941 030b 0281 030b 000a 16c1
0010    0009 0008 0009 1d01 030b 1c81 030b 1d41 030b 1941 030b 0281 030b 000a 0081 000b
0020    2781 0009 0008 0281 0291 0009 1141 030b 1c41 030b 1d41 030b 1841 030b 1b01 030b
0030    0801 030b 18c1 030b 1a01 030b 1941 030b 18c1 030b 1ac1 030b 0e81 030b 000a 10d7
0040    11a1 0089 0008 1351 0049 0008 0049 0010 000a 1357 00e1 0089 0008 0009 1981 030b
0050    1841 030b 1b01 030b 1cc1 030b 1941 030b 0281 030b 000a 02c1 0291 0009 1381 030b
0060    1bc1 030b 1d01 030b 0801 030b 1141 030b 1c41 030b 1d41 030b 1841 030b 1b01 030b
0070    0801 030b 18c1 030b 1a01 030b 1941 030b 18c1 030b 1ac1 030b 0e81 030b 000a 20d7
0080    21a1 0089 0008 2351 0049 0008 0049 0010 000a 2357 2421 0089 0008 04a1 0089 0008
0090    0009 1981 030b 1841 030b 1b01 030b 1cc1 030b 1941 030b 0281 030b 000a 000c 0000
00a0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00b0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00c0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00d0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00e0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00f0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0100
Page 1/255

Stack:
0000    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0010    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0020    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0030    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0040    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0050    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0060    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0070    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0080    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0090    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00a0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00b0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00c0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00d0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00e0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
00f0    0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

Instruction: (LDV) 0000100011000001
Registers:
A: 0000 B: 0000 C: 0000 D: 0000 IP: 0000        SP: 0000

(s)tep (e)xit (n)ext / (p)revious memory page >>>

16bitjs's People

Contributors

hlide avatar molingyu avatar radiofreejohn avatar redfast00 avatar tanuck avatar tetsuo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

16bitjs's Issues

Rename LDM to STA

Previously LoaD Memory with register because STore at Address.
Create a pseudo instruction LDM for compatibility.

An example (but incomplete) of ISA using almost the same rules of 16bitjs

Not a proposal, not an issue, just a way to show a different ISA which tries to use as much as possible all the field bits of an instruction. This ISA is not complete (no comparison instruction, no jump, missing stack alu, and so one).

  • ARM instruction inspiration as combining arithmetic and constant bit shifting in one instruction,
  • some look complex but they are indeed simple once you understand the underlying logic.
  • fix-point computation can be feasible in easier way and less instructions.
  • direct access to pushed arguments of a call or local variables.

Regarding stack operations, some ALU operations is missing (reserving/releasing N word in stack).

Now the example:
cpu16

Security issue: code execution when assembling files

There is an eval in this line. That might result in code execution, so I tried to exploit it and give a proof of concept. This was quite hard to exploit since any ')' and ';' characters will cause the payload to be messed up: ';' will be seen as comments and ')' will cause the regex to stop capturing the group. Here's the proof of concept:

.data
.text
  .global main:

main:
    ldv A, ( [10, console.log`pwnt`][0] )
    hlt

and here's the stdout of node src/assembler -i asm/poc.asm -o test.out:

[ 'pwnt' ]
Read 6 instructions, including labels.
Expanded to 5, including labels.
Removed labels. Final instruction count: 4/65535
{}
Successfully assembled to binary file test.out

I would normally include a more serious (for example writing a file) proof of concept, but it was extremely hard to code in JS without parentheses.

LDV16 B, value probably doesn't work

Hi, I really enjoyed your article and I am now doing a code-along (I read the specs on your blog post and try to implement them in Python). I just finished coding the LDV16 expander and when I looked at your code, I saw that the (temporary) B register was hardcoded, so setting the B register with LDV16 probably won't work as intended. I solved this like this

Merge SYS, HLT and RET since they no arguments

Those three instructions have only the 4-bit main opcode considered and no other relevant fields. Instead of having three different main opcode, we can have only one and used some X bits to carry the real opcode for SYS, HLT and RET and be able to add other possible instructions where no SS and DD are needed. As for SYS, I would have considered an immediate (8-bit ?) value to encode the system call code instead of relying on a register value (I believe it is A currently) . Two main opcodes would be freed for other usages that I can have in mind.

STACK - some RISC CPUs don't have a PUSH and POP instructions

There is a different way to use stack (supposedly it resides somewhere in memory).

push a
push b
push c

can be done this way:

sub sp, 3 // allocate 3 slots in word unit
str a,[sp + 0]
str b,[sp + 1]
str c,[sp + 2]
pop c
pop b
pop a

can be done this way:

str a,[sp + 0]
str b,[sp + 1]
str c,[sp + 2]
add sp, 3 // relaese 3 slots in word unit

Benefits are:

  • easy allocation of local variables in stack
  • possible dynamic allocation in stack (array of variable length)
  • direct access to local variables in many place to avoid too much pressure on the limited 4 registers
  • complex computations made easier (when involving more that 4 temporaries)

One way to call a program by RISC CPUs:

jal a, routine // call a routine by saving return in 'a' and jumping to routine
...
routine:
[sub sp, 1] only if you need to reuse 'a' in the body
[str a,[sp] only if you need to reuse 'a' in the body
... // do something
[ldr d,[sp] only if you needed to reuse 'a'
[add sp, 1] only if you needed to reuse 'a'
jr d

Sure, push, pop, call and ret allow more compact code but push and pop are not great with more complex expressions where you need to dup registers in stack: having a direct access to read/write in a stack slot makes complex expressions cleaner and shorter.

Refactor JLT to JCP (Jump ComPare)

JCP R1, R2, R3, Op

R1 is compared against R2, and the jump address is stored in R3. Operation will be in the list:

-Equal
-Not equal
-Less than
-Greater than
-Less than or equal
-Greater than or equal
-Zero (in which case R2 is ignored)
-Not zero (in which case R2 is ignored)

Create pseudo instructions for all the operations:

  • JEQ
  • JNE
  • JLT
  • JGT
  • JLE
  • JGE
  • JZE
  • JNZ

Extend LDR with offset mode

To make accessing data structures easier, LDR can take two additional arguments: MODE and OFFSET_REGISTER.

If MODE is 1, then SOURCE is treated as a base which is offset by the value in OFFSET_REGISTER.

Missing ESLint configuration

Hello @francisrstokes,

This is a very interesting project, thanks! greatly inspired me to read about some theory this weekend.

I want to contribute as well but the ESLint configuration file is missing in the repository. What is the policy on this? :)

Cheers!

Trouble to use "node src/assembler"

Under Windows 10 64-bit, latest version of Node.js: v8.2.1.

I'm not familiar with Node.js as it is pretty recent I'm using it at work.

So once Node.js installed, I made npm install and nothing else.

I tried this: C:\Dev\16bitjs>node src\assembler -i asm\factorial.asm -o bin\factorial.bin

And I got:

C:\Dev\16bitjs\src\assembler\preprocessor\get-data-table.js:70
      throw new Error`Unsupported data declaraction:\n${cur}\nExiting...`);
                                                                         ^

SyntaxError: Unexpected token )
    at createScript (vm.js:74:10)
    at Object.runInThisContext (vm.js:116:10)
    at Module._compile (module.js:533:28)
    at Object.Module._extensions..js (module.js:580:10)
    at Module.load (module.js:503:32)
    at tryModuleLoad (module.js:466:12)
    at Function.Module._load (module.js:458:3)
    at Module.require (module.js:513:17)
    at require (internal/module.js:11:18)
    at Object.<anonymous> (C:\Dev\16bitjs\src\assembler\preprocessor\index.js:3:22)
    at Module._compile (module.js:569:30)
    at Object.Module._extensions..js (module.js:580:10)
    at Module.load (module.js:503:32)
    at tryModuleLoad (module.js:466:12)
    at Function.Module._load (module.js:458:3)
    at Module.require (module.js:513:17)

Not sure if the issue is due to my way to install Node.js.

MOV - using the upper 8 bits to store a sign-extended immediate as an addend

Proposal

Instruction MOV

MOV is renamed as MVR and MOV, DEC, and INC become pseudos using MVR

Before:

|MOV| D, S | XXXXXXXXSSDD0000 | Move value at source register to destination register|

After:

|MVR| D, S, V | VVVVVVVVSSDD0000 | Add sign-extended immediate value to value at source register and move it to destination register|
|MOV| D, S | 00000000SSDD0000 | Move value at source register to destination register|
|INC| D | 00000001DDDD0000 | Increment value at destination register|
|DEC| D | 11111111DDDD0000 | Decrement value at destination register|

Retro-compatibility is kept for assembly code, not for machine code.

LDV - a better way to deal with 16-bit immediate value

Since you word is 16-bit wide, you may split it into 2ร—8-bit, and keep two bits of immediate (SS bits) to select a special operation.

In MIPS, you have LUI instruction which sets a register with imm16 left shifted by 16 bits. Then you can use ADDIU or ORI to add/or-ize the register with a signed 16-bit immediate value, allowing in two instructions to set a 32-bit immediate value to a register. ARM Cortex architecture also have a similar way.

So in your case, it could be:

  • OO:00 โ€“ MVI Dr, imm8 โ€“> Dr = unsigned(imm8)
  • OO:01 โ€“ ADI Dr, imm8 โ€“> Dr = Dr + signed(imm8)
  • OO:10 โ€“ MUI Dr, imm8 โ€“> Dr = unsigned(imm8) shl 8
  • OO:11 โ€“ AUI Dr, imm8 โ€“> Dr = Dr + signed(imm8) shl 8

And

|LDV| D, V | VVVVVVVVVVDD0001 | Load a value into destination register.

becomes

|LDV| D, V, O | VVVVVVVVOODD0001 | Load a value into destination register.

with pseudos

|MVI| D, V | VVVVVVVV00DD0001 | Set a zero-extended lower byte to destination register.
|ADI| D, V | VVVVVVVV01DD0001 | Set a zero-extended lower byte to destination register.
|MUI| D, V | VVVVVVVV10DD0001 | Set a byte left shifted by 8 bits to destination register.
|AUI| D, V | VVVVVVVV11DD0001 | Add a byte left shifted by 8 bits to destination register.

Mnemonics stand for:

  • MVI: MoVe Immediate,
  • ADI: ADd Immediate,
  • MUI: Move Upper Immediate,
  • AUI: Add Upper Immediate.

The rationale is:
โ€“ the 8-bit immediate is zero/signed-extended respectively when bit 0 of SS is 0/1.
โ€“ the effective immediate value is set with the extended 8-bit immediate left shifted by 0/8 bits respectively when bit 1 of SS is 0/1,
โ€“ the effective immediate value is set/added respectively to the destination register when bit 0 of SS is 0/1.

As a result,

  • you can ditch INC/DEC Dr because you can use ADI Dr,+1/-1,
  • you can use a loop step different to -1/+1 with only one instruction,
  • you can use that pair: MVI Dr, 0xLL; AUI Dr, 0xHH <=> Dr = 0x00LL + 0xHH00 <=> Dr = 0xHHLL and define a pseudo MVA Dr, 0xHHLL,
  • you can add/substract a big constant amount: AUI Dr, 0xHH;ADI Dr,0xLL <=> Dr = Dr + 0xHH00 + 0x00LL <=> Dr = Dr + 0xHHLL if 0xLL is inferior or equals to 0x7f,
  • you can add/substract a big constant amount: AUI Dr, 0xHH+1; ADI Dr, 0xLL <=> Dr = Dr + (0xHH00+0x0100) + (0xffLL) <=> Dr = Dr + 0xHHLL if 0xLL is greater than 0x7f.
  • you can define a pseudo ADA Dr, 0xHHLL for the previous two points and which issues the right pair according to the sign bit of 0xLL.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.