branlwyd / bdcpu16 Goto Github PK

0.0 2.0 0.0 416 KB

DCPU-16 simulator based on version 1.7 of the DCPU-16 specification. See http://dcpu.com/.

Java 100.00%

bdcpu16's Introduction

bdcpu16

DCPU-16 simulator based on version 1.7 of the DCPU-16 specification. See http://dcpu.com/.

This is currently in active development and probably won't be useful to someone who just wants to play around with the DCPU-16. Fortunately there are plenty of more mature DCPU simulators out there. :-)

Building

This project is built using Apache Ant (http://ant.apache.org/). Once you have Ant installed, change to the root directory of the repository and type:

ant

This will produce a jar file in bin/bdcpu16.jar. If you want to precompile all of the instructions (see below), you can instead type:

ant make-precompiled

As well as bin/bdcpu16.jar, this will also produce a bin/bdcpu16-precompiled.jar. Note that instruction precompilation can take some time as it produces a compiled class file for every possible DCPU-16 instruction.

Dependencies

Java JDK: The default instruction provider compiles each instruction into Java bytecode at run time. Doing this requires the Java JDK to be present at runtime. If this isn't acceptable, there is an alternative instruction provider that allows you to precompile every instruction; using the precompiled instruction provider removes the requirement for the JDK.

Instruction precompilation

By default, instructions are compiled to Java bytecode as they are run. However, the cc.bran.bdcpu16.codegen.InstructionPrecompiler class can be run to generate Java source code for every legal instruction, as well as an InstructionProvider implementation that provides these instructions to the CPU.

Due to the large number of generated classes, it is recommended that you do not include these source files into your IDE as you will likely run into out-of-memory errors. Instead, I recommend loading these files into your IDE as a compiled JAR file.

To produce a jar, you can run ant make-precompiled from the root directory of the repository. This will generate code for the instructions and produce a precompiled jar in bin/bdcpu16-precompiled.jar.

You can then use the precompiled instruction provider in cc.bran.bdcpu16.precompiled.PrecompiledInstructionProvider by referencing the JAR file. The instruction provider should be passed to the CPU object during construction.

Memory usage

Using the precompiled instructions (or using many different dynamically-compiled instructions) will use quite a bit of memory since more than 50K classes are used to represent the instructions. You will need to increase the size of the permanent generation memory in order to avoid out-of-memory issues. I find that 256MB works well. You can pass -XX:MaxPermSize=256M to the JVM to do this.

bdcpu16's People

Contributors

Watchers

bdcpu16's Issues

Improve documentation

The code documentation is currently very poor. It should be improved.

Precompile instructions

Instead of dynamically compiling instructions at runtime, precompile all the instructions.

Hardware: separate monitor from keyboard/mouse devices

Refactor DebuggerFrame, InstructionDecoder, Operand value formatting to use a new ValueFormatter interface rather than using parameters to specify hex or dec. This would add flexibility if we wanted to add different numeric formats later, or if we add another debugger UI. 99dc358
~~The *Operand classes in cc.bran.bdcpu16.codegen are only ever intended to be used by Operand. Move these classes to be private internal classes to Operand.~~ 9470eca
Refactor int Operand.wordsUsed() into boolean usesWord(). Every operand uses 0 or 1 words, so this is still sufficiently expressive; and many places in code make the assumption that every operand will consume 0 or 1 words, so making this change will cause the interface to match the existing assumptions. 9470eca
Consider factoring the Keboard/Mouse device pieces apart from the Terminal (UI) piece, and allowing Keyboard/Mouse devices to be arbitrarily attached/detached from Terminals.

Better error handling

Currently the DCPU fails silently on error conditions (or crashes!). A few areas that should have well-defined error conditions:

Illegal/unrecognized instructions (currently causes a NullPointerException!)
Filled interrupt queue (currently drops all queued interrups)

Hardware: floppy disk should have bad sector support

The floppy disk hardware should have support for receiving disks with bad sectors.

Debug: improve interrupt support

Currently, the debugger will "skip" stopping on the first instruction of the interrupt handler when an interrupt is received. Ths is because of the way the debugger works: it can only pause when it is notified of activity by the CPU through the cyclesElapsed() function, but cyclesElapsed() is called in step() only after both handling an interrupt and executing an instruction.

Likely this bugfix will fall out of some other architectural change. (See issue #18.)

Cycle counting

The interpreter currently does no cycle counting whatsoever. As well as actually counting the number of cycles that have passed, a way to run for a certain number of cycles would be very helpful when using DCPU in a real-time context as part of a larger system (ie in games).

Debug: occasional exception thrown when stepping through debugger

Does not seem to be caused by repeatable circumstances. Possibly a race condition? -- earlier testing showed some internal ops on JTable were multithreaded.

Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException
    at javax.swing.BufferStrategyPaintManager.flushAccumulatedRegion(BufferStrategyPaintManager.java:420)
    at javax.swing.BufferStrategyPaintManager.endPaint(BufferStrategyPaintManager.java:380)
    at javax.swing.RepaintManager.endPaint(RepaintManager.java:1270)
    at javax.swing.JComponent._paintImmediately(JComponent.java:5166)
    at javax.swing.JComponent.paintImmediately(JComponent.java:4971)
    at javax.swing.RepaintManager.paintDirtyRegions(RepaintManager.java:770)
    at javax.swing.RepaintManager.paintDirtyRegions(RepaintManager.java:728)
    at javax.swing.RepaintManager.prePaintDirtyRegions(RepaintManager.java:677)
    at javax.swing.RepaintManager.access$700(RepaintManager.java:59)
    at javax.swing.RepaintManager$ProcessingRunnable.run(RepaintManager.java:1621)
    at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:251)
    at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:721)
    at java.awt.EventQueue.access$200(EventQueue.java:103)
    at java.awt.EventQueue$3.run(EventQueue.java:682)
    at java.awt.EventQueue$3.run(EventQueue.java:680)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
    at java.awt.EventQueue.dispatchEvent(EventQueue.java:691)
    at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:244)
    at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:163)
    at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:151)
    at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:147)
    at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:139)
    at java.awt.EventDispatchThread.run(EventDispatchThread.java:97)

Debug: conditional breakpoints/watches

Add conditional breakpoints and watch values to the debugger.

These two items are grouped because they can be done easily through integration with a script engine to provide expression evaluation. Perhaps the Rhino engine included in recent versions of Java.

create glyph cache

Create a cache of Images of each glyph, so that drawing the screen is just blitting the appropriate images rather than manually drawing each rectangle.

This will require creating a cache of (glyphCount * foreColors * backColors) = 32768 Images. Alternatively, just create (glyphCount) images and use image transforms to set the colors appropriately.

Hardware devices

http://dcpu.com specifies several hardware devices:

Clock -- thought should be given on how to model passage of time if not using wallclock -- specify clockspeed of processor and simulate time based on number of cycles that have passed?
Floppy drive
Keyboard
Monitor
Sleep Chamber -- no implementation, not useful outside of 0x10c
3D Vector Display -- probably no implementation, seems like a toy

Debug: improve breakpoints on skipped instructions

Currently, if a breakpoint exists on an instruction that is skipped due to a failing IF* test, it will still trigger a breakpoint, even if the UI's "step over skipped instructions" option is set; the UI then skips over the instruction and it appears the next instruction triggered the breakpoint. Ideally, the breakpoint would be skipped if the instruction was skipped. This would require the addition of some logic in the Debugger class to support stepping over skipped instructions.

Debug: scrolling through memory causes corruption of the display

Scrolling with the mouse causes text to over-write itself. Scrolling with the keyboard causes multiple lines to appear to be selected simultaneously. Both seem to be an issue with old elements not redrawing correctly.

proper build system

Building is currently done using Eclipse. A proper build system (ant?) shoudl be set up instead.

Codegen: stricter failures for HWI/HWQ

Currently, if the hardware index read by HWI/HWQ is out of bounds, these instructions will act as no-ops. Preferably, these instructions should fault the CPU as illegal instructions to ease debugging. This is possible now that the CPU is notified of illegal instructions through illegalInstructionExecuted().

not all monitor tests pass

The monitor testing suite from https://github.com/0x10c-crap/emulator-test/ has some failures. Specifically the "Border" and "Default Font" tests fail.

The Default Font test may be ignorable (if it's due to a difference in default font)--but should check that the format of our font dump matches what is expected, since we're using the default passed down by Notch.

The Border test is worrying--test result is only based on whether we push 'Y' or 'N'. That we get it wrong indicates a possible problem with instruction compilation.

skip doesn't work properly with interrupts

Currently, IF* instruction skipping is implemented by setting a skip flag and then checking that flag when executing an instruction. However if an interrupt is received while this is happening, the interrupt handler will continue to skip. This is incorrect.

The solution is to skip all at once before attempting to handle an interrupt.

Debug: improve decoded instruction cache

DebuggerFrame.DecodedInstructionCache is currently very inefficient:

Every time the CPU is paused, the entire cache is invalidated.
When invalidating a single instruction, also invalidates every instruction after that instruction.

Marked as low priority as, although this behavior is quite terrible, it only applies when single-stepping the debugger, where performance is not of extreme importance.

Improving this is likely going to require caching instructions using the current PC value as the "base" for decoding rather than 0. However this will pose a challenge as it's not immediately clear how to decode backwards--given an address, there may be multiple previous addresses that would put us where we currently are. Could scan backwards until we hit a point where there is no ambiguity--guaranteed to exist because we'll eventually hit 0. Not sure how often this would cause us to scan all the way back to 0 however--not sure how common ambiguities are. This would also be a fairly complicated algorithm to implement efficiently.

Codegen: re-add interpreter/"small" precompiled instruction provider

The current instruction providers are:

The dynamically-compiled provider, which requires the JDK to be installed at runtime and can use lots of RAM.
The precompiled provider, which requires ~30MB of additional code (when the rest of the library is 60KB!) and takes quite some time to load.

Neither one of these is ideal. The interpreted instruction provider should be re-added. Alternatively a "small precompiled" provider could be added: instead of generating one class per instruction, generate one class for each possible operator and operand and then plug them together. It would be slightly slower, but the code-size savings would be immense: instead of (operators * operands * operands) = ~55000 classes, we would instead have (operators + operands) = 99 classes.

Core: CPU performance ideas

A couple of performance ideas:

Make the CPU single-threaded again. Currently the only place that multithreading is supported is in interrupt() -- we can receive interrupts from separate threads. All interrupts come from either software (INT instruction) or hardware. Software instructions are already interrupting from same thread. By refactoring hardware to only interrupt in cyclesElapsed() we can ensure that all interrupts come from same thread.
Experiment with "step handler." That is, create a StepHandler interface that handles a single step. Two stephandlers: one for instructions/regular execution, one for interrupt checking. step() just calls the current step handler. Normally the instruction step handler is installed. On receiving an interrupt, or enabling interrupts (RFI/IAQ instructions), the interrupt step handler is installed instead. Once it has run, the interrupt step handler reinstalls the instruction step handler (as either it has handled an interrupt, and therefore interrupts are disabled; or it has discovered there are no interrupts to handle).

Things to look out for with this: need to be careful this actually provides an increase in performance (balancing the extra level of indirection per step with the lack of needing to check for interrupts every step). Need to make sure existing code is okay with having 0 cycles returned from step() without assuming it's an error.

Assembler: promote label-values to immediate literals

Currently the assembler only promotes literal-values to immediate literals. When possible, the assembler should promote label-values to immediate literals as well.

General: unit testing

Currently testing all of the instructions is tedious (and has not been done).

Create a unit test suite to rigorously test the functionality of the CPU. The areas of greatest concern are the arithmetic/shift operators, esp. when dealing with signed/unsigned & overflow/underflow semantics.

Assembler: improve error messages

Currently the error messages generated by the assembler are very crude. They could be improved a lot. They should include the error column along with the line, and be much more specific in most cases.

Assembler: remove uses of Either<Character, String>

The assembler uses the Either<Character, String> type to represent values that may be literal values (characters) or labels (strings). There's no reason to use such a generic type here; should be replaced by a specific Value class instead.

make interrupts thread-safe

Currently, devices must wait until a safe time (e.g. when cyclesElapsed() is called) before they can call interrupt(). This means if a device wants to interrupt due to an out-of-band event (such as keyboard input) they have to keep their own queue of interrupts. This leads to added logic & storage.

Investigate whether it will hurt performance to add locking, or perhaps replace the current interrupt queue with a lockless/low-lock queue structure.

Hardware: make terminal self-updating

Currently, the monitor on the terminal depends on the main loop to update itself every frame using the Terminal.update() method. It makes more sense to have the terminal keep track of when it should update itself.

Hardware: disk seek time intended to be per-track

Disk seek time is intended to be per-track. However it's currently either 0 (if the requested data is on the same track) or the seek time of a single track (if the data isn't on the same track).

The seek time should be proportional to the number of tracks between the last read and the current one.

Per the specification:

Track seeking time is about 2.4 ms per track.

General: all methods not intended to be used externally should be non-public

Currently, lots of methods are public just because they need to be called across package boundaries. They should all be internal.

Either merge the packages or use some glue classes to pass around access to the priveleged methods.

Core: do not call Device.step() on every device for every CPU step

See comment below.

Currently, every call to Cpu.step() loops through the attached hardware and notifies each device that some cycles have elapsed. While this allows for accurate hardware simulation, this loop is slowing us down (I think).

~~Determine if this is actually a performance issue.~~
~~Add option to only occasionally call cyclesElapsed(). Should this be per-number-of-instructions or per-number-of-cycles?~~

Remove mmap/munmap

The CPU has a completely unnecessary mmap/munmap functionality. While it's fun to play with, it slows down every CPU memory access. It needs to go.

Debugger

Add support for debugging:

Stepped execution
Machine state display (memory/registers... devices?)
Machine state modification
Breakpoints
Performance metrics?

Bonus points for displaying the original source alongside the debugged program. This should probably be in a GUI--too much data for text.

CPU should track its own clock speed (constant for CPU, set at time of construction, in Hz)
Hardware should have standard method to receive a number of cycles that have passed
CPU should use this method each time step() is called
Existing cycle-counting hardware should be moved over to this functionality
CPU should have functionality for running for certain amount of "real time", calculate to cycles