Giter Site home page Giter Site logo

lc3asm's Introduction

LC-3 Assembler

This project is an assembler written in C (standard C11) for Little Computer 3 (LC-3) as specified by Introduction to Computing Systems: From bits and gates to C and beyond

There are multiple online implementations of LC-3 virtual machines that can be used to run the resulting binary code:

wget https://acg.cis.upenn.edu/milom/cse240-Fall05/handouts/code/LC3sim.jar
java -jar LC3sim.jar

Note: in addition to the instructions described in the specification, the assembler implemented in this project also supports JMPT and RRT. These instructions are a variant of JMP and RET, respectively, that have the additional effect of setting the privilege bit in PS (Process Status Register). There is no guarantee though that the above virtual machines will support it.

Here's some other learning resources and references:

Compilation

Run make lc3as CPPFLAGS=-DFAB_MAIN to create the executable lc3as.

When running lc3as on an assembly file (.asm), two files are generated (in the same folder as .asm):

  • binary with extension .obj
  • symbol table with extension .sym

Unit tests

To run the unit tests:

  • install cmocka:

    • brew install cmocka in MacOSX,
    • sudo apt-get install libcmocka-dev in Ubuntu
  • run make unittest

To get a test coverage report with gcov and lcov:

  • install lcov (brew install lcov in MacOS)
  • run make coverage_report
  • report will open in the default browser

Support tools

The folder tools contains some debugging utilities used during the development of this assembler:

  • lc3objdump is a version of objdump to print the binary content of an object file generated by the LC3 assembler; Makefile shows how to run it

Appendix

The following sections contain a brief description of the LC-3 arquitecture and the assembly language

LC-3 Instruction Set Architecture (ISA)

Memory organisation

The LC-3 memory has an address space of 2^16 (65,536) locations, and an addressability of 16 bits.

The normal unit of data that is processed in the LC-3 is 16 bits, we refer to 16 bits as one word, and we say the LC-3 is word-addressable.

Registers

The LC-3 specifies eight general purpose registers, each identified by a 3-bit register number. They are referred to as R0, R1 ... R7.

Registers are used as memory locations to store information. The number of bits stored in each register is 16 (one word).

Registers can be accessed in a single machine cycle as opposed to data from memory that normally requires more than one cycle.

Instruction set

An instruction is made up of two things: opcode and operands.

The instruction set of an ISA is defined by its set of opcodes, data types, and addressing modes. The addressing modes determine where the operands are located.

The LC-3 ISA has 15 instructions, each identified by its unique opcode. The opcode is specified by bits [15:12] of the instruction. Since four bits are used to specify the opcode, 16 distinct opcodes are possible. However, the LC-3 ISA specifies only 15 opcodes. The code 1101 has been left unspecified, reserved for some future need.

There are three different types of instructions, which means three different types of opcodes:

  • operates instructions: process information
  • data movement instructions: move information between memory and the registers and between registers/memory and input/output devices
  • control instructions: change the sequence of instructions that will be executed (instead of processing them sequentially according to their location in memory)
    • conditional branch
    • unconditional jump
    • subroutine (function) call
    • TRAP (system calls, PC changes to a memory address that is part of the operating system so that the operating system will perform some task on behalf of the program)
    • return from interrupt

Data types

The data type of the operands is 16-bit 2's complement integers.

Addressing modes

An addressing mode is a mechanism for specifying where the operand is located. For instance, a 16-bit integer does not fit in an instruction, therefore the only way an opcode can operate on said integer is by storing it in memory/register and use as operand a reference to that location.

An operand can generally be found in one of three places:

  • in memory,
  • in a register, or
  • as a part of the instruction (in this case, the operand is called literal or immediate)

The LC-3 supports five addressing modes:

  • immediate (or literal)
  • register
  • memory addressing modes:
    • PC-relative: bits [8:0] of the instruction specify an offset relative to the PC. The memory address is computed by sign- extending bits [8:0] to 16 bits, and adding the result to the incremented PC
    • indirect: in this case, bits [8:0] do not contain the operand but the memory addres of the operand
    • Base+offset: bits [5:0] of the instruction specify an offset relative to a base register. The memory address is computed by sign- extending bits [5:0] to 16 bits, and adding the result to the base register.

Condition codes

Condition codes allow the instruction sequencing to change on the basis of a previously generated result.

The LC-3 has three single-bit registers (condition codes) that are set (set to 1) or cleared (set to 0) each time one of the eight general purpose registers is written. The three single-bit registers are called N (negative), Z (zero) and P (positive).

LC-3 Assembly language

This is the specification of the assembly language corresponding to the previously described ISA.

The LC-3 assembler is the program that takes as input a computer program written in LC-3 assembly language and translates it into a program in the ISA of the LC-3.

Instructions

(LABEL) OPCODE OPERANDS (; COMMENTS)

  • The OPCODE is a symbolic name for the opcode of the corresponding LC-3 instruction.
  • The number of OPERANDS depends on the operation being performed and are separated by commas
  • Labels are symbolic names that are used to identify memory locations that are referred to explicitly in the program
  • Comments are identified by semicolons and are ignored by the assembler

Labels and comments can also appear on their own line (without accompanying any instruction). Labels always make reference to the memory location of the first instruction after the label. Two consecutive labels on the same line is considered illegal. However, two consecutive labels on different lines is permitted.

Trap service routines

The assembly language provides some aliases for the TRAP instructions:

  • GETC: TRAP x20
  • OUT: TRAP x21
  • PUTS: TRAP x22
  • IN: TRAP x23
  • PUTSP: TRAP x24
  • HALT: TRAP x25

Pseoud-ops (assembler directives)

An assembler directive is a message to help the assembler in the assembly process. Once the assembler handles the message, the pseudo-op is discarded.

  • .ORIG: tells the assembler where in memory to place the LC-3 program
  • .FILL: tells the assembler to set aside the next location in the program and initialize it with the value of the operand.
  • .BLKW: tells the assembler to set aside some number of sequential memory locations (BLocK of Words) in the program
  • .STRINGZ: tells the assembler to initialize a sequence of n + 1 memory locations; the argument is a sequence of n characters, inside double quotation marks; the first n words of memory are initialized with the zero-extended ASCII codes of the corresponding characters in the string; the final word of memory is initialized to 0.
  • .END: tells the assembler where the program ends; any characters that come after .END are ignored by the assembler.

lc3asm's People

Contributors

falvarezb avatar

Watchers

 avatar

Forkers

jem-green

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.