Giter Site home page Giter Site logo

fpu's Introduction

IEEE 754 floating point arithmetic

Synthesiseable IEEE 754 floating point library in Verilog.

  • Provides Divider, Multiplier and Adder
  • Provides float_to_int and int_to_float
  • Supports Denormal Numbers
  • Round-to-nearest (ties to even)
  • Optimised for area
  • Over 100,000,000 test vectors (for each function)

Test

Dependencies

To run the test suite, you will need the g++ compiler, and the icarus verilog simulator.

Procedure

For each arithmetic function, a test-bench is provided. The testbench consists of a Python script run_test.py and a Simple C model used as the reference for verification. The C reference model is contained in the c_test subfolder. To recompile the C model run the following command:

~$ cd c_test
~$ g++ -o test test.cpp

The test suite consists of corner cases, edge cases, and 100,000,000 constrained random vectors. The test suite could take several days to run to completion. To run the test suite, run the following command:

~$ ./run_test.py

Interface

Each arithmetic module accepts two 32-bit data streams a and b, and outputs a data stream z. The stream interface is decribed in the chips manual manual.

fpu's People

Contributors

andreast271 avatar dawsonjon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fpu's Issues

How much time does it require to divide two numbers?

I am trying to ask this because, I am using Quartus II. But this tool can't provide information or I can't analyze the output because there is 1 microsecond limit in waveform observation, I can't analyze output beyond 1 micro second. Should division take higher than 1 micro second?

Managing simulation time for divider circuit

This divider code is working for us and its output is being displayed at 1165ns. But our requirement time is below 500 ns.
So can you suggest us ways to reduce the simulation time to 500ns?

Error(xxxxxxxxxxxxxxxxxxxxxxx) in Division output using the following testbench

testbench:-
`timescale 1ns / 1ps

module Divider_tb;

reg clk, rst;
reg [31:0] input_a;
reg input_a_stb;
reg [31:0] input_b;
reg input_b_stb;
reg output_z_ack;
//reg s_input_a_ack,s_input_b_ack;

wire input_a_ack;
wire input_b_ack;
wire [31:0] output_z;
wire output_z_stb;

Flaoting_32_Divider uut(

    .input_a(input_a),
    .input_b(input_b),
    .input_a_stb(input_a_stb),
    .input_b_stb(input_b_stb),
    //.s_input_a_ack(s_input_a_ack),
    //.s_input_b_ack(s_input_b_ack),
    .output_z_ack(output_z_ack),
    .clk(clk),
    .rst(rst),
   
    .output_z(output_z),
    .output_z_stb(output_z_stb),
    .input_a_ack(input_a_ack),
    .input_b_ack(input_b_ack)
    );

 always #5 clk=~clk;

 initial begin
 
         clk= 1'b0;
       
        end

initial
begin

rst=1'b1;
input_a_stb=1'b1;
input_b_stb=1'b1;
output_z_ack=1'b0;
//s_input_a_ack=1'b1;
//s_input_b_ack=1'b1;

#1 rst=1'b0;
#2 input_a=32'b01000010101101101011000000000000;

#1 input_b=32'b00111110000101000000000000000000;
//#2 s_input_a_ack=1'b1;
//#5 s_input_b_ack=1'b0;
end

initial
begin
$monitor("time=",$time,"input_a =%b,input_b=%b,output_z=%b",input_a,input_b,output_z);
end

endmodule

output:-
time= 0
input_a=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
input_b=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
output_z=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
time= 3
input_a =01000010101101101011000000000000,
input_b=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
output_z=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
time= 4
input_a=01000010101101101011000000000000,
input_b=00111110000101000000000000000000,
output_z=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Screenshot

a question about multiplication and division algorithms

Thanks for sharing the codes.
Can you tell which algorithm exactly is being used for multiplication and division in the codes. I was curious.
Single-precision division is taking >110 cycles, so I guess, these algorithms may not be the ones used in real processors?
(I could not find any contact email, so thought of raising an issue).

ModuleNotFoundError: No module named 'streams'

:~$ python3
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import chips.api.api
Traceback (most recent call last):
File "", line 1, in
File "/home/naga/.local/lib/python3.6/site-packages/chips/init.py", line 3, in
import streams, sinks, process, instruction
ModuleNotFoundError: No module named 'streams'

A + -A = +=0 bug

Although fixed in the single fp codebase, this bug still exists in the double codebase. The fix that is in place ensures that A + -A works when A is positive, but not when A is negative (e.g. 0xff80000000000000 + 0x7f80000000000000).

missing resp_z

File "./run_test.py", line 60, in run_test
stim_z = open("resp_z");
IOError: [Errno 2] No such file or directory: 'resp_z'

multiplier mismatch

Hi, dawsonjon

I run C-to-RTL formal verification on the multiplier and found there is a mismatch.

input:
a : 32'h7f80_0000 (infinity)
b : 32'h0

output:
C implementation : 32'hFFC00000 (NAN)
RTL implementation : 32'h7f80_0000 (infinity)

I googled the IEEE 754 standard and found this table :

Table 4. Multiplication of operands.
image

[ref: http://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview]

It seems multiple an Infinity to zero is NaN.

Add unary operators

  • abs
  • neg
  • ceil
  • floor
  • trunc
  • nearest
  • sqrt

Square root would be more difficult, although I've find there are several (naïve?) implementations that we could use.

License?

What's the license of the project? According to copying.txt, the text seems to be very similar to MIT, could it be?

multiplier slow converge

Hi, Jon

I just found another interesting trace that multiplier is stuck at state normalize_2 for a long time.

input:
a : 32'h5e_c72e (denormal)
b : 32'h1f_ddeb (denormal)

waveform

I have uploaded the testbench in here.

Regarding Multiplication and exponent

Sir I was going through your code for the implementation of IEEE 754 floatingpoint multiplier. I noticed that you used z_e <= a_e + b_e + 1; product <= a_m * b_m * 4; but I was bit confused as I think the code should be z_e <= a_e + b_e + 127; product <= a_m * b_m ;. But your syntax gives the correct answer but mine does not. Can you explain the logic behind using it?

Merge `get_a` and `put_z` states

I've seen that the get_a and put_z states could be merged in a single one without too much hassle earning some cycles, since the conditions to accept a new request could be done at the same time a result is given, allowing to overlap the requests on that stage. How do you see it?

Unsigned integer to real

Seems the conversions between integers and real number only support signed ones, how could it be possible to convert from/to unsigned integers and long numbers?

$fopenr update

$fopenr("stim_a"); can be changed to $fopen("stim_a", "r")

Implement `fpu` module

Create a simple fpu module that host all the other components and can be used as a black-box. It can be just a wrapper over all the other components, just routing the a, b and z data wires and their signal ones according to an op selector, almost like a "kitchen sink" example of how to use the components.

In a future iteration, maybe it would be nice to create another more advanced one that allow execution of different operations at the same time to increase performance, with some control using a FIFO or similar to warranty order of execution, but maybe it would be done in an independent project too.

Pipelined Design?

Hi, Jon.
Thanks for your open-source FPU design. Amazing!
Having read the Verilog code of computation unit, I was impressive of the computational flow by a finite state machine.
However, I am wondering how to insert pipeline into your FPU design, for the sake of improving its throughput? The problem has puzzled me for almost half months. I wanna figure out how to insert pipeline in a finite state machine. Is it accessible? Wish for your help. Thank you.

Divider Explanation

Hi! I am a student and I stumbled on your work trying to understand how a floating point divider works but my module is something like this:
image

where the exception code is:
image

So I was wondering how does your divider.v relate to this module? I purely want to understand the working of your code and meaning behind: input_a_stb; input_a_ack; etc. like this.

Looking forward for a reply!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.