Giter Site home page Giter Site logo

Comments (6)

shioyadan avatar shioyadan commented on June 23, 2024

Hello,

Basically, it is difficult to disable the I-cache, D-cache, or branch predictor in a simple way, and manual modification is required.

Simply disabling the I-cache and D-cache will greatly reduce the speed of RSD, since each access will reach the main memory. It is possible to replace the cache with a simple, one-cycle-accessible memory with some manual modifications.

Disabling branch prediction is more complicated. It is not obvious what is meant by disabling branch prediction. It is relatively easy to rewrite the predictor so that it always predicts untaken. It is difficult to modify the fetcher to halt instruction fetching at every branch instruction until the branch resolution, and that causes significant performance degradation.

If you can tell me what your goal is, I may be able to suggest a better alternative.

from rsd.

YuanPol avatar YuanPol commented on June 23, 2024

Hello,

Thanks for your reply. I am developing a performance simulator based on the timing database. The accurate cycle latency information obtained by the verilator will be used to develop this database. The estimated cycle consumption from my simulator will finally compare to the result from the verilator. Unfortunately, because of the limited time, I have no time to add the cache processing mechanism to my simulator. Thus the final performance result is not comparable with the verilator result.

Adding a memory and reconnecting some ports will be a possible way. I am writing to ask if there are some configurations that can disable the cache simply like writing some control registers. If yes, then less effort needed for me.

Thank you so much! :-)

from rsd.

shioyadan avatar shioyadan commented on June 23, 2024

In your case, a possible solution is to significantly increase the cache size. With such a setup, cache misses will not occur except for the first time for each line. Especially, for benchmarks like CoreMark and Dhrystone that repeatedly run the same loop, they will almost always hit the cache from the second run onwards. By comparing the performance of running the loop twice with that of running it once, you should achieve results similar to a scenario where every access is a cache hit.

from rsd.

YuanPol avatar YuanPol commented on June 23, 2024

Thank you for your suggestion, but it doesn't work in my case because my timing database granularity is only a instruction blocks with hundreds of instructions . It doesn't depends on the real context of the full program execution. If you have any other ideas, please tell me. If not, I can close it :-)

from rsd.

YuanPol avatar YuanPol commented on June 23, 2024

Thank you for your suggestions. now I will close it :)

from rsd.

shioyadan avatar shioyadan commented on June 23, 2024

I'm sorry for the late reply.

Have you tried pipeline visualization with Konata? The logs for the pipeline visualizer contain most of the information for each cycle of the core pipeline.

If you want per-instruction-block statistics, analyzing the log may help you. From this log, you can see when each instruction was fetched and committed.

The following is the format of the log.
RSD's "make kanata" will generate a log for Konata; see the RSD's README.
(Be careful, the command is "kanata".)
https://github.com/shioyadan/Konata/blob/master/docs/kanata-log-format.md

By running Coremark more than once and extracting the information from the second run, as I suggested at the beginning, you may be able to determine the number of execution cycles in each instruction block when most accesses hit the caches.

By the way, I think, how to define the number of "executed" cycles consumed in a fine-grained instruction block is not easy. For example, if you use the difference in commit cycles, it may not reflect the effect of instruction cache misses.

from rsd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.