Giter Site home page Giter Site logo

deduplicate's People

Contributors

garypen avatar tatsuya6502 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

tatsuya6502

deduplicate's Issues

moka's benchmarking issues

Hi. First of all, thank you for trying out moka in early days in Apollo Router. I wish you all the best with deduplicate!

I was quite surprised with the moka results. The performance didn't seem to improve when I varied the size of the cache.
...
It's also possible that I've not written my benchmarking code correctly, but I've tried to be consistent between the two crates, so please let me know if you spot any errors I can correct.

Actually, I spotted an error in your benchmarking code. I will create a pull request with the fix and some other enhancements.

In the PR, I did the followings:

  1. In moka benchmark, avoid calling get_text() in every iterations.
    • For moka, get_text() is called in every iterations even though the key already exists.
    • This is the error causing moka's performance was not improved when the cache size was increased.
    • Moving the line let words = get_text(); into the async block fixes the error.
      • After the fix, moka's performance became competitive to deduplicate's performance.
  2. Call moka's cache.sync() in every iterations to process pending eviction.
    • Without doing so, moka will only do batch eviction of entries.
      • This means moka will hold more entries than the specified max capacity, which will improve the hit ratio.
    • This will cause the benchmark to be unfair to deduplicate. So pending evictions in moka should be processed in every iterations.
  3. Show the number of get calls in both benchmarks.
    • This is to check if the benchmark is fair.
      • Since caches are stateful, differences in the number of get calls will cause skews in the hit ratio.
      • For this benchmark, hit ratio will be the most important metric, because the time spent on cache misses is much greater than the time spent on cache hits.
    • To be a fair benchmark, the number of get calls should be the same for both caches.
      • However, Criterion will run the benchmark multiple times to get a stable result. So the number of get calls can be different for each run.
    • I have not found a way to address this issue.
  4. Show the hit ratio (hit_count / get_count) in moka benchmark.
    • This requires moka v0.10.0 or later. So I updated the dependency to moka v0.10.0.
    • I could not found a way to track the hit ratio in deduplicate benchmark.
  5. Add more cache sizes 64, 128 and 256 to reduce the hit ratio.
    • moka's hit ratio do not change much when the cache size is greater than 1024 (hit ratio is ~65% in my env).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.