Giter Site home page Giter Site logo

Got completion with error about sherman HOT 18 CLOSED

Dicridon avatar Dicridon commented on June 21, 2024
Got completion with error

from sherman.

Comments (18)

Transpeptidase avatar Transpeptidase commented on June 21, 2024

Hi, ./restartMemc.sh only needs to be executed once before each run: execute it on machine 1, but not machine 2.
Besides, the memached consumes almost no system resources, so you can co-locate it with Sherman processes.

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

Thanks for your quick reply, we now can run Sherman successfully!

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

Sorry for reopening this issue, but when running multi-machine benchmarks, we have the following errors when the thread number exceeds 4:
on machine 0
image
on machine 1
image

And if we start the two servers at almost the same time, we have an assertion failure Assertion page->hdr.sibling_ptr != GlobalAddress::Null() failed

from sherman.

Transpeptidase avatar Transpeptidase commented on June 21, 2024

Can you provide a screenshot of the entire test?

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

Sorry for my late reply.
image

from sherman.

Transpeptidase avatar Transpeptidase commented on June 21, 2024

I cannot see the complete output of server 0 (right part of screenshot )

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

The missing part is below
image
and the registering 8589934592 memory region is some output added by us to see the execution process of Sherman (these outputs are too long and repetitive. I can capture them all)

from sherman.

Transpeptidase avatar Transpeptidase commented on June 21, 2024

Can you check if the error is triggerred when performing

auto root_addr = dsm->alloc(kLeafPageSize);

or
dsm->write_sync(page_buffer, root_addr, kLeafPageSize);

?

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

bool res = dsm->cas_sync(root_ptr_ptr, 0, root_addr.val, cas_buffer);

The above line triggers the error

from sherman.

Transpeptidase avatar Transpeptidase commented on June 21, 2024

Is it OK when the number of threads is 2?
Can you print the information of related variables?

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

Unfortunately currently 2-thread benchmark fails too and error messages are the same (I wonder if maybe I should reboot the machines after each run?)
I have the following variables with -O0 optimization:

image
The root_addr.val's hex value is 0x20000000001. It doesn't look like a valid value.

from sherman.

Transpeptidase avatar Transpeptidase commented on June 21, 2024

How about a single thread in each machine? Please check RDMA network state via running ibv_write_bw.

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

Running single-thread benchmarks sometime is OK and occasionally produces the same error.

ibv_write_bw works fine and our own programs also work.

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

This issue is weird because we successfully ran the multithread benchmark on two machines once, but currently it doesn't work, Maybe it is due to some machine state issue?

from sherman.

Transpeptidase avatar Transpeptidase commented on June 21, 2024

Can you insert while(true) {} after

tree = new Tree(dsm);

?
Let's check if these two servers can init the tree successfully

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

Sorry for my so late reply, I'm currently busy on another project.
The two servers can init the tree successfully after adding the loop.

from sherman.

Transpeptidase avatar Transpeptidase commented on June 21, 2024

Hi, can you send your WeChat ID via [email protected] ? we can communicate more efficiently through WeChat

from sherman.

Dicridon avatar Dicridon commented on June 21, 2024

Thank you so much for your help and I've sent my ID to you.

from sherman.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.