Giter Site home page Giter Site logo

Comments (15)

tgross avatar tgross commented on May 19, 2024

Yeah, there's definitely still some kind of race in the setup/teardown of the signal handlers. I gave justen bad advice on this. I'll have a PR with a fix shortly.

from containerpilot.

justenwalker avatar justenwalker commented on May 19, 2024

I ran it again and it succeeded - so its a race

https://travis-ci.org/justenwalker/containerbuddy/builds/94959813

Is it the order in which sendAndWaitForSignal blocks?

    <-sig
    runtime.Gosched()

vs.

    runtime.Gosched()
    <-sig

from containerpilot.

tgross avatar tgross commented on May 19, 2024

With that ordering it fails consistently when I run it locally. I suspect the mental model we're using here is flawed somehow about how signal handlers should be set up.

from containerpilot.

tgross avatar tgross commented on May 19, 2024

I've got a good workaround and I'm testing it now on a WIP branch... we'll set up the signal handler in a helper and then mark that as such in a global bool in the test module. This way we verify that we've only set it up once; the state should be safe to reuse between the signal tests so long as each test covers different signals.

from containerpilot.

justenwalker avatar justenwalker commented on May 19, 2024

I don't know why, but this seems to work:
Edit: Scratch that. I increased it to 1000 and it dies.

    runtime.Gosched()
    <-sig
    runtime.Gosched()
    signal.Stop(sig) //May not be needed

I added a test that re-runs the test 100 times, and it passes.
Dies with 1000 iterations.

func TestMaintenanceSignalRace(t *testing.T) {
    for i := 0; i < 100; i++ {
        TestMaintenanceSignal(t)
    }
}

from containerpilot.

tgross avatar tgross commented on May 19, 2024

The only problem with that is that it might only work on your machine (the test that's failing never failed on my machine before, for example, and it still doesn't). If we remove the need to reset the handlers, we can avoid the question entirely. I've got a PR coming to that effect.

from containerpilot.

justenwalker avatar justenwalker commented on May 19, 2024

Figured - thats what I get for coding by coincidence

from containerpilot.

tgross avatar tgross commented on May 19, 2024

Interesting... apparently I'm totally wrong. It worked the first time I pushed up the commit (on a very dirty work-in-progress branch) and then not afterwards: https://travis-ci.org/tgross/containerbuddy/builds/94967601

from containerpilot.

tgross avatar tgross commented on May 19, 2024

Working on this is a little tricky because I can't force the bug on my machine.

from containerpilot.

tgross avatar tgross commented on May 19, 2024

@justenwalker I tried what you'd suggested in https://travis-ci.org/tgross/containerbuddy/builds/94972692 and it results in the same kinds of errors on Travis.

I'm going to do some integration testing on this and make sure that the signal handler is actually working as I expect and that this isn't just a test artifact.

from containerpilot.

tgross avatar tgross commented on May 19, 2024

Ok, so fortunately in an integration test this seems to work as we'd expect. That doesn't eliminate a possible race in the implementation that happens when signals arrive close together, of course. But it does lead me to believe that the problem is in the test rig.

from containerpilot.

tgross avatar tgross commented on May 19, 2024

Think I got it: #33

from containerpilot.

tgross avatar tgross commented on May 19, 2024

@justenwalker I slept on this one a bit and woke up this morning with some new insights. We're making what is probably a classic mistake of trying to force behavior in the test by forcing the test thread to yield or by sleeping. We're trying to fix a concurrency bug within asynchronous signals, and we can't coerce the signals to be received by the signal package synchronously.

I've spent a little time trying to force synchronicity into this and I think the best approach will be to factor out the signal handlers from the signal receivers such that we can unit test the signal handling code properly without worrying about concurrency, and then just have a simple test that verifies we've wired up the handlers correctly. I'll have a PR inbound for this shortly.

from containerpilot.

tgross avatar tgross commented on May 19, 2024

I've opened #36 with what I believe to be the correct solution to the problem.

from containerpilot.

tgross avatar tgross commented on May 19, 2024

Fix merged and build is now marked as passing. Thanks for the assist on this, @justenwalker

from containerpilot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.