Giter Site home page Giter Site logo

Comments (15)

bsergean avatar bsergean commented on May 14, 2024 1

I have a 100% reproducible hang on macOS now (in master ...). That should make the bug easy to repro.

from ixwebsocket.

bsergean avatar bsergean commented on May 14, 2024 1

So the hang was happening because we are polling in the background thread, without a timeout, and the control thread which is closing is not exiting of its main loop before calling poll.

(control thread is waiting on _thread.join()).

Last commit fixes this and we're back to a reliable green build -> 671c9f8

Unfortunately the close and friends unittest are not working now ... I think they are relying on code which can potentially hang. I wonder how we should fix this, but now we have some reliable code that triggers those problems.

from ixwebsocket.

AlexandreK38 avatar AlexandreK38 commented on May 14, 2024

This is the test we commented, it’s more like I said: you stop before connection is marked OPEN and the one after has enough time, then you have your tsan errors. Several stop before being open work, several stop after being connected work too; it seems the transition fwils

from ixwebsocket.

bsergean avatar bsergean commented on May 14, 2024

You're right it's a good test to use to track this down.
There's _requestInitCancellation which we can use to interrupt the code that tries to reconnect.
We probably need more synchronization between the 2 threads, it's clearly a race condition between stop and connect.

WARNING: ThreadSanitizer: data race (pid=3630)
  Write of size 8 at 0x7ffc53c08db8 by thread T5:
    #0 void std::swap<ix::Socket*>(ix::Socket*&, ix::Socket*&) /usr/include/c++/5/bits/move.h:187 (ixwebsocket_unittest+0x0000005d9444)
    #1 std::__shared_ptr<ix::Socket, (__gnu_cxx::_Lock_policy)2>::swap(std::__shared_ptr<ix::Socket, (__gnu_cxx::_Lock_policy)2>&) /usr/include/c++/5/bits/shared_ptr_base.h:1076 (ixwebsocket_unittest+0x0000005d90ce)
    #2 std::__shared_ptr<ix::Socket, (__gnu_cxx::_Lock_policy)2>::operator=(std::__shared_ptr<ix::Socket, (__gnu_cxx::_Lock_policy)2>&&) /usr/include/c++/5/bits/shared_ptr_base.h:1000 (ixwebsocket_unittest+0x0000005d8ef6)
    #3 std::shared_ptr<ix::Socket>::operator=(std::shared_ptr<ix::Socket>&&) /usr/include/c++/5/bits/shared_ptr.h:294 (ixwebsocket_unittest+0x0000005d8b90)
    #4 ix::WebSocketTransport::connectToUrl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int) /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocketTransport.cpp:154 (ixwebsocket_unittest+0x0000005ee93e)
    #5 ix::WebSocket::connect(int) /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocket.cpp:169 (ixwebsocket_unittest+0x0000005e36a1)
    #6 ix::WebSocket::checkConnection(bool) /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocket.cpp:251 (ixwebsocket_unittest+0x0000005e3f1d)
    #7 ix::WebSocket::run() /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocket.cpp:284 (ixwebsocket_unittest+0x0000005e45ad)
    #8 void std::_Mem_fn_base<void (ix::WebSocket::*)(), true>::operator()<, void>(ix::WebSocket*) const /usr/include/c++/5/functional:600 (ixwebsocket_unittest+0x0000005e9794)
    #9 void std::_Bind_simple<std::_Mem_fn<void (ix::WebSocket::*)()> (ix::WebSocket*)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/include/c++/5/functional:1531 (ixwebsocket_unittest+0x0000005e96ab)
    #10 std::_Bind_simple<std::_Mem_fn<void (ix::WebSocket::*)()> (ix::WebSocket*)>::operator()() /usr/include/c++/5/functional:1520 (ixwebsocket_unittest+0x0000005e94dc)
    #11 std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (ix::WebSocket::*)()> (ix::WebSocket*)> >::_M_run() /usr/include/c++/5/thread:115 (ixwebsocket_unittest+0x0000005e9414)
    #12 <null> <null> (libstdc++.so.6+0x0000000b8c7f)
  Previous read of size 8 at 0x7ffc53c08db8 by main thread:
    #0 std::__shared_ptr<ix::Socket, (__gnu_cxx::_Lock_policy)2>::operator->() const /usr/include/c++/5/bits/shared_ptr_base.h:1055 (ixwebsocket_unittest+0x0000005aa4c0)
    #1 ix::WebSocketTransport::close(unsigned short, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, bool) /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocketTransport.cpp:1035 (ixwebsocket_unittest+0x0000005f320a)
    #2 ix::WebSocket::close() /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocket.cpp:217 (ixwebsocket_unittest+0x0000005e3d6a)
    #3 ix::WebSocket::stop() /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocket.cpp:147 (ixwebsocket_unittest+0x0000005e3502)
    #4 stop /home/travis/build/machinezone/IXWebSocket/test/cmd_websocket_chat.cpp:95 (ixwebsocket_unittest+0x0000005c4b42)
    #5 ____C_A_T_C_H____T_E_S_T____0 /home/travis/build/machinezone/IXWebSocket/test/cmd_websocket_chat.cpp:326 (ixwebsocket_unittest+0x0000005c6ab8)
    #6 Catch::TestInvokerAsFunction::invoke() const /home/travis/build/machinezone/IXWebSocket/test/Catch2/single_include/catch.hpp:11868 (ixwebsocket_unittest+0x000000426050)
    #7 Catch::TestCase::invoke() const /home/travis/build/machinezone/IXWebSocket/test/Catch2/single_include/catch.hpp:11769 (ixwebsocket_unittest+0x000000425462)
    #8 Catch::RunContext::invokeActiveTestCase() /home/travis/build/machinezone/IXWebSocket/test/Catch2/single_include/catch.hpp:10622 (ixwebsocket_unittest+0x00000041e755)
    #9 Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /home/travis/build/machinezone/IXWebSocket/test/Catch2/single_include/catch.hpp:10595 (ixwebsocket_unittest+0x00000041e3ba)
    #10 Catch::RunContext::runTest(Catch::TestCase const&) /home/travis/build/machinezone/IXWebSocket/test/Catch2/single_include/catch.hpp:10365 (ixwebsocket_unittest+0x00000041c4da)
    #11 runTests /home/travis/build/machinezone/IXWebSocket/test/Catch2/single_include/catch.hpp:10924 (ixwebsocket_unittest+0x000000420138)
    #12 Catch::Session::runInternal() /home/travis/build/machinezone/IXWebSocket/test/Catch2/single_include/catch.hpp:11119 (ixwebsocket_unittest+0x000000421831)
    #13 Catch::Session::run() /home/travis/build/machinezone/IXWebSocket/test/Catch2/single_include/catch.hpp:11076 (ixwebsocket_unittest+0x0000004214cd)
    #14 int Catch::Session::run<char>(int, char const* const*) <null> (ixwebsocket_unittest+0x00000048ca3f)
    #15 main /home/travis/build/machinezone/IXWebSocket/test/test_runner.cpp:25 (ixwebsocket_unittest+0x00000043c694)
  As if synchronized via sleep:
    #0 nanosleep <null> (libtsan.so.0+0x000000043c7a)
    #1 void std::this_thread::sleep_for<double, std::ratio<1l, 1000l> >(std::chrono::duration<double, std::ratio<1l, 1000l> > const&) /usr/include/c++/5/thread:292 (ixwebsocket_unittest+0x000000562956)
    #2 ix::WebSocket::checkConnection(bool) /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocket.cpp:247 (ixwebsocket_unittest+0x0000005e3eed)
    #3 ix::WebSocket::run() /home/travis/build/machinezone/IXWebSocket/ixwebsocket/IXWebSocket.cpp:284 (ixwebsocket_unittest+0x0000005e45ad)
    #4 void std::_Mem_fn_base<void (ix::WebSocket::*)(), true>::operator()<, void>(ix::WebSocket*) const /usr/include/c++/5/functional:600 (ixwebsocket_unittest+0x0000005e9794)
    #5 void std::_Bind_simple<std::_Mem_fn<void (ix::WebSocket::*)()> (ix::WebSocket*)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/include/c++/5/functional:1531 (ixwebsocket_unittest+0x0000005e96ab)
    #6 std::_Bind_simple<std::_Mem_fn<void (ix::WebSocket::*)()> (ix::WebSocket*)>::operator()() /usr/include/c++/5/functional:1520 (ixwebsocket_unittest+0x0000005e94dc)
    #7 std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (ix::WebSocket::*)()> (ix::WebSocket*)> >::_M_run() /usr/include/c++/5/thread:115 (ixwebsocket_unittest+0x0000005e9414)
    #8 <null> <null> (libstdc++.so.6+0x0000000b8c7f)

from ixwebsocket.

AlexandreK38 avatar AlexandreK38 commented on May 14, 2024

from ixwebsocket.

AlexandreK38 avatar AlexandreK38 commented on May 14, 2024

I may have an idea of the problem and how to fix it, I will give it a try tomorrow

from ixwebsocket.

bsergean avatar bsergean commented on May 14, 2024

If you can quickly describe your idea that would be great.

I was thinking about having 2 different socket objects, but that sounds like a bandaid and a bad idea.

from ixwebsocket.

AlexandreK38 avatar AlexandreK38 commented on May 14, 2024

The idea is to prevent the checkConnection() or at least the connect() inside it to set the _socket var before the close() is done (as the close is called on the other thread).
My idea is to have a mutex in Websocket.h and lock it in .cpp, in close() method before the call to _ws.close(), and also in the checkConnection(); the place in the latter would probably be at the first line in the loop but to be check for the behavior
Anyway that’s the idea :)

from ixwebsocket.

AlexandreK38 avatar AlexandreK38 commented on May 14, 2024

The place of the mutex in the checkConnection has to be placed carefully because we need to be sure that we sent data on the background thread when closing on the old socket

from ixwebsocket.

AlexandreK38 avatar AlexandreK38 commented on May 14, 2024

And if this is working, it may also solve the same tsan errors we had in the other test of close code and reason, when server is closing; if not the fix will looks like this one but adapted for no thread

from ixwebsocket.

bsergean avatar bsergean commented on May 14, 2024

from ixwebsocket.

AlexandreK38 avatar AlexandreK38 commented on May 14, 2024

Okay, great 👍

from ixwebsocket.

AlexandreK38 avatar AlexandreK38 commented on May 14, 2024

I’ll be happy if all the errors preventing the PR to be merged and so on are finally gone

from ixwebsocket.

bsergean avatar bsergean commented on May 14, 2024

So I added a mutex (recursive_mutex actually), I've enabled the test that was causing trouble, and now the tsan errors are gone. I've put the mutex in WebSocketTransport, so that they are kindof fine grained.

The sad news is that I'd like not to use a recursive mutex. Now the failure on travis seems to be on macOS. If you can try to make a PR and try different strategies.

I think that some of the complications relate to the fact that we are more correct now when processing the close code. But I'm not too sure about that. Maybe that problem has always been there.

from ixwebsocket.

bsergean avatar bsergean commented on May 14, 2024

9315eb5

from ixwebsocket.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.