Giter Site home page Giter Site logo

Comments (9)

vancejc-mt avatar vancejc-mt commented on June 12, 2024 1

Apologies - I went on vacation shortly after I added this issue (didn't expect you guys would get back to me so fast - super awesome); I'll be looking at all your comments and working through where my issues may have been. If this doesn't rectify things then I'll open a new issue with the problem hopefully narrowed down.

Thanks again!

from amqplib.

cressie176 avatar cressie176 commented on June 12, 2024 1

My best guess is there's something very funky going on with your colleagues machine. Hopefully wireshark will confirm you have actually connected to a real broker despite the docker container shutting down.

from amqplib.

vancejc-mt avatar vancejc-mt commented on June 12, 2024 1

Alright - you were completely right; we were hitting a separate rabbitmq server when we were connecting to the company VPN (didn't realize my coworker was on it, and I wasn't - just random coincidence that we had an alias on the company network with the same name as our docker container, zzz) which explains why we were getting the reconnection; sorry for spending so much of your time. But thank you for all your help. Going to close the issue.

from amqplib.

cressie176 avatar cressie176 commented on June 12, 2024

Hi @vancejc-mt,

Thanks for taking the time to post such a clear example. I'm very sure that the createChannel and checkExchange calls would fail if the broker was still unavailable, but will try it later to confirm.

In the meantime, I can see one problem with the code you've written (but I don't think it would cause the symptoms you describe). You have registered a permanent error handler on both the close and error events. I believe you can get both of these events, and also you can get multiple error events. With your code it will therefore be possible to reconnect multiple times for the same connection error. I see you've attempted to prevent this with a boolean flag, however if you were unlucky the first reconnection attempt might succeed before the second reconnection attempt was initiated.

Instead, I tend to do something like...

this.connection.on('error', (err) => {
  console.log('Connection error', err);
  connection.emit('lost');
});
this.connection.on('close', () => {
  console.log('Connection closed');
  connection.emit('lost');
});
this.connection.once('lost', () => {
  this.connection = null;
  this.connect(true).catch((err) => {
    console.log('Error reconnecting', err);
  });
});

I will get back to you after I've had a change to try your example. Something to try in the meantime is using wireshark to debug what's going on.

from amqplib.

cressie176 avatar cressie176 commented on June 12, 2024

I also notice you don't set a socket timeout which you can do as follows

await amqp.connect({
  protocol: this.protocol,
  hostname: this.hostname,
  port: this.port,
  username: this.username,
  password: this.password,
  heartbeat: 30
}, {
  timeout: 1000,
  clientProperties: {
    connection_name: this.connectionName
  }
});

from amqplib.

cressie176 avatar cressie176 commented on June 12, 2024

Once I kill the RabbitMQ server

Out of interest how are you killing the RabbitMQ server?

When I reconnect the SDK sometimes appears to connect just fine, despite the RabbitMQ server being down, and will continue without error as I request him to create a channel, and even verify an exchange exists.

And how are you verifying that a channel was created and that the exchange was checked?

from amqplib.

cressie176 avatar cressie176 commented on June 12, 2024

I'm unable to reproduce using docker kill $CONTAINER_ID. Your best option for debugging is wireshark. When everything works an you filter by amqp you should see something similar to the following

5	0.000738	::1	::1	AMQP	84	Protocol-Header 0-9-1
7	0.003534	::1	::1	AMQP	589	Connection.Start 
9	0.006700	::1	::1	AMQP	426	Connection.Start-Ok 
11	0.008094	::1	::1	AMQP	96	Connection.Tune 
13	0.008683	::1	::1	AMQP	96	Connection.Tune-Ok 
15	0.008847	::1	::1	AMQP	92	Connection.Open vhost=/ 
17	0.010071	::1	::1	AMQP	89	Connection.Open-Ok 
19	0.014325	::1	::1	AMQP	89	Channel.Open 
21	0.015889	::1	::1	AMQP	92	Channel.Open-Ok 
23	0.016933	::1	::1	AMQP	111	Exchange.Declare x=issue737 
25	0.018190	::1	::1	AMQP	88	Exchange.Declare-Ok 

When the broker is killed you should not see any more traffic (assuming you keep filtering by amqp). If you remove this filter and instead filter by tcp.dstport == 5672 you should see the SYN packets attempting to establish a connection, i.e.

80	115.160883	::1	::1	TCP	88	60154 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=239402240 TSecr=0 SACK_PERM=1
82	116.165167	::1	::1	TCP	88	60155 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=1919497636 TSecr=0 SACK_PERM=1
84	117.168391	::1	::1	TCP	88	60156 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=1707563111 TSecr=0 SACK_PERM=1
86	118.174436	::1	::1	TCP	88	60157 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=2464711875 TSecr=0 SACK_PERM=1
88	119.180208	::1	::1	TCP	88	60158 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=2489639874 TSecr=0 SACK_PERM=1
90	120.183648	::1	::1	TCP	88	60159 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=3295603972 TSecr=0 SACK_PERM=1
92	121.187520	::1	::1	TCP	88	60160 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=4084392921 TSecr=0 SACK_PERM=1
94	122.190422	::1	::1	TCP	88	60162 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=365481072 TSecr=0 SACK_PERM=1
96	123.196929	::1	::1	TCP	88	60163 → 5672 [SYN] Seq=0 Win=65535 Len=0 MSS=16324 WS=64 TSval=4004402570 TSecr=0 SACK_PERM=1

What am am confident about though is there isn't any caching within amqplib. However unlikely, maybe your code is connecting to a different broker? Alternatively have you somehow mocked or memoized any functions?

from amqplib.

cressie176 avatar cressie176 commented on June 12, 2024

@vancejc-mt any further update or OK to close?

from amqplib.

vancejc-mt avatar vancejc-mt commented on June 12, 2024

Sorry to get back to you so late, it's been a crazy week.

So I modified my code to utilize the lost once event, preventing what could have been a race condition (though the boolean should have prevented a double connection; but this implementation is a lot cleaner, thanks!) and also added in a socket timeout; unfortunately it still happens work just fine on my machine, but a co-workers machine is still exhibiting the same problems. My guess is there's something with the way that his RabbitMQ server (hosted on docker) is shutting down that is causing some kind of race condition - first guess was that it was just reconnecting to RabbitMQ as it was going down; but even adding in a long enough pause before reconnection (15 seconds) to allow the server to fully go down doesn't seem to be fixing the issue.

Currently our setup is just running both our apollo server (using amqplib) and the RabbitMQ server in separate docker containers on the same network; in order to kill RabbitMQ we just stop the container with docker.

"And how are you verifying that a channel was created and that the exchange was checked?"

So.. currently I'm just trusting that amqplib.connect() throws if the connection failed; I don't see him throwing so I assumed that he's connecting correctly. Is there some follow-up check which I should do on the connection in order to verify that it's a working connection? For the exchange I'm using channel.checkExchange() in order to verify that the exchange exists; it succeeds without throwing. Is there a different call I should be using on the.. channel maybe to verify that things were setup correctly? connection.createChannel() similarly appears to succeed without throwing any errors.

The events that we get back when we shut the RabbitMQ docker container down is first the error event (ECONNRESET) followed by the close event. When these happen we're correctly just getting the single lost handler called; and after a waiting period to back to connect and it still appears to succeed (on my coworkers machine, on my machine he correctly fails to connect and then just loops trying to reconnect).

Going to dive into piecing apart what might be happening on his machine using Wireshark starting next week; if I get any more information I'll be posting it here.

Thank you again for all of your help,
Jeff

from amqplib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.