Giter Site home page Giter Site logo

Comments (13)

conniey avatar conniey commented on September 8, 2024 2

I think trying out the V2 beta would be an option for us. Where can I find the details on how to test it?

The beta is available as: azure-messaging-eventhubs 5.19.0-beta.2. Setting the environment variable com.azure.messaging.eventhubs.v2=true when running your application will use the v2 stack.

new EventHubClientBuilder()
        .connectionString(properties.connectionString())
        .configuration(new com.azure.core.util.ConfigurationBuilder()
                .putProperty("com.azure.messaging.eventhubs.v2", "true")
                .build())
        .buildProducerClient();

from azure-sdk-for-java.

anuchandy avatar anuchandy commented on September 8, 2024 1

To clarify, the cause of Service Bus issue 41489 is unrelated though both log shows link disconnects; Service Bus 41489 was a thread bottleneck problem.

The Service Bus has been running on the V2 stack engine for several months and is generally available. Event Hub is currently being integrated into the V2 stack engine and is in the testing and feature parity phase, with a beta release in progress. From our local testing, EH with V2 stack engine do not face this issue observed in lower traffic environment.

Between @djarnis73, I'll clean up your comment in 41489 to prevent any confusion for others running into issue similar to that Service Bus case or this Event Hub issue. Hope that's alright with you.

from azure-sdk-for-java.

djarnis73 avatar djarnis73 commented on September 8, 2024 1

Sure, go ahead. I'm not too deep into our application or the services it uses I just happened to be the one investigating the issues or log monitoring found. So a lot of guesswork on my side.

from azure-sdk-for-java.

github-actions avatar github-actions commented on September 8, 2024

@anuchandy @conniey @lmolkova

from azure-sdk-for-java.

github-actions avatar github-actions commented on September 8, 2024

Thank you for your feedback. Tagging and routing to the team member best able to assist.

from azure-sdk-for-java.

clevelandcs avatar clevelandcs commented on September 8, 2024

We're also encountering this issue. Sounds like with the other two issues mentioned above it's pretty wide spread for apps that have a low volume period. The comment on #41535 about a V2 beta opt-in isn't very preferable for an immediate fix for production environments. As is we're looking at if a version downgrade helps the problem or if switching to the non-async client and initializing/disposing of the client periodically is a viable workaround. Having a fix for the async client would be very much preferred and I'm surprised this is made it through testing when it sounds like there are tests for low frequency clients in place.

from azure-sdk-for-java.

djarnis73 avatar djarnis73 commented on September 8, 2024

I'm going to try to implement a scheduled sending of keep-alive messages (once every minute) to see if that will stop the bleeding, this will of course require the receiving end being able to cope with these messages. Will revert with my findings.

from azure-sdk-for-java.

djarnis73 avatar djarnis73 commented on September 8, 2024

I think trying out the V2 beta would be an option for us. Where can I find the details on how to test it?

from azure-sdk-for-java.

djarnis73 avatar djarnis73 commented on September 8, 2024

So I'm pretty clueless when it comes to reactive programming but did some googling and was wondering if adding something like this:

static {
    reactor.core.publisher.Hooks.onErrorDropped(t -> {
        log.error("default onErrorDropped received, rethrowing", t);
        throw new RuntimeException(t);
    });
}

would ensure that the exception would travel up the reactive stack and finally be thrown as an exception to the caller? As to be notified about the failure? In our case we are passing messages around and if we got an exception we could put the original message on a dead-letter-queue to avoid losing it.

from azure-sdk-for-java.

anuchandy avatar anuchandy commented on September 8, 2024

Hi @djarnis73, unfortunately re-throwing error from the global Reactor hook will not bubble the error to the actual send API call that application made. If we look at the first call stack in the issue description, it does not have send API call in the stack, so this error we re-throw from the hook will be signaled to thread happens to invoke the hook, likely goes to the default uncaught exception handler associated with the thread.

from azure-sdk-for-java.

djarnis73 avatar djarnis73 commented on September 8, 2024

I'm going to try to implement a scheduled sending of keep-alive messages (once every minute) to see if that will stop the bleeding, this will of course require the receiving end being able to cope with these messages. Will revert with my findings.

Initial result of this looks good, no exceptions since it was deployed around 18 hours ago. But I do feel it is only a matter of time before another onSessionRemoteClose (perhaps triggered by something different than a timeout), so I still think I will try out the V2.

from azure-sdk-for-java.

djarnis73 avatar djarnis73 commented on September 8, 2024

I deployed a version of our app to our test environment with V2 enabled and my manual keep alive removed. It does not seem to resolve the issue (the errors in the logs are back). So it looks like V2 has not fixed the issue. Any clues on how to investigate this any further (like enable debug logging for specific logger to ensure I have enabled V2 correctly)?

from azure-sdk-for-java.

anuchandy avatar anuchandy commented on September 8, 2024

Hi @djarnis73 - regarding logging, a reference log4j2.xml and logback.xml can be found here with DEBUG logging enabled.

To verify V2 stack is enabled, we can check if there are logs with class name "com.azure.core.amqp.implementation.ReactorConnectionCache", which should be present at the start of the logs when a connection is established.

To give an additional context, even with v2, we will still see the disconnect events - "onSessionRemoteClose", "the connection was closed by container…". The broker will disconnect if there is no activity, and the next send attempt by the application should force library to reconnect.

Can we check

  1. If V2 is enabled by inspecting if there are logs from "ReactorConnectionCache" class in logs,
  2. If we still see this error with V2 - "Error: java.lang.NullPointerException: Cannot invoke ava.util.List.add(Object) because this._sessions is null",
  3. If next send attempt reconnects and send events though there were session/connection close/disconnects.

from azure-sdk-for-java.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.