Giter Site home page Giter Site logo

Comments (15)

SergeyKanzhelev avatar SergeyKanzhelev commented on May 18, 2024

CC: @lmolkova

from applicationinsights-servicefabric.

lmolkova avatar lmolkova commented on May 18, 2024

It seems like a known issue: microsoft/ApplicationInsights-dotnet#693
Which is likely a side effect of microsoft/ApplicationInsights-dotnet-server#723, that reproduces when debugging the app in Visual Studio.

If it is this issue, it should be fixed in Microsoft.ApplicationInsights.DependencyCollector 2.6.0-beta3 and Microsoft.ApplicationInsights.AspNetCore 2.3.0-beta1.

@Tadimsky could you please check versions of DependencyCollector and AspNetCore (and update them).
Also, do you know if the issue reproduces locally under debug, or in production?

If it reproduces with latest versions, could you please check which if dependency telemetry was reported on the same machine immediately after this message? (unless you have very high-loaded service and this does not make sense?)

Could you please also share some of your telemetries for this operation?

Please use following Analytics queries

  1. requests | union dependencies | where id == "|1cafaa09-478a33c4492e47c3.a2a37cb4_"
  2. requests | union dependencies | where id == "'|1cafaa09-478a33c4492e47c3.a2a37cb4_1."
  3. requests | union dependencies | union traces | where operation_Id == "1cafaa09-478a33c4492e47c3.a2a37cb4"

Please remove any sensitive information like iKey (or anything else). You can also email this info to me.

from applicationinsights-servicefabric.

Tadimsky avatar Tadimsky commented on May 18, 2024

Hey @lmolkova this is happening for me in our Production services:

AI (Internal): Operation to stop does not match the current operation. Details: Telemetry Id '|fac5592a-4fd80273c9a61f11.80081fb1_' does not match current Activity '|fac5592a-4fd80273c9a61f11.80081fb1_1.' Ā 

I'll send you an email with the traces.

from applicationinsights-servicefabric.

lmolkova avatar lmolkova commented on May 18, 2024

@Tadimsky thanks for the info!

The error happens when telemetry we are going to track in StopOperation does not match current Activity. We have tracked such telemetries before (by mistake) and started to enforce this in the latest ApplicationInsights Base SDK. Otherwise, it leads to broken correlation.

So, what I see from the logs is:

Service Whiteboards API makes remoting call to SharesService (Iā€™m omitting not interesting pieces)

Id Type OperationId ParentId
|fac5592a-4fd80273c9a61f11.3.1. dependency fac5592a-4fd80273c9a61f11 |fac5592a-4fd80273c9a61f11.3.
|fac5592a-4fd80273c9a61f11.80081fb1_1. request fac5592a-4fd80273c9a61f11 |fac5592a-4fd80273c9a61f11.3.

Eventually when something with Id |fac5592a-4fd80273c9a61f11.80081fb1_ is being reported, it could not, because Activity with |fac5592a-4fd80273c9a61f11.80081fb1_1 is not stopped.

Some strange things:

  1. Dependency has ParentId |fac5592a-4fd80273c9a61f11.3. ā€“ it has never been reported
  2. Request parent is |fac5592a-4fd80273c9a61f11.3. too! While dependency id is |fac5592a-4fd80273c9a61f11.3.1.
  3. Why request Id is fac5592a-4fd80273c9a61f11.80081fb1_1? It should be fac5592a-4fd80273c9a61f11.80081fb1_

So my assumption is:
Activities are created multiple times: e.g. when incoming request starts, we receive several event source events about it. So we attempt to start several nested activities.
Same happens for outgoing requests. This could explain all above discrepancies.

According to @Tadimsky, each service is packed into it's own package and each instance runs on it's own machine, i.e. it does not look like SharedProcess SF issue.

/cc @brahmnes @yantang-msft

from applicationinsights-servicefabric.

an4286sc avatar an4286sc commented on May 18, 2024

@lmolkova I'm getting this error message too in my current project. But only in production. Not while debugging locally. I'm using this telemetry client methods:
TrackAction
TrackTrace
TrackException

No StartOperation or StopOperation.

The Application Insights message is: "AI (Internal): Operation to stop does not match the current operation." No further informations.

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on May 18, 2024

@an4286sc What's the service fabric runtime and AI ServiceFabric version you are on?
Are you using Reliable Service or Actors? Remoting V1 or V2? Did you by any chance use Shared Process hosting model in Servicefabric.
The StartOperation and StopOperation will be called automatically when the request is coming to your service or when you make outgoing request.

Would you be able to share the project that can reproduce the issue? That would be the easiest way for us to trouble shoot.

Other than that, when you see the "AI (Internal): Operation to stop does not match the current operation.", it should give you the TelemetryId, e.g., '|fac5592a-4fd80273c9a61f11.80081fb1_', could you run analytic query like below, this will give us all telemetries happened during the request.

requests | union dependencies |
union traces | where operation_Id ==
"fac5592a-4fd80273c9a61f11"

from applicationinsights-servicefabric.

an4286sc avatar an4286sc commented on May 18, 2024

@yantang-msft
Thanks for your reply. The error occurs in the stateless serivce with remoting v2 and shared process hosting. We are using AI Service Fabric 2.1.1-beta1.

See the query results in the attached file.
ai_stop_operation_failure.zip

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on May 18, 2024

@an4286sc I do see discrepancies of the operation ids. And what's the version of Microsoft.ServiceFabric.Services.Remoting? Can you try exclusive process model and see if that solves the problem?
I can try the shared process model later this week and see if I can reproduce the issue. If so, we will keep investigate in how to make it work with shared process model.

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on May 18, 2024

@Tadimsky @an4286sc I have reproduced this issue and identified the root cause, it's happening for me after I upgraded Microsoft.ApplicationInsights.AspNetCore 2.3.0-beta2. Hopefully you're seeing this error for the same reason.

We are working on the fix, but for some process reason, it will take some time before we can release the fix. A temporary workaround would be downgrade the Microsoft.ApplicationInsights.AspNetCore to 2.2.1.

from applicationinsights-servicefabric.

Tadimsky avatar Tadimsky commented on May 18, 2024

@yantang-msft I am on 2.3.0-beta1, do you think this version is affected as well?

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on May 18, 2024

@Tadimsky Yes, I confirm 2.3.0-beta1 is also affected.

from applicationinsights-servicefabric.

Tadimsky avatar Tadimsky commented on May 18, 2024

@yantang-msft is this expected to be fixed in 2.1.1?
I'm still seeing the issue:

image

image

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on May 18, 2024

@Tadimsky Yes, the bug as AI SF side is fixed. Does the huge decrease after 6/25 happened after you upgrade the AI SF package?
It could be somewhere else unrelated to AI SF that is still causing this issue. If you can provide a reproduce, I can definitely take a look and see what's wrong.

from applicationinsights-servicefabric.

Tadimsky avatar Tadimsky commented on May 18, 2024

The huge decrease was due to some increased load we had to process for a few days.

This is total requests into the service:
image

We did deploy the fix on 6/27 though - it does seem to be lower, but it's still occurring:
image

The project is the same one I shared with you earlier.
Is there something I can try to do to debug this?

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on May 18, 2024

@Tadimsky can you run the query as you did last time and share the result?
You can try remove the AI SF nuget package and see if this still happens. If so then it's could be some other AI package or if you manually called client.StartOperation but didn't call client.StopOperation properly.

from applicationinsights-servicefabric.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.