Giter Site home page Giter Site logo

Comments (12)

pablogsal avatar pablogsal commented on May 20, 2024 2

I'm not gonna pretend to have a good handle on the thread state APIs, but - could we iterate through the interpreter's thread states and, for each of them, use PyThreadState_Swap to make it our current thread state, then call PyEval_SetProfile, then restore our old thread state?

If that's valid, it would only use public APIs, but there's an assertion I don't understand in _PyThreadState_Swap which makes me worry that it wouldn't be valid to do so.

I'm not at my computer so I cannot check but if I recall correctly _PyThreadState_Swap will crash if you try to pass a thread state that is not the one that is currently registered with the GIL, if that is set.

The other method we discussed is much more attainable than messing directly with the thread APIs.

from memray.

godlygeek avatar godlygeek commented on May 20, 2024 1

Right, any usage of memray run should work fine, because in those cases our hooks get installed before the application creates any threads.

It's only using with memray.Tracker(...) that isn't working, because doing that in some particular thread leaves any other already-running threads without our hooks installed.

from memray.

godlygeek avatar godlygeek commented on May 20, 2024

Wow, I can reproduce that, and that is very very wrong. The "stack trace unavailable" seems less wrong than the fact that the histogram thought the smallest bucket should be 225 petabytes and the largest one should be 434 million vigintillion yottabytes - even though all of those buckets except one is empty.

How very strange...

from memray.

godlygeek avatar godlygeek commented on May 20, 2024

Well, I've figured out what's happening for each of those bugs. The histogram one is easy: we go off the rails when all of the allocation locations have allocated the same amount of memory (we're careful to avoid a divide by zero, but what we do instead isn't terribly helpful).

The other problem, the one that you raised the issue for, is trickier. What's happening is that, when the tracker is started in one thread, we're failing to install our profile function on the other threads, which means that we're failing to collect the Python stack of the main thread leading up to the allocations. I'm not sure what to do about that just yet.

from memray.

pablogsal avatar pablogsal commented on May 20, 2024

The other problem, the one that you raised the issue for, is trickier. What's happening is that, when the tracker is started in one thread, we're failing to install our profile function on the other threads, which means that we're failing to collect the Python stack of the main thread leading up to the allocations. I'm not sure what to do about that just yet.

This is expected: thread profile functions only affect newly created threads and the sys.setprofile call is per-thread. This means that when you activate the Tracker it will only track the stack of threads created after it was initialized. We should either document this limitation or try to overcome it, which is tricky, because there isn't a supported way to "install a profile function in every running thread".

Also, even if we install a profile function, the main thread will continue to be untracked because it never entered new functions while we are tracking in another thread, so we won't see a frame push or a pop.

For the main thread we can call Py_AddPendingCall with the trace function setter if we are invoked from a different thread, but it has limited usefulness.

I think we should:

  1. Document the limitation with Tracker and existing running threads.
  2. Fix the stats historgram.

from memray.

godlygeek avatar godlygeek commented on May 20, 2024

because there isn't a supported way to "install a profile function in every running thread"

How about if we make install_trace_function do

PyInterpreterState* interp = PyThreadState_GetInterpreter(PyThreadState_Get());
PyThreadState* ts = PyInterpreterState_ThreadHead(interp);
while (ts) {
    if (_PyEval_SetProfile(ts, PyTraceFunction, PyLong_FromLong(123)) < 0) {
        PyErr_Clear();
    }
    ts = PyThreadState_Next(ts);
}

Everything that uses is public except for _PyEval_SetProfile.

even if we install a profile function, the main thread will continue to be untracked because it never entered new functions while we are tracking in another thread, so we won't see a frame push or a pop.

When we install the profile function on a thread, perhaps we could also capture its initial stack, since at the point where we would be installing the profile function, we're holding the GIL and could walk the stack backwards from the thread's current frame. We could then apply any later pushes/pops on top of our captured initial stack.

from memray.

pablogsal avatar pablogsal commented on May 20, 2024

Everything that uses is public except for _PyEval_SetProfile.

This means it is not supported and there is no assurance we can do this in the future, that's why I said "supported way" 😉

OTOH we can go with this for now meanwhile I raise this issue with the rest of the core team and we can consider making _PyEval_SetProfile public or at least semi-private.

from memray.

godlygeek avatar godlygeek commented on May 20, 2024

there is no assurance we can do this in the future

Sure, fair enough. It is available in 3.7 through 3.11, though, at least. And I can't see much reason why it shouldn't be public - it's just PyEval_SetProfile applied to any arbitrary PyThreadState, and the ability to iterate through PyThreadState's is already public, and PyEval_SetProfile is public - I don't see any obvious reason why _PyEval_SetProfile shouldn't be.

meanwhile I raise this issue with the rest of the core team and we can consider making _PyEval_SetProfile public or at least semi-private.

I think that's a good idea. If there's not resistance to that, then that seems like a reasonable way for us to proceed.

If there is resistance to it, then we could always interpose the pthreads functions used to acquire the GIL, giving us a hook into when the GIL is next picked up in whatever other thread is already running 😈

from memray.

godlygeek avatar godlygeek commented on May 20, 2024

Another option would be for us to use the public and documented _PyInterpreterState_SetEvalFrameFunc to inject a frame evaluation function instead of a profile function - those are shared across all threads for an interpreter, and would let us detect that tracing has begun on the next Python function call within a thread. Though that seems much more likely to go away in the future than _PyEval_SetProfile would be.

from memray.

godlygeek avatar godlygeek commented on May 20, 2024

I'm not gonna pretend to have a good handle on the thread state APIs, but - could we iterate through the interpreter's thread states and, for each of them, use PyThreadState_Swap to make it our current thread state, then call PyEval_SetProfile, then restore our old thread state?

If that's valid, it would only use public APIs, but there's an assertion I don't understand in _PyThreadState_Swap which makes me worry that it wouldn't be valid to do so.

from memray.

godlygeek avatar godlygeek commented on May 20, 2024

(I'm guessing that the answer is "no", and that it's not valid to use a thread state on a different OS thread than the one that it was created for, but that doesn't seem to be documented anywhere...)

from memray.

wilsonchai8 avatar wilsonchai8 commented on May 20, 2024

If I use 'live' mode, I can get more information about every thread

Like this:

import time
from threading import Thread

def hello():
    a = 'h' * 10240
    count = 0
    while count<15:
        a += a
        count += 1
    time.sleep(10)

t = Thread(target=hello, args=())
t.start()
t.join()

memray run --live main.py

image

image

from memray.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.