Thank you guys for this amazing beautiful cool tool! Feature Reques

Is multi-processing supported? about memray HOT 6 CLOSED

bloomberg commented on May 20, 2024 2

Is multi-processing supported?

from memray.

Comments (6)

semaphore-egg commented on May 20, 2024 2

This is pretty reasonable. Thank you guys so much!

from memray.

godlygeek commented on May 20, 2024

Is there a way to observe multi-processing information?

Not with live mode. We don't have any way right now for one UI to be ingesting data from multiple processes.

What we do have is the --follow-fork option for memray run. That will cause it to write one output file per child process, and you can then inspect each of those output files individually, for instance by using memray flamegraph to generate a flame graph for each that you can open up in a browser.

This will only work if it's forking and not exec'ing - meaning that it will be able to gather meaningful data if you use a multiprocessing.Pool, but not if you use a subprocess.run() call. As far as I can tell at a quick glance, though, DataLoader seems to be using multiprocessing, and so this ought to work.

--follow-fork mode is pretty new, so there may still be some kinks to work out - try it and let me know if you hit any issues.

from memray.

rossjp commented on May 20, 2024

Are there any plans to create/extend a reporter to accept and integrate data from multiple capture files? I'm wrapping a multi-worker gunicorn process with memray and I end up with a capture file per worker. Inspecting them separately is useful, but inspecting them all merged together would also provide some insights.

from memray.

godlygeek commented on May 20, 2024

There aren't any such plans. When we discussed amongst ourselves, the consensus was that trying to analyze information from multiple processes at the same time was likely to cause more confusion than anything else, and we had trouble coming up with any cases where seeing, say, multiple workers at once would tell you anything that you wouldn't be able to identify by analyzing them individually.

In fact, for the gunicorn case, I would think that what would make the most sense is just to drop the number of workers down to 1 while you're investigating it, so that all requests are reaching the same worker instance.

But you might be seeing something we didn't - can you describe a case where there's some interesting feature of the memory usage of a pool of worker processes that would be difficult to identify by looking at their allocations individually, but easy to identify by looking at their allocations in aggregate?

from memray.

semaphore-egg commented on May 20, 2024

Great, --follow-fork works! Here is another question.

The script I provide is to trace the copy-on-write caused by accessing python objects from forked-process. Accessing a python object from a forked process changes the reference-counting thus triggers page duplication. It seems that memray do not report memory consumption related to COW.

So does it means we can not use memray to trace COW?

from memray.

pablogsal commented on May 20, 2024

So do it means we can not use memray to trace COW?

Memray traces two things:

Request for allocations to the system allocators: these include malloc, mmap, calloc, realloc, valloc... and a bunch more.
Resident size every bunch of milliseconds directly from the kernel.

When a process is forked, the memory maps are shared between the part and the child until a write happens, as you indicate. When the write happens it triggers an implicit interrupt generated directly from the MMU, which in turn causes the kernel to update the page table with the new (writable) pages, decrements the number of references, and performs the write.

This means that all of this happens in kernel space and therefore memray cannot really "see" anything here. The only thing memray will be able to see is that the resident size is increased by the kernel when that happens. We don't really have a way to know what operation causes this to happen, as this is deeply underneath us and will require instrumentation or similar.

So the answer is sadly that is very unlikely that you can use many common profilers to properly trace COW unless they allow instrumentation (like valgrind does).

from memray.

Is multi-processing supported? about memray HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent