Comments (6)
This is pretty reasonable. Thank you guys so much!
from memray.
Is there a way to observe multi-processing information?
Not with live mode. We don't have any way right now for one UI to be ingesting data from multiple processes.
What we do have is the --follow-fork
option for memray run
. That will cause it to write one output file per child process, and you can then inspect each of those output files individually, for instance by using memray flamegraph
to generate a flame graph for each that you can open up in a browser.
This will only work if it's forking and not exec'ing - meaning that it will be able to gather meaningful data if you use a multiprocessing.Pool
, but not if you use a subprocess.run()
call. As far as I can tell at a quick glance, though, DataLoader
seems to be using multiprocessing
, and so this ought to work.
--follow-fork
mode is pretty new, so there may still be some kinks to work out - try it and let me know if you hit any issues.
from memray.
Are there any plans to create/extend a reporter to accept and integrate data from multiple capture files? I'm wrapping a multi-worker gunicorn process with memray and I end up with a capture file per worker. Inspecting them separately is useful, but inspecting them all merged together would also provide some insights.
from memray.
There aren't any such plans. When we discussed amongst ourselves, the consensus was that trying to analyze information from multiple processes at the same time was likely to cause more confusion than anything else, and we had trouble coming up with any cases where seeing, say, multiple workers at once would tell you anything that you wouldn't be able to identify by analyzing them individually.
In fact, for the gunicorn case, I would think that what would make the most sense is just to drop the number of workers down to 1 while you're investigating it, so that all requests are reaching the same worker instance.
But you might be seeing something we didn't - can you describe a case where there's some interesting feature of the memory usage of a pool of worker processes that would be difficult to identify by looking at their allocations individually, but easy to identify by looking at their allocations in aggregate?
from memray.
Great, --follow-fork
works! Here is another question.
The script I provide is to trace the copy-on-write
caused by accessing python objects from forked-process. Accessing a python object from a forked process changes the reference-counting
thus triggers page duplication. It seems that memray
do not report memory consumption related to COW
.
So does it means we can not use memray
to trace COW
?
from memray.
So do it means we can not use
memray
to traceCOW
?
Memray
traces two things:
- Request for allocations to the system allocators: these include
malloc
,mmap
,calloc
,realloc
,valloc
... and a bunch more. - Resident size every bunch of milliseconds directly from the kernel.
When a process is forked, the memory maps are shared between the part and the child until a write happens, as you indicate. When the write happens it triggers an implicit interrupt generated directly from the MMU
, which in turn causes the kernel to update the page table with the new (writable) pages, decrements the number of references, and performs the write.
This means that all of this happens in kernel space and therefore memray
cannot really "see" anything here. The only thing memray
will be able to see is that the resident size is increased by the kernel when that happens. We don't really have a way to know what operation causes this to happen, as this is deeply underneath us and will require instrumentation or similar.
So the answer is sadly that is very unlikely that you can use many common profilers to properly trace COW unless they allow instrumentation (like valgrind
does).
from memray.
Related Issues (20)
- Supporting for sending memray profiled files to pyroscope HOT 4
- Support --follow-fork HOT 1
- Using `memray attach` on a suspended process leads to a segfault
- Core dumped with couchbase==4.1.4 HOT 3
- memray attach command reports error "No module named 'memray'" HOT 2
- Support writes to S3 HOT 2
- --no-web Packaging Option HOT 2
- `memray attach` with lldb can trigger an abort due to stack smashing detection
- Memray leaks native backtrace state HOT 5
- MacOS Sonoma: SIGBUS errors HOT 34
- Allow to render flame graphs from large files HOT 1
- The process completes without the file output progress bar reaching 100%. HOT 3
- Follow Fork Bin File Overwritten When Pool of Workers Reuse PID HOT 5
- Generate the output of "memray tree" in a html format with support to collapse/expand sub trees HOT 2
- --leaks flag used as module argument HOT 3
- Official Debian package HOT 16
- %%memray_flamegraph magic options HOT 3
- Make the `%%memray_flamegraph` IPython magic use aggregated capture files
- is memray support profiling c version of python package like pillow? HOT 2
- continous profiling of memray HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from memray.