Comments (2)
What do you want CPU profiling for? There are ways we could measure and incorporate that information, but it would be helpful to know where that information would be helpful over periodic sampling and GVL information.
I only ever used CPU sampling with stackprof because either I wanted to see data from all threads, or I wanted to focus on when the program was active. Vernier records all threads all the time, and shows where the GVL is held (which is usually what I actually wanted out of "CPU" time). So far that has been better for me than what I would have used mode: :cpu
for on stackprof. I also feel like cpu mode has been the cause of a lot of confusion for folks, until recently looking at a stackprof CPU flamegraph on speedscope was nonsense because it still used the realtime timestamps as weights, and the resolution of ITIMER_PROF
is really poor in a way that I think has made people misunderstand their programs. I'd like to help avoid making the same mistakes when using vernier, but if we find a good way to use CPU utilization I'm open to adding it!
It looks like the firefox profiler will (when enabled in a configuration setting) record CPU time for when a thread is awake and display that as a marker (the checkbox made it sound like it recorded it between each sample, but I don't see how that data is used in the profile viewer, more investigation needed). I think that could make a lot of sense for us to include in our GVL markers ("Thread running" etc) it will cause some overhead but probably not too much.
It looks like they also have an experimental feature to record/show CPU utilization as a trace as a track in the timeline. However it requires running a line in the console to enable, so I wouldn't rush to support this until the profiler UI also included it by default.
from vernier.
Thanks for your reply @jhawthorn ! Really cool project :)
What do you want CPU profiling for?
The short answer is that I'd like to see if we can totally replace stackprof with something new, and we already have been using CPU profiling for continuous profiling data. It allows us to treat CPU time as a resource (which indeed it is a resource we're paying for in the cloud), and analyze where it is going. In WALL mode profiles, lots of the profile tends to just be waiting on I/O.
I'll also add that we don't usually use it with one off profiles. Rather, we tend to aggregate large numbers of CPU profiles across many instances and requests. This helps to identify "hot spots" in the code that spend a lot of time on-CPU, and excludes the stuff waiting on I/O. Our current pipeline for this is we submit stackprof to GCS, and have a pub/sub worker convert it into pprof, then submit it to Grafana Pyroscope.
I'll probably need to write a gecko to pprof converter for the existing timer-based profiles either way, and I already have some experience with the gecko format from writing a converter that went the other way. The extra thread and marker information can be added as labels on pprof samples, so tools like Pyroscope still may be able to query it and filter the samples.
Outside of our continuous profiling use case though, I can see why the on-CPU time in a single profile, eg for a single request, probably isn't that interesting especially for mostly short, I/O bounded requests and in a viewer like firefox-profiler, what you described seems much more useful and detailed.
Vernier records all threads all the time, and shows where the GVL is held (which is usually what I actually wanted out of "CPU" time). So far that has been better for me than what I would have used mode: :cpu for on stackprof.
Very interesting, I can see why that could be better. If a thread is blocking in I/O, then maybe just filtering the profiles by the one holding the GVL is a better way to get what we are after here to exclude it. In converting to pprof, perhaps we could just be sure to label which sample has the GVL, and discard or filter out the samples for the non-GVL holding threads....
until recently looking at a stackprof CPU flamegraph on speedscope was nonsense because it still used the realtime timestamps as weights
We ran into this too and it was indeed a very disappointing to see that folks (including myself) were reasoning from false precepts because of it. I'm not sure if the stackprof generated flamegraphs themselves have the problem.
On our internal instance of speedscope, we fixed this by weighting by number of periods instead. Looks like my colleague Manuel is the one who upstreamed the fix in jlfwong/speedscope#425
and the resolution of ITIMER_PROF is really poor in a way that I think has made people misunderstand their programs
Can you elaborate on this? Is this due to the delayed capturing of the sample in recent versions, or something in the OS implementation itself? Or, something else?
My understanding is that if a stack comes up often enough in the on-CPU profiling based on SIGPROF, it should still be statistically reasonable to infer that it is a heavy user of CPU. I agree that abstracting from periods to time is a bit iffy, but for the stacks that actually do have a heavy weight, I'd still expect that weighting by samples * period length is approximately correct, if a bit of a lie. On the flip side, for infrequent stacks, it is usually wrong to infer "I saw this stack once, it must have used 10ms of CPU time" as it could jut be a blip, and we have no way of accounting for what was actually on CPU for most of that period. Is this not the case?
It looks like the firefox profiler will (when enabled in a configuration setting) record CPU time for when a thread is awake and display that as a marker (the checkbox made it sound like it recorded it between each sample, but I don't see how that data is used in the profile viewer, more investigation needed). I think that could make a lot of sense for us to include in our GVL markers ("Thread running" etc) it will cause some overhead but probably not too much.
This seems cool 👍 , I'll need to investigate this more too.
from vernier.
Related Issues (20)
- Docs: example profiler output HOT 1
- Vernier.trace_retained: fails on missing keywords HOT 2
- Store options given to collector in the result HOT 1
- [bug] Segfaults? Seem to be related to GC (ruby 3.2.0, 3.2.2, 3.3.0-preview2)(vernier 0.3.0) HOT 5
- Segfault on Ruby 3.3 with `RUBY_MN_THREADS=1` HOT 13
- Epic: Library/framework integrations HOT 2
- Trace Ruby allocations HOT 1
- Integrate with rack-mini-profiler HOT 4
- Segfault on Thread HOT 2
- Single thread profiling? HOT 5
- We should gzip by default HOT 3
- [BUG] pthread_kill failed when running vernier through Sidekiq HOT 7
- Explicit selection of threads in firefox output HOT 1
- API to start and stop tracing HOT 1
- Add custom markers HOT 1
- README should mention low overhead HOT 2
- `Encoding::UndefinedConversionError` when writing to gz HOT 1
- self-hosted vernier.prof? HOT 1
- vernier.prof compare interface doesn't respect tilde HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vernier.