Comments (27)
For the record, I experimented with -fprof-late
a bit and the trick to get all dependencies to compile with -fprof-late
I came up with is to put into your cabal.project.local
:
package *
ghc-options: -fno-prof-auto -fprof-late
from optics.
Ok, I didn't see this bit:
As others already stated this doesn't apply prof-late to dependencies which is a shortcomming of cabal. To work around this you can use the cabal.project trick mentioned further up.
Looks like a bug in cabal to me if profiling
affects dependencies, but profiling-detail
doesn't.
from optics.
profiling does get into way of optimizations. It's results are unreliable. Try -fprof-late
with GHC 9.4
from optics.
I have built my project with ghc 9.4 and with
cabal run --enable-profiling --profiling-detail=none --ghc-option -fprof-late --ghc-option -with-rtsopts=-pj
It appears the prof-late
does not apply to the dependencies, but only to my project. Is this what you wanted me to do?
castOptic
and beneath it repIso
, Lens.first
, Lens.second
are still present in the most expensive function of my program.
I don’t believe that you can tell anything from a speedscope screenshot, but maybe it illustrates, vaguely, what I am talking about:
I am not sure if the simplifier could profit from inlining castOptic
, but I think it’s plausible that it could help.
(maybe for context: In this benchmark this processJsonMessage
function is being called on over 1M json messages, so squeezing even a tiny bit out of that would probably help perfomance.)
from optics.
This might be similar to #324.
from optics.
Also, @maralorn where does the optics code that doesn't optimize come from? Is it hand written?
from optics.
Also, optics
doesn't define repIso
. This function comes from generic-lens-core
. For the record, optics
provides built-in generic lenses and prisms, not sure why do you use generic-optics
.
from optics.
@arybczak Thank you for helping me understanding this.
You are right, I was using generic-optics
here. I didn’t know that optics has generic lenses, thx for the hint. I removed generic-optics
from my project. Replacing field
with gfield
actually made the function 25% faster (in this benchmark), so that’s a huge win!
Still castOptic
remains in the eventlog. My code in questions is the following function:
(gfield @"activities" %~ IntMap.adjust (_3 ?~ progress) id'.value)
I apply it to a record which has a field called activities
of type IntMap _
.
I wonder if the problem is, that I haven’t yet brought cabal to build optics
with -fprof-late
. I think the -fprof-late
currently only applies to my project. Maybe castOptic
will disappear when I get it to do that? Although apparently all the operators from optics
which I am using seem to get inlined correctly. Only castOptic
and #
are remaining.
from optics.
Replacing field with gfield actually made the function 25% faster (in this benchmark), so that’s a huge win!
That's great news 👍 FYI you can use gfield
via OverloadedLabels
, like this:
(#activities %~ IntMap.adjust (_3 ?~ progress) id'.value)
Regarding the topic, let's experiment. Can you try #470 and see whether it fixes the problem?
from optics.
(I am not sure, why I refrained from using OverloadedLabels, I think it’s because sometimes the error messages get more confusing.)
Well, yes, applying that patch makes castOptic disappear from the eventlog. One operator remained: #.
so I additionally created 4d7be0d and now presumably everything get’s inlined, at least there are no visible cost centers in the function anymore, I haven‘t looked at the core though.
Now I don‘t know if the function got faster again, I have run the benchmark muitiple times and it looks like it might be a improvement of about 10% but it might not be statistically significant.
So yes, castOptic
get’s inlined with that change, I am just not sure how important that is.
from optics.
I did an experiment with using review
in a code with profiling enabled and I didn't get castOptic
there, but I got #.
and .#
.
I don't see a harm in marking there functions INLINE because they and very small and thus always get inlined when profiling is off anyway, but the more fundamental issue here is that profiling in GHC interferes with optimizations, so other, similar problems will most likely pop in the future 😞
from optics.
A friendly reminder about -fprof-late
.
from optics.
Well, if this is really just being caused by the profiler preventing inlining, then let‘s just close this issue.
I hear you @phadej, but then the real issue here is that there are currently no resources on the internet about the relation between -fprof-late and dependencies build with cabal. I am actually not even sure what -fprof-late is supposed to do when building a library …
from optics.
the real issue here is that there are currently no resources on the internet about the relation between -fprof-late and dependencies build with cabal.
... which library maintainers cannot really address. Doing stuff with cabal
(or tbh, any build tool) which is just recently got added to GHC is often troublesome. Indeed, I have no good idea how to pass profiling specific flags to dependency package builds, and only when cabal
profiling build is done. (My first idea is to build a custom cabal
which does -fprof-late
instead of -fprof-auto
, but I wouldn't call that user friendly, nor really recommend to anyone).
from optics.
Yeah, it’s obviously not an optics issue. I just insisted, to make sure, that I haven’t been making a stupid mistake.
from optics.
Thx, anyway for taking the time to talk this through with me.
from optics.
@maralorn I'm running into very similar issues with https://discourse.haskell.org/t/debugging-why-late-profiling-detail-still-shows-optics-unexpectedly/6391. I'm trying to experiment with @arybczak's suggestion.
from optics.
I've created a minimal, reproducible example showing the performance differences between lens and optics while profiling here: https://github.com/siraben/lens-optics-profiling
from optics.
Very interesting. Thank you!
from optics.
@siraben can you check with #491?
from optics.
Ok, I checked it myself and it does indeed fix it.
from optics.
I am still likely confused whether this fixes a problem that only happens during profiling and if so, why?
from optics.
Looks like GHC doesn't inline small/trivial functions when profiling is enabled unless they are marked with INLINE pragmas. As for why this happens, it would be best to ask GHC devs.
from optics.
Looks like GHC doesn't inline small/trivial functions when profiling is enabled unless they are marked with INLINE pragmas. As for why this happens, it would be best to ask GHC devs.
I don't think this is the case. Rather cabal ends up invoking ghc with both -fprof-auto and -fprof-late which ends up inserting the cost centres implied by both modes.
I've created a minimal, reproducible example showing the performance differences between lens and optics while profiling here: https://github.com/siraben/lens-optics-profiling
This was very helpful. At least in the reproducer this issue is that cabal will pass "-fprof-auto" while you then later pass -fprof-late manually. These two don't cancel each other out. Rather ghc will insert both the cost-centres for "-fprof-auto" and for "-fprof-late". This is because there are edge cases where one might want to combine these profiling modes. But perhaps that's not worth the confusion.
Anyway to properly use prof-late for just your project either use:
cabal run --enable-profiling {lens|optics}-profiling --ghc-options='-O2 -fprof-late' --profiling-detail=none -- +RTS -p -RTS
Or for newer versions of cabal
cabal run --enable-profiling {lens|optics}-profiling --ghc-options='-O2' --profiling-detail=late-- +RTS -p -RTS
As others already stated this doesn't apply prof-late to dependencies which is a shortcomming of cabal. To work around this you can use the cabal.project trick mentioned further up. So something like:
package *
ghc-options: -fno-prof-auto -fprof-late
profiling-detail: none
If there is a reproducer showing clearly that -fprof-late actually prevents inlining I would be very interested to take a look at it. But using the flags as I described it seems to work as expected for this reproducer.
from optics.
@AndreasPK I've run this repo with the following cabal.project.local
:
with-compiler: ghc-9.4.5
profiling: True
profiling-detail: late
tests: True
and here's the difference:
lens-profiling:
Sun Jun 18 15:49 2023 Time and Allocation Profiling Report (Final)
lens-profiling +RTS -s -p -RTS
total time = 0.02 secs (19 ticks @ 1000 us, 1 processor)
total alloc = 40,051,984 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
$wloop Main <no location info> 100.0 99.9
individual inherited
COST CENTRE MODULE SRC no. entries %time %alloc %time %alloc
MAIN MAIN <built-in> 162 0 0.0 0.0 100.0 100.0
main Main app/Main-lens.hs:29:1-4 324 1 0.0 0.0 0.0 0.0
main7 Main <no location info> 325 1 0.0 0.0 0.0 0.0
main1 Main <no location info> 326 1 0.0 0.0 0.0 0.0
CAF Main <entire-module> 323 0 0.0 0.0 100.0 99.9
$fShowBox13 Main <no location info> 333 1 0.0 0.0 0.0 0.0
main2 Main <no location info> 327 1 0.0 0.0 0.0 0.0
main3 Main <no location info> 328 1 0.0 0.0 0.0 0.0
$w$cshowsPrec2 Main app/Main-lens.hs:9:17-20 329 1 0.0 0.0 0.0 0.0
main4 Main <no location info> 330 1 0.0 0.0 100.0 99.9
$wloop Main <no location info> 331 1000001 100.0 99.9 100.0 99.9
main5 Main <no location info> 332 1 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal <entire-module> 284 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding <entire-module> 269 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv <entire-module> 267 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD <entire-module> 259 0 0.0 0.1 0.0 0.1
optics-profiling:
Sun Jun 18 15:49 2023 Time and Allocation Profiling Report (Final)
optics-profiling +RTS -s -p -RTS
total time = 0.02 secs (24 ticks @ 1000 us, 1 processor)
total alloc = 40,051,784 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
increment3 Main <no location info> 58.3 39.9
$wloop Main <no location info> 25.0 59.9
% Optics.Internal.Optic src/Optics/Internal/Optic.hs:(120,1)-(130,54) 16.7 0.0
individual inherited
COST CENTRE MODULE SRC no. entries %time %alloc %time %alloc
MAIN MAIN <built-in> 139 0 0.0 0.0 100.0 100.0
main Main app/Main-optics.hs:36:1-4 278 1 0.0 0.0 0.0 0.0
main7 Main <no location info> 279 1 0.0 0.0 0.0 0.0
main1 Main <no location info> 280 1 0.0 0.0 0.0 0.0
CAF Main <entire-module> 277 0 0.0 0.0 100.0 99.9
$fShowBox13 Main <no location info> 295 1 0.0 0.0 0.0 0.0
increment2 Main <no location info> 286 1 0.0 0.0 0.0 0.0
castOptic Optics.Internal.Optic src/Optics/Internal/Optic.hs:(102,1)-(108,44) 288 1 0.0 0.0 0.0 0.0
% Optics.Internal.Optic src/Optics/Internal/Optic.hs:(120,1)-(130,54) 289 2 0.0 0.0 0.0 0.0
runFunArrow Data.Profunctor.Indexed src/Data/Profunctor/Indexed.hs:78:37-47 287 1 0.0 0.0 0.0 0.0
main2 Main <no location info> 281 1 0.0 0.0 0.0 0.0
main3 Main <no location info> 282 1 0.0 0.0 0.0 0.0
$w$cshowsPrec2 Main app/Main-optics.hs:16:26-29 283 1 0.0 0.0 0.0 0.0
main4 Main <no location info> 284 1 0.0 0.0 100.0 99.9
$wloop Main <no location info> 285 1000001 25.0 59.9 100.0 99.9
increment2 Main <no location info> 290 0 0.0 0.0 75.0 39.9
castOptic Optics.Internal.Optic src/Optics/Internal/Optic.hs:(102,1)-(108,44) 291 0 0.0 0.0 75.0 39.9
% Optics.Internal.Optic src/Optics/Internal/Optic.hs:(120,1)-(130,54) 292 0 16.7 0.0 75.0 39.9
increment3 Main <no location info> 293 1000000 58.3 39.9 58.3 39.9
main5 Main <no location info> 294 1 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal <entire-module> 265 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding <entire-module> 250 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv <entire-module> 248 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD <entire-module> 240 0 0.0 0.1 0.0 0.1
Only after #491 it becomes:
Sun Jun 18 15:53 2023 Time and Allocation Profiling Report (Final)
optics-profiling +RTS -s -p -RTS
total time = 0.02 secs (19 ticks @ 1000 us, 1 processor)
total alloc = 40,051,768 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
$wloop Main <no location info> 100.0 99.9
individual inherited
COST CENTRE MODULE SRC no. entries %time %alloc %time %alloc
MAIN MAIN <built-in> 127 0 0.0 0.0 100.0 100.0
main Main app/Main-optics.hs:36:1-4 254 1 0.0 0.0 0.0 0.0
main7 Main <no location info> 255 1 0.0 0.0 0.0 0.0
main1 Main <no location info> 256 1 0.0 0.0 0.0 0.0
CAF Main <entire-module> 253 0 0.0 0.0 100.0 99.9
$fShowBox13 Main <no location info> 263 1 0.0 0.0 0.0 0.0
main2 Main <no location info> 257 1 0.0 0.0 0.0 0.0
main3 Main <no location info> 258 1 0.0 0.0 0.0 0.0
$w$cshowsPrec2 Main app/Main-optics.hs:16:26-29 259 1 0.0 0.0 0.0 0.0
main4 Main <no location info> 260 1 0.0 0.0 100.0 99.9
$wloop Main <no location info> 261 1000001 100.0 99.9 100.0 99.9
main5 Main <no location info> 262 1 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal <entire-module> 248 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding <entire-module> 238 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv <entire-module> 236 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD <entire-module> 228 0 0.0 0.1 0.0 0.1
from optics.
@arybczak the new changes help, thank you! Would you be able to make a release to Hackage that incorporates these changes?
from optics.
I will after #492 is merged.
from optics.
Related Issues (20)
- Build failure in optics-extra-0.4.2 with mtl-2.3 and transformers < 0.6 HOT 8
- Drop ticks in type level constructors HOT 1
- Label optics misbehavior when defined with phantom type HOT 5
- Add a function to traverse the contents of a lens HOT 6
- Clarify makeFieldLabels prefix requirements
- over' and iover' pretend to work with incompatible optics
- Strictify traversals?
- Improve type errors
- Support GHC 9.6 HOT 5
- Tests failing on GHC 9.4 on current Hackage release (0.4.2) HOT 3
- Unwrapping newtype should be easier than `coercedTo`. HOT 5
- Document release process
- https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9411 seems to break us HOT 5
- Performance best practices regarding lens vs lensVL and Optics.TH? HOT 2
- Improve documentation on what names are generated by the various functions
- `makeFieldLabelsNoPrefix` throws error on `Network.Wai.Handler.Warp.Internal.Settings` HOT 6
- template-haskell-optics support for th-abstraction-0.6 HOT 1
- Allow dependencies bundled with GHC-9.8 HOT 3
- Versioning HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from optics.