Comments (13)
That's nice. I suggested a similar version, but without the GADTs tracking balance information, i.e.,
type Node a = Tip a | Node2 (Node a) (Node a) | Node3 (Node a) (Node a) (Node a)
type FingerTree a = ... | Deep ... (FingerTree a) ...
Do you know if there are any differences in performance of these two versions?
BTW, we could also add the size to the Digit
, as you suggested somewhere else.
from containers.
Yes, your implementation inspired mine. I have done no performance testing at all thus far. I think mine (and yours!) probably needs a bit of a representation change to avoid making the trees too big. I would be surprised if your version performed differently from the GADT one, but I've been surprised before.
from containers.
Note that despite the substantial change to the types, most of the horrifying ><
code stayed exactly the same. I'm trying to go slowly so I don't have to change too much of that at once.
from containers.
I think mine (and yours!) probably needs a bit of a representation change to avoid making the trees too big.
I do not understand. The new representation is isomorphic to the original one, is it not? So apart some constant factors (which are I would think better in the new version), the trees are as large as they used to be. Maybe I am not seeing something.
I would be surprised if your version performed differently from the GADT one, but I've been surprised before.
I was not sure how the optimizer works in the presence of GADTs, that's why I was asking. I spent some time looking at the code now and it seems not to be burdened by the GADTs at all.
What I am not sure about is whether we can change the code to use exclusively GADTs. Maybe we could create such a representation which would either use the GADTs or not, with the same code? If we created type synonyms for Digit n
, Tree23 n
and Tree23 (S n)
, which would leave out the n
? That would allow us to get static typing on GHC and not require GADTs for other compilers. But I have not think it through and maybe that is too crazy :-)
BTW, why not using the Node
name as in the original implementation?
from containers.
I tried briefly to do a bit of conditional stuff with the types but had a bit of trouble with the Nat
kind. I do think we can almost certainly do it with a bit of thinking and work, and that could obviously be important for portability. I didn't use the Node
name because it's a fairly different thing than the original Node
type, merging the original Node
with the original Elem
. Rather than representing a single pair or triple, it actually represents a tree of elements. This is not important, of course, and I don't much care, but that's what I did and why. The main reason for sticking with the Elem
name rather than changing to Tip
was just to make it easier to transform the "old" code.
from containers.
Yeah, that sounds reasonable.
It is just that the -23
methods look weird: we have lookupTree
/lookupDigit
/lookup23
. Do you have any idea for something different from Tree23
? I currently do not, I could only though of Pack
, Arbor
, Heap
or Pile
.
from containers.
If we decide to actually use any of this, it will likely end up looking different, and whatever it looks like I will give you carte blanche to name things as you see fit. I think Tree23
is actually quite reasonable because it's actually a 2-3 tree!
from containers.
Oh, I missed part of one of your messages, regarding my concern about tree size. Yes, the new is isomorphic to the old, but in the new representation, each leaf gets a real data constructor (currently called Elem
, maybe later called Tip
). In the old representation, each leaf got a fake Elem
constructor (just a newtype wrapper). I'll have to test to be sure, but this change seems likely to multiply the size of the tree by a substantial constant factor (somewhere around 1.5, probably). The most obvious fix is to store 2-leaves and 3-leaves instead of individual leaves, but I think that will take a good bit more work.
from containers.
It looks like https://github.com/sebastiaanvisser/msc-thesis/blob/master/HigherOrder.lhs is fairly similar to what we've been discussing, in some regards, but it's a little over my head.
from containers.
I did a tiny little experiment, and it's very clear that the GADT representation as implemented here significantly increases allocation and reduces speed. I'm pretty sure we can fix it; what I'm less sure about is whether we can fix it nicely. The simple-minded approach is to stop using singleton leaves, and instead use 2-3 leaves. The trouble is that then we need to add an extra layer right on top that duplicates a lot of the basic finger tree structure but that uses singleton leaves (serving as the 0th level of the finger tree). This actually works out just fine for ><
(essentially no more code), but that's not even close to being the case for other functions, and I don't know how to fix it.
from containers.
You are right. I also did some trivial experimenting with
module Main where
import Control.DeepSeq
import Criterion.Main
import qualified Data.Sequence as S
main = do
let s10 = S.replicate 10 1 :: S.Seq Int
s100 = S.replicate 100 1 :: S.Seq Int
s1000 = S.replicate 1000 1 :: S.Seq Int
s10000 = S.replicate 10000 1 :: S.Seq Int
rnf [s10, s100, s1000, s10000] `seq` return ()
defaultMain
[ bgroup "index"
[ bench "10" $ nf (iterate_index s10) 10
, bench "100" $ nf (iterate_index s100) 100
, bench "1000" $ nf (iterate_index s1000) 1000
, bench "10000" $ nf (iterate_index s10000) 10000
]
, bgroup "|>"
[ bench "10" $ nf iterate_snoc 10
, bench "100" $ nf iterate_snoc 100
, bench "1000" $ nf iterate_snoc 1000
, bench "10000" $ nf iterate_snoc 10000
]
]
iterate_index :: S.Seq Int -> Int -> ()
iterate_index xs n = go 0 n
where go i len | i < len = S.index xs i `seq` go (i+1) len
| otherwise = ()
iterate_snoc :: Int -> S.Seq ()
iterate_snoc n = go 0 n S.empty
where go i len xs | i < len = go (i+1) len (xs S.|> ())
| otherwise = xs
and the GADT representation is slower on both considerably comparing to the current zip
branch (25% on index
and (depending on size) 20%-0% on |>
).
I am not sure that we can really fix that -- the problem is increased number of pattern matches. For example consider lookupDigit
-- in the original representation, the lookupDigit
for Elem
is inlined directly to index
, so the size :: Digit (Elem a)
is inlined and return the size of the Digit
only using the Digit
constructor (One
-> 1, Two
-> 2, ...). In the new representation, size
of Digit has to match the Digit
constructor and then all the Elem
constructor, so it requires 4 additional pattern-matches for the Four
case. Also, it is not only the lowest level -- the new Tree23
(original Node
) has three constructors, which take longer to pattern-match when two (this can be observed by changing the order of Tree23
constructors to Node2
, Node3
, Elem
-- the index/10
becomes slower by 34%, but index/10000
faster by 7%, as there are more Node[23]
then Elem
s for larger tree; and this is only the ordering, the effect of removing one constructor is probably even more dramatic). If we for example add Elem2
and Elem3
, we could avoid some pattern matches, but on the other hand we will have four/five constructors of Tree23
, which would prevent pointer tagging on 32bit architectures (according to GHC docs, 10%-14% loss).
At this point I think my original idea was a bad one. Yes, GHC optimizer works a bit better for the GADTs, but the original representation makes a lot less work on the bottom of the Sequence, which I did not realize.
from containers.
What happens if you turn of the between-run garbage collection? It seems strange to me that the allocation in the nexted index isn't enough to overcome the added pattern-matching in the non-nested one. But I guess that's possible. I didn't know three constructors was a magic maximum. What exactly is done differently with fewer and more?
from containers.
See https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/PointerTagging for description and link to paper. In short, the last 2 bits (on 32-bit machine; 3 bits on 64-bit machine) of a pointer to a constructor contain the id of the constructor if it is evaluated, so pattern matching may happen without entering the constructor. The limit is 3 constructors on 32-bit machines (all 0 means unknown, values 1-3 mean evaluated, constructor number 1-3) and 7 on 64-bit machines.
I am not saying we cannot have more than 3 constructors, but it bears a penalty on 32-bit machines.
from containers.
Related Issues (20)
- Is graphFromEdges too lazy? HOT 4
- Should dfs be lazy? HOT 8
- Make foldMap1 for Data.Tree follow the structure HOT 4
- Add laziness tests for Tree foldMap1 HOT 2
- Check time bound for graphFromEdges HOT 2
- Use a custom array type for SetM? HOT 6
- Optimize scc as hard as possible HOT 3
- Different Links to OverloadedLists Extension HOT 1
- A more efficient Graph representation HOT 4
- Use generic map merging framework HOT 1
- Build fails with "Prelude.chr: bad argument" HOT 11
- Build of container-test fails with cabal HOT 8
- containers-0.6.4.1 and 0.6.3.1 fail to build with GHC-9.6.1 HOT 2
- Set difference and union in one HOT 4
- foldTree does not optimize well HOT 2
- Tree fusion HOT 16
- Fusible Set.fromDistinctAscList definition HOT 10
- Fusible IntSet.fromDistinctAscList definition HOT 3
- NonEmpty for CyclicSCC HOT 11
- better instance Hashable IntSet? HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from containers.