Comments (23)
Interesting, thanks! I don't fully understand what you benchmarked, can you show the definition you used for the Alga
(new)?
Could it be that we should specialise ToGraph
code to Graph
or simply copy the code?
from alga.
Sorry if I wasn't clear.
For the FIRST table, I used (copy/pasted from ToGraph)
hasEdge s t g = case foldg e v o c g of (_, _, r) -> r
where
e = (False , False , False )
v x = (x == s , x == t , False )
o (xs, xt, xst) (ys, yt, yst) = (xs || ys, xt || yt, xst || yst)
c (xs, xt, xst) (ys, yt, yst) = (xs || ys, xt || yt, xs && yt || xst || yst)
For the SECOND, I used
hasEdge :: Ord a => a -> a -> Graph a -> Bool
hasEdge s t = (\g -> case foldg e v o c g of (_, _, r) -> r) . induce (`elem` [s, t])
where
e = (False , False , False )
v x = (x == s , x == t , False )
o (xs, xt, xst) (ys, yt, yst) = (xs || ys, xt || yt, xst || yst)
c (xs, xt, xst) (ys, yt, yst) = (xs || ys, xt || yt, xs && yt || xst || yst)
from alga.
Wow, that's pretty surprising to me. So, essentially, on clique
we get a 10x slowdown, and the only difference for clique
is the term xs && yt
.
What about making the third field (or maybe all of them) of the tuple strict? Perhaps, we are accumulating huge unevaluated trees of Boolean expressions in the process?
from alga.
One simple tweak could probably be as follows:
hasEdge s t g = case foldg e v o c g of (_, _, r) -> r
where
e = (False , False , False)
v x = (x == s , x == t , False)
o (xs, xt, xst) (ys, yt, yst) = (xs || ys, xt || yt, xst || yst)
c (_, _, True) (_, _, _) = (True, True, True)
c (_, _, _) (_, _, True) = (True, True, True)
c (xs, xt, xst) (ys, yt, yst) = (xs || ys, xt || yt, xs && yt)
This essentially forces the third field for the connect
case. But we could make it more elegant I'm sure.
from alga.
Sorry for taking so long to answer. Your implementation is faster but not enough:
Clique
1000 | |
---|---|
Alga | 692.3 ms |
AlgaOld | 94.69 ms |
Note that this is certainly the fault of edges
. I have forget it, but I am running benchmarks using edges
, and here not clique
. For example, with the clique
from alga
:
Clique
1000 | |
---|---|
ToGraph | 58.46 μs |
ForcingValues | 61.72 μs |
Actual | 48.52 μs |
Anyway, one advantage of the ToGraph
function is that it require only an Eq instance, and not an Ord one like the actual alga is requiring (because of the use of isSubgraphOf
from alga.
Note that this is certainly the fault of
edges
.
I don't understand what you mean: both algorithms are working on exactly the same input, right? It's not the fault of the input that one algorithm takes 7x longer than another :)
from alga.
Mmh, yes your are right, I wanted to say to we must not forget that we are only using a very big stack of overlayed edges from edges
:)
from alga.
I spent some time trying to improve your implementation, and I think that it is maybe not worth spending time here, the actual implementation is faster everywhere (on real graphs made using edges
and on using alga's clique
).
The only improvement that can be made I think is something like:
hasEdge :: Eq a => a -> a -> Graph a -> Bool
hasEdge u v = hasEdge' . induce (`elem` [u, v])
where
hasEdge' (Overlay x y) = hasEdge' x || hasEdge' y
hasEdge' (Connect x y) = let !isInLeft = hasVertex u x
!isInRight = hasVertex v y
in (isInLeft && isInRight) || (isInLeft && hasEdge' x) || (hasEdge' y && isInRight)
hasEdge' _ = False
Because it drop the Ord
Constraint.
It can also be great to define hasLoop
, which will be faster than hasEdge x x
:
hasLoop x = hasLoop' . induce (==x)
where
hasLoop' (Overlay x y) = hasLoop' x || hasloop' y
hasLoop' Connect{} = True
hasLoop' _ = False
Which I think is working because induce
remove Empty
leaves.
from alga.
@nobrakal I don't think your hasLoop
is correct, it will return True
for Connect Empty Empty
:) Unless, of course, you rely on the feature of (some) implementations of induce
that do constant folding to eliminate Empty
whenever possible.
from alga.
Yep totally, I was thinking with Algebra.Graph.induce
which throw away empty leaves
from alga.
@nobrakal Right, I see. I don't mind adding hasLoop
into the API, so feel free to do that, but consistently across various modules.
So, what's the conclusion regarding hasEdge
? I'm confused by so many different versions we discussed :) What is the most efficient implementation?
from alga.
I don't mind adding hasLoop into the API, so feel free to do that, but consistently across various modules.
You don't mind to do it yourself but will merge if I do it ? I prefer to ask ^^
So, what's the conclusion regarding hasEdge? I'm confused by so many different versions we discussed :)
I was also ^^.
The actual one is the fastest, with graphs constructed with edges
or with clique
.
But a somewhat better version is:
hasEdge :: Eq a => a -> a -> Graph a -> Bool hasEdge u v = hasEdge' . induce (`elem` [u, v])
where
hasEdge' (Overlay x y) = hasEdge' x || hasEdge' y
hasEdge' (Connect x y) =
let !isInLeft = hasVertex u x
!isInRight = hasVertex v y
in (isInLeft && isInRight) || (isInLeft && hasEdge' x) || (hasEdge' y && isInRight)
hasEdge' _ = False
This improve a bit the time of the function (it is about 1.07
times faster), and mostly drop the Ord
instance required on vertices because of isSubgraphOf
in the current implementation. It is now only requiring Eq
.
The "naive" implementation of hasEdge'
works because the induced subgraph is very tiny thanks to the absence of Empty
leaves
from alga.
You don't mind to do it yourself but will merge if I do it ? I prefer to ask ^^
I don't mind either, but I won't find time to do it myself in the next couple of weeks, so feel free to do it :-)
OK, let's switch to your improved implementation -- it's not immediately obvious it is correct but it actually looks quite nice. Please send a PR!
from alga.
I was going to send a PR, and while verifying my results, I benchmarked to hasEdge'
function alone. It is quite better with graphs constructed with edges
, but slower with clique
: Here is the detailed benchmarks:
hasEdge :: Eq a => a -> a -> Graph a -> Bool
hasEdge u v = hasEdge'
where
hasEdge' (Overlay x y) = hasEdge' x || hasEdge' y
hasEdge' (Connect x y) =
let !isInLeft = hasVertex u x
!isInRight = hasVertex v y
in (isInLeft && isInRight) || (isInLeft && hasEdge' x) || (isInRight && hasEdge' y)
hasEdge' _ = False
So without the suffixed induce
. It is really faster hand drop the Ord
instance:
### Clique
#### 1000
##### (0,1)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 62.98 ns | 1.000 |
| AlgaOld || 96.58 ms | 0.966 |
+---------++----------+-------+
##### (292,820)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 9.132 ms | 1.000 |
| AlgaOld || 96.28 ms | 0.984 |
+---------++----------+-------+
##### (997,999)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 20.61 ms | 1.000 |
| AlgaOld || 98.32 ms | 0.980 |
+---------++----------+-------+
##### (0,0)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 19.01 ms | 0.999 |
| AlgaOld || 92.89 ms | 0.982 |
+---------++----------+-------+
##### (1,0)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 26.18 ms | 0.999 |
| AlgaOld || 88.50 ms | 0.975 |
+---------++----------+-------+
##### (1,1)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 18.12 ms | 0.997 |
| AlgaOld || 91.48 ms | 0.983 |
+---------++----------+-------+
The first three tested edges are edges in the graph, the last three are not in the graph.
Basically, in addition to being faster to find edges not in the graph, it is really faster to find edges "in the front" of the graph.
Sadly this is "less" true for graph constructed using clique
, since it is slower for edges in the graph but "at the end", but faster in other cases:
### Clique
#### 1000
##### (0,1)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 62.20 ns | 1.000 |
| AlgaOld || 46.53 μs | 1.000 |
+---------++----------+-------+
##### (292,820)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 52.41 μs | 1.000 |
| Alga || 2.486 ms | 1.000 |
+---------++----------+-------+
##### (997,999)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 52.42 μs | 1.000 |
| Alga || 5.969 ms | 0.999 |
+---------++----------+-------+
##### (0,0)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 12.29 μs | 1.000 |
| AlgaOld || 46.69 μs | 1.000 |
+---------++----------+-------+
##### (1,0)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 12.20 μs | 1.000 |
| AlgaOld || 45.93 μs | 0.999 |
+---------++----------+-------+
##### (1,1)
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 12.34 μs | 0.998 |
| AlgaOld || 47.13 μs | 1.000 |
+---------++----------+-------+
from alga.
Wow, what happens here?
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 52.42 μs | 1.000 |
| Alga || 5.969 ms | 0.999 |
+---------++----------+-------+
It's 100 times slower here! Could you investigate why?
from alga.
The clique is in the form Connect 0 (Connect 1 (Connect 2))
, so because I run hasVertex
for every Connect
node, this take very long time if we search an edge "at then end".
That is why it works great with #80 representation ^^
I suspect that it will be great fo "balanced" graphs, with similar of nodes on each part of a "Connect"...
from alga.
Ah, of course -- the version that calls hasVertex
actually has quadratic complexity. That's why I used triples in my implementation -- to memoize the invocations of hasVertex
.
from alga.
I tried to do some memoization, but nothing great :p
So do you think we can approach the time of the version using hasVertex
but using a better memoization ? Or should we just stick to the previous version (here #84 (comment) ).
Or even propose the two versions ? The hasVertex
solution is still faster with graph constructed with edges
and by far
from alga.
So do you think we can approach the time of the version using hasVertex but using a better memoization ?
Well, you are comparing an O(n) implementation against an O(n^2) one. Of course the latter will always lose on large benchmarks.
Or even propose the two versions ?
I don't think there is any point in including a quadratic version into the library when there is a linear implementation. Think why the quadratic version manages to be faster on some examples -- perhaps that will help you figure out how to improve the linear version too!
from alga.
Think why the quadratic version manages to be faster on some examples
It runs hasVertex
only whenConnect
is encountered, and with a graph constructed with edges
, Connect
is always like Connect (Vertex a) (Vertex b)
. So hasVertex
is fast :)
I spent again some time trying to find a better version, but didn't find anything, so maybe just dropping the Ord
instance can be great for now (as discussed in #84 (comment) )
from alga.
@nobrakal Sorry, I'm not sure which implementation you refer to. Perhaps, just create a PR and we'll discuss it there :)
from alga.
I think this is closed with your super version of hasEdge
?
from alga.
@nobrakal Yes, thank you. I think we can close this.
from alga.
Related Issues (20)
- How to handle goodness inherited from other libraries HOT 2
- Bump upper bound for base-compat for version 0.4 on Hackage HOT 4
- Unable to use ghcid HOT 6
- Algorithms should be in terms of typeclasses not data structures HOT 15
- Acyclic graphs with Int labels HOT 5
- Tests failed to build with QuickCheck 2.14.2
- Multitree HOT 3
- Monoid instance HOT 6
- support for GraphViz "HTML strings" for node or edge labels HOT 1
- Add benchmarks to CI HOT 2
- Supporting different node and edge types HOT 7
- v0.6 fails to build with GHC 8.0 and 8.2 HOT 3
- Doubt about decorating a Graph HOT 4
- Compatibility with mtl-2.3 HOT 2
- Smart constructor for Symbol rather than requiring -XOverloadedLists HOT 1
- Proposal : foldgM HOT 2
- Algebra.Graph.Labelled.AdjacencyMap.edges appears to be broken HOT 2
- Semigroup version of Algebra.Graph.Labelled.AdjacencyMap HOT 7
- Bump "deepseq" dependency bounds to 1.5.0.0 HOT 2
- Warnings when building with ghc-9.10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alga.