Giter Site home page Giter Site logo

Edge labels about alga HOT 54 CLOSED

snowleopard avatar snowleopard commented on September 24, 2024
Edge labels

from alga.

Comments (54)

jaesharp avatar jaesharp commented on September 24, 2024 1

@snowleopard Excellent point -- now that you say it, I can't yet see that either. I suppose I've been coming at it from more of a data structure perspective and looking at how to cram the construction into the haskell typechecker while still maintaining invariants. Thanks heaps for taking a look and taking the time to provide some excellent feedback. Cheers!

from alga.

boggle avatar boggle commented on September 24, 2024 1

I meant the the normal "Set a" monoid. Except the monoid constraint isn't needed here. Actually the kind of graphs I'm interested in right now would only need "NonEmptySet a" which of course is just a semiring. Layering is possible, too, if the type of edge labels is a monoid, then "connect" can be used without specifying a label.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024 1

@alpmestan I used to think we need semirings, but I'm now leaning towards something lattice-like.

Let's try to list all requirements. Below I use x -{a}-> y to denote LabelledConnect.

  • We need 0 and 1 for defining overlay and connect, as above.
  • We'd like to compose edge labels as overlay (x -{a}-> y) (x -{b}-> y) = x -{a \/ b}-> y. This leads to 0 \/ x = x, x \/ x = x and 1 \/ x = 1, as otherwise overlay would lose its usual laws.

This looks like a join-semilattice (https://ncatlab.org/nlab/show/semilattice) with two bounds (0 and 1).

I guess we don't need the meet operator /\ for edge-labelled graphs, which is a bit strange. If I remember correctly, /\ is required for tagged graphs discussed here.

from alga.

alpmestan avatar alpmestan commented on September 24, 2024 1

My semiring obsession is almost cured, don't worry. So with your formulation, 0 is _|_, OK, but 1 is T (top) ? In which case we can't "go past 1"? I might be misunderstanding what you said though.

from alga.

boggle avatar boggle commented on September 24, 2024 1

This is interesting! On the one hand it is really elegant to have neutral elements for both overlay and connect, on the other hand, I don't quite see the need to have a full lattice with two bounds and both join and meet.

My use case would still be representing multigraphs by labeling edges with sets of tags. The empty set/lower bound would serve to represent overlay; any other set X would represent labeled connect but of course would label edges in the new graph with the union of X with any pre-existing edge label.

Using a full lattice certainly works (and I think the commutativity of join is a good requirement for edge labeling). For representing multigraphs using tag sets as labels this would require to artificially introduce an upper bound 1 (Perhaps by switching to Maybe (Set a)). I'm worried about the interpretation of that though: Such a bound would stand for the universe of all possible tags and thus imply that there are potentially infinitely many edges between a pair of nodes connected with 1 in the represented multigraph.

Regular connect could just be recovered by having a default label. I don't see the need for this to be the top/1. @snowleopard could you please explain why that's needed?

from alga.

snowleopard avatar snowleopard commented on September 24, 2024 1

So with your formulation, 0 is _|_, OK, but 1 is T (top) ? In which case we can't "go past 1"?

@alpmestan Yes, that's right. The idea is that x * y corresponds to maximum possible connection between x and y, whereas x + y corresponds to minimum possible connection (i.e. no connection).

This duality between overlay and connect exists in the unlabelled version of the algebra, and I think it may be useful to preserve it, but I admit that the justification is not particularly strong.

On the one hand it is really elegant to have neutral elements for both overlay and connect, on the other hand, I don't quite see the need to have a full lattice with two bounds and both join and meet.

@boggle Yes, I agree. Ideally we'd like a better justification (than just elegance) for having a full lattice. Here are just a couple of random thoughts on this:

  • As I mentioned above, I'd like to keep (and perhaps even reinforce) the duality between + and *.
  • It may be useful to have forall k. x + y <= labelledConnect k x y <= x * y.

Alas, not convincing enough.

Such a bound would stand for the universe of all possible tags and thus imply that there are potentially infinitely many edges between a pair of nodes connected with 1 in the represented multigraph.

This was my intention, but now that you mentioned Set a I realise that we probably don't want to deal with infinite sets. Going to Maybe (Set a) doesn't look attractive.

Regular connect could just be recovered by having a default label.

You may be right. I'm a bit uncomfortable about picking a default label, which feels rather arbitrary, but I guess one can argue that all possible choices lead to isomorphic results... which seems to bring us to initial and final objects: for sets, the empty set is the initial object (and we'll use it for +) and singleton sets are (isomorphic) final objects. Any singleton set appears to be a good candidate for *. So, we've started with one sort of duality (bottom-top), and came to another one (initial-final). Interesting indeed :)

For Hask category, we would have chosen Void and () labels for + and *. Is that useful in any sense? Probably not, because the composition (.) doesn't seem to interact with + in the right way.
Would the composition of labels a and b in this case be Either a b? Doesn't seem to be idempotent.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024 1

@boggle We can avoid making the choice and simply require the user to provide a semi-lattice, plus a default label that will be used for connect. For example:

-- We require additional laws for the underlying monoid: <> must be commutative and idempotent
class Monoid a => Label a where
    defaultLabel :: a

overlay :: Label b => LabelledGraph b a -> LabelledGraph b a -> LabelledGraph b a
overlay = labelledConnect mempty

connect :: Label b => LabelledGraph b a -> LabelledGraph b a -> LabelledGraph b a
connect = labelledConnect defaultLabel

This may be a temporary solution while we are still looking for the right existing abstraction. I believe this permits both interpretations (full lattice and initial-final).

The accumulation vs consolidation point of view is indeed useful for many applications, but I think the basic abstraction should be as general as possible. One can of course build the accumulation vs consolidation machinery on top of the basic one (another graph-like typeclass with an additional connect-like method?).

from alga.

boggle avatar boggle commented on September 24, 2024 1

Ah! That is of course possible and indeed allows exploring this space more gradually.

Might also be insightful to try to define how to encode an edge labeled graph into a node labeled graph which should be possible with a newtype, using either to distinguish between nodes used for nodes in the labeled graph, and nodes used for edges in the labeled graph, and defining connect accordingly.

from alga.

anfelor avatar anfelor commented on September 24, 2024 1

@boggle Furthermore, in a semiring 0 doesn't need to be 0 and 1 doesn't need to be 1 (of the underlying structure). Take for example the longest distance semiring in the excellent "Fun with semirings" paper:

data LongestDistance = LDistance Int | LUnreachable | LInfinite
instance Semiring LongestDistance where
  zero = Unreachable
  one = LDistance 0
  x @+ LUnreachable = x
  LUnreachable @+ x = x
  LDistance x @+ LDistance y = LDistance (max x y)
  ...

We would then define overlay (connect l a b) (connect k a b) = connect (l + k) a b. Furthermore overlay would connect nodes with the LUnreachable tag, which is sematically equivalent to no connection at all. I think this should be a strict superset of the (Nat, 0, max) monoid, while extending it to include a sensible operation for matrix multiplication. Would that work for you?

from alga.

anfelor avatar anfelor commented on September 24, 2024 1

@boggle The set of laws @snowleopard and I agreed on so far (I believe) are as follows:
A Graph is a quadruple (G, S, -{}->, e) with S an idempotent semiring and the following laws:

  1. forall. a,b,c in G and r,s in S:
  2. a -{r}-> e = e -{r}-> a = a
  3. a -{0}-> b = b -{0}-> a
  4. (a -{r}-> b) -{s}-> c = (a -{r}-> b) -{0}-> (a -{s}-> c) -{0}-> (b -{s}-> c)
  5. a -{r}-> (b -{s}-> c) = (a -{r}-> b) -{0}-> (a -{r}-> c) -{0}-> (b -{s}-> c)
  6. (a -{r}-> b) -{0}-> (a -{s}-> b) = a -{r+s}-> b

These imply:

  1. (a -{r}-> b) -{r}-> c = (a -{r}-> b) -{0}-> (a -{r}-> c) -{0}-> (b -{r}-> c) = a -{r}-> (b -{r}-> c) (Associativity of -{}->)
  2. a -{r}-> b = (a -{r}-> b) -{s}-> e = (a -{r}-> b) -{0}-> (a -{s}-> e) -{0}-> (b -{s}-> e) = (a -{r}-> b) -{0}-> a -{0}-> b (Absorption)
  3. (a -{0}-> b) -{r}-> c = (a -{0}-> b) -{0}-> (a -{r}-> c) -{0}-> (b -{r}-> c) = (a -{r}-> c) -{0}-> a -{0}-> c -{0}-> (b -{r}-> c) -{0}-> b -{0}-> c = (a -{r}-> c) -{0}-> (b -{r}-> c) (-{r}-> distributes over -{0}->)
  4. a = (a -{r}-> e) -{s}-> e = (a -{r}-> e) -{0}-> (a -{s}-> e) -{0}-> (e -{s}-> e) = a -{0}-> a (Idempotence)

That means that by using the bool semiring (0=False, 1=True, +=or, *=and) we can always recover the original algebra of graphs.

We can extend the above set of laws by demanding that every element in our semiring except 0 has a multiplicative inverse (^-1) and extend the 2. law to:

  1. a -{r}-> b = b -{r^-1}-> a with 0^-1 := 0

This fits well with some semirings like the above mentioned LongestDistance semiring:

instance Semiring LongestDistance where
  zero = LUnreachable
  one = LDistance 0
  x @+ LUnreachable = x
  LUnreachable @+ x = x
  LDistance x @+ LDistance y = LDistance (max x y) -- or alternatively 
    | (abs x) < (abs y) = LDistance y
    | otherwise = LDistance x

  ...

  x @. LUnreachable = LUnreachable
  LUnreachable @. x = LUnreachable
  LDistance x @. LDistance y = LDistance (x + y)

  inv LUnreachable = LUnreachable -- we don't want partial functions here => 0 has an inverse
  inv (LDistance x) = LDistance (-x)

Do you think that could be a useful addition? I think it could possibly help with augmented graphs; I will comment here when I have something presentable. I wish we could extend that to a ring, but unfortunately there are no idempotent rings so that is impossible without changing the algebra of graphs.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024 1

@anfelor Yes, this sounds good. Have you formally proved the theorems? I haven't yet, but plan to do this in the Agda repository: https://github.com/algebraic-graphs/agda.

By the way, I haven't thought about multiplicative inverses, that's pretty cool!

To have a fixed common vocabulary, I pushed a branch with a draft implementation of edge-labelled graphs: https://github.com/snowleopard/alga/blob/edge-labels/src/Algebra/Graph/Labelled.hs.

Note that I chose the class name Dioid as it is often used to refer to idempotent semirings, not only in the context of path-finding: https://ncatlab.org/nlab/show/idempotent+semiring. It is also much shorter to type compared to, say, IdempotentSemiring or TropicalSemiring.

from alga.

anfelor avatar anfelor commented on September 24, 2024 1

feel free to send a PR to the Agda repository

I will do that tomorrow or friday.

In Latex, it is x\overset{s}{\longrightarrow}u. Is this OK?

I prefer the shorter a\xrightarrow{s}b, but to each his own :)

from alga.

anfelor avatar anfelor commented on September 24, 2024 1

Sorry, messed up the link at first; I was referring to http://stedolan.net/research/semirings.pdf

from alga.

jaesharp avatar jaesharp commented on September 24, 2024

Perhaps an implementation based on line graphs (where the nodes are representative of edges and cliques uniquely represent nodes in the underlying edge-to-vertex dual graph). You might use multiparameter typeclass instances of the nodes to represent valid connection types overloading the type hierarchy for validity checking and label specification -- out of band, so speak. You could then treat the concrete linegraph as an edge 'schema' connected with a surjective relation of cliques to concrete edge-to-vertex dual graph nodes. Still, I'm just brainstorming.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

Thanks @justinlynn!

Line graphs are cool, but so far I couldn't come up with a good way to represent them algebraically (without allowing some of their forbidden subgraphs). I also couldn't figure out how to efficiently (and algebraically) obtain the line graph of a given graph. So, I am a bit stuck in this direction, but I agree, this is a potential option to consider when dealing with edge labels.

from alga.

alpmestan avatar alpmestan commented on September 24, 2024

Hello. I'm loving alga so far and am looking into ways to make this labelled edge story nice. Here are a few thoughts:

  • Maybe we could have the edge labelling part be an extension to the core graph algebra, so:

    -- mirrors your general `Label` primitive, but we could
    -- look into just a function that takes two `Vertex g`s and
    -- basically labels `Connect v1 v2`. We could have both with
    -- default definitions in at least one direction, maybe?
    class Graph g => LabelledGraph g where
        label :: b -> g -> g

    This way, we keep the core algebra intact, and each interpretation is free to implement (or not)
    the label combinator. This however does not really help in finding one or more implementations
    that just "fall out" nicely, especially not ones that e.g allow us to just label edges, not arbitrary
    subgraphs.

  • I have been looking into providing an adjacency matrix style interpretation with labelled edges. The point would be to have the ability to reuse Russell O'Connor's approach to solving a whole bunch of problems using just the semiring hammer. It however seems to be like a particularly unfriendly representation for alga-style graphs, as you possibly just keep discovering new vertices as you interpret a graph expression, which possibly requires that you resize the matrix unless using a sparse representation, etc, in addition to requiring that your vertex type implements Ix and Bounded, which is a bit annoying when all you have is "a", "b", "c" and you'd just like to have some graph involving those 3 vertices. If we do find a nice way to make this work, then going from labelled to unlabelled edges is just about using the standard Bool semiring instead of some other lbl semiring.

  • In the paper, you say:

    The above looks remarkably close to a semiring, with the only oddity being the shared identity of the two operations. The lack of the annihilating zero element (i.e. x → 0 = 0) and the following decomposition law is what makes the algebra of graphs different

    But the post I linked to above (and several papers/books) do admit at least one encoding of (labelled or unlabelled) graphs that is a semiring. I saw in some comments on your blog that other people asked about that kind of thing, but have you considered augmenting the algebra to make it basically one with two distinct identities (so that 0 absorbs →) ? So a graph would be semiring + decomposition law. What would change? Have you investigated this?

More thoughts might come later.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

@alpmestan Hello, and thank you for sharing your thoughts! Let me answer point-by-point.

  • I like your suggestion about adding a type class with an additional graph labelling method. I'd probably prefer not to use the term 'label' here to avoid confusion with edge labels. Perhaps, we can call such graphs tagged graphs? We'll also need to add an associated type:

    class Graph g => TaggedGraph g where
        type Tag g
        tag :: Tag g -> g -> g

    Tagged graphs have been discussed here: #19 (comment).

  • I agree, matrices seem to be quite unfriendly for Alga graphs. One possible workaround is to keep graph expressions in the tree form (i.e. using the deep embedding from Algebra.Graph) until they actually need to be consumed as a matrix. It seems to me that there is no need to mix construction of graphs (using the graph algebra) and their analysis (using matrices, semirings etc.). Then there will be a function like Graph a -> Matrix a that takes the whole expression, finds out all the vertices in it and builds the corresponding matrix, avoiding intermediate ones. Do you think this will work for your use case?

  • Yes, one can add annihilating zero into the algebra of graphs, but this zero is very unusual. Just as unusual as the identity (empty graph). Zero is just the fully connected graph. Indeed we have 0 + x = 0 and 0 * x = 0, i.e. the fully connected graph annihilates both overlay and connect! But this is still not a semiring, it's a different algebraic structure. It's probably not a satisfactory answer for you, but I don't know how to marry the algebra of graphs with traditional semiring-based graph analysis. Perhaps, they are just different tools useful in different circumstances.

from alga.

alpmestan avatar alpmestan commented on September 24, 2024

Tagged graphs as you introduce them in your last comment could work well indeed. I can certainly imagine this construct being useful. I however do think I need another solution for graphs with labelled edges, as I basically want a representation that'll only allow me to "tag" Connect (Vertex _) (Vertex _)-shaped graphs.

Regarding your suggestion of constructing the matrix "at the last minute", once we have described the entire graph and right before we run some algorithm on it, this would definitely work, I think. That's one possible approach. I have been thinking, since my last comment, about another route that doesn't go through the deep embedding. It is probably less efficient, but it is quite cute IMO.

So, having an adjacency matrix and growing it all the time is a no-go. And I want something that doesn't restrict me to Ix+Bounded-compliant vertex types (I want to be able to have String vertices for example). But we know about other representations of linear maps! Conal (here, here) and Ed (here) have talked about this. By putting this together with Russell's approach, it occured to me that I might be able to model a graph as a linear map from a free module generated by our set of vertices to itself, where the free modules are over an arbitrary semiring (Bool for unlabelled edges, something fancier otherwise). I believe that the set can in theory be infinite (but countable?), as long as any vector we ever consider is only a finite linear combination of elements from our set. The "adjacency linear map" representation of a graph could in addition to storing the linear map, store the set of vertices that we're actually dealing with (or that we've seen so far, when building the graph), so that we would only ever call (and memoize) the restriction of the linear map to just those vertices when using that representation. I have written a bit of code down but it's too early for me to know whether this will work in the end, and to know whether I can provide sane asymptotics for the construction and use of that representation.

Most likely, this is all fun but terribly inefficient and we'll just go with your suggestion of going through Graph and converting to an actual Matrix at the end. :)

Re: the algebra, it's interesting that this 0 would be the fully connected graph. That's a really weird thing to work with. It does feel like their ought to be a way to reconcile the two, but I don't see any.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

I basically want a representation that'll only allow me to "tag" Connect (Vertex _) (Vertex _)-shaped graphs.

@alpmestan Yes, I see. Thinking about this led me to the following type class:

class (Graph g, Monoid (Label g)) => LabelledGraph g where
    type Label g 
    labelledEdge :: Label g -> Vertex g -> Vertex g -> g

With the following two laws:

  • connect (vertex x) (vertex y) == labelledEdge mempty x y
  • overlay (labelledEdge a x y) (labelledEdge b x y) == labelledEdge (a <> b) x y

It doesn't look very nice though.

But we know about other representations of linear maps! [..]

That's very cool, I like this idea! I'm very curious to see what you come up with -- please share the code when/if you manage to make this approach work :)

It does feel like their ought to be a way to reconcile the two, but I don't see any.

Same here!

from alga.

boggle avatar boggle commented on September 24, 2024

Late to the thread... putting labels on connect is what I was thinking about, too. Does it have to be a monoid though? Shouldn't a semiring be enough? Or do you think unlabeled edges are important? With "SET" as the semiring, this directly gives labeled multigraphs which is what I would want most of the time.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

@boggle You're not late, I think we're still looking for the right approach.

Does it have to be a monoid though? Shouldn't a semiring be enough?

Good point, a semiring should be sufficient for combining edges. My previous attempts to extend the theory with labelled edges used monoid labels, but it's interesting to see what happens with a semiring.

With "SET" as the semiring, this directly gives labeled multigraphs which is what I would want most of the time.

Could you clarify what you mean by "SET" here? Plain sets do have the identity element (the empty set), so I guess you mean something else here.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

@boggle I see, thanks for clarifying!

Just to make sure I fully understand your use-case: you are looking for a graph type G a b where a is the type of vertices, and b is the type of labels that can be placed on edges, where multiple edges are allowed between a pair of vertices. Alternatively, we can think of this as having at most one edge between a pair of vertices, which is labelled by NonEmptySet b.

Interesting!

from alga.

boggle avatar boggle commented on September 24, 2024

Yes, that's what I had in mind. And by the way you've put it, I'm realizing if one actually works with a monoid bound, one could get a function "connected a b g" this gives mempty if a b are not connected in a b, the label otherwise. That might turn out to feel really natural and wouldn't require moving to NonEmptySet. It raisese the question though what "connect a b mempty" would mean - perhaps this is just a no-op.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

It raisese the question though what "connect a b mempty" would mean - perhaps this is just a no-op.

I think connect a b mempty should be the same as overlay a b in this case.

from alga.

boggle avatar boggle commented on September 24, 2024

Yes, I agree.

This made me now wonder for a moment if that would allow to not even have overlay, i.e. overlay becomes just a shorthand for connect a b mempty thus reducing the number of operators in the algebra.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

would allow to not even have overlay, i.e. overlay becomes just a shorthand for connect a b mempty

@boggle That's true! We've been experimenting with so-called 'conditional connect' which allows us to have the following encoding of overlay and connect for edge-unlabelled graphs:

data Graph a = Empty | Vertex a | ConditionalConnect Bool (Graph a) (Graph a)

overlay :: Graph a -> Graph a -> Graph a
overlay = ConditionalConnect False

connect :: Graph a -> Graph a -> Graph a
connect = ConditionalConnect True

This case is not particularly interesting, since it's isomorphic to the datatype with two separate constructors Overlay and Connect, but when we take this idea to edge-labelled graphs this seems to become more interesting, as you point out:

data Graph b a = Empty | Vertex a | LabelledConnect b (Graph a) (Graph a)

overlay :: Monoid b => Graph b a -> Graph b a -> Graph b a
overlay = LabelledConnect mempty

But there is more! If b is a lattice, then we can have the following symmetric encoding:

data Graph b a = Empty | Vertex a | LabelledConnect b (Graph a) (Graph a)

overlay :: BoundedLattice b => Graph b a -> Graph b a -> Graph b a
overlay = LabelledConnect bottom

connect :: BoundedLattice b => Graph b a -> Graph b a -> Graph b a
connect = LabelledConnect top

This brings us back to the 'conditional connect' since Boolean algebra is a bounded lattice with top = True and bottom = False.

Could this be the encoding we've been looking for?

from alga.

alpmestan avatar alpmestan commented on September 24, 2024

You could probably play the same trick with Semiring b, just to tie it all up to what I mentionned earlier in this thread. Possibly other structures? But yes, promising encoding, definitely.

from alga.

boggle avatar boggle commented on September 24, 2024

I've been thinking a bit more about the full lattice encoding for the Set case. What I like about this is that it leads to a information accumulation vs consolidation point of view:

  • overlay is essentially join in the lattice between the labels provided by both arguments ("union edge labels", "collect all available information"). This would treat disconnected edges missing in one argument graph/or to be created as being labeled with BOT.
  • one could define connect to be the inverse operation that is the meet between the labels provided by both arguments ("intersect edge labels", "consolidate available information"). This would treat disconnected edges missing in one argument graph/or to be created as being labeled with TOP.
  • this could perhaps be extended to vertex labels ("keep all nodes" vs "only keep shared nodes")

This seems a very useful picture and it reveals overlay to just be union/join and connect to just be intersect/meet.
Then again one could try to define these operations just atop another encoding (like the initial/final one).
It seems we're at a cross-roads with two possible encodings with different trade-offs.
What is the right way to proceed?

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

Might also be insightful to try to define how to encode an edge labeled graph into a node labeled graph

@boggle Indeed. We can do Graph (Either b a) with vertices labelled by a and edges labelled by b. The main difficulty here is that we'll need to somehow constrain connect to disallow direct vertex-vertex and edge-edge connections, i.e. the resulting graph must be bipartite -- I've just opened #38 for this.

from alga.

anfelor avatar anfelor commented on September 24, 2024

Hey all,

I am currently considering whether to participate in the next GSoC and so I am trying to familiarize myself with this project. There are a couple of things in this thread I didn't understand, maybe you could help my out?

  1. If Labels are a BoundedLattice, a * b would be a-{1}->b, right? But then x \/ 1 = 1 would mean that a-{x}->b + a * b = a * b , e. g. labels get deleted by unlabeled connections. This doesn't seem right to me.

  2. To have edge labels and node labels in the same graph, I am not sure it suffices to make the graph bipartite. If we define 1-{a}->2 = 1 * a + a * 2 and (1 + 2 + 3)-{a}->{2 + 3 + 4) = 1*a + 2*a + 3*a + a*2 + a*3 + a*4 , what would be the meaning of a*2, e.g. an edge from nowhere to 2? Or star a [1..10]?

  3. Using a monoid seems to be the most elegant solution, but it makes me wonder how to define something like the bipartite newtype wrapper over overlay and connect. Isn't it impossible to pattern match on mempty? So can one actually distinguish between connect and overlay in such a labeled graph?

  4. I haven't seen much from graph theory , but wouldn't labels be mostly interesting as weights for flow problems? Because then <> from Label being idempotent would mean that a-{2}->b + a-{3}->b = a-{5}->b but a-{3}->b + a-{3}->b = a-{3}->b. Maybe one would need a different type class for weights?

Thank you for your time!

from alga.

anfelor avatar anfelor commented on September 24, 2024
  1. Or maybe this isn't a problem because one could use something like this?
newtype Distinct a = Distinct a

instance Eq Distinct where _ == _ = False
instance Monoid a => Monoid (Distinct a) where
  ...

Then <> @(Distinct a) would be idempotent even if <> @a isn't?

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

@anfelor Hey, thanks for joining the conversation!

There are a couple of things in this thread I didn't understand

This thread has become quite confusing, because so many approaches have been intertwined in discussions. My current understanding is as follows (I am planning to push some code with this soon):

We label edges with elements from a semiring (0, 1, |+|, |*|) and define the following labelled graph datatype LG e a (I'm shortening everything here for brevity):

data LG e a = E | V a | C e (LG e a) (LG e a)

overlay :: Semiring e => LG e a -> LG e a -> LG e a
overlay x y = C 0 x y

connect :: Semiring e => LG e a -> LG e a -> LG e a
connect x y = C 1 x y

Then my understanding is that for LG e a to actually satisfy graph laws, we need some additional constraint on the semiring e. My guess is something like Tropical e should work well:

instance Tropical e => Graph (LG e a) where
    Vertex (LG e a) = a
    ...

Here is a cool paper about semirings and tropical semirings and labelled graphs:

http://stedolan.net/research/semirings.pdf

@anfelor Now, that we've fixed a concrete vocabulary, do you still have the same questions? Or are some of them resolved?

For example, I think your point (1) is no longer a concern, since 1 will actually mean we've connected two vertices with a path of length zero -- of course, adding any additional parallel edges doesn't matter in this case, since the shortest length is still zero.

from alga.

anfelor avatar anfelor commented on September 24, 2024

Wow, thank you for your explanation and the paper, it all makes a lot more sense to me now! A few open questions remain though:

  1. Do you define LabelledGraph as the type class from "Nov 4" above or as the data type in your last comment? Would an AdjacencyMap be an instance of LabelledGraph?

  2. Only if LabelledGraph is a type class: I think you can't pattern match on 0 and 1 from an arbitrary semiring? So when a newtype wrapper needs to distinguish between connect and overlay wouldn't that require an additional Eq constraint on the semiring or a function classify :: Semiring a => a -> "Is0 | IsLabel | Is1"?

  3. How would you define Tropical as a type class? In the paper and hackage it seems to be a data type?

from alga.

anfelor avatar anfelor commented on September 24, 2024

I spend yesterday afternoon thinking about this a little more and I think the answers on my questions are 1) Both and yes 2) yes and 3) idempotent semiring, correct?

I also tried to set some laws for labelled graphs, but unfortunately with little success. Especially

a-[k]>b-[m]>c = a-[k]>b -[0]> a-[?]>c -[0]> b-[m]>c

was a problem as the natural solution ?=k*m leads to the contradiction

(a-[k]>b-[m]>c)-[n]>d /= a-[k]>(b-[m]>c-[n]>d)

and ?=k+m makes me wonder why a semiring is needed at all. At least I am now all the more interested in what you came up with.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

@anfelor Apologies for the delay, I'm currently travelling.

Do you define LabelledGraph as the type class from "Nov 4" above or as the data type in your last comment? Would an AdjacencyMap be an instance of LabelledGraph?

I would prefer to start just with a data type. The type class is pretty straightforward, once we are sure the data type works as expected and satisfies the laws we want.

I don't think AdjacencyMap can be an instance of LabelledGraph since it has no labels. So, I guess you meant some labelled modification of it. If yes, what exactly did you have in mind?

So when a newtype wrapper needs to distinguish between connect and overlay wouldn't that require

Why would it need to distinguish between them? Could you give an example of the code you'd like to write that would use classify?

How would you define Tropical as a type class?

It could be as simple as class Semiring a => Tropical a where with no methods, but with the required laws, specified in a comment. All instances would have to satisfy these laws.

I also tried to set some laws for labelled graphs, but unfortunately with little success.

Unfortunately, I think the labelled connect operator will be neither associative, nor will it satisfy the decomposition axiom in general. There will only be some special cases, where the label is the same, e.g.:

  • (a -{x}-> b) -{x}-> c = a -{x}-> (b -{x}-> c)
  • a -{x}-> b -{x}-> c = a -{x}-> b + a -{x}-> c + b -{x}-> c

I don't think there is a way to have both labels and these laws.

from alga.

anfelor avatar anfelor commented on September 24, 2024

@snowleopard Apologies for the delay on my part as well.

I don't think AdjacencyMap can be an instance of LabelledGraph since it has no labels. So, I guess you meant some labelled modification of it. If yes, what exactly did you have in mind?

Yes, I thought about Tropical l => Map a (Map a l), but I am not sure about the efficiency of it.

with the required laws [of a tropical]

I already assumed you meant something like that, but what are the laws of tropical? The same as a semiring idempotent over +?

Unfortunately, I think the labelled connect operator will be neither associative, nor will it satisfy the decomposition axiom in general.

Yes, I agree, my initial confusion came from the fact that I assumed -> had something to do with matrix multiplication.

Why would it need to distinguish between them?

  1. Optimization: In a AdjacencyMap implemented as Tropical l => Map a (Map a l) a-{0}->b would be fromList [(a, fromList [(b, 0)]), (b, Map.empty)] instead of the less space intense fromList [(a, Map.empty), (b, Map.empty)]. I feel like not distinguishing between 0 and other labels could turn any graph into a dense graph with many zero labels and huge space usage.
  2. Laws: a + b = b + a but a -> b /= b -> a. I am not sure, where this would be needed given that simplify doesn't use this law.
  3. You are right about the code I linked to earlier, one could implement it without distinguishing between overlay and connect. But an underlying implementation would need to traverse all the newly created edges only to discover that they are all labelled with zero in the case of overlay. Maybe that isn't an issue, I haven't done any benchmarks.

Being a complete newbie to the GSoC, I realized only yesterday I needed a written proposal. I have written close to two pages on Google docs if you are interested. I can still change it over tomorrow, just leave some comments there, if you have suggestions (or write me an email).

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

I already assumed you meant something like that, but what are the laws of tropical? The same as a semiring idempotent over +?

@anfelor Indeed, it looks like it's just a semiring with an idempotent +. I actually expected there were more additional laws :)

I feel like not distinguishing between 0 and other labels could turn any graph into a dense graph with many zero labels and huge space usage.

Yes, if we do not treat 0 specially we essentially end up with a dense matrix representation. I'm curious whether algebraic graphs (the Graph data type) suffer from this too -- maybe not, since they are very efficient in representing cliques. In any case, it looks like we should treat 0 edges with care.

Laws: a + b = b + a but a -> b /= b -> a. I am not sure, where this would be needed given that simplify doesn't use this law.

I'm not sure what you mean: for directed graphs you surely have a -> b /= b -> a, but a -> b == b -> a holds for undirected graphs, and may potentially be used in simplification of undirected graph expressions.

Thanks for submitting a proposal to GSoC! I haven't yet found time to look at all proposals, but will do that soon.

from alga.

anfelor avatar anfelor commented on September 24, 2024

I'm not sure what you mean: for directed graphs you surely have a -> b /= b -> a, but a -> b == b -> a holds for undirected graphs, and may potentially be used in simplification of undirected graph expressions.

I meant it as forall. Graph g => a, b in g => a -{0}->b == b-{0}->a, but forall. Graph g => a, b in g => x /= 0 => a-{x}->b /= b-{x}->a in the general case. So if we don't distinguish between zero and other elements we might end up not being able to use this law in code.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

I see! Yes, without distinguishing 0 we seem to lose the operator + (cannot distinguish it from ->) as well as all its laws.

from alga.

boggle avatar boggle commented on September 24, 2024

I keep coming back to think about graphs with edge weights in terms of information loss/gain (where disconnected means "no information") and what that interpretation would require from connect/overlay.

This leads me to believe that its perhaps not good to try to unify two pieces of information here:

  • wether two nodes a, b are connected
  • with which weight/label a, b are connected

As an experiment, let's assume the algebra tracks both overlay and connect but uses only a commutative monoid for weights like (Nat, 0, max).

data LG e a = E | V a | O (LG e a) (LG e a) | C e (LG e a) (LG e a)

overlay :: Monoid e => LG e a -> LG e a -> LG e a
overlay x y = O x y 

connect :: Monoid e => LG e a -> LG e a -> LG e a
connect x y = C mempty x y

This leads to a definition of weight where overlay and connect both combine any weights found but connect will ensure nodes are at least "weakly connected" (with a small weight) while overlay will not.

weight :: (Eq a, Monoid e) => LG e a -> a -> a -> Maybe e
weight (O g h) a b = (weight g a b) <> (weight h a b)
weight (C w g h) a b = (weight g a b) <> (weight h a b) <> simple
  where
    simple = if (contains g a) && (contains h b) then (Just w) else None
    contains :: Eq a => LG e a -> a -> Bool
    contains E _ = False
    contains (V a) x = x == a
    contains (O g h) x = contains g x || contains h x
    contains (C _ g h) x = contains g x || contains h x
weight _ _ _ = None

So rather than having two dual monoids (like in a semiring), the option monoid and the edge label monoid are nested.
Is it possible to achieve a similar definition for weight by using a semiring?
I'd expect at least the following behaviour:

  • overlay combines weights (does not loose edge label information) but never connects unconnected nodes
  • connect ensures nodes are connected but also combines weights (does not loose edge label information, too)

This was written with the monoid (0, max) in mind (i.e. we start from no information and move to more information).
It's also interesting to consider the dual (+Inf, min) (i.e. we start from maximum information and want to consolidate/intersect it).

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

@boggle As far as I understand, you want O x y to be different from C 0 x y, but I don't see how this can be useful. Since you are using max as your operation, the weight represents some kind of 'connectedness' between vertices, for example, the width of the road between two cities. In this setting, the width of 0 is semantically indistinguishable from a non-existent road.

Can you give an example where the difference between O x y and C 0 x y matters?

from alga.

anfelor avatar anfelor commented on September 24, 2024

Have you formally proved the theorems?

No, I haven't yet, but I could do it if you want.

I pushed a branch with a draft implementation
TODO: Prove the C.Graph laws

  1. + := -{0}->. Commutativity is 1.2 and Associativity 2.1 above
  2. -> := -{1}->. Identity is 1.1 and Associativity is 2.1 above
  3. -> distributes over + is 2.3 above
  4. Decomposition is either of 1.3 or 1.4 with r=s=1

I chose the class name Dioid

Sounds good!

Related: -{s}-> looks pretty ugly, maybe we should switch to a different notation? <s> or something like that (({s}, e, <s>) forms a monoid for every element s)? The law 1.4 would become:
a <r> (b <s> c) = (a <r> b) <0> (a <r> c) <0> (b <s> c)

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

No, I haven't yet, but I could do it if you want.

@anfelor Sure, if you are familiar with Agda, feel free to send a PR to the Agda repository. Alternatively, if you'd like to prove it using your favourite prover, just share the link to the proof. A pen-and-paper proof is fine too, although it's harder to check :-)

Your proof sketch for Graph laws looks correct.

Related: -{s}-> looks pretty ugly, maybe we should switch to a different notation?

By x -{s}-> y I actually meant an arrow between x and y with s written on top of it, as in some of my earlier papers, e.g. see page 6 here: https://www.staff.ncl.ac.uk/andrey.mokhov/algebra.pdf.

This notation worked pretty well. In Latex, it is x\overset{s}{\longrightarrow}u. Is this OK?

For ASCII, your <s> notation is pretty neat, I like it. The only potential issue is that it is symmetric, but the operation is not commutative. But then <> is not commutative either and noone is bothered by this :-)

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

@anfelor I've been reviewing the definition and theorems of labelled algebraic graphs and realised that we never use distributivity of dioids (in fact, I don't think we even use the dioid multiplication at all). Perhaps, we should relax the constraint and talk about a semilattice with both least and greatest bounds? It feels a bit awkward, but is most general.

It's interesting that in order to introduce transitively-closed graphs, we will need the multiplication in order to define the following axiom:

b /= empty => a <s> b + b <t> c = a <s> b + b <t> c + a <s * t> c

We'll also need multiplication and distributivity to talk about the notions of reachability, paths, etc.

from alga.

anfelor avatar anfelor commented on September 24, 2024

@snowleopard Do you mean something like this?

data LabelledGraph {S eq} (d : Semilattice S eq) (A : Set) : Set where ...
connect, overlay, ... on LabelledGraph

data StructuredGraph {D eq} (d : Dioid D eq) (A : Set) : Set where ...
reachable, transitive-closure, ... on StructuredGraph

embed ::  {A D S eqd eqs} {d : Dioid D eqd} {s : Semilattice S eqs } -> StructuredGraph d A -> LabelledGraph s A

But I don't see yet how to embed a semilattice into a dioid. In Haskell:

class Semilattice a where
  high :: a -- For all x: x `join` high = high
  low :: a  -- For all x: x `join` low = x
  join :: a -> a -> a -- associative, idempotent, commutative
  
class Dioid a where
  zero :: a
  one :: a
  (+) :: a -> a -> a
  (*) :: a -> a -> a
  
instance (Dioid a) => Semilattice a where
  join = (+)
  low = zero
  high = ??

Now, high = one works for most dioids we have considered so far but not for all, for example the longest distance dioid on page 4 of the semirings paper. Do we add one + x = one as a dioid law and thus rule out this case?

PS: I like your implementation of x -<e>- y, I didn't know such syntax was possible in Haskell.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

Do you mean something like this?

@anfelor Yes: low corresponds to the lack of information about connectivity and high to knowing everything about it. For path-like interpretations, low means unreachable, and high means 0 distance. For set-like representations, low corresponds to the empty set and high to the full set. Perhaps, we don't necessarily need to require high, because it may sometimes be not defined. For example, if we consider graphs whose vertices are types, or in Haskell terms TypeReps, then there is no sensible notion of "the set of all functions" between any two types, but we could still create graphs Int + Bool and Int -<even>- Bool, allowing us to represent diagrams from the Hask category (although the representation will probably be untyped, i.e. we won't be able to typecheck our diagrams). I guess there is just no sensible definition of connect at all in this case.

Now, high = one works for most dioids we have considered so far but not for all, for example the longest distance dioid on page 4 of the semirings paper. Do we add one + x = one as a dioid law and thus rule out this case?

Aha, interesting! If we define zero = low = LUnreachable then zero + x = x and everything is fine, but then if also define one = high = Distance 0 then we no longer have the law one + x = one -- for example, Distance 0 + Infinite = Infinite. So, I guess this is another example where we might not want to dictate that the semilattice necessarily fixes a particular upper bound.

We could do the following, assuming Semilattice e => Dioid e and low = zero:

overlay :: Semilattice e => Graph e a -> Graph e a -> Graph e a
overlay = LConnect low

connect :: Dioid e => Graph e a -> Graph e a -> Graph e a
connect = LConnect one

It's a bit less symmetric, but perhaps most general. Does this sound better to you?

PS: I like your implementation of x -<e>- y, I didn't know such syntax was possible in Haskell.

Cool :) It does look nice, especially with the Hasklig font I'm using. The only (minor) issue is that -< clashes with the Arrow notation.

from alga.

anfelor avatar anfelor commented on September 24, 2024

It's a bit less symmetric, but perhaps most general. Does this sound better to you?

It sounds like a relatively easy change to make with potential benefits to users and much better than changing the dioid laws. It would also improve the usefulness of alga as a general graph representation.

Yes: low corresponds to the lack of information about connectivity and high to knowing everything about it.

Unfortunately, the semilattice class above has a name clash with monadic join and for meet-semilattices the relationship is reversed by convention, eg low = one. But maybe use a join-semilattice with a different name?

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

@anfelor What about using the following class hierarchy?

-- Associative, commutative and idempotent |+| with identity 'zero'
class Semilattice a where
    zero :: a
    (|+|) :: a -> a -> a

-- Associative |*| that distributes over |+| with identity 'one'
class Semilattice a => Dioid a where
    one :: a
    (|*|) :: a -> a -> a

from alga.

 avatar commented on September 24, 2024

Hi. I'm jumping into the conversation late but I wanted clarification on what is wrong with the naive labelled connect operations. I don't see how it makes sense to talk about associativity of two different operations. What makes sense to me is asking whether they commute, and they don't:

(a <x> b) <y> (c <x> d) /= (a <y> b) <x> (c <y> d)

However, they still seem to satisfy some decomposition laws:

a <x> (b <y> c) = a <x> b + a <x> c + b <y> c
(a <x> b) <y> c = a <x> b + a <y> c + b <y> c

These laws seem intuitive to me and I think they hold for labelled graphs in general although I haven't proved it. In particular, the two ways of decomposing the following expression agree:

(a <x> b) <y> (c <z> d)

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

Hi @danharaj, welcome!

I agree with you: different labels lead to different operations, so associativity makes little sense (but when the labels are the same, associativity does hold). Also, your 4-way decomposition is very nice, I'll use it in the examples/tests, thank you!

But I don't understand this remark:

what is wrong with the naive labelled connect operations

What do you mean by "naive labelled connect operations"?

As far as I'm concerned, there is nothing wrong with our current design, and it does look rather straightforward to me, one could even call it naive :-) Essentially, we just add labelled connect operations and describe how they work by postulating a few axioms. We also establish a link with unlabelled graphs by noting that overlay and connect correspond to two specific labels.

from alga.

 avatar commented on September 24, 2024

What do you mean by "naive labelled connect operations"?

Sorry, I got tripped up reading this thread; it seemed like the approach where you have one connect operator per edge label was inadequate and an alternative formulation was sought. I call it naive because it's the very first thing one would try and hope it works :) I see now that this is a special case of the formulation you all have been discussing via the lattice of subsets of edge labels.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

Aha, I see :) This thread did get pretty long and confusing. I guess I better open a new one, describing the chosen design, just to figure out the remaining bits. Will do that in a few days.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

I gave a Haskell eXchange 2018 talk about edge-labelled algebraic graphs:

https://skillsmatter.com/skillscasts/12361-labelled-algebraic-graphs

It mostly builds on the abstractions we discussed in this issue, but in the end I decided to go for a different hierarchy of type classes for edge labels: Monoid => Semiring => Dioid. A working implementation can be found in Algebra.Graph.Label and Algebra.Graph.Labelled modules in the master branch.

I intend to write a blog post about this, providing further details.

@anfelor Any comments? Do we need to modify your Agda proofs? I guess all results you proved for dioids still hold, but we might have some other interesting theorems for the case when the type of labels is not a dioid, but something weaker -- a monoid or a semiring.

from alga.

snowleopard avatar snowleopard commented on September 24, 2024

I think we can finally close this 🎉 I'm quite happy with the final design. There are a few unpolished spots, but I'll be opening separate issues for them if need be.

Many thanks to everyone for an amazing discussion!

from alga.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.