Giter Site home page Giter Site logo

Comments (10)

arogozhnikov avatar arogozhnikov commented on May 18, 2024 3

@p4perf4ce thanks for thinking about that loudly with examples.

I've been poking around with an operation semantic (I've dubbed in rechunk), it has some overlap with your suggestion.

One critical choice: how to specify which axis is modified?

When both input and output are represented in full shape (as in your suggestion) - packs too much into a single operation, and does not focus on axis. It is unclear what a user should focus on.

In my experiments I've landed on a very similar "list" in pattern and list as input/output.
In your suggestion there is an exceptional case of single element in a list, and introduction of special cases should be avoided.

I have converged on:

[result] = rechunk([a, b, c], '[x,y,z] -> [x+y+z]', axis='b h w *')

It is possible to also support something like:

result = rechunk([a, b, c], '[x,y,z] -> x+y+z', axis='b h w *')

...but static code analysis would get crazy - easier to just always input and output lists.

Problem with an arbitrary number of inputs is a hard one. I've had something like this:

[result] = rechunk([a, b, c], '[*x] -> [concat(x)]', axis='b h w *')

... too complex, no need for this flexibility. Following could completely cover all necessary cases.

[result] = rechunk([a, b, c], 'concatenate', axis='b h w *')

There is a natural requirement to have an "inversion" to concatenation (which can properly work only if pattern contains information about a single axis).

I can post more detailed RFC with suggestion if that's something interesting to discuss, but I'll not be able to dedicate time for implementing/supporting.

from einops.

arogozhnikov avatar arogozhnikov commented on May 18, 2024 3

Thanks for the discussion folks!

Brand new einops.pack and einops.unpack cover common cases for concatenate and chunk, so closing this

https://github.com/arogozhnikov/einops/blob/master/docs/4-pack-and-unpack.ipynb

from einops.

davidnvq avatar davidnvq commented on May 18, 2024 2

We can also add the name of that dimension for informative details such that the arguments inside square bracket belong to the dimension named right before.

For example:

reordering

# perform [rg b a -> b rg a] on `c` dimension
einops.reorder(imgs, 'batch c [rg b a -> b rg a] h w', rg=2, b=1, a=1)

concatenation

inputs = [x, y, z]
# concatenate along `feat` dimension.
out = einops.reorder(inputs, 'batch feat [x y z -> x y z]', x=10, y=15, z=20)

chunk

train_len = int(len(dataset) * 0.8)
# split/chunk along `batch` dimension
train_split = einops.chunk(dataset, 'batch [train val -> train] c h w', train=train_len)
val_split = einops.chunk(dataset, 'batch [train val -> val] c h w', train=train_len)

It's up to you to remove the space between the square brackets and its dimension, i.e., feat [x y z -> x y z] -> feat[x y z -> x y z].

from einops.

p4perf4ce avatar p4perf4ce commented on May 18, 2024 1

Sorry to ask. Is davidnvq#1 too old to be merged, or there are other ways to workaround?
It's quite awkward where most of the work can be done using einops, but a native framework tool is needed for tasks like this O-o.

from einops.

arogozhnikov avatar arogozhnikov commented on May 18, 2024

@davidnvq great writing, thanks for ideas and examples.

Big pro of your suggestion

  • chunking is verbalized
  • patterns are concise
  • everything is uniquely interpretable

These are issues I see:
Concatenate looks like it does nothing (left and right parts are identical, but inputs and outputs are different)

out = einops.reorder(inputs, 'batch feat [x y z -> x y z]', x=10, y=15, z=20)

Second thing is conflict with existing notation: 'x y z' looks like 3d tensor (according to existing operations). This is easy to fix by e.g writing x+y+z.

from einops.

arogozhnikov avatar arogozhnikov commented on May 18, 2024

@davidnvq I'm also poking around the same issues (#20, #50), and pre-testing some concepts. I'll try to post some thoughts on how that can be written soon (but no promises!)

from einops.

davidnvq avatar davidnvq commented on May 18, 2024

These are issues I see:
Concatenate looks like it does nothing (left and right parts are identical, but inputs and outputs are different)

out = einops.reorder(inputs, 'batch feat [x y z -> x y z]', x=10, y=15, z=20)

Second thing is conflict with existing notation: 'x y z' looks like 3d tensor (according to existing operations). This is easy to fix by e.g writing x+y+z.

I'm sorry. In your tutorial, concatenate can be done by x y -> (x y). I just missed a kind of indicator like (). Yeah, x+y+z is a simple and easy-to-understand way.
I'm looking forward to your new release. If you need any support from communities, please kindly create some issues that we can help.

from einops.

mashrurmorshed avatar mashrurmorshed commented on May 18, 2024

@p4perf4ce I was just about to say the same thing. If einops had concat my code would finally be framework independent and super neat.

from einops.

p4perf4ce avatar p4perf4ce commented on May 18, 2024

It always bugging me to just use torch.cat when most of the tensor manipulation was done using einops. Making the code to be less semantic. Like when you need to concatenate the tensor from the different branches or implement skip connections where sometimes the number of channels doesn't match.

However, some of the functionalities must be discussed. For example, take a look at the concatenation proposed by @davidnvq in his fork:

# From davidnvq/einops/pull/1 
>>> x = torch.randn(2, 10, 512)
>>> y = torch.randn(2, 10, 128)
>>> z = torch.randn(2, 10, 256)
>>> h = concat([x, y, z], "batch seq [dx dy dz -> (dx dy dz)]", batch=2, seq=10, dx=512, dy=128, dz=256)

I think we don't actually need to separate concat from the rearrange since the number of dimensions of the input doesn't change at all.

# Suppose tensors x, y, z share the same structure, except for the axis that needs to be concatenated.
# We denoted a variable-length axis by using square bracket `[]`
# (Single pair of [] is allowed in each side of the pattern)
# E.g., x, y, z > (B, H, W, [C])
>>> h = rearrange([x, y, z], "B H W [Cx Cy Cz] -> B H W [Cx+Cy+Cz]")  # Like how we stack the list of tensors 
                                                                      # of the same shape, but uses `[]` as
                                                                      # a variable-length trigger and 
                                                                      # `+` for concatenation. 
                                                                      # Thanks to @arogozhnikov for `+`
>>> h.shape
(B, H, W, Cx+Cy+Cz)

# Additionally, this syntax allow us to split the tensor.
>>> batch_group = rearrange(h, "B H W [C1+C2] -> B H W [C1 C2]", C1=Cx, C2=Cy+Cz)
>>> batch_group.shape
((B, H, W, Cx), (B, H, W, Cy+Cz))

# While these do practically nothing. Unlike `()`, where we can use to stack the list of tensors
>>> [a, b, c] = rearrange([x, y, z], "B H W [Cx Cy Cz] -> B H W [Cx Cy Cz]") # a==x, b==y, c==z
>>> H = rearrange(h, "B H W [Cx+Cy+Cz] -> B H W [Cx+Cy+Cz]") # H==h

# The following is although looking strange, but still sensible. As long as the numbers add up.
# Here, we can still treat `Cb+Cc` from the left-hand side as a named variable
# where we need to evaluate the validity of `Cb+Cc`.
# This is particularly useful where we may use some black-box function/method that promise to do concatenation 
# but we want to make sure the output is valid.
>>> s = someBlackBoxMethod(b, c)
>>> K = rearrange(
    [a, s], "B H W [Ca Cb+Cc] -> B H W [Ca+Cb+Cc]",
    Ca=b.shape[-1], Cb=b.shape[-1], Cc=c.shape[-1]
)

However, the proposed syntax shouldn't allow you to do any other operation other than concatenation on the variable-length axis and swapping the axis around.

# This make no sense.
>>>rearrange([x, y, z], "B H W ([Cx Cy Cz] K) -> B H W [Cx+Cy+Cz] K") 
# This make senses.
# Also, it wouldn't matter whether you rearrange each tensor then rearrange, or concatenate then rearrange
>>>rearrange([x, y, z], "B (h1 h2) W [Cx Cy Cz] -> B h1 h2 [Cx+Cy+Cz] W", h1=N, h2=M) 

For this proposed syntax:

# From davidnvq/einops/pull/1 
###  Example: concatenate
>>> h = concat([x, y, z], "batch seq [... -> ...]")
>>> h = concat([x, y, z], "batch seq [... -> d]")

# If we change into the usual rearrange syntax
>>> h = concat([x, y, z], "batch seq [...] -> batch seq [d]") # Make sense, but looks like reduction.
>>> h = rearrange([x, y, z], "batch seq [...] -> batch seq [...]") # What actually does this mean?
# Is it should be interpreted as doing nothing or concatenate?

I personally think this ellipsis style violated the purposed of being strictly determined. Because I can't immediately infer what's actually going on here.
If it is going to be that much of tensor, like 10 or more. We can work around like this:

>>> h = [x, y, z]
>>> dim_var = [f'_{i}' for i in range(len(tobe_concat))]
>>> sep_dim = ' '.join(dim_var)
>>> cat_dim = '+'.join(dim_var)
>>> h = rearrange(h, f"batch seq [{sep_dim}] -> batch seq [{cat_dim}]")

While it's a bit messy. But still limiting tensor manipulation to just string manipulation.

Alternatively, we may be just doing this...

# This is allowed as long as there is only one variable-length axis.
# Also, no named variable got dropped in the right-hand side.
# Note: `+` is not allowed in this case, where the number of named vars in `[]`
# doesn't match the number of elements in the list.
>>> h = rearrange([x, y, z], "B H W [C] -> B H W C") 

Thanks @arogozhnikov for this wonderful tool, liberating us from the cross-framework tensor manipulation, and the '+' advice above. Would love to know your thought on this.

from einops.

austinmw avatar austinmw commented on May 18, 2024

Would really love this feature!

from einops.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.