arogozhnikov / einops Goto Github PK
View Code? Open in Web Editor NEWFlexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
Home Page: https://einops.rocks
License: MIT License
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
Home Page: https://einops.rocks
License: MIT License
The beginning of "Improving RNN language modelling" and "CNNs for text classification" blocks aren't visible in https://arogozhnikov.github.io/einops/pytorch-examples.html
und typos: "Improving RNN language modilling" - modelling. ^_^
This makes reduce() more flexible, e.g., you could just pass in tf.reduce_logsumexp
instead of requiring it to be builtin (#12).
My sense in that in many cases the size of new axis should match the size of an existing axis on a different tensor.
I wonder if a helper function that used the same syntax as the rest of einops
for extracting axis size in a dict
would work well?
e.g.,
>>> einops.sizes(input, 'b h w c')
{'b': 32, 'h': 192, 'w': 192, 'c': 3}
This could be naturally extended into a multi-argument version that verifies consistent sizes, e.g.,
>>> einops.sizes(input, 'b h w c_in', weights, 'w h c_in c_out`)
{'b': 32, 'h': 192, 'w': 192, 'c_in': 3, 'c_out': 16}
The alternative is manual unpacking of shape
, e.g., b_size, h_size, w_size, c_size = input.shape
This is also pretty readable, but maybe a little harder to use reliably. For example, if you only care about the size of the batch axis, you would be tempted to write b_size, *_ = input.shape
or b_size = input.shape[0]
, which doesn't include the explicit shape assertion. And there's no easy way to check sizes for multiple arguments.
Hello, I'm just throwing an idea,
I'm not sure it fits in the scope of Einops, and it will probably require a lot of work, but I think it would be useful:
What about allowing to manipulate elements along a dimension, like for example the r,g,b channels of an image ?
I could imagine a syntax that would look like this, with everything inside brackets referring to elements rather than dimensions:
# reorder color channels rgb to bgr OpenCV-style:
rearrange(imgs, 'batch [r g b] h w -> batch [b g r] h w')
Extending on the existing syntax, grouping elements would look like this:
# reorder color channels rgb to brg:
rearrange(imgs, 'batch [rg b] h w -> batch [b rg] h w', rg=2)
This could allow to drop elements:
# remove alpha channel:
rearrange(imgs, 'batch [rgb a] h w -> batch [rgb] h w', rgb=3)
@arogozhnikov if you like einsum you will love capsules networks: https://github.com/michaelklachko/CapsNet/blob/master/capsnet_cifar.py#L68-L91 someone should do that in pytorch.
subj. For now concentrating on a single framework.
Is it possible to get out a new release of einops on pypi?
It seems like the version installable by pip doesn't include repeat
(which is a very useful op).
This is admittedly a bit of a crazy idea that I think could be interesting to explore. Interested to hear anybody's thoughts on this.
Basically, I think it would be even more readable and concise if one could chain together multiple operations in a single string. Would also eliminate intermediate variables and multiple einops calls.
Let me explain with an example:
out = einops.chain("""
x1=x: b h w c -> b h w; mean
x2=y: b h w c -> b h w; mean
x1, x2: b h1 w, b h2 w -> b h1 h2
""", x=x, y=y)
So, here x
and y
are set by keyword args, which would let you pass any number of variables into the chain of expressions. You could potentially run an entire pipeline through a single einops
chain.
The syntax: x1=x:
stores the output of the following computation in the "register" x1
. That would be passed to the next calculations (shown on line 3 of the string). And similar for x2
.
The syntax: ; mean
would tell einops
that a reduction operation using mean operation should be performed.
The syntax x1, x2:
would take the arrays in the temporary variables x1
and x2
, and execute the following command using einsum. Since there's no =
, it just means to output it as the result of the chain.
If the keyword arg is an integer rather than array, it would be interpreted as a size, e.g.:
einops.chain("ims: (b1 b2) h w c -> h (b1 b2 w) c", ims=ims, b1=2)
would do the normal rearrange operation with b1
set to 2.
Excluding einsum, which is discussed in #73, I don't this would require additional operations. It would just be string parsing and meta-programming of other operations.
This would also be a unified way of doing the other operations, as well as einsum.
Curious to hear opinions on this! And any other syntax ideas for doing this.
Cheers,
Miles
mxnet
issueMXNET_SPECIAL_MAX_NDIM
)After digging into mxnet:
Great work!
I discovered in https://github.com/arogozhnikov/einops/blob/master/einops/einops.py#L199 that you also support ellipsis. Its an important feature so you may want to add it to the documentation.
CI will not have GPUs, thus we need to test without cupy.
Need to decide a policy on keeping reference for previous releases.
In pytorch, we have 'expand_as' which check dim before expand.
I'm aware of 'repeat' layer as replace for 'expand' but could you add 'repeat_as' as expand for 'expand as' ?
Thanks.
I just read through the pytorch 1.3 release notes and found their named tensor feature
https://pytorch.org/docs/stable/named_tensor.html
it looks similar to einops but quite limited and not as powerful - will continue to use einops π¬ π
I was wondering how einops integration with the named tensor feature could look like; e.g. in
>>> imgs = torch.randn(1, 2, 2, 3 , names=('N', 'C', 'H', 'W'))
>>> imgs.names
('N', 'C', 'H', 'W')
>>> rearrange(imgs, "() c h w -> c h w")
should einops check that the names of the input tensors are matching the pattern? What else can be done here?
I understand the named tensors are an experimental feature right now; this ticket is more about starting a discussion from the einops point of view. Thanks! π
Some revisit of recipes will be required, but it is missing ingredient for complete uniformity
Sometimes when the internet is slow, GitHub takes a long time to open jupyter notebooks, and sometimes it fails to open it.
I would suggest adding nbviewer links
for docs/xxx.ipynb
files which open fast and also seem more pleasant (IMO).
can't access the einops fundamentals ipynb
need to investigate if backend packages make strides available for analysis (or at least as_contiguous).
This may help with optimizations
Is there a reason that einops does not support upper latin letters?
I would like to use upper and lower letters.
Documentation should live on a shorter link
Continuing discussion started in pull-request #25 .
So far: tf.keras
ΠΈ keras
are different things now, they work on different input and have different recommendations for creating custom layers.
This version seems to work for me with tensorflow.
import tensorflow as tf
from einops.layers.keras import RearrangeMixin, ReduceMixin, UnknownSize
class Rearrange(RearrangeMixin, tf.keras.layers.Layer):
def call(self, inputs):
return self._apply_recipe(inputs)
class Reduce(ReduceMixin, tf.keras.layers.Layer):
def call(self, inputs):
return self._apply_recipe(inputs)
Example for eager execution
tf.enable_eager_execution()
x = tf.zeros([4, 5], dtype='float32')
Rearrange('i j -> j i')(x).shape
Reduce('i j -> j', 'max')(x).shape
And example without eager execution
import numpy
x = tf.placeholder('float32')
x.set_shape([None, None])
with tf.Session().as_default():
y = Rearrange('i j -> j i')(x).eval({x: numpy.zeros([5, 6], dtype='float32')})
y = Reduce('i j -> j', 'max')(x).eval({x: numpy.zeros([5, 6], dtype='float32')})
At least this seems to comply with tf guide
https://www.tensorflow.org/tutorials/eager/custom_layers
My env:
python 3.6 (should not affect)
In [2]: tensorflow.__version__
Out[2]: '1.10.0'
In [4]: keras.__version__ (should not affect)
Out[4]: '2.2.4'
This seems like it should be (intuitively) plausible:
rearrange(x, 'b -> a b c', a=1, c=1)
to essentially push a vector to be compatible with some other tensors (for broadcasting operations). Currently this throws an error.
One (sort of ugly) workaround is:
rearrange(x, '(a b c) -> a b c', a=1, c=1)
However, it seems like this is a bit redundant and it obfuscates the intent a bit. Thoughts?
Currently, concatenation as in the example is done by calling stack_on_zeroth_dimension()
first then rearranging the tensor into the appropriate shape. However, most backend.stack()
requires that all except the stacked dimension to be the same, so simple concatenation of a dimension with different lengths is not possible.
For example, if we were to stack an image with 3 channels with an image with a single channel to create a 4-channel image:
img1 = np.random.randn(300, 200, 3)
img2 = np.random.randn(300, 200, 1)
np.concatenate([img1, img2], axis=2).shape
# (300, 200, 4) as expected
rearrange([img1, img2], 'b w h c -> w h (b c)')
# np.stack error: all input arrays must have the same shape
I would be ideal if such cases occurs, concatenation methods like np.concatenate
or torch.cat
is called instead of stack
. I am not sure how this might break the simplicity of the rest of the code.
way to split images into patches like im2col and col2im where they're inverse operations of each other. Unlike PyTorch which does summation.
eg:
x = torch.rand(1, 3, 64, 64)
y = im2col(x, kernerl_size=5, stride=1)
z = col2im(y, kernel_size=5, stride=1)
and x == z.
Repo description contains the word rethinked
, which is grammatically incorrect. The correct form is rethought
.
Like torch.tensor.expand
Can we use this library only for numpy operations when we do not have tensorflow/torch/etc?
I was looking for the requirements.txt
file and it was missing in the Github repo.
It would be helpful for starters if there is info about library requirements.
Thank you!
It would be nice to have it, but there are problems with backends
Custom implementation through exp and max would take probably much more memory for a backward pass.
Integrating einsum with einops is a good direction
Resume: currently relying on backends is hard.
Other option: implement minimalistic version for two operands based on rearrange / diagonal slicing and dot product. This may turn out to be inefficient
I am using pytorch. Suppose I want to rearrange
a tensor and change some of its elements inplace. But I don't know if rearrange
will create a view or not. So I think there must either be an argument which means "raise an error iff this rearrange can't be performed using view" or there must be an easy way to determine if my rearrange will create a view or not.
opt_einsum: https://optimized-einsum.readthedocs.io/en/latest/
Not sure how integration would look like.
Maybe with a module flag for "einsum optimizer" (EINSUM_OPT in ['opt_einsum', None]
). Since the einsum part should work the same for all backends it supports, but the rest of the operations need to be specified per-backend.
Just opening it for discussion :)
Thank you for making our life easier when working with tensors. I have the following suggestions based on #50 and #20.
As suggested in #50, it is indeed useful when we have an operation for reordering the elements of channels, especially for those working on images with different libraries (open-cv, PIL). It is really better than doing with boring indices.
I totally agree with @remisphere that we can use reorder
without misleading to users.
# instead of doing this
out = imgs[:, [2, 0, 1, 3], :, : ]
# we can use the below
einops.reorder(imgs, 'batch [rg b a -> b rg a] h w', rg=2, b=1, a=1)
Since we only perform operations on the single dimension, we can perform the concatenation of multiple items with different sizes on that dimension. This will easily handle the case mentioned in #20 and extremely useful for those who use concatenate
in their code. I use this function many times to concatenate tensors of different shapes. For example:
# three below tensors have different size on the 2nd dim
print(x.shape) # [b, 10]
print(y.shape) # [b, 15]
print(z.shape) # [b, 20]
# we can concatenate them as
inputs = [x, y, z]
out = einops.reorder(inputs, 'batch [x y z -> x y z]', x=10, y=15, z=20)
The above call is consistent with einops.rearrange
to concatenate inputs including items of the same shape.
It is possible to split out
into their components x, y, z
with three lines using the below chunk
function:
x = einops.chunk(out, 'batch [x yz -> x]', x=10)
y = einops.chunk(out, 'batch [x y z -> y]', x=10, y=15)
z = einops.chunk(out, 'batch [xy z -> z]', z=20)
In contrast with #50, I don't think it is a good idea to merge chunking
into reorder
.
We can separate these functionalities into the above reorder
and chunk
. Chunking is used frequently when we want to sample parts of datasets and features.
Example in #50:
# remove the alpha channel and the bottom half of 256*256 images:
einops.chunk(imgs, 'batch [rg b a -> b rg] [top bottom -> top] w', rg=2, b=1, top=128, batch=10)
Split dataset into train
and val
train_len = int(len(dataset) * 0.8)
train_split = einops.chunk(dataset, '[train val -> train] c h w', train=train_len)
val_split = einops.chunk(dataset, '[train val -> val] c h w', train=train_len)
And we can get the full dataset given train_split
and val_split
:
dataset = einops.reorder([train_split, val_split], '[train val -> train val] c h w', train=len(train_split), val=len(val_split))
It would be nice to have independent separate guides (much better if kept separately), but its better to start from one particular
Hey! Loving einops
, so much that now I feel a bit sad about standard einsum
not being able to use descriptive names for dimensions. It would be amazing if einops
implemented einsum
with the same conveniences.
Sometimes, I find myself working lists of tensors in which one tensor has a shape (b, c)
(for c
classes) and another tensor has shape (b,)
(for a single class). My current approach is to pad the tensors that have only one class with an additional channel dimension, use rearrange
on the list, and then squeeze the dimensions that need to be squeezed.
A great alternative to this would be supporting optional channels. Perhaps you could notate them with a question mark: rearrange(x, "b c? -> (b c?)")
.
When all axes are known, it works nicely:
import tensorflow as tf
from einops import rearrange
x = tf.placeholder(tf.float32, shape=(2, 5))
print(x.shape)
y = rearrange(x, 'a b -> b a')
print(y.shape)
yields:
(2, 5)
(5, 2)
If some axis are not known (e.g. variable batch size, variable sequence length, ...), reshape does not copy the shape information:
import tensorflow as tf
from einops import rearrange
x = tf.placeholder(tf.float32, shape=(None, 5))
print(x.shape)
y = rearrange(x, 'a b -> b a')
print(y.shape)
yields
(?, 5)
(?, ?)
Originated from patch #31
A new experimental layer WeightedEinsum
was added recently (PR #70 ). Users are welcome to give it a try
This issue is for collecting feedback on API and possible issues with current implementation.
WeightEinsum
reminds usual einsum
with two arguments:
output = einsum('<input_part>,<weight_part> -> <output_part>', input, layer.weight)
Corresponds to a layer
layer = WeightedEinsum('<input_part> -> <output_part>', weight_shape='<weight_part>')
weight_shape is passed as an additional argument to stress difference between input and weight.
Note: all dimensions of weight shape/bias shape in parameters should be specified.
Simple linear layer with bias term. You have one like that in your framework (prefer framework built-in where possible)
WeightedEinsum('t b cin -> t b cout', weight_shape='cin cout', bias_shape='cout', cin=10, cout=20)
Linear layer applied to a different axis. Identical to Conv1x1
WeightedEinsum('b cin h w -> b cout h w', weight_shape='cin cout', bias_shape='cout', cin=10, cout=20)
Channel-wise multiplication (like one used in normalizations)
WeightedEinsum('t b c -> t b c', weight_shape='c', c=128)
Separate dense layer within each head, no connection between different heads
WeightedEinsum('t b head cin -> t b head cout', weight_shape='head cin cout', head=8, cin=128, cout=128)
Separate dense layer within each head, no connection between different heads
WeightedEinsum('t b head cin -> t b head cout', weight_shape='head cin cout', head=8, cin=128, cout=128)
Collapsing several axes into one is frequently followed by a linear layer. This should be one explicit step, also all arithmetics is now done by a layer, not user
WeightedEinsum('b h w c_in -> b c', weight_shape='h w c _in c', h=6, w=6, c_in=64, c=256)
Composition and decomposition should be possible for input and output (to be implemented)
WeightedEinsum('t b (head cin) -> t b (head cout)', weight_shape='head cin cout', head=8, cin=128, cout=128)
einops.rearrange
)Uniform He initialization is applied to weight tensor.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.