keras-team / keras-cv Goto Github PK
View Code? Open in Web Editor NEWIndustry-strength Computer Vision workflows with Keras
License: Other
Industry-strength Computer Vision workflows with Keras
License: Other
Some comments on this:
In the timm package, it provides some soft attention modules to building network blocks and I think it's a good fit here, for example:
and many others.
for example:
"Looking to preprocess image data for classification? Try this CutMix + RandAugment pipeline"
Random Erasing [1] is another essential augmentation transform for controlling the amount of regularization when training models on ImageNet. It's also used by the recent line of papers [2, 3]. Short explanation: https://paperswithcode.com/method/random-erasing.
[1] Random Erasing: https://arxiv.org/abs/1708.04896
[2] ResNet Strikes Back: https://arxiv.org/abs/2110.00476
[3] A ConvNet for the 2020s: https://arxiv.org/abs/2201.03545
I'm very interested in this warehouse. Is it an upgraded version of keras app? Like torchvision?
The nightly release will be published every commit to master via a GitHub action
This includes:
Please evaluate to extract common components/utils that could support Anchor free models:
tensorflow/hub#424
See also:
google-ai-edge/mediapipe#495 (comment)
google-coral/edgetpu#51
This can be done by hard coding some predictions, running through cocoeval.py, then testing the known values against the Keras COCO metrics
These are not the focus of KerasCV, but are in scope of creating state of the art models for one of the focus tasks.
A great example of this is model visualization
We can likely autogenerate this
Given the models requirements are being gathered in the discussion, is there a preferred way to implement them?
There are multiple ways, to implement blocks and models:
keras.applications
way -> Models and blocks are functional Example.keras.layers.Layer
and keras.Model
respectively. Both implement call
method.Each way has it's own benefits and drawbacks. Is one of the above preferred? Or maybe something entirely different?
Are you interested to cover or migrate some of the CV components in the TF Addons namespace?
I suppose the starting point could be to review tf.addons.image
as we are already duplicated in this repo things like coutout/randomcoutout etc..:
https://www.tensorflow.org/addons/api_docs/python/tfa/image
P.s. This is a parallel ticket of keras-team/keras-nlp#11
This is a commonly performed operation in image processing making it a good fit for keras-cv. While this is not a super complicated operation (it's really just a lambda into a tf.gather) the resulting code is more readable. Let's include this in keras-cv.
Originally discussed here: keras-team/keras#15705
Re the design of the layer signature...
The current proposal is to implement a layer that looks like this:
rgb2bgr_layer = tf.keras.layers.ReorderChannel(order=[2, 1, 0], axis=-1)
Perhaps we actually may want to use a einsum inspired syntax, such as:
rgb2bgr_layer = tf.keras.layers.ReorderChannel('rgb->bgr', axis=-1)
This would be really readable to anyone stumbling upon a new codebase, and generalizes quite well to any number of channels.
Please feel free to comment below with any additional thoughts
Paper: https://arxiv.org/abs/1912.02781 Cited by 308
Code:
Understanding which for loops are costing the most will guide the optimization/vectorization effort. Profiling can be done using the TensorFlow profiler:
I have explained my questions in more detail here. I would appreciate if you could answer them for this repository (i.e. keras-cv) as well.
Lots of the for loops in the COCORecall update_state method could be reduced to tf.einsum()
calls. Additionally, we should be able to vectorize area computation in the iou function, along with some other operations in the overall computation of the metric. tf.TensorArrays are quite slow, so we should look to remove those if possible.
These two well-known and very effective augmentation methods are widely used among ML practitioners. In keras, we can find some general augmentation . And these two augmentation methods can be included as advanced augmentation layers.
CutMix - Paper - Cited by ~ 865
MixUp - Paper - Cited by ~ 2675
Let's add these to
keras_cv.layers.RandomCutMix
and keras_cv.layers.RandomMixUp
.
System information.
TensorFlow version (you are using): 2.5
Are you willing to contribute it (Yes/No) : Yes
Describe the feature and the current behavior/state.
ResNeXt
is a well-known classification model that is weirdly missing from the keras.applications
. The idea that is presented here is great and I think it's fit for the model member in the keras package.
Will this change the current api? How?
Yes.
from tensorflow.keras.applications.resnext50 import ResNeXt50
from tensorflow.keras.applications.resnext101 import ResNeXt101
Who will benefit from this feature?
ML engineers and researcher who uses tf.keras
.
Others
Others implementation: https://github.com/qubvel/classification_models
based on tensorflow/tensorflow#48371
This module came about due to some shared functionality that was originally tested within the layers. As long as this is shared, it would be good to unit test it.
[Reposting from here as a future reference for potential contributor].
CutMix - Paper - Cited by ~ 865
MixUp - Paper - Cited by ~ 2675
Currently, this aug can be applied to classification tasks but as the kerascv
sets the target to general vision task, that's why to support vision tasks like object detection, we may need to add utility to accept the bbox_params
argument too.
example of mixup-object-detection - region=full-images
example of mixup-object-detection - region=random
example of cutmix-object-detection
ref: https://www.kaggle.com/ankursingh12/data-augmentation-for-object-detection
ref: https://www.kaggle.com/shonenkov/oof-evaluation-mixup-efficientdet
If you open a GitHub issue, here is our policy:
It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below must be filled out.
Here's why we have that policy:.
Keras developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.
System information.
TensorFlow version (you are using): 2.6
Are you willing to contribute it (Yes/No): Yes
Describe the feature and the current behavior/state.
Describe the feature clearly here. Be sure to convey here why the requested feature is needed. Any brief description of the use-case would help.
Paper: Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Original Code: https://github.com/microsoft/Swin-Transformer?utm_source=catalyzex.com
It's a variant of the transformer model and achieves state-of-the-art performance or comparable performance with the best CNN-based models. It also contains enough citations (~250 at this moment) for addition to the package.
On ImageNet-1K and 22K, below is the comparable results with EfficientNet (CNN) models.
- | Img Size | Top 1K acc | - | Img Size | Top 1K acc | Top 22K acc |
---|---|---|---|---|---|---|
E3 | 300 | 81.6 | EfficientNetV2-S | - | 83.9 | 84.9 |
E5 | 456 | 83.6 | EfficientNetV2-M | - | 85.1 | 86.2 |
E7 | 600 | 84.3 | EfficientNetV2-L | - | 85.7 | 86.8 |
- | - | - | EfficientNetV2-XL | - | - | 87.3 |
Swin-T | 224 | 81.3 | Swin-B | 224 | - | 85.2 |
Swin-S | 224 | 83.0 | Swin-B | 384 | - | 86.4 |
Swin-B | 224 | 83.5 | Swin-L | 384 | - | 87.3 |
Swin-B | 384 | 84.5 | - | - | - | - |
Will this change the current api? How?
Yes. It will change as follows
tensorflow.keras.applications.SwinT
tensorflow.keras.applications.SwinS
tensorflow.keras.applications.SwinB
tensorflow.keras.applications.SwinL
Who will benefit from this feature?
Keras users.
https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives
Two papers to reference:
Best resnet for Cifar10/Cifar100
Paper: GridMask Data Augmentation
Citation: ~70
Code: https://github.com/google/automl/blob/master/efficientdet/aug/gridmask.py
Demo:
From a PR review:
Do we want to have a paper citation treshold or any other evalustion metric?
We want an adoption metric of some kind, and a citation threshold (~50) is a convenient metric.
What about the component maintainership policiy? Can we scale over the community maintainership/ownership of the components?
We'll try. Someone who contributes a component will be called to fix issues with it if any arise.
I think also that we could not go to only accumulate components over time how we are going to evaluate and handle deprecations?
Deprecation decisions are always the result of cost/benefits analysis. The picture varies from component to component and over the lifetime of the repo. It's a case by case basis.
I suppose that we could partially use the README.MD for this and partially CONTRIBUTING.MD
Yes, this is mostly information that should go in the contributors' guide.
Sayak is working on this
Current error when running in mirrored mode:
ValueError: SyncOnReadVariable does not support
assign_add
in cross-replica context when aggregation is set totf.VariableAggregation.SUM
.
If we can compile to XLA we can also run on TPUs.
We don’t want people depending on, for example, fill_utils. This is intended for internal use only
Note: Owen is already working on this.
Image Augment layer using SLIC
https://ieeexplore.ieee.org/document/6205760 cited by 7880
Implementation in skimage https://github.com/scikit-image/scikit-image/blob/v0.19.0/skimage/segmentation/slic_superpixels.py#L110-L385
Sharpen images by using a unsharp mask or something better I am unaware of.
I'm pretty sure we can remove this and rely on keras.losses.CategoricalCrossentropy's label_smoothing. I don't think it makes any numerical difference to do it earlier in the pipeline.
Mosaic Augmentation
What is it?
The idea is to take 4 random samples and create a single sample of mosaic fashion with them. For example:
Starter
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.