barisozmen / deepaugment Goto Github PK
View Code? Open in Web Editor NEWDiscover augmentation strategies tailored for your dataset
License: MIT License
Discover augmentation strategies tailored for your dataset
License: MIT License
I'm experimenting with wrn-16-8 (WideResNet) at this repo. During training, loss suddenly turned into nan. I guess it's a numerical calculation problem.
black formats code in PEP 8 style:
Black ignores previous formatting and applies uniform horizontal and vertical whitespace to your code. The rules for horizontal whitespace can be summarized as: do whatever makes pycodestyle happy. The coding style used by Black can be viewed as a strict subset of PEP 8.
Hi, thank you for sharing this code!
I have training on my own data with the following code:
cnn_config = {"model":model,
"child_batch_size": 32,
"child_epochs": 50}
deepaug = DeepAugment(images=x_train, labels=y_train, config=cnn_config)
best_policies = deepaug.optimize(100)
The columns names in the CSV file are different from the names of the columns that you display on "notebooks/result-analyses/*".
My columns names are:
acc | loss | val_acc | val_loss | trial_no | A_aug1_type | A_aug1_magnitude | A_aug2_type | A_aug2_magnitude | B_aug1_type | B_aug1_magnitude | B_aug2_type | B_aug2_magnitude | C_aug1_type | C_aug1_magnitude | C_aug2_type | C_aug2_magnitude | D_aug1_type | D_aug1_magnitude | D_aug2_type | D_aug2_magnitude | E_aug1_type | E_aug1_magnitude | E_aug2_type | E_aug2_magnitude | sample_no | mean_late_val_acc | epoch
while yours are:
acc | loss | val_acc | val_loss | trial_no | aug1_type | aug1_magnitude | aug2_type | aug2_magnitude | aug3_type | aug3_magnitude | portion | sample_no | mean_late_val_acc
Do I have to run more code lines to get the same file as yours?
Explore raw data and report it here:
/notebooks/explore-raw-data.ipynb
Do an explorative analysis on jupyter notebook and put it to /notebooks/explore-raw-data
Notebook should iterate followings:
Explore raw training data from DOTA, and report followings:
Show some image samples
Write an overall summary of explorative analysis. And add necessary information from DOTA paper in it.
Hello my friend,
Wich version of Tensorflow is needed?
(For GPU support)
Wich Python Version works best?
pip install doesn't work
Hi @barisozmen
How can one use deepaugment for encoder-decoder network architectures like semantic and instance segmentation.
Thanks and Regards,
Deeksha.
Research Scholar (Data Science)
IIIT, Bangalore, India.
An SSD model pre-trained by DOTA team would be a good start. It can be downloaded following links here (github.com/ringringyi/DOTA_models#training)
Its unclear for me if this project produces the images or if it also does the training - outputting the final score. Is it possible to use this project like keras ImageDataGenerator?
Using Dropout in child_model shows great works on prevent overfitting, however it also cause the final performance on model change significantly during each training with same hyper-params. It is too random that cause that we need using more sampling times to estimate final performance on one hyper-params which is very time consuming. Any ideal for solving this problem.
They are listed here:
'''
def ShearX(img, v): # [-0.3, 0.3]
return img.transform(img.size, PIL.Image.AFFINE, (1, v, 0, 0, 1, 0))
def ShearY(img, v): # [-0.3, 0.3]
return img.transform(img.size, PIL.Image.AFFINE, (1, 0, 0, v, 1, 0))
def TranslateX(img, v): # [-150, 150] => percentage: [-0.45, 0.45]
v = v*img.size[0]
return img.transform(img.size, PIL.Image.AFFINE, (1, 0, v, 0, 1, 0))
def TranslateY(img, v): # [-150, 150] => percentage: [-0.45, 0.45]
v = v*img.size[1]
return img.transform(img.size, PIL.Image.AFFINE, (1, 0, 0, 0, 1, v))
def Rotate(img, v): # [-30, 30]
return img.rotate(v)
def AutoContrast(img, _):
return PIL.ImageOps.autocontrast(img)
def Invert(img, _):
return PIL.ImageOps.invert(img)
def Equalize(img, _):
return PIL.ImageOps.equalize(img)
def Flip(img, _): # not from the paper
return PIL.ImageOps.mirror(img)
def Solarize(img, v): # [0, 256]
return PIL.ImageOps.solarize(img, v)
def Posterize(img, v): # [4, 8]
v = int(v)
return PIL.ImageOps.posterize(img, v)
def Contrast(img, v): # [0.1,1.9]
return PIL.ImageEnhance.Contrast(img).enhance(v)
def Color(img, v): # [0.1,1.9]
return PIL.ImageEnhance.Color(img).enhance(v)
def Brightness(img, v): # [0.1,1.9]
return PIL.ImageEnhance.Brightness(img).enhance(v)
def Sharpness(img, v): # [0.1,1.9]
return PIL.ImageEnhance.Sharpness(img).enhance(v)
def Cutout(img, v): # [0, 60] => percentage: [0, 0.2]
w, h = img.size
v = v*img.size[0]
x0 = np.random.uniform(w-v)
y0 = np.random.uniform(h-v)
xy = (x0, y0, x0+v, y0+v)
color = (127, 127, 127)
img = img.copy()
PIL.ImageDraw.Draw(img).rectangle(xy, color)
return img
def SamplePairing(imgs): # [0, 0.4]
def f(img1, v):
i = np.random.choice(len(imgs))
img2 = PIL.Image.fromarray(imgs[i])
return PIL.Image.blend(img1, img2, v)
return f
'''
Data preprocess should be like:
For pipeline v0.1, only use 20 images for training set, where 10 of them having "planes" in it. All images from test set. MVP targets only to detect planes.
Hi,
Could you please explain about sample script that use fashion_mnist dataset
As I know, fashion_mnist is gray dataset, how convert to to 3 channel images as requirement ?
I have tried to reshape X_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3), but it cann't
The error I faced
0, 0.1327777779367234, ['rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0]
trial: 1
['gamma-contrast', 0.8442657485810175, 'coarse-salt-pepper', 0.8472517387841256, 'brighten', 0.38438170729269994, 'translate-y', 0.056712977317443194, 'translate-y', 0.47766511732135, 'add-to-hue-and-saturation', 0.47997717237505744, 'emboss', 0.8360787635373778, 'sharpen', 0.6481718720511973, 'emboss', 0.9571551589530466, 'rotate', 0.8700872583584366]
/home/kaka/PycharmProjects/DeepAugment /venv/lib/python3.6/site-packages/imgaug/augmenters/color.py:448: UserWarning: Received an image with shape (H, W, C) and C=1 in ChangeColorspace._augment_image(). Expected C to usually be 3 -- any other value will likely result in errors. (Note that this function is e.g. called during grayscale conversion and hue/saturation changes.)
"changes.)" % (image.shape[2],)
Is it possible to add simply add ChildCNN for regression - using MSE instead of accuracy? Or will that also require change to the Controller?
Many resources referenced here (https://www.manifold.ai/blog/exploration-exploitation-reinforcement-learning)
UC Berkeley RL course (http://rail.eecs.berkeley.edu/deeprlcourse/)
30x-250x efficient method to find augmentation policies automaticallty, relative to AutoAugment by Google.
Arxiv : https://arxiv.org/abs/1905.00397
Code : https://github.com/KakaoBrain/fast-autoaugment
We don't train child-networks like AutoAugment or deepaugment, and that is the key reason of the speed. But I really appreciate your work and I hope we can influence each other, in a good way. I also want to make my repo easy to use like yours.
According to my observation, I don't see deepaugment support big dataset (which I cannot load all images and labels at a time and have to use data generator)? If I'm missing something, can you show how to use this repo with custom dataset which I have to use data generator?
I write a simple script like this:
import os
from deepaugment import DeepAugment
from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
deepaug = DeepAugment(x_train, y_train)
best_policies = deepaug.optimize(300)
after run it, about one minute, I got a AssertionError:
...
...
Epoch 48/50
- 1s - loss: 0.2101 - acc: 0.9382 - val_loss: 2.6119 - val_acc: 0.5540
Epoch 49/50
- 1s - loss: 0.2101 - acc: 0.9347 - val_loss: 2.7725 - val_acc: 0.5430
Epoch 50/50
- 1s - loss: 0.2075 - acc: 0.9388 - val_loss: 1.9880 - val_acc: 0.5510
fit()'s runtime: 55.3912 sec.
0, 0.567111111190584, ['rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0, 'rotate', 0.0]
('trial:', 1, '\n', ['gamma-contrast', 0.8442657485810175, 'coarse-salt-pepper', 0.8472517387841256, 'brighten', 0.38438170729269994, 'translate-y', 0.056712977317443194, 'translate-y', 0.47766511732135, 'add-to-hue-and-saturation', 0.47997717237505744, 'emboss', 0.8360787635373778, 'sharpen', 0.6481718720511973, 'emboss', 0.9571551589530466, 'rotate', 0.8700872583584366])
Traceback (most recent call last):
File "test.py", line 29, in <module>
best_policies = deepaug.optimize(300)
File "/data/ansheng/cv_strategy/autoML/deep_augment/deepaugment-master/deepaugment/deepaugment.py", line 151, in optimize
f_val = self.objective_func.evaluate(trial_no, trial_hyperparams)
File "/data/ansheng/cv_strategy/autoML/deep_augment/deepaugment-master/deepaugment/objective.py", line 44, in evaluate
self.data["X_train"], self.data["y_train"], *trial_hyperparams
File "/data/ansheng/cv_strategy/autoML/deep_augment/deepaugment-master/deepaugment/augmenter.py", line 166, in augment_by_policy
), "first transform is unvalid"
AssertionError: first transform is unvalid
The code that throw Error is:
X_portion_aug = transform(hyperparams[i], hyperparams[i+1], X_portion) # first transform
assert (
X_portion_aug.min() >= -0.1 and X_portion_aug.max() <= 255.1
), "first transform is unvalid"
It seems the code after data-augmentation is out of range [0,255].
So if the function augment_by_policy() in augmenter.py has some bug?
HI @barisozmen thanks for sharing the code for deepaugment
I would like to try this on my dataset.
which value would you recommend to monitor on?
have you considered to implement tensorboard/ tensorboardX in the code for easy validate of the process?
thanks!
Some suggestions:
http://josh-tobin.com/assets/pdf/troubleshooting-deep-neural-networks-01-19.pdf (Last slides are about Bayesian hyperparameter optimization)
Experiment data is at: https://github.com/barisozmen/deepaugmenter/tree/master/reports/experiments/2019-1-30_19-27
X : data as it is
X_aug: augmented version of X
Current plan:
Make an initial training (200 epochs) with X of the child model, then using trained weights:
Make an experiment for options 1 and 2, and see which one is better.
https://pypi.org/project/deepaugment/
In the Advanced usage
deepaug = DeepAugment(iamges=x_train, labels=y_train, config=my_config)
!! images typo 'iamges' !!
the right patten is I thought
deepaug = DeepAugment(images=x_train, labels=y_train, config=my_config)
A nice usage example of skopt: https://geekyisawesome.blogspot.com/2018/07/hyperparameter-tuning-using-scikit.html
How to get the parrot.jpg?
I've just discovered your projet and I would like to add it in my ML pipeline using Tensorflow 1.13.1 but the requirement of my projet doesn't match yours. Is it planned to support an upper version of Tensorflow ?
Thanks !
When setting child_first_train_epochs (I tried with 15 and 20), the following error occurs after training 'child_first_train_epochs' number of epochs:
Traceback (most recent call last):
File "run_deepaugment.py", line 51, in <module>
deepaug = DeepAugment(images=x_train, labels=y_train.reshape(TRAIN_SET_SIZE, 1), config=my_config)
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/deepaugment/lib/decorators.py", line 106, in wrapper
return func(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/deepaugment/deepaugment.py", line 120, in __init__
self._do_initial_training()
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/deepaugment/deepaugment.py", line 202, in _do_initial_training
-1, ["first", 0.0, "first", 0.0, "first", 0.0, 0.0], 1, None, history
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/deepaugment/notebook.py", line 38, in record
new_df["B_aug2_magnitude"] = trial_hyperparams[7]
IndexError: list index out of range
Here is my config used:
my_config = {
'model': 'wrn_16_2',
'train_set_size': int(TRAIN_SET_SIZE*0.75),
'child_epochs': 60,
'child_batch_size': 64,
'child_first_train_epochs': 20,
'opt_samples': 1,
}
Where TRAIN_SET_SIZE is a custom dataset of 3000 examples
The code runs fine if I omit the child_first_train_epochs setting
I have a custom dataset to be used for object detection.
Can deepaugment be applied to this dataset?
I would be very grateful if you could reply.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.