Comments (20)
all is done there :
https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/composition.py
...
from torchio.
All contributions are welcome!
from torchio.
One of the main issues related to this is that the random parameters generated by the RandomTransform
are stored in the sample. I thought this was nice was traceability. Adding these probabilistic compositions is nice, but if two samples don't share all their keys, it won't be possible to collate them in a batch by the default PyTorch DataLoader
collate_fn
. This is why I add the random parameters to a sample even if the transform is not applied, with the do_it
variable, as shown here:
One solution is to force the user to define a custom collate function as in this example:
torchio/examples/example_heteromodal.py
Lines 57 to 64 in 224dbed
but this seems like unnecessary work and slightly moves away from standard PyTorch practice.
In order to keep the default collate function, the best solution I can think of is to add the random parameters only if a flag has been enabled for it. In that case, it will be possible that samples will have different keys and the user will need to use a custom collate function. By default, these entries won't be added and the samples will be easily composed in a batch, even if they have followed paths in the transforms pipeline generated by these composing operations.
from torchio.
Yes I discover this problem playing with https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L98
The problem is not related to the composition, it happend to each transform that have a probability to be use or not.
My way to handel it is to always define all parameter that you want to add in the dictionary (with default value (the only dificulty is that it should be the same type, and the same shape )
and the parameter sample[image_name]['random_bias_do_augmentation'] will tell you if the transform has been applied or not
from torchio.
for instance
here is the default value of the parameter I want to keep trace of
https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L93
it will be define a few line ater
https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L114
if the transformation is applied
from torchio.
Yes, that's the same strategy I showed in my comment. But I think that polluting all samples with irrelevant information adds unnecessary complexity to the dictionaries. Is like adding an entry with "the parameters that would have been used but never did".
from torchio.
This is 2 separates things:
1_ The choice of the parameter to add in the dictionaries
2_ the fact that it should be add with default value even if the tranfo is not applied
2 is necessary to have a mixte batch of "applied and no applied" transfo
and 1 is indeed very dependant of the task in mind.
Personally I like it to be as complete as possible, to be able to check what has been performed
from torchio.
What happens if you have a list of 10 transforms and you compose them with OneOf
? Do you still need to go through the other 9 and add some parameters to the sample?
from torchio.
sure !
I see your point: 9 different dict parameters are empty...
but at the end you want to retrieve them (in order to add conditions base on those parameters). To do so you need a complete dictionary. So yes
it may become hard to handel, if parameters witht the same name can have variable shape.
A other difficult question is about the exact structures to store them. For instance the order of the composition matter. how do you retrieve it...
from torchio.
For instance the order of the composition matter
You're right, this is also an issue.
Something I've also thought of is to store the transforms information in the sample in plain text, adding a line every time a transform is applied. But this wouldn't be readable and the text would need to be parsed. Also, currently, some parameters are saved on a per-image basis and other (spatial transforms) are saved for the whole sample.
from torchio.
The text approach would be like a "history" of operations applied to the sample, a bit like the autograd history in PyTorch tensors.
from torchio.
The last couple of commits will probably make retrieving random transforms parameters difficult. I'm planning to create a Sample
class that can be automatically collated by the DataLoader
and stores the random parameters information.
from torchio.
@romainVala You can retrieve the parameters for random transforms like this:
torchio/examples/example_motion.py
Lines 17 to 31 in 6741adf
from torchio.
yes, why not ...
what happen when you are within a dataloader, where is the batch dimention
from torchio.
Now transforms return instances of Subject
, which inherits from dict
. The random parameters are now stored in the history
attribute of Subject
. The batch is collated with the default collate function in the data loader. I'm not sure I understand your question.
from torchio.
Ok, I better understand if I can play with it,
so I try the following :
t = Compose([RandomNoise(), RandomElasticDeformation() ])
dataset = ImagesDataset(suj, transform=t)
sample = dataset[0]
then as you write, I can get the parameter of each transform, since the history has now the same length as number of compose.
great !
But now if I do
dl = DataLoader(dataset, batch_size=4)
sample_batch = next(iter(dl))
but I can not retrieve the history informations
type(sample) is torchio.data.subject.Subject
but
type(sample_batch) is dict
may be I missing something ?
from torchio.
You're right. I suppose that if you want to trace the random parameters, you'll need to collate the batch yourself:
dl = DataLoader(
dataset,
batch_size=4,
collate_fn=lambda x: x, # this creates a list of Subjects
)
samples = next(iter(dl))
inputs = torch.cat([sample['image'][torchio.DATA] for sample in samples])
histories = [sample.history for sample in samples]
from torchio.
Ok I see,
thanks for the code exemple, I now better understand what the collate_fn does.
Well I do prefer the previous version where extra parameter, were added in the subject dict. in this case the batch concatenation was done automatically, (for the inputs data and for the parameters). I do no understand what was wrong with it
but I can deal with the new version, and do the torch cat myself ...
from torchio.
The problem with that was related to this issue, I mention it in the comments here. It adds unnecessary complexity to the code, the order in which transforms have been applied is lost, etc. I think the amount of work needed to trace the parameters (torch.cat
) is tiny compared to the trouble that would cause to the library to make include them in the dictionaries.
I can add an option to save the parameters as strings in the dictionary, but the user would need to parse them somehow.
from torchio.
ok thanks
from torchio.
Related Issues (20)
- The rotation given by Random Affine not accurate HOT 5
- (Optional) state_dict for each transform (reproducibility) HOT 2
- Wrong link to docs in Getting Started tutorial HOT 2
- Suggestions the modifying default value of prefetch_factor and the argument to set it for minimize the blocking-bottleneck between fetch subject and generate patch in Queue HOT 1
- Different transforms applied to CT and label HOT 11
- The Affine matrix does not change after applying the augmentations HOT 3
- Custom loader not used when loading data lazily HOT 2
- Seed is not working HOT 2
- Silenced exception makes it harder to debug custom Transforms HOT 5
- Resample
- tio.Resample does not work with custom image class HOT 2
- Setting NUM_SAMPLES when using sampler with Queue HOT 3
- RescaleIntensity - multiple calls HOT 9
- Return sampled parameters upon request HOT 3
- Halve queue length when using DDP HOT 2
- bug in rotation part of tio.transforms.RandomAffine HOT 4
- get_subjects_from_batch has a hick-up with int metadata HOT 5
- masking_method in Mask class is not saved as argument (preventing applying the inverse transform)
- RandomAffine raises an error when isotropic=True and 3 elements are given for scales HOT 10
- Queue is not respecting the batch size HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torchio.