I may be a good feature to add, if we can compose augmentation in a similar way as

all is done there : <a href="https://github.com/albumentations-team/albumentations

Yes I discover this problem playing with <a href="https://github.com/romainVala/torchi

Yes, that's the same strategy I showed in my <a href="https://github.com/fepegar/torch

For instance the order of the composition matter <p dir

compose augmentation with given probabilitie,about fepegar/torchio

romainVala commented on July 4, 2024

all is done there :
https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/composition.py

...

from torchio.

fepegar commented on July 4, 2024

All contributions are welcome!

from torchio.

fepegar commented on July 4, 2024

One of the main issues related to this is that the random parameters generated by the RandomTransform are stored in the sample. I thought this was nice was traceability. Adding these probabilistic compositions is nice, but if two samples don't share all their keys, it won't be possible to collate them in a batch by the default PyTorch DataLoader collate_fn. This is why I add the random parameters to a sample even if the transform is not applied, with the do_it variable, as shown here:

torchio/torchio/transforms/augmentation/intensity/random_bias_field.py

Lines 47 to 55 in 224dbed

    
           do_augmentation, coefficients = self.get_params( 
        
               self.order, 
        
               self.coefficients_range, 
        
               self.proportion_to_augment, 
        
           ) 
        
           sample[image_name]['random_bias_coefficients'] = coefficients 
        
           sample[image_name]['random_bias_do_augmentation'] = do_augmentation 
        
           if not do_augmentation: 
        
               continue

One solution is to force the user to define a custom collate function as in this example:

torchio/examples/example_heteromodal.py

Lines 57 to 64 in 224dbed

    
           # This collate_fn is needed in the case of missing modalities 
        
           # In this case, the batch will be composed by a *list* of samples instead of 
        
           # the typical Python dictionary that is collated by default in Pytorch 
        
           batch_loader = DataLoader( 
        
               queue_dataset, 
        
               batch_size=batch_size, 
        
               collate_fn=lambda x: x, 
        
           )

but this seems like unnecessary work and slightly moves away from standard PyTorch practice.

In order to keep the default collate function, the best solution I can think of is to add the random parameters only if a flag has been enabled for it. In that case, it will be possible that samples will have different keys and the user will need to use a custom collate function. By default, these entries won't be added and the samples will be easily composed in a batch, even if they have followed paths in the transforms pipeline generated by these composing operations.

from torchio.

romainVala commented on July 4, 2024

Yes I discover this problem playing with https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L98

The problem is not related to the composition, it happend to each transform that have a probability to be use or not.
My way to handel it is to always define all parameter that you want to add in the dictionary (with default value (the only dificulty is that it should be the same type, and the same shape )
and the parameter sample[image_name]['random_bias_do_augmentation'] will tell you if the transform has been applied or not

from torchio.

romainVala commented on July 4, 2024

for instance
here is the default value of the parameter I want to keep trace of
https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L93

it will be define a few line ater
https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L114
if the transformation is applied

from torchio.

fepegar commented on July 4, 2024

Yes, that's the same strategy I showed in my comment. But I think that polluting all samples with irrelevant information adds unnecessary complexity to the dictionaries. Is like adding an entry with "the parameters that would have been used but never did".

from torchio.

romainVala commented on July 4, 2024

This is 2 separates things:
1_ The choice of the parameter to add in the dictionaries
2_ the fact that it should be add with default value even if the tranfo is not applied

2 is necessary to have a mixte batch of "applied and no applied" transfo
and 1 is indeed very dependant of the task in mind.

Personally I like it to be as complete as possible, to be able to check what has been performed

from torchio.

fepegar commented on July 4, 2024

What happens if you have a list of 10 transforms and you compose them with OneOf? Do you still need to go through the other 9 and add some parameters to the sample?

from torchio.

romainVala commented on July 4, 2024

sure !
I see your point: 9 different dict parameters are empty...
but at the end you want to retrieve them (in order to add conditions base on those parameters). To do so you need a complete dictionary. So yes

it may become hard to handel, if parameters witht the same name can have variable shape.

A other difficult question is about the exact structures to store them. For instance the order of the composition matter. how do you retrieve it...

from torchio.

fepegar commented on July 4, 2024

For instance the order of the composition matter

You're right, this is also an issue.

Something I've also thought of is to store the transforms information in the sample in plain text, adding a line every time a transform is applied. But this wouldn't be readable and the text would need to be parsed. Also, currently, some parameters are saved on a per-image basis and other (spatial transforms) are saved for the whole sample.

from torchio.

fepegar commented on July 4, 2024

The text approach would be like a "history" of operations applied to the sample, a bit like the autograd history in PyTorch tensors.

from torchio.

fepegar commented on July 4, 2024

The last couple of commits will probably make retrieving random transforms parameters difficult. I'm planning to create a Sample class that can be automatically collated by the DataLoader and stores the random parameters information.

from torchio.

fepegar commented on July 4, 2024

@romainVala You can retrieve the parameters for random transforms like this:

torchio/examples/example_motion.py

Lines 17 to 31 in 6741adf

    
           dataset = ImagesDataset(subjects_list) 
        
           sample = dataset[0] 
        
           transform = transforms.RandomMotion( 
        
               seed=42, 
        
               degrees=10, 
        
               translation=10, 
        
               num_transforms=3, 
        
           ) 
        
           transformed = transform(sample) 
        
           _, random_parameters = transformed.history[0] 
        
           pprint(random_parameters['t1']['times']) 
        
           pprint(random_parameters['t1']['degrees']) 
        
           pprint(random_parameters['t1']['translation'])

from torchio.

romainVala commented on July 4, 2024

yes, why not ...
what happen when you are within a dataloader, where is the batch dimention

from torchio.

fepegar commented on July 4, 2024

Now transforms return instances of Subject, which inherits from dict. The random parameters are now stored in the history attribute of Subject. The batch is collated with the default collate function in the data loader. I'm not sure I understand your question.

from torchio.

romainVala commented on July 4, 2024

Ok, I better understand if I can play with it,
so I try the following :

t = Compose([RandomNoise(),  RandomElasticDeformation() ])
dataset = ImagesDataset(suj, transform=t)
sample = dataset[0]

then as you write, I can get the parameter of each transform, since the history has now the same length as number of compose.
great !

But now if I do

dl = DataLoader(dataset, batch_size=4)
sample_batch = next(iter(dl))

but I can not retrieve the history informations
type(sample) is torchio.data.subject.Subject
but
type(sample_batch) is dict

may be I missing something ?

from torchio.

fepegar commented on July 4, 2024

You're right. I suppose that if you want to trace the random parameters, you'll need to collate the batch yourself:

dl = DataLoader(
    dataset,
    batch_size=4,
    collate_fn=lambda x: x,  # this creates a list of Subjects
)  
samples = next(iter(dl))
inputs = torch.cat([sample['image'][torchio.DATA] for sample in samples])
histories = [sample.history for sample in samples]

from torchio.

romainVala commented on July 4, 2024

Ok I see,
thanks for the code exemple, I now better understand what the collate_fn does.

Well I do prefer the previous version where extra parameter, were added in the subject dict. in this case the batch concatenation was done automatically, (for the inputs data and for the parameters). I do no understand what was wrong with it

but I can deal with the new version, and do the torch cat myself ...

from torchio.

fepegar commented on July 4, 2024

The problem with that was related to this issue, I mention it in the comments here. It adds unnecessary complexity to the code, the order in which transforms have been applied is lost, etc. I think the amount of work needed to trace the parameters (torch.cat) is tiny compared to the trouble that would cause to the library to make include them in the dictionaries.

I can add an option to save the parameters as strings in the dictionary, but the user would need to parse them somehow.

from torchio.

romainVala commented on July 4, 2024

ok thanks

from torchio.

compose augmentation with given probabilitie about torchio HOT 20 CLOSED

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	do_augmentation, coefficients = self.get_params(
	self.order,
	self.coefficients_range,
	self.proportion_to_augment,
	)
	sample[image_name]['random_bias_coefficients'] = coefficients
	sample[image_name]['random_bias_do_augmentation'] = do_augmentation
	if not do_augmentation:
	continue

	# This collate_fn is needed in the case of missing modalities
	# In this case, the batch will be composed by a list of samples instead of
	# the typical Python dictionary that is collated by default in Pytorch
	batch_loader = DataLoader(
	queue_dataset,
	batch_size=batch_size,
	collate_fn=lambda x: x,
	)

	dataset = ImagesDataset(subjects_list)
	sample = dataset[0]
	transform = transforms.RandomMotion(
	seed=42,
	degrees=10,
	translation=10,
	num_transforms=3,
	)
	transformed = transform(sample)

	_, random_parameters = transformed.history[0]

	pprint(random_parameters['t1']['times'])
	pprint(random_parameters['t1']['degrees'])
	pprint(random_parameters['t1']['translation'])