Giter Site home page Giter Site logo

Comments (20)

romainVala avatar romainVala commented on July 4, 2024

all is done there :
https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/composition.py

...

from torchio.

fepegar avatar fepegar commented on July 4, 2024

All contributions are welcome!

from torchio.

fepegar avatar fepegar commented on July 4, 2024

One of the main issues related to this is that the random parameters generated by the RandomTransform are stored in the sample. I thought this was nice was traceability. Adding these probabilistic compositions is nice, but if two samples don't share all their keys, it won't be possible to collate them in a batch by the default PyTorch DataLoader collate_fn. This is why I add the random parameters to a sample even if the transform is not applied, with the do_it variable, as shown here:

do_augmentation, coefficients = self.get_params(
self.order,
self.coefficients_range,
self.proportion_to_augment,
)
sample[image_name]['random_bias_coefficients'] = coefficients
sample[image_name]['random_bias_do_augmentation'] = do_augmentation
if not do_augmentation:
continue

One solution is to force the user to define a custom collate function as in this example:

# This collate_fn is needed in the case of missing modalities
# In this case, the batch will be composed by a *list* of samples instead of
# the typical Python dictionary that is collated by default in Pytorch
batch_loader = DataLoader(
queue_dataset,
batch_size=batch_size,
collate_fn=lambda x: x,
)

but this seems like unnecessary work and slightly moves away from standard PyTorch practice.

In order to keep the default collate function, the best solution I can think of is to add the random parameters only if a flag has been enabled for it. In that case, it will be possible that samples will have different keys and the user will need to use a custom collate function. By default, these entries won't be added and the samples will be easily composed in a batch, even if they have followed paths in the transforms pipeline generated by these composing operations.

from torchio.

romainVala avatar romainVala commented on July 4, 2024

Yes I discover this problem playing with https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L98

The problem is not related to the composition, it happend to each transform that have a probability to be use or not.
My way to handel it is to always define all parameter that you want to add in the dictionary (with default value (the only dificulty is that it should be the same type, and the same shape )
and the parameter sample[image_name]['random_bias_do_augmentation'] will tell you if the transform has been applied or not

from torchio.

romainVala avatar romainVala commented on July 4, 2024

for instance
here is the default value of the parameter I want to keep trace of
https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L93

it will be define a few line ater
https://github.com/romainVala/torchio/blob/038540ab94e6ea6d75fb288e8c6a2a641ed5396a/torchio/transforms/augmentation/intensity/random_motion_from_time_course.py#L114
if the transformation is applied

from torchio.

fepegar avatar fepegar commented on July 4, 2024

Yes, that's the same strategy I showed in my comment. But I think that polluting all samples with irrelevant information adds unnecessary complexity to the dictionaries. Is like adding an entry with "the parameters that would have been used but never did".

from torchio.

romainVala avatar romainVala commented on July 4, 2024

This is 2 separates things:
1_ The choice of the parameter to add in the dictionaries
2_ the fact that it should be add with default value even if the tranfo is not applied

2 is necessary to have a mixte batch of "applied and no applied" transfo
and 1 is indeed very dependant of the task in mind.

Personally I like it to be as complete as possible, to be able to check what has been performed

from torchio.

fepegar avatar fepegar commented on July 4, 2024

What happens if you have a list of 10 transforms and you compose them with OneOf? Do you still need to go through the other 9 and add some parameters to the sample?

from torchio.

romainVala avatar romainVala commented on July 4, 2024

sure !
I see your point: 9 different dict parameters are empty...
but at the end you want to retrieve them (in order to add conditions base on those parameters). To do so you need a complete dictionary. So yes

it may become hard to handel, if parameters witht the same name can have variable shape.

A other difficult question is about the exact structures to store them. For instance the order of the composition matter. how do you retrieve it...

from torchio.

fepegar avatar fepegar commented on July 4, 2024

For instance the order of the composition matter

You're right, this is also an issue.

Something I've also thought of is to store the transforms information in the sample in plain text, adding a line every time a transform is applied. But this wouldn't be readable and the text would need to be parsed. Also, currently, some parameters are saved on a per-image basis and other (spatial transforms) are saved for the whole sample.

from torchio.

fepegar avatar fepegar commented on July 4, 2024

The text approach would be like a "history" of operations applied to the sample, a bit like the autograd history in PyTorch tensors.

from torchio.

fepegar avatar fepegar commented on July 4, 2024

The last couple of commits will probably make retrieving random transforms parameters difficult. I'm planning to create a Sample class that can be automatically collated by the DataLoader and stores the random parameters information.

from torchio.

fepegar avatar fepegar commented on July 4, 2024

@romainVala You can retrieve the parameters for random transforms like this:

dataset = ImagesDataset(subjects_list)
sample = dataset[0]
transform = transforms.RandomMotion(
seed=42,
degrees=10,
translation=10,
num_transforms=3,
)
transformed = transform(sample)
_, random_parameters = transformed.history[0]
pprint(random_parameters['t1']['times'])
pprint(random_parameters['t1']['degrees'])
pprint(random_parameters['t1']['translation'])

from torchio.

romainVala avatar romainVala commented on July 4, 2024

yes, why not ...
what happen when you are within a dataloader, where is the batch dimention

from torchio.

fepegar avatar fepegar commented on July 4, 2024

Now transforms return instances of Subject, which inherits from dict. The random parameters are now stored in the history attribute of Subject. The batch is collated with the default collate function in the data loader. I'm not sure I understand your question.

from torchio.

romainVala avatar romainVala commented on July 4, 2024

Ok, I better understand if I can play with it,
so I try the following :

t = Compose([RandomNoise(),  RandomElasticDeformation() ])
dataset = ImagesDataset(suj, transform=t)
sample = dataset[0]

then as you write, I can get the parameter of each transform, since the history has now the same length as number of compose.
great !

But now if I do

dl = DataLoader(dataset, batch_size=4)
sample_batch = next(iter(dl))

but I can not retrieve the history informations
type(sample) is torchio.data.subject.Subject
but
type(sample_batch) is dict

may be I missing something ?

from torchio.

fepegar avatar fepegar commented on July 4, 2024

You're right. I suppose that if you want to trace the random parameters, you'll need to collate the batch yourself:

dl = DataLoader(
    dataset,
    batch_size=4,
    collate_fn=lambda x: x,  # this creates a list of Subjects
)  
samples = next(iter(dl))
inputs = torch.cat([sample['image'][torchio.DATA] for sample in samples])
histories = [sample.history for sample in samples]

from torchio.

romainVala avatar romainVala commented on July 4, 2024

Ok I see,
thanks for the code exemple, I now better understand what the collate_fn does.

Well I do prefer the previous version where extra parameter, were added in the subject dict. in this case the batch concatenation was done automatically, (for the inputs data and for the parameters). I do no understand what was wrong with it

but I can deal with the new version, and do the torch cat myself ...

from torchio.

fepegar avatar fepegar commented on July 4, 2024

The problem with that was related to this issue, I mention it in the comments here. It adds unnecessary complexity to the code, the order in which transforms have been applied is lost, etc. I think the amount of work needed to trace the parameters (torch.cat) is tiny compared to the trouble that would cause to the library to make include them in the dictionaries.

I can add an option to save the parameters as strings in the dictionary, but the user would need to parse them somehow.

from torchio.

romainVala avatar romainVala commented on July 4, 2024

ok thanks

from torchio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.