bloodaxe / pytorch-toolbelt Goto Github PK

PyTorch extensions for fast R&D prototyping and Kaggle farming

License: MIT License

Makefile 0.05% Python 99.95%

pytorch kaggle image-classification image-segmentation deep-learning segmentation python image-processing machine-learning focal-loss jaccard-loss tta test-time-augmentation augmentation object-detection pipeline

pytorch-toolbelt's Issues

SoftCrossEntropyLoss error

When I use the SoftCrossEntropyLoss, I got the error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Could anyone help me? BTW, what paper proposed the SoftCrossEntropyLoss?

How to Implement TTA For binary segmentation

Anyone kind enough to share a code on how to use TTA for binary segmentation using this code?
I have my trained model weights but can't figure out how to use Pytroch toolbelt.

Thank you.

integrate_batch throws error: RuntimeError: The size of tensor a (6) must match the size of tensor b (928) ...

Hi, I'm trying to use your tiling tools with my yolov5 model but in the following line I get following error:

pytorch-toolbelt/pytorch_toolbelt/inference/tiles.py

Line 341 in cab4fc4

self.image[:, y : y + tile_height, x : x + tile_width] += tile * self.weight

RuntimeError: The size of tensor a (6) must match the size of tensor b (928) at non-singleton dimension 2

The debugger shows a tile tensor size of (52983,6) and a weight tensor size of (1, 928,928). What could be the reason for the difference in the tensor size?

Some more infos:
model size: 928x928
image size is 3840*2160
I am leading the model using DetectMultiBackend from yolov5

10 crop TTA

It would be nice to have an option for 10 crop TTA that is widely used for the classification tasks:

5 crops are:

left top
right top
left bottom
right bottom
center

And 5 more with a horizontally flipped image.

Focal loss error

Multiclass Focal loss returns error.

    loss = criterion(preds, target)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 32, in forward
    return self.first(*input) + self.second(*input)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 18, in forward
    return self.loss(*input) * self.weight
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/focal.py", line 89, in forward
    loss += self.focal_loss_fn(cls_label_input, cls_label_target)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/functional.py", line 45, in focal_loss_with_logits
    logpt = F.binary_cross_entropy_with_logits(output, target, reduction="none")
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2580, in binary_cross_entropy_with_logits
    raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
ValueError: Target size (torch.Size([5, 1, 256, 256])) must be the same as input size (torch.Size([5, 256, 256]))
Exception ignored in: <function tqdm.__del__ at 0x7fd03260d400>
Traceback (most recent call last):
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1128, in __del__
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1341, in close
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1520, in display
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1131, in __repr__
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1481, in format_dict
TypeError: cannot unpack non-iterable NoneType object

I think that line 83 in pytorch_toolbelt/losses/focal.py should be changed
from cls_label_input = label_input[:, cls, ...]
to cls_label_input = label_input[:, cls, ...].unsqueeze(1)

AttributeError: module 'pytorch_toolbelt.losses' has no attribute 'JointLoss'

where is that old attribute JointLoss, what's the name of it now?

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Expected behavior

Environment

Pytorch-toolbelt version (e.g., 0.4.4):
Pytorch version (e.g., 1.8.1):
Python version (e.g., 3.7):
OS (e.g., Linux):
Any other relevant information:

Additional context

AttributeError: 'MulticlassDiceMetricCallback' object has no attribute 'order'

  File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/runner/supervised.py", line 197, in train
    monitoring_params=monitoring_params
  File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/experiment/base.py", line 40, in __init__
    self._callbacks = process_callback(callbacks)
  File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/utils/callbacks.py", line 23, in process_callback
    result = sorted(callbacks, key=lambda x: x.order)
  File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/utils/callbacks.py", line 23, in <lambda>
    result = sorted(callbacks, key=lambda x: x.order)
AttributeError: 'MulticlassDiceMetricCallback' object has no attribute 'order'

Loss.SoftCrossEntropyLoss() not work!!!

Loss.SoftCrossEntropyLoss() not work.

Conda installations

Hello,

Do you have a conda installation, somehow my azure vm and py35 does not load pip installation on jupyter kernels.

Please suggest.
Sayak

Have issues with tiling prediction when image.shape is not divisible by tile_size

image.shape == (5632, 5120, 3)
tile_size=(1280, 1280)
tile_step=(1280, 1280)

After the split, predict, merge I got a mask with a shape

(6400, 5120)

Question about tiled inference

🐛 Question about tiled inference

Hello, thank you for your excellent work. I understand the advantage of tiled inference, but the way we use it confuses me. For each tile, we multiply the result of the inference with the weight. However, at final step, we then divide it by the norm mask (in the merge function). In my opinion, the action of dividing the results by the norm mask seems to produce a result without a weighting mechanism. Could you please explain this further? Maybe we would need a norm_mask containing different weight with the weight of inference result (for example norm_mask is an amount of inferences in each pixels which is different with pyramid_patch_weight_loss) to normalize correctly our result ? Thank you in advance !

To Reproduce

  for tile, (x, y, tile_width, tile_height) in zip(batch, crop_coords):
      self.image[:, y : y + tile_height, x : x + tile_width] += tile * self.weight
      self.norm_mask[:, y : y + tile_height, x : x + tile_width] += self.weigh
  def merge(self) -> torch.Tensor:
      return self.image / self.norm_mask

UnetSegmentationModel dimension won't match

I want to try hrnet34_unet64 for image segmentation using:

encoder = E.HRNetV2Encoder34(pretrained=pretrained, layers=[0, 1, 2, 3, 4])
UnetSegmentationModel(encoder, num_classes=num_classes, unet_channels=[64, 128, 256, 512], dropout=dropout)

And got an error:
``RuntimeError: Sizes of tensors must match except in dimension 2. Got 128 and 256 (The offending index is 0)```

Could you please let me know what is wrong? Thanks!

Getting out of memory by using inference on huge images

I have tried pretty small slices but get cuda out of memory on ---> 23 pred_batch = best_model(tiles_batch)[:, 0:1, :,:] As I can see it finally preceded few steps but failed. I have GPU with 8 GB, model it`s unet but wuth heavy encoders. Image shape (6300, 6304, 3)

import numpy as np
import torch
import cv2
from tqdm import tqdm_notebook
from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy


image = img_to_predict

# Cut large image into overlapping tiles
tiler = ImageSlicer(image.shape, tile_size=(64, 64), tile_step=(64, 64), weight='pyramid')

# HCW -> CHW. Optionally, do normalization here
tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]

# Allocate a CUDA buffer for holding entire mask
merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)

# Run predictions for tiles and accumulate them
for tiles_batch, coords_batch in tqdm_notebook(DataLoader(list(zip(tiles, tiler.crops)), batch_size=1, pin_memory=True)):
    tiles_batch = tiles_batch.float().cuda()
    pred_batch = best_model(tiles_batch)[:, 0:1, :,:] # taking only first channel

    merger.integrate_batch(pred_batch, coords_batch)

# Normalize accumulated mask and convert back to numpy
merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
merged_mask = tiler.crop_to_orignal_size(merged_mask)

can't install on windows with pip

  Could not find a version that satisfies the requirement torch>=0.4.1 (from pytorch_toolbelt) (from versions: 0.1.2, 0.1.2.post1)
No matching distribution found for torch>=0.4.1 (from pytorch_toolbelt)

IoUMetricsCallback error, iou too small

The same dataset, I use mmseg get a val iou 0.8, but I use IoUMetricsCallback get a val iou 0.64. I am sure that mmseg's iou value is correct.

example to get list of tensor from feature maps

Hi,
it was mentioned that we can get list of tensors, from fine (high-resolution, indexed 0) to coarse (low-resolution) feature maps. do you have example on how to do that?

thank you

TypeError: object of type 'int' has no len()

I am unable to create a basic UNet model from the library as given on the readme. Here's the code for the same:

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class UNet(nn.Module):
    def __init__(self, input_channels, num_classes):
        super().__init__()
        self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
        self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])
    
model= UNet(input_channels= 3, num_classes= 1)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-4e8064bebb83> in <module>
     15         return self.logits(x[0])
     16 
---> 17 model= UNet(input_channels= 3, num_classes= 1)

<ipython-input-1-4e8064bebb83> in __init__(self, input_channels, num_classes)
      7         super().__init__()
      8         self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
----> 9         self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
     10         self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
     11 

~/anaconda3/envs/dl_gpu/lib/python3.7/site-packages/pytorch_toolbelt/modules/decoders/unet.py in __init__(self, feature_maps, decoder_features, unet_block, upsample_block)
     38             decoder_features = [None] * num_blocks
     39         else:
---> 40             if len(decoder_features) != num_blocks:
     41                 raise ValueError(f"decoder_features must have length of {num_blocks}")
     42         in_channels_for_upsample_block = feature_maps[-1]

TypeError: object of type 'int' has no len()

Detailed documentation is recommended

Thank you very much for making such a good library. It would be nice to have a more detailed document, for example, https://smp.readthedocs.io/en/latest/

Error in Lovasz loss

pytorch-toolbelt/pytorch_toolbelt/losses/lovasz.py

Line 150 in f3acfca

    
           probas = torch.movedim(probas, 0, -1)  # [B, C, Di, Dj, Dk...] -> [B, C, Di...Dk, C]

Should be:

probas = torch.movedim(probas, 1, -1)  # [B, C, Di, Dj, ...] -> [B, Di, Dj, ..., C]

Update for collections.abc in installation

🐛 Bug

Traceback (most recent call last):
File "/home/sebasmos/Desktop/TRPD/segmentation_models_test.py", line 1, in
import segmentation_models_pytorch as smp
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/init.py", line 1, in
from .unet import Unet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/unet/init.py", line 1, in
from .model import Unet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/unet/model.py", line 3, in
from ..encoders import get_encoder
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/encoders/init.py", line 14, in
from .timm_efficientnet import timm_efficientnet_encoders
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/encoders/timm_efficientnet.py", line 4, in
from timm.models.efficientnet import EfficientNet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/init.py", line 2, in
from .models import create_model, list_models, is_model, list_modules, model_entrypoint,
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/init.py", line 1, in
from .cspnet import *
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/cspnet.py", line 20, in
from .helpers import build_model_with_cfg
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/helpers.py", line 17, in
from .layers import Conv2dSame, Linear
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/init.py", line 7, in
from .cond_conv2d import CondConv2d, get_condconv_initializer
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/cond_conv2d.py", line 16, in
from .helpers import to_2tuple
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/helpers.py", line 6, in
from torch._six import container_abcs
ImportError: cannot import name 'container_abcs' from 'torch._six' (/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/torch/_six.py)

To Reproduce

Steps to reproduce the behavior:

Cloning as of today (12 dec): pip install -U git+https://github.com/jlcsilva/segmentation_models.pytorch

Solution - how it worked for me based on huggingface/pytorch-image-models@`94ca140`#diff-c7abf83bc43184f6101237b08d7c489c361f3d57b3538d633f6f01d35254b73c

""" Layer/Module Helpers

Hacked together by / Copyright 2020 Ross Wightman
"""
from itertools import repeat
import collections.abc

def _ntuple(n):
def parse(x):
if isinstance(x, collections.abc.Iterable):
return x
return tuple(repeat(x, n))
return parse

to_1tuple = _ntuple(1)
to_2tuple = _ntuple(2)
to_3tuple = _ntuple(3)
to_4tuple = _ntuple(4)
to_ntuple = _ntuple

Source on BalancedBCEWithLogitsLoss

I was wondering if there is a paper or even just a description on the BalancedBCEWithLogitsLoss ?

Ошибка при вызове метода merge ?

Спасибо за удобные инструменты для работы но я не совсем понял в каком виде я должен передавать tiles в метод merge. У меня tiles выглядит как [[512, 512, 1], [512, 512, 1], ...] и когда я вызываю для них метод merge я получаю такую ошибку:

image[y:y + tile_height, x:x + tile_width] += tile * w
ValueError: non-broadcastable output operand with shape (512,512,1) doesn't match the broadcast shape (512,512,512)

Не подскажите в чем может быть вызвана эта проблема ?

FocalLoss

🐛 Bug

There are two types of focal loss here (BinaryFocalLoss and FocalLoss):
https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/losses/focal.py

Both of these functions are calling the focal_loss_with_logits function, while the second one should use softmax_focal_loss_with_logits.

Is dependency on `opencv-python` necessary?

Depending on opencv-python makes it difficult to use the library in the docker environment since there is typically no gui. Would it be possible to depend on the opencv-python-headless instead?

Thanks.

Dice loss is smaller when computed on entire batch

🐛 Bug

I noticed that when I compute the dice loss on an entire batch the loss is smaller than computing it singularly for each sample and then averaging it. Is this behavior intended?

Expected behavior

Dice loss on batch equivalent to average of dice losses

Environment

Using loss from segmentation_models_pytorch

I faced AttributeError: can't set attribute 'channels'

🐛 Bug

When I used pytorch_toolbelt.modules.decoders.FPNCatDecoder I got AttributeError.

think there is a duplicate usage of the "channel" variable in the FPNCatDecoder object, which is causing an error. As a solution, I renamed the channel variable used in FPNCatDecoder to "channel_o," and it executed without any issues. The potential location of the variable duplication seems to be in the channel variable of the DecoderModule.

Dice Loss/Score question

Hey Eugene,

First of all, thank you for this very useful package. I'm transferring my environment from TF to Pytorch now and having your advanced losses is very helpful. However, when I trained the same model on the same data using same loss functions in both frameworks, I noticed that I get very different loss numbers (I'm using multilabel approach). Digging a little deeper in your code I noticed that when you calculate the Dice Loss you always calculate per sample AND per channel loss and then average it. I don't understand why are you doing the per channel calculation ad averaging, and not the Dice loss for all classes together. I can show What I mean on a dummy example below:

Let's prepare 2 dummy multilabel matrices - ground truth (d_gt) and prediction (d_pr) with 3 classes each, 0 Red, 1 Green and 2 Blue:
d_gt = np.zeros(shape=(20,20,3))
d_gt[5:10,5:10,0] =1
d_gt[10:15,10:15,1] =1
d_gt[:,:,2] = (1 - d_gt.sum(axis=-1, keepdims=True)).squeeze()
plt.imshow(d_gt)

d_pr = np.zeros(shape=(20,20,3))
d_pr[4:9,4:9,0] =1
d_pr[11:14,11:14,1] =1
d_pr[:,:,2] = (1 - d_pr.sum(axis=-1, keepdims=True)).squeeze()
plt.imshow(d_pr)

One can see that (using Dice Loss = 1- Dice Score):

Dice Loss for Red is 1- ((16+ 16) / (25+ 25)) = 0.36
Dice Loss for Green is 1 - ((9+9)/(9+25) = 0.4706
Dice Loss for Blue is 1 - ((341+341)/(350+366)) = 0.0474

However, total Dice Loss for the whole picture is 1 - (2*(16+9+341)/(2*400) = 0.085

After wrapping them into tensors
d_gt_tensor = torch.from_numpy(np.transpose(d_gt,(2,0,1))).unsqueeze(0)
d_pr_tensor = torch.from_numpy(np.transpose(d_pr,(2,0,1))).unsqueeze(0)
what your Dice Loss (with from_logits=False) is returning is 0.2927 which is the averaged loss of individual channels instead of the total loss. The culprit seems to be passing dims=(0,2) to the soft_dice_score function, I think that dims=(1,2) should be passed instead to get individual scores for each item in the batch? Unless this behaviour is intended but then I'd need some more explanation why.

Second smaller question regrading your Dice Loss is why you use from_logits= True by default?

Thanks in advance!

Where is DiceLoss class?

Tiled inference potentially generates wrong multi-class predictions

🐛 Bug

I believe the current implementation of the tiled inference could produce erroneous predictions. If I understand it correctly, in your tiled inference approach you accumulate predictions for each pixel and then divide them by a norm_mask (which is the total number of predictions for each pixel). This works well for a binary case, but not for a multi-class classification. For example, if I have 4 classes to predict and I do tiled inference (e.g. tile_size=128, tile_step=64) using your moving window approach I can end up with a mix of predictions for a pixel (e.g. 1,4,4,4), and the final prediction of this pixel (after applying norm_mask) will be 3. Wouldn't it be more appropriate to take mode of all predictions for this pixel to get the final prediction of 4?

To Reproduce

Steps to reproduce the behavior:

for tile, (x, y, tile_width, tile_height) in zip(batch, crop_coords):
    self.image[:, y : y + tile_height, x : x + tile_width] += tile * self.weight
    self.norm_mask[:, y : y + tile_height, x : x + tile_width] += self.weigh

def merge(self) -> torch.Tensor:
    return self.image / self.norm_mask

Environment

Pytorch-toolbelt version: 0.6.2
Pytorch version: 2.0.0
Python version: 3.10
OS: Windows 11

Feature request add an option to pass activation function to TTA

https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tta.py#L135

In many cases averaging logits works worse than averaging probabilities => would be nice to be able to pass user-defined activation function. For example softmax or sigmoid.

Is compute_pyramid_patch_weight_loss correctly imlemented?

https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L33 can be deleted.

https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L28
https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L29

are never updated and stay zero?

P.S. Numpy is very slow. replacing sqrt and square speeds things up a lot.

how to use TTA in the object detection?

could you tell me some reference blog?I find that currently this repo could not support Objective detection of TTA

bloodaxe / pytorch-toolbelt Goto Github PK

pytorch-toolbelt's Issues

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

🐛 Question about tiled inference

To Reproduce

🐛 Bug

To Reproduce

Solution - how it worked for me based on huggingface/pytorch-image-models@94ca140#diff-c7abf83bc43184f6101237b08d7c489c361f3d57b3538d633f6f01d35254b73c

🐛 Bug

🐛 Bug

Expected behavior

Environment

🐛 Bug

🐛 Bug

To Reproduce

Environment

Recommend Projects

Recommend Topics

Recommend Org

Solution - how it worked for me based on huggingface/pytorch-image-models@`94ca140`#diff-c7abf83bc43184f6101237b08d7c489c361f3d57b3538d633f6f01d35254b73c