Giter Site home page Giter Site logo

bloodaxe / pytorch-toolbelt Goto Github PK

View Code? Open in Web Editor NEW
1.5K 26.0 118.0 3.05 MB

PyTorch extensions for fast R&D prototyping and Kaggle farming

License: MIT License

Makefile 0.05% Python 99.95%
pytorch kaggle image-classification image-segmentation deep-learning segmentation python image-processing machine-learning focal-loss jaccard-loss tta test-time-augmentation augmentation object-detection pipeline

pytorch-toolbelt's Issues

SoftCrossEntropyLoss error

When I use the SoftCrossEntropyLoss, I got the error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Could anyone help me? BTW, what paper proposed the SoftCrossEntropyLoss?

How to Implement TTA For binary segmentation

Anyone kind enough to share a code on how to use TTA for binary segmentation using this code?
I have my trained model weights but can't figure out how to use Pytroch toolbelt.

Thank you.

integrate_batch throws error: RuntimeError: The size of tensor a (6) must match the size of tensor b (928) ...

Hi, I'm trying to use your tiling tools with my yolov5 model but in the following line I get following error:

self.image[:, y : y + tile_height, x : x + tile_width] += tile * self.weight

RuntimeError: The size of tensor a (6) must match the size of tensor b (928) at non-singleton dimension 2

The debugger shows a tile tensor size of (52983,6) and a weight tensor size of (1, 928,928). What could be the reason for the difference in the tensor size?

Some more infos:
model size: 928x928
image size is 3840*2160
I am leading the model using DetectMultiBackend from yolov5

10 crop TTA

It would be nice to have an option for 10 crop TTA that is widely used for the classification tasks:

5 crops are:

  1. left top
  2. right top
  3. left bottom
  4. right bottom
  5. center

And 5 more with a horizontally flipped image.

Focal loss error

Multiclass Focal loss returns error.

    loss = criterion(preds, target)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 32, in forward
    return self.first(*input) + self.second(*input)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 18, in forward
    return self.loss(*input) * self.weight
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/focal.py", line 89, in forward
    loss += self.focal_loss_fn(cls_label_input, cls_label_target)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/functional.py", line 45, in focal_loss_with_logits
    logpt = F.binary_cross_entropy_with_logits(output, target, reduction="none")
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2580, in binary_cross_entropy_with_logits
    raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
ValueError: Target size (torch.Size([5, 1, 256, 256])) must be the same as input size (torch.Size([5, 256, 256]))
Exception ignored in: <function tqdm.__del__ at 0x7fd03260d400>
Traceback (most recent call last):
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1128, in __del__
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1341, in close
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1520, in display
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1131, in __repr__
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1481, in format_dict
TypeError: cannot unpack non-iterable NoneType object

I think that line 83 in pytorch_toolbelt/losses/focal.py should be changed
from cls_label_input = label_input[:, cls, ...]
to cls_label_input = label_input[:, cls, ...].unsqueeze(1)

AttributeError: 'MulticlassDiceMetricCallback' object has no attribute 'order'

  File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/runner/supervised.py", line 197, in train
    monitoring_params=monitoring_params
  File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/experiment/base.py", line 40, in __init__
    self._callbacks = process_callback(callbacks)
  File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/utils/callbacks.py", line 23, in process_callback
    result = sorted(callbacks, key=lambda x: x.order)
  File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/utils/callbacks.py", line 23, in <lambda>
    result = sorted(callbacks, key=lambda x: x.order)
AttributeError: 'MulticlassDiceMetricCallback' object has no attribute 'order'

Conda installations

Hello,

Do you have a conda installation, somehow my azure vm and py35 does not load pip installation on jupyter kernels.

Please suggest.
Sayak

Question about tiled inference

🐛 Question about tiled inference

Hello, thank you for your excellent work. I understand the advantage of tiled inference, but the way we use it confuses me. For each tile, we multiply the result of the inference with the weight. However, at final step, we then divide it by the norm mask (in the merge function). In my opinion, the action of dividing the results by the norm mask seems to produce a result without a weighting mechanism. Could you please explain this further? Maybe we would need a norm_mask containing different weight with the weight of inference result (for example norm_mask is an amount of inferences in each pixels which is different with pyramid_patch_weight_loss) to normalize correctly our result ? Thank you in advance !

To Reproduce

  for tile, (x, y, tile_width, tile_height) in zip(batch, crop_coords):
      self.image[:, y : y + tile_height, x : x + tile_width] += tile * self.weight
      self.norm_mask[:, y : y + tile_height, x : x + tile_width] += self.weigh
  def merge(self) -> torch.Tensor:
      return self.image / self.norm_mask

UnetSegmentationModel dimension won't match

I want to try hrnet34_unet64 for image segmentation using:

encoder = E.HRNetV2Encoder34(pretrained=pretrained, layers=[0, 1, 2, 3, 4])
UnetSegmentationModel(encoder, num_classes=num_classes, unet_channels=[64, 128, 256, 512], dropout=dropout)

And got an error:
``RuntimeError: Sizes of tensors must match except in dimension 2. Got 128 and 256 (The offending index is 0)```

Could you please let me know what is wrong? Thanks!

Getting out of memory by using inference on huge images

I have tried pretty small slices but get cuda out of memory on ---> 23 pred_batch = best_model(tiles_batch)[:, 0:1, :,:] As I can see it finally preceded few steps but failed. I have GPU with 8 GB, model it`s unet but wuth heavy encoders. Image shape (6300, 6304, 3)

import numpy as np
import torch
import cv2
from tqdm import tqdm_notebook
from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy


image = img_to_predict

# Cut large image into overlapping tiles
tiler = ImageSlicer(image.shape, tile_size=(64, 64), tile_step=(64, 64), weight='pyramid')

# HCW -> CHW. Optionally, do normalization here
tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]

# Allocate a CUDA buffer for holding entire mask
merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)

# Run predictions for tiles and accumulate them
for tiles_batch, coords_batch in tqdm_notebook(DataLoader(list(zip(tiles, tiler.crops)), batch_size=1, pin_memory=True)):
    tiles_batch = tiles_batch.float().cuda()
    pred_batch = best_model(tiles_batch)[:, 0:1, :,:] # taking only first channel

    merger.integrate_batch(pred_batch, coords_batch)

# Normalize accumulated mask and convert back to numpy
merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
merged_mask = tiler.crop_to_orignal_size(merged_mask)

can't install on windows with pip

  Could not find a version that satisfies the requirement torch>=0.4.1 (from pytorch_toolbelt) (from versions: 0.1.2, 0.1.2.post1)
No matching distribution found for torch>=0.4.1 (from pytorch_toolbelt)

TypeError: object of type 'int' has no len()

I am unable to create a basic UNet model from the library as given on the readme. Here's the code for the same:

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class UNet(nn.Module):
    def __init__(self, input_channels, num_classes):
        super().__init__()
        self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
        self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])
    
model= UNet(input_channels= 3, num_classes= 1)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-4e8064bebb83> in <module>
     15         return self.logits(x[0])
     16 
---> 17 model= UNet(input_channels= 3, num_classes= 1)

<ipython-input-1-4e8064bebb83> in __init__(self, input_channels, num_classes)
      7         super().__init__()
      8         self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
----> 9         self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
     10         self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
     11 

~/anaconda3/envs/dl_gpu/lib/python3.7/site-packages/pytorch_toolbelt/modules/decoders/unet.py in __init__(self, feature_maps, decoder_features, unet_block, upsample_block)
     38             decoder_features = [None] * num_blocks
     39         else:
---> 40             if len(decoder_features) != num_blocks:
     41                 raise ValueError(f"decoder_features must have length of {num_blocks}")
     42         in_channels_for_upsample_block = feature_maps[-1]

TypeError: object of type 'int' has no len()

Update for collections.abc in installation

🐛 Bug

Traceback (most recent call last):
File "/home/sebasmos/Desktop/TRPD/segmentation_models_test.py", line 1, in
import segmentation_models_pytorch as smp
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/init.py", line 1, in
from .unet import Unet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/unet/init.py", line 1, in
from .model import Unet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/unet/model.py", line 3, in
from ..encoders import get_encoder
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/encoders/init.py", line 14, in
from .timm_efficientnet import timm_efficientnet_encoders
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/encoders/timm_efficientnet.py", line 4, in
from timm.models.efficientnet import EfficientNet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/init.py", line 2, in
from .models import create_model, list_models, is_model, list_modules, model_entrypoint,
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/init.py", line 1, in
from .cspnet import *
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/cspnet.py", line 20, in
from .helpers import build_model_with_cfg
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/helpers.py", line 17, in
from .layers import Conv2dSame, Linear
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/init.py", line 7, in
from .cond_conv2d import CondConv2d, get_condconv_initializer
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/cond_conv2d.py", line 16, in
from .helpers import to_2tuple
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/helpers.py", line 6, in
from torch._six import container_abcs
ImportError: cannot import name 'container_abcs' from 'torch._six' (/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/torch/_six.py)

To Reproduce

Steps to reproduce the behavior:

  1. Cloning as of today (12 dec): pip install -U git+https://github.com/jlcsilva/segmentation_models.pytorch

Solution - how it worked for me based on huggingface/pytorch-image-models@94ca140#diff-c7abf83bc43184f6101237b08d7c489c361f3d57b3538d633f6f01d35254b73c

""" Layer/Module Helpers

Hacked together by / Copyright 2020 Ross Wightman
"""
from itertools import repeat
import collections.abc

def _ntuple(n):
def parse(x):
if isinstance(x, collections.abc.Iterable):
return x
return tuple(repeat(x, n))
return parse

to_1tuple = _ntuple(1)
to_2tuple = _ntuple(2)
to_3tuple = _ntuple(3)
to_4tuple = _ntuple(4)
to_ntuple = _ntuple

Ошибка при вызове метода merge ?

Спасибо за удобные инструменты для работы но я не совсем понял в каком виде я должен передавать tiles в метод merge. У меня tiles выглядит как [[512, 512, 1], [512, 512, 1], ...] и когда я вызываю для них метод merge я получаю такую ошибку:

image[y:y + tile_height, x:x + tile_width] += tile * w
ValueError: non-broadcastable output operand with shape (512,512,1) doesn't match the broadcast shape (512,512,512)

Не подскажите в чем может быть вызвана эта проблема ?

Is dependency on `opencv-python` necessary?

Depending on opencv-python makes it difficult to use the library in the docker environment since there is typically no gui. Would it be possible to depend on the opencv-python-headless instead?

Thanks.

Dice loss is smaller when computed on entire batch

🐛 Bug

I noticed that when I compute the dice loss on an entire batch the loss is smaller than computing it singularly for each sample and then averaging it. Is this behavior intended?

Expected behavior

Dice loss on batch equivalent to average of dice losses

Environment

Using loss from segmentation_models_pytorch

I faced AttributeError: can't set attribute 'channels'

🐛 Bug

When I used pytorch_toolbelt.modules.decoders.FPNCatDecoder I got AttributeError.

think there is a duplicate usage of the "channel" variable in the FPNCatDecoder object, which is causing an error. As a solution, I renamed the channel variable used in FPNCatDecoder to "channel_o," and it executed without any issues. The potential location of the variable duplication seems to be in the channel variable of the DecoderModule.

Dice Loss/Score question

Hey Eugene,

First of all, thank you for this very useful package. I'm transferring my environment from TF to Pytorch now and having your advanced losses is very helpful. However, when I trained the same model on the same data using same loss functions in both frameworks, I noticed that I get very different loss numbers (I'm using multilabel approach). Digging a little deeper in your code I noticed that when you calculate the Dice Loss you always calculate per sample AND per channel loss and then average it. I don't understand why are you doing the per channel calculation ad averaging, and not the Dice loss for all classes together. I can show What I mean on a dummy example below:

Let's prepare 2 dummy multilabel matrices - ground truth (d_gt) and prediction (d_pr) with 3 classes each, 0 Red, 1 Green and 2 Blue:
d_gt = np.zeros(shape=(20,20,3))
d_gt[5:10,5:10,0] =1
d_gt[10:15,10:15,1] =1
d_gt[:,:,2] = (1 - d_gt.sum(axis=-1, keepdims=True)).squeeze()
plt.imshow(d_gt)

image

d_pr = np.zeros(shape=(20,20,3))
d_pr[4:9,4:9,0] =1
d_pr[11:14,11:14,1] =1
d_pr[:,:,2] = (1 - d_pr.sum(axis=-1, keepdims=True)).squeeze()
plt.imshow(d_pr)

image

One can see that (using Dice Loss = 1- Dice Score):

  • Dice Loss for Red is 1- ((16+ 16) / (25+ 25)) = 0.36
  • Dice Loss for Green is 1 - ((9+9)/(9+25) = 0.4706
  • Dice Loss for Blue is 1 - ((341+341)/(350+366)) = 0.0474

However, total Dice Loss for the whole picture is 1 - (2*(16+9+341)/(2*400) = 0.085

After wrapping them into tensors
d_gt_tensor = torch.from_numpy(np.transpose(d_gt,(2,0,1))).unsqueeze(0)
d_pr_tensor = torch.from_numpy(np.transpose(d_pr,(2,0,1))).unsqueeze(0)
what your Dice Loss (with from_logits=False) is returning is 0.2927 which is the averaged loss of individual channels instead of the total loss. The culprit seems to be passing dims=(0,2) to the soft_dice_score function, I think that dims=(1,2) should be passed instead to get individual scores for each item in the batch? Unless this behaviour is intended but then I'd need some more explanation why.

Second smaller question regrading your Dice Loss is why you use from_logits= True by default?

Thanks in advance!

Tiled inference potentially generates wrong multi-class predictions

🐛 Bug

I believe the current implementation of the tiled inference could produce erroneous predictions. If I understand it correctly, in your tiled inference approach you accumulate predictions for each pixel and then divide them by a norm_mask (which is the total number of predictions for each pixel). This works well for a binary case, but not for a multi-class classification. For example, if I have 4 classes to predict and I do tiled inference (e.g. tile_size=128, tile_step=64) using your moving window approach I can end up with a mix of predictions for a pixel (e.g. 1,4,4,4), and the final prediction of this pixel (after applying norm_mask) will be 3. Wouldn't it be more appropriate to take mode of all predictions for this pixel to get the final prediction of 4?

To Reproduce

Steps to reproduce the behavior:

for tile, (x, y, tile_width, tile_height) in zip(batch, crop_coords):
    self.image[:, y : y + tile_height, x : x + tile_width] += tile * self.weight
    self.norm_mask[:, y : y + tile_height, x : x + tile_width] += self.weigh
def merge(self) -> torch.Tensor:
    return self.image / self.norm_mask

Environment

  • Pytorch-toolbelt version: 0.6.2
  • Pytorch version: 2.0.0
  • Python version: 3.10
  • OS: Windows 11

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.