bloodaxe / pytorch-toolbelt Goto Github PK
View Code? Open in Web Editor NEWPyTorch extensions for fast R&D prototyping and Kaggle farming
License: MIT License
PyTorch extensions for fast R&D prototyping and Kaggle farming
License: MIT License
When I use the SoftCrossEntropyLoss, I got the error:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
Could anyone help me? BTW, what paper proposed the SoftCrossEntropyLoss?
Anyone kind enough to share a code on how to use TTA for binary segmentation using this code?
I have my trained model weights but can't figure out how to use Pytroch toolbelt.
Thank you.
Hi, I'm trying to use your tiling tools with my yolov5 model but in the following line I get following error:
RuntimeError: The size of tensor a (6) must match the size of tensor b (928) at non-singleton dimension 2
The debugger shows a tile tensor size of (52983,6) and a weight tensor size of (1, 928,928). What could be the reason for the difference in the tensor size?
Some more infos:
model size: 928x928
image size is 3840*2160
I am leading the model using DetectMultiBackend from yolov5
It would be nice to have an option for 10 crop TTA that is widely used for the classification tasks:
5 crops are:
And 5 more with a horizontally flipped image.
Multiclass Focal loss returns error.
loss = criterion(preds, target)
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 32, in forward
return self.first(*input) + self.second(*input)
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 18, in forward
return self.loss(*input) * self.weight
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/focal.py", line 89, in forward
loss += self.focal_loss_fn(cls_label_input, cls_label_target)
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/functional.py", line 45, in focal_loss_with_logits
logpt = F.binary_cross_entropy_with_logits(output, target, reduction="none")
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2580, in binary_cross_entropy_with_logits
raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
ValueError: Target size (torch.Size([5, 1, 256, 256])) must be the same as input size (torch.Size([5, 256, 256]))
Exception ignored in: <function tqdm.__del__ at 0x7fd03260d400>
Traceback (most recent call last):
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1128, in __del__
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1341, in close
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1520, in display
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1131, in __repr__
File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1481, in format_dict
TypeError: cannot unpack non-iterable NoneType object
I think that line 83 in pytorch_toolbelt/losses/focal.py should be changed
from cls_label_input = label_input[:, cls, ...]
to cls_label_input = label_input[:, cls, ...].unsqueeze(1)
AttributeError: module 'pytorch_toolbelt.losses' has no attribute 'JointLoss'
Steps to reproduce the behavior:
File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/runner/supervised.py", line 197, in train
monitoring_params=monitoring_params
File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/experiment/base.py", line 40, in __init__
self._callbacks = process_callback(callbacks)
File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/utils/callbacks.py", line 23, in process_callback
result = sorted(callbacks, key=lambda x: x.order)
File "/home/vladimir/anaconda3/lib/python3.6/site-packages/catalyst/dl/utils/callbacks.py", line 23, in <lambda>
result = sorted(callbacks, key=lambda x: x.order)
AttributeError: 'MulticlassDiceMetricCallback' object has no attribute 'order'
Loss.SoftCrossEntropyLoss() not work.
Hello,
Do you have a conda installation, somehow my azure vm and py35 does not load pip installation on jupyter kernels.
Please suggest.
Sayak
image.shape == (5632, 5120, 3)
tile_size=(1280, 1280)
tile_step=(1280, 1280)
After the split, predict, merge I got a mask with a shape
(6400, 5120)
Hello, thank you for your excellent work. I understand the advantage of tiled inference, but the way we use it confuses me. For each tile, we multiply the result of the inference with the weight. However, at final step, we then divide it by the norm mask (in the merge function). In my opinion, the action of dividing the results by the norm mask seems to produce a result without a weighting mechanism. Could you please explain this further? Maybe we would need a norm_mask containing different weight with the weight of inference result (for example norm_mask is an amount of inferences in each pixels which is different with pyramid_patch_weight_loss) to normalize correctly our result ? Thank you in advance !
for tile, (x, y, tile_width, tile_height) in zip(batch, crop_coords):
self.image[:, y : y + tile_height, x : x + tile_width] += tile * self.weight
self.norm_mask[:, y : y + tile_height, x : x + tile_width] += self.weigh
def merge(self) -> torch.Tensor:
return self.image / self.norm_mask
I want to try hrnet34_unet64 for image segmentation using:
encoder = E.HRNetV2Encoder34(pretrained=pretrained, layers=[0, 1, 2, 3, 4])
UnetSegmentationModel(encoder, num_classes=num_classes, unet_channels=[64, 128, 256, 512], dropout=dropout)
And got an error:
``RuntimeError: Sizes of tensors must match except in dimension 2. Got 128 and 256 (The offending index is 0)```
Could you please let me know what is wrong? Thanks!
I have tried pretty small slices but get cuda out of memory on ---> 23 pred_batch = best_model(tiles_batch)[:, 0:1, :,:]
As I can see it finally preceded few steps but failed. I have GPU with 8 GB, model it`s unet but wuth heavy encoders. Image shape (6300, 6304, 3)
import numpy as np
import torch
import cv2
from tqdm import tqdm_notebook
from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy
image = img_to_predict
# Cut large image into overlapping tiles
tiler = ImageSlicer(image.shape, tile_size=(64, 64), tile_step=(64, 64), weight='pyramid')
# HCW -> CHW. Optionally, do normalization here
tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]
# Allocate a CUDA buffer for holding entire mask
merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)
# Run predictions for tiles and accumulate them
for tiles_batch, coords_batch in tqdm_notebook(DataLoader(list(zip(tiles, tiler.crops)), batch_size=1, pin_memory=True)):
tiles_batch = tiles_batch.float().cuda()
pred_batch = best_model(tiles_batch)[:, 0:1, :,:] # taking only first channel
merger.integrate_batch(pred_batch, coords_batch)
# Normalize accumulated mask and convert back to numpy
merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
merged_mask = tiler.crop_to_orignal_size(merged_mask)
Could not find a version that satisfies the requirement torch>=0.4.1 (from pytorch_toolbelt) (from versions: 0.1.2, 0.1.2.post1)
No matching distribution found for torch>=0.4.1 (from pytorch_toolbelt)
The same dataset, I use mmseg get a val iou 0.8, but I use IoUMetricsCallback get a val iou 0.64. I am sure that mmseg's iou value is correct.
Hi,
it was mentioned that we can get list of tensors, from fine (high-resolution, indexed 0) to coarse (low-resolution) feature maps. do you have example on how to do that?
thank you
I am unable to create a basic UNet model from the library as given on the readme. Here's the code for the same:
from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D
class UNet(nn.Module):
def __init__(self, input_channels, num_classes):
super().__init__()
self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
def forward(self, x):
x = self.encoder(x)
x = self.decoder(x)
return self.logits(x[0])
model= UNet(input_channels= 3, num_classes= 1)
Error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-1-4e8064bebb83> in <module>
15 return self.logits(x[0])
16
---> 17 model= UNet(input_channels= 3, num_classes= 1)
<ipython-input-1-4e8064bebb83> in __init__(self, input_channels, num_classes)
7 super().__init__()
8 self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
----> 9 self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
10 self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
11
~/anaconda3/envs/dl_gpu/lib/python3.7/site-packages/pytorch_toolbelt/modules/decoders/unet.py in __init__(self, feature_maps, decoder_features, unet_block, upsample_block)
38 decoder_features = [None] * num_blocks
39 else:
---> 40 if len(decoder_features) != num_blocks:
41 raise ValueError(f"decoder_features must have length of {num_blocks}")
42 in_channels_for_upsample_block = feature_maps[-1]
TypeError: object of type 'int' has no len()
Thank you very much for making such a good library. It would be nice to have a more detailed document, for example, https://smp.readthedocs.io/en/latest/
Should be:
probas = torch.movedim(probas, 1, -1) # [B, C, Di, Dj, ...] -> [B, Di, Dj, ..., C]
Traceback (most recent call last):
File "/home/sebasmos/Desktop/TRPD/segmentation_models_test.py", line 1, in
import segmentation_models_pytorch as smp
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/init.py", line 1, in
from .unet import Unet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/unet/init.py", line 1, in
from .model import Unet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/unet/model.py", line 3, in
from ..encoders import get_encoder
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/encoders/init.py", line 14, in
from .timm_efficientnet import timm_efficientnet_encoders
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/segmentation_models_pytorch/encoders/timm_efficientnet.py", line 4, in
from timm.models.efficientnet import EfficientNet
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/init.py", line 2, in
from .models import create_model, list_models, is_model, list_modules, model_entrypoint,
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/init.py", line 1, in
from .cspnet import *
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/cspnet.py", line 20, in
from .helpers import build_model_with_cfg
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/helpers.py", line 17, in
from .layers import Conv2dSame, Linear
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/init.py", line 7, in
from .cond_conv2d import CondConv2d, get_condconv_initializer
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/cond_conv2d.py", line 16, in
from .helpers import to_2tuple
File "/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/timm/models/layers/helpers.py", line 6, in
from torch._six import container_abcs
ImportError: cannot import name 'container_abcs' from 'torch._six' (/home/sebasmos/anaconda3/envs/sebasmos/lib/python3.9/site-packages/torch/_six.py)
Steps to reproduce the behavior:
""" Layer/Module Helpers
Hacked together by / Copyright 2020 Ross Wightman
"""
from itertools import repeat
import collections.abc
def _ntuple(n):
def parse(x):
if isinstance(x, collections.abc.Iterable):
return x
return tuple(repeat(x, n))
return parse
to_1tuple = _ntuple(1)
to_2tuple = _ntuple(2)
to_3tuple = _ntuple(3)
to_4tuple = _ntuple(4)
to_ntuple = _ntuple
I was wondering if there is a paper or even just a description on the BalancedBCEWithLogitsLoss
?
Спасибо за удобные инструменты для работы но я не совсем понял в каком виде я должен передавать tiles в метод merge. У меня tiles выглядит как [[512, 512, 1], [512, 512, 1], ...]
и когда я вызываю для них метод merge
я получаю такую ошибку:
image[y:y + tile_height, x:x + tile_width] += tile * w
ValueError: non-broadcastable output operand with shape (512,512,1) doesn't match the broadcast shape (512,512,512)
Не подскажите в чем может быть вызвана эта проблема ?
There are two types of focal loss here (BinaryFocalLoss and FocalLoss):
https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/losses/focal.py
Both of these functions are calling the focal_loss_with_logits function, while the second one should use softmax_focal_loss_with_logits.
Depending on opencv-python
makes it difficult to use the library in the docker environment since there is typically no gui. Would it be possible to depend on the opencv-python-headless
instead?
Thanks.
I noticed that when I compute the dice loss on an entire batch the loss is smaller than computing it singularly for each sample and then averaging it. Is this behavior intended?
Dice loss on batch equivalent to average of dice losses
Using loss from segmentation_models_pytorch
When I used pytorch_toolbelt.modules.decoders.FPNCatDecoder I got AttributeError.
think there is a duplicate usage of the "channel" variable in the FPNCatDecoder object, which is causing an error. As a solution, I renamed the channel variable used in FPNCatDecoder to "channel_o," and it executed without any issues. The potential location of the variable duplication seems to be in the channel variable of the DecoderModule.
Hey Eugene,
First of all, thank you for this very useful package. I'm transferring my environment from TF to Pytorch now and having your advanced losses is very helpful. However, when I trained the same model on the same data using same loss functions in both frameworks, I noticed that I get very different loss numbers (I'm using multilabel approach). Digging a little deeper in your code I noticed that when you calculate the Dice Loss you always calculate per sample AND per channel loss and then average it. I don't understand why are you doing the per channel calculation ad averaging, and not the Dice loss for all classes together. I can show What I mean on a dummy example below:
Let's prepare 2 dummy multilabel matrices - ground truth (d_gt) and prediction (d_pr) with 3 classes each, 0 Red, 1 Green and 2 Blue:
d_gt = np.zeros(shape=(20,20,3))
d_gt[5:10,5:10,0] =1
d_gt[10:15,10:15,1] =1
d_gt[:,:,2] = (1 - d_gt.sum(axis=-1, keepdims=True)).squeeze()
plt.imshow(d_gt)
d_pr = np.zeros(shape=(20,20,3))
d_pr[4:9,4:9,0] =1
d_pr[11:14,11:14,1] =1
d_pr[:,:,2] = (1 - d_pr.sum(axis=-1, keepdims=True)).squeeze()
plt.imshow(d_pr)
One can see that (using Dice Loss = 1- Dice Score):
However, total Dice Loss for the whole picture is 1 - (2*(16+9+341)/(2*400) = 0.085
After wrapping them into tensors
d_gt_tensor = torch.from_numpy(np.transpose(d_gt,(2,0,1))).unsqueeze(0)
d_pr_tensor = torch.from_numpy(np.transpose(d_pr,(2,0,1))).unsqueeze(0)
what your Dice Loss (with from_logits=False) is returning is 0.2927 which is the averaged loss of individual channels instead of the total loss. The culprit seems to be passing dims=(0,2) to the soft_dice_score function, I think that dims=(1,2) should be passed instead to get individual scores for each item in the batch? Unless this behaviour is intended but then I'd need some more explanation why.
Second smaller question regrading your Dice Loss is why you use from_logits= True by default?
Thanks in advance!
I believe the current implementation of the tiled inference could produce erroneous predictions. If I understand it correctly, in your tiled inference approach you accumulate predictions for each pixel and then divide them by a norm_mask
(which is the total number of predictions for each pixel). This works well for a binary case, but not for a multi-class classification. For example, if I have 4 classes to predict and I do tiled inference (e.g. tile_size=128, tile_step=64) using your moving window approach I can end up with a mix of predictions for a pixel (e.g. 1,4,4,4), and the final prediction of this pixel (after applying norm_mask
) will be 3. Wouldn't it be more appropriate to take mode
of all predictions for this pixel to get the final prediction of 4?
Steps to reproduce the behavior:
for tile, (x, y, tile_width, tile_height) in zip(batch, crop_coords):
self.image[:, y : y + tile_height, x : x + tile_width] += tile * self.weight
self.norm_mask[:, y : y + tile_height, x : x + tile_width] += self.weigh
def merge(self) -> torch.Tensor:
return self.image / self.norm_mask
https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tta.py#L135
In many cases averaging logits works worse than averaging probabilities => would be nice to be able to pass user-defined activation function. For example softmax or sigmoid.
https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L33 can be deleted.
https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L28
https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L29
are never updated and stay zero?
P.S. Numpy is very slow. replacing sqrt
and square
speeds things up a lot.
could you tell me some reference blog?I find that currently this repo could not support Objective detection of TTA
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.