Is the tensor of the input image of mobilevitv2's pre-trained weights between 0 to 255

If I use mobilevitv2's pretrained weights for transfer learning, does the custom dataset need to be normalized between 0 and 1? i.e. divide by 255.0; and the input image is bgr or rgb format? about ml-cvnets HOT 4 CLOSED

apple commented on May 14, 2024

If I use mobilevitv2's pretrained weights for transfer learning, does the custom dataset need to be normalized between 0 and 1? i.e. divide by 255.0; and the input image is bgr or rgb format?

from ml-cvnets.

Comments (4)

chenying99 commented on May 14, 2024

I found the _training_transforms method in some data loading classes, such as the coco_segmentation.py file, the code snippet is as follows:

def _training_transforms(self, size: tuple, ignore_idx: Optional[int] = 255):
        aug_list = [
            T.RandomResize(opts=self.opts),
            T.RandomCrop(opts=self.opts, size=size),
            T.RandomHorizontalFlip(opts=self.opts),
            T.ToTensor(opts=self.opts),
        ]

        return T.Compose(opts=self.opts, img_transforms=aug_list)

T.ToTensor(opts=self.opts),

The call is the ToTensor(BaseTransformation) class in image_pil.py,

There is the following code snippet in the _call method:

data["image"] = img.to(dtype=self.img_dtype).div(255.0)

Therefore, the input image should be normalized, between 0 and 1？

from ml-cvnets.

chenying99 commented on May 14, 2024

At the same time, in the getitem method of the coco_segmentation.py file, there is the following code:

rgb_img = self.read_image_opencv(os.path.join(self.img_dir, path))

The code of the read_image_opencv method in dataset_base.py is as follows:

    @staticmethod
    def read_image_opencv(path: str):
        return cv2.imread(
            path, cv2.IMREAD_COLOR
        )  # Image is read in BGR Format and not RGB format

Image is read in BGR Format and not RGB format

But in the getitem method of other data loading classes, such as pascal_voc.py, the call is:

img = self.read_image_pil(self.images[img_index])

The code snippet of the read_image_pil method of dataset_base.py is as follows:

    def read_image_pil(self, path: str, *args, **kwargs):
        def convert_to_rgb(inp_data: Union[str, io.BytesIO]):
            try:
                rgb_img = Image.open(inp_data).convert("RGB")
            except:
                rgb_img = None
            return rgb_img

        if self.cached_data is not None:
            # code for caching data on RAM
            used_memory = float(psutil.virtual_memory().percent)

            if path in self.cached_data:
                img_byte = self.cached_data[path]

            elif (path not in self.cached_data) and (used_memory <= self.cache_limit):
                # image is not present in cache and RAM usage is less than the threshold, add to cache
                with open(path, "rb") as bin_file:
                    bin_file_data = bin_file.read()
                    img_byte = io.BytesIO(bin_file_data)
                    self.cached_data[path] = img_byte
            else:
                with open(path, "rb") as bin_file:
                    bin_file_data = bin_file.read()
                    img_byte = io.BytesIO(bin_file_data)  # in-memory data
            img = convert_to_rgb(img_byte)
        else:
            img = convert_to_rgb(path)
        return img

Image is read in RGB format

My doubt is, should the input image be in bgr format or rgb format?

from ml-cvnets.

sacmehta commented on May 14, 2024

Initial version of this library uses OpenCV, which read images in BGR format. With v0.2, we migrated to PIL and now use only RGB format. OpenCV data loading is only for backward compatibility. So, we recommend to use RGB format.

We normalize the images between 0 and 1, and recommend to do so.

hope this helps

from ml-cvnets.

chenying99 commented on May 14, 2024

Thank you for your reply

from ml-cvnets.

If I use mobilevitv2's pretrained weights for transfer learning, does the custom dataset need to be normalized between 0 and 1? i.e. divide by 255.0; and the input image is bgr or rgb format? about ml-cvnets HOT 4 CLOSED

Comments (4)

Image is read in BGR Format and not RGB format

Image is read in RGB format

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent