Giter Site home page Giter Site logo

If I use mobilevitv2's pretrained weights for transfer learning, does the custom dataset need to be normalized between 0 and 1? i.e. divide by 255.0; and the input image is bgr or rgb format? about ml-cvnets HOT 4 CLOSED

apple avatar apple commented on May 14, 2024
If I use mobilevitv2's pretrained weights for transfer learning, does the custom dataset need to be normalized between 0 and 1? i.e. divide by 255.0; and the input image is bgr or rgb format?

from ml-cvnets.

Comments (4)

chenying99 avatar chenying99 commented on May 14, 2024

I found the _training_transforms method in some data loading classes, such as the coco_segmentation.py file, the code snippet is as follows:

def _training_transforms(self, size: tuple, ignore_idx: Optional[int] = 255):
        aug_list = [
            T.RandomResize(opts=self.opts),
            T.RandomCrop(opts=self.opts, size=size),
            T.RandomHorizontalFlip(opts=self.opts),
            T.ToTensor(opts=self.opts),
        ]

        return T.Compose(opts=self.opts, img_transforms=aug_list)

T.ToTensor(opts=self.opts),

The call is the ToTensor(BaseTransformation) class in image_pil.py,

There is the following code snippet in the _call method:

data["image"] = img.to(dtype=self.img_dtype).div(255.0)

Therefore, the input image should be normalized, between 0 and 1?

from ml-cvnets.

chenying99 avatar chenying99 commented on May 14, 2024

At the same time, in the getitem method of the coco_segmentation.py file, there is the following code:

rgb_img = self.read_image_opencv(os.path.join(self.img_dir, path))

The code of the read_image_opencv method in dataset_base.py is as follows:

    @staticmethod
    def read_image_opencv(path: str):
        return cv2.imread(
            path, cv2.IMREAD_COLOR
        )  # Image is read in BGR Format and not RGB format

Image is read in BGR Format and not RGB format

But in the getitem method of other data loading classes, such as pascal_voc.py, the call is:

img = self.read_image_pil(self.images[img_index])

The code snippet of the read_image_pil method of dataset_base.py is as follows:

    def read_image_pil(self, path: str, *args, **kwargs):
        def convert_to_rgb(inp_data: Union[str, io.BytesIO]):
            try:
                rgb_img = Image.open(inp_data).convert("RGB")
            except:
                rgb_img = None
            return rgb_img

        if self.cached_data is not None:
            # code for caching data on RAM
            used_memory = float(psutil.virtual_memory().percent)

            if path in self.cached_data:
                img_byte = self.cached_data[path]

            elif (path not in self.cached_data) and (used_memory <= self.cache_limit):
                # image is not present in cache and RAM usage is less than the threshold, add to cache
                with open(path, "rb") as bin_file:
                    bin_file_data = bin_file.read()
                    img_byte = io.BytesIO(bin_file_data)
                    self.cached_data[path] = img_byte
            else:
                with open(path, "rb") as bin_file:
                    bin_file_data = bin_file.read()
                    img_byte = io.BytesIO(bin_file_data)  # in-memory data
            img = convert_to_rgb(img_byte)
        else:
            img = convert_to_rgb(path)
        return img

Image is read in RGB format

My doubt is, should the input image be in bgr format or rgb format?

from ml-cvnets.

sacmehta avatar sacmehta commented on May 14, 2024

Initial version of this library uses OpenCV, which read images in BGR format. With v0.2, we migrated to PIL and now use only RGB format. OpenCV data loading is only for backward compatibility. So, we recommend to use RGB format.

We normalize the images between 0 and 1, and recommend to do so.

hope this helps

from ml-cvnets.

chenying99 avatar chenying99 commented on May 14, 2024

Thank you for your reply

from ml-cvnets.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.