Giter Site home page Giter Site logo

glmc's Introduction

🌎[CVPR2023] Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions(GLMCοΌ‰

by Fei Du, Peng Yang, Qi Jia, Fengtao Nan, Xiaoting Chen, Yun Yang

This is the official implementation of Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

🎬Video | πŸ’»Slide | πŸ”₯Poster

Update 2023/5/23

Thank you very much for the question from @CxC-ssjg. In our code for the Cifar10Imbalance and Cifar100Imbalance classes, when generating imbalanced data, we used np.random.choice for random sampling of samples. However, we did not set the "replace" parameter in the method to False, which could result in multiple repeated samples of a particular sample, thereby reducing the diversity of the dataset. Based on @CxC-ssjg's advice, we set replace to False and fine-tuned our model accordingly. As a result, we observed a significant improvement in performance compared to the results reported in the paper. We have provided an update on the latest results and made the model publicly available. Once again, thank you, @CxC-ssjg, for your valuable question.

Dateset IF GLMC GLMC(Updated) GLMC(Updated) + MaxNorm
CIFAR-100-LT 100 55.88% 57.97% 58.41%
CIFAR-100-LT 50 61.08% 63.78% 64.57%
CIFAR-100-LT 10 70.74% 73.40% 74.28%
CIFAR-10-LT 100 87.75% 88.50% 89.58%
CIFAR-10-LT 50 90.18% 91.04% 92.04%
CIFAR-10-LT 10 94.04% 94.87% 95.00%

Update 2023/5/15

Apologies for the oversight in our paper regarding the incorrect upload of the results for CIFAR-10. We have updated our GitHub repository and reported the final results for CIFAR-10-LT. Compared to the latest state-of-the-art work by BCL[1], our results are still 3% higher. We have also uploaded the latest paper on arXiv, and you can find it at the following link: Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

The experimental setup was as follows:

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 4

CIFAR-10-LT

Method IF Model Top-1 Acc(%)
GLMC 100 ResNet-32 87.75%
GLMC 50 ResNet-32 90.18%
GLMC 10 ResNet-32 94.04%
GLMC + MaxNorm 100 ResNet-32 87.57%
GLMC + MaxNorm 50 ResNet-32 90.22%
GLMC + MaxNorm 10 ResNet-32 94.03%

[1] Jianggang Zhu, ZhengWang, Jingjing Chen, Yi-Ping Phoebe Chen, and Yu-Gang Jiang. Balanced contrastive learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6908–6917, 2022. 2, 3, 5, 6

πŸ’₯Meanwhile, We supplemented the experiment on iNaturelist2018 and achieved the state-of-the-art.

Method Model Many Med Few All model
GLMC ResNeXt-50 64.60 73.16 73.01 72.21 Download

Overview

An overview of our GLMC: two types of mixed-label augmented images are processed by an encoder network and a projection head to obtain the representation $h_g$ and $h_l$. Then a prediction head transforms the two representations to output $u_g$ and $u_l$. We minimize their negative cosine similarity as an auxiliary loss in the supervised loss. $sg(*)$ denotes stop gradient operation.

image

We propose an efficient one-stage training strategy for long-tailed visual recognition called Global and Local Mixture Consistency cumulative learning (GLMC). Our core ideas are twofold: (1) a global and local mixture consistency loss improves the robustness of the feature extractor. Specifically, we generate two augmented batches by the global MixUp and local CutMix from the same batch data, respectively, and then use cosine similarity to minimize the difference. (2) A cumulative head-tail soft label reweighted loss mitigates the head class bias problem. We use empirical class frequencies to reweight the mixed label of the head-tail class for long-tailed data and then balance the conventional loss and the rebalanced loss with a coefficient accumulated by epochs.

Getting Started

Requirements

All codes are written by Python 3.9 with

  • PyTorch = 1.10.0

  • torchvision = 0.11.1

  • numpy = 1.22.0

Preparing Datasets

Download the datasets CIFAR-10, CIFAR-100, ImageNet, and iNaturalist18 to GLMC-2023/data. The directory should look like

GLMC-2023/data
β”œβ”€β”€ CIFAR-100-python
β”œβ”€β”€ CIFAR-10-batches-py
β”œβ”€β”€ ImageNet
|   └── train
|   └── val
β”œβ”€β”€ train_val2018
└── data_txt
    └── ImageNet_LT_val.txt
    └── ImageNet_LT_train.txt
    └── iNaturalist18_train.txt
    └── iNaturalist18_val.txt
    

Training

for CIFAR-10-LT

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 1

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.02 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 1

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.1 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.2 --label_weighting 1  --contrast_weight 2

for CIFAR-100-LT

python main.py --dataset cifar100 -a resnet32 --num_classes 100 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2  --contrast_weight 4

python main.py --dataset cifar100 -a resnet32 --num_classes 100 --imbanlance_rate 0.02 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.2  --label_weighting 1.2  --contrast_weight 6

python main.py --dataset cifar100 -a resnet32 --num_classes 100 --imbanlance_rate 0.1 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.2  --label_weighting 1.2  --contrast_weight 4

for ImageNet-LT

python main.py --dataset ImageNet-LT -a resnext50_32x4d --num_classes 1000 --beta 0.5 --lr 0.1 --epochs 135 -b 120 --momentum 0.9 --weight_decay 2e-4 --resample_weighting 0.2 --label_weighting 1.0 --contrast_weight 10

for iNaturelist2018

python main.py --dataset iNaturelist2018 -a resnext50_32x4d --num_classes 8142 --beta 0.5 --lr 0.1 --epochs 120 -b 128 --momentum 0.9 --weight_decay 1e-4 --resample_weighting 0.2 --label_weighting 1.0 --contrast_weight 10

Testing

python test.py --dataset ImageNet-LT -a resnext50_32x4d --num_classes 1000 --resume model_path

Result and Pretrained models

CIFAR-10-LT

Method IF Model Top-1 Acc(%)
GLMC 100 ResNet-32 87.75%
GLMC 50 ResNet-32 90.18%
GLMC 10 ResNet-32 94.04%
GLMC + MaxNorm 100 ResNet-32 87.57%
GLMC + MaxNorm 50 ResNet-32 90.22%
GLMC + MaxNorm 10 ResNet-32 94.03%

CIFAR-100-LT

Method IF Model Top-1 Acc(%)
GLMC 100 ResNet-32 55.88
GLMC 50 ResNet-32 61.08
GLMC 10 ResNet-32 70.74
GLMC + MaxNorm 100 ResNet-32 57.11
GLMC + MaxNorm 50 ResNet-32 62.32
GLMC + MaxNorm 10 ResNet-32 72.33

ImageNet-LT

Method Model Many Med Few All model
GLMC ResNeXt-50 70.1 52.4 30.4 56.3 Download
GLMC + BS ResNeXt-50 64.76 55.67 42.19 57.21 Download

iNaturelist2018

Method Model Many Med Few All model
GLMC ResNeXt-50 64.60 73.16 73.01 72.21 Download

Citation

If you find this code useful for your research, please consider citing our paper

@inproceedings{
du2023global,
title={Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions},
author={Fei Du, Peng Yang, Qi Jia, Fengtao Nan, Xiaoting Chen, Yun Yang},
booktitle={Conference on Computer Vision and Pattern Recognition 2023},
year={2023},
url={https://arxiv.org/abs/2305.08661}
}

glmc's People

Contributors

ynu-yangpeng avatar glbreeze avatar

Forkers

teamofprofguo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.