浏览全部资源
扫码关注微信
1.河北大学 数学与信息科学学院,河北 保定 071002
2.河北省机器学习与计算智能重点实验室〔河北大学〕,河北 保定 071002
孔令权,男,从事深度学习研究,15530098103@163.com。
翟俊海,男,教授,博士生导师,从事深度学习、小样本学习和非平衡学习研究,mczjh@126.com。
纸质出版日期:2024-08-25,
收稿日期:2023-10-20,
扫 描 看 全 文
孔令权, 翟俊海. 基于平衡对比学习策略的长尾识别方法[J]. 西北大学学报(自然科学版), 2024,54(4):677-688.
KONG Lingquan, ZHAI Junhai. A long-tailed recognition method based on balanced contrastive learning strategy[J]. Journal of Northwest University (Natural Science Edition), 2024,54(4):677-688.
孔令权, 翟俊海. 基于平衡对比学习策略的长尾识别方法[J]. 西北大学学报(自然科学版), 2024,54(4):677-688. DOI: 10.16152/j.cnki.xdxbzr.2024-04-010.
KONG Lingquan, ZHAI Junhai. A long-tailed recognition method based on balanced contrastive learning strategy[J]. Journal of Northwest University (Natural Science Edition), 2024,54(4):677-688. DOI: 10.16152/j.cnki.xdxbzr.2024-04-010.
长尾识别是计算机视觉领域最具挑战性的问题之一。在现实世界中长尾识别具有广泛的应用,研究长尾识别具有重要意义。对于长尾分布数据来说,由于类与类之间样本量不平衡,以及占比众多的尾部类缺少足够的训练样本,使其在训练过程中很难找到各类间的明确界限。为解决这一问题,将元预训练和监督对比学习结合起来,提出了基于平衡对比学习策略的长尾识别方法MBCP-BB(meta balanced contrastive pre-training and batch balance)。MBCP-BB采用解耦学习方式进行模型训练:通过预训练获得具有优异特征表示能力的特征提取器,在微调阶段,固定特征提取器,重新训练分类器。该方法突出特征学习的重要性,设计了平衡对比学习策略指导特征学习过程,从而使监督对比学习技术能有效应用于长尾识别场景。进行特征学习时,首先适当减少头部类样本,并利用少样本图像生成技术为尾部类生成新样本;之后以每类的类原型作为补充样本用于训练。解耦学习训练模式下,充分挖掘了特征提取器与分类器的潜力,在增强模型特征学习能力的同时,大大简化了分类器的训练过程。在几个长尾基准数据集上进行了大量实验,并与7个代表性的算法从多个角度进行了实验比较,实验结果表明该方法优于比较的算法。
Long-tailed recognition is one of the most challenging problems in computer vision.Long-tailed recognition has a wide range of applications in the real world
and it is of great significance to study long-tailed recognition.For long-tailed distribution data
due to the unbalanced sample size between classes and the lack of sufficient training samples for the large tail classes
it is difficult to find a clear boundary between classes during the training process. To address this issue
we combine meta pre-training and supervised contrastive learning
and propose MBCP-BB(meta balanced contrastive pre-training and batch balance)
a long-tailed recognition method based on a balanced contrastive learning strategy. MBCP-BB adopts a decoupled learning method for model training: A feature extractor with excellent feature representation ability is obtained through pre-training
and in the fine-tuning stage
the feature extractor is fixed and the classifier is retrained. This method highlights the importance of feature learning
and designs a balanced contrastive learning strategy to guide the feature learning process
so that supervised contrastive learning techniques can be effectively applied to long-tailed recognition scenarios. When performing feature learning
first reduce the samples of the head classes appropriately
and use the few-shot image generation technology to generate new samples for the tail classes; then use the class prototype of each class as supplementary samples for training. In the decoupled learning training mode
the potential of the feature extractor and classifier is fully mined
and the training process of the classifier is greatly simplified while enhancing the feature learning ability of the model. A large number of experiments are carried out on several long-tailed benchmark datasets
and compared with seven representative algorithms from multiple perspectives
the experimental results show that the proposed method is superior to the compared algorithms.
长尾识别元学习预训练监督对比学习批次平衡训练
long-tailed recognitionmeta-learningpre-trainingsupervised contrastive learningbatch balance training
LIU Z W, MIAO Z Q, ZHAN X H, et al. Large-scale long-tailed recognition in an open world [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2532-2541.
ZHANG Y F, KANG B Y, HOOI B, et al. Deep long-tailed learning: A survey [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(9): 10795-10816.
张明, 翟俊海, 许垒, 等. 长尾识别研究进展[J]. 南京师范大学学报(工程技术版), 2022, 22(2): 63-72.
ZHANG M, ZHAI J H, XU L, et al. Research advance in long-tailed recognition[J]. Journal of Nanjing Normal University (Engineering Technology Edition), 2022, 22(2): 63-72.
BUDA M, MAKI A, MAZUROWSKI M A. A systematic study of the class imbalance problem in convolutional neural networks [J]. Neural Networks, 2018, 106: 249-259.
DRUMMOND C, HOLTE R C. C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling [C]//Workshop on Learning from Imbalanced Datasets II. Washington: PRML, 2003, 11: 1-8.
MAHAJAN D, GIRSHICK R, RAMANATHAN V, et al. Exploring the limits of weakly supervised pretraining [C]//Proceedings of the European Conference on Computer Vision (ECCV). Munich: Springer, 2018: 185-201.
KANG B Y, XIE S N, ROHRBACH M, et al. Decoupling representation and classifier for long-tailed recognition [C]//Proceedings of the International Conference on Learning Representations (ICLR). Virtual Conference, 2020: 1-16.
WANG Y R, GAN W H, YANG J, et al. Dynamic curriculum learning for imbalanced data classification [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE, 2019: 5016-5025.
CUI Y, JIA M L, LIN T Y, et al. Class-balanced loss based on effective number of samples [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 9260-9269.
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]//Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999-3007.
REN M Y, ZENG W Y, YANG B, et al. Learning to reweight examples for robust deep learning [C]//International Conference on Machine Learning. Stockholm: PRML, 2018: 4334-4343.
HUANG C, LI Y N, LOY C C, et al. Learning deep representation for imbalanced classification [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 5375-5384.
MULLICK S S, DATTA S, DAS S. Generative adversarial minority oversampling [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE, 2019: 1695-1704.
YIN X, YU X, SOHN K, et al. Feature transfer learning for face recognition with under-represented data [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5697-5706.
LIU J L, SUN Y F, HAN C C, et al. Deep representation learning on long-tailed data: A learnable embedding augmentation perspective [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 2967-2976.
WANG J F, LUKASIEWICZ T, HU X L, et al. RSG: A simple but effective module for learning imbalanced datasets [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE, 2021: 3783-3792.
KIM J, JEONG J, SHIN J. M2M: Imbalanced classification via major-to-minor translation [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 13893-13902.
WANG Y X, RAMANAN D, HEBERT M. Learning to model the tail [C]//31st Conference on Neural Information Processing Systems (NIPS). Long Beach: IEEE, 2017: 1-11.
CAO K D, WEI C, GAIDON A, et al. Learning imbalanced datasets with label-distribution-aware margin loss [C]//33rd Conference on Neural Information Processing Systems (NIPS). Vancouver: IEEE, 2019: 1-12.
MENON A K, JAYASUMANA S, RAWAT A S, et al. Long-tail learning via logit adjustment [EB/OL]. (2021-07-09)[2023-03-10]. https://arxiv.org/abs/2007.07314https://arxiv.org/abs/2007.07314.
TAN J R, WANG C B, LI B Y, et al. Equalization loss for long-tailed object recognition [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 11659-11668.
ZHOU B Y, CUI Q, WEI X S, et al. BBN: Bilateral-branch network with cumulative learning for long-tailed visual recognition [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 9716-9725.
赵凯琳, 靳小龙, 王元卓. 小样本学习研究综述 [J]. 软件学报, 2021, 32(2): 349-369.
ZHAO K L, JIN X L, WANG Y Z. Survey on few-shot learning [J]. Journal of Software, 2021, 32(2): 349-369.
李凡长, 刘洋, 吴鹏翔, 等. 元学习研究综述 [J]. 计算机学报, 2021, 44(2): 422-446.
LI F Z, LIU Y, WU P X, et al. A survey on recent advances in meta-learning [J]. Journal of Computers, 2021, 44(2): 422-446.
VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning [J]. Advances in Neural Information Processing Systems, 2016: 3637-3645.
SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning [C]//31st Conference on Neural Information Processing Systems (NIPS). Long Beach: IEEE, 2017: 1-11.
SUNG F, YANG Y X, ZHANG L, et al. Learning to compare: Relation network for few-shot learning [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake: IEEE, 2018: 1199-1208.
ANDRYCHOWICZ M, DENIL M, GOMEZ S, et al. Learning to learn by gradient descent by gradient descent [C]//30th Conference on Neural Information Processing Systems (NIPS). Barcelona: IEEE, 2016: 1-12.
RAVI S, LAROCHELLE H. Optimization as a model for few-shot learning [C]//5th International Conference on Learning Representations. Toulon, France: IEEE, 2017: 1-11.
FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks [C]//International Conference on Machine Learning. Sydney: PMLR, 2017: 1126-1135.
ANTONIOU A, EDWARDS H, STORKEY A. How to train your MAML [C]//Proceedings of the International Conference on Learning Representations. New Orleans: ICLR, 2019: 1-11.
JAMAL M A, QI G J. Task agnostic meta-learning for few-shot learning [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 11711-11719.
WANG P, HAN K, WEI X S, et al. Contrastive learning based hybrid networks for long-tailed image classification [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE, 2021: 943-952.
ZHU J G, WANG Z, CHEN J J, et al. Balanced contrastive learning for long-tailed visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 6898-6907.
WANG R, WU Z X, WENG Z J, et al. Cross-domain contrastive learning for unsupervised domain adaptation [J]. IEEE Transactions on Multimedia, 2022, 25: 1665-1673.
HAN T D, XIE W D, ZISSERMAN A. Self-supervised co-training for video representation learning [J]. Advances in Neural Information Processing Systems, 2020, 33: 5679-5690.
CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[EB/OL]. (2020-02-13)[2023-03-10]. https://arxiv.org/abs/2002.05709https://arxiv.org/abs/2002.05709.
KHOSLA P, TETERWAK P, WANG C, et al. Supervised contrastive learning [J]. Advances in Neural Information Processing Systems, 2020, 33: 18661-18673.
YANG Y Z, XU Z. Rethinking the value of labels for improving class-imbalanced learning [J]. Advances in Neural Information Processing Systems, 2020, 33: 19290-19301.
CUI J Q, ZHONG Z S, LIU S, et al. Parametric contrastive learning [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 695-704.
KANG B Y, LI Y, XIE S, et al. Exploring balanced feature spaces for representation learning[EB/OL]. (2021-05-03)[2023-03-10]. https://api.semanticscholar.org/CorpusID:235613459https://api.semanticscholar.org/CorpusID:235613459.
PHAPHUANGWITTAYAKUL A, GUO Y, YING F L. Fast adaptive meta-learning for few-shot image generation [J]. IEEE Transactions on Multimedia, 2022, 24: 2205-2217.
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构