浏览全部资源
扫码关注微信
1.中国国家博物馆,北京 100006
2.馆藏资源活化技术文化和旅游部重点实验室,北京 100006
杨富勇,男,博士后,工程师,从事文物大数据、人工智能、机器学习、图像处理研究,yfy317@163.com。
李华飙,男,正高级工程师,从事图像处理、文物大数据、智慧博物馆建设研究,lihuabiao@chnmuseum.cn。
纸质出版日期:2025-02-25,
收稿日期:2024-10-15,
移动端阅览
杨富勇, 李华飙, 孟睿伟. 面向甲骨文目标检测的大规模数据集生成技术[J]. 西北大学学报(自然科学版), 2025,55(1):36-49.
YANG FUYONG, LI HUABIAO, MENG RUIWEI. Large-scale dataset generation technology for oracle object detection. [J]. Journal of northwest university (natural science edition), 2025, 55(1): 36-49.
杨富勇, 李华飙, 孟睿伟. 面向甲骨文目标检测的大规模数据集生成技术[J]. 西北大学学报(自然科学版), 2025,55(1):36-49. DOI: 10.16152/j.cnki.xdxbzr.2025-01-003.
YANG FUYONG, LI HUABIAO, MENG RUIWEI. Large-scale dataset generation technology for oracle object detection. [J]. Journal of northwest university (natural science edition), 2025, 55(1): 36-49. DOI: 10.16152/j.cnki.xdxbzr.2025-01-003.
甲骨文目标检测是甲骨文数字化研究中重要一环,主要依靠深度学习模型实现对甲骨文图像中位置信息和分类信息的识别。为了避免模型过拟合,深度学习模型的训练一般需要依赖大规模的数据集,而在甲骨文目标检测领域,目前可用于深度学习的大规模数据集较少,很多研究所用的数据集均依靠专家人工标注和整理,这使得甲骨文目标检测数据集存在整理成本较高、数据量较小、数据质量不高、类别间均衡性差等问题。提出了动态两阶段Mosaic算法及甲骨文大规模数据集生成技术,解决传统Mosaic算法在处理甲骨文图像中存在的拼接图数量有限、图像的多样性和差异性不足、空白背景较大、信息缺失等问题,并设计了完整的数据集生成流程,实现了从甲骨文单字符图片到大规模数据集生成的流程化、智能化处理,从根本上解决了甲骨文目标检测领域的数据困境。通过此研究方法,生成了标注位置信息和类别信息且规模庞大的甲骨文数据集,共生成57万张甲骨文图像和57万份对应的标注文件,包含甲骨文类别416类,样本数量最少的类别包含了516个甲骨文字符,且数据集规模和各类别样本数量可动态调整以避免类别间样本不均衡。采用YOLOv8模型对生成后的大规模数据集进行训练,在经过200批次训练后,模型精度(Precision)达到96.45%,mAP50值为97.75%,mAP50-95值为96.96%,从模型训练曲线看,训练过程表现出较好的稳定性和高效性,模型训练结果表明,研究的数据集生成技术可应用于甲骨文目标检测。
Oracle bone inscription object detection is an important part of oracle bone inscription digitization research. This work mainly relies on deep learning models to realize the recognition of position information and classification information in oracle bone inscription images. In order to avoid model overfitting
deep learning models need to rely on large-scale datasets. In the field of oracle bone inscription object detection
there are currently few large-scale data sets available for deep learning. Many research datasets rely on experts to manually annotate and organize
which makes oracle bone inscription object detection datasets face problems such as high cost
small data volume
low data quality
and poor balance between categories. This study proposes a dynamic two-stage Mosaic algorithm and oracle bone inscription large-scale dataset generation technology to solve the problems of limited number of mosaic images
insufficient image diversity and difference
large blank background
and missing information in the traditional Mosaic algorithm in processing oracle bone images. A complete dataset generation process is designed to realize the process-based and intelligent processing from oracle bone inscription single character images to dataset generation
which fundamentally solves the data dilemma in the field of oracle bone inscription object detection. Using the method in this study
a large-scale oracle bone inscription dataset with labeled position information and category information was generated. A total of 570 000 oracle bone inscription images and 570 000 corresponding annotation files were generated
including 416 oracle bone inscription categories
and the minimum category contained 516 oracle bone inscription characters. The dataset size and the number of samples in each category can be adjusted dynamically to avoid the problem of sample imbalance between categories. This research uses the YOLOv8 model to train the generated large-scale dataset. After 200 batches of training
the model precision reached 96.45%
the mAP50 value was 97.75%
and the mAP50-95 value was 96.96%. From the model training curve
the training process showed good stability and efficiency. The model training results show that the dataset generation technology in this paper can be applied to oracle bone inscription target detection research.
甲骨文深度学习目标检测数据集YOLOv8算法
oracledeep learningobject detectiondatasetYOLOv8 algorithm
耿国华, 冯龙, 李康, 等. 秦陵文物数字化及虚拟复原研究综述[J]. 西北大学学报(自然科学版), 2021, 51(5): 710-721.
GENG G H, FENG L, LI K, et al. A literature review on the digitization and virtual restoration of cultural relics in the Mausoleum of Emperor Qinshihuang[J]. Journal of Northwest University (Natural Science Edition), 2021, 51(5): 710-721.
FLAD R K. Divination and power: A multiregional view of the development of oracle bone divination in early China[J]. Current Anthropology, 2008, 49(3): 403-437.
HUANG S P, WANG H B, LIU Y G, et al. OBC306: A large-scale oracle bone character recognition dataset[C]//2019 International Conference on Document Analysis and Recognition (ICDAR). September 20-25, 2019. Sydney, Australia. IEEE, 2019: 681-688.
CHINASAGE. Early Chinese writing on oracle bones[R/OL]. (2021-02-10)[2024-07-07]. https://www.chinasage.info/oracle-bones.htmhttps://www.chinasage.info/oracle-bones.htm.
ZHANG C S, ZONG R X, CAO S, et al. AI-powered oracle bone inscriptions recognition and fragments rejoining[C]//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. January 7-15, 2021. Yokohama, Japan: ACM, 2021: 5309-5311.
GAO F, ZHANG J P, LIU Y G, et al. Image translation for oracle bone character interpretation[J]. Symmetry, 2022, 14(4): 743.
ZHEN Q Q, WU L, LIU G Y. An oracle bone inscriptions detection algorithm based on improved YOLOv8[J]. Algorithms, 2024, 17(5): 174.
顾绍通. 基于拓扑配准的甲骨文字形识别方法[J]. 计算机与数字工程, 2016, 44(10): 2001-2006.
GU S T. Identification of oracle-bone script fonts based on topological registration[J]. Computer & Digital Engineering, 2016, 44(10): 2001-2006.
刘永革, 刘国英. 基于SVM的甲骨文字识别[J]. 安阳师范学院学报, 2017(2): 54-56.
LIU Y G, LIU G Y. Oracle bone inscription recognition based on SVM[J]. Journal of Anyang Normal University, 2017(2): 54-56.
顾绍通. 基于分形几何的甲骨文字形识别方法[J]. 中文信息学报, 2018, 32(10): 138-142.
GU S T. Identification of oracle-bone script fonts based on fractal geometry[J]. Journal of Chinese Information Processing, 2018, 32(10): 138-142.
邢济慈. 基于深度卷积神经网络的甲骨文字检测技术研究[D]. 郑州: 郑州大学, 2020.
王琦琦. 基于深度卷积神经网络的甲骨文精确识别[D]. 南昌: 江西科技师范大学, 2021.
高旭. 基于卷积神经网络的甲骨文识别研究与应用[D]. 长春: 吉林大学, 2021.
毛亚菲, 毕晓君. 改进ResNeSt网络的拓片甲骨文字识别[J]. 智能系统学报, 2023, 18(3): 450-458.
MAO Y F, BI X J. Rubbing oracle bone character recognition based on improved ResNeSt network[J]. CAAI Transactions on Intelligent Systems, 2023, 18(3): 450-458.
王浩彬. 基于深度学习的甲骨文检测与识别研究[D]. 广州: 华南理工大学, 2019.
CHEUNGC. The Chinese history that is written in bone[R/OL]. (2018-01-23)[2024-07-10]. https://www.sapiens.org/archaeology/chinese-oracle-bones-history/https://www.sapiens.org/archaeology/chinese-oracle-bones-history/.
LI B, DAI Q W, GAO F, et al. HWOBC-a handwriting oracle bone character recognition database[J]. Journal of Physics: Conference Series, 2020, 1651(1): 012050.
殷契文渊. 甲骨文字检测数据集[DB/OL]. (2020-09-23) [2023-07-14]. http://jgw.aynu.edu.cn/home/down/detail/index.html?sysid=3http://jgw.aynu.edu.cn/home/down/detail/index.html?sysid=3.
陈婷珠. 殷商甲骨文字形系统再研究[D]. 上海: 华东师范大学, 2007.
刘芳, 李华飙, 马晋, 等. 基于Mask R-CNN的甲骨文拓片的自动检测与识别研究[J]. 数据分析与知识发现, 2021, 5(12): 88-97.
LIU F, Li H B, MA J, et al. Automatic detection and recognition of oracle rubbings based on Mask R-CNN[J]. Data Analysis and Knowledge Discovery, 2021, 5(12): 88-97.
GUO J, WANG C H, ROMAN-RANGEL E, et al. Building hierarchical representations for oracle character and sketch recognition[J]. IEEE Transactions on Image Processing, 2016, 25(1): 104-118.
IZUMI T, MENG L. A combined recognition system for oracle bone inscriptions[J]. International Journal of Advanced Mechatronic Systems, 2017, 7(4): 235.
台湾历史语言研究所. 甲骨文數位典藏資料庫[DB/OL]. (2020-07-22) [2023-08-22]. https://rub.ihp.sinica.edu.tw/~oracle/main4.htmhttps://rub.ihp.sinica.edu.tw/~oracle/main4.htm.
SHORTEN C, KHOSHGOFTAAR T M. A survey on image data augmentation for deep learning[J]. Journal of Big Data, 2019, 6(1): 60.
SUN C, SHRIVASTAVA A, SINGH S, et al. Revisiting unreasonable effectiveness of data in deep learning era[C]//2017 IEEE International Conference on Computer Vision (ICCV). October 22-29, 2017. Venice, Italy: IEEE, 2017: 843-852.
孟宪佳, 傅利平, 刘栋, 等. 高性能计算发展现状及其在文化遗产保护中的应用展望[J]. 西北大学学报(自然科学版), 2021, 51(5): 807-815.
MENG X J, FU L P, LIU D, et al. Development status of high performance computing and its application prospect in cultural heritage protection[J]. Journal of Northwest University (Natural Science Edition), 2021, 51(5): 80
TAKAHASHI R, MATSUBARA T, UEHARA K. Data augmentation using random image cropping and patching for deep CNNs[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(9): 2917-2931.
ZHANG C Y, BENGIO S, HARDT M, et al. Understanding deep learning (still) requires rethinking generalization[J]. Communications of the ACM, 2021, 64(3): 107-115.
SCHMIDT L, SANTURKAR S, TSIPRAS D, et al. Adversarially robust generalization requires more data[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. December 3-8, 2018. Montréal, Canada: ACM, 2018: 5019-5031.
HESTNESS J, NARANG S R, ARDALANI N, et al. Deep learning scaling is predictable, empirically[EB/OL](2017-12-01) [2024-08-08]. http://arxiv.org/abs/1712.00409http://arxiv.org/abs/1712.00409.
SUMMERS C, DINNEEN M J. Improved mixed-example data augmentation[C]//2019 IEEE winter conference on applications of computer vision (WACV). January 7-11, 2019. Hawaii, United States: IEEE, 2019: 1262-1270.
ZHONG Z, ZHENG L, KANG G L, et al. Random erasing data augmentation[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 13001-13008.
NIU J S, CHEN Y F, YU X H, et al. Data augmentation on defect detection of sanitary ceramics[C]//IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society. October 18-21, 2020. Singapore, Singapore: IEEE, 2020: 5317-5322.
INOUE H. Data augmentation by pairing samples for images classification[EB/OL]. (2018-04-11) [2024-08-08]. http://arxiv.org/abs/1801.02929http://arxiv.org/abs/1801.02929.
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10) [2024-08-08]. http://arxiv.org/abs/1409.1556http://arxiv.org/abs/1409.1556.
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[EB/OL]. (2015-12-10) [2024-08-08]. http://arxiv.org/abs/1512.03385http://arxiv.org/abs/1512.03385.
HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//2017. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 21-26, 2017. Honolulu, HI, USA: IEEE, 2017: 2261-2269.
XIE S N, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 21-26, 2017. Honolulu, HI, USA: IEEE, 2017: 5987-5995.
MORENO-BAREA F J, STRAZZERA F, JEREZ J M, et al. Forward noise adjustment scheme for data augmentation[C]//2018 IEEE symposium series on computational intelligence (SSCI). November 18-21, 2018. Bangalore, India: IEEE, 2018: 728-734.
DEVRIES T, TAYLOR G W, ASSIRI Y. Improved regularization of convolutional neural networks with cutout[EB/OL]. (2017-11-29) [2024-08-08]. http://arxiv.org/abs/1708.04552http://arxiv.org/abs/1708.04552.
LI Y H, CHENG R, ZHANG C Y, et al. Dynamic Mosaic algorithm for data augmentation[J]. Mathematical Biosciences and Engineering, 2023, 20(4): 7193-7216.
ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: Beyond empirical risk minimization[EB/OL]. (2017-11-29) [2024-08-08]. http://arxiv.org/abs/1708.04552http://arxiv.org/abs/1708.04552, 2017.
YUN S, HAN D, OLF S J, et al. CutMix: Regularization strategy to train strong classifiers with localizable features[EB/OL]. (2019-08-07) [2024-08-08]. http://arxiv.org/abs/1905.04899http://arxiv.org/abs/1905.04899.
BOCHKOVSKIY A, WANG C Y, LIAO H M. YOLOv4: Optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2024-08-08]. http://arxiv.org/abs/2004.10934http://arxiv.org/abs/2004.10934.
IOFFE S, SZEGEDY C, PARANHOS L, et al. Batch normalization: Accelerating deep network training by reducing internal covariate shift[EB/OL]. (2015-03-02) [2024-08-08]. http://arxiv.org/abs/1502.03167http://arxiv.org/abs/1502.03167.
SHAHINFAR S, MEEK P, FALZON G. "How many images do I need?" Understanding how sample size per class affects deep learning model performance metrics for balanced designs in autonomous wildlife monitoring[J]. Ecological Informatics, 2020, 57: 101085.
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). June 27-30, 2016. Las Vegas, NV, USA: IEEE, 2016: 779-788.
JOCKER G, CHAURASIA A, QIU J. Ultralytics YOLO (Version 8.0.0)[CP/OL]. (2023-07-20) [2024-03-17]. https://github.com/ultralytics/ultralyticshttps://github.com/ultralytics/ultralytics.
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构