融合SOLOv2-Vision Transformer的面瘫识别方法研究

庄哲笼; 丁有伟; 胡孔法; 陈科宏; 陈功

doi:10.14148/j.issn.1672-0482.2025.1399

融合SOLOv2-Vision Transformer的面瘫识别方法研究

Study on Facial Paralysis Recognition Method with SOLOv2-Vision Transformer

摘要

摘要:
目的为了使患者和医生更快诊断病情，达到早发现、早诊断、早治疗的目的，建立准确及时的面瘫智能化辅助诊断方法。
方法提出融合SOLOv2-Vision Transformer的方法，将收集的面瘫数据经过替换主干网络的SOLOv2模型分割，去除图像中干扰部分，再输入到Vision Transformer模型中进行分类训练。通过先分割再分类的原则，提高面瘫图像的分类效果。
结果该实验方法在MEEI面瘫数据集上的准确率为0.982、召回率为0.982、F1-score为0.981，相比于基础模型分别提高了2%、4%、4%。
结论融合SOLOv2-Vision Transformer的面瘫分类模型，相比较于未经分割的方法可实现更高的识别精度，为面瘫诊断提供了新方法。

Abstract:
OBJECTIVE To establish an accurate and timely intelligent auxiliary diagnosis method for facial paralysis in order to enable patients and doctors to diagnose the disease faster and achieve the purpose of early detection, early diagnosis and early treatment.
METHODS A method integrating SOLOv2-Vision Transformer was proposed. The collected facial paralysis data was segmented by the SOLOv2 model with a replaced backbone network, the interference part in the image was removed, and then inputted into the Vision Transformer model for classification training. By adopting the principle of segmentation first and then classification, the classification effect of facial paralysis images was improved.
RESULTS The accuracy rate of the experimental method on the MEEI facial paralysis dataset was 0.982, the recall rate was 0.982, and the F1-score was 0.981, which were respectively increased by 2%, 4%, and 4% compared with the basic model.
CONCLUSION The facial paralysis classification model integrated with SOLOv2-Vision Transformer can achieve higher recognition accuracy than the unsegmented method, and provides a new method for the diagnosis of facial paralysis.

HTML全文

参考文献(31)

施引文献

资源附件(0)