All Issue

2026 Vol.14, Issue 1 Preview Page

Research Article

31 March 2026. pp. 87-98
Abstract
비디오투시연하검사(VFSS)는 삼킴장애(Dysphagia)를 평가하는 표준 진단 도구이나, 판독 시간 지연과 평가자 간 편차 등 주관적 요소에 의존한다는 한계가 있다. 이러한 문제를 해결하기 위해 최근에는 인공지능(AI)을 활용한 자동화 분석 기법을 도입하여 VFSS 영상의 정량적이고 표준화된 해석을 가능하게 하려는 연구가 국내외에서 활발히 진행되고 있다. 본 연구에서는 VFSS 영상에서 주요 증상인 침투(Penetration)와 흡인(Aspiration)을 자동으로 분류하기 위한 계층적 딥러닝 모델(Hierarchical Deep Learning Model )을 제안한다. 이를 위해 VFSS 데이터를 대상으로 화질과 가시성이 확보된 고품질 영상을 선별하여 학습 및 평가에 활용하였다. 제안하는 모델은 객체 탐지(YOLOv8), 영역 분할(U-Net), 이미지 분류(ResNet18)의 세 단계가 순차적으로 수행되는 파이프라인 구조를 가진다. 모델은 각 구성 단계별로 독립적인 학습과 평가를 수행하였으며, 이를 통해 전체 파이프라인의 성능과 실효성을 검증하였다. 본 연구의 결과로 단계별 딥러닝 모듈을 결합하여 삼킴장애 진단 보조 도구로서의 가능성을 입증하였으며, 향후 객체 탐지 모듈의 고도화를 통해 완전 자동화된 진단 시스템으로 발전시킨다면 임상 환경에서 진단 효율성과 정확성을 크게 높일 수 있을 것으로 기대된다.
Videofluoroscopic Swallowing Study (VFSS) is the gold standard for evaluating dysphagia. However, it is limited by delayed interpretation times and subjective variability among raters. To address these issues, recent domestic and international studies have focused on integrating Artificial Intelligence (AI) to enable quantitative and standardized analysis of VFSS images. In this study, we propose a Hierarchical Deep Learning Model designed to automatically classify penetration and aspiration, the primary symptoms observed in VFSS. High-quality VFSS videos with ensured clarity and visibility were selected for training and evaluation. The proposed model features a sequential pipeline architecture consisting of three stages: object detection (YOLOv8), semantic segmentation (U-Net), and image classification (ResNet18). Each module within the pipeline was independently trained and evaluated to verify its performance and practical efficacy. Our results demonstrate the potential of this integrated deep learning framework as a diagnostic support tool for dysphagia. We anticipate that further refinement of the object detection module will lead to a fully automated diagnostic system, significantly enhancing diagnostic efficiency and accuracy in clinical environments.
References
  1. P. E. Marik and D. Kaplan, “Aspiration pneumonia and dysphagia in the elderly”, Chest Journal, Vol. 124, No. 1. pp. 328-336, July 2003. DOI: https://doi.org/10.1378/chest.124.1.328

    10.1378/chest.124.1.328
  2. W. J. Dodds, “The physiology of swallowing”, Dysphagia 3, Vol. 3, pp. 171-178, 1989. DOI: https://doi.org/10.1007/BF02407219

    10.1007/BF02407219
  3. K. C. Chen, Y. Jeng, W. T. Wu, T. G. Wang, D. S. Han, L. Özçakar, and K. V. Chang, “Sarcopenic dysphagia: A narrative review from diagnosis to intervention”, Nutrients, Vol. 13, No. 11, pp. 1-19, 2021. DOI: https://doi.org/10.3390/nu13114043. PMID: 34836299; PMCID: PMC8621579.

    10.3390/nu13114043
  4. B. Martin-Harris, and B. Jones, “The videofluorographic swallowing study”, Physical Medicine and Rehabilitation Clinics of North America, Vol. 19, No. 4, pp. 769-785, 2008. DOI: https://doi.org/10.1016/j.pmr.2008.06.004. PMID: 18940640; PMCID: PMC2586156.

    10.1016/j.pmr.2008.06.004
  5. G. H. McCullough, R. T. Wertz, J. C. Rosenbek, R. H. Mills, W. G. Webb, and K. B. Ross, “Inter-And intrajudge reliability for videofluoroscopic swallowing evaluation measures”, Dysphagia, Vol. 16, No. 2, pp. 110-118, 2001. DOI: https://doi.org/10.1007/PL00021291. PMID: 11305220.

    10.1007/PL00021291
  6. I. Min, H. Woo, J. Y. Kim, T. L. Kim, Y. Lee, W. K. Chang, S. H. Jung, W. H. Lee, B. M. Oh, T. R. Han, and H. G. Seo, “Inter-rater and Intra-rater reliability of the videofluoroscopic dysphagia scale with the standardized protocol”, Dysphagia, Vol. 39, No. 1, pp. 43-51, 2024. DOI: https://doi.org/10.1007/s00455-023-10590-1. Epub 2023 May 19. PMID: 37204525.

    10.1007/s00455-023-10590-1
  7. C.W. Jeong, D.W. Lim, S.H. Noh, H.K. Moon, C. Park, N. Ko, and M.S. Kim, “Multi-center validation of artificial intelligence-based video analysis platform for automatic evaluation of swallowing disorders”, Diagnostics, Vol. 16, No. 45, pp. 1-13, 2025. DOI: https://doi.org/10.3390/diagnostics16010045.

    10.3390/diagnostics16010045
  8. D.W. Lim, C.S. Lee, and H.K. Moon, “Development of AI web service for diagnosis of swallowing disorders”, The Society of Convergence Knowledge Transactions, Vol. 11. No. 4, pp. 93-104, Dec. 2023. DOI: https://doi.org/10.22716/sckt.2023.11.4.038

    10.22716/sckt.2023.11.4.038
  9. C.W. Jeong, C.S. Lee, D.W. Lim, S.H. Noh, H.K. Moon, C. Park, and M.S. Kim, “The development of an artificial intelligence video analysis-based web application to diagnose oropharyngeal Dysphagia: A pilot study”, Brain Sciences, Vol. 14, No. 6, pp 1-14, 2024. DOI: https://doi.org/10.3390/brainsci14060546

    10.3390/brainsci14060546
  10. H.K. Moon, “Development of a YOLOv7-Based web AI system for automated VFSS swallowing disorder diagnosis”, International Journal of Advanced Smart Convergence, Vol. 14, No. 3, pp. 352-359, 2025. DOI: http://dx.doi.org/10.7236/ IJASC.2025.14.3.352

    10.7236/
  11. J. K. Kim, Y. J. Choo, G. S. Choi, H. K. Shin, M. C. Chang, and D. H. Park, “Deep learning analysis to automatically detect the presence of penetration or aspiration in videofluoroscopic swallowing study”, Journal of Korean Medical Science, Vol. 37, No. 6, pp. 1-8, 2022. DOI: https://doi.org/10.3346/jkms.2022.37.e42

    10.3346/jkms.2022.37.e42
  12. Y. Ariji, M. Gotoh, M. Fukuda, S. Watanabe, T. Nagao, A. Katsumata and E. Ariji, “A preliminary deep learning study on automatic segmentation of contrast-enhanced bolus in videofluorography of swallowing”, Scientific Reports, Vol. 12, No. 18754, pp. 1-8, 2022. DOI: https://doi.org/10.1038/s41598-022-21530-8

    10.1038/s41598-022-21530-8
  13. K. H. Nam, C. Y. Lee , T. H. Lee, M. S. Shin, B. H. Kim, and J. W. Park “Automated laryngeal invasion detector of boluses in videofluoroscopic swallowing study videos using action recognition-based networks”, Diagnostics, Vol. 14, No. 13, pp. 1-8, 2024. DOI: https://doi.org/10.3390/diagnostics14131444

    10.3390/diagnostics14131444
  14. S. J. Hwang, H. B. Moon, and J. W. Park, “Automated penetration–Aspiration scale scoring with deep learning (VFSS video clips)”, Vol. 106, No. 4, pp. e29, Archives of Physical Medicine and Rehabilitation, 2025. DOI: https://doi.org/10.1016/j.apmr.2025.01.075. 2025.01.075

    10.1016/j.apmr.2025.01.075
  15. A. Fakhry, S. M. Antony, E. Park, and J. T. Lee, “Deep learning for video fluoroscopic swallowing study analysis: A survey on classification, detection, and segmentation techniques”, IEEE Access, Vol. 13, pp. 94239-94255, 2025. DOI: https://doi.org/10.1109/ACCESS.2025.3573282

    10.1109/ACCESS.2025.3573282
  16. K. Matsuo, and J. B. Palmer, “Anatomy and physiology of feeding and swallowing: normal and abnormal”, Physical Medicine and Rehabilitation Clinics of North America, Vol. 19, No. 4, pp. 691-707, 2008. DOI: https://doi.org/10.1016/j.pmr.2008.06.001. PMID: 18940636; PMCID: PMC2597750.

    10.1016/j.pmr.2008.06.001
  17. J. C. Rosenbek, J. A. Robbins, E. B. Roecker, J. L. Coyle, and J. L. Wood, “A penetration-aspiration scale”, Dysphagia, Vol. 11, No. 2, pp. 93-98, 1996. DOI: https://doi.org/10.1007/BF00417897. PMID: 8721066.

    10.1007/BF00417897
  18. J. Terven, D. M. Cordova-Esparza, and J. A. Romero-Gonzalez, “A comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv8 and YOLO-NAS”, Vol. 5, No.4, pp. 1680-1716, 2023. DOI: https://doi.org/10.3390/make5040083

    10.3390/make5040083
  19. N. Siddique, S. Paheding, C. P. Elkin, and V. Devabhaktuni, “U-Net and its variants for medical image segmentation: A review of theory and applications”, IEEE Access, Vol. 9, pp. 82031-82057, 2021. DOI: https://doi.org/10.1109/ACCESS.2021.3086020.

    10.1109/ACCESS.2021.3086020
  20. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition” in Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2016, pp. 770-778.

    10.1109/CVPR.2016.90
  21. L. Maaten and G. Hinton, “Visualizing Data using t-SNE”, Journal of Machine Learning Research (JMLR), Vol. 9, pp. 2579-2605, 2008.

Information
  • Publisher :The Society of Convergence Knowledge
  • Publisher(Ko) :융복합지식학회
  • Journal Title :The Society of Convergence Knowledge Transactions
  • Journal Title(Ko) :융복합지식학회논문지
  • Volume : 14
  • No :1
  • Pages :87-98