Design and Preliminary Application of a CV-Based Multimodal Teaching Support System for Higher Vocational Education
DOI:
https://doi.org/10.64583/b0j1xn67Keywords:
Computer Vision, Multimodal Educational Technology, Higher Vocational Education, Teaching Assistance, Intelligent ClassroomAbstract
In recent years, the steady progress of artificial intelligence has brought computer vision (CV) into the spotlight of educational research. Compared with traditional approaches that mainly depend on text or speech, CV is capable of processing multiple information streams—such as images, recognized text, and structural features—and reorganizing them into meaningful teaching resources. This ability makes it possible to support classroom instruction in a way that is more visual, interactive, and efficient. For higher vocational education, where learning tasks are strongly practice-oriented and course materials often contain diagrams or graphical elements, such technologies are particularly relevant. This paper examines how CV-based multimodal techniques can be applied in vocational teaching, with specific attention to resource management, knowledge extraction, and interactive support in the classroom. A prototype framework was designed, integrating functions such as network topology identification, keyword extraction from slides, and the digital capture of handwriting from blackboard work. When tested in teaching scenarios, the system showed promise in improving students’ understanding and participation, while also reducing repetitive tasks for instructors. The study therefore provides both a conceptual foundation and preliminary practical evidence for advancing the digital transformation of vocational education.
Downloads
References
He, M., & Mi, H. (2025). Advantages, concerns, and prospects of applying multimodal large models to ideological and political education. School Party Building and Ideological Education, (11), 79–82. https://doi.org/10.19865/j.cnki.xxdj.2025.11.017
Huang, W., & Liang, G. (2025). Intelligent construction of multimodal ethnic mathematics education resources and curriculum practice. Journal of Primitive Ethnic Culture, 17(4), 143–152. https://doi.org/10.3969/j.issn.1674-621X.2025.04.015
Huang, Z., Li, G., & Zheng, Y. (2025). Empowering the high-quality development of science education through multimodal large models: Potentials, challenges, and applications. China Educational Technology, (6), 60–69. https://doi.org/10.3969/j.issn.1006-9860.2025.06.009
Jiang, H., & Sun, Y. (2025). Ethical reflections and governance strategies of multimodal large models in ideological and political education. School Party Building and Ideological Education, (6), 66–69. https://doi.org/10.19865/j.cnki.xxdj.2025.06.018
Wang, X., & Xu, X. (2024). Multimodal emotion recognition in online education considering credibility bias. Sensors and Microsystems, 43(11), 122–126. https://doi.org/10.13873/J.1000-9787(2024)11-0122-05
Wang, Y., Wang, Y. C., & Zheng, Y. (2021). Multimodal learning analytics: A new trend in intelligent education driven by multimodality. China Educational Technology, (3), 88–96. https://doi.org/10.3969/j.issn.1006-9860.2021.03.013
Yang, X., Bu, H., & Li, X. (2025). Promoting the deep application of multimodal large models in education: Value empowerment, scenario construction, and implementation strategies. Chinese Journal of Education, (4), 9–14.
Zhang, X., Li, W., Zhang, S., et al. (2023). Data-enabled teaching decision-making: From educational data applications to multimodal learning analytics for instructional support. E-educational Research, 44(3), 63–70. https://doi.org/10.13811/j.cnki.eer.2023.03.009
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Xiaoxue Yang (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.