Design and Preliminary Application of a CV-Based Multimodal Teaching Support System for Higher Vocational Education

Xiaoxue Yang

doi:10.64583/b0j1xn67

Authors

Xiaoxue Yang Wuhan Vocational College of Software and Engineering Author

DOI:

https://doi.org/10.64583/b0j1xn67

Keywords:

Computer Vision, Multimodal Educational Technology, Higher Vocational Education, Teaching Assistance, Intelligent Classroom

Abstract

In recent years, the steady progress of artificial intelligence has brought computer vision (CV) into the spotlight of educational research. Compared with traditional approaches that mainly depend on text or speech, CV is capable of processing multiple information streams—such as images, recognized text, and structural features—and reorganizing them into meaningful teaching resources. This ability makes it possible to support classroom instruction in a way that is more visual, interactive, and efficient. For higher vocational education, where learning tasks are strongly practice-oriented and course materials often contain diagrams or graphical elements, such technologies are particularly relevant. This paper examines how CV-based multimodal techniques can be applied in vocational teaching, with specific attention to resource management, knowledge extraction, and interactive support in the classroom. A prototype framework was designed, integrating functions such as network topology identification, keyword extraction from slides, and the digital capture of handwriting from blackboard work. When tested in teaching scenarios, the system showed promise in improving students’ understanding and participation, while also reducing repetitive tasks for instructors. The study therefore provides both a conceptual foundation and preliminary practical evidence for advancing the digital transformation of vocational education.

Downloads

Download data is not yet available.

Author Biography

Xiaoxue Yang, Wuhan Vocational College of Software and Engineering

Yang Xiaoxue (1982.01—), Female, Han, Wuhan, Hubei, Wuhan Vocational College of Software and Engineering, Master, Associate Professor, Research area: Artificial Intelligence
E-mail: shirly520123@gmail.com

References

He, M., & Mi, H. (2025). Advantages, concerns, and prospects of applying multimodal large models to ideological and political education. School Party Building and Ideological Education, (11), 79–82. https://doi.org/10.19865/j.cnki.xxdj.2025.11.017

Huang, W., & Liang, G. (2025). Intelligent construction of multimodal ethnic mathematics education resources and curriculum practice. Journal of Primitive Ethnic Culture, 17(4), 143–152. https://doi.org/10.3969/j.issn.1674-621X.2025.04.015

Huang, Z., Li, G., & Zheng, Y. (2025). Empowering the high-quality development of science education through multimodal large models: Potentials, challenges, and applications. China Educational Technology, (6), 60–69. https://doi.org/10.3969/j.issn.1006-9860.2025.06.009

Jiang, H., & Sun, Y. (2025). Ethical reflections and governance strategies of multimodal large models in ideological and political education. School Party Building and Ideological Education, (6), 66–69. https://doi.org/10.19865/j.cnki.xxdj.2025.06.018

Wang, X., & Xu, X. (2024). Multimodal emotion recognition in online education considering credibility bias. Sensors and Microsystems, 43(11), 122–126. https://doi.org/10.13873/J.1000-9787(2024)11-0122-05

Wang, Y., Wang, Y. C., & Zheng, Y. (2021). Multimodal learning analytics: A new trend in intelligent education driven by multimodality. China Educational Technology, (3), 88–96. https://doi.org/10.3969/j.issn.1006-9860.2021.03.013

Yang, X., Bu, H., & Li, X. (2025). Promoting the deep application of multimodal large models in education: Value empowerment, scenario construction, and implementation strategies. Chinese Journal of Education, (4), 9–14.

Zhang, X., Li, W., Zhang, S., et al. (2023). Data-enabled teaching decision-making: From educational data applications to multimodal learning analytics for instructional support. E-educational Research, 44(3), 63–70. https://doi.org/10.13811/j.cnki.eer.2023.03.009