STEMM Institute Press
Science, Technology, Engineering, Management and Medicine
Embedded Multimodal Perception Smart Glasses: Innovation and Practice of Barrier-Free Interaction Technology
DOI: https://doi.org/10.62517/jike.202604116
Author(s)
Xiangxuan Ji, Zhiyuan Li*
Affiliation(s)
School of Artificial Intelligence and Software, Ke Wen College of Jiangsu Normal University, Xuzhou, Jiangsu, China *Corresponding Author
Abstract
Addressing the communication gap between Deaf and Hard of Hearing (DHH) individuals and the hearing population, as well as the numerous inconveniences faced by Blind and Visually Impaired (BLV) individuals in environmental perception and daily travel, this paper develops an intelligent glasses-based barrier-free interaction system leveraging embedded multimodal perception technology. This system deeply integrates cutting-edge technologies such as computer vision, intelligent audio analysis, tactile feedback control, and edge computing. It innovatively employs a lightweight spatiotemporal graph convolutional network architecture, multimodal information fusion algorithms, and optimized environmental perception models, successfully achieving core functionalities such as real-time dynamic sign language recognition, precise speech-to-tactile semantic conversion, rapid environmental hazard warning, and high-precision navigation. The system utilizes binocular vision and multi-sensor collaborative fusion solutions, effectively breaking through the performance bottleneck of traditional single-modality assistive devices and achieving low-power consumption (1.2W) and low latency (end-to-end delay < 500ms) multimodal information collaborative processing on an embedded hardware platform. Test results show that the dynamic sign language recognition accuracy reaches 87.6%, spatial positioning accuracy is up to 0.3 meters, and hazard warning response time is only 95ms. This system builds a comprehensive, barrier-free, and inclusive interaction environment for special groups, facilitating their deep integration into the digital society.
Keywords
Barrier-Free Interaction Technology; Multimodal Information Fusion; Dynamic Sign Language Recognition; Tactile Feedback Mechanism; Edge Intelligent Computing; VSLAM Navigation System
References
[1]Meng Zhiqiang. Research on Human Action Recognition Method Based on Multimodal Graph Convolutional Neural Network. Changchun University of Technology, 2025 [2]Yang Li, Yin Shiqi, Wang Tingting. Object Detection in Optical Remote Sensing Images Based on YOLOv7. Infrared Technology, 2025, 47(11): 1398-1405 [3]Liu Xingzheng. Research on Gesture Recognition Algorithm Based on Improved YOLOv7. Anhui University of Science and Technology, 2023 [4]Zhao Zhiqi. Research on Visual SLAM Algorithms Based on Deep Learning. University of Electronic Science and Technology of China, 2025 [5]Wu Chundi. Research on Dynamic Sign Language Recognition Algorithm Based on Deep Learning. Shenyang University of Technology, 2024 [6]Tu Chong, Jin Liying, Wang Zhongren, et al. Overview of Speech Recognition Technology and Its Applications. Digital Technology and Application, 2025, 43(09): 179-181 [7]Wang Jingyao, Fan Fei, Liu Haoyu, et al. Deaf and Dumb Sign Language Recognition Based on Machine Vision - Voice Interaction System. Internet of Things Technology, 2021, 11(12): 3-5 [8]Song Bo. Research on Power Consumption Optimization of Intelligent Terminal Speech Recognition Chips. Electronic Components and Information Technology, 2025, 9(08): 31-33 [9]Fan Yuhan, Li Qianqian, Meng Xue, et al. Design and Implementation of a Gesture Recognition System Based on Binocular Vision Principle. Intelligent Internet of Things Technology, 2024, 56(06): 81-84 [10]Shi Zihao. Research on Semantic VSLAM Based on Deep Learning. Shenyang University of Technology, 2025
Copyright @ 2020-2035 STEMM Institute Press All Rights Reserved