Session: 04-25-01: Thin-Film Materials/Electronics for Advanced Biochemical and Biophysical Sensing
Paper Number: 150959
150959 - Decoding Silent Speech Cues From Muscular Motions for Efficient Human-Robot Collaborations
Silent speech interfaces have been pursued to restore spoken communication for individuals with voice disorders and to facilitate intuitive communications when acoustic-based speech communication is unreliable, inappropriate, or undesired. However, the current methodology for silent speech faces several challenges, including bulkiness, obtrusiveness, low accuracy, limited portability, and susceptibility to interferences. In the first part of the presentation, we will present a wireless, unobtrusive, and robust silent speech interface for tracking and decoding speech-relevant movements of the temporomandibular joint. Our solution employs a single soft magnetic skin placed behind the ear for wireless and socially acceptable silent speech recognition. The developed system alleviates several concerns associated with existing interfaces based on face-worn sensors, including a large number of sensors, highly visible interfaces on the face, and obtrusive interconnections between sensors and data acquisition components. With machine learning-based signal processing techniques, good speech recognition accuracy is achieved (93.2% accuracy for phonemes, and 87.3% for a list of words from the same viseme groups). Moreover, the reported silent speech interface demonstrates robustness against noises from both ambient environments and users’ daily motions. Finally, its potential in assistive technology and human-machine interactions is illustrated through two demonstrations – a silent speech enabled smartphone assistant and drone control.
In the second part of the presentation, we will present the material optimization, structural design, deep learning algorithm, and system integration of mechanically and visually unobtrusive silent speech interfaces that can realize both speaker identification and speech content identification. Conformal, transparent, and self-adhesive electromyography electrode arrays are designed for capturing speech-relevant muscle activities. Temporal convolutional networks are employed for recognizing speakers and converting sensing signals into spoken content. The resulting silent speech interfaces achieve a 97.5% speaker classification accuracy and 91.5% keyword classification accuracy using four electrodes. We further integrate the speech interface with an optical hand-tracking system and a robotic manipulator for human-robot collaborations in both assembly and disassembly processes. The integrated system enables the control of the robot manipulator by silent speech and facilitates the hand-over process by hand motion trajectory detection. The developed framework facilitates natural robot control in noisy environments and lays the ground for collaborative human-robot tasks involving multiple human operators. The main contributions of this work include (1) the development of a self-adhesive, integrated, mechanically and visually imperceptible EMG sensing array for tracking subtle speech-relevant muscle activities; (2) The exploration of TCN model for both speaker identification and speech content identification, which allows multiple human operators to work alongside one robot; (3) The design of an efficient HRC framework comprising SSI and hand motion detection that can engage people with voice disorders and remain robust in a noisy environment.
Presenting Author: Shanshan Yao Stony Brook University
Presenting Author Biography: Dr. Shanshan Yao is an assistant professor in the Department of Mechanical Engineering at Stony Brook University. She received her B.S. and M.S. from Xi'an Jiaotong University. She received her Ph.D. degree in Mechanical Engineering from North Carolina State University in 2016. Before joining Stony Brook University, she was a postdoc at North Carolina State University from 2017 to 2019. Her research primarily lies in smart structures, wearable sensors, haptic interfaces, soft robotics, and integration techniques for wearable systems. Dr. Shanshan Yao was a recipient of the Faculty Early Career Development (CAREER) award from the National Science Foundation (NSF) and the Stony Brook Foundation Trustees Faculty Awards.
Authors:
Shanshan Yao Stony Brook UniversityDecoding Silent Speech Cues From Muscular Motions for Efficient Human-Robot Collaborations
Paper Type
Technical Presentation