Wearable, gaze-directed beamforming to improve speech comprehension
********** | |
Lee Miller | |
Center for Mind & Brain |
Project's details
Wearable, gaze-directed beamforming to improve speech comprehension | |
Everyday auditory environments are cluttered, noisy, and distracting. This presents a complex perceptual and computational challenge known as the “cocktail party problem”: how to extract relevant acoustic information while filtering out the background. The cocktail party problem is a paradigmatic case of blind-source separation, a broad class of computational approaches with application in fields as varied as acoustics, machine vision, medical imaging, stock prediction, and seismic monitoring. In the case of sound perception, one of the most important mechanisms our brains use to accomplish this is selective attention, which typically correlates with eye gaze – we look where we’re listening. | |
Our lab is developing a system that boosts sounds from one direction while suppressing sounds from all others using a beamformer algorithm guided by steering signals from an eye-tracker. It furthermore will apply state-of-the-art speech isolation and automatic speech recognition (ASR) to present textual captions in the listener’s view using augmented reality (AR). Put simply, the system represents a brain-guided filter for acoustic information – wherever the listener looks, she should hear that sound best. The proposed project offers numerous instructive and industry-relevant computer science and engineering challenges, from programming to signal processing to device interfacing and communications, primarily using C++ and Python. It is structured so that team members can make progress on their own challenges while working closely with other team members to integrate the components. It has a clearly defined, feasible, and impactful deliverable. |
|
A near-real-time (<50ms audio delay) gaze-directed beamformer with AR captioning that increases the target speech signal-to-noise ratio by 10dB and improves comprehension in background noise significantly. | |
Programming in Python and/or C++ and/or Unity. A great attitude! | |
********** | |
30-60 min weekly or more | |
Client wishes to keep IP of the project | |
Attachment | N/A |
Yes | |
Team members | Aman Ganapathy Sarah Yuniar Ashley Bilbrey Karim Abou Najm |
Rex Liu | |
N/A |