A Journey Through Personal Albums and Exploring the Intersection of Tech and Humanity
Welcome to my personal blog that delves into the intricate tapestry of personal albums and the fascinating intersection of ever-evolving technology and humanity. Come along on a journey with me as we delve into the seamless fusion of creativity, state-of-the-art AI and robotics, intricately interwoven within the tapestry of our shared awareness.
Hover your mouse over the image and see the AI genereated caption and rating. Have fun!
AI Headphones Let You Listen to a Single Person in a Crowded Noisy Environment
Imagine being able to focus on a single person's voice in a crowded, noisy environment without any distractions. Apple’s AirPods Pro, for instance, use AI to automatically adjust sound levels by sensing when they’re in conversation. Apple called them Adaptive Noise Control, Personalized Volume, and Conversation Awareness.
Adaptive Noise Control can seamlessly blend Active Noise Cancellation and Transparency mode to adjust the level of noise control based on the changing noise conditions in your surroundings. This means that whether you're in a quiet library or a bustling city street, AirPods Pro will automatically optimize the noise cancellation or transparency to suit your environment.
Personalized Volume can learn your listening preferences over time and adjusts the media volume on your AirPods Pro based on your surrounding environment. This means that if you're in a noisy environment, your earbuds will automatically increase the volume to ensure you can hear your media clearly, while in quieter settings, the volume will be adjusted accordingly.
These are great features but user has little control over whom to listen to or when this happens. Thanks to groundbreaking research from the University of Washington (UW), this is now possible with a new AI system called Target Speech Hearing (TSH).
How Target Speech Hearing Works
The TSH system allows a user wearing headphones to "enroll" a speaker by looking at them for just three to five seconds. The headphones are fitted with microphones that capture the sound waves from the speaker's voice. This signal is then sent to an on-board embedded computer, where machine learning software learns the desired speaker's vocal patterns.
Once the speaker is enrolled, the TSH system cancels all other sounds in the environment and plays only the enrolled speaker's voice in real time, even as the listener moves around in noisy places and no longer faces the speaker. The system's ability to focus on the enrolled voice improves as the speaker continues talking, providing more training data.
(A) Two users are walking near a noisy street (B) the wearer looks at the target speaker for a few seconds to capture a noisy binaural audio example, which is used to learn the speech traits of the target speaker (C) the hearable extracts the target speaker and removes interference, even when the wearer is no longer looking at the target speaker.
The UW team tested the TSH system on 21 subjects, who rated the clarity of the enrolled speaker's voice nearly twice as high as the unfiltered audio on average. This impressive result demonstrates the effectiveness of the AI-powered system in isolating a single voice in a noisy environment.
Potential Applications
The TSH system has numerous potential applications, including:
Enhancing communication in noisy environments such as restaurants, bars, or public transportation
Assisting individuals with hearing impairments in focusing on a single speaker
Improving the quality of remote meetings or conferences by isolating the main speaker's voice
Enabling journalists or researchers to conduct interviews in crowded places without background noise interference
Future Developments
While the current TSH system can only enroll one speaker at a time and requires a clear line of sight to the target speaker, the UW team is working on expanding the system to earbuds and hearing aids in the future. They also aim to improve the system's ability to handle multiple speakers and more complex acoustic environments.
The Target Speech Hearing system represents a significant advancement in the field of auditory perception and AI-powered hearing devices. By allowing users to focus on a single speaker in a noisy environment, this technology has the potential to revolutionize communication and improve the quality of life for individuals with hearing impairments. As the UW team continues to refine and expand the system, we can expect to see even more exciting developments in the near future.