Researchers have revealed a system that uses radio frequency (RF) and Wi-Fi signals for Lip-reading when the speaker is wearing a mask. Their published results discuss how such technology can aid the hearing impaired in communication while also noting that the technology demonstrated may have wide-ranging applications for biometric security, voice-enabled technology, and privacy.
READING LIPS IS FUNDAMENTAL TO MILLIONS WORLDWIDE
Over 5% of the world’s population (a whopping 430 million people) suffers from some form of hearing loss. This number is expected to increase to over 700 million by 2050. For many of these folks, hearing aids are supplemented by the ability to read lips. Unfortunately, lip-reading is virtually impossible in a COVID-19 world, when the speaker is often wearing a mask.
A team of researchers has now demonstrated a method that utilizes Wi-Fi-based RF sensing for reading lips even if the speaker is wearing a mask.
Published in the journalNature Communications, the University of Glasgow team’s research began with a search for alternatives to filming lips directly due to the privacy concerns and legalities around unwanted filming, as well as technological limitations to those systems.
“Most of the Lip-reading technologies developed so far are camera-based, which require video recording of the target,” they explain. “However, these technologies have well-known limitations of occlusion and ambient lighting with serious privacy concerns.”
Along with these lighting and privacy issues, including the fact that it is illegal to film someone without their consent in many countries, the team notes that these technologies are virtually useless in a COVID-19 world, “where face masks have become a norm.”
“Lip reading in the presence of face masks with camera-based technologies has become potentially impossible in the COVID-19 era,” they write. “Camera-based Lip-reading systems also fail in complete darkness when lip movements cannot be visually observed.”
These limitations, along with a growing population of those with hearing issues, led the researchers to look for alternatives that wouldn’t face the same legal or functional limitations.
“This paper aims to solve the fundamental limitations of camera-based systems by proposing a radio frequency (RF) based Lip-reading framework,” they write, including “having an ability to read lips under face masks.”
Specifically, their successfully tested method “employs Wi-Fi and radar technologies as enablers of RF sensing based Lip-reading.”
RF AND WI-FI ENABLE LIP-READING EVEN THROUGH MASKS
To accomplish this goal, the researchers first collected a data set of a masked speaker saying each of the English language vowels A, E, I, O, and U, using Wi-Fi-enabled RF sensing and radar data capture. The team also collected data on closed-mouth positions when no words were spoken. This information was then used to train machine learning (ML) and deep learning (DL) models in human speech patterns, particularly when the speaker wears a mask.
For both experiments (radar and Wi-Fi), three participants, one male, and two females, participated in the data collection process.
“A total of 3600 data samples were collected during both experiments for six classes, namely, A, E, I, O, U, and Emp, where Emp represents the lip posture of being silent,” the researchers explain. “In each experiment, a total of 1800 data samples were collected from three participants, 900 with face mask and 900 without a face mask, where 50 samples were collected in each class.”
During this data collection process, each participant repeated the speaking activity of each vowel 50 times with a mask and 50 times without a mask.
According to the published research, “The lip and mouth movements result in variations in the wireless channel state information (CSI) amplitudes, which are picked up by ML/DL algorithms as patterns belonging to spoken sounds and classified into their respective speech, words, phonemes or spoken letters.”
This process is particularly effective because RF and radar signals can penetrate the mask to capture visual cues, including lip and mouth movements, “which will otherwise be obscured from visual hearing aids.”
In fact, the researchers add, their approach is simple enough that RF-based lip reading “may just require the addition of a single antenna on the hearing aid.” They also point out that their system doesn’t need to be combined with a hearing aid but may be used as a stand-alone system.
“This provides an exciting opportunity to transform next-generation multi-modal hearing aids through RF sensing,” they write.
The team notes that their approach has many possible applications, “including hearing aid devices, biometric security, and voice-enabled control systems in smart homes and cars infotainment.”
LIP-READING THROUGH MASKS HAS THE POTENTIAL FOR ABUSE
What is not discussed in the published research is the possibility that this technology and approach could provide the intelligence community and law enforcement professionals with a previously unavailable tool to monitor human speech even when the speaker is masked. That may be useful and even necessary in some situations, but it is fraught with potential legal ramifications and seems ripe for abuse.
Still, this type of tool could become irreplaceable for the millions of folks suffering from hearing loss, especially in a world where mask-wearing is prevalent.
“Lip reading through RF sensing can provide highly accurate cues to the hearing aids by identifying spoken sounds and detecting speech patterns through machine learning (ML) and deep learning (DL) techniques,” the team concludes. “Furthermore, unlike vision-based systems, RF-sensing-based Lip reading does not suffer from limitations due to face masks.”
Follow and Connect with Author Christopher Plain on Twitter @plain_fiction