Is this phenomenon regulated by law in any way? And if so, are these regulations sufficient to protect the author? After all, the voice is not only an element of the image, but a unique biometric feature used for identification. Therefore, it should be protected in the same way as personal data.

Progress in the development of artificial intelligence means that voice-based fraud (impersonating other people) is becoming more and more common and is the subject of financial fraud, political attacks, data theft or unfair promotion. Do you remember the fake voice of US President Joe Biden used in automated telephone calls discouraging participation in the primary elections, or the use of Taylor Swift’s fake voice in an ad phishing for data under the guise of handing out pots? These are examples of how quickly and effectively technology can be used for unfair purposes. Unfortunately, the scale of this type of abuse will only increase.

Motivations for use are also becoming more and more popular in film productions. We’re talking about the cult Top Gun Maverick, where Val Kilmer’s voices were synthesized. This happens in the last season of the Polish series Rojst, in which Filip Pławiak (young Kociołek) talks about Piotr Fronczewski (Kociołek). The effect of use, i.e. the procedure we are dealing with when the challenge is no longer used in the era of AI invasion. While this aspect was taken into account and regulated for the needs of Rojst’s production, the question arises about the device in films synthesized by actors after their death. The analysis report also includes aspects of biometric testing manufacturers.

Having the appropriate tools at our disposal, we attempted to biometrically compare the voices of Fronczewski and Pławiak. The results of the analysis show that their voices are biometrically NOT consistent (Pławiak utterance vs. Fronczewski VP – only 15% agreement, Fronczewski utterance vs. Pławiak VP – 11%), but interestingly these differences are not noticed at the ear level. In our opinion, the voices of Pławiak and Fronczewski are almost identical. And that’s what’s going on here.

For both characters, gender and nationality were recognized with minimal uncertainty (score of almost 100%). An age difference between the characters was also detected, estimated at 20 years.
The study was carried out in our digital signal processing laboratory, for this purpose we used 25 seconds of total speeches of both characters, composed of several fragments of their original speech, based on the original film track.

The conclusions from this experiment indicate how helpful and effective biometrics can be in identifying the speaker, assessing the authenticity of his voice and, consequently, detecting voice-based fraud. Will this be enough to limit the unfair use of the voices of famous people in the future? And most importantly, are we able to regulate the market so as to take care of the voices of famous people after their death?

What is accessibility?

Accessibility is a broad concept describing the extent to which a given system can be used by as large a group of people as possible.

It is a property of the environment (physical space, digital reality, information and communication systems, products or services) that allows people with functional (physical, cognitive) difficulties to use it on an equal basis with others.

Regulations in Poland – WCGA2.0 and WCAG2.1 Directives

With people with disabilities in mind, the WCAG 2.0 standard (Web Content Accessibility Guidelines) was created – an extensive set of recommendations on the accessibility of Internet content – these are 12 guidelines defining the features of individual content elements on the Internet that affect their accessibility. This applies to online stores, email clients and mobile applications.

WCAG2.1 is an extended WCAG 2.0 standard with an additional 17 guidelines related to sharing content on mobile devices. among others the ability to view the image on mobile devices in vertical and horizontal orientation, adjust the page content to the device window without having to scroll the image, use appropriate spacing between lines, use an appropriate contrast ratio, and the ability to turn off animation after interaction.

Public institutions are obliged to apply these Directives, but when thinking about popularizing solutions, we should ensure that newly created products/services meet these requirements to the greatest extent possible.

What does accessibility mean to us as a company?

Accessibility for us is one of our 4 main values ​​(along with innovation, cooperation, security), to which we attach a huge importance when designing solutions in the field of voice biometrics and improving systems of this type. We develop our technologies with its universality in mind, for people who lack knowledge or have little knowledge about online threats. By creating solutions already at the research stage, we make sure that as many people as possible can use them, giving priority to people who are not fully functional, e.g. with limited vision. In this way, we try to eliminate differences, to put it bluntly, to counteract digital exclusion. We believe that this approach significantly improves the quality and comfort of life of people with disabilities. According to the Central Statistical Office, there are over 3 million such people (legally registered) in Poland, which constitutes 10% of the entire society.

Vesper voice communicator

We are currently working on a new, innovative solution for communicating via voice. Its advantage will be its high accessibility for people with disabilities. Due to the fact that the solution will also be available on mobile devices, we will base our design on the WCAG2.1 standard. The product will feature a clear and default interface, ensuring simple and intuitive operation and high flexibility in use. The voice communicator will provide users with, among others: the ability to clearly zoom in on the text or use alternative descriptions. It is worth emphasizing that none of the instant messengers available on the market have such functionalities.

The project is implemented thanks to a grant from the European Union.

more about the Vesper project

Is it possible? The research experiment opens new possibilities in this area.

Researchers in the UK have created a dataset of physical movements that generate speech sounds. This collection may be used in the future to develop speech recognition systems that synthesize the voices of people with speech impediments. This may also contribute to the development of a new method for recognizing silent speech and even new behavioral biometrics.

This means that in the future,  voice-controlled devices such as smartphones will likely be able to read users’ lips and be used to authenticate banking and other sensitive applications by identifying the user’s unique facial expressions. In other words, a person could be authenticated based on the movements of their lips and face.

In this experiment, the database was built based on lip reading and facial movement analysis. Data from continuous wave radars were used to capture the movement of the skin on the face, tongue and larynx of the study participants while speaking. Scientists used, among others, a laser spectra detection system with a super-fast camera to capture vibrations on the skin surface, as well as a Kinect V2 camera to read changes in the shape of the lips when forming various sounds.

The database, created based on the analysis of 400 minutes of speech, will be made available to researchers free of charge in order to further develop the technology.

The research group included scientists from the University of Dundee and University College London. The experiment also used technology from the Center for Communication, Sensing and Imaging at the University of Glasgow.

more 

The fight against crime can become more effective thanks to voice biometrics.

Phonexia’s product for voice comparison in the field of computer forensics will soon be available on the market. Voice Inspector 5.1, because this is what we are talking about, was designed especially for experts in this area.
The software is able to identify a person based on just 3 seconds of speech and offers the same voice comparison accuracy regardless of language. The new software offering meets international standards of judicial admissibility, in line with the guidelines of the European Network of Forensic Science Institutes (ENFSI).

The product also includes a set of supporting technologies, such as speaker diarization based on voice recognition, which allows marking individual speakers and separating them from the mono audio stream, a phoneme recognition module for identifying similar sound patterns in recordings, voice presence detection, and a spectrogram for analyzing audio files. 

Phonexia operates as part of the European Union-backed Roxanne consortium, which cooperates with law enforcement agencies in investigating criminal networks by providing voice biometrics data. The project was co-financed under the EU’s “Horizon 2020” program.

More on biometricupdate
https://www.biometricupdate.com/202401/phonexia-launches-voice-biometrics-product-for-forensic-investigations

We started 2024 by launching a new research and development project for which we received a European Union subsidy called “Vesper – safe voice communication platform with the integration of biometric services”. The aim of this project is to develop and implement an innovative voice communicator with unique functional features on the market. What is it? Will Vesper voice communicator stand out?

In addition to strong transmission encryption, the communicator will have integrated voice biometrics technology and two unique functionalities developed as part of the research and development work of this project, i.e. technology for verifying the authenticity of the far-end voice stream emission source and technology for augmenting the voice stream received in the near-end device. It is worth emphasizing that no other instant messengers such as Skype or Teams currently available on the market have such functional features.

The technologies developed as part of the project are intended to protect the user against presentation attacks and prevent the interlocutor’s voice from being used to effectively create a deep fake using voice synthesis techniques.

Additionally, the technologies implemented in the communicator: the technology for verifying the authenticity of the long-range voice stream emission source and the technology for amplification of the voice stream received in the short-range device will also be the subject of independent commercialization under the granted license.

Another advantage of the messenger will be its high accessibility for people with disabilities, prepared according to the WCAG2.1 standard. The product will feature a clear and default interface, ensuring simple and intuitive operation and high flexibility in use.

The target market for the introduced products is the international encrypted mobile communications market. The main recipients of the Vesper communicator will be enterprises requiring secure voice communication, administration, etc. The recipients of the two technologies resulting from the project will be primarily producers and suppliers of voice and related communication systems, for whom such technologies will constitute an added value that increases the safety of their users.

The project started on January 1, 2024 and will last 2.5 years. This is the fourth BiometrIQ project implemented with EU funds.

=>Project value: PLN 7,217,514.00

=>Amount of contribution from European Funds: 4,779,500.00

=> Project number: FENG.01.01-IP.02-0769/23

VoiceDNA is a modern product for biometric voice authentication, also used for voice analysis and audio customer service, offered by the Vietnamese startup Namitech, also known as Nami Technology.

Ho Chi Minh City-based Namitech argues that VoiceDNA is more than twice as fast as Nuance biometrics, both in terms of registration and identity verification. The software can be implemented for text-dependent and text-independent verification, and has an efficiency of detecting presentation attacks of up to 95%. The startup has just raised $2 million for its further development.

More https://www.biometricupdate.com/202310/vietnamese-voice-biometrics-startup-namitech-secures-2m-funding

Comparable to “Voice DNA” is the proprietary solution from BiometrIQ – VoiceToken, which provides strong, two-step voice authentication with a very high effectiveness of almost 99%, also in the case of attacks based on speech synthesis. It only takes a dozen or so seconds to verify your identity.

An interesting case of using voice biometrics in medicine. Voice biometrics restores the patient’s ability to generate speech. After a stroke, a patient (Ania) who has completely lost her speech can speak in her own voice, using a biometric avatar controlled by her mind.

This is made possible by a special implant, implanted outside the brain, that uses voice and facial biometrics to derive speech data and inferences based on cerebral assessment. An artificial intelligence algorithm trained on a recording of the patient’s wedding speech, the main one of her new voice. Cerebral interrogation when the patient talks or acts.
Ania’s avatar is animated on a graphical grid using emotion signals expressing happiness, sadness and surprise.

The creators of the implant are scientists from the University of California in San Francisco and Berkeley.

More details
https://www.biometricupdate.com/202309/voice-biometrics-restore-patients-ability-to-generate-speech

How much should a car know about its driver? Qualcomm announced a collaboration with SoundHound to develop and test SoundHound Chat AI for automotive. The first available voice assistant with generative AI capabilities will be added to the Snapdragon Digital Chassis concept vehicle. For example, the voice assistant will be able to find a recipe, add the necessary ingredients to a digital shopping cart and have them ready for pickup at the driver’s local grocery store at a specific time.


While AI service providers believe megaplatforms increase convenience and will change the way people live, privacy researchers call them “data-guzzling machines” that have the unparalleled power to see, listen and collect information about what they are doing and where they are driving. This is according to research by organizations responsible for privacy protection regarding the collection and use of data by automotive brands.

More
https://www.biometricupdate.com/202309/your-car-is-a-good-listener-maybe-too-good

Game developers strive to improve player experience by designing immersive environments. A key element of this immersion is the integration of seamless payment methods that allow players to purchase in-game items, upgrade their characters or access premium content without interrupting the flow of the game.

Fingerprint scanners and facial recognition systems are already common and provide a safe and convenient alternative to traditional systems and the use of passwords and PIN codes. Thanks to this, game fans can continuously enjoy the game thanks to biometric authentication at any stage. However, the use of behavioral biometrics in the gaming industry will allow us to go a step further – recognizing people will be possible based on their unique behaviors. These could be typing patterns or mouse movements. This means that the user’s natural interaction with the game interface will create a unique behavioral signature that can be used to authenticate the user in subsequent game sessions.

More https://www.biometricupdate.com/202308/the-intersection-of-gaming-and-biometrics-a-look-to-the-future

Daon’s newly patented set of ALX algorithms for voice, face and document verification has hit the market. This technology is intended to improve the detection of voice fraud, mainly supported by artificial intelligence.
https://www.biometricupdate.com/202306/daon-adds-algorithms-to-improve-deepfake-detection-for-voice-and-face-biometrics

BiometrIQ, also as part of one of its research projects, is working on a package of solutions to increase the security of biometric systems and protect against attacks based on voice theft. Thanks to the original solution, it will be possible to determine the authenticity of the voice source with a probability of up to 99%.