Do you know what watermarking is in voice biometrics? It’s a method of digitally tagging audio. It involves embedding an inaudible marker, called an identifier, into an audio file. The goal is to protect the recording from unauthorized use and verify its authenticity.
Watermarking is a tool that significantly improves the security of voice biometrics systems, mainly by preventing voice-based attacks, so-called deepfakes.
In one of our tools, we developed this proprietary method, a unique technique that protects audio recordings from being used for voice synthesis or access. The method is currently in functional use.
Tag: voice biometrics
Phase 2 of the Vesper project, a biometrics-based voice communicator, is nearing completion. During this phase, we worked on creating audio stream augmentation technology. We wrote about what this augmentation is here https://biometriq.pl/en/voice-stream-augmentation-what-is-it/
Our proprietary voice stream augmentation engine is currently undergoing perceptual (listening) and blind testing. Their goal is to provide an objective evaluation to confirm proper engine operation in line with the established quality parameters. Furthermore, the built-in voice stream augmentation technology in the voice messenger is designed to aid in detecting unauthorized voice use for further synthesis/conversion without causing degradation of sound to the human ear. This is all to prevent voice theft and ensure the most effective service performance.
It’s worth noting that solutions on the market such as SKYPE, ZOOM, DISCORD, Google Meet, TEAMS, WhatsApp, Signal, Threema, Viber, and Telegra do not support biometric caller authentication.
We are pioneers in this regard.
The comprehensive project completion is scheduled for September 2026.
In the first stage of the project, we mainly tested the far end voice stream source authenticity algorithm, which we informed you about here https://biometriq.pl/en/tests-of-a-voice-communicator-with-a-source-authenticity-detection-module-are-underway/
You can read more about the project on the website https://biometriq.pl/en/vesper-save-voice-communication-platform-with-integration-of-biometric-services/
Project financed by EU funds.
Are you curious about the final solution?
The first international standard for age-assurance technology has been published – ISO/IEC 27566-1:2025. This document establishes a framework for age-assurance systems and describes their core features, including privacy and security, to enable age-based eligibility decisions.
Access permissions refers to the term that authorizes access to applications or services. Definitions of age verification, age estimation, age inference, and subsequent validation are available here.
The standard’s main initiator is Tony Allen, head of the UK Age Check Certification System (ACCS), founder of the Global Age Assurance Standards Summit, and leader of the Australian Age Assurance Technology Research (AATT). He calls the publication of ISO 27566-1:2025 (which he co-authored) “a significant breakthrough in age assurance at the global level.”
A sample of the ISO 27566-1:2025 standard is available free of charge, but access to the full version of the document requires purchase. https://www.iso.org/standard/88143.html
more about the standard https://www.biometricupdate.com/202512/first-international-standard-on-age-assurance-sees-publication
source, photo https://www.biometricupdate.com
- The voice biometrics market is relatively young, currently estimated at USD 2-3 billion, USD 2.6 billion according to the Mordor Intelligence report “Voice Biometrics Market Size, Forecast Report, Landscape 2025”.
- Depending on the source, forecasts assume growth of approximately $10-15 billion over the next 8-10 years.
- The leading region is North America – in the Fortune Business Insights analysis, the share in 2024 was nearly 37%.
- Asia-Pacific (APAC) is often cited as the fastest growing region in the coming years.
- The “Healthcare and Life Sciences” sector will be the leader in 2025 with a 40% market share.
- Growth is driven by: growing security requirements, the need for passwordless authentication, the development of voice and AI technologies, and the digitization of financial and contact services.
sources:
What distinguishes effective voice biometrics systems? The following four indicators determine the advantage of one system over another:
1. Accuracy rate, it means that the effectiveness of biometric systems should be in the range of 95-99%.
2. FAR (False Acceptance Rate), a metric that measures how often a system incorrectly accepts an unauthorized person (e.g., someone impersonating a user) as a valid user. In the most accurate systems, this rate is less than 1%. The lower the rate, the more secure the system and the more difficult it is to impersonate.
3. FRR (False Rejection Rate), a metric that measures false rejections, or the number of times the system rejects a genuine user when it should accept them. Ideally, this figure is below 3%.
4. EER (Equal Error Rate). The point at which the FAR equals the FRR, this metric is often used to compare the quality of biometric systems.
The most effective systems are generally considered to be Phonexia oraz ID R&D systems due to their outstanding performance in comparative tests.
In our research, we primarily use Phonexia engines, but we also utilize others such as Kaldi (X-vector) and ECAPA. The goal is to test our algorithms as extensively as possible in a diverse environment. Security is our top priority.
Phase 1 of the Vesper project is nearing completion. We’ve launched a test version of the messenger with an implemented far-end voice stream authentication module. Tests are being conducted on three different environments: Windows, Android, and iOS. The results are consistent with the project’s KPIs. We’re working to ensure that quality indicators not only meet the design minimums but, where possible, exceed the established goals. Our priority is to develop a product that meets user needs and builds a positive user experience.
We conduct experiments based on 40 speakers, 20-second recordings, testing each recording across 5 channels, and obtaining over 171,500 embeds. This number of recording configurations is designed to help achieve the target parameters, confirming the effectiveness of our messenger.
Vesper Messenger is intended to be a response to the growing problems of cybersecurity and identity theft.
More about the project https://biometriq.pl/en/vesper-save-voice-communication-platform-with-integration-of-biometric-services/
The exhibition was marked by the ubiquitous AI. Many companies presented their latest achievements in constructing systems that communicate autonomously with people. The humanoid robot Ameca (Etisalat) interacting with its interlocutors aroused great interest. The stands with interactive agents (Amdocs) offered an almost unbelievable quality of image and speech generated by the systems.
Google has unveiled Gemini Live, its response to ChatGPT’s voice mode. Gemini Live has function Share Screen With Live, that allows Gemini to interact with the image displayed on the phone’s screen. Deutsche Telekom has indicated a possible direction for the development of phones by turning the entire phone into a chatbot. The phone has no applications and is a personal assistant that communicates with the user by voice. The basis of the solution is a digital assistant from AI Perplexity, but it is also to be open to, among others, Google Cloud AI, ElevenLabs, and Picsart. South Korean startup Newnal has presented a new operating system for mobile phones that uses historical and current user data to create a personalized AI assistant that is to eventually become an AI avatar behaving just like the user.
All of the above solutions, as well as many others, are connected by the use of voice technologies for two-way communication. The direction indicated at MWC 2025 is clear – our actions will be supported by avatars and bots communicating with us autonomously. The possibility of quick, machine confirmation of who we are talking to is therefore becoming even more important than ever before, because the quality of autonomous voice communication systems does not guarantee correct verification of the speaker by a human.
Photos by Andrzej Tymecki



How to effectively detect voice-based fraud? How to distinguish a real voice from a fake one, e.g. one generated on the basis of AI? The answer is simple. This requires advanced voice biometrics tools and a number of analyses. We publish here two examples that we analyzed some time ago in our laboratory and, thanks to our proprietary algorithm, we assessed with a very high probability whether the voice is real or fake and to what extent it is consistent with the voice of a given person.
The analyzes concern:
● recognizing the voice of one of President Duda’s Russian pranksters pretending to be President Macron
● assessment of the similarity of the voices of actors Piotr Fronczewski and Filip Pławiak in the film Rojst. In the play, men play the role of the same person (Kociołek) in adulthood and youth, respectively.
We share with you the conclusions from these experiments.
Biometric comparison of the voices of Fronczewski and Pławiak.
For this purpose, we used 25 seconds of total speeches by both characters, composed of several fragments of their original speech, based on the original film soundtrack. What compliance did we achieve?
The results of the analysis showed that the actors’ biometric voices are NOT consistent. Pławiak statement vs. Fronczewski VP – only 15% agreement, Fronczewski statement vs. Pławiak VP – 11%, but interestingly, these differences are not noticed at the level of the ear. In our opinion, the voices of Pławiak and Fronczewski are almost identical. And that is ultimately what this is all about.
For both characters, gender and nationality were recognized with minimal uncertainty (score of almost 100%). An age difference between the characters was also detected, estimated at 20 years.
Analysis of the voices of Russian pranksters Vladimir Kuznetsov (Vovan) and Alexei Stolyarov (Lexus) impersonating President Macron.
In this case, we biometrically analyzed the recordings of the pranksters’ voices and compared them with the voice of the real Macron (in both Polish and English versions). We downloaded all voice samples in the form of individual recordings from the public domain on YouTube. Our goal was to confirm the effectiveness of biometric systems for this specific situation – identifying fraud.
It turned out that the voice of one of the “Lexus” pranksters was just over 50% consistent with the voice of the President of France and as much as 97% consistent with the voice of the false president. The voice of the second one – “Vovana” – showed no similarities (0%) to the fake president.
This clearly proves that thanks to biometric analysis we managed to:
●detect the fact, only after 1 minute, that a fake president was involved in the conversation
● identify the identity of the fictional president (Lexus)
● confirm that the public domain is a very good source of voice samples, which may not always be used for noble purposes
● strengthen the thesis that the most effective attacks are those using social engineering, and in this case it was the choice of the right time when the President was faced with increased stress (rocket fall).
These are just selected examples of the use of specialized biometric tools to confirm the identity of people. If implemented in the future, they may help detect voice-based abuse.
