Voice
Voice service is a REST API to which you can send audio files to be processed and get the result of the voice recognition process. The service offers a service to enroll a new voice, and another one to authenticate a voice.
The product is for speaker verification and voice liveness detection. It is based on the use of a voice template, which is a string that contains the voice biometric information. This voice template can be used to authenticate the voices in the future.
Supported operations are:
- Enrollment: new voice enrollment.
- Authentication: authenticate a voice.
Enrollment
This endpoint is used to enroll a new voice. It receives one or more audio files, and returns a voice template. The voice template is a string that contains the voice biometric information. This voice template can be used to authenticate the voices in the future. The audios can be encrypted or not, and encoded in base64. The returned template is always encrypted and encoded in base64. It accepts 1 audio, or 3 to 5 audios, to perform a text independent or text dependent enrollment, respectively:
- 1 audio for text-independent enrollment.
- 3 to 5 audios for text-dependent enrollment.
| Field | Description |
|---|---|
| audios | Array of strings. Each position of the array is an audio raw buffer encoded in base64 RFC4648. Maximum two files. It accepts 1 audio, or 3 to 5 audios. |
Type supported
- WAV
- MP3
- Opus/OGG
- AAC
- WMA
- PCM ulaw and mulaw
- FLAC
- ALAC (mov)
- MP4
- AIFF
Example request:
curl --location '{IDENTITY_API_BASE_URL}/voice/enrollment' \
--header 'x-api-key: {IDENTITY_API_APIKEY}' \
--header 'Content-Type: application/json' \
--data '{
"audios": ["JVBERi0xLjQKJeLjz9MKNSAwIG9iago8P..."]
}'
Example response:
200 OK
{
"serviceResultCode": 200,
"serviceResultLog": "Service executed ok",
"timestamp": "2024-07-12T09:43:36Z",
"serviceTransactionId": "99999999-9999-9999-9999-999999999999",
"serviceResult": {
"operation_result": 3,
"template": "BgEBAQIvimhg/Th98mTNID4BPHKsJsf...",
"template_type": "text-dependent",
"validate_audios_result": [
{
"audio_position": 0,
"matching_score": 0.9999997019767761,
"multiple_speakers_score_detected": -3.4028234663852886e+38,
"result_code": 3,
"snr_db_detected": 18.781143188476562,
"speech_length_ms_detected": 4200,
"speech_relative_length_detected": 0.65625
},
{
"audio_position": 1,
"matching_score": 1,
"multiple_speakers_score_detected": -3.4028234663852886e+38,
"result_code": 3,
"snr_db_detected": 17.34685707092285,
"speech_length_ms_detected": 5000,
"speech_relative_length_detected": 0.6868131756782532
},
{
"audio_position": 2,
"matching_score": 1,
"multiple_speakers_score_detected": -3.4028234663852886e+38,
"result_code": 3,
"snr_db_detected": 17.34685707092285,
"speech_length_ms_detected": 5000,
"speech_relative_length_detected": 0.6868131756782532
}
]
},
"serviceTime": "638"
}
Authentication
This endpoint is used to authenticate a voice. It receives an audio file and a voice template, and returns a boolean value that indicates if the voice belongs to the same person as the one in the voice template, and a probability that indicates the similarity between the two voices. The audio can be encrypted or not, and encoded in base64. The voice template must be encrypted and encoded in base64.
| Field | Description |
|---|---|
| audio | audio raw buffer encoded in base64 RFC4648 |
| template | Biometric template buffer, obtained from Enrollment(), encrypted and encoded in base64 RFC4648. |
Example request:
curl --location '{IDENTITY_API_BASE_URL}/voice/authentication' \
--header 'x-api-key: {IDENTITY_API_APIKEY}' \
--header 'Content-Type: application/json' \
--data '{
"audio": "JVBERi0xLjQKJeLjz9MKNSAwIG9iago8P...",
"template": "BgEBAQI+d368i49ITeoPlmCi5zbYp3kdvTsk6otTOl...."
}'
Example response:
200 OK
{
"serviceResultCode": 200,
"serviceResultLog": "Service executed ok",
"timestamp": "2024-07-13T19:43:36Z",
"serviceTransactionId": "99999999-9999-9999-9999-999999999999",
"serviceResult": {
"liveness_score": 0,
"match": true,
"matching_score": 1,
"operation_result": 3,
"tracking_message": "",
"tracking_status": -1
},
"serviceTime": "1708"
}