Connect a speech-to-text model
Follow these instructions to connect and configure a speech-to-text model for P4 Search.
To learn more about setting configurations manually, see Configure P4 Search locally.
On this page:
Set up auto-speech recognition with Azure Speech
-
Set the auto-speech model to AzureSpeechModel:
-
Specify the auto-speech recognition service hostname. For example:
-
Specify the language of the audio you want to be transcribed. For example:
-
Enter the API key for your auto-speech recognition service. For example:
- Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.
-
Set the short audio transcribe limit (in seconds). For example:
-
Specify the name of the cloud storage bucket or container where the large speech files will be stored. For example:
-
Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:
com.perforce.p4search.auto-speech.model=AzureSpeechModel
com.perforce.p4search.auto-speech.host=https://eastus.stt.speech.microsoft.com
com.perforce.p4search.auto-speech.lang=en-US
com.perforce.p4search.auto-speech.key=0123456789ABCDEF0123456789ABCDEF
In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.
com.perforce.p4search.auto-speech.threshold=0.6
com.perforce.p4search.auto-speech.short-audio=60
com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name
com.perforce.p4search.auto-speech.timeout=1200
All the configurations rogether:
com.perforce.p4search.auto-speech.model=AzureSpeechModel
com.perforce.p4search.auto-speech.host=https://eastus.stt.speech.microsoft.com
com.perforce.p4search.auto-speech.lang=en-US
com.perforce.p4search.auto-speech.key=0123456789ABCDEF0123456789ABCDEF
com.perforce.p4search.auto-speech.threshold=0.6
com.perforce.p4search.auto-speech.short-audio=60
com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name
com.perforce.p4search.auto-speech.timeout=1200
Set up auto-speech recognition with Amazon Transcribe
-
Set the auto-speech model to AmazonTranscribeModel:
-
Specify the auto-speech recognition service hostname. For example:
-
Specify the language of the audio you want to be transcribed. For example:
-
Enter the API key for your auto-speech recognition service. The API key for the AmazonTranscribeModel is a combination of <aws_access_key_id> and <aws_secret_access_key>. For example:
-
Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.
In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.
-
Set the short audio transcribe limit (in seconds). For example:
-
Specify the name of the cloud storage bucket or container where the large speech files are stored. For example:
-
Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:
com.perforce.p4search.auto-speech.model=AmazonTranscribeModel
com.perforce.p4search.auto-speech.host=us-east-2
com.perforce.p4search.auto-speech.lang=en-US
com.perforce.p4search.auto-speech.key=<aws_access_key_id>:<aws_secret_access_key>
where
ws_access_key_id
=ABCDEFGHIJKL12345678
ws_secret_access_key=ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp
gives
com.perforce.p4search.auto-speech.key=ABCDEFGHIJKL12345678:ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp
com.perforce.p4search.auto-speech.threshold=0.6
com.perforce.p4search.auto-speech.short-audio=60
com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name
com.perforce.p4search.auto-speech.timeout=1200
All the configurations together:
com.perforce.p4search.auto-speech.model=AmazonTranscribeModel
com.perforce.p4search.auto-speech.host=us-east-2
com.perforce.p4search.auto-speech.lang=en-US
com.perforce.p4search.auto-speech.key=ABCDEFGHIJKL12345678:ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp
com.perforce.p4search.auto-speech.threshold=0.6
com.perforce.p4search.auto-speech.short-audio=60
com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name
com.perforce.p4search.auto-speech.timeout=1200
Set up auto-speech recognition with Google Speech
-
Set the auto-speech model to GoogleSpeechModel:
-
Specify the auto-speech recognition service hostname. For example:
-
Specify the language of the audio you want to be transcribed. For example:
-
Enter the API key for your auto-speech recognition service. For example:
-
Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.
In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.
-
Set the short audio transcribe limit (in seconds). For example:
-
Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:
com.perforce.p4search.auto-speech.model=GoogleSpeechModel
com.perforce.p4search.auto-speech.host=https://speech.googleapis.com
com.perforce.p4search.auto-speech.lang=en
com.perforce.p4search.auto-speech.key=AbcdEFG12345ZXvfe56210QWErtyui123456789
com.perforce.p4search.auto-speech.threshold=0.6
com.perforce.p4search.auto-speech.short-audio=60
com.perforce.p4search.auto-speech.timeout=1200
All the configurations together:
com.perforce.p4search.auto-speech.model=GoogleSpeechModel
com.perforce.p4search.auto-speech.host=https://speech.googleapis.com
com.perforce.p4search.auto-speech.lang=en
com.perforce.p4search.auto-speech.key=AbcdEFG12345ZXvfe56210QWErtyui123456789
com.perforce.p4search.auto-speech.threshold=0.6
com.perforce.p4search.auto-speech.short-audio=60
com.perforce.p4search.auto-speech.timeout=1200
Set up auto-speech recognition with Whisper Speech
-
Run Whisper as a service in a Docker container. Here is an example of a Docker container:
-
Once the Docker container is ready, set the auto-speech model to WhisperSpeechModel, for example:
-
Specify the auto-speech recognition service hostname using the Whisper Docker container hostname and port number. For example:
-
(Optional:) Whisper has built-in language detection. You can specify the language of your audio and video files. For example:
-
Set the short audio transcribe limit (in seconds). For example:
-
Specify the name of the cloud storage bucket or container where the large speech files are stored. For example:
-
Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:
https://hub.docker.com/r/onerahmet/openai-whisper-asr-webservice
com.perforce.p4search.auto-speech.model=WhisperSpeechModel
com.perforce.p4search.auto-speech.host=http://localhost:9000
com.perforce.p4search.auto-speech.lang=en
com.perforce.p4search.auto-speech.short-audio=60
com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name
com.perforce.p4search.auto-speech.timeout=1200
All the configurations together:
com.perforce.p4search.auto-speech.model=WhisperSpeechModel
com.perforce.p4search.auto-speech.host=http://localhost:9000
com.perforce.p4search.auto-speech.lang=en
com.perforce.p4search.auto-speech.short-audio=60
com.perforce.p4search.auto-speech.timeout=1200