Connect a speech-to-text model

You can connect a speech-to-text model to P4 Search and configure it.

To learn more about setting configurations manually, see Configure P4 Search locally.

On this page:

Set up auto-speech recognition with Azure Speech

Set the auto-speech model to AzureSpeechModel:

com.perforce.p4search.auto-speech.model=AzureSpeechModel

Specify the auto-speech recognition service hostname. For example:

com.perforce.p4search.auto-speech.host=https://eastus.stt.speech.microsoft.com

Specify the language of the audio you want to be transcribed. For example:

com.perforce.p4search.auto-speech.lang=en-US

Enter the API key for your auto-speech recognition service. For example:

com.perforce.p4search.auto-speech.key=0123456789ABCDEF0123456789ABCDEF

Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.

In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.

com.perforce.p4search.auto-speech.threshold=0.6

Set the short audio transcribe limit (in seconds). For example:

com.perforce.p4search.auto-speech.short-audio=60

Specify the name of the cloud storage bucket or container where the large speech files are stored. For example:

com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:

com.perforce.p4search.auto-speech.timeout=1200

All the configurations together:

com.perforce.p4search.auto-speech.model=AzureSpeechModel

com.perforce.p4search.auto-speech.host=https://eastus.stt.speech.microsoft.com

com.perforce.p4search.auto-speech.lang=en-US

com.perforce.p4search.auto-speech.key=0123456789ABCDEF0123456789ABCDEF

com.perforce.p4search.auto-speech.threshold=0.6

com.perforce.p4search.auto-speech.short-audio=60

com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

com.perforce.p4search.auto-speech.timeout=1200

Set up auto-speech recognition with Amazon Transcribe

Set the auto-speech model to AmazonTranscribeModel:

com.perforce.p4search.auto-speech.model=AmazonTranscribeModel

Specify the auto-speech recognition service hostname. For example:

com.perforce.p4search.auto-speech.host=us-east-2

Specify the language of the audio you want to be transcribed. For example:

com.perforce.p4search.auto-speech.lang=en-US

Enter the API key for your auto-speech recognition service. The API key for the AmazonTranscribeModel is a combination of <aws_access_key_id> and <aws_secret_access_key>. For example:

com.perforce.p4search.auto-speech.key=<aws_access_key_id>:<aws_secret_access_key>

where

ws_access_key_id=ABCDEFGHIJKL12345678

ws_secret_access_key=ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp

gives

com.perforce.p4search.auto-speech.key=ABCDEFGHIJKL12345678:ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp

Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.

In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.

com.perforce.p4search.auto-speech.threshold=0.6

Set the short audio transcribe limit (in seconds). For example:

com.perforce.p4search.auto-speech.short-audio=60

Specify the name of the cloud storage bucket or container where the large speech files are stored. For example:

com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:

com.perforce.p4search.auto-speech.timeout=1200

All the configurations together:

com.perforce.p4search.auto-speech.model=AmazonTranscribeModel

com.perforce.p4search.auto-speech.host=us-east-2

com.perforce.p4search.auto-speech.lang=en-US

com.perforce.p4search.auto-speech.key=ABCDEFGHIJKL12345678:ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp

com.perforce.p4search.auto-speech.threshold=0.6

com.perforce.p4search.auto-speech.short-audio=60

com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

com.perforce.p4search.auto-speech.timeout=1200

Set up auto-speech recognition with Google Speech

Set the auto-speech model to GoogleSpeechModel:

com.perforce.p4search.auto-speech.model=GoogleSpeechModel

Specify the auto-speech recognition service hostname. For example:

com.perforce.p4search.auto-speech.host=https://speech.googleapis.com

Specify the language of the audio you want to be transcribed. For example:

com.perforce.p4search.auto-speech.lang=en

Enter the API key for your auto-speech recognition service. For example:

com.perforce.p4search.auto-speech.key=AbcdEFG12345ZXvfe56210QWErtyui123456789

Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.

In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.

com.perforce.p4search.auto-speech.threshold=0.6

Set the short audio transcribe limit (in seconds). For example:

com.perforce.p4search.auto-speech.short-audio=60

Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:

com.perforce.p4search.auto-speech.timeout=1200

All the configurations together:

com.perforce.p4search.auto-speech.model=GoogleSpeechModel

com.perforce.p4search.auto-speech.host=https://speech.googleapis.com

com.perforce.p4search.auto-speech.lang=en

com.perforce.p4search.auto-speech.key=AbcdEFG12345ZXvfe56210QWErtyui123456789

com.perforce.p4search.auto-speech.threshold=0.6

com.perforce.p4search.auto-speech.short-audio=60

com.perforce.p4search.auto-speech.timeout=1200

Set up auto-speech recognition with Whisper Speech

Run Whisper as a service in a Docker container. Here is an example of a Docker container:

https://hub.docker.com/r/onerahmet/openai-whisper-asr-webservice

After the Docker container is ready, set the auto-speech model to WhisperSpeechModel, for example:

com.perforce.p4search.auto-speech.model=WhisperSpeechModel

Specify the auto-speech recognition service hostname using the Whisper Docker container hostname and port number. For example:

com.perforce.p4search.auto-speech.host=http://localhost:9000

(Optional:) Whisper has built-in language detection. You can specify the language of your audio and video files. For example:

com.perforce.p4search.auto-speech.lang=en

Set the short audio transcribe limit (in seconds). For example:

com.perforce.p4search.auto-speech.short-audio=60

Specify the name of the cloud storage bucket or container where the large speech files are stored. For example:

com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:

com.perforce.p4search.auto-speech.timeout=1200

All the configurations together:

com.perforce.p4search.auto-speech.model=WhisperSpeechModel

com.perforce.p4search.auto-speech.host=http://localhost:9000

com.perforce.p4search.auto-speech.lang=en

com.perforce.p4search.auto-speech.short-audio=60

com.perforce.p4search.auto-speech.timeout=1200