Connect a speech-to-text model

Follow these instructions to connect and configure a speech-to-text model for P4 Search.

To learn more about setting configurations manually, see Configure P4 Search locally.

Set up auto-speech recognition with Azure Speech

  1. Set the auto-speech model to AzureSpeechModel:

  2. com.perforce.p4search.auto-speech.model=AzureSpeechModel

  3. Specify the auto-speech recognition service hostname. For example:

  4. com.perforce.p4search.auto-speech.host=https://eastus.stt.speech.microsoft.com

  5. Specify the language of the audio you want to be transcribed. For example:

  6. com.perforce.p4search.auto-speech.lang=en-US

  7. Enter the API key for your auto-speech recognition service. For example:

  8. com.perforce.p4search.auto-speech.key=0123456789ABCDEF0123456789ABCDEF

  9. Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.
  10. In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.

    com.perforce.p4search.auto-speech.threshold=0.6

  11. Set the short audio transcribe limit (in seconds). For example:

  12. com.perforce.p4search.auto-speech.short-audio=60

  13. Specify the name of the cloud storage bucket or container where the large speech files will be stored. For example:

  14. com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

  15. Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:

  16. com.perforce.p4search.auto-speech.timeout=1200

All the configurations rogether:

com.perforce.p4search.auto-speech.model=AzureSpeechModel

com.perforce.p4search.auto-speech.host=https://eastus.stt.speech.microsoft.com

com.perforce.p4search.auto-speech.lang=en-US

com.perforce.p4search.auto-speech.key=0123456789ABCDEF0123456789ABCDEF

com.perforce.p4search.auto-speech.threshold=0.6

com.perforce.p4search.auto-speech.short-audio=60

com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

com.perforce.p4search.auto-speech.timeout=1200

Set up auto-speech recognition with Amazon Transcribe

  1. Set the auto-speech model to AmazonTranscribeModel:

  2. com.perforce.p4search.auto-speech.model=AmazonTranscribeModel

  3. Specify the auto-speech recognition service hostname. For example:

  4. com.perforce.p4search.auto-speech.host=us-east-2

  5. Specify the language of the audio you want to be transcribed. For example:

  6. com.perforce.p4search.auto-speech.lang=en-US

  7. Enter the API key for your auto-speech recognition service. The API key for the AmazonTranscribeModel is a combination of <aws_access_key_id> and <aws_secret_access_key>. For example:

  8. com.perforce.p4search.auto-speech.key=<aws_access_key_id>:<aws_secret_access_key>

    where

    ws_access_key_id=ABCDEFGHIJKL12345678

    ws_secret_access_key=ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp

    gives

    com.perforce.p4search.auto-speech.key=ABCDEFGHIJKL12345678:ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp

  9. Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.

    In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.

  10. com.perforce.p4search.auto-speech.threshold=0.6

  11. Set the short audio transcribe limit (in seconds). For example:

  12. com.perforce.p4search.auto-speech.short-audio=60

  13. Specify the name of the cloud storage bucket or container where the large speech files are stored. For example:

  14. com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

  15. Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:

  16. com.perforce.p4search.auto-speech.timeout=1200

All the configurations together:

com.perforce.p4search.auto-speech.model=AmazonTranscribeModel

com.perforce.p4search.auto-speech.host=us-east-2

com.perforce.p4search.auto-speech.lang=en-US

com.perforce.p4search.auto-speech.key=ABCDEFGHIJKL12345678:ab0cd1ef2gh3IJ4aaaBBBcccWWW111rfc1234YERTpp

com.perforce.p4search.auto-speech.threshold=0.6

com.perforce.p4search.auto-speech.short-audio=60

com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

com.perforce.p4search.auto-speech.timeout=1200

Set up auto-speech recognition with Google Speech

  1. Set the auto-speech model to GoogleSpeechModel:

  2. com.perforce.p4search.auto-speech.model=GoogleSpeechModel

  3. Specify the auto-speech recognition service hostname. For example:

  4. com.perforce.p4search.auto-speech.host=https://speech.googleapis.com

  5. Specify the language of the audio you want to be transcribed. For example:

  6. com.perforce.p4search.auto-speech.lang=en

  7. Enter the API key for your auto-speech recognition service. For example:

  8. com.perforce.p4search.auto-speech.key=AbcdEFG12345ZXvfe56210QWErtyui123456789

  9. Set the automatic speech recognition threshold for your auto-speech service as a floating point percentage. This threshold is the accuracy returned by your auto-speech service. For example, 0.6=60%.

    In this example, when the auto-speech threshold is set to 0.6, any transcription with 60% accuracy is accepted by the auto-speech service and anything below that is discarded.

  10. com.perforce.p4search.auto-speech.threshold=0.6

  11. Set the short audio transcribe limit (in seconds). For example:

  12. com.perforce.p4search.auto-speech.short-audio=60

  13. Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:

  14. com.perforce.p4search.auto-speech.timeout=1200

All the configurations together:

com.perforce.p4search.auto-speech.model=GoogleSpeechModel

com.perforce.p4search.auto-speech.host=https://speech.googleapis.com

com.perforce.p4search.auto-speech.lang=en

com.perforce.p4search.auto-speech.key=AbcdEFG12345ZXvfe56210QWErtyui123456789

com.perforce.p4search.auto-speech.threshold=0.6

com.perforce.p4search.auto-speech.short-audio=60

com.perforce.p4search.auto-speech.timeout=1200

Set up auto-speech recognition with Whisper Speech

  1. Run Whisper as a service in a Docker container. Here is an example of a Docker container:

  2. https://hub.docker.com/r/onerahmet/openai-whisper-asr-webservice

  3. Once the Docker container is ready, set the auto-speech model to WhisperSpeechModel, for example:

  4. com.perforce.p4search.auto-speech.model=WhisperSpeechModel

  5. Specify the auto-speech recognition service hostname using the Whisper Docker container hostname and port number. For example:

  6. com.perforce.p4search.auto-speech.host=http://localhost:9000

  7. (Optional:) Whisper has built-in language detection. You can specify the language of your audio and video files. For example:

  8. com.perforce.p4search.auto-speech.lang=en

  9. Set the short audio transcribe limit (in seconds). For example:

  10. com.perforce.p4search.auto-speech.short-audio=60

  11. Specify the name of the cloud storage bucket or container where the large speech files are stored. For example:

  12. com.perforce.p4search.auto-speech.bucket=cloud_storage_bucket_name

  13. Specify the timeout limit for short audio transcription processing from the cloud services (in seconds). For example:

  14. com.perforce.p4search.auto-speech.timeout=1200

All the configurations together:

com.perforce.p4search.auto-speech.model=WhisperSpeechModel

com.perforce.p4search.auto-speech.host=http://localhost:9000

com.perforce.p4search.auto-speech.lang=en

com.perforce.p4search.auto-speech.short-audio=60

com.perforce.p4search.auto-speech.timeout=1200