Recently we were having a requirement for getting text from audio or video file. These audios were majorly from customer calls with agents. Once we get text from audio files , we can review those conversions. We can check how agents are discussing with customers. This will increase customer happiness.
There are many solutions available for converting audio/video file to text. Some of major solutions are below
This post is using Google Speech API to transcribe audio file into text. This audio file is attached in case record. As a requirement, we need to convert attached audio file and add that to case comment in same record.
To make this requirement achievable, we have to create two lightning component. One component will get access token from Google authorization service and second component will use this access token to generate text from audio file.
Prerequisite for this solution:
- Create user in Google cloud at https://cloud.google.com/speech-to-text/
- Create project in Google Cloud and enable Cloud Speech-to-Text API at  https://console.cloud.google.com/apis/library
- Authorize domain in OAuth Consent Screen. Use Lightning url of your org without https.
- Create credential for this API. We need to set redirect url which will be lighting component url for authenticating and getting access token.
As mentioned above, to get transcription from audio we have to create two lightning components.
- First Lightning Component to get Access Token from Google Authentication Service
- Second Lightning Component to get Transcription from Speech API
1. Get Access Token from Google Authentication Service
Let us Authenticate Salesforce App and get access token from Google authentication service. Below steps will be used to get access token
- Create Apex class to generate URL for getting access token
- Create lightning component to get access token.
- Create custom metadata to store access token for later user.
- Make lightning component as tab
a. Create Apex class to generate URL for getting access token
Create apex class GoogleAuthService to get authentication url. This url will generate token after you authenticate using google credential created in step #1.
createAuthURL – This apex method will create authentication redirect url.
getAccessToken: This apex method will generate access token. This token will be saved in custom metadata for further use.
b. Create Lightning component to get access token
GoogleAuthComponent component is created for getting access token. It will use above mentioned GoogleAuthService class to generate token.
c. Create custom metadata to store access token for later user
Create one custom metadata type named GoogleAuthSetting and add field AccessToken in it. Add class MetadataService to update access token value. This class will be called from class GoogleAuthService.
d. Make lightning component as tab
Create a tab for this lightning component so that it will be accessible easily. Url of this tab should be added in step 3 of prerequisites.
2. Get Transcription from Speech API
Let us create another lightning component. This component will take case id and generate transcribed text from any attached audio file on specific record. This transcribed text is added in case comment.
As we have access token from first lightning component into custom metadata. Let us use that access token from metadata to create transcribe text. Apex class GoogleSpeechService will generate translated text using https://speech.googleapis.com/v1p1beta1/speech:recognize API.
Add this component to case page layout.
Demo Time :
References:
https://cloud.google.com/speech-to-text
21 Comments
Hi Dhanik,
Excellent work on this project for speech to text conversion. I tried to run this and it failed from Line 33 of GoogleSpeechService.apxc class. I am sure you have forgot to add some other instructions. Please share this it this would help for loads of Dev communities. Keep up with your good work. Best regards, Iftekhar.
AudioData data=new AudioData();
AudioData.Audio audio=new AudioData.Audio();
audio.content=EncodingUtil.base64Encode(file_body);
data.audio=audio;
AudioData.Config config=new AudioData.Config();
config.encoding=’MP3′;
config.sampleRateHertz=16000;
config.languageCode=’en-US’;
config.enableWordTimeOffsets=false;
data.config=config;
string jsondta=system.JSON.serialize(data);
Hello Ifte,
AudioData class was missing. I have added that class file. Please check now.
Thank You,
Dhanik
Hi Dhanik,
It’s a nice tool to have and really appreciate your help towards Dev Community. Google Speech API settings is not well defined on this page. An excellent example defined everything in details as on this this link: http://santanuboral.blogspot.com/2020/01/LWC-FileUpload-AWS.html?m=1 . While it’s convenient for you please provide full details.
Best regards,
Iftekhar Uddin.
I am getting this error on the GoogleSpeechService: I am getting 17 errors on the GoogleSpeechService.apxc
Hello Linda, As per your latest comment many of these are resolved. For remaining 2 bugs I have commented in your another question.
Thank You,
Dhanik
I have it down to 2 errors on the GoogleSpeechService:
Invalid type: GoogleAuthSetting__mdt
and
Variable does not exist: mapping
Please advise, I am very much a novice at this.
Thanks in advance.
Hello Linda,
GoogleAuthSetting__mdt is custom metadata which you have to create. See section 1.c Create custom metadata to store access token for later user for creating it.
Thank You,
Dhanik
Thank you for your help, I did set it up and have no errors in the code. In the googleAuthServicethe call to the MetadataService does not seem to be saving the token to the custom metadata so when I go to transcribe the log gives a 11:29:10:325 FATAL_ERROR System.QueryException: List has no rows for assignment to SObject I did a system debug on authuri and get:
11:42:17:010 USER_DEBUG [21]|DEBUG|authuri https://accounts.google.com/o/oauth2/auth?client_id=206717206662-jdhqo2p2cegn574vik31a1skr2pmfl5e.apps.googleusercontent.com&response_type=code&scope=https://www.googleapis.com/auth/cloud-platform&redirect_uri=https%3A%2F%2Fmultco–uat.lightning.force.com%2Flightning%2Fn%2FSpeech_to_Text&access_type=offline
System Deug does not return anything on messageBody or response.
Hey Linda,
First check you are getting token or not. If you are getting token then you can try saving in custom setting or object. Let me know, if that is not working, we can connect to resolve your issue.
Thank You,
Dhanik
That would be very helpful. I believe I am getting the token, but cannot prove it. My Google person thinks things are set up correctly.
When would you be available?
Hey Linda, Please ping me on linked-in or telegram for this.
Thank You
Dhanik
Unable to message in either place
This site can’t be reachedt.me unexpectedly closed the connection.
Try:
Checking the connection
Checking the proxy and the firewall
Running Windows Network Diagnostics
ERR_CONNECTION_CLOSED
Hello Sir, I am also getting the same Error
FATAL_ERROR System.QueryException: List has no rows for assignment to SObject
Hello Saurabh,
As per error record is not being retrieved for some filter criteria. Please provide complete error detail like the error method and line to check where is the issue.
Thank You,
Dhanik
Hello LINDA,
It gives me same error. can you tell me next step to resolve it
FATAL_ERROR : System.QueryException: List has no rows for assignment to SObject
Hello Linda, Dhanik,
hope your issue has been resolved, can you please confirm , I am also facing same issue
System.QueryException: List has no rows for assignment to SObject
Saurabh
Hello Saurabh,
As per error record is not being retrieved for some filter criteria. Please provide complete error detail like the error method and line to check where is the issue.
Thank You,
Dhanik
Hello Dhanik Sir,
First of all, thanks for the solution on this very common use case. This is the most nearest solution I could find after hard searching for few days. The steps looks straight forward but when I was trying to authorise my sandbox’s domain in OAth Consent Screen, I am getting the error – “Must be a top private domain’. Searched extensively but couldn’t find an answer around this. If you can help me around this issue, that would be great. Thanks.
Hello Pranshu,
Please check this post Must be a top private domain
Thank You,
Dhanik
This Article is Awesome. It’s helped me a lot. Sir, Please keep up your good work.
https://revaalolabs.com/post/best-free-speech-to-text-apis-and-open-source-libraries