Close Menu
SalesforceCodex
    Facebook X (Twitter) Instagram
    Trending
    • Unlock the Power of Vibe Coding in Salesforce
    • How to Implement Dynamic Queueable Chaining in Salesforce Apex
    • How to Implement Basic Queueable Chaining in Salesforce Apex
    • How to Suppress PMD Warnings in Salesforce Apex
    • Top 10 PMD Issues Salesforce Developers Should Focus on in Apex
    • How to Use Graph API for Outlook-Salesforce Connection
    • Enhancing Performance with File Compression in Apex
    • Salesforce Spring ’25 Release: Top Flow Enhancements You Need to Know
    Facebook X (Twitter) Instagram
    SalesforceCodex
    Subscribe
    Thursday, May 8
    • Home
    • Architecture
    • Salesforce
      • News
      • Apex
      • Integration
      • Books Testimonial
    • Questions
    • Certification
      • How to Prepare for Salesforce Integration Architect Exam
      • Certification Coupons
    • Integration Posts
    • Downloads
    • About Us
      • Privacy Policy
    SalesforceCodex
    Home»Salesforce»Running Salesforce App using Voice command – Speech-To-Text API

    Running Salesforce App using Voice command – Speech-To-Text API

    Dhanik Lal SahniBy Dhanik Lal SahniAugust 12, 2019Updated:June 30, 202312 Comments5 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Running Salesforce App using Voice command – Speech-To-Text API
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Speech-To-Text API

    In previous post, I have given understanding of Text-to-Speech feature of Web Speech API. In this post, I will give detail of Speech-To-Text feature of this API.

    We will create a demo lightning component. This component will get voice command and salesforce object record will open.
    Voice command can be integrated using many APIs. Below are some important APIs which can be used for Speech Recognition.

    • Web Speech API
    • Google Speech-To-Text
    • Microsoft Cognitive Services
    • Dialogflow
    • IBM Watson
    • Speechmatics

    We will use Web Speech API for speech Recognition. This API use browser’s audio stream to convert speech into text.

    SpeechRecognition : This is speech recognition interface and it is available in browser’s window object. This object is present as SpeechRecognition in Firefox and as webkitSpeechRecognition in Chrome.

    Below code will set recognition interface to SpeechRecognition.

    window.SpeechRecognition = window.webkitSpeechRecognition || window.SpeechRecognition;

    After setting recognition, create SpeechRecognition object using window.SpeechRecognition()

    const recognition = new window.SpeechRecognition();

    This recognition object has many properties, methods and event handlers.
    Methods:

    abort()This method stops the speech recognition service from listening to incoming audio
    start()This method starts the speech recognition service listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition
    stop()This method Stops the speech recognition service from listening to incoming audio.

    Properties:

    grammarsGets and sets a collection of SpeechGrammar objects that represent the grammars that will be understood by the current SpeechRecognition object.
    langGets and sets the language of the current SpeechRecognition.
    interimResults Controls whether interim results should be returned (true) or not (false). By default value is false.
    maxAlternativesSets the maximum number of SpeechRecognitionAlternatives provided per result. The default value is 1.
    serviceURISpecifies the location of the speech recognition service used by the current SpeechRecognition to handle the actual.

    Events

    onstartFired when the speech recognition service has begun listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition
    onaudiostartFired when the user agent has started to capture audio
    onaudioendFired when the user agent has finished capturing audio
    onendFired when the speech recognition service has disconnected
    onerrorFired when a speech recognition error occurs
    onnomatchFired when the speech recognition service returns a final result with no significant recognition
    onresultFired when the speech recognition service returns a result — a word or phrase has been positively recognized and this has been communicated back to the app
    onsoundstartFired when any sound — recognisable speech or not — has been detected
    onsoundendFired when any sound — recognisable speech or not — has stopped being detected
    onspeechstartFired when sound that is recognised by the speech recognition service as speech has been detected
    onspeechendFired when speech recognised by the speech recognition service has stopped being detected

    Let us create voice app which open record page based on what we speak using above methods, properties and events.

    voiceCommand.cmp

    <aura:component controller="VoiceCommandController" implements="force:appHostable,flexipage:availableForAllPageTypes,flexipage:availableForRecordHome,force:hasRecordId,forceCommunity:availableForAllPageTypes,force:lightningQuickAction" access="global" >
        <aura:attribute name="value" type="string" default=""></aura:attribute> 
        <lightning:card title="Speech-to-Text">
            <lightning:textarea aura:id="speechText1" value="{!v.value}"></lightning:textarea>
            <lightning:button variant="success" label="Click to Speak" title="Speak" onclick="{! c.handleSpeechText }"/>
        </lightning:card>   
    </aura:component>

    voiceCommandController.js

    ({
        handleSpeechText: function(component, event, helper) {
          window.SpeechRecognition = window.webkitSpeechRecognition || window.SpeechRecognition;
            
            if ('SpeechRecognition' in window) {
                console.log('supported speech')
            } else {
                console.error('speech not supported')
            }
            const recognition = new window.SpeechRecognition();
            recognition.lang = 'en-IN';
            recognition.continuous = true;
            recognition.onresult = (event) => {
                component.set("v.value",event.results[event.results.length -1][0].transcript);
                var commandText=event.results[event.results.length -1][0].transcript;
                var commands=commandText.split(' ');
                if(commands.length>0)
                {
                	var obj=commands[1];
                	var condition=commands[3];
                	helper.getRecords(component,obj,condition);
            	}
            }
            recognition.start();
        }
    });

    event.results[event.results.length -1][0].transcript this will give voice transcript.  After that we will split those transcript into tags/terms which need to fire in salesforce to open records.

    We will get command in format of “open {object} for/of {objectname}”. For example “open account of dhanik” or “open lead for dhanik” .  In this transcript, we have two important term: object name and data value like account and dhanik.  We will split text transcript to get object name and data values using commandText.split(‘ ‘);.

    voiceCommandHelper.js

    Apex method is called to get specific record id. if multiple records are matching then it will return first record. Based on record id, application is navigated to record.

    getRecords : function(cmp,objName,name) {
         	// create a one-time use instance of the serverEcho action
            // in the server-side controller
            var action = cmp.get("c.getRecord");
           	action.setParams(
               {
                   objectName:objName,
                   names : name
               });
    
            // Create a callback that is executed after 
            // the server-side action returns
            action.setCallback(this, function(response) {
                var state = response.getState();
                debugger;
                if (state === "SUCCESS") {
                      var data=response.getReturnValue();
                      var navEvt = $A.get("e.force:navigateToSObject");
                        navEvt.setParams({
                        "recordId": data.Id
                    });
                    navEvt.fire();
                }
                else if (state === "INCOMPLETE") {
                    // do something
                }
                else if (state === "ERROR") {
                    var errors = response.getError();
                    if (errors) {
                        if (errors[0] && errors[0].message) {
                            console.log("Error message: " + 
                                     errors[0].message);
                        }
                    } else {
                        console.log("Unknown error");
                    }
                }
            });
            $A.enqueueAction(action);
       }
    

    Above helper class will call Apex method to get record id of first matching records. This can be changed based on requirement.

    public class VoiceCommandController {
        @Auraenabled
    	public static sObject getRecord(string objectName, string names)
        {
            String name= '%'+names+'%';
            string soql='select id from '+objectName+' where name like : name';
            List<sObject> sobjList = Database.query(soql);
    	if(sobjList.size()>0)
            {
        		return sobjList[0];
            }
            return null;
        }
    }

    Demo Time

    The Speech Recognition API is can be useful for data entry, record navigation and other useful commands. We can create application to capture instant transcripts. 

    speech-to-text
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleAdd Text-to-Speech Capability in Lightning Component
    Next Article Getting Salesforce Licenses Information
    Dhanik Lal Sahni
    • Website

    Related Posts

    By Dhanik Lal Sahni6 Mins Read

    Unlock the Power of Vibe Coding in Salesforce

    April 30, 2025
    By Dhanik Lal Sahni5 Mins Read

    How to Implement Dynamic Queueable Chaining in Salesforce Apex

    April 21, 2025
    By Dhanik Lal Sahni5 Mins Read

    How to Implement Basic Queueable Chaining in Salesforce Apex

    March 31, 2025
    View 12 Comments

    12 Comments

    1. Gopal Giri on October 23, 2019 5:31 pm

      Hello DHANIK ,
      I am Facing API version issues while saving code because of web api in lightning component design .
      can you please let me know how to implement it in visual force page and lightning component using correct version

      Reply
      • Dhanik Lal Sahni on October 31, 2019 2:36 am

        You can get lightning code in Visual Code from org. Change API version in Visual code and push change.It will work.

        Reply
    2. SUCHARITA MONDAL on November 9, 2019 9:40 pm

      Hi Dhanik,
      I want to use Einstein Voice for something similar. Can you guide me on this that how should I approach to this.

      Thanks,

      Reply
      • Dhanik Lal Sahni on November 11, 2019 9:22 pm

        Sucharita, We have implemented that using Alexa with Einstein Bots. You can also try with that.

        Reply
    3. Harikesh on February 4, 2020 10:48 am

      Hi Dhanak,

      I am getting the below error if I try to use the similar code in lwc
      window.SpeechRecognition || window.webkitSpeechrecognition) is not a constructor
      Can you guide me on this.

      Thanks

      Reply
      • Dhanik Lal Sahni on February 4, 2020 2:37 pm

        You can not use window in LWC. In LWC, better you use Google Speech-To-Text api or other API mentioned in post.

        Reply
    4. Nevin on May 8, 2020 1:03 pm

      How to trigger the functionality on a keyword like “Hey Einstein” instead of manually clicking button to activate?

      Reply
      • Dhanik Lal Sahni on May 8, 2020 5:15 pm

        Hey Nevin,

        Run method handleSpeechText in component load. Use init handler for this.

        Thank You,
        Dhanik Sahni
        http://salesforcecodex.com/

        Reply
    5. Jayant on November 25, 2020 11:08 am

      Hi,
      I’m getting Invalid SecureWindow API, webkitSpeechRecognition was blacklisted in LC.
      Any idea?
      Thanks,
      Jayant

      Reply
      • Dhanik Lal Sahni on November 25, 2020 1:26 pm

        Have you changed LC version to 39? This API only work till 39 after that it is blocked by locker service.

        Please check with LC version 39.

        Thank You,
        Dhanik

        Reply
    6. Harshal on September 14, 2021 6:58 pm

      I have changed the LC API to 39 but it says,
      This page has an error. You might just need to refresh it.
      Action failed: c:helloWorld$controller$handleClick [window.SpeechRecognition is not a constructor]
      Failing descriptor: {c:helloWorld$controller$handleClick}

      Reply
      • Dhanik Lal Sahni on September 15, 2021 12:06 pm

        Hello Harshal,

        Please confirm once again, your LC is updated to version 39 or not. This error will only throw when API is not set to version 39.

        Thank You,
        Dhanik

        Reply
    Leave A Reply Cancel Reply

    Ranked #1 SALESFORCE DEVELOPER BLOG BY SALESFORCEBEN.COM
    Featured on Top Salesforce Developer Blog By ApexHours
    Recent Posts
    • Unlock the Power of Vibe Coding in Salesforce
    • How to Implement Dynamic Queueable Chaining in Salesforce Apex
    • How to Implement Basic Queueable Chaining in Salesforce Apex
    • How to Suppress PMD Warnings in Salesforce Apex
    • Top 10 PMD Issues Salesforce Developers Should Focus on in Apex
    Ranked in Top Salesforce Blog by feedspot.com
    RSS Recent Stories
    • How to Connect Excel to Salesforce to Manage Your Data and Metadata February 9, 2025
    • Difference Between With Security and Without Security in Apex January 2, 2025
    • Top Reasons to Love Salesforce Trailhead: A Comprehensive Guide December 5, 2024
    • How to Utilize Apex Properties in Salesforce November 3, 2024
    • How to Choose Between SOQL and SOSL Queries July 31, 2024
    Archives
    Categories
    Tags
    apex (110) apex code best practice (8) apex rest (11) apex trigger best practices (4) architecture (22) Asynchronous apex (9) AWS (5) batch apex (9) batch processing (4) code optimization (8) custom metadata types (5) design principle (9) file upload (3) flow (14) future method (4) google (6) google api (4) integration (19) integration architecture (6) lighting (8) lightning (64) lightning-combobox (5) lightning-datatable (10) lightning component (29) Lightning web component (61) lwc (50) named credential (8) news (4) optimize apex code (4) Permission set (4) Queueable (9) rest api (23) S3 Server (4) salesforce (139) salesforce apex (46) salesforce api (4) salesforce api integration (5) Salesforce GraphQL API (3) Salesforce Interview Question (4) salesforce news (5) salesforce question (5) shopify api (3) solid (6) tooling api (5) Winter 20 (8)

    Get our newsletter

    Want the latest from our blog straight to your inbox? Chucks us your detail and get mail when new post is published.
    * indicates required

    Ranked #1 SALESFORCE DEVELOPER BLOG BY SALESFORCEBEN.COM
    Featured on Top Salesforce Developer Blog By ApexHours
    Recent Posts
    • Unlock the Power of Vibe Coding in Salesforce
    • How to Implement Dynamic Queueable Chaining in Salesforce Apex
    • How to Implement Basic Queueable Chaining in Salesforce Apex
    • How to Suppress PMD Warnings in Salesforce Apex
    • Top 10 PMD Issues Salesforce Developers Should Focus on in Apex
    Ranked in Top Salesforce Blog by feedspot.com
    RSS Recent Stories
    • How to Connect Excel to Salesforce to Manage Your Data and Metadata February 9, 2025
    • Difference Between With Security and Without Security in Apex January 2, 2025
    • Top Reasons to Love Salesforce Trailhead: A Comprehensive Guide December 5, 2024
    • How to Utilize Apex Properties in Salesforce November 3, 2024
    • How to Choose Between SOQL and SOSL Queries July 31, 2024
    Archives
    Categories
    Tags
    apex (110) apex code best practice (8) apex rest (11) apex trigger best practices (4) architecture (22) Asynchronous apex (9) AWS (5) batch apex (9) batch processing (4) code optimization (8) custom metadata types (5) design principle (9) file upload (3) flow (14) future method (4) google (6) google api (4) integration (19) integration architecture (6) lighting (8) lightning (64) lightning-combobox (5) lightning-datatable (10) lightning component (29) Lightning web component (61) lwc (50) named credential (8) news (4) optimize apex code (4) Permission set (4) Queueable (9) rest api (23) S3 Server (4) salesforce (139) salesforce apex (46) salesforce api (4) salesforce api integration (5) Salesforce GraphQL API (3) Salesforce Interview Question (4) salesforce news (5) salesforce question (5) shopify api (3) solid (6) tooling api (5) Winter 20 (8)

    Get our newsletter

    Want the latest from our blog straight to your inbox? Chucks us your detail and get mail when new post is published.
    * indicates required

    Facebook X (Twitter) Instagram Pinterest YouTube Tumblr LinkedIn Reddit Telegram
    © 2025 SalesforceCodex.com. Designed by Vagmine Cloud Solution.

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.