Create OCR App using Salesforce Einstein OCR API

by Dhanik Lal Sahni June 26, 2020

written by Dhanik Lal Sahni June 26, 2020

Nowadays organizations are going for digital automation for most of their repetitive work like manual records entries from printed forms. This manual entry requirement can be for the application form, insurance forms, doctor prescription forms, examination forms, digitized business cards, and many more. This post will explain the OCR App using Salesforce Einstein to extract text from images and populate them in Salesforce objects.

For extracting text from images we have many API service available which gives almost 95% accuracy. Refer to my other post related to this service.

Salesforce announced OCR (Einstein Optical Character Recognition) service in Apr,2019. This API is now available for use.

Einstein Optical Character Recognition (OCR) leverages computer vision to analyze documents and extract relevant information, making repetitive tasks like data entry more efficient.

Let us integrate Einstein OCR in Salesforce for extracting form data. Below steps will be required to integrate it.

Create an account in Einstein Platform Services
Create a Private Key and Generate Token
Call Einstein OCR API from Apex
Extract image data in the Case object

1. Create an account in Einstein Platform Services

We have to consume Einstein OCR API so first create an API account. Create an account at https://api.einstein.ai/signup. This will send an email to your provided email. Confirm email to start working on OCR.

In the registration process, it will ask you to download a key file. Download that file, it will be used to generate tokens. The file will be saved as einstein_platform.pem. Upload this file in the Salesforce File object. Below is a screenshot of the file record of key file.

You can also follow the steps which are mentioned at Einstein Vision and Language

2. Create a Private Key and Generate Token

For integrating external API from Apex we need an API token which will require to authenticate requests. Einstein OCR API requires a valid JWT Token. This token will be generated from the above-mentioned key file.

The token can be generated online from https://api.einstein.ai/token as well but this will not work when we use API in Salesforce. We have to generate tokens at runtime before calling OCR API. We will use API https://api.einstein.ai/v2/oauth2/token for generating tokens from the apex.

Apex Class for generating Token

EinsteinController.getAccessToken() should be called to generate a token from the apex code before calling API.

3. Call Einstein OCR API from Apex

We have an API Token and API URL https://api.einstein.ai/v2/vision/ocr to extract texts from images. Let us call API from the apex with the required request data.

Request Details:

sampleLocation : This is the image URL. We can get a downloadable URL for our uploaded image. Refer Extract License Plate Number from Image In Salesforce for creating a downloadable URL for any uploaded image.
modelId : This parameter define which type of text need to be extracted from image like tabular data or business card. Value for this parameter can be OCRModel (for unstructured data) and tabulatev2 (for tabular data)

Einstein OCR API can be called using multipart/form-data and request parameter will be passed in body as blob.

blob formBlob = EncodingUtil.base64Decode(form64);
string contentLength = string.valueOf(formBlob.size());
req.setBodyAsBlob(formBlob);

Apex Code for calling API

Note: Add API URL (https://api.einstein.ai) in the remote site setting or you can use a named credential to avoid this.

4. Extract image data in the Case object

Now we are ready with consuming API service to extract images. For this post, I have created one sample image form where some field information is present. Using the above Einstein OCR API we will extract data from the image and put that in the Case object.

OCR App using Salesforce Einstein - SalesforceCodex

We need to extract field data from above-mentioned image. Similar to this we can have different forms or business cards. To extract information from above mentioned image we have to map this information in one mapping object.

Object Creation:

Create one custom object OCRTemplateMapping__c with the below fields.

Field Name	Data Type	Size
Name	Text	100
MinX__c	Number	5
MaxX__c	Number	5
MinY__c	Number	5
MaxY__c	Number	5

Below are sample record for above mentioned image.

If you need to use other form then put these X,Y coordinates accordingly. You can check X and Y coordinates from https://yangcha.github.io/iview/iview.html . For my sample image X, Y coordinates will be like the below image.

As we have to extract image data and put that in case object so add below fields in Case object.

Field Name	Data Type	Size
FirstName__c	Text	100
LastName__c	Text	100
Email__c	Text	100
Mobile__c	Text	12

Now we ready with object creation. Let us write Apex code that will extract proper image data and put that in case record.

Flow to call above Apex Action:

Create a flow which call above apex method. You can refer post (Extract License Plate Number from Image In Salesforce) for flow and content url creation.

Button to call Flow:

Add one action button in case object which will call above flow.

Demo App:

References:

Einstein Vision and Language

https://developer.salesforce.com/docs/atlas.en-us.apexcode.meta/apexcode/apex_intro_what_is_apex.htm

apex apex rest einstein einstein api flow ocr ocr api rest api salesforce

Dhanik Lal Sahni

I have around 14 years of Experience in Web Based Application. In this experience, I have worked with various technology like SalesForce, .NET, .NET Core, MS Dynamic CRM, Azure, Oracle, SQl Server, WCF, Ionic, Angular.I am more focused on Technology instead of Management. I love to know and research about new technology.

Extract License Plate Number from Image In Salesforce

61 comments

Dhananjay Kumar August 1, 2020 - 6:11 pm

Amazing write-up!
Very very Useful post to follow and learn as new concept.

Mr Dhanik is also very approachable on call/email, if you have any doubt or stuck in between implementation.

Dhananjay August 1, 2020 - 6:28 pm

Amazing write-up!
Very very Useful post to follow and learn as new concept.

Mr Dhanik is also very approachable on call/email, if you have any doubt or stuck in between implementation.

Jarosław Kołodziejczyk September 7, 2020 - 8:21 pm

Is there any way of testing this on a sandbox environment?

Dhanik Lal Sahni September 8, 2020 - 2:03 pm

Yes, we can test that in the sandbox. Mentioned steps will work. Let me now, if you face any difficulty in implementing it.

Thank You,
Dhanik

Shayeedha September 9, 2020 - 3:58 pm

Getting this error on click of Action in record Level.

An unhandled fault has occurred in this flow
An unhandled fault has occurred while processing the flow. Please contact your system administrator for more information.

When checked on the Dev Console, this is the Exception I’m getting.
implementation restriction: ContentDocumentLink requires a filter by a single Id on Content Document or LinkedEntityId using the equals operator or multiple Id’s using the IN operator.

Dhanik Lal Sahni September 9, 2020 - 4:23 pm

Hello Shayeedha,
You have to use the where clause in SOQL in the below code otherwise you will get that error.
List links=[SELECT ContentDocumentId,LinkedEntityId FROM ContentDocumentLink where LinkedEntityId=:recodId];

Thank You,
Dhanik

Shayeedha September 9, 2020 - 4:35 pm

Yes, I have already used the where class and added the same condition as you have stated above.

Dhanik Lal Sahni September 9, 2020 - 6:22 pm

Let us connect to resolve your issue. Check your email.

Thank You,
Dhanik

Dhanik Lal Sahni September 18, 2020 - 6:51 pm

This issue is resolved. Issue was button was not getting placed at proper place so record id was not fetching. There was alternate solution to use lightning component instead of flow to extract required information.

harish February 17, 2023 - 8:44 pm

Can you tell me how this issue is resolved.. Same error, Query is right may be button placement is wrong. Please help. Thank you.

Dhanik Lal Sahni February 26, 2023 - 7:25 pm

Hello Harish,
Please check the recordid has an associated attachment. If you are not able to check the issue, let us connect over LinkedIn.

Thank You,
Dhanik

Sandhya October 12, 2020 - 4:14 pm

I am uploading 2 attachments with different mappings. It gives an error

Dhanik Lal Sahni October 12, 2020 - 5:38 pm

Hello Sandhya,

What error you are getting? share screenshot for that. We can connect to resolve your issue.

Thank You,
Dhanik

rihan August 28, 2023 - 12:56 pm

its throwing this error when we upload two documents ” FLOW_ELEMENT_ERROR An Apex error occurred: System.CalloutException: You have uncommitted work pending. Please commit or rollback before calling out”

Dhanik Lal Sahni September 3, 2023 - 5:12 pm

Hello Rihan,

It might give this error in getImageText when you have multiple files. Code line EinsteinOCR.extractText(imageUrl, token, 'OCRModel') is calling API and then we are updating response detail in object. When we having multiple files, this process will continue again. So you can try updating response detail once at the last when all file content is received. This way, first all API work will be done and in last one update statement will update all response detial.
Try this, if you are unable to do this, we can connect to resolve your issue.

Thank You,
Dhanik

smriti December 6, 2020 - 9:57 pm

ContentDistribution is not fetching any value. Can you please guide.

smriti December 6, 2020 - 10:25 pm

@dhaniksahni thanks a lot for quick help.

Dhanik Lal Sahni December 7, 2020 - 6:15 pm

Glad to help you, Smriti.

Thank You,
Dhanik

Dhanik Lal Sahni December 7, 2020 - 6:14 pm

Hey Smriti,

Please check this link

Thank You,
Dhanik

Christian December 30, 2020 - 12:01 am

Hi Dhanik,

Thank you for the write-up. I’m having challenges with the extracted values populating correctly on the case. I think that it’s a problem with my flow, but I’m not sure where I have gone wrong. Would you be able to help me?

Dhanik Lal Sahni December 31, 2020 - 4:21 pm

As per email communication, you need to change email in EinsteinController. Instead of salesforcecodex@gmail.com, need to use your email which used for setting up Einstein.

Thank You,
Dhanik

arshad January 11, 2021 - 11:41 am

is it possible to create a model that can predict labels in invoices/bills or bank statements in einstein ocr?

Dhanik Lal Sahni January 11, 2021 - 3:48 pm

Yes, it can predict if we specify correct index of label.

Thank You,
Dhanik

Sunil February 4, 2021 - 12:43 pm

Hi Dhanik,

ContentDistribution is not fetching any value. Checked the Line provided above to Smriti. That Feature is Enabled in the Org and all options are checked. What am I missing? Please guide.

Sunil February 4, 2021 - 12:57 pm

It worked! Thank you! I had to Enable public link.

Raksha H R June 1, 2021 - 12:38 pm

Dear Dhanik,
How to get einstein RSA private key for a sandbox?
I get below error:

“An error occurred while serving your request
It looks like this org doesn’t allow access to the Einstein.ai connected app. Contact your Salesforce admin to allow access, or sign up with a different org.”

It says oauth error. Please help on this

Dhanik Lal Sahni June 5, 2021 - 7:28 pm

Hello Raksha,

It should work in all org. Please check possible solution at https://metamind.readme.io/page/troubleshooting

If that not work, please ping me in LinkedIn or telegram group. We can join and resolve issue.

Thank You,
Dhanik

Raksha H R June 1, 2021 - 6:15 pm

Dear Dhanik,
Thanks for explaining in detail. How to establish the connection with sandbox and generate RSA private key?

Regards,
Raksha

Viresh Patnaik June 30, 2021 - 6:18 pm

Hi Dhanik,

I am getting the following error–> “Error Occurred: An Apex error occurred: System.QueryException: Implementation restriction: ContentDocumentLink requires a filter by a single Id on ContentDocumentId or LinkedEntityId using the equals operator or multiple Id’s using the IN operator.”
Please help me out.

Regards.

Dhanik Lal Sahni July 3, 2021 - 1:33 pm

Hello Vinesh,

Have you checked that you are getting recodId to filter using query? Check that SOQL, how many records are being returned.

Thank You,
Dhanik

Viresh July 5, 2021 - 4:00 pm

Hi Dhanik,

I had issue with button placement, got it resolved, data is populated but the issue now i am facing is ; if the name is VIRESH PATNAIK, it is populating in the field as PATNAIKVIRESH.

Regards,
Viresh

Dhanik Lal Sahni July 8, 2021 - 5:05 pm

Hello Viresh, Instead of merging here, add seprate columns and then create formula field.

Thank You,
Dhanik

Harish February 17, 2023 - 8:17 pm

Hi Dhanik,

Regards.
Same error as Viresh’s. If its resolved for him, Can you explain how to resolve the error.

Dhanik Lal Sahni February 26, 2023 - 7:23 pm

Hello Harish,
This issue is showing because SOQL cannot find the attachment record. Please check, you are passing the correct recordid to get the attachment.

Thank You,
Dhanik

Extract Driver License Detail from Image using Einstein API | SalesforceCodex August 17, 2021 - 5:15 pm

[…] account at https://api.einstein.ai/signup. See step 1 from Create OCR App using Salesforce Einstein OCR API for more […]

Christian September 7, 2021 - 11:39 pm

Dhanik,

Thank you again for the write-up. Any guidance on the test classes that I’ll need in order to push this to production?

Dhanik Lal Sahni September 8, 2021 - 10:08 pm

Hey Christian,

What kind of guidance or support you need to test class?

Thank You,
Dhanik

Dhanik Lal Sahni November 6, 2021 - 5:16 am

Hello Matheus,
Have you tried using mock test class instead of using @isTest(SeeAllData=true)? It should work in this situation.

Thank You,
Dhanik

ram March 24, 2022 - 6:42 pm

@Dhanik , can you update with test classes to this solution. I set up solution and working properly extracting PDf Data.
But got validation proleems with no test data

Dhanik Lal Sahni March 25, 2022 - 8:55 pm

Hello Ram,

Can you share your test code that is not working so that I can help you?

Thank You,
Dhanik

Raja April 12, 2022 - 12:35 pm

Hi Dhanik,

This is helpful for document data extraction . As Salesforce introduced the “Intelligent Form Reader” for document data extraction, which one is the best appraoch? And do you have any samples for “Intelligent Form Reader”

Dhanik Lal Sahni April 18, 2022 - 11:15 am

Hello Raja,

Intelligent Form Reader is used especially for medical docs in the health cloud. So if your requirement is related to that then you can go ahead with Intelligent Form Reader, otherwise, you can proceed with the Einstein OCR API or any other appexchange app.

Thank You
Dhanik

Difference between SOAP and REST API? - SalesforceCodex September 5, 2022 - 10:37 am

[…] File in S3 using Named Credential OCR App using Salesforce Einstein OCR API Integrate Salesforce with WhatsApp using […]

Sai November 28, 2022 - 11:02 am

Hi Dhanik,
whenever we extracting the data from image to Text into Case , Iam getting the error
1. Image Url == Null
2. ContentDocument Id ={}

Dhanik Lal Sahni November 29, 2022 - 7:41 pm

Hello Sai,

Please check you are getting the image url that you are using to extract.

Thank You,
Dhanik

Anamika Shinde February 27, 2023 - 4:11 pm

Hello Dhanik

we are not receiving a token in response it return “403” error. can you please help me here.

Thanks
Anamika

Dhanik Lal Sahni February 28, 2023 - 12:14 pm

Hello Anamika,

Looks like you are not passing valid request information. Please check you are passing your correct API details. Even though it is not resolving, please ping me on LinkedIn.

Thank You,
Dhanik

Abishek Datta Porandla April 4, 2023 - 12:42 pm

Hi Dhanik,

We tried to implement it on the Partner Community site and it is throwing an error. When we checked the debug logs we observe that the user has no access to the einstein platform file which is causing the error. Can you please guide us here?

Regards & Thanks,
Abishek.

Dhanik Lal Sahni April 13, 2023 - 6:40 pm

Hello Abhishek,

If your issue is not resolved, let us connect on linked in.

Thank You,
Dhanik

Ashish Sakhare May 25, 2023 - 1:28 pm

hii dhanik ,
do we have to upload those jpg images in notes and attachment of case object?? and my another question is .. if instead of image we have to go with pdf.. what changes will it needed..

Dhanik Lal Sahni May 27, 2023 - 10:16 am

Hello Ashish,

You can use notes/attachement also for this. Only thing you have to take care is, generating public URL for API. Yes, you can use PDF also for this. Please refer document for this.

Thank You,
Dhanik

Ashish Sakhare May 26, 2023 - 6:58 am

Hii Dhanik,
u gave us reference for generating image url.. but its of license plate recognition.. so can we use License plate recognition api for this example also.. or we have to search for any other api on rapidapi.com..

Dhanik Lal Sahni May 27, 2023 - 10:10 am

Hello Ashish,

You need to take code for trigger ContentVersionExternalLink, ContentDocumentLinkTrigger and trigger handler ContentTriggerHandler for generating public url for any uploaded image. We will pass public URL to OCR API for processing.

Thank You,
Dhanik

Ashish Sakhare May 28, 2023 - 8:17 am

thanks dhanik .. it worked for image..

Ashish Sakhare May 28, 2023 - 8:13 am

hii dhanik ,,
i want to use this ocr for pdf also.. what changes shall i need to make

Dhanik Lal Sahni June 12, 2023 - 11:31 am

Hello Aashish,

Please check Detect Text in PDFs with Einstein OCR (Generally Available)

Thank You,
Dhanik

Ashish Sakhare May 28, 2023 - 8:24 am

Hii Dhanik..
I want to perform this functionality for pdf.. but when i upload pdf, i get downloadable pdf url and json data but fields in case objects are not updated.. So can u tell me what changes shall i need to make so that it work for pdf as well… Since i got requirement for pdf and storing data in case object..

thanks
Ashish Sakhare

Dhanik Lal Sahni September 3, 2023 - 5:38 pm

Hello Ashish,
What issue you are facing in update? Are you getting value for fields? Let us discuss, if you still want to discuss this issue.

Thank You,
Dhanik

namratha July 21, 2023 - 1:17 pm

Hi I am unable to get the contentDocument ID tried the solutions mentioned above but didn’t work

Dhanik Lal Sahni July 23, 2023 - 9:56 pm

Hello Namratha,
Plese check our other post Generate Public Link for Salesforce file to resolve this issue.

Thank You,
Dhanik

Shopify integration with Salesforce using Webhook | SalesforceCodex June 26, 2024 - 10:02 pm

[…] Create OCR App using Salesforce Einstein OCR API […]

Create OCR App using Salesforce Einstein OCR API

1. Create an account in Einstein Platform Services

2. Create a Private Key and Generate Token

Apex Class for generating Token

3. Call Einstein OCR API from Apex

Apex Code for calling API

4. Extract image data in the Case object

Object Creation:

Flow to call above Apex Action:

Button to call Flow:

Demo App:

Related Posts

Extract License Plate Number from Image In Salesforce

Add Icon In Lightning Web Component Tab

You may also like

61 comments

Leave a Comment Cancel Reply