Extract Text from Image using Einstein OCR

on

|

views

and

comments

Dear #Trailblazers,

Welcome back, in this blog post we are going to learn how to read Text from the Image using Einstein OCR. Einstein OCR has been made available in Latest release which is Summer 20.

So, let’s start. If you have missed my earlier post about JWT authentication, please go ahead and Read it Here as Einstein Platform API like Einstein OCR using JWT for authentication.

If you are new to this blog post, please refer this link to get an Einstein Platform API account.

Once you have got the Einstein Platform API Account and downloaded the Key file and uploaded the same to Salesforce under file object. If you want to learn more about how it works here is the API Doc for you

Now, let’s see what are all the required parameter for making the request.

curl -X POST -H "Authorization: Bearer " -F sampleLocation="https://www.publicdomainpictures.net/pictures/240000/velka/emergency-evacuation-route-signpost.jpg" -F modelId="OCRModel" https://api.einstein.ai/v2/vision/ocr

If you see the above URL it takes the below parameters

POSTThis means the request that we will be making to platform API is POST
AuthorizationFor the Authorization, we need to send the token that we can get from here.
sampleLocationThe URL of the Image
sampleBase64ContentThe image content on Base64 Encoded
modelIdThe model that we wanted to use for text reading and accepted values are OCRModel & tabulatev2. Learn more
Endpoint The Service Endpoint which is https://api.einstein.ai/v2/vision/ocr

As we are sending the Image details to the Einstein Platform API, We need to make sure that we are sending the correct values in the Correct format. To send the image we are using “multipart/form-data; charset=”UTF-8” as the Content Type for All the Request. Why we are using this because the image can be either in a URL form or base64 form.

So, as we have discussed all the request Parameter. Let’s start development.

Find the Code for HttpFormBuilder class which is useful for preparing the request body with various parameters

public class HttpFormBuilder {
    
    private final static string Boundary = '1ff13444ed8140c7a32fc4e6451aa76d';

    public static string GetContentType() {
        return 'multipart/form-data; charset="UTF-8"; boundary="' + Boundary + '"';
    }

    /**
     *  Pad the value with spaces until the base64 encoding is no longer padded.
     */
    private static string SafelyPad(
        string value,
        string valueCrLf64,
        string lineBreaks) {
        string valueCrLf = '';
        blob valueCrLfBlob = null;

        while (valueCrLf64.endsWith('=')) {
            value += ' ';
            valueCrLf = value + lineBreaks;
            valueCrLfBlob = blob.valueOf(valueCrLf);
            valueCrLf64 = EncodingUtil.base64Encode(valueCrLfBlob);
        }

        return valueCrLf64;
    }

    /**
     *  Write a boundary between parameters to the form's body.
     */
    public static string WriteBoundary() {
        string value = '--' + Boundary + '\r\n';
        blob valueBlob = blob.valueOf(value);

        return EncodingUtil.base64Encode(valueBlob);
    }

    /**
     *  Write a boundary at the end of the form's body.
     */
    public static string WriteBoundary(
        EndingType ending) {
        string value = '';

        if (ending == EndingType.Cr) {
            value += '\n';
        } else if (ending == EndingType.None) {
            value += '\r\n';
        }

        value += '--' + Boundary + '--';

        blob valueBlob = blob.valueOf(value);

        return EncodingUtil.base64Encode(valueBlob);
    }

    /**
     *  Write a key-value pair to the form's body.
     */
    public static string WriteBodyParameter(
        string key,
        string value) {
        string contentDisposition = 'Content-Disposition: form-data; name="' + key + '"';
        string contentDispositionCrLf = contentDisposition + '\r\n\r\n';
        blob contentDispositionCrLfBlob = blob.valueOf(contentDispositionCrLf);
        string contentDispositionCrLf64 = EncodingUtil.base64Encode(contentDispositionCrLfBlob);
        string content = SafelyPad(contentDisposition, contentDispositionCrLf64, '\r\n\r\n');
        string valueCrLf = value + '\r\n';
        blob valueCrLfBlob = blob.valueOf(valueCrLf);
        string valueCrLf64 = EncodingUtil.base64Encode(valueCrLfBlob);
        content += SafelyPad(value, valueCrLf64, '\r\n');
        return content;
    }

    /**
     *  Helper enum indicating how a file's base64 padding was replaced.
     */
    public enum EndingType {
        Cr,
        CrLf,
        None
    }
}

Find the code for Utility Class “EinsteinAPIService” which will be used to send the Request for All the API calls related to Einstein Platform API. this is the reusable class for Einstein Platform API

public class EinsteinAPIService {

    public static FINAL STRING OAUTH_END_POINT = 'https://api.einstein.ai/v2/oauth2/token';
    
    private static String getAccessToken() {
        
        /* Get the key file from the File Object that we have downloaded from Einstein platfom API*/
        ContentVersion base64Content = [SELECT Title, VersionData 
                                        FROM ContentVersion
                                        WHERE Title='einstein_platform' OR 
                                        Title='predictive_services'
                                        ORDER BY Title 
                                        LIMIT 1];
        
        String keyContents = base64Content.VersionData.tostring();
        keyContents = keyContents.replace('-----BEGIN RSA PRIVATE KEY-----', '');
        keyContents = keyContents.replace('-----END RSA PRIVATE KEY-----', '');
        keyContents = keyContents.replace('\n', '');
        
        // Get a new token
        JWT jwt = new JWT('RS256');
        // jwt.cert = 'JWTCert'; 
        // Uncomment this if you used a Salesforce certificate to sign up for an Einstein Platform account
        jwt.pkcs8 = keyContents; // Comment this if you are using jwt.cert
        jwt.iss = 'developer.force.com';
        jwt.sub = 'sfdcpanther@gmail.com';
        jwt.aud =  OAUTH_END_POINT;
        jwt.exp = '3600';
        String access_token = JWTBearerFlow.getAccessToken(OAUTH_END_POINT, jwt);
        return access_token;    
    }
    public static String imageOCR(String endPoint, String sample, String model, boolean isBase64){
        String result = einsteinAPICall(endPoint, sample, model, isBase64);
        return result;
    }
    public static String predictImage(String endPoint, String sample, String model, boolean isBase64){
        String result = einsteinAPICall(endPoint, sample, model, isBase64);
        return result;
    }
    private static String einsteinAPICall(String endPoint, String sample, String model, boolean isBase64) {
        string contentType = HttpFormBuilder.GetContentType();
        String access_token = getAccessToken();
        
        //  Compose the form
        string form64 = '';
		
        form64 += HttpFormBuilder.WriteBoundary();
        form64 += HttpFormBuilder.WriteBodyParameter('modelId', EncodingUtil.urlEncode(model, 'UTF-8'));
        form64 += HttpFormBuilder.WriteBoundary();
        if(isBase64) {
            form64 += HttpFormBuilder.WriteBodyParameter('sampleBase64Content', sample);
        } else {
            form64 += HttpFormBuilder.WriteBodyParameter('sampleLocation', sample);
        }
        form64 += HttpFormBuilder.WriteBoundary(HttpFormBuilder.EndingType.CrLf);

        blob formBlob = EncodingUtil.base64Decode(form64);
        string contentLength = string.valueOf(formBlob.size());
        
        HttpRequest httpRequest = new HttpRequest();

        httpRequest.setBodyAsBlob(formBlob);
        httpRequest.setHeader('Connection', 'keep-alive');
        httpRequest.setHeader('Content-Length', contentLength);
        httpRequest.setHeader('Content-Type', contentType);
        httpRequest.setMethod('POST');
        httpRequest.setTimeout(120000);
        httpRequest.setHeader('Authorization','Bearer ' + access_token);
        httpRequest.setEndpoint(endPoint);

        Http http = new Http();
        try {
            HTTPResponse res = http.send(httpRequest);
            if (res.getStatusCode() == 200) {
                return res.getBody();
            }
        } catch(System.CalloutException e) {
            System.debug('ERROR:' + e);
            return e.getStackTraceString();
        }
        return null;
    }
}

Now, We have got both the classes the Utility Class and RequestBuilder Class. Let’s create the Class which will be the actual class for making the Request.

Before we go ahead, Find an image and upload under account record or any other object record that you want to. Like below

Here is the code for the same

public class EinsteinOCRService {
    public static FINAL String  OCR_API         = 'https://api.einstein.ai/v2/vision/ocr';
    public static FINAL String  OCR_MODEL       = 'OCRModel';
    public static FINAL String  OCR_MODEL_TABEL = 'OCRModel';
    
    public static void readTextFromImageByURL(){
        String sample = 'https://i1.wp.com/www.pantherschools.com/wp-content/uploads/2020/07/Day-1.png';
        String result = EinsteinAPIService.imageOCR(OCR_API, sample, OCR_MODEL, false);
        parseResponse(result);
    }
    
    public static void readTextFromImageByBase64(){
        List<ContentDocumentLink> contentLink = [SELECT ContentDocumentId, LinkedEntityId  
                                                 FROM ContentDocumentLink where LinkedEntityId ='0010o00002KIY2SAAX'];
// replace 0010o00002KIY2SAAX with the RecordId where you have uploaded the file
        if(!contentLink.isEmpty()){
            ContentVersion content = [SELECT Title,VersionData FROM 
                                      ContentVersion 
                                      where ContentDocumentId =: contentLink.get(0).ContentDocumentId 
                                      LIMIT 1];
            String sample = EncodingUtil.base64Encode(content.VersionData);
            String result = EinsteinAPIService.imageOCR(OCR_API, sample, OCR_MODEL, true);
            parseResponse(result);
        }
    }
    private static void parseResponse(String ressult){
        EinsteinOCRResponse response = (EinsteinOCRResponse)System.JSON.deserialize(ressult, EinsteinOCRResponse.class);
        for(EinsteinOCRResponse.Probabilities prob : response.probabilities){
            System.debug(System.LoggingLevel.DEBUG, prob.label);
        }
    }
}

The above class contains 2 methods

readTextFromImageByURLWhich reads the Image Text from the Image URL
readTextFromImageByBase64Which reads the text from the Image Base64 Content

Below is the Helper Class for the Same which is used to store the response

public class EinsteinOCRResponse {
    public String task;	
    public Probabilities[] probabilities;
    public class Probabilities {
        public Double probability;
        public String label;	
        public BoundingBox boundingBox;
    }
    public class BoundingBox {
        public Integer minX;	
        public Integer minY;	
        public Integer maxX;	
        public Integer maxY;	
    }
}

See the working demo for the same

You can find the complete code here

Thanks for reading. Sharing is caring 🙂

If you have any doubts please feel free to put into the comment section or connect with me.

Amit Singh
Amit Singhhttps://www.pantherschools.com/
Amit Singh aka @sfdcpanther/pantherschools, a Salesforce Technical Architect, Consultant with over 8+ years of experience in Salesforce technology. 21x Certified. Blogger, Speaker, and Instructor. DevSecOps Champion
Share this

Leave a review

Excellent

SUBSCRIBE-US

Book a 1:1 Call

Must-read

How to Utilize Salesforce CLI sf (v2)

The Salesforce CLI is not just a tool; it’s the cornerstone of development on the Salesforce Platform. It’s your go-to for building, testing, deploying, and more. As one of the most important development tools in our ecosystem

Save the day of a Developer with Apex Log Analyzer

Table of Contents What is Apex Log Analyzer? Apex Log Analyzer, a tool designed with Salesforce developers in mind, is here to simplify and accelerate your...

Salesforce PodCast

Introduction Hey Everyone, Welcome to my podcast, the first-ever podcast in India for Salesforce professionals. Achievement We are happy to announce that we have been selected as Top...

Recent articles

More like this

6 COMMENTS

  1. If I Executing this code getting error

    Line: 15, Column: 1
    System.JSONException: Unexpected character (‘C’ (code 67)): expected a valid value (number, String, array, object, ‘true’, ‘false’ or ‘null’) at input location [1,2]

LEAVE A REPLY

Please enter your comment!
Please enter your name here

5/5

Stuck in coding limbo?

Our courses unlock your tech potential