Optical Character Recognition on Android – OCR


android ocr library - featured

Android itself is a smart OS, still it lacked a very basic feature of text recognition. But not anymore; with the official Optical Character Recognition API of Android and the Mobile Vision library, now Android can perform OCR very efficiently and correctly. I did a very basic feature test to have a look at the new functionality and found out its very fast and easy to use. Here in this Optical Character Recognition(OCR) example of Android, I would simply import the library, click a picture of a piece of text and look for text blocks in it. But before doing so lets take an overview of Android Mobile Vision API to understand the working of Text API better.

Android Mobile Vision Library

Consider a scenario where you wish to scan and an image or a video stream and detect faces, barcodes, QR codes or Text in it on Android. Astonishingly till now no such framework existed on Android. But now Google has introduced Mobile Vision APIs. These set of APIs provide a very easy to use programming interfaces through which we can scan Faces, Barcodes, QR Codes and Text without writing huge amount of code. And the best part is that these can be used offline as well, i.e. only once when the app is downloaded the required dependencies are downloaded, post that an internet connection is not required any more. As this feature is introduced on Android through the Google Play services. To enable your app to use Mobile Vision APIs, you need to add this dependency in your build.gradle file:

compile 'com.google.android.gms:play-services-vision:11.4.0'

Please Note: Learn more on how to setup Google Play Services. Or learn more about Mobile Vision API.

Introducing an Android OCR Library – Text Recognition API

Since the Android OS was brought on to production devices, Optical Character Recognition has been a common area of research. But this Text Recognition API of Mobile Vision suite would bring all these researches to a stop. As this Google powered API contains features like multiple language recognition where languages are like : English, French, German, Spanish or any other Latin based text. Also the text can be parsed from a stream of frames i.e. a video and displayed on the screen in real time as displayed in the image above. But due to the scope of this Android OCR Library example we would keep things simple and scan the text from an image only, as this tutorial is targeted for beginners. Apart from this the interesting part is, all this is done offline by the Google Play services itself, i.e. no internet connection is required after once it has been set up in the app (shown in steps ahead).  Now when it comes to structuring the text, this Android OCR library not only recognizes the text but can also divide the captured text into the following categories:

  1. Block – TextBlock – A top level object where a scanned paragraph or column is captured.
  2. Line – Line – A line of text captured from a block of text.
  3. Word – Element – A single word recognized in a Line.

Android OCR Example

Now that we have a basic understanding of Android OCR library, particularly the Text Recognition API. I will demonstrate by an example where we would simply take a picture and scan for text in it. Now as a thumb rule to do so first we may need to set up our Android app to download the play services dependency for Optical Character Recognition. Therefore please include the block of code below in your manifest to instruct installer to download the OCR dependency at the time of installing the app.

<meta-data
            android:name="com.google.android.gms.vision.DEPENDENCIES"
            android:value="ocr"/>

Please Note: This is not a mandatory step, but helps in downloading the dependencies beforehand. Also the link to full source code is at the end of this tutorial.

Next lets define a layout to display the scanned results:

<?xml version="1.0" encoding="utf-8"?>
<android.support.constraint.ConstraintLayout
    android:id="@+id/activity_main"
    xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context="com.truiton.mobile.vision.ocr.MainActivity">

    <ImageView
        android:id="@+id/imageView"
        android:layout_width="200dp"
        android:layout_height="200dp"
        android:layout_marginTop="16dp"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:srcCompat="@mipmap/truiton"
        tools:layout_constraintLeft_creator="1"
        tools:layout_constraintRight_creator="1"/>

    <Button
        android:id="@+id/button"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginBottom="8dp"
        android:text="Scan Text"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        tools:layout_constraintLeft_creator="1"
        tools:layout_constraintRight_creator="1"
        />

    <TextView
        android:id="@+id/textView"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginTop="40dp"
        android:text="Scan Results:"
        android:textAllCaps="false"
        android:textStyle="normal|bold"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/imageView"
        tools:layout_constraintLeft_creator="1"
        tools:layout_constraintRight_creator="1"/>

    <ScrollView
        android:layout_width="0dp"
        android:layout_height="0dp"
        android:layout_marginTop="8dp"
        android:paddingLeft="5dp"
        android:paddingRight="5dp"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintHorizontal_bias="1.0"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/textView"
        tools:layout_constraintTop_creator="1"
        tools:layout_constraintRight_creator="1"
        tools:layout_constraintBottom_creator="1"
        tools:layout_constraintLeft_creator="1">

        <LinearLayout
            android:layout_width="match_parent"
            android:layout_height="wrap_content"
            android:orientation="vertical">

            <TextView
                android:id="@+id/results"
                android:layout_width="wrap_content"
                android:layout_height="wrap_content"
                android:layout_marginTop="8dp"
                app:layout_constraintLeft_toLeftOf="parent"
                app:layout_constraintRight_toRightOf="parent"
                tools:layout_constraintLeft_creator="1"
                tools:layout_constraintRight_creator="1"
                tools:layout_constraintTop_creator="1"/>
        </LinearLayout>
    </ScrollView>
</android.support.constraint.ConstraintLayout>

I was playing around with ConstraintLayout in Android. Hence made it in Constraint layout, although its not a requirement to use ConstraintLayout for this Android OCR Library example. The above layout basically contains a ScrollView to accurately display the scanned text. Next lets define the main Activity:

package com.truiton.mobile.vision.ocr;

import android.Manifest;
import android.content.Context;
import android.content.Intent;
import android.content.pm.PackageManager;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.net.Uri;
import android.os.Bundle;
import android.os.Environment;
import android.provider.MediaStore;
import android.support.annotation.NonNull;
import android.support.v4.app.ActivityCompat;
import android.support.v4.content.FileProvider;
import android.support.v7.app.AppCompatActivity;
import android.util.Log;
import android.util.SparseArray;
import android.view.View;
import android.widget.Button;
import android.widget.TextView;
import android.widget.Toast;

import com.google.android.gms.vision.Frame;
import com.google.android.gms.vision.text.Text;
import com.google.android.gms.vision.text.TextBlock;
import com.google.android.gms.vision.text.TextRecognizer;

import java.io.File;
import java.io.FileNotFoundException;

public class MainActivity extends AppCompatActivity {
    private static final String LOG_TAG = "Text API";
    private static final int PHOTO_REQUEST = 10;
    private TextView scanResults;
    private Uri imageUri;
    private TextRecognizer detector;
    private static final int REQUEST_WRITE_PERMISSION = 20;
    private static final String SAVED_INSTANCE_URI = "uri";
    private static final String SAVED_INSTANCE_RESULT = "result";

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        Button button = (Button) findViewById(R.id.button);
        scanResults = (TextView) findViewById(R.id.results);
        if (savedInstanceState != null) {
            imageUri = Uri.parse(savedInstanceState.getString(SAVED_INSTANCE_URI));
            scanResults.setText(savedInstanceState.getString(SAVED_INSTANCE_RESULT));
        }
        detector = new TextRecognizer.Builder(getApplicationContext()).build();
        button.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {
                ActivityCompat.requestPermissions(MainActivity.this, new
                        String[]{Manifest.permission.WRITE_EXTERNAL_STORAGE}, REQUEST_WRITE_PERMISSION);
            }
        });
    }

    @Override
    public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
        super.onRequestPermissionsResult(requestCode, permissions, grantResults);
        switch (requestCode) {
            case REQUEST_WRITE_PERMISSION:
                if (grantResults.length > 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
                    takePicture();
                } else {
                    Toast.makeText(MainActivity.this, "Permission Denied!", Toast.LENGTH_SHORT).show();
                }
        }
    }

    @Override
    protected void onActivityResult(int requestCode, int resultCode, Intent data) {
        if (requestCode == PHOTO_REQUEST && resultCode == RESULT_OK) {
            launchMediaScanIntent();
            try {
                Bitmap bitmap = decodeBitmapUri(this, imageUri);
                if (detector.isOperational() && bitmap != null) {
                    Frame frame = new Frame.Builder().setBitmap(bitmap).build();
                    SparseArray<TextBlock> textBlocks = detector.detect(frame);
                    String blocks = "";
                    String lines = "";
                    String words = "";
                    for (int index = 0; index < textBlocks.size(); index++) {
                        //extract scanned text blocks here
                        TextBlock tBlock = textBlocks.valueAt(index);
                        blocks = blocks + tBlock.getValue() + "\n" + "\n";
                        for (Text line : tBlock.getComponents()) {
                            //extract scanned text lines here
                            lines = lines + line.getValue() + "\n";
                            for (Text element : line.getComponents()) {
                                //extract scanned text words here
                                words = words + element.getValue() + ", ";
                            }
                        }
                    }
                    if (textBlocks.size() == 0) {
                        scanResults.setText("Scan Failed: Found nothing to scan");
                    } else {
                        scanResults.setText(scanResults.getText() + "Blocks: " + "\n");
                        scanResults.setText(scanResults.getText() + blocks + "\n");
                        scanResults.setText(scanResults.getText() + "---------" + "\n");
                        scanResults.setText(scanResults.getText() + "Lines: " + "\n");
                        scanResults.setText(scanResults.getText() + lines + "\n");
                        scanResults.setText(scanResults.getText() + "---------" + "\n");
                        scanResults.setText(scanResults.getText() + "Words: " + "\n");
                        scanResults.setText(scanResults.getText() + words + "\n");
                        scanResults.setText(scanResults.getText() + "---------" + "\n");
                    }
                } else {
                    scanResults.setText("Could not set up the detector!");
                }
            } catch (Exception e) {
                Toast.makeText(this, "Failed to load Image", Toast.LENGTH_SHORT)
                        .show();
                Log.e(LOG_TAG, e.toString());
            }
        }
    }

    private void takePicture() {
        Intent intent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
        File photo = new File(Environment.getExternalStorageDirectory(), "picture.jpg");
        imageUri = FileProvider.getUriForFile(MainActivity.this,
                BuildConfig.APPLICATION_ID + ".provider", photo);
        intent.putExtra(MediaStore.EXTRA_OUTPUT, imageUri);
        startActivityForResult(intent, PHOTO_REQUEST);
    }

    @Override
    protected void onSaveInstanceState(Bundle outState) {
        if (imageUri != null) {
            outState.putString(SAVED_INSTANCE_URI, imageUri.toString());
            outState.putString(SAVED_INSTANCE_RESULT, scanResults.getText().toString());
        }
        super.onSaveInstanceState(outState);
    }

    private void launchMediaScanIntent() {
        Intent mediaScanIntent = new Intent(Intent.ACTION_MEDIA_SCANNER_SCAN_FILE);
        mediaScanIntent.setData(imageUri);
        this.sendBroadcast(mediaScanIntent);
    }

    private Bitmap decodeBitmapUri(Context ctx, Uri uri) throws FileNotFoundException {
        int targetW = 600;
        int targetH = 600;
        BitmapFactory.Options bmOptions = new BitmapFactory.Options();
        bmOptions.inJustDecodeBounds = true;
        BitmapFactory.decodeStream(ctx.getContentResolver().openInputStream(uri), null, bmOptions);
        int photoW = bmOptions.outWidth;
        int photoH = bmOptions.outHeight;

        int scaleFactor = Math.min(photoW / targetW, photoH / targetH);
        bmOptions.inJustDecodeBounds = false;
        bmOptions.inSampleSize = scaleFactor;

        return BitmapFactory.decodeStream(ctx.getContentResolver()
                .openInputStream(uri), null, bmOptions);
    }
}

In the above piece of code, I simply initialized a TextRecognizer and asked the user to grant the permission to store the captured image on disk. Post which, when an image like shown below is captured, we resize the image in method decodeBitmapUri to a smaller size so that, it can be scanned faster. Once the image is scaled, we check for operational TextRecognizer. Then after the detector is operational, we scan out the text from picture.

android ocr library

When the Android OCR library – the Mobile Vision, returned the text from the picture above, it was very accurate. Have a look at the result below:

android ocr library

For full source code, please refer to the link below:

Full Source Code

Hence I believe this is one of the best OCR libraries available for Android till date. As it gives the unique capability of offline text scanning, without compromising on quality. Therefore if you have to make an app where it is required to scan the text and process it, use the Text API of Mobile Vision for Optical Character Recognition. For more posts like this, please connect with us on Twitter, Facebook and Google+. Hope this helped.

About Mohit Gupt

An android enthusiast, and an iPhone user with a keen interest in development of innovative applications.


Leave a comment

Your email address will not be published. Required fields are marked *

15 thoughts on “Optical Character Recognition on Android – OCR

  • José

    Hi Mohit,

    First of all congratulations for the great post. There isn’t so much information about this topic on internet and you explained it so well.

    I’ve been following your step-by-step guide and also deployed your code. However, even using a new Galaxy, the accuracy of the text detected is so poor (not even a word) . I think the problem could be in the decode function because I have deployed de google labs and that one works fine.

    Could you please can give me some advice or telling how to fine the results.

    Thanks in advance and again, great great job!

  • Jay

    Hi.. Mohit,
    Its work very fine in Asus Zenfone 2 (Ver. 5.0) but not working in Samsung Galaxy S3 (V. 4.3) and LG nexus 5x (Ver. 7.0) in Samsung it takes picture but error going on: “Scan Failed: Found nothing to scan” and in Nexus 5x error going on: “crash the app”.

    So what should i do for that solving error and how to solve their issue ?
    Help as soon as possible.

    Thanks in Advance.

    • oza san

      Dear Mohit,
      first of all thank you for your great work.next how i can use it for other langues which didn’t use Latin characters?and how i can use images taken from videos?
      thanks in advance!!!

  • Eleazer

    Hello Mohit,
    This is a nice tutorial, I’d tried this and it works fine but there’s only one bug:
    The blocks are not correctly arrange sometimes, the blocks are shuffled if there are 3 or more blocks,
    I don’t know why.
    Could you help on that, thanks in advance

  • supriya chauhan

    Hi,
    This code works so well with clarity of scanned words, but can we customize the camera view since this time a complete screen is used in scanning. Is it possible to reduce the scanning window size.

  • samuel wainaina

    thank for such a clear and understandable guide. the problem i have is that can i scan two sided file like an ID and store it in one file with their corresponding text file. The storage will be a remote server.