Open Source OCR Tesseract installation on Ubuntu and use of it

First of all you must have command line expertise to use this open source OCR software

At the beginning we are going to install Tesseract on Ubuntu

Open your terminal and write the following command

root@nur-HP:~#apt-get install tesseract-ocr

It will install OCR on your Ubuntu Operating System. Then install your desire language packages. Remember you do not to install English language package because it already installed with tesseract installation.

Here, I going to install Bangla language package

apt-get install tesseract-ocr-[lang]

root@nur-HP:~#apt-get install tesseract-ocr-ben    (This command will install Bangla language package)

If you like to install All language packages, try the following command

root@nur-HP:~#apt-get install tesseract-ocr-all

Our installation has completed. Now we are going to use it

tesseract [image_path] [file_name]

sample command:

root@nur-HP:~#tesseract /home/nurahammad/Dropbox/ForOCR/IMG_20171201_161244.jpg /home/nurahammad/Desktop/test

If you like to see the result on terminal, try below command

tesseract [image_path] stdout

root@nur-HP:~# tesseract /home/nurahammad/Dropbox/ForOCR/IMG_20171201_161244.jpg stdout

I think it will help you for processing your Repository/Digital Library files

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s