Installing Tesseract 3.04+ in Ubuntu 14.04

Artem Malynovskyi
2 min readJan 5, 2017

--

Newest version of Tesseract, that could be installed in Ubuntu 14.04 is 3.03, but some libraries require version not lower than 3.04 (I have encountered such library written on Python — tesserocr ). In this paper I have described how to avoid difficulties with this issue.

To begin working with Tesseract 3.04 you need to install Leptonica 1.71+, but the highest version of Leptonica that you could install in Ubuntu 14.04 is 1.70. To install newer version, you need to compile it manually from sources.

So, here is a plan of actions :

  • Compile Leptonica 1.71+
  • Compile Tesseract 3.04+(over compiled Leptonica)
  • Install desired library that works with Tesseract

1)Install libraries, that are required by Leptonica and Tesseract

sudo apt-get install python-distutils-extra tesseract-ocr tesseract-ocr-eng libopencv-dev libtesseract-dev libleptonica-dev python-all-dev swig libcv-dev python-opencv python-numpy python-setuptools build-essential subversionsudo apt-get install autoconf automake libtool
sudo apt-get install libpng12-dev libjpeg62-dev libtiff4-dev zlib1g-dev

2)Install libraries required for tesseract training(optional) :

sudo apt-get install libicu-dev libpango1.0-dev libcairo2-dev

3)Download Leptonica (choose preferred version here and modify command) :

wget http://www.leptonica.com/source/leptonica-1.74.4.tar.gz

4)Unpack and build downloaded Leptonica archive :

tar xvf leptonica-1.74.tar.gz
cd leptonica-1.74
./configure
make
sudo make install

5) Install Tesseract over installed Leptonica

git clone https://github.com/tesseract-ocr/tesseract.git
cd tesseract
./autogen.sh
./configure --enable-debug
LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make
sudo make install
sudo ldconfig

6) You can check tesseract version by typing tesseract -v. If all steps were successful, it should look like here :

>tesseract -vtesseract 4.00.00alpha-241-g6f83ba0
leptonica-1.74.1
libjpeg 8d (libjpeg-turbo 1.3.0) : libpng 1.2.50 : libtiff 4.0.3 : zlib 1.2.8
Found AVX
Found SSE

7)Use pip (or other package manager for your programming language) to install required library :

pip install tesserocr

Following resources have been used for writing article :

--

--

Artem Malynovskyi

Co-founder and CTO @ Mawi Solutions. Wearables, AI, Software Architecture