T-HOG: an Effective Gradient-Based Descriptor for Single Line Text Regions

Pattern Recognition (PR), Elsevier, vol. 46, pp. 1078-1090, 2013.

Rodrigo Minetto ,   Nicolas Thome ,   Matthieu Cord ,   Neucimar J. Leite ,   Jorge Stolfi

Federal University of Technology of Parana, DAINF, Curitiba, Brazil
Universite Pierre et Marie Curie, UPMC-Sorbonne Universities, LIP6, Paris, France
University of Campinas, Institute of Computing, Campinas, Brazil

We discuss the use of the histogram of oriented gradients (HOG) descriptors as an effective tool for text description and recognition. Specifically, we propose a HOG-based texture descriptor (T-HOG) that uses a partition of the image into overlapping horizontal cells with gradual boundaries, to characterize single-line texts in outdoor scenes and video frames. The input of our algorithm is a rectangular image presumed to contain a single line of text in Latin-like characters. The output is a relatively short descriptor that provides an effective input to an SVM classifier. Extensive experiments show that the T-HOG recognizer is more accurate than Dalal and Triggs's original HOG-based classifier, for any descriptor size. In addition, we the T-HOG is an effective tool for text/non-text discrimination and can be used in various text detection applications. In particular, combining T-HOG with a permissive bottom-up text detector is shown to outperform state-of-the-art text detection systems in two major publicly available databases, as described in our journal [thog-paper]. Read also our text detector paper: [text detection].

Source code (C-LIBSVM): [click here to download]

Source code (JAVA): [click here to download]