Skip to content

Commit 5abe3b5

Browse files
committed
Class-specific Extremal Region Filter algorithm as proposed in :
Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012. High-level C++ interface and implementation of algorithm is in the objdetect module. C++ example, a test image, and the default classifiers in xml files.
1 parent d81d3fc commit 5abe3b5

File tree

9 files changed

+9540
-1
lines changed

9 files changed

+9540
-1
lines changed

modules/objdetect/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
set(the_description "Object Detection")
2-
ocv_define_module(objdetect opencv_core opencv_imgproc OPTIONAL opencv_highgui)
2+
ocv_define_module(objdetect opencv_core opencv_imgproc opencv_ml OPTIONAL opencv_highgui)

modules/objdetect/include/opencv2/objdetect.hpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -394,5 +394,6 @@ CV_EXPORTS_W void drawDataMatrixCodes(InputOutputArray image,
394394
}
395395

396396
#include "opencv2/objdetect/linemod.hpp"
397+
#include "opencv2/objdetect/erfilter.hpp"
397398

398399
#endif
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
/*M///////////////////////////////////////////////////////////////////////////////////////
2+
//
3+
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
4+
//
5+
// By downloading, copying, installing or using the software you agree to this license.
6+
// If you do not agree to this license, do not download, install,
7+
// copy or use the software.
8+
//
9+
//
10+
// License Agreement
11+
// For Open Source Computer Vision Library
12+
//
13+
// Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
14+
// Copyright (C) 2009, Willow Garage Inc., all rights reserved.
15+
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
16+
// Third party copyrights are property of their respective owners.
17+
//
18+
// Redistribution and use in source and binary forms, with or without modification,
19+
// are permitted provided that the following conditions are met:
20+
//
21+
// * Redistribution's of source code must retain the above copyright notice,
22+
// this list of conditions and the following disclaimer.
23+
//
24+
// * Redistribution's in binary form must reproduce the above copyright notice,
25+
// this list of conditions and the following disclaimer in the documentation
26+
// and/or other materials provided with the distribution.
27+
//
28+
// * The name of the copyright holders may not be used to endorse or promote products
29+
// derived from this software without specific prior written permission.
30+
//
31+
// This software is provided by the copyright holders and contributors "as is" and
32+
// any express or implied warranties, including, but not limited to, the implied
33+
// warranties of merchantability and fitness for a particular purpose are disclaimed.
34+
// In no event shall the Intel Corporation or contributors be liable for any direct,
35+
// indirect, incidental, special, exemplary, or consequential damages
36+
// (including, but not limited to, procurement of substitute goods or services;
37+
// loss of use, data, or profits; or business interruption) however caused
38+
// and on any theory of liability, whether in contract, strict liability,
39+
// or tort (including negligence or otherwise) arising in any way out of
40+
// the use of this software, even if advised of the possibility of such damage.
41+
//
42+
//M*/
43+
44+
#ifndef __OPENCV_OBJDETECT_ERFILTER_HPP__
45+
#define __OPENCV_OBJDETECT_ERFILTER_HPP__
46+
47+
#include "opencv2/core.hpp"
48+
#include <vector>
49+
#include <deque>
50+
51+
namespace cv
52+
{
53+
54+
/*!
55+
Extremal Region Stat structure
56+
57+
The ERStat structure represents a class-specific Extremal Region (ER).
58+
59+
An ER is a 4-connected set of pixels with all its grey-level values smaller than the values
60+
in its outer boundary. A class-specific ER is selected (using a classifier) from all the ER's
61+
in the component tree of the image.
62+
*/
63+
struct CV_EXPORTS ERStat
64+
{
65+
public:
66+
//! Constructor
67+
ERStat(int level = 256, int pixel = 0, int x = 0, int y = 0);
68+
//! Destructor
69+
~ERStat(){};
70+
71+
//! seed point and the threshold (max grey-level value)
72+
int pixel;
73+
int level;
74+
75+
//! incrementally computable features
76+
int area;
77+
int perimeter;
78+
int euler; //!< euler number
79+
int bbox[4];
80+
double raw_moments[2]; //!< order 1 raw moments to derive the centroid
81+
double central_moments[3]; //!< order 2 central moments to construct the covariance matrix
82+
std::deque<int> *crossings;//!< horizontal crossings
83+
84+
//! 1st stage features
85+
float aspect_ratio;
86+
float compactness;
87+
float num_holes;
88+
float med_crossings;
89+
90+
//! 2nd stage features
91+
float hole_area_ratio;
92+
float convex_hull_ratio;
93+
float num_inflexion_points;
94+
95+
// TODO Other features can be added (average color, standard deviation, and such)
96+
97+
98+
// TODO shall we include the pixel list whenever available (i.e. after 2nd stage) ?
99+
std::vector<int> *pixels;
100+
101+
//! probability that the ER belongs to the class we are looking for
102+
double probability;
103+
104+
//! pointers preserving the tree structure of the component tree
105+
ERStat* parent;
106+
ERStat* child;
107+
ERStat* next;
108+
ERStat* prev;
109+
110+
//! wenever the regions is a local maxima of the probability
111+
bool local_maxima;
112+
ERStat* max_probability_ancestor;
113+
ERStat* min_probability_ancestor;
114+
};
115+
116+
/*!
117+
Base class for 1st and 2nd stages of Neumann and Matas scene text detection algorithms
118+
Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012
119+
120+
Extracts the component tree (if needed) and filter the extremal regions (ER's) by using a given classifier.
121+
*/
122+
class CV_EXPORTS ERFilter : public cv::Algorithm
123+
{
124+
public:
125+
126+
//! callback with the classifier is made a class. By doing it we hide SVM, Boost etc.
127+
class CV_EXPORTS Callback
128+
{
129+
public:
130+
virtual ~Callback(){};
131+
//! The classifier must return probability measure for the region.
132+
virtual double eval(const ERStat& stat) = 0; //const = 0; //TODO why cannot use const = 0 here?
133+
};
134+
135+
/*!
136+
the key method. Takes image on input and returns the selected regions in a vector of ERStat
137+
only distinctive ERs which correspond to characters are selected by a sequential classifier
138+
\param image is the input image
139+
\param regions is output for the first stage, input/output for the second one.
140+
*/
141+
virtual void run( cv::InputArray image, std::vector<ERStat>& regions ) = 0;
142+
143+
144+
//! set/get methods to set the algorithm properties,
145+
virtual void setCallback(const cv::Ptr<ERFilter::Callback>& cb) = 0;
146+
virtual void setThresholdDelta(int thresholdDelta) = 0;
147+
virtual void setMinArea(float minArea) = 0;
148+
virtual void setMaxArea(float maxArea) = 0;
149+
virtual void setMinProbability(float minProbability) = 0;
150+
virtual void setMinProbabilityDiff(float minProbabilityDiff) = 0;
151+
virtual void setNonMaxSuppression(bool nonMaxSuppression) = 0;
152+
virtual int getNumRejected() = 0;
153+
};
154+
155+
156+
/*!
157+
Create an Extremal Region Filter for the 1st stage classifier of N&M algorithm
158+
Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012
159+
160+
The component tree of the image is extracted by a threshold increased step by step
161+
from 0 to 255, incrementally computable descriptors (aspect_ratio, compactness,
162+
number of holes, and number of horizontal crossings) are computed for each ER
163+
and used as features for a classifier which estimates the class-conditional
164+
probability P(er|character). The value of P(er|character) is tracked using the inclusion
165+
relation of ER across all thresholds and only the ERs which correspond to local maximum
166+
of the probability P(er|character) are selected (if the local maximum of the
167+
probability is above a global limit pmin and the difference between local maximum and
168+
local minimum is greater than minProbabilityDiff).
169+
170+
\param cb Callback with the classifier.
171+
if omitted tries to load a default classifier from file trained_classifierNM1.xml
172+
\param thresholdDelta Threshold step in subsequent thresholds when extracting the component tree
173+
\param minArea The minimum area (% of image size) allowed for retreived ER's
174+
\param minArea The maximum area (% of image size) allowed for retreived ER's
175+
\param minProbability The minimum probability P(er|character) allowed for retreived ER's
176+
\param nonMaxSuppression Whenever non-maximum suppression is done over the branch probabilities
177+
\param minProbability The minimum probability difference between local maxima and local minima ERs
178+
*/
179+
CV_EXPORTS cv::Ptr<ERFilter> createERFilterNM1(const cv::Ptr<ERFilter::Callback>& cb = NULL,
180+
int thresholdDelta = 1, float minArea = 0.000025,
181+
float maxArea = 0.13, float minProbability = 0.2,
182+
bool nonMaxSuppression = true,
183+
float minProbabilityDiff = 0.1);
184+
185+
/*!
186+
Create an Extremal Region Filter for the 2nd stage classifier of N&M algorithm
187+
Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012
188+
189+
In the second stage, the ERs that passed the first stage are classified into character
190+
and non-character classes using more informative but also more computationally expensive
191+
features. The classifier uses all the features calculated in the first stage and the following
192+
additional features: hole area ratio, convex hull ratio, and number of outer inflexion points.
193+
194+
\param cb Callback with the classifier
195+
if omitted tries to load a default classifier from file trained_classifierNM2.xml
196+
\param minProbability The minimum probability P(er|character) allowed for retreived ER's
197+
*/
198+
CV_EXPORTS cv::Ptr<ERFilter> createERFilterNM2(const cv::Ptr<ERFilter::Callback>& cb = NULL,
199+
float minProbability = 0.85);
200+
201+
}
202+
#endif // _OPENCV_ERFILTER_HPP_

0 commit comments

Comments
 (0)