python recognition - How to classify blurry numbers with openCV
recognize in (3)
You clarified in comments that you've already isolated the number part of the image pre-detection, so I'll start under that assumption.
Perhaps you can approximate the perspective effects and "blurriness" of the number by treating it as a hand-written number. In this case, there is a famous data-set of handwritten numerals for classification training called mnist.
Yann LeCun has enumerated the state of the art on this dataset here mnist hand-written dataset.
At the far end of the spectrum, convolutional neural networks yield outrageously low error rates (fractions of 1% error). For a simpler solution, k-nearest neighbours using deskewing, noise removal, blurring, and 2 pixel shift, yielded about 1% error, and is significantly faster to implement. Python opencv has an implementation. Neural networks and support vector machines with deskewing also have some pretty impressive performance rates.
Note that convolutional networks don't have you pick your own features, so the important color-differential information here might just be used for narrowing the region-of-interest. Other approaches, where you define your feature space, might incorporate the known color difference more precisely.
Python supports a lot of machine learning techniques in the terrific package sklearn - here are examples of sklearn applied to mnist. If you're looking for an tutorialized explanation of machine learning in python, sklearn's own tutorial is very verbose
Those are the kinds of items you're trying to classify if you learn using this approach. To emphasize how easy it is to start training some of these machine learning-based classifiers, here is an abridged section from the example code in the linked sklearn package:
digits = datasets.load_digits() # built-in to sklearn! data = digits.images.reshape((len(digits.images), -1)) # Create a classifier: a support vector classifier classifier = svm.SVC(gamma=0.001) # We learn the digits on the first half of the digits classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])
If you're wedded to openCv (possibly because you want to port to a real-time system in the future), opencv3/python has a tutorial on this exact topic too! Their demo uses k-nearest-neighbor (listed in the LeCun page), but they also have svms and the many of the other tools in sklearn. Their ocr page using SVMs uses deskewing, which might be useful with the perspective effect in your problem:
UPDATE: I used the out-of-the box skimage approach outlined above on your image, heavily cropped, and it correctly classified it. A lot more testing would be required to see if this is rhobust in practice
^^ That tiny image is the 8x8 crop of the image you embedded in your question. mnist is 8x8 images. That's why it trains in less than a second with default arguments in skimage.
I converted it the correct format by scaling it up to the mnist range using
number = scipy.misc.imread("cropped_image.png") datum = (number[:,:,0]*15).astype(int).reshape((64,)) classifier.predict(datum) # returns 8
I didn't change anything else from the example; here, I'm only using the first channel for classification, and no smart feature computation. 15 looked about right to me; you'll need to tune it to get within the target range or (ideally) provide your own training and testing set
If you haven't isolated the number in the image you'll need an object detector. The literature space on this problem is gigantic and I won't start down that rabbit hole (google Viola and Jones, maybe?) This blog covers the fundamentals of a "sliding window" detector in python. Adrian Rosebrock looks like he's even a contributor on SO, and that page has some good examples of opencv and python-based object detectors fairly tutorialized (you actually linked to that blog in your question, I didn't realize).
In short, classify windows across the image and pick the window of highest confidence. Narrowing down the search space with a region of interest will of course yield huge improvements in all areas of performance
I would like to capture the number from this kind of picture.
I tried multi-scale matching from the following link.
All I want to know is the red number. But the problem is, the red number is blurry for openCV recognize/match template. Would there be other possible way to detect this red number on the black background?
You have a couple of things you can use to your advantage:
- The number is within the black rectangular bezel and one colour
- The number appears to be a segmented LCD type display, if so there are only a finite number of segments which are off or on.
So I suggest you:
- Calibrate your camera and preprocess the image to remove lens distortion
- Rectify the display rectangle:
- Detect the display rectangle using either the intersection of hough lines, or edge detection followed by contour detection and then pick the biggest, squarest contours
GetPerspectiveTransformto get the transform between image coordinates and an ideal rectangle, then transform the input image using
Split image into R, G and B channels and work out
r - avg(g, b), this is a bit lighting dependent but should give something like this:
- Then either try pattern matching on this, or perhaps re-segment the image and attempt to find which display segments are lit, or run through an OCR package.
The first things I would look for are color - like RED , when doing Red eye detection in an image - there is a certain color range to detect , some characteristics about it considering the surrounding area and such as distance apart from the other eye if it is indeed visible in the image.
1: First characteristic is color and Red is very dominant. After detecting the Coca Cola Red there are several items of interest 1A: How big is this red area (is it of sufficient quantity to make a determination of a true can or not - 10 pixels is probably not enough), 1B: Does it contain the color of the Label - "Coca-Cola" or wave. 1B1: Is there enough to consider a high probability that it is a label.
Item 1 is kind of a short cut - pre-process if that doe snot exist in the image - move on.
So if that is the case I can then utilize that segment of my image and start looking more zoom out of the area in question a little bit - basically look at the surrounding region / edges...
2: Given the above image area ID'd in 1 - verify the surrounding points [edges] of the item in question. A: Is there what appears to be a can top or bottom - silver? B: A bottle might appear transparent , but so might a glass table - so is there a glass table/shelf or a transparent area - if so there are multiple possible out comes. A Bottle MIGHT have a red cap, it might not, but it should have either the shape of the bottle top / thread screws, or a cap. C: Even if this fails A and B it still can be a can - partial.. This is more complex when it is partial because a partial bottle / partial can might look the same , so some more processing of measurement of the Red region edge to edge.. small bottle might be similar in size ..
3: After the above analysis that is when I would look at the lettering and the wave logo - because I can orient my search for some of the letters in the words As you might not have all of the text due to not having all of the can, the wave would align at certain points to the text (distance wise) so I could search for that probability and know which letters should exist at that point of the wave at distance x.