Training Phase: Very similar to Randomized Tree. Only a few training images is required. A set of stable key-points will be chosen by transforming the input images in many ways(300). These stable key-points becomes the class labels. Each image is then transformed again many more times (1000) to obtain the view-set. The classifier will keep count of each Fern pattern (vector of binary-intensity-differences of a group of pixel-pairs) for each associated class label. The counts are used to set the prior probabilities.
The training and testing for 2D matching is done on a video frame sequence. The frame with upright front facing object is chosen for training.
Implementation decision has to be made on how to divide up the input vector into groups - Fern-Size. Increasing fern-size yields better 'variations' handling. (Is this referring to perspective, lighting variants?) Care must be taken with respect to memory usage. The amount required to store the distributions increases quickly with Fern size. And it would need more training samples (to build distributions of a bigger set of possible values?). On the other hand, increasing number of Ferns while keeping the same Fern size (small?) (increased vector size?) gives better recognition rate. The comes with only linear memory increase. But the run-time costs increases (relevant?!).
There is a paper on mobile AR application using Ferns - Citation 34 "Pose Tracking from Natural Features on Mobile Phones", Wagner et al.
This demo uses the LDetector class to detect object keypoints. And it uses PlanarObjectDetector class to do matching. FernsClassifier is one of PlanarObjectDetector members.
- Determine the most stable key-points from the object image (by recovering the key-points from affine-transformed object images).
- Build 3-level Image Pyramid for object image.
- Train the stable key-points with FernsClassifier and save the result to a file. The image pyramid is also supplied for training. Parameters include Ferns size, number of Ferns, Patch size, and Patch Generator.
- Load the PlanarObjectDetector from the file obtained from the last step.
- Use the LDetector to find keypoints from the scene image pyramid. Match them against the object key-points using the PlanarObjectDetector. The results are represented as index-pairs between the model-keypoints and the scene-keypoints. The model-keypoints are the stable keypoints of object-image. The list is available from the loaded PlanarObjectDetector instance.
- Draw the correspondences on screen.
The simplest way to exercise Ferns matching is to use the FernDescriptorMatcher class. The demo program is very straightforward. The find_obj_ferns demo app is more informative.
Results and Observations
Using ball.pgm (book is pictured sideways) from Dundee test set as training image.
In most cases, it is able to find and locate correctly from the scene images it appears on. The worst result is TestImg010.jpg. It cannot locate the whole upside down book. I suppose that is because the lack of keypoints detected. The book title "Rivera" is obscured.
Test for false-positive using TestImg02.pgm. The detector return status of 'found' but it was obviously wrong. Half of it is 'out-of-the-picture'.
Fast Keypoint Recognition using Random Ferns, Ozuyal et al.