|
The images below show a few frames of tracking results from each
of three test runs with my Camshift implementation.
Run 1: moving slowly toward and away from the camera.
The Camshift algorithm includes an adaptive resizing
step. It assumes a predefined proportion between width
and height and resizes each dimension proportional
to the square root of the skin-probability area inside
its search region. This resizing worked fairly well,
as the frames from Run 1 show. Camshift sized the
region in the lower, center frame of this run a bit too small.
Run 2: swaying side to side.
In Run 2,
I deliberately sized and positioned the initial region
poorly. Within 5 frames, the tracker adapted automatically
to capture the face region well (second image).
The left and right frames in the bottom row of this run
illustrate a limitation of a naive Mean Shift tracker. This
algorithm always converges to the nearest mode. It
doesn't look beyond that mode's location to see if there's a
better match nearby.
Run 3: approaching from a distance.
In Run 3, I tried to test resizing over a broader extreme,
but after about 50 frames, the tracker started sliding away
towards a region of warm lighting in the background. Interestingly,
if I run this video sequence in reverse, the tracker does very
well. I believe that's because the search region for each
frame is larger when the face image recedes than when it approaches.
When the search region is too small, it drifts away more easily.
Bradski mentions that as an issue in his algorithm article.
|