Mean Shift Segmentation in OpenCV

I’ve posted a new repository on GitHub for doing mean shift segmentation in C++ using OpenCV: see here.

OpenCV contains a mean shift filtering function and has a GPU, I think CUDA, implementation of mean shift segmentation. I didn’t evaluate the GPU implementation because I’m personally not interested in GPU for the project I am working on. I did take a look at turning cv::pyrMeanShiftFiltering(…) output into a segmentation but didn’t bother trying because pyrMeanShiftFiltering seems broken to me. This is my gut instinct — I can’t quantify it but basically I agree with this guy. The output just seems to not be as good as the output generated by codebases elsewhere online. I have no idea why … one interesting reason might be that OpenCV is doing mean shift on RGB rather than one of the color spaces that are supposed to be better at modeling human vision. Everybody always says to do things that involve treating colors as points in Euclidean space using L*a*b* or L*u*v* rather than RGB, but in practice, to be honest, it never seems to matter to me. Maybe this is an example of where it does. I don’t know but in any case cv::pyrMeanShiftFiltering in my opinion sucks.

The “elsewhere online” I mention above is the codebase of EDISON, “Edge Detection and Image SegmentatiON”, made freely available by Rutgers University’s “Robust Image Understanding Laboratory”. EDISON is a command line tool that parses a script specifying a sequence of computer vision operations that I wasn’t really interested in except for the part in which it does mean shift segmentation, as its mean shift output seems really good to me. What I have done is extracted the mean shift code, which was C, wrapped it thinly in C++, and ported it to use OpenCV types, e.g. cv::Mat, and OpenCV operations where possible. I also re-factored for concision and removed C-isms where possible, e.g. I replace naked memory allocations with std::vectors and so forth.

The most significant change coming out of this re-factoring work in terms of functionality and/or performance was replacing the EDISON codebase’s L*u*v*-to-RGB/RGB-to-L*u*v* conversion routines with OpenCV calls. This actually changes the output of this code relative to EDISON because OpenCV and EDISON give different L*u*v* values for the same image. Not sure who is right or the meaning of the difference but OpenCV is an industry standard so am erring on the side of OpenCV and further the segmentation this code outputs is in my opinion better that what results from EDISON’s L*u*v* routines while performance is unchanged.

Below is some output: