Archive for the Category image processing


Mean Shift Segmentation in OpenCV

I’ve posted a new repository on GitHub for doing mean shift segmentation in C++ using OpenCV: see here.

OpenCV contains a mean shift filtering function and has a GPU, I think CUDA, implementation of mean shift segmentation. I didn’t evaluate the GPU implementation because I’m personally not interested in GPU for the project I am working on. I did take a look at turning cv::pyrMeanShiftFiltering(…) output into a segmentation but didn’t bother trying because pyrMeanShiftFiltering seems broken to me. This is my gut instinct — I can’t quantify it but basically I agree with this guy. The output just seems to not be as good as the output generated by codebases elsewhere online. I have no idea why … one interesting reason might be that OpenCV is doing mean shift on RGB rather than one of the color spaces that are supposed to be better at modeling human vision. Everybody always says to do things that involve treating colors as points in Euclidean space using L*a*b* or L*u*v* rather than RGB, but in practice, to be honest, it never seems to matter to me. Maybe this is an example of where it does. I don’t know but in any case cv::pyrMeanShiftFiltering in my opinion sucks.

The “elsewhere online” I mention above is the codebase of EDISON, “Edge Detection and Image SegmentatiON”, made freely available by Rutgers University’s “Robust Image Understanding Laboratory”. EDISON is a command line tool that parses a script specifying a sequence of computer vision operations that I wasn’t really interested in except for the part in which it does mean shift segmentation, as its mean shift output seems really good to me. What I have done is extracted the mean shift code, which was C, wrapped it thinly in C++, and ported it to use OpenCV types, e.g. cv::Mat, and OpenCV operations where possible. I also re-factored for concision and removed C-isms where possible, e.g. I replace naked memory allocations with std::vectors and so forth.

The most significant change coming out of this re-factoring work in terms of functionality and/or performance was replacing the EDISON codebase’s L*u*v*-to-RGB/RGB-to-L*u*v* conversion routines with OpenCV calls. This actually changes the output of this code relative to EDISON because OpenCV and EDISON give different L*u*v* values for the same image. Not sure who is right or the meaning of the difference but OpenCV is an industry standard so am erring on the side of OpenCV and further the segmentation this code outputs is in my opinion better that what results from EDISON’s L*u*v* routines while performance is unchanged.

Below is some output:

Floodfilling in OpenCV with multiple seeds

One irritating thing about OpenCV is that as a computer vision library it doesn’t actually offer a lot of routines for dealing with connected components easily and efficiently.

There’s cv::findContours and two versions of cv::connectedComponents — the regular one and one “WithStats”. The trouble is findContours returns polygons when what you often want is raster blob masks. connectedComponents returns a label image but OpenCV doesn’t offer a lot of routines for doing anything with a label images, and further connectedComponentsWithStats is pretty limited in what it will give you. For example, there is no option to be returned a pixel location contained by each connected component. The other issue is that even if you have a pixel location contained by each connected component of interest there is no version of floodFill that takes more than one seed. I really think this kind of floodFill function is something that should be added to OpenCV.

The following assumes single channel input and returns the results of the fills as a separate Mat rather than by modifying the input, but it could easily be extended to be polymorphic and support all the different variations that regular floodFill supports. Basically if we view the input as monochrome blobs, what it is doing is returning the union of all the connected components in the source bitmap that have a non-null intersection with the seed bitmap:

Mat FloodFillFromSeedMask(const Mat& image, const Mat& seeds, uchar src_val = 255, uchar target_val = 255, uchar connectivity = 4)
	auto sz = image.size();
	Mat output;
	copyMakeBorder(Mat::zeros(sz, CV_8U), output, 1, 1, 1, 1, BORDER_CONSTANT, target_val);
	for (int y = 0; y < seeds.rows; y++) {
		const uchar* img_ptr = image.ptr<uchar>(y);
		const uchar* seeds_ptr = seeds.ptr<uchar>(y);
		uchar* output_ptr = output.ptr<uchar>(y + 1) + 1;
		for (int x = 0; x < seeds.cols; x++) {
			if ( *img_ptr == src_val && *seeds_ptr > 0 && *output_ptr != target_val)
				floodFill(image, output, Point(x, y), target_val, nullptr, 0, 0, connectivity | (target_val << 8) | FLOODFILL_MASK_ONLY);
	return Mat(output, Rect(Point(1, 1), sz));

and, yes, not using cv::Mat::at(y,x) is actually noticeably faster than using it and this function is substantially faster than calling findContours, iterating over the polygons returned, painting them, and testing for an intersection with the seed mask. It would be nice to get rid of that call to copyMakeBorder() but there doesn’t seem to be a way to create a bordered Mat directly. Didn’t feel like writing a function like that and then testing that mine is faster than the above…

How to Convert a GDI+ Image to an OpenCV Matrix in C++…

You have to convert the gdi+ image to a gdi+ bitmap and then to an OpenCV matrix. There is no easier way to do the first conversion than creating a bitmap and painting the image into it, as far as I know.

To perform the second conversion (Gdiplus::Bitmap -> cv::Mat), note that Mat constructors order their parameters rows then columns so that is Bitmap height then width. The actual memory layout is row major, however, so we can just copy the data, but there is no need to do the actual copy yourself. You can use one of the Mat constructors that will wrap existing data without copying and then force it to copy by calling the clone member function.

The other trouble here however is that gdi+ supports, in theory at least, loads of exotic pixel formats so to handle them all would be a chore. The following handles the basic case:

Gdiplus::Bitmap* GdiplusImageToBitmap(Gdiplus::Image* img, Gdiplus::Color bkgd = Gdiplus::Color::Transparent)
	int wd = img->GetWidth();
	int hgt = img->GetHeight();
	auto format = img->GetPixelFormat();
	Gdiplus::Bitmap* bmp = new Gdiplus::Bitmap(wd, hgt, format);

	if (bmp == nullptr)
		return nullptr; // this might happen if format is something exotic, not sure.

	auto g = std::unique_ptr<Gdiplus::Graphics>(Gdiplus::Graphics::FromImage(bmp));
	g->DrawImage(img, 0, 0, wd, hgt);

	return bmp;

cv::Mat GdiPlusBitmapToOpenCvMat(Gdiplus::Bitmap* bmp)
	auto format = bmp->GetPixelFormat();
	if (format != PixelFormat24bppRGB)
		return cv::Mat();

	int wd = bmp->GetWidth();
	int hgt = bmp->GetHeight();
	Gdiplus::Rect rcLock(0, 0, wd, hgt);
	Gdiplus::BitmapData bmpData;

	if (!bmp->LockBits(&rcLock, Gdiplus::ImageLockModeRead, format, &bmpData) == Gdiplus::Ok)
		return cv::Mat();

	cv::Mat mat = cv::Mat(hgt, wd, CV_8UC3, static_cast<unsigned char*>(bmpData.Scan0), bmpData.Stride).clone();

	return mat;

cv::Mat GdiplusImageToOpenCvMat(Gdiplus::Image* img)
	auto bmp = std::unique_ptr<Gdiplus::Bitmap>(GdiplusImageToBitmap(img));
	return (bmp != nullptr) ? GdiPlusBitmapToOpenCvMat(bmp.get()) : cv::Mat();