# Quintics Redux

A couple of years ago I wrote a blog post about finding the closest point on a cubic bezier to a given point in which I showed how to reduce the problem to solving a quintic equation and provided an implementation in C# that finds the roots of the quintic using Laguerre’s Method. I also mentioned at the end of that post that there is a fancier algorithm for solving specifically quintics described in Doyle and McMullen’s “Solving the Quintic by Iteration” [1] and speculate that it might be the fastest way to find the closest point to a bezier.

Recently I spent the time to actually implement the Doyle and McMullen algorithm in C++ as a numeric algorithm. I don’t know of any other implementations. My code is here.

It is an interesting algorithm. Its existence implies that although there is no general formula, akin to the quadratic formula, that you can plug into to get the roots of a 5th degree equation, and in fact although the roots of most 5th degree equations cannot even be represented as expressions composed of ordinary arithmetic operations plus radicals — you can represent numbers with radical expressions that are arbitrarily close to these roots. Each of these the roots can be defined exactly in terms of the limit a recursive function converges on as the depth of recursion approaches infinity; each step along the way approximates with increasing accuracy the root of an unsolvable quintic expressed using arithmetic and radicals.

Above I specify my implementation is “a numeric algorithm”, I mean as opposed to an implementation using a symbolic math package; I use ordinary floating point numbers. This is an important point: [1] is from the wilds of number theory, not from a CS text book, and it thus defines an algorithm over real numbers which of course have arbitrary precision. It was an open question to me whether the algorithm has value as a numeric algorithm. The issue I saw is that it is a solution to quintics in the single parameter Brioschi form. The Brioschi form, by magic, collapses every general quintic equation defined by six complex coefficients into a single complex number; it is impossible for such a reduction to not be limited by finite precision. That is, to some extent the finite precision of floating point numbers must determine to which general quintics a numeric implementation of [1] will return sensible results.

I will leave an analysis of this question to someone who is an actual mathematician or computer scientist but empirically the Doyle and McMullen algorithm does seem to me to have merit as a numeric algorithm at double precision. On randomly generated general quintics with coefficients uniformly distributed between -1000 and 1000, my implementation of solving the quintic by iteration returns results that look to me to be about as good or better than the implementation of Laguerre’s Method from Data Structures and Algorithms in C++ but is about 500 times faster. Both algorithms perform better in terms of speed and correctness when the coefficients are lower. When they are between -10 and 10 my implementation is more accurate and about 100 times faster than the Data Structures and Algorithms code.

My implementation works as follows:

1. Given a general quintic, convert to principal form. I do this conversion as explained here. I found the resultant and solved for the c₁ and c₂ being canceled out using SageMath.
2. Convert the principal quintic to Brioschi form. I followed the explanation here.
3. Find two solutions to the Brioschi form quintic via iteration as described in [1] pages 32 to 33. The only difficulty here was that I needed to find the derivative to the function g(Z,w). I did this symbolically again via SageMath; however, a speed optimization would be to replace use of an explicit C++ function for g’ and instead evaluate g(Z,w) and g'(Z,w) simultaneously as described in the chapter in Numerical Recipes: The Art of Scientific Computing on polynomials. Also just as a note to anyone else who may want to write an implementation of this algorithm, the wikipedia article here has a mistake in the definition of h(Z,w). Use the original paper. (Or better yet cut-and-paste from Peter Doyle’s macsyma output and overload C++ such that the caret operator is exponentiation, which is what I did)
4. Convert the two roots back to the general quintic.
5. Test both roots. If the error is less than threshold k pass along both roots. If one is less than k pass along the one good root. If both roots yield errors greater than k perform n iterations of Halley’s Method and retest for one or two good roots. If neither root has error less than k pass both along anyway.
6. If you have two good roots v₁ and v₂, perform synthetic division by (zv₁ )(zv₂ ) yielding a cubic and solve the cubic via radicals. If you have only one good root divide the quintic by (z-v) and solve the resulting quartic. I’m using the cubic solving procedure as described in Numerical Recipes and the quartic formula as described here.

My implementation is a header-only C++17 library (“quintic.hpp” in the github repo i linked to above) parametrized on the specific floating point type you want to use. Single precision is not good enough for this algorithm. Double precision works. I didn’t test on long doubles because Visual Studio does not support them.

# Basic signals & slots in C++

I put up some code on github that implements basic signals and slots functionality using standards compliant C++. It’s one header file “signals.hpp”. See here: link to github

This implementation of signals is a re-work of some code by a user _pi on the forums for the cross-platform application framework Juce — that thread is here — which was itself a re-work of some example code posted by Aardvajk in the GameDev.net forums (here). I just put it all together, cleaned things up, fixed some bugs, and added lambda support.

The basic idea is that given variadic templates being added to C++ it became possible for a more straight-forward implementation of signal and slot functionality beyond what was done for boost::signals et. al. — that is what Aardvajk’s original article was about. _pi changed Aardvajk’s code to make it such that slots are a thing you inherit from, which makes more sense to me. I changed _pi’s code so that it uses std::functions to store the handlers thus allowing lambdas with captures to be attached to a signal.

Usage is like the following:

```#include "signals.hpp"
#include <iostream>

// a handler is a kind of slot, that, as an implementation detail, requires usage of the
// "curiously recurring template pattern". That is, the intention is for instances of
// a class C that will react to a signal firing to have an is-a relationship with a slot
// parametrized on class C itself.

class CharacterHandler : public Slot<CharacterHandler>
{
public:
void HandleCharacter(char c)
{
std::cout << "The user entered '" << c << "'" << std::endl;
}
};

class DigitHandler : public Slot<DigitHandler>
{
public:
void HandleCharacter(char c)
{
if (c >= '0' && c <= '9') {
int n = static_cast<int>(c - '0');
std::cout << "  " << n << " * " << n << " = " << n*n << std::endl;
}
}
};

int main()
{
bool done = false;

char c;
Signal<char> signal;
CharacterHandler character_handler;
DigitHandler digit_handler;

// can attach a signal to a slot with matching arguments
signal.connect(character_handler, &CharacterHandler::HandleCharacter);

// can also attach a lambda, associated with a slot.
// (the lambda could capture the slot and use it like anything else it captures
// however technically all the associated slot is doing is allowing you to have
// a way of disconnecting the lambda e.g. in this case signal.disconnect(characterHandler)
signal.connect(character_handler,
[&](char c) -> void {
if (c == 'q')
done = true;
}
);

// can also attach a slot to a signal ... this means the same as the above.
digit_handler.connect(signal, &DigitHandler::HandleCharacter);

do
{
std::cin >> c;
signal.fire(c);

} while (! done);

// can disconnect like this
character_handler.disconnect(signal);

// or this
signal.disconnect(digit_handler);

//although disconnecting wasnt necessary here in that just gaving everything go out of scope
// wouldve done the right thing.

return 0;
}
```

# Mean Shift Segmentation in OpenCV

I’ve posted a new repository on GitHub for doing mean shift segmentation in C++ using OpenCV: see here.

OpenCV contains a mean shift filtering function and has a GPU, I think CUDA, implementation of mean shift segmentation. I didn’t evaluate the GPU implementation because I’m personally not interested in GPU for the project I am working on. I did take a look at turning cv::pyrMeanShiftFiltering(…) output into a segmentation but didn’t bother trying because pyrMeanShiftFiltering seems broken to me. This is my gut instinct — I can’t quantify it but basically I agree with this guy. The output just seems to not be as good as the output generated by codebases elsewhere online. I have no idea why … one interesting reason might be that OpenCV is doing mean shift on RGB rather than one of the color spaces that are supposed to be better at modeling human vision. Everybody always says to do things that involve treating colors as points in Euclidean space using L*a*b* or L*u*v* rather than RGB, but in practice, to be honest, it never seems to matter to me. Maybe this is an example of where it does. I don’t know but in any case cv::pyrMeanShiftFiltering in my opinion sucks.

The “elsewhere online” I mention above is the codebase of EDISON, “Edge Detection and Image SegmentatiON”, made freely available by Rutgers University’s “Robust Image Understanding Laboratory”. EDISON is a command line tool that parses a script specifying a sequence of computer vision operations that I wasn’t really interested in except for the part in which it does mean shift segmentation, as its mean shift output seems really good to me. What I have done is extracted the mean shift code, which was C, wrapped it thinly in C++, and ported it to use OpenCV types, e.g. cv::Mat, and OpenCV operations where possible. I also re-factored for concision and removed C-isms where possible, e.g. I replace naked memory allocations with std::vectors and so forth.

The most significant change coming out of this re-factoring work in terms of functionality and/or performance was replacing the EDISON codebase’s L*u*v*-to-RGB/RGB-to-L*u*v* conversion routines with OpenCV calls. This actually changes the output of this code relative to EDISON because OpenCV and EDISON give different L*u*v* values for the same image. Not sure who is right or the meaning of the difference but OpenCV is an industry standard so am erring on the side of OpenCV and further the segmentation this code outputs is in my opinion better that what results from EDISON’s L*u*v* routines while performance is unchanged.

Below is some output:

# Floodfilling in OpenCV with multiple seeds

One irritating thing about OpenCV is that as a computer vision library it doesn’t actually offer a lot of routines for dealing with connected components easily and efficiently.

There’s cv::findContours and two versions of cv::connectedComponents — the regular one and one “WithStats”. The trouble is findContours returns polygons when what you often want is raster blob masks. connectedComponents returns a label image but OpenCV doesn’t offer a lot of routines for doing anything with a label images, and further connectedComponentsWithStats is pretty limited in what it will give you. For example, there is no option to be returned a pixel location contained by each connected component. The other issue is that even if you have a pixel location contained by each connected component of interest there is no version of floodFill that takes more than one seed. I really think this kind of floodFill function is something that should be added to OpenCV.

The following assumes single channel input and returns the results of the fills as a separate Mat rather than by modifying the input, but it could easily be extended to be polymorphic and support all the different variations that regular floodFill supports. Basically if we view the input as monochrome blobs, what it is doing is returning the union of all the connected components in the source bitmap that have a non-null intersection with the seed bitmap:

```Mat FloodFillFromSeedMask(const Mat& image, const Mat& seeds, uchar src_val = 255, uchar target_val = 255, uchar connectivity = 4)
{
auto sz = image.size();
Mat output;
copyMakeBorder(Mat::zeros(sz, CV_8U), output, 1, 1, 1, 1, BORDER_CONSTANT, target_val);
for (int y = 0; y < seeds.rows; y++) {
const uchar* img_ptr = image.ptr<uchar>(y);
const uchar* seeds_ptr = seeds.ptr<uchar>(y);
uchar* output_ptr = output.ptr<uchar>(y + 1) + 1;
for (int x = 0; x < seeds.cols; x++) {
if ( *img_ptr == src_val && *seeds_ptr > 0 && *output_ptr != target_val)
floodFill(image, output, Point(x, y), target_val, nullptr, 0, 0, connectivity | (target_val << 8) | FLOODFILL_MASK_ONLY);
img_ptr++;
seeds_ptr++;
output_ptr++;
}
}
return Mat(output, Rect(Point(1, 1), sz));
}
```

and, yes, not using cv::Mat::at(y,x) is actually noticeably faster than using it and this function is substantially faster than calling findContours, iterating over the polygons returned, painting them, and testing for an intersection with the seed mask. It would be nice to get rid of that call to copyMakeBorder() but there doesn’t seem to be a way to create a bordered Mat directly. Didn’t feel like writing a function like that and then testing that mine is faster than the above…

# Jesus Christ, internet…

Center window on primary screen — Win32, C/C++ — below, featuring workiness. Our long national nightmare is now over:

```void CenterWindowOnScreen(HWND hwnd)
{
RECT wnd_rect;
GetWindowRect(hwnd, &wnd_rect);

RECT screen_rect;
SystemParametersInfo(SPI_GETWORKAREA, 0, reinterpret_cast<PVOID>(&screen_rect), 0);

int scr_wd = screen_rect.right - screen_rect.left;
int scr_hgt = screen_rect.bottom - screen_rect.top;
int wnd_wd = wnd_rect.right - wnd_rect.left;
int wnd_hgt = wnd_rect.bottom - wnd_rect.top;

int x = (scr_wd - wnd_wd) / 2;
int y = (scr_hgt - wnd_hgt) / 2;

SetWindowPos(hwnd, 0, x, y, 0, 0, SWP_NOZORDER | SWP_NOSIZE);
}
```

I’ve been doing some Win32 programming lately and just wanted to flag this old Raymond Chen post in which he defines a simple function that fills a hole in the Win32 API and has a nice why-didnt-I-think-of-that quality to it.

```BOOL UnadjustWindowRectEx(
LPRECT prc,
DWORD dwStyle,
DWORD dwExStyle)
{
RECT rc;
SetRectEmpty(&rc);
if (fRc) {
prc->left -= rc.left;
prc->top -= rc.top;
prc->right -= rc.right;
prc->bottom -= rc.bottom;
}
return fRc;
}
```

Note, Chen says that AdjustWindowRect handles scroll bars but I think that is wrong. AdjustWindowRect works without a window handle so it is not possible for it to get scroll bars right. It can treat any window with the WS_VSCROLL and WS_HSCROLL styles as having two scroll bars (which I think it does) but a window can have those styles and have it’s scroll bars hidden (if all the content fits in the client area, say). You could write another function AdjustWindowRectNoReally that takes a window handle but this kind of defeats the purpose of AdjustWindowRect — you mostly use it before creating a window so don’t have a handle.

If you do have a handle, probably what you want is the following: (from here)

```void ClientResize(HWND hWnd, int nWidth, int nHeight)
{
RECT rcClient, rcWind;
POINT ptDiff;
GetClientRect(hWnd, &rcClient);
GetWindowRect(hWnd, &rcWind);
ptDiff.x = (rcWind.right - rcWind.left) - rcClient.right;
ptDiff.y = (rcWind.bottom - rcWind.top) - rcClient.bottom;
MoveWindow(hWnd, rcWind.left, rcWind.top, nWidth + ptDiff.x, nHeight + ptDiff.y, TRUE);
}
```

which sets a window’s size by its client rectangle rather than its window rectangle, and is more accurate than doing this via AdjustWindowRect anyway because it will do the right thing regarding menu wrapping space and scroll bar space.

# How to Convert a GDI+ Image to an OpenCV Matrix in C++…

You have to convert the gdi+ image to a gdi+ bitmap and then to an OpenCV matrix. There is no easier way to do the first conversion than creating a bitmap and painting the image into it, as far as I know.

To perform the second conversion (Gdiplus::Bitmap -> cv::Mat), note that Mat constructors order their parameters rows then columns so that is Bitmap height then width. The actual memory layout is row major, however, so we can just copy the data, but there is no need to do the actual copy yourself. You can use one of the Mat constructors that will wrap existing data without copying and then force it to copy by calling the clone member function.

The other trouble here however is that gdi+ supports, in theory at least, loads of exotic pixel formats so to handle them all would be a chore. The following handles the basic case:

```Gdiplus::Bitmap* GdiplusImageToBitmap(Gdiplus::Image* img, Gdiplus::Color bkgd = Gdiplus::Color::Transparent)
{
int wd = img->GetWidth();
int hgt = img->GetHeight();
auto format = img->GetPixelFormat();
Gdiplus::Bitmap* bmp = new Gdiplus::Bitmap(wd, hgt, format);

if (bmp == nullptr)
return nullptr; // this might happen if format is something exotic, not sure.

auto g = std::unique_ptr<Gdiplus::Graphics>(Gdiplus::Graphics::FromImage(bmp));
g->Clear(bkgd);
g->DrawImage(img, 0, 0, wd, hgt);

return bmp;
}

cv::Mat GdiPlusBitmapToOpenCvMat(Gdiplus::Bitmap* bmp)
{
auto format = bmp->GetPixelFormat();
if (format != PixelFormat24bppRGB)
return cv::Mat();

int wd = bmp->GetWidth();
int hgt = bmp->GetHeight();
Gdiplus::Rect rcLock(0, 0, wd, hgt);
Gdiplus::BitmapData bmpData;

if (!bmp->LockBits(&rcLock, Gdiplus::ImageLockModeRead, format, &bmpData) == Gdiplus::Ok)
return cv::Mat();

cv::Mat mat = cv::Mat(hgt, wd, CV_8UC3, static_cast<unsigned char*>(bmpData.Scan0), bmpData.Stride).clone();

bmp->UnlockBits(&bmpData);
return mat;
}

cv::Mat GdiplusImageToOpenCvMat(Gdiplus::Image* img)
{
auto bmp = std::unique_ptr<Gdiplus::Bitmap>(GdiplusImageToBitmap(img));
return (bmp != nullptr) ? GdiPlusBitmapToOpenCvMat(bmp.get()) : cv::Mat();
}
```