Quintics Redux

A couple of years ago I wrote a blog post about finding the closest point on a cubic bezier to a given point in which I showed how to reduce the problem to solving a quintic equation and provided an implementation in C# that finds the roots of the quintic using Laguerre’s Method. I also mentioned at the end of that post that there is a fancier algorithm for solving specifically quintics described in Doyle and McMullen’s “Solving the Quintic by Iteration” [1] and speculate that it might be the fastest way to find the closest point to a bezier.

Recently I spent the time to actually implement the Doyle and McMullen algorithm in C++ as a numeric algorithm. I don’t know of any other implementations. My code is here.

It is an interesting algorithm. Its existence implies that although there is no general formula, akin to the quadratic formula, that you can plug into to get the roots of a 5th degree equation, and in fact although the roots of most 5th degree equations cannot even be represented as expressions composed of ordinary arithmetic operations plus radicals — you can represent numbers with radical expressions that are arbitrarily close to these roots. Each of these the roots can be defined exactly in terms of the limit a recursive function converges on as the depth of recursion approaches infinity; each step along the way approximates with increasing accuracy the root of an unsolvable quintic expressed using arithmetic and radicals.

Above I specify my implementation is “a numeric algorithm”, I mean as opposed to an implementation using a symbolic math package; I use ordinary floating point numbers. This is an important point: [1] is from the wilds of number theory, not from a CS text book, and it thus defines an algorithm over real numbers which of course have arbitrary precision. It was an open question to me whether the algorithm has value as a numeric algorithm. The issue I saw is that it is a solution to quintics in the single parameter Brioschi form. The Brioschi form, by magic, collapses every general quintic equation defined by six complex coefficients into a single complex number; it is impossible for such a reduction to not be limited by finite precision. That is, to some extent the finite precision of floating point numbers must determine to which general quintics a numeric implementation of [1] will return sensible results.

I will leave an analysis of this question to someone who is an actual mathematician or computer scientist but empirically the Doyle and McMullen algorithm does seem to me to have merit as a numeric algorithm at double precision. On randomly generated general quintics with coefficients uniformly distributed between -1000 and 1000, my implementation of solving the quintic by iteration returns results that look to me to be about as good or better than the implementation of Laguerre’s Method from Data Structures and Algorithms in C++ but is about 500 times faster. Both algorithms perform better in terms of speed and correctness when the coefficients are lower. When they are between -10 and 10 my implementation is more accurate and about 100 times faster than the Data Structures and Algorithms code.

My implementation works as follows:

  1. Given a general quintic, convert to principal form. I do this conversion as explained here. I found the resultant and solved for the c₁ and c₂ being canceled out using SageMath.
  2. Convert the principal quintic to Brioschi form. I followed the explanation here.
  3. Find two solutions to the Brioschi form quintic via iteration as described in [1] pages 32 to 33. The only difficulty here was that I needed to find the derivative to the function g(Z,w). I did this symbolically again via SageMath; however, a speed optimization would be to replace use of an explicit C++ function for g’ and instead evaluate g(Z,w) and g'(Z,w) simultaneously as described in the chapter in Numerical Recipes: The Art of Scientific Computing on polynomials. Also just as a note to anyone else who may want to write an implementation of this algorithm, the wikipedia article here has a mistake in the definition of h(Z,w). Use the original paper. (Or better yet cut-and-paste from Peter Doyle’s macsyma output and overload C++ such that the caret operator is exponentiation, which is what I did)
  4. Convert the two roots back to the general quintic.
  5. Test both roots. If the error is less than threshold k pass along both roots. If one is less than k pass along the one good root. If both roots yield errors greater than k perform n iterations of Halley’s Method and retest for one or two good roots. If neither root has error less than k pass both along anyway.
  6. If you have two good roots v₁ and v₂, perform synthetic division by (zv₁ )(zv₂ ) yielding a cubic and solve the cubic via radicals. If you have only one good root divide the quintic by (z-v) and solve the resulting quartic. I’m using the cubic solving procedure as described in Numerical Recipes and the quartic formula as described here.

My implementation is a header-only C++17 library (“quintic.hpp” in the github repo i linked to above) parametrized on the specific floating point type you want to use. Single precision is not good enough for this algorithm. Double precision works. I didn’t test on long doubles because Visual Studio does not support them.

Basic Convex Hull in C#

Transliterated from java found here:

class ConvexHull
{
    public static double cross(Point O, Point A, Point B)
    {
        return (A.X - O.X) * (B.Y - O.Y) - (A.Y - O.Y) * (B.X - O.X);
    }

    public static List<Point> GetConvexHull(List<Point> points)
    {
        if (points == null)
            return null;

        if (points.Count() <= 1)
            return points;

        int n = points.Count(), k = 0;
        List<Point> H = new List<Point>(new Point[2 * n]);

        points.Sort((a, b) =>
             a.X == b.X ? a.Y.CompareTo(b.Y) : a.X.CompareTo(b.X));

        // Build lower hull
        for (int i = 0; i < n; ++i)
        {
            while (k >= 2 && cross(H[k - 2], H[k - 1], points[i]) <= 0)
                k--;
            H[k++] = points[i];
        }

        // Build upper hull
        for (int i = n - 2, t = k + 1; i >= 0; i--)
        {
            while (k >= t && cross(H[k - 2], H[k - 1], points[i]) <= 0)
                k--;
            H[k++] = points[i];
        }

        return H.Take(k - 1).ToList();
    }
}

“Triangular Life”

Recently I looked through a bunch of triangular cellular automata in which each (1) uses two states, (2) uses the simple alive cell count type of rule, and (3) uses the neighborhood around a cell c that is all the triangles that share a vertex with c; that is, the 12 shaded triangles below are the neighborhood around the yellow triangle:
12-cell triangular neighborhood
These cellular automata have state tables that can be thought of as 13 rows of 2 columns: there are 12 possible non-zero alive cell counts plus thee zero count and each of these counts can map to either alive or dead in the next generation depending on whether the cell in the current generation is alive or dead (column 1 or column 2). I looked at each of the 4096 cellular automata you get by filling the third through eighth rows of these state tables with each possible allocations of 0s and 1s and letting all other rows contain zeros.

A handful of these 4096 feature the spontaneous generation of gliders but one rule is clearly the triangular analog of Conway’s Life. I have no idea if this rule has been described before in the literature but it is the following:

On a triangular grid

  • If a cell is dead and it has exactly four or six vertex-adjacent alive neighbors then it alive in the next generation.
  • If a cell is alive and it has four to six vertex-adjacent alive neighbors, inclusive, then it remains alive in the next generation.
  • Otherwise it is dead in the next generation.

The above has a glider shown below that is often randomly generated and exhibits bounded growth.
Here it running in Jack Kutilek’s web-based CA player:

Tri Life gliders are slightly rarer than in Conway life because they are bigger in terms of number of alive cells in each glider “frame”. If you don’t see a glider in the above, stir it up by dragging in the window.

frames of the glider in cannonical triangular life

Rhombo

Over the past couple of weeks I wrote some code in C# to generate dissections of the rhombic triacontahedron into golden rhombohedrons. George Hart discusses these types of dissections here  and also talks about the problem of enumerating them in an appendix here — briefly, all this material by Hart and others is about how the fact that the rhombic triacontahedron and the rhombic enneacontahedron are zonohedra lead to both having interesting combinatoric properties which can be explored by coloring their dissections.

I was, however, more interested in how such dissections could be turned into an interlocking puzzle, akin to a traditional burr puzzle. amd as such needed code to generate 3D models of the dissections. My generation code is a dumb, constructive, brute force approach in which I just traverse the search space adding rhombohedrons to a candidate dissection in progress and backtracking when reaching a state in which it is impossible to add a rhombohedron without intersecting the one that was already added or the containing triacontahedron, keeping track of configurations that have already been explored.

Dissections of the rhombic triacontahedron into golden rhombohedrons (hereafter “blocks”) turns out to always need 10 and 10 of the two types of blocks that Hart refers to in the above as the “pointy” and “flat” varieties (and which I refer to as yellow and blue). Further it turns out that in all of these dissections there are four blocks that are completely internal, i.e. sharing no face with the triacontahedron; I also believe that the four internal blocks are always three blue and one yellow, but I’m not sure about that.

My strategy for finding an interlocking puzzle was the following:

  • Generate a bunch of raw dissections into blocks
  • For each dissection, search the adjacency graph for four pieces, the union of sets of five blocks, such that
    • Each piece forms a simple path in the dissection; that is, each block in the piece
      • is either an end block that is face adjacent to a next or previous block in the piece or is a non-end block that is face adjacent to a next block and a previous block.
      • and does not share any edges with other blocks in the piece except for the edges of the face adjacencies.
    • Each piece contains at least one fully internal block.
    • Each piece is “single axis disentangle-able” from each other piece, where we mean by that that there exists some edge e in the complete construction such that if given piece p1 and piece p2, if you offset p1 in the direction of  e by a small amount p1 does not intersect p2.
    • Each piece is not single axis disentangle-able from the union of the other three pieces.

I never managed to succeed in doing a complete enumeration, generating all of the dissections for reasons that I don’t feel like going into. (As I said above, I did not do anything fancy and it would be easier to just be smarter about how I do the generation than to make what I have more efficient; i.e. could have done the George Hart algorithm if I had known abouyt that or there are ways of transforming one dissection into another that I don’t do — I do an exhaustive search, period — but I never did the smarter stuff because I found what I was looking for, see below)

But from about 10 dissections I found one set of pieces that uniquely satisfies all of the above:

Here’s some video. (The individual blocks were 3D printed and super glued together)

I’m calling the above “rhombo”. Those pieces are rough because I only 3D printed the individual rhombohedrons and then superglued them together into the pieces, which is imprecise. I had to sand them heavily to get them to behave  nicely. I’ll eventually put full piece models up on Shapeways.

In the course of doing this work, it became apparent that there is no good computational geometry library for C# to use for something like this. There is one called Math.Net Numerics along with Math.Net Spatial that will get you vectors and matrices but not with all the convenience routines you’d expect to treat vectors like 3D points and so forth. What I ended up doing was extracting the vectors and matrices out of monogame and search-and-replacing “float” to “double” to get double precision. Here is that code on github. I also included in there 3D line segment/line segment intersection code and 3D triangle/triangle intersection code which I transliterated to C#. The line segment intersection code came from Paul Bourke’s web site. And the triangle intersection code came from running Tomas Moller’s C code through just a C preprocessor to resolve all the macros and then transliterating the result to C#.

Basic signals & slots in C++

I put up some code on github that implements basic signals and slots functionality using standards compliant C++. It’s one header file “signals.hpp”. See here: link to github

This implementation of signals is a re-work of some code by a user _pi on the forums for the cross-platform application framework Juce — that thread is here — which was itself a re-work of some example code posted by Aardvajk in the GameDev.net forums (here). I just put it all together, cleaned things up, fixed some bugs, and added lambda support.

The basic idea is that given variadic templates being added to C++ it became possible for a more straight-forward implementation of signal and slot functionality beyond what was done for boost::signals et. al. — that is what Aardvajk’s original article was about. _pi changed Aardvajk’s code to make it such that slots are a thing you inherit from, which makes more sense to me. I changed _pi’s code so that it uses std::functions to store the handlers thus allowing lambdas with captures to be attached to a signal.

Usage is like the following:

#include "signals.hpp"
#include <iostream>

// a handler is a kind of slot, that, as an implementation detail, requires usage of the
// "curiously recurring template pattern". That is, the intention is for instances of
// a class C that will react to a signal firing to have an is-a relationship with a slot 
// parametrized on class C itself.

class CharacterHandler : public Slot<CharacterHandler>
{
public:
	void HandleCharacter(char c)
	{
		std::cout << "The user entered '" << c << "'" << std::endl;
	}
};

class DigitHandler : public Slot<DigitHandler>
{
public:
	void HandleCharacter(char c)
	{
		if (c >= '0' && c <= '9') {
			int n = static_cast<int>(c - '0');
			std::cout << "  " << n << " * " << n << " = " << n*n << std::endl;
		}
	}
};

int main()
{
	bool done = false;

	char c;
	Signal<char> signal;
	CharacterHandler character_handler;
	DigitHandler digit_handler;

	// can attach a signal to a slot with matching arguments
	signal.connect(character_handler, &CharacterHandler::HandleCharacter);

	// can also attach a lambda, associated with a slot.
	// (the lambda could capture the slot and use it like anything else it captures
	// however technically all the associated slot is doing is allowing you to have
	// a way of disconnecting the lambda e.g. in this case signal.disconnect(characterHandler)
	signal.connect(character_handler,
		[&](char c) -> void {
			if (c == 'q')
				done = true;
		}
	);

	// can also attach a slot to a signal ... this means the same as the above.
	digit_handler.connect(signal, &DigitHandler::HandleCharacter);

	do
	{
		std::cin >> c;
		signal.fire(c);

	} while (! done);
	
	// can disconnect like this
	character_handler.disconnect(signal);

	// or this
	signal.disconnect(digit_handler);

	//although disconnecting wasnt necessary here in that just gaving everything go out of scope
	// wouldve done the right thing.

    return 0;
}

I put pypacker up on github

Probably the most read post on this blog is about sprite packing in Python, here.

At the time I didn’t have a github account and was planning to make a small desktop application out of the algorithm as a way of learning Qt, which maybe I really will do some day, but in the meantime here is the latest version of this code and that I mention in the comments of the original post:

https://github.com/jwezorek/pypacker

The main thing that this version adds besides fixing the bug with growing that I had mentioned is an “–padding” command line option that will pad each sprite by an integer amount (usually you want 1) in the sprite sheet. This turns out to often be necessary for cocos2d-x/iOS games as if you don’t do it one sprite can bleed into adjacent sprites — although the issue may have been something that Apple has now fixed at the OpenGL ES layer or that was fixed in cocos2d-x — I was never sure whose bug it was but padding by a pixel makes it go away.

So usage would now be like:

pypacker.py -i C:\foo\sprites -o C:\foo\output\spritesheet -m grow -p 1

which would do packing on the images in C:\foo\sprites and make two files in C:\foo\output, spritesheet.png and spritesheet.plist, with each sprite padded by 1 pixel. If you need power-of-two square padding on the out put include a “-x” option on the command.

Mean Shift Segmentation in OpenCV

I’ve posted a new repository on GitHub for doing mean shift segmentation in C++ using OpenCV: see here.

OpenCV contains a mean shift filtering function and has a GPU, I think CUDA, implementation of mean shift segmentation. I didn’t evaluate the GPU implementation because I’m personally not interested in GPU for the project I am working on. I did take a look at turning cv::pyrMeanShiftFiltering(…) output into a segmentation but didn’t bother trying because pyrMeanShiftFiltering seems broken to me. This is my gut instinct — I can’t quantify it but basically I agree with this guy. The output just seems to not be as good as the output generated by codebases elsewhere online. I have no idea why … one interesting reason might be that OpenCV is doing mean shift on RGB rather than one of the color spaces that are supposed to be better at modeling human vision. Everybody always says to do things that involve treating colors as points in Euclidean space using L*a*b* or L*u*v* rather than RGB, but in practice, to be honest, it never seems to matter to me. Maybe this is an example of where it does. I don’t know but in any case cv::pyrMeanShiftFiltering in my opinion sucks.

The “elsewhere online” I mention above is the codebase of EDISON, “Edge Detection and Image SegmentatiON”, made freely available by Rutgers University’s “Robust Image Understanding Laboratory”. EDISON is a command line tool that parses a script specifying a sequence of computer vision operations that I wasn’t really interested in except for the part in which it does mean shift segmentation, as its mean shift output seems really good to me. What I have done is extracted the mean shift code, which was C, wrapped it thinly in C++, and ported it to use OpenCV types, e.g. cv::Mat, and OpenCV operations where possible. I also re-factored for concision and removed C-isms where possible, e.g. I replace naked memory allocations with std::vectors and so forth.

The most significant change coming out of this re-factoring work in terms of functionality and/or performance was replacing the EDISON codebase’s L*u*v*-to-RGB/RGB-to-L*u*v* conversion routines with OpenCV calls. This actually changes the output of this code relative to EDISON because OpenCV and EDISON give different L*u*v* values for the same image. Not sure who is right or the meaning of the difference but OpenCV is an industry standard so am erring on the side of OpenCV and further the segmentation this code outputs is in my opinion better that what results from EDISON’s L*u*v* routines while performance is unchanged.

Below is some output:

Floodfilling in OpenCV with multiple seeds

One irritating thing about OpenCV is that as a computer vision library it doesn’t actually offer a lot of routines for dealing with connected components easily and efficiently.

There’s cv::findContours and two versions of cv::connectedComponents — the regular one and one “WithStats”. The trouble is findContours returns polygons when what you often want is raster blob masks. connectedComponents returns a label image but OpenCV doesn’t offer a lot of routines for doing anything with a label images, and further connectedComponentsWithStats is pretty limited in what it will give you. For example, there is no option to be returned a pixel location contained by each connected component. The other issue is that even if you have a pixel location contained by each connected component of interest there is no version of floodFill that takes more than one seed. I really think this kind of floodFill function is something that should be added to OpenCV.

The following assumes single channel input and returns the results of the fills as a separate Mat rather than by modifying the input, but it could easily be extended to be polymorphic and support all the different variations that regular floodFill supports. Basically if we view the input as monochrome blobs, what it is doing is returning the union of all the connected components in the source bitmap that have a non-null intersection with the seed bitmap:

Mat FloodFillFromSeedMask(const Mat& image, const Mat& seeds, uchar src_val = 255, uchar target_val = 255, uchar connectivity = 4)
{
	auto sz = image.size();
	Mat output;
	copyMakeBorder(Mat::zeros(sz, CV_8U), output, 1, 1, 1, 1, BORDER_CONSTANT, target_val);
	for (int y = 0; y < seeds.rows; y++) {
		const uchar* img_ptr = image.ptr<uchar>(y);
		const uchar* seeds_ptr = seeds.ptr<uchar>(y);
		uchar* output_ptr = output.ptr<uchar>(y + 1) + 1;
		for (int x = 0; x < seeds.cols; x++) {
			if ( *img_ptr == src_val && *seeds_ptr > 0 && *output_ptr != target_val)
				floodFill(image, output, Point(x, y), target_val, nullptr, 0, 0, connectivity | (target_val << 8) | FLOODFILL_MASK_ONLY);
			img_ptr++;
			seeds_ptr++;
			output_ptr++;
		}
	}
	return Mat(output, Rect(Point(1, 1), sz));
}

and, yes, not using cv::Mat::at(y,x) is actually noticeably faster than using it and this function is substantially faster than calling findContours, iterating over the polygons returned, painting them, and testing for an intersection with the seed mask. It would be nice to get rid of that call to copyMakeBorder() but there doesn’t seem to be a way to create a bordered Mat directly. Didn’t feel like writing a function like that and then testing that mine is faster than the above…

Jesus Christ, internet…

Center window on primary screen — Win32, C/C++ — below, featuring workiness. Our long national nightmare is now over:

void CenterWindowOnScreen(HWND hwnd)
{
	RECT wnd_rect;
	GetWindowRect(hwnd, &wnd_rect);

	RECT screen_rect;
	SystemParametersInfo(SPI_GETWORKAREA, 0, reinterpret_cast<PVOID>(&screen_rect), 0);

	int scr_wd = screen_rect.right - screen_rect.left;
	int scr_hgt = screen_rect.bottom - screen_rect.top;
	int wnd_wd = wnd_rect.right - wnd_rect.left;
	int wnd_hgt = wnd_rect.bottom - wnd_rect.top;

	int x = (scr_wd - wnd_wd) / 2;
	int y = (scr_hgt - wnd_hgt) / 2;

	SetWindowPos(hwnd, 0, x, y, 0, 0, SWP_NOZORDER | SWP_NOSIZE);
}