Archive for the Category Programming

 
 

Mean Shift Segmentation in OpenCV

I’ve posted a new repository on GitHub for doing mean shift segmentation in C++ using OpenCV: see here.

OpenCV contains a mean shift filtering function and has a GPU, I think CUDA, implementation of mean shift segmentation. I didn’t evaluate the GPU implementation because I’m personally not interested in GPU for the project I am working on. I did take a look at turning cv::pyrMeanShiftFiltering(…) output into a segmentation but didn’t bother trying because pyrMeanShiftFiltering seems broken to me. This is my gut instinct — I can’t quantify it but basically I agree with this guy. The output just seems to not be as good as the output generated by codebases elsewhere online. I have no idea why … one interesting reason might be that OpenCV is doing mean shift on RGB rather than one of the color spaces that are supposed to be better at modeling human vision. Everybody always says to do things that involve treating colors as points in Euclidean space using L*a*b* or L*u*v* rather than RGB, but in practice, to be honest, it never seems to matter to me. Maybe this is an example of where it does. I don’t know but in any case cv::pyrMeanShiftFiltering in my opinion sucks.

The “elsewhere online” I mention above is the codebase of EDISON, “Edge Detection and Image SegmentatiON”, made freely available by Rutgers University’s “Robust Image Understanding Laboratory”. EDISON is a command line tool that parses a script specifying a sequence of computer vision operations that I wasn’t really interested in except for the part in which it does mean shift segmentation, as its mean shift output seems really good to me. What I have done is extracted the mean shift code, which was C, wrapped it thinly in C++, and ported it to use OpenCV types, e.g. cv::Mat, and OpenCV operations where possible. I also re-factored for concision and removed C-isms where possible, e.g. I replace naked memory allocations with std::vectors and so forth.

The most significant change coming out of this re-factoring work in terms of functionality and/or performance was replacing the EDISON codebase’s L*u*v*-to-RGB/RGB-to-L*u*v* conversion routines with OpenCV calls. This actually changes the output of this code relative to EDISON because OpenCV and EDISON give different L*u*v* values for the same image. Not sure who is right or the meaning of the difference but OpenCV is an industry standard so am erring on the side of OpenCV and further the segmentation this code outputs is in my opinion better that what results from EDISON’s L*u*v* routines while performance is unchanged.

Below is some output:

Jesus Christ, internet…

Center window on primary screen — Win32, C/C++ — below, featuring workiness. Our long national nightmare is now over:

void CenterWindowOnScreen(HWND hwnd)
{
	RECT wnd_rect;
	GetWindowRect(hwnd, &wnd_rect);

	RECT screen_rect;
	SystemParametersInfo(SPI_GETWORKAREA, 0, reinterpret_cast<PVOID>(&screen_rect), 0);

	int scr_wd = screen_rect.right - screen_rect.left;
	int scr_hgt = screen_rect.bottom - screen_rect.top;
	int wnd_wd = wnd_rect.right - wnd_rect.left;
	int wnd_hgt = wnd_rect.bottom - wnd_rect.top;

	int x = (scr_wd - wnd_wd) / 2;
	int y = (scr_hgt - wnd_hgt) / 2;

	SetWindowPos(hwnd, 0, x, y, 0, 0, SWP_NOZORDER | SWP_NOSIZE);
}

UnadjustWindowRectEx etc.

I’ve been doing some Win32 programming lately and just wanted to flag this old Raymond Chen post in which he defines a simple function that fills a hole in the Win32 API and has a nice why-didnt-I-think-of-that quality to it.

BOOL UnadjustWindowRectEx(
    LPRECT prc,
    DWORD dwStyle,
    BOOL fMenu,
    DWORD dwExStyle)
{
  RECT rc;
  SetRectEmpty(&rc);
  BOOL fRc = AdjustWindowRectEx(&rc, dwStyle, fMenu, dwExStyle);
  if (fRc) {
    prc->left -= rc.left;
    prc->top -= rc.top;
    prc->right -= rc.right;
    prc->bottom -= rc.bottom;
  }
  return fRc;
}

Note, Chen says that AdjustWindowRect handles scroll bars but I think that is wrong. AdjustWindowRect works without a window handle so it is not possible for it to get scroll bars right. It can treat any window with the WS_VSCROLL and WS_HSCROLL styles as having two scroll bars (which I think it does) but a window can have those styles and have it’s scroll bars hidden (if all the content fits in the client area, say). You could write another function AdjustWindowRectNoReally that takes a window handle but this kind of defeats the purpose of AdjustWindowRect — you mostly use it before creating a window so don’t have a handle.

If you do have a handle, probably what you want is the following: (from here)

void ClientResize(HWND hWnd, int nWidth, int nHeight)
{
	RECT rcClient, rcWind;
	POINT ptDiff;
	GetClientRect(hWnd, &rcClient);
	GetWindowRect(hWnd, &rcWind);
	ptDiff.x = (rcWind.right - rcWind.left) - rcClient.right;
	ptDiff.y = (rcWind.bottom - rcWind.top) - rcClient.bottom;
	MoveWindow(hWnd, rcWind.left, rcWind.top, nWidth + ptDiff.x, nHeight + ptDiff.y, TRUE);
}

which sets a window’s size by its client rectangle rather than its window rectangle, and is more accurate than doing this via AdjustWindowRect anyway because it will do the right thing regarding menu wrapping space and scroll bar space.

How to Convert a GDI+ Image to an OpenCV Matrix in C++…

You have to convert the gdi+ image to a gdi+ bitmap and then to an OpenCV matrix. There is no easier way to do the first conversion than creating a bitmap and painting the image into it, as far as I know.

To perform the second conversion (Gdiplus::Bitmap -> cv::Mat), note that Mat constructors order their parameters rows then columns so that is Bitmap height then width. The actual memory layout is row major, however, so we can just copy the data, but there is no need to do the actual copy yourself. You can use one of the Mat constructors that will wrap existing data without copying and then force it to copy by calling the clone member function.

The other trouble here however is that gdi+ supports, in theory at least, loads of exotic pixel formats so to handle them all would be a chore. The following handles the basic case:

Gdiplus::Bitmap* GdiplusImageToBitmap(Gdiplus::Image* img, Gdiplus::Color bkgd = Gdiplus::Color::Transparent)
{
	int wd = img->GetWidth();
	int hgt = img->GetHeight();
	auto format = img->GetPixelFormat();
	Gdiplus::Bitmap* bmp = new Gdiplus::Bitmap(wd, hgt, format);

	if (bmp == nullptr)
		return nullptr; // this might happen if format is something exotic, not sure.

	auto g = std::unique_ptr<Gdiplus::Graphics>(Gdiplus::Graphics::FromImage(bmp));
	g->Clear(bkgd);
	g->DrawImage(img, 0, 0, wd, hgt);

	return bmp;
}

cv::Mat GdiPlusBitmapToOpenCvMat(Gdiplus::Bitmap* bmp)
{
	auto format = bmp->GetPixelFormat();
	if (format != PixelFormat24bppRGB)
		return cv::Mat();

	int wd = bmp->GetWidth();
	int hgt = bmp->GetHeight();
	Gdiplus::Rect rcLock(0, 0, wd, hgt);
	Gdiplus::BitmapData bmpData;

	if (!bmp->LockBits(&rcLock, Gdiplus::ImageLockModeRead, format, &bmpData) == Gdiplus::Ok)
		return cv::Mat();

	cv::Mat mat = cv::Mat(hgt, wd, CV_8UC3, static_cast<unsigned char*>(bmpData.Scan0), bmpData.Stride).clone();

	bmp->UnlockBits(&bmpData);
	return mat;
}

cv::Mat GdiplusImageToOpenCvMat(Gdiplus::Image* img)
{
	auto bmp = std::unique_ptr<Gdiplus::Bitmap>(GdiplusImageToBitmap(img));
	return (bmp != nullptr) ? GdiPlusBitmapToOpenCvMat(bmp.get()) : cv::Mat();
}

Lifelike, or the Joy of Killing Time via Breeding Little Squiggles

A couple of weeks ago I got interested in this project but wanted full control of the code, wanted to know exactly what it is doing, wanted a bunch of features like the ability to import and export CA rules, and wanted to have the process not be seeded by cellular automata already featuring gliders (which the web app seems to be). To this end I have pushed a project to github that does something similar but, I hope, more transparently.

I’m calling it Lifelike Cellular Automata Breeder. It is a (C# WinForms) application in which given some settings a user can artificially select and breed cellular automata; i.e., it performs a genetic algorithm in which the user manually provides the fitness criteria interactively.

I decided I wanted to only allow a reproduction step in which I scramble together state tables in various ways, guessing that using “DNA” more complex than commensurate 2D tables of numbers wouldn’t work well for a genetic process in this case. I characterize CA rules as applying only relative to a given number of states and a given, what I call, “cell structure” and “neighborhood function”. Cell structure just means a lattice type and neighborhood –e.g. square, with four neighbors; square, with eight neighbors; hex, with six, etc. “Neighborhood function” is an arbitrary function that given the  states of the n cells in some cell’s neighborhood returns an integer from 0 to r where r is dependent on the neighborhood function and possibly number of states. For example, Conway Life uses the neighborhood function I refer to as “alive cell count”, and for an n-cell neighborhood, r equals n because the greatest number of alive cells that can surround a cell is just the size of the neighborhood. If the user has selected s total states, the state tables will be s by r.

Lifelike works as follows

  1. The user selects a number of states, cellular structure, and neighborhood function and kicks off the genetic process.
  2. Lifelike sets the current generation to nil, where by “generation” we just mean a set of cellular automata that have been tagged with fitness values.
  3. While the user has not clicked the “go to the next generation” button,
    • If the current generation is nil, Lifelike randomly generates a cellular automata, CA, from scratch by making an s by r state table filled with random numbers from 0 to s. (The random states are generated via a discrete distribution controllable by the user). If the generation is not nil, Lifelike selects a reproduction method requiring k parents, selects k parents from the current generation such that this selection is weighted on the fitness of the automata, generates CA using the reproduction method and parents, and then possibly selects a random mutation function and mutates CA, selecting the mutation function via a discrete distribution controllable by the user and applying it with a “temperature” controlled by the user.
    • Lifelike presents CA in a window.
    • The user either skips CA in which case it no longer plays a role in the algorithm or applies a fitness value to it and adds it to the next generation.
  4. When the user decides to go to next generation, the selections the user just made become the new parent generation and processing continues.

Here’s a video of me playing with an early version of the application.


Results, Musings, etc.

Briefly, Lifelike works.

You can produce interesting cellular automata with it and there is a weird feeling that is hard to describe when you first see a tiny glider wiggle across the screen; however, the way it works is somehow more mundane than I thought that it would be. I wonder if all genetic algorithms are like this. Most of what it produces is garbage. When you see something that isn’t garbage you can select for it. However Lifelike doesn’t do magic. It doesn’t magically find phenomena you want just because you have a fancy framework implemented to find such phenomena. For example, it is my belief at this point that there is no simple hex grid analog of Conway Life using alive cell count, a six cell neighborhood, and 2-states. I think you could probably prove the non-existence of gliders in this configuration but it would be a boring proof by exhaustion and running Lifelike on that configuration is boring as well. Just because you, the user, are a step in a “genetic algorithm” doesn’t somehow make it interesting.

The simple hex neighborhood negative result led me to ask the following question: What is the smallest change you can make to hex/6-cell/alive cell count/2 states to allow Conway Life-like behavior? If you google “Hex Life” you will find that it is well-known that interesting things can happen if you go to a 12-cell  Star of David shaped neighborhood, but this seems inelegant to me because the simple hex neighborhood is so nice. The question then is are gliders possible in the simple hex lattice and neighborhood if we add one state and modify “alive cell count” in a trivial way? The answer to this question turns out be yes. There are beautiful rules that live in the hex lattice with the natural neighborhood if we have 0 = death, 1 = Alive-A, 2 = Alive-B and instead of the simple alive cell count we use its natural analog when states can have values greater than one: sum of states.

Below is a such a rule set and is probably the best thing that has come out of my work with Lifelike as far as I am concerned (So I am naming it Joe Life, assuming that it is unknown in the literature).

The above has a nice quality that Conway Life also has that I call “burn”. This a qualitative thing that is really hard to define but it is what I look for now when I play with Lifelike: burn + gliders = a Life-like cellular automaton. “Burn” is the propensity of a cellular automata configuration to descend into segregated regions of chaos that churn for awhile before ultimately decaying into gliders, oscillators, and still lifes. Some CAs burn faster than others; the above has a nice slow burn. CAs that exhibit steady controlled burn turn out to be rare. Most CAs either die or devolve instantly into various flavors of unbounded chaos.

However there does turn out to be another quality that is not death or unbounded chaos that is sort of like the opposite of burn. See for example

(The above is hex 6-cell, four states, and using a neighborhood function I call “state-based binary”)  which I have been calling “Armada” and generally have been referring to these kind of CAs as being armada-like. Armada-like cellular automata quickly decay completely into only weakly interacting gliders. For example, one from the literature that I would characterize as armada-like is Brian’s Brain. Armada-like rules turn out to be more common than life-like rules. They’re impressive when you first start finding them but they are ultimately less interesting, to me at least. The best thing about armada-like rules is that they indicate that life-like rules are probably “nearby” in the space you are exploring in the genetic process.

Also they can breed weird hybrids that defy classification, such as the following which are all burn with large blob-like gliders and seem sometimes to live around the boundary between armada-like rules and life-like rules.

Magic Carpets (square, 4-cell/ 4 states / sum of states)

or Ink Blots (hex, 6-cell/ 3 states / “0-low-med-high”)

My other major result is that life-like rules exist in the square 4-cell neighborhood if we allow an extra state and use the simple sum of states as the neighborhood function, but they can be boring looking so instead here is an armada-like square 4-cell CA that is on the edge of being lifelike:

The above uses the neighborhood function I call “2-state count” which enumerates all possible combinations of c1, number of neighboring cells in state 1 and c2, number of neighboring cells in state 2 or above, in an n-cell neighborhood i.e. c1 + c2 ≤ n.

Draak Update

Below is gameplay video of a prototype of the puzzle game “Draak” that I am working on. Draak will be an iPad-only iOS game; the prototype below is written to Windows in C# on top of MonoGame.

The initial idea for this game was about the art. I thought it would be cool to make a game that looks like an animated version of one of M. C. Escher’s tessellations, particularly one that involves a similarity symmetry. So I lifted mechanics from a 1990s era game that I always felt was a good game trapped in the typical square lattice for which it was particularly ill-suited. The old game’s Euclidean lay-out wasted a lot of space.

A lot of the work in the above may not be immediately apparent (which is good I think). Specifically, there are actually two shapes of butterflies in the above not just one. They look like this:
draak butterfly tiles
and are arrayed as in a checkerboard with a 90 degree rotation applied to cells of opposite parity — the geometry might be clearer here in which I create these butterfly tiles unadorned with my tool EscherDraw (before I had beziers working in EscherDraw) Thus any time a butterfly moves to a cell with opposite parity I need to not only handle the scale and rotation tweening imposed by the spiral lattice, I also have to simulatenously do the 90 degree rotation imposed by the butterfly tiles and simultaneously morph the actual sprites above from one to the other.

I created the above sprites as vector art using tile outlines exported from Escher draw as SVG and wrote a Python script that morphs SVG as the publically available software for morphing vector graphics is surprisingly non-existent. I am going to post about my vector morphing code here soon, I just need to clean it up for use by people other than me.

My Code for Doing Two Things That Sooner or Later You Will Want to do with Bezier Curves

I just added full cubic bezier curve support to the vector tessellation creation tool I am developing (see here). I’ll include some video at a later date (Update: here) showing this feature in action but first I want to document and make publicly available some of the code involved: namely, a bezier class in C# that supports two functions that are not the easiest things in the world to implement. Both of these functions come up as soon as you try to do anything non-trivial with beziers and the literature online can be unhelpful.

My code is here: link.

The first function is splitting a cubic bezier into two curves at a given t parameter such that the concatenation of the two curves is congruent to the original. This turns out to be easy; however, I could find no actual code on offer, rather just descriptions of code. There is good coverage elsewhere on the theory so I won’t go into it deeply here, but, briefly, splitting a bezier is easy because how one does so follows naturally from the sort of recursive definition of the bezier that is implied by De Casteljau’s algorithm. Anyway here is my code:

...
 public Tuple<Bezier,Bezier> Split(double t) {
     PointD E = Interpolate(t, CtrlPoint1, CtrlPoint2);
     PointD F = Interpolate(t, CtrlPoint2, CtrlPoint3);
     PointD G = Interpolate(t, CtrlPoint3, CtrlPoint4);
     PointD H = Interpolate(t, E, F);
     PointD J = Interpolate(t, F, G);
     PointD K = Interpolate(t, H, J);

     return new Tuple<Bezier, Bezier>(
         new Bezier(CtrlPoint1, E, H, K),
         new Bezier(K, J, G, CtrlPoint4)
     );
 }

 static private PointD Interpolate(double t, PointD pt1, PointD pt2) {
     double x = (1.0 - t) * pt1.X + t * pt2.X;
     double y = (1.0 - t) * pt1.Y + t * pt2.Y;
     return new PointD(x, y);
 }
....

The other function is considerably harder: finding the nearest point on a bezier curve to a given point. If you scan the literature (i.e. perform the google search) you will find that what most people end up using is some code from an article in the book Graphics Gems I, “Solving the Nearest-Point-On-Curve Problem” by Philip J. Schneider. I, however, have a couple of problems with this code (1) I don’t really understand it, (2) it is long and I need to port it to C#, and (3) someone on the internet is claiming that it is incorrect but given (1) I can’t really evaluate the claim; see the comment that begins “There seem to be errors in the ControlPolygonFlatEnough function […]” in the C code here.

But more relevantly I just don’t get this code on a fundamental level. The nearest point on a bezier problem is difficult to solve mathematically but it is easy to formulate mathematically. I think that the Graphics Gems code obscures the straightforward aspect of how this problem can be formulated. Further, I don’t have a problem using code that I don’t understand; however, if I am going to port an opaque block of code from one language to another I’d like to keep that portion of code to a minimum and I can do this easily by structuring my code around the formulation of the problem that I understand.

The formulation I am talking about is as follows, if B(t) is a particular cubic bezier function, B’(t) is its derivative, and P is an arbitrary point then [B(t) – P] ⋅ B’(t) = 0 when t is the parameter of the closest point on B to P. [B(t) – P] is a vector that points from P towards some point on the curve and B’(t) is a vector that is tangent to the curve at this point, if the distance is going to be a minimum then these two vectors will be perpendicular and thus their dot product will be zero. Since B(t) is cubic, its derivative will be quadratic and thus when we take the scalar product of [B(t) – P] and B’(t) we will end up with a quintic equation of one variable. This leads naturally to a nearest point function that looks like the following on the top level:

 public Tuple<PointD, double> FindNearestPoint(PointD pt)
 {
     var polyQuintic = GetNearestBezierPtQuintic(
         _points[0].X, _points[0].Y,
         _points[1].X, _points[1].Y,
         _points[2].X, _points[2].Y,
         _points[3].X, _points[3].Y,
         pt.X, pt.Y
     );
     List<Complex> roots = FindAllRoots(polyQuintic);

     // Filter out roots with nonzero imaginary parts and roots
     // with real parts that are not between 0 and 1.
     List<double> candidates = roots.FindAll(
         root => root.Real > 0 && root.Real <= 1.0 && Math.Abs(root.Imaginary) < ROOT_EPS
     ).Select(
         root => root.Real
     ).ToList();

     // add t=0 and t=1 ... the edge cases.
     candidates.Add(0.0);
     candidates.Add(1.0);

     // find the candidate that yields the closest point on the bezier to the given point.
     double t = double.NaN;
     PointD output = new PointD(double.NaN,double.NaN);
     double minDistance = double.MaxValue;
     foreach (double candidate in candidates)
     {
         var ptAtCandidate = GetPoint(candidate);
         double distance = DistSqu(ptAtCandidate, pt);
         if (distance < minDistance)
         {
             minDistance = distance;
             t = candidate;
             output = ptAtCandidate;
         }
     }

     return new Tuple<PointD, double>(output, t);
 }

which reduces the problem to implementing GetNearestBezierPtQuintic() and a numeric root finding function for polynomials, given that quintic equations cannot be solved via a closed-form formula like the quadratic equation.

GetNearestBezierPtQuintic() –which returns the coefficients of [B(t) – P] ⋅ B’(t) when fully expanded given P and the control points of — turns out to be a little two much algebra to work out comfortably by hand, so I ran the open source symbolic mathematics application Sage on this Sage script yielding the following when translated into C#: (For the first time on the internet! … as far as I can tell)

static private List<Complex> GetNearestBezierPtQuintic(double x_0, double y_0, double x_1, double y_1,
            double x_2, double y_2, double x_3, double y_3, double x, double y)
{
    double t5 = 3 * x_0 * x_0 - 18 * x_0 * x_1 + 27 * x_1 * x_1 + 18 * x_0 * x_2 - 54 * x_1 * x_2 + 27 * x_2 * x_2 -
        6 * x_0 * x_3 + 18 * x_1 * x_3 - 18 * x_2 * x_3 + 3 * x_3 * x_3 + 3 * y_0 * y_0 - 18 * y_0 * y_1 +
        27 * y_1 * y_1 + 18 * y_0 * y_2 - 54 * y_1 * y_2 + 27 * y_2 * y_2 - 6 * y_0 * y_3 + 18 * y_1 * y_3 -
        18 * y_2 * y_3 + 3 * y_3 * y_3;
    double t4 = -15 * x_0 * x_0 + 75 * x_0 * x_1 - 90 * x_1 * x_1 - 60 * x_0 * x_2 + 135 * x_1 * x_2 -
        45 * x_2 * x_2 + 15 * x_0 * x_3 - 30 * x_1 * x_3 + 15 * x_2 * x_3 - 15 * y_0 * y_0 + 75 * y_0 * y_1 -
        90 * y_1 * y_1 - 60 * y_0 * y_2 + 135 * y_1 * y_2 - 45 * y_2 * y_2 + 15 * y_0 * y_3 - 30 * y_1 * y_3 +
        15 * y_2 * y_3;
    double t3 = 30 * x_0 * x_0 - 120 * x_0 * x_1 + 108 * x_1 * x_1 + 72 * x_0 * x_2 - 108 * x_1 * x_2 +
        18 * x_2 * x_2 - 12 * x_0 * x_3 + 12 * x_1 * x_3 + 30 * y_0 * y_0 - 120 * y_0 * y_1 +
        108 * y_1 * y_1 + 72 * y_0 * y_2 - 108 * y_1 * y_2 + 18 * y_2 * y_2 - 12 * y_0 * y_3 + 12 * y_1 * y_3;
    double t2 = 3 * x * x_0 - 30 * x_0 * x_0 - 9 * x * x_1 + 90 * x_0 * x_1 - 54 * x_1 * x_1 + 9 * x * x_2 -
        36 * x_0 * x_2 + 27 * x_1 * x_2 - 3 * x * x_3 + 3 * x_0 * x_3 + 3 * y * y_0 - 30 * y_0 * y_0 - 9 * y * y_1 +
        90 * y_0 * y_1 - 54 * y_1 * y_1 + 9 * y * y_2 - 36 * y_0 * y_2 + 27 * y_1 * y_2 - 3 * y * y_3 + 3 * y_0 * y_3;
    double t1 = -6 * x * x_0 + 15 * x_0 * x_0 + 12 * x * x_1 - 30 * x_0 * x_1 + 9 * x_1 * x_1 - 6 * x * x_2 +
        6 * x_0 * x_2 - 6 * y * y_0 + 15 * y_0 * y_0 + 12 * y * y_1 - 30 * y_0 * y_1 + 9 * y_1 * y_1 -
        6 * y * y_2 + 6 * y_0 * y_2;
    double t0 = 3 * x * x_0 - 3 * x_0 * x_0 - 3 * x * x_1 + 3 * x_0 * x_1 + 3 * y * y_0 - 3 * y_0 * y_0 -
        3 * y * y_1 + 3 * y_0 * y_1;

    return new List<Complex> { (Complex)t0/t5, (Complex)t1/t5, (Complex)t2/t5, (Complex)t3/t5, (Complex)t4/t5, (Complex)1.0 };
}

and decided to use Laguerre’s Method to solve the equation.

I chose Laguerre’s Method because it is optimized for polynomials, is less flaky than Newton-Raphson, is relatively concise, and is easy to port from C++ given .Net’s System.Numerics’s implementation of complex numbers. I ported the implementation of Laguerre’s Method found in Data Structures and Algorithms in C++ by Goodrich et. al. (here). This solver works well enough for my purposes; however, I think the optimal thing to do to get the optimal nearest-point-on-a-bezier algorithm would be to implement one of a class of newish algorithms that are specifically targeted at solving the quintic, e.g. “Solving the Quintic by Iteration” by Doyle and McMullen. No implementations of these algorithms seem to be publicly available, however, and it would be a research project that I don’t have time for right now to translate that paper from academese into code.

An Application for Creating Novel Escher-like Tessellations of the Plane

Over the past several weeks I have been investigating desktop applications for creating novel Escher-like tilings of the plane. Basically I’ve determined that none of what is publicly available is useful to me. The game I have in development will involve an animated Escher-like tessellation of a double logarithmic spiral — see here. The “logarithmic spiral” part of it means the publicly available applications can’t help me: they are too limited in what they will do and I can’t extend them because they are proprietary, so I have created my own tool.

These applications are “too limited” in the sense that none encompass everything possible that one would regard as a 2D tessellation. You have some applications that simplistically shoehorn you into a few kinds of geometries ignoring the richness of the whole domain of tiled patterns. You have other applications that try to take the domain seriously and follow the mathematical literature on the subject. What this means, generally, is that they allow the user to construct a tessellation in terms of its “wallpaper group” which is the mathematical classification of a plane pattern based on the symmetries the pattern admits.

The latter seems like what one would want but in practice it isn’t, at least it is not what I want. It is too heavy of a constraint to work only in terms of wallpaper group because there are all kinds of patterns one might be interested in that fall outside of this formalism: Escher’s tessellations of the Poincare disk model of the hyperbolic plane, for example, or, say, any kind of aperiodic tiling or any tessellations that admit similarity as a symmetry, i.e. scaling, such as Escher’s square limit etchings, or my case: various spiral patterns.

It is instructive that Escher himself did not work in terms of symmetry groups. He created his own classification system which Doris Schattschneider refers to as Escher’s “layman’s theory” of tilings in her book Visions of Symmetry (the authoritative text on Escher’s tessellations). Escher’s classification system comes from a pragmatic perspective rather than a formal mathematical perspective. He starts with basic cells — a square, a rectangle, an isosceles right triangle, etc. — and then enumerates the ways in which one can surround the cell with itself that result in tilings of the plane. There is a good overview of his system here.

What I like about Escher’s approach is that it is geared towards construction. When you try to make one of these things you start with a tile or some a small set of simple tiles that tile the plane. You try to modify them until you get something you like. I decided to implement software that captures that approach. I didn’t want to have to be too concerned with “wallpaper group” or other abstractions which I don’t personally find to be intuitive anyway.

What I came up with is the following application that is written in C#. Below is video in which I use the application to construct something like Escher’s butterfly tessellation — system IX-D in his layman’s theory — but on a double logarithmic spiral (This is going to be the basis for the art of the first level of my game; the butterflies will be animated, flapping their wings, etc.)

The way the application works is that there are two kinds of tiles: normal tiles and tile references. Normal tiles are composed of “sides”. There are two kinds of sides: normal sides and side references. Tile references and side references point to a another tile or side and say basically, “I am just like that tile or side, except apply this transformation to me”. The transformations that tile references can apply to a tile are defined in terms of affine transformation matrices. The transformations that side references can apply are just two kinds of flipping relative to the sides’ end points, either flipping “horizontally” or flipping “vertically” or both. The application then allows the user to add and move vertices on any sides that are on tiles that are marked as editable and then resolves all the references in real-time.

Right now, all of the information about tiles and sides and which is a reference to which has to be hand coded as an XML file, which is a pain (for the above I wrote a separate Python program that generated the XML defining a double logarithmic spiral of tiles that interact with each other according to Escher’s IX-D system), but it is an extremely flexible formulation (it could, for example, handle hyperbolic tessellations in theory … if you added a kind of side that is a circular arc and moebius transformations to the set of transformations the applications knows how to apply to referenced tiles).

Eventually I’d like to release this application  as open source but I am not sure it will be useful to anyone but me in its current form. I need to incorporate the part that one has to hand code in XML into the actual GUI … fix bugs and bezier curves and so forth, but please feel free to contact me if you are interested in the application in its current state or otherwise.

What I am working on…

I’m working on a puzzle game for iPads that takes the mechanics of a somewhat obscure game from the 1990s(*) and puts them on a double logarithmic spiral.

The final app will be styled similar to one of M. C. Escher’s tessellations of the plane with the tessellation changing at level breaks. I don’t have a name yet…

Gameplay will be like the following, which is video of a prototype I wrote in C# (just WinForms to the GDI, nothing fancy). I’m going to write the real thing in Swift using Sprite Kit, unless Swift turns out to be too immature in which case I’ll use C++ and Cocos2d-x again.

*: The identity of which I leave as an exercise for the reader.

Discretely Distributed Random Numbers in C#

Random numbers generated from a discrete distribution are a commonly needed thing in game development.

By “discrete distribution” we just mean the roll you get from something like an unfair die, e.g. you want a random number from 0 to 5 but you want 4 and 5 to be twice as likely as 0, 1, 2, or 3. If we think of each possible random value as having a weight, in this case 0, 1, 2, and 3 would have a weight of 1 and 4 and 5 would have a weight of 2.

A simple way to generate these kinds of random values is the following. Given some n such that we want random values ranging from 0 to n-1 where for each 0 ≤ i < n we have a weight w(i):

  1. Build a data structure mapping cumulative weight to each value i. By cumulative weight we mean for each i the sum of w(0), … , w(i-1).
  2. Generate a random number r from 0 to W-1 inclusive where W is the total weight i.e. the sum of w(i) for all i.
  3. If r is a cumulative weight in our data structure return the value associated with it; otherwise, find the value v that is the first item in the data structure that has a cumulative weight greater than r and return v-1.

Obviously the data structure in the above could just be an unordered array but a better way to do it is to use a binary search tree because 3. will then be O(log n) rather than linear. In Java you can do this with a TreeMap. In C++ you can do this with an std::map (or in C++ you can do the whole thing with boost::random::discrete_distribution).

However, in C# you can’t just use a SortedDictionary, which is a binary search tree under the hood. You can’t use a SortedDictionary because it does not expose the equivalent of C++’s std::lower_bound and std::upper_bound or the equivalient of Java’s TreeMap.floorEntry(…) and TreeMap.ceilingEntry(…). In order to perform the “otherwise” part of 3. above you need to efficiently be able to find the spot in the data structure where a key would go if it was in the data structure when it is in fact not in the data structure. There is no efficient way to do this with a SortedDictionary.

However, C#’s List does support a BinarySearch method that will return the bitwise complement of the index of the next element that is larger than the item you searched for so you can use that. The downside of the above is that there will be no way to efficiently add or remove items to the discrete distribution, but often you don’t need this functionality anyway and the code to do the whole algorithm is very concise:

class DiscreteDistributionRnd
    {
        private List<int> m_accumulatedWeights;
        private int m_totalWeight;
        private Random m_rnd;

        public DiscreteDistributionRnd(IEnumerable<int> weights, Random rnd = null)
        {
            int accumulator = 0;
            m_accumulatedWeights = weights.Select(
                (int prob) => {
                    int output = accumulator;
                    accumulator += prob;
                    return output;
                }
            ).ToList();

            m_totalWeight = accumulator;
            m_rnd = (rnd != null) ? rnd : new Random();
        }

        public DiscreteDistributionRnd(Random rnd, params int[] weights) : 
            this(weights, rnd) { }

        public DiscreteDistributionRnd(params int[] weights) : 
            this(weights, null) { }

        public int Next()
        {
            int index = m_accumulatedWeights.BinarySearch(m_rnd.Next(m_totalWeight));
            return (index >= 0) ? index : ~index - 1;
        }
    }

where usage would be like the following:

            DiscreteDistributionRnd rnd = new DiscreteDistributionRnd(3,1,2,6);
            int[] ary = new int[4] {0,0,0,0};
            for (int i = 0; i < 100000; i++)
                ary[rnd.Next()]++;
            System.Diagnostics.Debug.WriteLine(
                "0 => {0}, 1 => {1}, 2 => {2}, 3 => {3}",
                (float)(ary[0] / 100000.0),
                (float)(ary[1] / 100000.0),
                (float)(ary[2] / 100000.0),
                (float)(ary[3] / 100000.0)
            );