Archive for the Category Programming

 
 

The World’s Simplest 2D Curve Library

For the game I’m working on I need to have sprites that travel along curving paths.

I’m talking about the sprites traveling along somewhat arbitrary curves, meaning curves that look good, not curves that result from gravity or other physical forces. If you need those kinds of curves, e.g. the parabolic trajectories of cannonballs, you need to simulate the forces acting on the sprites and that is not what I’m talking about in this post.

Caveat aside, an arbitrary curving path is a pretty common thing to need but I think is unnecessarily headache-inducing because curves in graphics are just confusing. Maybe you’ve found yourself thinking

  • I don’t know what the difference between a spline, a bezier curve, a bezier curve of various degrees, a B-spline, a t-spline, etc. is.
  • I don’t know which of the things mentioned above I need.
  • Every time I try read the wikipedia article on these things the math gets heavy and my eyes glaze over

etc.?

So assuming it’s not just me, as a public service I’m going to try to clear this up.

Short version, if you need to have a sprite that travels along a curving path from point A to point B in x amount of time, you probably need a cubic bezier curve and generally, in 2d game programming, all you will ever need probably is n cubic bezier curves possibly concatenated together. You can concatenate them yourself if you need to do that, so what you need is a way to define a cubic bezier and a function get the point along the bezier at some time t. Despite what you would think from trying read the literature, this turns out to be trivial — I mean less than a dozen lines of code.

More thoroughly, explaining away my bulleted list above:

  • A spline is a more general term than a “bezier curve”: a bezier curve is a particular polynomial function (that I will implement below) that defines a curve that goes from point A to point B given some control points. A bezier spline is an aggregation of n of these. A general spline can be an aggregation of other kinds curves e.g. a B-spline is composed of a bunch of curves that are generalizations of bezier curves.
  • The only kinds of beziers you need to be concerned with are quadratic and cubic beziers. Quadratic beziers are just parabolas and are not interesting. Cubic beziers are curves that go from point A to point B and are tangent to a given line at A and tangent to given line at B. They are defined by A and B plus two other control points that define the tangent lines and the weight they have on the curve.
  • Cubic bezier curves are easy to implement. See below.

So here is my curve “library”:
Bezier.h

#include <utility>

class Bezier {
	private:
		float x1_, y1_, x2_, y2_, x3_, y3_, x4_, y4_;
	public:
		Bezier(float x1, float  y1, float x2, float y2, float x3, float y3, float x4, float y4);
		std::pair<float,float> getPoint(float t) const;
};

Bezier.cpp

#include "Bezier.h"

Bezier::Bezier(float x1, float  y1, float x2, float y2, float x3, float y3, float x4, float y4) :
		x1_(x1), y1_(y1),
		x2_(x2), y2_(y2),
		x3_(x3), y3_(y3),
		x4_(x4), y4_(y4) {
}

std::pair<float,float> Bezier::getPoint(float t) const {
	float x = (x1_+t*(-x1_*3+t*(3*x1_ - x1_*t))) + t*(3*x2_+t*(-6*x2_ + x2_*3*t)) + t*t*(x3_*3-x3_*3*t) + x4_*t*t*t;
	float y = (y1_+t*(-y1_*3+t*(3*y1_ - y1_*t))) + t*(3*y2_+t*(-6*y2_ + y2_*3*t)) + t*t*(y3_*3-y3_*3*t) + y4_*t*t*t;
	return std::pair<float,float>(x,y);
}

You define a cubic bezier by making a Bezier object giving the constructor four points. (x1,y1) and (x4,y4) will be the start and end of the curve. The curve will be tangent to line segment (x1,y1)-(x2,y2) at its start and tangent to (x3,y3)-(x4,y4) at the end. To get a point along the curve call getPoint(t) where t=0.0 gives you (x1,y1), t=1.0 gives you (x4,y4), and 0.0 < t < 1.0 gives you the point along the curve in which 100t percent of the curve has been traversed e.g. 0.5 is halfway.

So that’s it. Code is here. I also included a Win32 GDI project that draws cubic beziers, screenshot below. (The sample program is also a little example of how to write a very basic Win32 program, which these days younger programmers seem to appreciate as a sort of parlor trick…)

Syzygy Update

So, looking at the early entries of this blog, I must have started working on Syzygy around the beginning of the year 2012 because by March 2012 I had the Win32 prototype done. At that time, I didn’t own a Macintosh, didn’t own an iOS device, had never heard of cocos2d-x, and, professionally-wise, was still writing image processing code for Charles River Labs / SPC. Since then SPC was killed, and I moved from Seattle to Los Angeles … but anyway as of today, about a year later, I have the primary functionality of Syzygy running on my iPad, re-using the source code, mostly, from that prototype. I haven’t really been working on it the whole time — there was a lot of moving-to-California in there somewhere, but here’s a screenshot (click for full-size):

There’s a common question in the mobile games forum of gamedev.net, “How can I make a game for iOS only using Windows?”. The answer to this is either (1) you can’t or (2) write your game to Marmalade or cocos2d-x on Windows and then when you are done get a friend with a Mac to let you register as an Apple developer, build under Xcode, and submit to the App store. I always say (1) is the serious answer and if you are unserious, or want to develop a really simple game, then go with (2). Basically I say this because you need to run your game on a device frequently and early, and I’m seeing the truth to this now.

Now that I have Syzygy running on a device I’m seeing issues with input which are artifacts of running on an iPad. The prototype implemented mouse input as a stand-in for touch input. It turns out touch screens and mice aren’t the same thing. The game plays on the device, but when you drag tiles your finger is in the way of the tile visually. You can’t see the tile you are dragging — this seems like it wouldn’t be a big deal, but it kind of is. … This sort of thing is the reason, in my opinion, that if you are not testing on a device during primary development then you are not really serious…

So not sure what I’m going to do about this, I’m thinking of making the tile the user is dragging larger and offset to the upper-left while the user is dragging it. The problem with this is to make it look nice I’d have to have large versions of all the relevant art and some of it I don’t even really remember how I rendered in the first place…

Marble Madness Madness

gamedev.net recently linked to this video about the making of Marble Madness, which got me thinking about the raster-to-vector via contour extraction script I wrote in Python last year and the fact that, it being the future and all, I can probably find all of the art from Marble Madness unrolled into a single image file. So three clicks later and, oh yeah:

(Click the image for full size)

So I ran the above through my raster-to-vector converter. Here are the results (zipped, this is a huge file, over 20,000 SVG paths)  This file kills Adobe Illustrator. It took 15 minutes just to open it.

SVG for a single level is more manageable.  Here’s  the 2nd level as unzipped SVG … (curious to see if various browsers can handle this) Illustrator could handle this one pretty well so I experimented with applying various vector filters. Below is a bit of it with the corners rounded on the paths (click for a larger version):


Not sure what this all amounts to … just some stuff I did today. However I did learn

  • Python is slow. It took a really long time to generate the big file, seemed too long. I think C++ would’ve been like an order of magnitude faster — that’s my intuition anyway.
  • My contour extraction program really works which is kind of surprising — I thought for sure running it on something like this would crash it. (It does still have the problem that it can’t handle paletted raster image formats, but that’s the only bug I encountered)

Sprite Packing in Python…

I’ve been working on my puzzle game Syzygy again, after a long hiatus, and am now writing to iOS/cocos2d-x rather than just working on the prototype I had implemented to Win32.

The way that you get sprite data into cocos2d is by including as a resource a sprite sheet image and a .plist file which is XML that specifies which sprite is where. Plists are apparently an old Mac thing — I had never heard of this format. .plists describing a lot of sprites would be a chore to write by hand so there is a cottage industry of sprite packing applications.

I tried out one called TexturePacker and liked it a lot — except that it is crippleware; I need a few features that are only in the full version; plus I can’t stand crippleware; and I think $30 is too much for something that I can write myself over the weekend. So I decided to write my own sprite packer over the weekend.

The result is pypacker, a python script: source code here. Usage is like

pypacker -i [input] -o [output] -m [mode] -p

where

  • [input] = a path to a directory containing image files. (In any format supported by the python PIL module.)
  • [output] = a path + filename prefix for the two output files e.g. given C:\foo\bar the script will generate C:\foo\bar.png and c:\foo\bar.plist
  • [mode] = the packing mode. Can be either “grow” or fixed dimensions such as “256×256″. “grow” tells the algorithm to begin packing rectangles from a blank slate expanding the packing as necessary. “256×256″ et. al. tell the algorithm to start with the given image size and pack sprites into it by subdivision, throwing an error if they all won’t fit.
  • -p = optional flag indicating you want the output image file dimensions padded to the nearest power-of-two-sized square.

The algorithm I used is a recursive bin packing algorithm in which sprites are placed one-by-one into a binary tree. I based it directly on Jake Gordon’s work in Javascript for generating sprite sheets for use in CSS, described here, only my algorithm is sort of like version 2 of his i.e. I fixed an issue that bugged me about his algorithm.

The core of the algorithm is a function that looks like this:

def pack_images( named_images, grow_mode, max_dim):
    root=()
    while named_images:
        named_image = named_images.pop()
        if not root:
            if (grow_mode):
                root = rect_node((), rectangle(0, 0, named_image.img.size[0], named_image.img.size[1]))
            else:
                root = rect_node((), rectangle(0, 0, max_dim[0], max_dim[1]))
            root.split_node(named_image)
            continue
        leaf = find_empty_leaf(root, named_image.img)
        if (leaf):
            leaf.split_node(named_image)
        else:
            if (grow_mode):
                root.grow_node(named_image)
            else:
                raise Exception("Can't pack images into a %d by %d rectangle." % max_dim)
    return root

We iterate through the images we want to pack. For each image, try to find a rectangular node in the tree that can contain the image. If one exists, place the image in the node and subdivide the node such that the remaining space, not taken up by the image, is available in the tree (this is what ‘split_node’ does). If such a node cannot be found, throw an exception if we are not in ‘grow’ mode or expand the root rectangle node to accommodate the new image if we are in ‘grow’ mode.

This routine is very similar to the Javascript implementation I linked to above. The difference is in the details about the structure of the binary tree. Jake Gordon’s Javascript implementation uses a node type that stores an image in the upper left and has children that he calls ‘right’ and ‘down’  like this:

Since actual data is always burnt into the upper left, it means that the tree can never subdivide into this space; we can never recurse into the upper left. This results in the grow_node routine being awkward to write. When we grow the root we either want to extend to the right or extend down, if the upper left can be a node and not image data this is a simple matter of creating a new node and making the the existing root its upper or left child. Anyway, Jake Gordon’s implementation results in a packing tree that cannot both grow right and grow down simultaneously because it would have been complicated to implement this. This limitation is not a problem practically as long as you sort the images from largest to smallest before running the packing algorithm —  a standard heuristic from the bin packing literature.

I however wanted to see if the standard sorting heuristic is really accomplishing anything. I wanted to be able to pack rectangles in random order. I therefore simplified the trinary node structure of the Javascript implementation into true binary nodes either oriented horizontally or vertically like this:

Further now only leafs can contain images and if a node is not a leaf it always has two valid, that is non-null, children. Using this type of tree structure makes the full grow_node routine more or less trivial.

Beyond that, I’m using the following heuristics:

  • If the orientation (horizontally or vertically) of a split is not forced, split with the orientation that will result in the new empty node having the largest area
  • If the orientation of growing the root rect is not forced, grow in the direction that leads to the smallest increase in the maximum side length of the root rectangle. (This heuristic enforces squarishness and is extremely important. Without doing this the grow version of the algorithm is basically unusable, and in this sense this grow heuristic can be considered part of the algorithm rather than a heuristic that can be swapped out)

Sorting by size (max side length) turns out be about a 6% improvement with this algorithm. Here’s 500 rects packed with sorting (top) and without (bottom):

The Cocos2d-x Device Orientation Bug…

This took me all morning to figure out. I’m posting here so there is clear information on this subject in at least one place on the internet.

The situation is this: when targeting iOS6 using Cocos2d-x v2.0.2, there is a bug in the auto-generated “Hello, World” code that shows up in a fresh project such that the compiled game will not display in landscape orientation even after following the steps in this item from Cocos2d-x documentation (such that it exists).

The solution is to follow the steps enumerated in this note from Walzer Wang. This fix is probably already in v2.1 but 2.1 is still beta, as far as I know, so this issue is probably still in a lot of code out there…

Thoughts on porting Syzygy to iOS

I’ve started trying to figure out the way in which I’m going to port Syzygy to iOS. I don’t actually own a Mac — though I may get one this weekend — so this is all theoretical at this point.

What I have right now is an implementation written in C++ to the Win32 API. Part of this implementation is a very basic 2D game framework. This 2D framework has an abstract widget class that has render and update methods. I didn’t call this class “sprite” because it is more general (and basic) than a sprite class : it can be implemented as anything that knows how to update and draw itself. For example, I have text widgets, that call the Win32 DrawText function in the draw method. Widgets are contained in GamePhase objects; GamePhases have a vector of lists of widgets, where each widget list represents a layer, so the order of a GamePhases’s list vector is effectively enforcing a z-order. GamePhases have update and render functions that can do phase specific rendering (e.g. draw a background) and then call the render and update methods of the widgets contained in the layers. GamePhases also have a predicate “IsPhaseComplete” and an accessor “GetNextPhase” which are used along with update and render to implement the game loop.

That’s basically it as far as a game engine goes. This code clearly is not logically platform-dependent. In practice Windows leaked into the implementation in the Render method which takes an HDC as a formal parameter and elsewhere where I wasn’t being careful in avoiding Win32 types. So my initial plan on porting to iOS was to refactor the 2D game framework part of the codebase to be truly platform independent and then to find an open source 2D drawing library that someone else implemented on top of OpenGL ES and reimplement the rendering code in terms of that 2d library.

The trouble is the 2D OpenGl-based library surprisingly doesn’t seem to exist. There is a project that someone did called Gles2d which is what I want but it is orphaned and never adapted to iOS anyway. It was implemented for the GamePark32 hardware, I believe. So my options are

(1) Stick with the original plan and adapt the Gles2D codebase to iOS myself.
(2) Stick with the original plan and write my own 2D graphics in OpenGL ES layer.
(3) Throw out everything and reimplement the application to Cocos2d in ObjectiveC.
(4) Keep whatever I can of my code and re-factor to use the Marmalade framework in C++.
(5) Keep whatever I can of my code and re-factor to use Cocos2d-x in C++.
(6) Stick with the original plan and write the platform dependent drawing stuff to SDL 1.3.

Long story short, I think I’m going to do (5).

(1) and (2) are just not work I feel like doing at this time. (3) would be a good solution but I’d be locked into iOS and would have to gain more competence at ObjectiveC development than I feel like investing time-wise at this point — however, I may end up doing things this way if it becomes clear that it is the easiest approach. (4) is out because I don’t think I need a very powerful game engine, Marmalade costs money, and I wouldn’t be using most of it. I’m ruling out (6) because I don’t really trust SDL 1.3 on iOS; maybe I’m wrong about this but SDL doesn’t officially support iOS and it just seems like there would be problems.

So (5) … Cocos2d-x is a reimplementation of the Cocos2d API but to C++ rather than ObjectiveC. It is designed for cross-platform (i.e. across iOS and Android specifically) and the project looks alive and well. There is even a Win32 build of it that uses PowerVR’s GLES emulator for windows so I could in theory start work without actually owning a Macintosh. The only problem I see with Cocos2d-x is that documentation seems to be non-existent and it is being developed by guys who are clearly speaking English as a second language so I may have trouble finding answers to questions and so forth … we will see.

Anyway, … thoughts?

Syzygy for Win32, pre-pre-alpha release

I’m releasing a prototype version of a puzzle game, Syzygy, that I eventually intend to port to iOS and possibly Android. The prototype is written to the Win32 API and should run on basically any Windows system without installing anything.

Syzygy can be downloaded here. Just unzip these three files into a directory and run the executable. I have the Syzygy prototype parametrized such that a single XML file defines its gameplay. I’m looking for play testers who are interested in abstract puzzle games to play the game and provide feedback regarding good values for the definable parameters. If I get multiple helpful submissions I’ll give $60 via paypal to whoever has the best revised XML file. Here’s a brief explanation of the XML file.

The game is a Scrabble-like word game re-imagined as a one-player action puzzle game. Here’s a screenshot (click on the image for a full-sized version):

Basically the game works as follows:

  • The bar on the left is the game timer. When it is empty the game is over.
  • Letter tiles randomly appear and the player must position the tiles in a legal crossword-style crossword grid by dragging them with the mouse pointer.
  • When the player has positioned tiles such that they form two or more legal connected words, the player can double-click on one of the tiles to “lock them in” and the two or more words are then scored as follows (This is a modified version the scoring used in the game Literaxx, which is the public domain Scrabble variant):
    • Yellow tiles are 1 point, green tiles are 2, blue tiles are 3, and red tiles are 5
    • A tile on the a board cell of matching color receives triple its point value.
    • The 2x and 3x board cells are double and triple word scores.
    • There are two levels of parameter controlled bonuses for long words (see the readme file in the game directory)
  • The remaining time in the game timer is increased proportionally to the point value earned by a successful lock in and the player’s score is increased by the score value of a successful lock-in times a level multiplier.
  • Locked in tiles can be played off of but cannot be moved.
  • Each tile has a bar timer widget on its right. When this timer expire, the tile disappears negatively effecting the global timer if the tile that expires is not locked in.
  • There are three kinds of special tiles
    • Random tiles: Random tiles look like gray transparent letter tiles (the weird looking ‘M’ tile above is one). They cycle through the alphabet until they are dragged the first time at which point they behave like normal letter tiles with no point value.
    • Bomb tiles: (pictured above) When the user drags a bomb tile onto a group of connected locked-in or non-locked-in letter tiles, the target tiles will be destroyed without effecting the user’s score or game timer.
    • Juice tiles: (appear as lightening bolt icons, not shown above) When user drags a juice tile onto a group of connected locked-in or non-locked-in letter tiles, the tiles’ local timer widgets receive additional time.
  • The game levels up after a certain number of tiles are locked in. The game timer is re-filled at level transitions.

Raster-to-Vector plus not bitching about significant whitespace

So this weekend I learned Python and implemented a basic raster to vector converter in it. It converts whatever file formats the Python PIL module supports to SVG.

Here’s the little guys from Joust embedded as SVG (This probably doesn’t work on Internet Explorer, but if you’re using IE you have bigger problems, brother)

and here’s the fruits from Pac-Man: (My favorite is the Galaxian fruit)

Specifically I implemented classical contour extraction as described in the 1985 paper “Topological Structural Analysis of Digitized Binary Images by Border Following”, which is the algorithm that OpenCV’s cvFindContours uses, modified such that it supports color images rather than black-and-white bitmaps. (This is something that I may eventually have to do at work for real, i.e. in C++, so I thought it would be a good way to learn Python and make sure my modified algorithm was actually going to work — it’s not a trivial change because color images allow two contours to be adjacent which can’t happen in a bit map image)

Here’s the code. Usage is like:

python Sprite2Svg.py “c:\work\fruits.png” “c:\work\fruits.svg” 4 “#000000″

where the last two arguments are optional. The 3rd argument is a scale factor in pixels. The 4th argument is a background color that will be removed from the output. I think there’s a bug right now in which the code doesn’t support paletted images; trivial to fix, but I wanted to fix it in some general way and then forgot about it.

Anyway, things I like about Python:

  • Significant whitespace turns out to not be annoying. (Who knew?)
  • Coroutines!… about twenty years ago I was in a class in which I had to write a compiler in a language called CLU. All I remember about CLU is that (a) I once apparently wrote a compiler in it and (b) Coroutines! — well it had the generator/yield construct, anyway. I wish C++ had generator/yield
  • It isn’t Perl. Can’t stress this one enough.

Things I don’t like about Python:

  • The thing with version 3 being better but nobody using it.
  • The issue I’m talking about here is annoying and I think the “nonlocal” declaration isn’t the best solution in the world.

Playing simultaneous sounds in Win32

There is a nice unscary Win32 API call for playing sounds that is unpretentiously called “PlaySound”. It is extremely easy to use. Unfortunately, PlaySound(…) has one problem that makes it unsuitable for even casual game development: although it will play audio asynchronously, it will not play two sounds simultaneously. Depending on what you are doing this may not be a big deal, e.g. warning beeps in an editor app and such, but for anything that is media rich it basically means that you can’t use PlaySound.

This leaves you with several options:

  1. Use a third party library.
  2. Use the new Windows Audio Session API which came in with Vista.
  3. Use the (legacy) low-level Win32 MCI routines.
  4. Use the (legacy) low-level Win32 waveOut interface

So if you have serious audio needs The Right Thing is 1. or 2. above.

A little background on what I’m doing: I’m working on a Win32 C++ implementation of a puzzle game that I am eventually going to directly port to iOS. I don’t really need serious audio as I just want something that is playable for game design debugging, i.e. balancing. I do want simultaneous sound effects though because I plan on releasing this Win32 prototype to gather feedback and I think sound effects add playability in this case.

Anyway, I looked into 1. and 2. On 1., I couldn’t find a library that was dirt simple enough to justify its use in my project given all I am really looking for is a drop-in replacement for PlaySound that supports asynchronous simultaneous audio. On 2., well, the API is dense; I’m not looking for a research project and I couldn’t find a lot of sample code — maybe, it’s too new? or maybe I wasn’t looking in the right places.

On 3., if you don’t know what I’m talking about, I’m talking about this function and friends. I remember using the MCI (multimedia command interface, I believe) routines back in the day, probably around 1997, and they are pretty easy to use. It’s a weird little API relative to the rest of win32. You send command strings to a device object that look like “LOAD ‘foo.wav’ as Foo” and so forth, and I think that they will do simultaneous audio output. The problem is the MCI routines don’t let you play sounds from memory and I want to play from WAVE resources that are embedded into the executable. To use MCI I would have to write my resources to a temp directory at start up and then play the serialized files. This seemed too ugly.

Which leaves 4. The problem with 4. is that waveOut et. al. are like the opposite of PlaySound: super low-level and absolutely un-user friendly. Fortunately there is a lot of code out there. In particular I found CWaveBox on CodeProject by a CodeProject user named Zenith__ that does basically what I need. I had lots of problems with this code, however, and ended up significantly re-working some code that was itself a re-work of the original CWaveBox that is posted in the CodeProject comments. I basically chose to work with the comments version because it executed the demo app as well as the original, but it was much shorter.

The code I started with is verbose, baroque, and in C rather than C++. I re-factored it in the following ways:

  • Added the ability to load WAV’s from resources
  • Replaced two C-style arrays: one with a std::vector and the other with an std::map
  • Cleaned up the .h file by making things that should be private private and moving as many implementation details into the .cpp file as I could
  • Simplified some of the thread synchronization stuff by using WaitSingleObject instead of looping and polling
  • Got rid of all C-isms i.e. mallocs, callocs, memsets, memcpys, etc. Replaced with C++-style allocation, std::copy, std::fill, etc.
  • Got rid of magic constants by noticing many of them served as booleans, so replaced them with booleans
  • Changed names of crazily named variables and functions
  • Generally got rid of craziness … we’re talking goto’s.

I honestly don’t understand the code involved, which is weird given the amount of work I did on it. I mean, I get that there’s a thread that’s running and that it is playing chunks of waves in a loop, etc. That’s the level at which I understand it … kind of reminds me of this time I was working a DARPA contract and ported a function for converting from Military Grid Reference Numbers to longitude and latitude from Fortran to C without actually understanding the algorithm or knowing Fortran…

My code is here. Usage goes likes this:

#include "WaveManager.h"
#include "resource.h"
// ...
{
   WaveManager wave_mgr( kNumSimultaneousWaves );
   
   wave_mgr.LoadFromResource(ID_SND_BUZZ, MAKEINTRESOURCE(ID_SND_BUZZ));
   wave_mgr.LoadFromResource(ID_SND_CLICK, MAKEINTRESOURCE(ID_SND_CLICK));

   // ...
   wave_mgr.Play(ID_SND_BUZZ);
   // ...
   // etc.
}

It actually works remarkably well.

However, the code still needs work if anyone is interested. There are a lot of functions returning error codes that are never checked. I think most of these should be changed to return void and should just throw if something serious happens. Also I think in the original implementation there could have been a race condition if you tried to load a wave after the playing thread was running. I wrapped the loading stuff in the critical section the playing thread uses, but this still might be a problem — I’m not sure. If you use this code it is safest to load all your sound assets at start up before you start playing anything. But in terms of fixing it up more, I think the main thing is that someone who actually knows what they are doing vis-à-vis wave output could probably cut the verbosity in half or so.

Oh, and one more note on my work, I used some C++11 stuff in there … basically I implemented loading from memory by taking the guts of the existing LoadFromFile implementation and making it into a function template that takes a handle type as the template parameter and has an additional formal parameter that is an std::function used for loading data. In the LoadFromFile case I instantiate the template with a file HANDLE as the template parameter and pass the function the Win32 call ReadFile wrapped thinly in a lambda. In the load from resource case, the template parameter is a custom buffer struct and the functor argument becomes, basically, a lambda wrapper around std::copy. But, anyway, just a heads up that this code will only compile under Visual Studio 2010 because of the lambdas. If you’re interested in using it but don’t do lambdas, it would be pretty easy to replace them with regular function pointers.

Blitting with per-pixel alpha in Win32

I don’t know how long the Win32 API has included an AlphaBlend() function. I mean, it came in whenever Msimg32.lib did but I’m not sure when that was, probably Windows XP era, I guess.

It’s always been a pain in the ass to use; easy to use global alpha but per-pixel is a chore. You see a lot of people asking how to use this call at StackOverflow and other sites but basically never see a comprehensive reply. Generally the replies are variants of “It’s easy … you just need to pre-multiply the alpha”, which is true but unhelpful for two reasons: (1) doing so is a pain in the ass i.e. show me some code, buddy, and more importantly (2) in order to burn the alpha values into the RGB you need to actually have image data that contains an alpha channel but Win32 only natively supports loading BMP’s which generally don’t.

So on (2), for completeness, I should say I think that it is possible to get Photoshop to spit out a BMP file with alpha information. I haven’t tried it but the advanced options when saving a BMP have an option for the format “a8 r8 g8 b8″; I always see it grayed out but am guessing that it’s possible to do this somehow. Also I think that you can load PNG’s using GDI+ — I know next to nothing about GDI+ but if that’s what you use I’m not sure the solution I propose below is worth it just to get out of having to write the pre-multiply function yourself.

However, the above aside, if you want alpha-blended images in your application, you want to use PNG files and if you are writing to Win32 you need the use a 3rd-party library. The two 3rd party graphics libraries that people commonly use in Windows applications for things like loading PNG’s and JPEG’s are DevIL and FreeImage. I have no experience with DevIL and, frankly, it looks orphaned to me. What I suggest for blitting with semi-transparency in Win32 is using FreeImage, which seems tailor-made for doing this.

So below is an implementation on top of FreeImage demonstrating

  • Loading PNG’s (and other formats) as FreeImage data structures from Win32 resources.
  • Converting from FreeImage to HBITMAPs with alpha burned in.
  • Blitting the HBITMAPs with per-pixel alpha.

Here’s the code I use for loading a PNG from a resource to FreeImage’s data structures. FimgUtil::MemPtr and FimgUtil::Ptr are defined as

boost::shared_ptr<FIMEMORY>
boost::shared_ptr<FIBITMAP>

That is, I’m using smart pointers with custom deleters to call the appropiate FreeImage clean-up code on destruction. If this isn’t your style the following code can easily be modified to use raw pointers.

namespace {
    bool GetResourceData(const char* name, BYTE*& ptr, unsigned long& size) {
	    HRSRC hrsrc = FindResource(NULL, name, RT_RCDATA);
	    if (hrsrc == NULL) {
	        return false;
	    }
	    HGLOBAL handle = LoadResource(NULL, hrsrc);
	    if (handle == NULL) {
	        return false;
	    }
	    ptr = static_cast<BYTE*>(LockResource(handle));
	    size = SizeofResource(NULL, hrsrc);
	    return true;
    }
}

FimgUtil::MemPtr FimgUtil::GetMemoryPtr(FIMEMORY* fimem) {
    return FimgUtil::MemPtr(fimem, FreeImage_CloseMemory);
}

FimgUtil::Ptr FimgUtil::GetBitmapPtr(FIBITMAP* fibmp) {
    return FimgUtil::Ptr(fibmp, FreeImage_Unload);
}

FimgUtil::Ptr FimgUtil::LoadFiBitmapFromResource(const char* rsrc_name, FREE_IMAGE_FORMAT format) {
    BYTE* data;
    unsigned long size = 0;
    if (! GetResourceData(rsrc_name, data, size)) {
        return FimgUtil::Ptr ();
    }
    FimgUtil::MemPtr  buff = GetMemoryPtr(FreeImage_OpenMemory(data, size));
    if (buff.get() == 0) {
        return FimgUtil::Ptr ();
    }
    if (format == FIF_UNKNOWN) {
        format = FreeImage_GetFileTypeFromMemory(buff.get(), 0);
        if (format == FIF_UNKNOWN) {
            return FimgUtil::Ptr ();
        }
    }
    return GetBitmapPtr(
        FreeImage_LoadFromMemory(format, buff.get(), 0)
    );
}

To convert from an FIBITMAP* to a an HBITMAP and rolled together with the above:

HBITMAP FimgUtil::FiBitmapToWin32Bitmap(const FimgUtil::Ptr & src_ptr, bool premultiply_alpha) {
    if (premultiply_alpha) {
        FreeImage_PreMultiplyWithAlpha( src_ptr.get() );
    }
    HDC hdc_scr = GetDC(NULL);
    FIBITMAP* src = src_ptr.get();
    HBITMAP hbm = CreateDIBitmap( hdc_scr, FreeImage_GetInfoHeader(src),
        CBM_INIT, FreeImage_GetBits(src), 
        FreeImage_GetInfo(src), 
        DIB_RGB_COLORS);
    ReleaseDC(NULL, hdc_scr);
    return hbm;
}

HBITMAP FimgUtil::LoadPngResource( const char* rsrc_name, bool premultiply_alpha) {
    FimgUtil::Ptr fibmp = LoadFiBitmapFromResource( rsrc_name, FIF_PNG ); 
    return FiBitmapToWin32Bitmap( fibmp, premultiply_alpha);
}

and a wrapper for blitting:

void FimgUtil::BlitWithAlpha( HDC dst, int dst_x, int dst_y, int wd, int hgt, HDC src, int src_x, int src_y, float alpha ) {
    BLENDFUNCTION bf;

    ZeroMemory( &bf, sizeof(BLENDFUNCTION) );
    bf.BlendOp = AC_SRC_OVER;
    bf.BlendFlags = 0;
    bf.AlphaFormat = AC_SRC_ALPHA;
    bf.SourceConstantAlpha = static_cast<BYTE>( 255 * alpha );

    AlphaBlend( dst, dst_x, dst_y, wd, hgt, src, src_x, src_y, wd, hgt, bf);
}

Source for my FreeImage utility functions is here.