Archive for the ‘Box2D AS3’ Category

Refactoring Box2D

Have you ever wondered whether your code is performing as best it can? What are your benchmarks, anyhow? How do you determine what “as best it can” means? These are all questions of heavy existential weight, but usually we have a reasonable idea of what it is we’d like to improve and why. Are you looking to turn a mess of unreadable and unmaintainable code into something more pleasurable to behold? Are you tasked with putting a processor pig on a diet? Usually you’ll have a sense of what you really need.

The real question is “How?”

I’m going to examine a concept called “Refactoring” and I’m going to give you a case study in it as I refactor parts of the Box2DAS3 (version 2.0.1) engine with the goal of improving runtime performance and memory consumption.

To refactor code is to change its inner workings without destructively changing its interfaces or altering the output of its functions and expressions in any way. You may wonder “How can this possibly help? Doesn’t this mean that the code is just doing what it always did?”

No, not in the slightest.

I had an itch to see if I could manage to squeeze some more juice out of the ol’ physics engine for the sake of the banner up above. If it’s running more smoothly on your machine now (and it should be. It certainly is on mine), this is the reason why.

Before we begin, there is one thing to keep in mind when you decide to refactor a 3rd party library : thier updates can break code you’re relying on! Your refactored code may not be compatible with the vision the creators originally conceived when they wrote it. They might have code they were intending to implement but simply haven’t, and their ideas might be better than yours. In this case you’re stuck with a lot of time sunk into your own custom branch of the project which might fall far behind the official branch. I will do my best to point out the danger zones as I encounter them.

I will admit to you that what follows is not the most formal method of refactoring. You can use a debugger to step through your code, or you can simply do as I have done here and take the initiative to “run through” yourself, from function to function and “execute” the code in your head. We’ll take this latter approach for now, as I find it tells an interesting story.

A reasonable place to look when identifying performance bottlenecks is anything that happens in a loop, or on an interval. The most obvious interval in the world of Box2D (and the one that is called 30 times per second in my banner) is b2World.Step(); Let’s examine this function. I’ll call out some places in the code via comments:

	public function Step(dt:Number, iterations:int) : void{
		m_lock = true;
		// *** An instantiation happens 30 times per second *** //
		var step:b2TimeStep = new b2TimeStep();
		step.dt = dt;
		step.maxIterations	= iterations;
		if (dt > 0.0)
			step.inv_dt = 1.0 / dt;
			step.inv_dt = 0.0;
		step.dtRatio = m_inv_dt0 * dt;
		step.positionCorrection = m_positionCorrection;
		step.warmStarting = m_warmStarting;
		// Update contacts.
		// Integrate velocities, solve velocity constraints, and integrate positions.
		if (step.dt > 0.0)
		// Handle TOI events.
		if (m_continuousPhysics && step.dt > 0.0)
		// Draw debug information.
		//*** Regardless of whether it's useful, or whether we're debugging,
		         this function is called 30 times per second ***//
		m_inv_dt0 = step.inv_dt;
		m_lock = false;

First, note the instantiation of a b2TimeStep each time we call the Step function. What is a b2TimeStep? It’s this:

package Box2D.Dynamics{
public class b2TimeStep
	public var dt:Number;			// time step
	public var inv_dt:Number;		// inverse time step (0 if dt == 0).
	public var dtRatio:Number;		// dt * inv_dt0
	public var maxIterations:int;
	public var warmStarting:Boolean;
	public var positionCorrection:Boolean;

It’s simply a data structure. There are no functions at all. Now, in case you did not know, instantiation is expensive. It is not something you want to do willy-nilly if you can possibly help it, and from the look of the code above it seems like all the variables are set immediately after instantiation within the scope of the Step function. So instead of instantiating from scratch, let’s simply make one local Class-level variable to contain our Step function’s b2TimeStep object, and do some refactoring:

	private var m_stepScopeTimeStep:b2TimeStep = new b2TimeStep();
	public function Step(dt:Number, iterations:int) : void{
		m_lock = true;
		//var step:b2TimeStep = new b2TimeStep();
		var step:b2TimeStep = m_stepScopeTimeStep;

“Why did he choose to to simply assign the old step variable with the object referenced by m_stepScopeTimeStep? Why not just find/replace all instances of the word “step” with “m_stepScopeTimeStep”?

When refactoring it is critical to take small steps. We know that the code works as it was written, so the goal at least for now is to modify it as little as possible while still making it better. Yes, we are still allocating memory for an unneccessary variable at the start of the Step function, but what’s more important right now is to stop instantiating a needless variable every time Step is called.

We now dutifully confirm that our change has not broken the code. It is best to do this with a debugger, but for our purposes we’ll simply execute the code and ensure that it still runs correctly.

This same sort of redundant instantiation happens in other places in the library as well. notably, there are a great many b2Island objects instantiated every frame when only a single one is ever needed. It can simply be re-initialized and reused.

Cutting down on instantiations of b2Islands and b2TimeSteps alone helps save several MB of memory over time, which can be better spent rendering the awesomeness of physics.

The next thing we’ll do in the Step function is examine the call to the DrawDebugData function. Here’s what’s going on inside of it:

public function DrawDebugData() : void{
		if (m_debugDraw == null)
		// snipped ...

One thing to remember about ActionScript is that function calls are relatively expensive to perform. You shouldn’t do it without a good reason, and particularly not on a loop. So what we’ll do instead is this:

if(m_debugDraw) DrawDebugData();

In the event that you’re not debugging the application, you’ll save 30 function calls per second here just by confirming that there’s even a reason to call the function in the first place. Again, we’re not going to change anything inside the DrawDebugData function. It’s not quite in the scope of the refactor… I really couldn’t care less right now about how well it runs in debug mode, as I’m only displaying content in production mode.

So, that’s all well and good. Where should we go next? Let’s scan our way down and see what functions are being called in Step

// I wonder what's happening in this function call?

We’ve seen our first call here in this line. m_contactManager is an instance of b2ContactManager, so let’s open it up and see what this function does:

	public function Collide() : void
		// Update awake contacts.
		for (var c:b2Contact = m_world.m_contactList; c; c = c.m_next)
			var body1:b2Body = c.m_shape1.m_body;
			var body2:b2Body = c.m_shape2.m_body;
			if (body1.IsSleeping() && body2.IsSleeping())

Doesn’t look like anything suspicious is happening here, does it?


    // getter functions for "IsSleeping"?  Is this just a boolean we can retrieve for ourselves?
    if (body1.IsSleeping() && body2.IsSleeping())

Let’s open up b2Body and have a look-see:

/// A rigid body.
public class b2Body
	/// Creates a shape and attach it to this body.
	/// @param shapeDef the shape definition.
	/// @warning This function is locked during callbacks.
	public function CreateShape(def:b2ShapeDef) : b2Shape{
           /*** snip ***/
	/// Is this body sleeping (not simulating).
	public function IsSleeping() : Boolean{
		return (m_flags & e_sleepFlag) == e_sleepFlag;
           /*** snip ***/
	public var m_flags:uint;
           /*** snip ***/
	// m_flags
		static public var e_frozenFlag:uint			= 0x0002;
		static public var e_islandFlag:uint			= 0x0004;
		static public var e_sleepFlag:uint			= 0x0008;
		static public var e_allowSleepFlag:uint		= 0x0010;  // this is the one!
		static public var e_bulletFlag:uint			= 0x0020;
		static public var e_fixedRotationFlag:uint	= 0x0040;
		static public var e_sleepFlag:uint			= 0x0008;

So the b2Body not only has a function call for IsSleeping, but it does a bitwise operation based on its current flags uint to determine whether or not it counts as “asleep”. Changing the internal guts of how Box2D determines whether an object sleeps might be worthwhile, but that would require some benchmarking. For now what we’ll do is take advantage of the fact that the variables are all public and perform the comparison without calling the function:

	public function Collide() : void
		// Update awake contacts.
		for (var c:b2Contact = m_world.m_contactList; c; c = c.m_next)
			var body1:b2Body = c.m_shape1.m_body;
			var body2:b2Body = c.m_shape2.m_body;
			//if (body1.IsSleeping() && body2.IsSleeping())
			if ((body1.m_flags & b2Body.e_sleepFlag) == b2Body.e_sleepFlag && 
				(body2.m_flags & b2Body.e_sleepFlag) == b2Body.e_sleepFlag)

We’ve now eliminated two needless function calls, each of which were being called by Step. Testing this code shows that the application continues to behave properly. But before we go on…

Danger : While we have done nothing to change the actual results of the code on execution we have definitely “painted ourselves into a corner” in a sense. The code becomes vastly more efficient by removing needless get/set function calls, but by breaking encapsulation we’re now at the mercy of fate. By no longer relying on the IsSleeping() function, we’ve lost out on the potential that the function itself may be made more efficient… or more disastrously, if in a future version of the library the way “sleep” is determined changes we’re no longer protected from it by a function that abstracts it away from us. It’s a potentially future-breaking change. In this particular case, it’s not likely that there would be a change, but that’s not the case for other parts of the library. Particularly, there are a number of functions of b2Vec2 that simply return new “cloned” instances of the b2Vec2 with certain transformations or maths applied to them. These functions, called repeatedly, are quite inefficient as they not only are a wasteful function call but also an instantiation. We *could* simply instantiate our own new b2Vec2 instances and apply the simple math functions ourselves, which would save us a function call. Potentially a better solution though, would be to make the function call itself less wasteful. If the authors (or we) choose to create an object pooling scheme that allows them to generate a limited number of b2Vec2 instances and recycle them it would justify keeping the function around. At times like those, it’s completely up to your own judgment and what risks you’re willing to live with. In the case of Box2D, the library is overly encapsulated in a number of ways that hamper performance. More frustratingly, there are several function calls that happen repeatedly where the function body has nothing inside other than a //TODO. While I’m sure in the future there is something to be done in those stub functions, in the meaintime those function calls should be commented. An empty function call iterated for every single “body” in the simulation, 30 times every second can quickly deteriorate your performance. Do you ever wonder how much reality there is in those contrived 100,000 iteration for-loops? This is one of those cases where it actually happens.

And there you have a very basic rundown of what a refactoring is. These changes seem small, but they add up very quickly and the more of them you implement in performance-critical areas the more juice you get from your code. They can be combined with microoptimizations, such as more efficient for-loop declarations, factoring out the use of b2Math convenience functions such as min/max, or the simple vector arithmetic functions. Another place for improvement is to use array-literal style instantiations where a pre-determined array length is unimportant. Truly though, I got my biggest gains from eliminating needless instantiation and allowing direct property access instead of using the get functions for properties that have no actual mutations applied to them in the process. I’d say that one of the biggest potential roads for improvement would be to implement object pools for commonly instantiated trump objects that do not persist, and that serve little value other than as fodder for calculation.

In this particular refactoring you’ll note that I’m tearing down certain OOP constructs for the sake of performance. This should not be taken to mean that well-designed systems and Object Orientation is by default a hog. It’s in general better to start with a system that is well-designed and possibly slower and to selectively break encapsulation to gain boosts than to start with a mess and try to organize it later.

Tags: , , , ,


What the b2World can teach you about C++ (and ActionScript, too)

I’m going to let you in on a secret. The Scriptocalypse banner at the time of this writing uses the Box2DFlashAS3 physics engine, a port of Erin Catto’s Box2D, written in C++. This is not the first time The Horseman has used Box2DFlashAS3. In fact, it’s one of those things that he learned to work with in the midst of a professional project “under the gun” as it were.

If you’ve never worked with a physics engine before, Box2D looks intimidating. To make matters ‘worse’ for an ActionScripter, the Box2DFlashAS3 port of the library breaks almost every standard ActionScript code convention. This made the task of parsing the library for comprehension a bit more difficult than other AS3 libraries I’ve encountered, though obviously it’s doable. Without knowing anything about the pedigree of the library, at first I wondered why the author chose to do this. I clued in fairly quickly after skimming the source code and finding references to “native” datatypes that Flash just doesn’t support (int32, anyone?) that had been commented out and re-written a line below with datatypes Flash recognizes. Box2DFlashAS3, as a port, is very much a “transliteration” of the library from its native code as opposed to a smooth “localization.” It would be like translating literally:

“Ik kom uit Amerika” -> “I come out America”

Instead of translating the general meaning…

“Ik kom uit Amerkia” -> “I’m from America”

Everything from the naming conventions, to the placement of the variables, to the slavish insistence on strictly defining the length of Arrays on instantiation, to the way you create instances of b2Body and b2Joint, to the ubiquity of the LinkedList structures used for iteration looked like it came straight from a different paradigm. The linked lists alone are peculiar in ActionScript. Virtually every physics body, shape definition, and joint is capable of containing a reference to a “next” and “previous” object of the same datatype, and almost nowhere are you as the user given an Array to sift via the API.

“Why would I want or need to explicitly declare Array lengths on instantiation? Why would I need to specifically create bodies and joints by calling the b2World rather than just instantiating them like I would do for a MovieClip object? Let’s read the user’s manual…”

From the manual:

C++ is great for encapsulation and polymorphism, but it’s not so great for API design. There are always significant trade-offs when creating a C++ library.

Should we use abstract factories or the pimpl idiom? These make the API look cleaner, but they ultimately get in the way of debugging and efficient development.

Should we use private data and friends as necessary? Perhaps, but eventually the number of friends can become ridiculous.

Should we just wrap the C++ code with a C-API? Perhaps, but this is extra work and may lead to internal choices that are non-optimal. Also, C-APIs are harder to debug and maintain. A C-API also breaks encapsulation.

For Box2D I have chosen the path of least resistance. For some cases a class is well contained in its design and function, so I use public functions and private data. For everything else I use classes or structs with all public members. These choices let me develop the code rapidly, it is easy to debug, and it creates minimal internal clutter while maintaining tight encapsulation. The downside is that you don’t see a clean, simple API. Of course, you have this nice manual to help you out. 🙂

Remember, the above excerpt is from the ActionScript-specific port of the manual!

Here’s some more:

Joint Factory

Joints are created and destroyed using the world factory methods. This brings up an old issue:


Don’t try to create a body or joint on the stack or on the heap using new or malloc. You must create and destroy bodies and joints using the create and destroy methods of the b2World class.

Heap? Malloc? In ActionScript?

Box2D doesn’t use reference counting. So if you destroy a body it is really gone. Accessing a pointer to a destroyed body has undefined behavior. In other words, your program will likely crash and burn. To help fix these problems, the debug build memory manager fills destroyed entities with FDFDFDFD. This can help find problems more easily in some cases.

If you destroy a Box2D entity, it is up to you to make sure you remove all references to the destroyed object. This is easy if you only have a single reference to the entity. If you have multiple references, you might consider implementing a handle class to wrap the raw pointer.

Often when using Box2D you will create and destroy many bodies, shapes, and joints. Managing these entities is somewhat automated by Box2D. If you destroy a body then all associated shapes and joints are automatically destroyed. This is called implicit destruction.

When you destroy a body, all its attached shapes, joints, and contacts are destroyed. This is called implicit destruction. Any body connected to one of those joints and/or contacts is woken. This process is usually convenient. However, you must be aware of one crucial issue:


When a body is destroyed, all shapes and joints attached to the body are automatically destroyed. You must nullify any pointers you have to those shapes and joints. Otherwise, your program will die horribly if you try to access or destroy those shapes or joints later.

Ahah! Any seasoned ActionScript developer should know and understand the concept of reference counting, as it’s a long-standing garbage collection technique used in the Flash Player (though the player now uses mark-and-sweep). But even though this documentation is clearly still “in another language” so to speak, things are definitely coming together here.

As pretty much anyone who’s done more than scratch the surface of Box2DFlashAS3 can tell you, trying to perform an action on a body or a joint that has either been destroyed or has “fallen out of the world” by leaving the confines of the AABB is asking for trouble. That manual is not kidding when it tells you to “nullify your pointers” when you destroy the object (or in ActionScript parlance, set your variable containing the b2Body/b2Joint = null) , and the more places in which you’re storing references to those bodies the more diligent you must be about cleaning up after yourself. The most disastrous behavior I found was calling b2World.DestroyBody() / DestroyJoint() on a body or joint that had already been destroyed. Completely threw off the internal count and broke all kinds of madness loose.

The paranoia about how you create and destroy the bodies and joints, and about Array length pre-definition is directly related to the fact that in straight C++ you are responsible for your own memory management and garbage collection. The implications of this are somewhat alien to the ActionScripter who has never worked with code in an unmanaged environment before, but become blindingly obvious when you have to handle it yourself.

Suddenly, the length to which you define your Array really, really matters! We’re a bit spoiled here in Flash-land with our dynamic-length Arrays. Those are more traditionally thought of as “Lists,” as they can grow or shrink in size on-the-fly at runtime. The worst that happens in ActionScript when you try to access an Array location that stores no data is that flash returns a literal datatype of “undefined”. In C++, may the Four Horsemen of the Scriptocalypse have mercy on your soul for daring to step off the end of an Array. This is why Box2D is so heavily reliant on Linked List structures when returning collections of bodies and joints. At least when you walk off the end of one of those, it can tell you explicitly what you’ve done rather than executing undefined behavior (which is a far different animal than returning a literal datatype defined as “undefined”).

So, what’s the point of all this rambling? I think what I found enlightening about my experience working with Box2DFlashAS3 was that it really gives you an appreciation for the care with which you must treat your resources. In our little world of managed code, we get away with a lot of things every time we compile a swf that just wouldn’t fly in C or C++. As a consequence, we can “afford” to be a little sloppy with our resources… but can we really afford it? Shouldn’t we always take care to implement a scheme for thorough destruction of all object references when they’re no longer needed? Isn’t it true that the more different places in which an object is stored, the harder this becomes?

Before snagging and holding onto a reference to an object whose life and death are somehow integral to the program, even in the highly managed world of ActionScript, maybe it’s a good idea to stop and ask yourself “Do I really need to store a reference to this object here? How much extra work will I create for myself when I have to destroy it? What harm will come when I fail to destroy it?

Tags: , , ,

No Comments