Archive for the ‘ActionScript 3’ Category

Mommy, where do framescripts come from?

When writing ActionScript you have your choice of using external class files or attaching code to a timeline by simply typing it into the .fla code editor window. I’m going to ask you to follow four simple steps and observe the results.

  1. Create a new .fla in Flash, set to use AS3
  2. On frame 1, type gotoAndStop(3);
  3. On frame 2, type trace(“I’m a script on frame 2”);
  4. On frame 3, type this[“frame2”]();

This should result in your trace window showing “I am a script on frame 2”.

“No way, Mr. Horseman! How could this have happened? The playhead never touched frame 2!” I hear your cries of astonishment even from here. Indeed, how did this happen? Let’s stop to think for a minute about what happens when Flash actually compiles a swf. The first thing to remember is that in a published swf frame scripts do not exist. In a previous entry I talked briefly about what the compiler does with instances on the stage. Note the following line of code from that old entry:

public var foo_mc:NewMovieClip0; // Where did this come from?????

Something similar happens every time you use the Flash Authoring Tool to write a framescript. There is an undocumented (but by now very well known) function of the MovieClip class called addFrameScript, which allows you to attach “frame scripts” at runtime to a MovieClip or any class derived from MovieClip. Indeed one thing you’ll find is that if you try to subclass Sprite and link that to a library symbol with frame scripts, you’ll recieve an error to the effect that Sprite does not have a function called addFrameScript. This should tell you that internally Flash uses this function to give the illusion that scripts you write in the Flash editor just “exist” as part of the movieclip. If we decompile the swf we just created, we will see exactly that.

package DecompileMe_fla
{
    import flash.display.*;
 
    dynamic public class MainTimeline extends MovieClip
    {
 
        public function MainTimeline()
        {
            addFrameScript(0, frame1, 1, frame2, 2, frame3);
            return;
        }// end function
 
        function frame3()
        {
            var _loc_1:String;
            _loc_1.this["frame2"]();
            return;
        }// end function
 
        function frame1()
        {
            gotoAndStop(3);
            return;
        }// end function
 
        function frame2()
        {
            trace("I am a script on frame 2");
            return;
        }// end function
 
    }
}

When Flash is required to attach frame scripts to a class (regardless of whether you explicitly created the class, or whether you are implicitly creating one simply by nature of the fact that you’ve attached a framescript to a movieclip in the editor) it adds (implicitly) public functions to that class named after the frame on which the code appears, and then uses addFrameScript in the constructor to add those frames at runtime. Since these functions are just public and freely accessable, the fact that the “playhead” never even needs to “touch” a frame in order for the frame’s scripts to be called should be quite obvious.

“You’re making my head hurt, Horseman! I implore you, cease this haunting with accursed visions from beyond the abyss!”

Piercing the veil of abstraction is always a perception-altering experience. You can expect to feel a slight discomfort, but more than that you should feel a distinct thrill about uncovering another of the world’s great mysteries.

Tags: ,

3 Comments


Flashing your C String

On a project I worked on a few months ago, I was tasked with taking data from a binary source file, reading it into Flash as UTF Bytes (eg: a ByteArray) and parsing the resulting ByteArray as a String. Straightforward enough, right?

What I know of the file to parse is that it is a very long list of words compressed and encoded, and that it needs to be decoded at runtime inside the swf. Fortunately, I have the algorithm for decoding the string on-hand, so I know I can implement this with relative ease. All I have to do is load the .dat file containing the list and then run the resulting byte array through the parser and get my list of words.

var ba:ByteArray = myImportedByteArray;
ba.position = 0;
var encodedString:String = ba.readUTFBytes(ba.length);
trace(encodedString);
/* My expected result was something akin to:
"Header Metadata - (Jibberish representing encoded list goes here)"
*/

So funny enough, when I do that what do I get instead?

"Header Metadata"

Huh.

I spent a good long time puzzling over where the encoded word list was in all of this. I could open the .dat file in Eclipse, and plainly see the list right there. I wondered if perhaps there were some sort of problem reading the data. Maybe the ByteArray was incomplete? No, because when I traced ba.length it indicated that the array was well over 90,000 bytes in size. Definitely too many to simply be “Header Metadata”. Next, I did this:

// 200 chosen simply for the sake of an easier to read log...
ba.position = 0;
var byte:Number;
while(ba.position < 200){
	byte = ba.readByte();
	trace(byte, String.fromCharCode(byte));	
}

What I recieved in the output window was in fact the first 200 characters of the file just as I’d expected them to be!

72 H
101 e
97 a
100 d
101 e
114 r
32  
77 M
101 e
116 t
97 a
100 d
97 a
116 t
97 a
0
32  
63 ?
32  
0
102 f
111 o
111 o
98 b
97 a
46 .
46 .
46 .

This gets more and more curious by the moment. At some point though, the lightbulb goes on. Look at this right here….

100 d
97 a
116 t
97 a
0        // < -- see that 0?  
32  
63 ?
32  
0        // < -- and that 0?

Anatomy of a String

So, in Flash-land we very seldom have to know about what might be going on under the hood of the String class. I mean, we just declare Strings, read them, print them, concatenate them, and in general just don’t really care much about how they work. It is not like this in all languages, however. Hop aboard the black stallion, for the Horseman is about to take you to a magical world known as “C Strings.”

The C language generally operates “Close to the metal” of a machine. You can think of it as an abstraction that allows you to write code that is then directly translated into Assembly-level instructions, that are then executed by the machine’s processor. In that sense it is a “High level” language (at least in comparison to writing Assembly by hand) but it is lower level than what an ActionScript or Java programmer would have to deal with as our respective Virtual Machines abstract over the nitty-gritty of memory allocation and deallocation, and direct pointer manipulation inherent in C. So what does a String look like in this lower level world? It is ultimately nothing more than an Array of char values.

char someString[10]; // an array of char values with a length of 10

It needs to be pointed out that there is a very important distinction between C Arrays and ActionScript Arrays. C Arrays are fixed-length. If you declare an Array to have a length of 10, then a contiguous set of memory addresses, of a size capable of storing 10 units of the data (in this case, char values) is allocated for that Array. That’s all. Nothing else happens. An Array in C is not an Object, so it doesn’t have a notion of a “length” property, nor does it have .push() or .pop() functions. Unless you know the length already, you can iterate your way past the end of the Array and keep on going into some other variable’s data… possibly even into undefined space.

This is considered “a very bad thing to do”. (As an existential aside about walking off the end of an array, please read this blog post by Steve Yegge)

But wait, if you can’t know the length of an Array in C then how can you possibly know when you’ve reached the “end” of a string?

Good question! Here’s what the string “Hello World!” looks like in C:

char *foo = "Hello World!";
// looks to the machine like this in ASCII....
// {72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 33, 0}

Each of those numbers maps to a Unicode representation of a letter, so for example 72 maps to “H” while 33 maps to “!”. If you want, to see this in the Flash context you can set up a KeyboardEvent listener and trace the event. You’ll find the keyboardEvent.charCode property will match the above.

But do you notice anything funny here? “Hello World!” has 12 chars, but that array above has 13 entries! The very last one of course being 0. Now why is that?

In C, when you want to know if you’ve reached the end of a string, you’ll look for what is called a null terminator. In C, 0 can be considered a null value and so when a string is declared it has a length equal to the length of the characters entered into it plus one for the terminating 0. Any code that wants to parse the string can know that it should stop when it hits a 0 because to keep going after that could lead anywhere! And we certainly wouldn’t want to add some instruction set for connecting to the printer to your string… or worse, overwrite it with some random garbage!

And so this brings us back to our little ByteArray in Flash and the encoded word list. As you can see, the header metadata is immediately followed by a 0. In C terms, this file is in fact 3 strings instead of 1. Knowing this, and observing the fact that Flash’s String object seems to believe that this 90,000+ length ByteArray is in fact a string of a mere 15 or so characters, we can deduce that under the hood Flash’s String is C’s string.

My algorithm then, had to account for this fact and once I knew that Flash would treat a 0 byte as a null terminator when calling readUTFBytes, was able to successfully reach the word list.

Tags: , , , , , ,

4 Comments


Refactoring Box2D

Have you ever wondered whether your code is performing as best it can? What are your benchmarks, anyhow? How do you determine what “as best it can” means? These are all questions of heavy existential weight, but usually we have a reasonable idea of what it is we’d like to improve and why. Are you looking to turn a mess of unreadable and unmaintainable code into something more pleasurable to behold? Are you tasked with putting a processor pig on a diet? Usually you’ll have a sense of what you really need.

The real question is “How?”

I’m going to examine a concept called “Refactoring” and I’m going to give you a case study in it as I refactor parts of the Box2DAS3 (version 2.0.1) engine with the goal of improving runtime performance and memory consumption.

To refactor code is to change its inner workings without destructively changing its interfaces or altering the output of its functions and expressions in any way. You may wonder “How can this possibly help? Doesn’t this mean that the code is just doing what it always did?”

No, not in the slightest.

I had an itch to see if I could manage to squeeze some more juice out of the ol’ physics engine for the sake of the banner up above. If it’s running more smoothly on your machine now (and it should be. It certainly is on mine), this is the reason why.

Before we begin, there is one thing to keep in mind when you decide to refactor a 3rd party library : thier updates can break code you’re relying on! Your refactored code may not be compatible with the vision the creators originally conceived when they wrote it. They might have code they were intending to implement but simply haven’t, and their ideas might be better than yours. In this case you’re stuck with a lot of time sunk into your own custom branch of the project which might fall far behind the official branch. I will do my best to point out the danger zones as I encounter them.

I will admit to you that what follows is not the most formal method of refactoring. You can use a debugger to step through your code, or you can simply do as I have done here and take the initiative to “run through” yourself, from function to function and “execute” the code in your head. We’ll take this latter approach for now, as I find it tells an interesting story.

A reasonable place to look when identifying performance bottlenecks is anything that happens in a loop, or on an interval. The most obvious interval in the world of Box2D (and the one that is called 30 times per second in my banner) is b2World.Step(); Let’s examine this function. I’ll call out some places in the code via comments:

	public function Step(dt:Number, iterations:int) : void{
 
		m_lock = true;
 
		// *** An instantiation happens 30 times per second *** //
		var step:b2TimeStep = new b2TimeStep();
 
		step.dt = dt;
		step.maxIterations	= iterations;
		if (dt > 0.0)
		{
			step.inv_dt = 1.0 / dt;
		}
		else
		{
			step.inv_dt = 0.0;
		}
 
		step.dtRatio = m_inv_dt0 * dt;
 
		step.positionCorrection = m_positionCorrection;
		step.warmStarting = m_warmStarting;
 
		// Update contacts.
		m_contactManager.Collide();
 
		// Integrate velocities, solve velocity constraints, and integrate positions.
		if (step.dt > 0.0)
		{
			Solve(step);
		}
 
		// Handle TOI events.
		if (m_continuousPhysics && step.dt > 0.0)
		{
			SolveTOI(step);
		}
 
		// Draw debug information.
 
		//*** Regardless of whether it's useful, or whether we're debugging,
		         this function is called 30 times per second ***//
		DrawDebugData();
 
		m_inv_dt0 = step.inv_dt;
		m_lock = false;
	}

First, note the instantiation of a b2TimeStep each time we call the Step function. What is a b2TimeStep? It’s this:

package Box2D.Dynamics{
 
 
public class b2TimeStep
{
	public var dt:Number;			// time step
	public var inv_dt:Number;		// inverse time step (0 if dt == 0).
	public var dtRatio:Number;		// dt * inv_dt0
	public var maxIterations:int;
	public var warmStarting:Boolean;
	public var positionCorrection:Boolean;
};
 
 
}

It’s simply a data structure. There are no functions at all. Now, in case you did not know, instantiation is expensive. It is not something you want to do willy-nilly if you can possibly help it, and from the look of the code above it seems like all the variables are set immediately after instantiation within the scope of the Step function. So instead of instantiating from scratch, let’s simply make one local Class-level variable to contain our Step function’s b2TimeStep object, and do some refactoring:

 
	private var m_stepScopeTimeStep:b2TimeStep = new b2TimeStep();
 
	public function Step(dt:Number, iterations:int) : void{
 
		m_lock = true;
 
		//var step:b2TimeStep = new b2TimeStep();
		var step:b2TimeStep = m_stepScopeTimeStep;

“Why did he choose to to simply assign the old step variable with the object referenced by m_stepScopeTimeStep? Why not just find/replace all instances of the word “step” with “m_stepScopeTimeStep”?

When refactoring it is critical to take small steps. We know that the code works as it was written, so the goal at least for now is to modify it as little as possible while still making it better. Yes, we are still allocating memory for an unneccessary variable at the start of the Step function, but what’s more important right now is to stop instantiating a needless variable every time Step is called.

We now dutifully confirm that our change has not broken the code. It is best to do this with a debugger, but for our purposes we’ll simply execute the code and ensure that it still runs correctly.

This same sort of redundant instantiation happens in other places in the library as well. notably, there are a great many b2Island objects instantiated every frame when only a single one is ever needed. It can simply be re-initialized and reused.

Cutting down on instantiations of b2Islands and b2TimeSteps alone helps save several MB of memory over time, which can be better spent rendering the awesomeness of physics.

The next thing we’ll do in the Step function is examine the call to the DrawDebugData function. Here’s what’s going on inside of it:

public function DrawDebugData() : void{
 
		if (m_debugDraw == null)
		{
			return;
		}
 
		// snipped ...

One thing to remember about ActionScript is that function calls are relatively expensive to perform. You shouldn’t do it without a good reason, and particularly not on a loop. So what we’ll do instead is this:

//DrawDebugData();
if(m_debugDraw) DrawDebugData();

In the event that you’re not debugging the application, you’ll save 30 function calls per second here just by confirming that there’s even a reason to call the function in the first place. Again, we’re not going to change anything inside the DrawDebugData function. It’s not quite in the scope of the refactor… I really couldn’t care less right now about how well it runs in debug mode, as I’m only displaying content in production mode.

So, that’s all well and good. Where should we go next? Let’s scan our way down and see what functions are being called in Step

 
// I wonder what's happening in this function call?
m_contactManager.Collide();

We’ve seen our first call here in this line. m_contactManager is an instance of b2ContactManager, so let’s open it up and see what this function does:

	public function Collide() : void
	{
		// Update awake contacts.
		for (var c:b2Contact = m_world.m_contactList; c; c = c.m_next)
		{
			var body1:b2Body = c.m_shape1.m_body;
			var body2:b2Body = c.m_shape2.m_body;
			if (body1.IsSleeping() && body2.IsSleeping())
			{
				continue;
			}
 
			c.Update(m_world.m_contactListener);
		}
	}

Doesn’t look like anything suspicious is happening here, does it?

Wait!

    // getter functions for "IsSleeping"?  Is this just a boolean we can retrieve for ourselves?
    if (body1.IsSleeping() && body2.IsSleeping())

Let’s open up b2Body and have a look-see:

/// A rigid body.
public class b2Body
{
	/// Creates a shape and attach it to this body.
	/// @param shapeDef the shape definition.
	/// @warning This function is locked during callbacks.
	public function CreateShape(def:b2ShapeDef) : b2Shape{
 
           /*** snip ***/
 
	/// Is this body sleeping (not simulating).
	public function IsSleeping() : Boolean{
		return (m_flags & e_sleepFlag) == e_sleepFlag;
	}
 
           /*** snip ***/
 
	public var m_flags:uint;
 
           /*** snip ***/
 
	// m_flags
	//enum
	//{
		static public var e_frozenFlag:uint			= 0x0002;
		static public var e_islandFlag:uint			= 0x0004;
		static public var e_sleepFlag:uint			= 0x0008;
		static public var e_allowSleepFlag:uint		= 0x0010;  // this is the one!
		static public var e_bulletFlag:uint			= 0x0020;
		static public var e_fixedRotationFlag:uint	= 0x0040;
	//};
		static public var e_sleepFlag:uint			= 0x0008;

So the b2Body not only has a function call for IsSleeping, but it does a bitwise operation based on its current flags uint to determine whether or not it counts as “asleep”. Changing the internal guts of how Box2D determines whether an object sleeps might be worthwhile, but that would require some benchmarking. For now what we’ll do is take advantage of the fact that the variables are all public and perform the comparison without calling the function:

	public function Collide() : void
	{
		// Update awake contacts.
		for (var c:b2Contact = m_world.m_contactList; c; c = c.m_next)
		{
			var body1:b2Body = c.m_shape1.m_body;
			var body2:b2Body = c.m_shape2.m_body;
			//if (body1.IsSleeping() && body2.IsSleeping())
			if ((body1.m_flags & b2Body.e_sleepFlag) == b2Body.e_sleepFlag && 
				(body2.m_flags & b2Body.e_sleepFlag) == b2Body.e_sleepFlag)
			{
				continue;
			}
 
			c.Update(m_world.m_contactListener);
		}
	}

We’ve now eliminated two needless function calls, each of which were being called by Step. Testing this code shows that the application continues to behave properly. But before we go on…

Danger : While we have done nothing to change the actual results of the code on execution we have definitely “painted ourselves into a corner” in a sense. The code becomes vastly more efficient by removing needless get/set function calls, but by breaking encapsulation we’re now at the mercy of fate. By no longer relying on the IsSleeping() function, we’ve lost out on the potential that the function itself may be made more efficient… or more disastrously, if in a future version of the library the way “sleep” is determined changes we’re no longer protected from it by a function that abstracts it away from us. It’s a potentially future-breaking change. In this particular case, it’s not likely that there would be a change, but that’s not the case for other parts of the library. Particularly, there are a number of functions of b2Vec2 that simply return new “cloned” instances of the b2Vec2 with certain transformations or maths applied to them. These functions, called repeatedly, are quite inefficient as they not only are a wasteful function call but also an instantiation. We *could* simply instantiate our own new b2Vec2 instances and apply the simple math functions ourselves, which would save us a function call. Potentially a better solution though, would be to make the function call itself less wasteful. If the authors (or we) choose to create an object pooling scheme that allows them to generate a limited number of b2Vec2 instances and recycle them it would justify keeping the function around. At times like those, it’s completely up to your own judgment and what risks you’re willing to live with. In the case of Box2D, the library is overly encapsulated in a number of ways that hamper performance. More frustratingly, there are several function calls that happen repeatedly where the function body has nothing inside other than a //TODO. While I’m sure in the future there is something to be done in those stub functions, in the meaintime those function calls should be commented. An empty function call iterated for every single “body” in the simulation, 30 times every second can quickly deteriorate your performance. Do you ever wonder how much reality there is in those contrived 100,000 iteration for-loops? This is one of those cases where it actually happens.

And there you have a very basic rundown of what a refactoring is. These changes seem small, but they add up very quickly and the more of them you implement in performance-critical areas the more juice you get from your code. They can be combined with microoptimizations, such as more efficient for-loop declarations, factoring out the use of b2Math convenience functions such as min/max, or the simple vector arithmetic functions. Another place for improvement is to use array-literal style instantiations where a pre-determined array length is unimportant. Truly though, I got my biggest gains from eliminating needless instantiation and allowing direct property access instead of using the get functions for properties that have no actual mutations applied to them in the process. I’d say that one of the biggest potential roads for improvement would be to implement object pools for commonly instantiated trump objects that do not persist, and that serve little value other than as fodder for calculation.

In this particular refactoring you’ll note that I’m tearing down certain OOP constructs for the sake of performance. This should not be taken to mean that well-designed systems and Object Orientation is by default a hog. It’s in general better to start with a system that is well-designed and possibly slower and to selectively break encapsulation to gain boosts than to start with a mess and try to organize it later.

Tags: , , , ,

4 Comments