Friday, August 16, 2013

Processing: Optimizations and FireFox

I have done a lot of commits to the processing GitHub repo since the last Hackage release. I have discovered dozens of bugs that are now fixed. I have also implemented some new nice features, like conditionals with custom values or arrays. However, the nicest change has been done in the Substitution Optimization Algorithm. At least, this is the one I am more enthusiast about! It took me a while to get arrays working correctly, but it wasn't so exciting. Well, maybe arrays with custom values are a bit magical! They handle a set a of arrays of potentially different types under the hood, so you think that you are dealing with a single array.

In any case, the optimization process has been improved greatly. Now it looks for common sub-expressions inside a set of optimizable values, which are instances of a class. Instances of this class are automatically generated using Template Haskell. The nice thing is that, an expression of type Float can be inside an expression of type Integer (for example, in round pi), however, the optimizer will find expressions that are nested even in expressions of other types. Furthermore, the optimizer is now able to take greater pieces of code to work with, since it is now able to detect which variables are mutated. Therefore, it is able to stick to the biggest piece of code where variables can be treated as constants. Neat.

Of course, since I am the developer of the library, I find all these words easy to understand, but for the reader they may be quite opaque. In the section below I will try to explain the Optimization by Substitution Algorithm which is fully implemented in the processing package.

Optimization by Substitution Algorithm

A problem that arises when you are automatically generating code for a language you cannot evaluate is that the output code is usually slow and repetitive. Consider the following code example in Haskell:

foo :: Int -> Int
foo x = 2*x + 2*x + 2*x

Apparently (perhaps some cool GHC optimization you should no rely on is avoiding it) we are computing 2*x three times. To avoid it, the general technique is to create a let or where clause and include the repeated computation in it.

foo :: Int -> Int
foo x = let y = 2*x
        in  y + y + y

Now it seems clear that 2*x is computed only once. What we have done is to observe a repeated pattern/computation and gave it a name for its result. We then have substituted all the appearances of the repeated computation with that name. This is an intuitive way to avoid repeat computations, and this is exactly the idea behind the Optimization by Substitution Algorithm. I am pretty sure that this has been done before, but I give it here my own perspective, applied to my library.

In processing.js (JavaScript), we can apply a similar technique to that we applied to Haskell. Consider the following assignment.

v = 2*x + 2*x + 2*x ;

We can create an auxiliar variable, store the result of 2*x in this variable, and then use it to calculate the value of v.

y = 2*x ;
v = y + y + y ;

And this is how we adapt the above technique to JavaScript code.

Unlike Haskell, processing.js variables are mutable. The same variable x may have different values, depending on the last assignment done to that variable. Each time an assignment is done, the value of the variable may change, and we can't know if the new value is going to be the same or not (probably not). For example, consider the following code.

v = 2 ;
v = 2*v ;
v = 2*v + 2*v ;

At first sight, it seems that we are doing the same operation three times. However, the value of v has changed over time. If we replace all the appearances of 2*v we alter the result of the program.

v = 2 ;
y = 2*v ;
v = y ;
v = y + y ;

In the first program, the final value of v is 16. In the second program is 8. We cannot make substitutions freely in this situation. The solution I came out is to keep track of every mutated variable. When we reach an expression that uses one of the mutated variables, we apply substitutions over the code we have traversed so far, without including the expression that uses the mutated variable. Once we are done with the substitutions in that fragment of code, we repeat the process starting from the expression containing the mutated variable. This way, each piece of code where we are applying the Optimization by Substitution is safe.

Is this enough?

After my testings I am very happy with the results. However, in some browsers, like FireFox, it seems that my current test example is not running smoothly. I have made a Pac-Man game (a slightly simplification of the original) and looks like FireFox is not able to run the script smoothly. Either we need further optimizations or FireFox is just not good with processing.js. Perhaps the problem comes from processing.js, which may be not efficient enough. In any case, the game runs perfectly in Chrome/Chromium. I haven't tested with other browsers yet. Try it yourself here.

In the near future

The development of the Haskell processing library is not going to stop yet! More features are to come, and more bugs are to be found (and fixed!). Please, do not hesitate in try it yourself and report any annoyances in the issue tracker.

No comments: