mjs: Simple Vector and Matrix Math for JS
8 Comments Published by vladimir February 5th, 2010 in Canvas 3D, Firefox, MozillaOne common thread running through the many different and interesting WebGL projects out there is that they all need to do vector and matrix math, do it quickly, and do it in JavaScript. To date, developers have either rolled their own, or they’ve used Sylvester, a fairly featureful vector and matrix JavaScript library.
One of the problems with Sylvester is that while it’s fully featured (arbitrary NxN matrices and vectors can be created and manipulated), it suffers in performance because of it. Since this is such a crucial part of a successful WebGL program, I’ve put together a small package that I’m calling mjs.
mjs is designed around speed and simplicity. For example, it doesn’t attempt to stuff vectors and matrices into JavaScript objects. Because the language offers no operator overloading, there’s very little benefit in treating these types as discrete objects, and lots of performance and memory usage downsides. Instead, it provides a set of functions for performing operations on vectors and matrices, which can be any array-like object. For any function that returns a vector or matrix, an existing array can be passed in to take the result, or the function can create a new one. Array reuse ends up being important because of the potential for expensive garbage collection churn eating away at performance.
Here’s a sample of the API:
var r = M4x4.rotate(Math.PI/2, V3.$(0, 1, 0), M4x4.I);
Note that V3.$ and M4x4.$ are shorthand for creating a new V3 or M4x4 (I wanted to use V3() and M4x4(), but that didn’t work out too well since functions have a length property). However, because all they return are just new array-like objects, you could also write:
var r = M4x4.rotate(Math.PI/2, [0, 1, 0], M4x4.I);
If the WebGL types are available, those will be used for newly created vectors/matrices. They are a significant performance boost especially for repeated operations; but for specifying one-off vectors such as the above, literal array syntax is fine.
The rotate function internally makes a rotation matrix, and then multiplies it by the given matrix. So the above could also be written as:
var rotation = M4x4.makeRotate(Math.PI/2, [0, 1, 0]); var r = M4x4.mul(M4x4.I, rotation);
(The last line being redundant given that we’re multiplying by the identity matrix.)
All methods that return a vector or matrix take an optional final argument, that of an existing object to reuse. For example:
var m0 = M4x4.$(); r = M4x4.mul(someMatrixA, someMatrixB, m0); // r == m0, so the assignment isn't necessary, but it's handy for chaining // .... do something with r ... r = M4x4.mul(someMatrixB, someMatrixC, m0); // r == m0 still // ... do something else with new results ...
Without allocating any additional temporary objects.
As mentioned before, one of the goals of mjs is performance. Matrix multiplication is one of the most common tasks, so here are some numbers comparing mjs, Sylvester, and native C code. This was run on a Core i7 desktop using a local build of Spidermonkey, which included one patch that’s about to go into the tree that fixes the no-reuse tracing case. (Without it, the no-reuse tracing case is much larger because it’s never actually jitted.) The test is simple: it multiplies two matrices together in a loop 1,000,000 times.
| Test | Time |
|---|---|
| mjs, JIT, matrix reuse | 140ms |
| mjs, JIT, no reuse | 533ms |
| Sylvester, JIT, no reuse | 5,280ms |
| mjs, no JIT, matrix reuse | 25,833ms |
| mjs, no JIT, no reuse | 26,681ms |
| Sylvester, no JIT, no reuse | 41,996ms |
| Native C++, SSE2, matrix reuse | 71ms |
| Native C++, SSE2, no reuse | 142ms |
(I also have numbers for MSVC without the SSE2 compile flag, but the numbers vary greatly depending on whether the values eventually go to infinity or not; if the values end up trending towards 0, the non-SSE2 code tends to win at around 52ms vs. 71ms; if the values trend to infinity, the non-SSE2 code takes around 11,000ms!)
Those numbers are pretty encouraging — having native code be only 2x as slow for something like this is pretty nice to see. Granted, this is only a very isolated test, and I’m sure there are some tricks to optimizing the native code case (it’s currently just a fully unrolled set of multiplies and adds). The “no JIT” case is less nice, but I’m sure that our Jaegermonkey folks will be all over this testcase (right, guys?). In any case, ideally most WebGL rendering loops will be fully traced in Firefox, so it would be less of an issue.
mjs is still very much a work in progress; it’s missing a test suite and a whole bunch of features. You can find it hosted at Google Code, at webgl-mjs. (Side note: I couldn’t just call the project mjs because a project called mjs was abandoned on Sourceforget 5 years ago, and Google Code complained.) There’s also some documentation, viewable online here.
Bugs and contributions welcome!
Vlad, thank you. I updated https://adblockplus.org/trash/orbital/index.html to use mjs, made my code simpler and works great. Only one bug report: in M4x4.rotate one might be tempted to use the same matrix for both m and r but this will produce wrong results. For some reason the translation functions are not affected by this issue.
Oh, and it would be nice to have a shortcut to multiply a 3×3 matrix with a vector – something like V3.transform(m, v):
r[0] = V3.dot(v, [m[0], m[1], m[2]]);
r[1] = V3.dot(v, [m[3], m[4], m[5]]);
r[2] = V3.dot(v, [m[6], m[7], m[8]]);
That’s for the case a 3×3 matrix is passed (in my case the inverse rotation matrix) – for a 4×4 matrix it might take only the top-left part.
This is an excellent idea! I’ll try porting a few of the tutorials I’ve done over to it and let you know how I get on.
Success! I got my upcoming lesson 14 working using mjs, just needed to fix one minor bug (which I’ve raised an issue for on the Google Code tracker).
Cheers,
Giles
Great stuff, Vlad! Could we get you to check-in the benchmark code so that we can do some additional comparisons?
Wow! Pretty nice and surprisingly close to CPP performance!
Are you doing the same thing for ARM?