Some of you may have noticed that the Firefox 3 nightly builds have felt a lot snappier since a few weeks ago. There’s an interesting story in that, one that I finally have time to write up. We’ve had a number of bugs on the Mac where people were complaining of bad performance compared to Firefox 2, usually involving a test where a page was scrolled by a small step 100 or so times, and the time from start to finish was recorded. In many of these tests, Fx3 was coming in at 50% to 500%+ slower. This was odd, because in theory the graphics layer (which is what scrolling is mostly exercising) in Firefox 3 should be faster, given that it’s talking almost directly to Quartz.
I noticed a few odd things — if I disabled drawing of the native theme (used for the scrollbar), and if I tested just a tall page with a solid colored background and a scrollbar, I saw the same performance issues. There’s a problem here, one that I initially blamed on the native theme drawing, because if I disabled native theme rendering, benchmark performance shot up, but there were still some inconsistencies. I collected some numbers: for a given test, the minimum time that it can report is the number of iterations multiplied by the setTimeout delay used. For example, with 200 iterations and a 50ms setTimeout, there’s no way that test can finish faster than 10,000ms. The difference between that time and the time the test actually reports is the time spent doing interesting things. By running my test for different timing values, I ended up with this graph:
The horizontal axis is the setTimeout value, and the vertical axis is in milliseconds. That plateau there? That’s not supposed to be there. Looking more closely at that number, it’s roughly 6500ms. With 200 iterations, that’s 6500/200 = ~32.5ms per iteration. So, per second, that gives us 1000/32.5 ~= 30.77. We’re capping at 30.77 frames per second here. That’s way too round of a number. Even more suspicious, if I disabled native theme rendering, the test hit a plateau at around 3300ms, which ends up being 60fps.
It’s at this point that I start to get an idea of what’s going on. I was aware of Beam Sync on OSX, but assumed that each app had to opt-in to it, given that it didn’t seem to affect Firefox 2. Quartz Debug lets you disable Beam Sync on a global basis; I did that, and the benchmark numbers dropped — the line above kept nicely following the blue line down, and I was able to peg the fps needle in Quartz Debug over to the right. So, we’re being throttled by the OS which is forcing us to wait for the next frame interval before allowing us to draw again. This is a pretty serious problem, because at this point I thought that the only way to disable this was on a system-wide basis, which wouldn’t be acceptable. Firefox 2 didn’t suffer from this, though, so I did some more digging.
Eventually I came to this tech note from Apple. The reason why Firefox 2 wasn’t affected was that Fx2 was not a Cocoa app; it is a Carbon app, and as such was exempt from being subject to coalesced updates. The key thing showed up in the “last resort” section of that tech note: how to disable coalesced updates for an individual app! This seems to be available available only on 10.4.4 or later, but that was fine; OS minor updates are free. I verified that adding the plist entry fixed the problem for me locally, and checked this in to become part of the build. See if you can spot when this change hit our performance testing infrastructure:
Not bad for two lines of XML.
While figuring all this out, I noticed that Safari/WebKit didn’t seem to be affected by this framerate cap — the fps meter when Safari was running the same benchmark happily went up beyond 60fps. After I found the plist entry, I checked Safari’s plist and was surprised to discover that they didn’t have this disabling in there. Doing some more searching, I found this code in WebKit. Apparently, there is a way to do this programatically, along with some other interesting things like enabling window update display throttling (though it’s unclear what that means!) — but only if you’re Apple.
All these WK* methods are undocumented, and they appear in binary blobs shipped along with the WebKit source (see the WebKitLibraries directory). There are now over 100 private “OS-secrets-only-WebKit-knows” in the library, many of which are referred to in a mostly comment-free header file. Reading the WebKit code is pretty interesting; there are all sorts of potentially useful Cocoa internals bits you can pick up, more easily on the Objective C side (e.g. search for “AppKitSecretsIKnow” in the code), but also in other areas as a pile of these WK* methods used in quite a few places. Would any other apps like to take advantage of some of that functionality? I’m pretty sure the answer there is yes, but they can’t. It’s not even clear under what license libWebKitSystemInterface is provided, so that other apps can know if they can link to it.
Despite my frustrations with Linux, this type of hiding isn’t really possible in a real open source environment. In the end, I really do hope that Linux can rise to the technical challenge and compete in desktop performance and features, but it’s not there yet. However, I’m glad that there was a workaround for this issue for us on OSX, because the performance benefits are huge – Firefox 3 on the Mac (everywhere, really) is going to be a kick-ass release!
Edit: Slashdot seems to have picked up on this, and in typical style, has completely misunderstood the post. To be clear, I do not think that Apple is in any way trying to purposely “cripple” non-Apple software. I also do not think that undocumented APIs give Safari any kind of “significant performance advantage” (as Firefox 3 should show!). However, as I said, the undocumented functionality could be useful for Firefox and other apps to implement things in an simpler (and potentially more efficient) manner. I don’t think this is malicious, it’s just an unfortunate cutting of corners that is way too easy for a company that’s not fully open to do.