Vladimir Vukićević — Words
 

It’s SIGGRAPH time, and this means all sorts of interesting announcements in the graphics world. One of these came today from AMD, who announced that they plan on shipping both EGL and OpenGL ES drivers on Windows for their recent GPUs.

One of the most challenging things in getting Firefox working with WebGL and hardware graphics acceleration has been dealing with platform-specific pieces to get access to OpenGL. In many cases similar functionality works differently (often in subtle ways), requiring both lots of testing and lots of very specific codepaths. EGL replaces all of these with a modern system designed with portability in mind. Until now, however, EGL has only been adopted in the mobile space. On the desktop, the older GLX, CGL, and WGL subsystems have held this role; in the case of GLX and WGL in particular, they bring along years of accumulated cruft.

Having a native EGL driver will allow us to ship one particular hardware acceleration provider that will work and be tested across various desktop and mobile platforms. Additionally, the same provider can connect to the ANGLE project, which implements EGL and OpenGL ES on top of Direct3D 9. Having OpenGL ES will allow us to test and develop truly identical code across desktop and mobile. As mobile graphics development has become important (not just to Mozilla, but in general!), having the same API implemented on the desktop will make it easier to catch problems and portability issues in an environment that’s much more conducive to development and debugging.

Native OpenGL ES on the desktop will also mean that we can tie our WebGL implementation directly to it, instead of going through the desktop OpenGL driver. Because WebGL follows the OpenGL ES specification, the native ES driver on the desktop will allow us to make a more efficient binding between WebGL and the underlying platform, potentially leading to higher performance.

As with any such change, it will be a while before we can depend on the presence of these APIs on the desktop. These first steps are important to making that change happen. I’m looking forward to seeing other vendors following AMD here, both on Windows and on other platforms.

WinDbg Image Viewer Extension

Since I often work on graphics code, one of the things I frequently want to do while debugging is take a chunk of memory and view it as an image.  No standard debugger seems to do this, which is surprising given how useful it is.  I’ve made do with other tools in the past — there’s a great debugging image viewer somewhere for win32 (I can’t find it via google this time around) that lets you attach to a process and manually put in an address, dimensions, etc. to grab the data from the process and view it.  It’s somewhat buggy though, and hasn’t been udpated in a while.  There’s also the image debugger, which is handy, but requires you to link it into your program.

So, I finally wrote the (very simple) debug extension I wish I’d had.  You can find the source for imext here (it’s a hg repo), and a binary DLL built for x86 here.  Rename it to imext.dll and drop it alongside your x86 WinDbg.  Use “!imext.help” to get some basic instructions.  It’s pretty rough code and only does exactly what I needed yesterday, but it can become more useful pretty quickly.  There are some weird bugs with WinDbg’s expression evaluators that make it difficult to use with actual expressions; passing addresses directly works better. Also remember that WinDbg’s default expression evaluator is MASM, not C++ — the main thing is that this means that any bare numbers are interpreted as hex (use 0n123 to get decimal).

I’ll probably extend this quite a bit over time, including teaching it about Mozilla-specific things like gfxImageSurfaces.

Has that happened to you? No? It’s just me? Really? Huh.

(Somewhat technical OpenGL post follows; you’ve been warned.)

So over the past week, I’ve been trying to make our OpenGL stuff in Firefox be a little more coherent and easy to use, especially when it comes to doing things like rendering to a texture. This is something really for WebGL, and is a key to making WebGL fast… we want it all to live in video hardware, and we want to render it straight from there, instead of doing the horrible readback that it does currently.

Our current implementation uses PBuffers for each WebGL context. PBuffers are old and crufty, but they mostly work ok. But, everyone kept telling me, “PBuffers are old and crufty! FBOs are the new hotness! Why do you hate unicorns?” Problem is, WebGL really wants its own context — one of the advantages of FBOs over PBuffers is that they don’t need a separate context.  That’s great, if what you need is to render to a texture in your game or visualization app or whatever.  But, WebGL needs a context.

Ok, I thought, I can just use OpenGL’s (technically, WGL, EGL, GLX, and CGL’s) context sharing feature, have everything live in one shared global namespace, and things will just work.  This has one somewhat scary problem — if you don’t clean up any resources, they stick around forever, or at least until all the contexts in the sharing group are gone.  But, I figured we can keep track and clean up.

So I did all this work, set up a global context to use as a “shared” context.  Everything worked great, creating a FBO is certainly easier than creating a PBuffer and all that.

And then I decided to test it on Android.  One of the reasons why I wanted to do this was that on a number of mobile devices, among them Nokia’s N900 and Nvidia’s Android port, PBuffers cannot be bound as a texture; they just don’t support that.  The shared-context FBO approach worked great.  Then I plugged in a Nexus One.  Failure.

What gives?  It turns out, a bunch of current Android devices simply don’t support GL context sharing.  My plan?  Ruined.  The full table of tears actually looks like this:

Target PBuffers Sharing Other
Desktop – WGL YES YES
Desktop – GLX no YES YES
Desktop – CGL YES YES YES
Maemo (N900 + others) no YES YES
Android – Tegra no YES YES
Android – Nexus One YES* no
Android – Droid YES* no
Android – EVO4G YES* no

The “Other” category means that the platform has an alternate approach that doesn’t involve either PBuffers or context sharing. These other approaches are:

Desktop – GLX: X11 Pixmaps can be rendered to and used as textures via texture_from_pixmap

Desktop – CGL: I think there’s some CoreAnimation thing that can be used here?

Maemo: X11 Pixmaps, as on GLX; potentially EGL_KHR_gl_texture_2D_image

Android – Tegra: EGL_KHR_gl_texture_2D_image, which allows a texture to be exported as an EGLImage.  This is ideal, since it gives all the benefit of FBOs without any of the downsides of context sharing.  Unfortunately, this is the only place where it’s supported, and as best I can tell nothing like this exists on the desktop.

The * next to some of the Android entries indicates that while they do support pbuffers, they only support power-of-two dimensions, at least for pbuffers that can be bound as textures.  This is annoying and caused me a bunch of grief until I realized that quirk.

So, with that information, the new plan is to attempt to share all window contexts’ resources — there are advantages here in cleanup operations and being able to do texture uploads and other things on different threads.  For WebGL and other offscreen contexts though, we want to avoid sharing if we can, so the order in which we’ll try things goes like this:

  1. EGL_KHR_texture_2d_image + GL_OES_EGL_image.  Even though it’s only supported on one target, I still want to make sure that we use this where we can — it really is exactly what we want.
  2. PBuffers.  Yes, they may be old and busted and difficult to create and all that, but if supported, they still do basically exactly what’s needed.
  3. X11 Pixmaps + texture_from_pixmap.  Basically like PBuffers, but even more annoying and actually supported on X11.
  4. Dummy window or other drawable and a FBO, plus full sharing with the windowed contexts.  This is only possible if sharing is possible, and is a little risky from a resource management perspective, but it works.
  5. Can’t share, can’t texture from any renderable target?  Then we call glReadPixels and take the slow boat through system memory.

So, where I was hoping to just write one path — #4 in the list above — I now have to write 5.  On the plus side, 1-3 all don’t have the sharing problem.  On the minus side, it’s 5 separate code paths instead of just 1.

Hopefully the above information saves someone some pain while trying to do offscreen GL rendering on various platforms, especially mobile ones.  I wish the Android EGL implementations were higher quality; the non-Nvidia ones seem to support an identical set of extensions and report identical version strings, which makes me wonder if they’re just based on some generic code that Google provides.  If so, it would be nice to see that updated with support for texture_2d_image/EGL_image.

Losing My Memory

With the work going on to bring Firefox to mobile devices, and with desktop users demanding more and more from their web browser, memory usage is a concern.  Even with 4GB on desktop and laptops becoming commonplace, and 8GB, 12GB, 16GB etc. becoming not all that unusual, it’s unnerving to see a web browser eating up a large chunk of that.  I’ve been spending time figuring out how we can improve our memory usage, which starts with finding out where Firefox uses memory to begin with.

Let’s get one thing out of the way up front.  Today’s web browser is in many ways acting like a miniature full operating system.  It runs multiple applications at once (whether in multiple windows or tabs).  It might do a lot of background processing.  It can work with large data sets, for example large images on flickr or large spreadsheets on Google Documents.  But, the final memory usage number that the user sees when they open up the Task Manager or Process Viewer is the aggregate memory usage of the entire system.  So, the goal of improving our memory usage is not to get that number to the lowest possible — doing that would be an unacceptable tradeoff in performance for users — but instead to understand where memory is being used, and then use that data to improve in those areas as possible.

One comment that I’ve heard is that Firefox 3.6 seems to use more memory than Firefox 3.0.  My initial tests show this to not be true; specifically, I looked at the “Private Bytes” value in the Windows 7 task manager shortly after startup with about:blank, and also after opening a number of tabs (gmail, google docs, cnn.com, front page of the boston.com big picture blog, engadget, and a few others).  Here are the results of a typical run:


(in kb) Firefox 3.0 Firefox 3.6
Blank Page 20,052 21,740
Multiple Tabs 115,532 109,128

The next question is figuring out where all the memory goes. I’ve been adding some instrumentation to Firefox to figure out in more detail where memory is being used. For a sample run with the multiple tabs shown above, here’s what some of that reporter data looks like:


Component Memory (in kb)
Windows – Private Bytes 111,616
jemalloc – Commit Size 91,684
JavaScript – GC Chunks 11,534
JavaScript – NJ Trace Code 128
JavaScript – js_malloc Other 30,142
Images (uncompressed) 53,811
Graphics Surfaces (win32) 53,967
PresShell Arenas 6,373

Or, graphically:

There’s some overlap in those numbers — for example, the jemalloc commit size is a subset of the Windows Private Bytes number, and most of the rest is a subset of the jemalloc commit size. Likewise, the uncompressed images number is a subset of the Win32 graphics surfaces number; that is, ~53MB is in use by win32 surfaces, and almost all of that is due to live images in pages (remember that we’ve got some image heavy sites in that tab set, including the Big Picture blog which has around 10-11 large images on it… those should account for about 20-25MB just by themselves).

There are some things that don’t make sense in the above, which mean that my instrumentation isn’t quite correct… for example, adding up the JS numbers, the Images number, and the PresShell arenas number brings us beyond the jemalloc commit size, which shouldn’t be true. However, some of the Image data is allocated by GDI, likely bypassing jemalloc, so we have to take that into account. There’s also some large other chunks of the browser that have yet to be instrumented, which should provide additional insight.

Two initial observations: one, keeping images compressed in memory and only decompressing them briefly when we need to draw them is a potential huge memory win. We have the infrastructure and code to do this in place; it was disabled recently while some of the internals changed, and it needs to be reenabled.

Two, the 30MB or so in the “js_malloc Other” bucket is also pretty curious. We need to do some more work to figure out what exactly is in here. (This contains things like data structures for tracking array contents and — potentally a big one — string data.)

I’ll be blogging more as the instrumentation takes shape, and as it gets landed into trunk nightly builds. Much of this information will be visible in about:memory, and eventually we’ll be able to give some per-tab memory information as well.

Fennec on Android

Over the last few months, we’ve made some great progress on bringing Firefox to Android.  Michael Wu, Brad Lassey, Alex Pakhotin and I have been focusing on getting a build ready that’s usable by a broader set of people, and we’re now ready to get that build out there.  This build should be considered “pre-alpha”, so there are some warnings and caveats:

  • We’ve only really tested this on the Motorola Droid and the Nexus One.
  • It will likely not eat your phone, but bugs might cause your phone to stop responding, requiring a reboot.
  • Memory usage of this build isn’t great — in many ways it’s a debug build, and we haven’t really done a lot of optimization yet.  This could cause some problems with large pages, especially on low memory devices like the Droid.
  • You’ll see the app exit and relaunch on first start, as well as on add-on installs; this is a quirk of our install process, and we’re working to get rid of it.
  • You can’t open links from other apps using Fennec; we should have this for the next build.
  • This build requires Android 2.0 or above, and likely an OpenGL ES 2.0 capable device.
  • Edit: This build must be installed to internal memory, not to a SD card.

There also aren’t yet any automated nightly developer builds or automated updates to this build; it’s even more of a pre-nightly build (even earlier than pre-alpha).  But, it’s usable enough that we wanted to get some feedback on it as we continue to develop.

Weave Sync

There is an experimental version of Weave that is compatible with this build: from within Fennec on your phone, open the Mozilla Labs weave page at https://mozillalabs.com/weave/ and click on “Experimental Version”.  (It’s to the right of the big Download Weave now! link — don’t click on that one though, it’s an older version.)  Install the add-on, then you’ll need to restart Fennec (swipe the screen left and then click on the “gear” icon to open the browser tools panel, then click on addons and click the Restart button at the top).  Follow the instructions when Fennec restarts.

Troubleshooting

Should you run into problems, such as the app not responding or just giving you a black screen, you can force it to quit by going into the Android Settings, selecting Applications, selecting Manage Applications, then selecting Fennec, and tapping Force Stop.  (A utility called White Killer, available from the Market, can do the same job with fewer clicks.)  Worst case, uninstalling and reinstalling would clear out your profile and any saved data.

Installation & Feedback

So, now that you’ve read all that, you can download the build here — the easiest way is to download it using your phone’s browser, and then click on it in the downloads list to install it.  If you’re reading this on your desktop, you can scan the QR code here on your phone, or type in the following address in your phone’s browser: bit.ly/fennec-android.  You may need to enable installation of non-Market applications by going to Settings, Applications, and checking “Unknown Sources”.

We’ve created a temporary Google Group for feedback about this pre-alpha build.  In the future we’ll have a more permanent way for user feedback and comments, but for now, please use the group to let us know what you think!