A while ago, I put together a presentation about the graphics architecture in Mozilla, and what we were working towards with Thebes and Cairo. It was supposed to be a video presentation, but I was never happy with the way the video turned out, so never put it up. After seeing John Resig’s post about putting his presentations online, I decided to do the same.
I plan to go over the reasoning behind switching Gecko’s rendering engine to Cairo, what purpose Thebes serves, and how it all interacts with the rest of the layout engine. First, let’s look at how Gecko used to render pre-1.9.
Each platform had separate implementations for their widget code (which manages windows, input events, repaint events, and so on), and implementations of the old gfx interfaces nsIDeviceContext and nsIRenderingContext. These provided the set of basic rendering capabilities that each platform had in common.
These capabilities were very limited. Besides text, the old gfx implementations could draw non-antialiased lines, rectangles, polygons, and ellipses. They could also draw images (but only undergoing a scale, not a rotation), and could clip content to rectangular regions. Images could also be alpha blended, but at a cost on most platforms.
With Gecko 1.9, things have changed significantly.
Switching to Cairo and Thebes lets us unify a large chunk of code. The widget implementations still remain separate, as this is where code has to work in terms of the platform native concepts of windows, widgets, and events (HWND, GdkWindow, and so on). However, everything below that is shared common code.
Because we didn’t want to have to rewrite all code that still used the previous interfaces, there is a shared implementation of nsIDeviceContext and nsIRenderingContext that uses Thebes under the hood. This code lives in mozilla/gfx/src/thebes, and is used on all platforms. There is still some platform-specific code left inside here, most of it having to deal with images and optimizing images for fast rendering. In Mozilla 2, we hope to remove these old interfaces and have everything speak directly to Thebes.
Let’s take a look at Thebes itself.
Thebes consists of two parts. One part is a direct C++ wrapper for Cairo that simplifies reference counting (Thebes objects can be used with the standard nsRefPtr<> found in Mozilla) and provides some Mozilla-specific helpers. For example, in many places in our code we want to draw pixel-aligned rectangles. There are convenience functions on gfxContext to perform this task (gfxContext::Rectangle with the snapped paramemter set to TRUE). The core Thebes code lives in mozilla/gfx/thebes.
gfxContext, gfx*Surface (with gfxASurface being the generic abstract implementation), and gfxPattern map directly to cairo_t, cairo_surface_t, and cairo_pattern_t. Additional geometry helper classes such as gfxMatrix, gfxPoint, and gfxRect are also provided.
The core Thebes rendering functionality is directly provided by Cairo and the full feature set of Cairo is available.
Instead of the basic capabilities exposed by the previous gfx implementations, Cairo provides antialiased path fills or clips, along with generic support for transforms, Porter-Duff compositing operators, native support for gradients and more. Most of these capabilities are exposed to the web only through SVG and the HTML Canvas, but some are beginning to appear as proposals and experiments within CSS.
As Cairo itself does not provide a text rendering implementation, a larger and more significant part of Thebes is the text API.
Cairo exposes only a method of drawing glyphs in a platform-specific font at specific locations. (The toy font API that’s present in Cairo can be ignored, as it does not support international text and is there only for convenience for smaller projects.) Cairo font objects such as cairo_font_face_t (listed incorrectly as cairo_font_t on the slide) and cairo_scaled_font_t are used internally by the Thebes text classes, but are not otherwise exposed. Thebes works with each platform’s native complex text handling APIs (Uniscribe on Windows, Pango on Linux, and ATSU on MacOS X) and presents a unified interface to the rest of Gecko.
The Thebes text API provides high performance cross platform text rendering capabilities, including full support for ligatures, kerning, and complex script shaping. In conjunction with Gecko’s text layout capabilities, all of these are supported across DOM element boundaries.
Because not all platforms have the same 2D capabilities in their graphics APIs, Thebes leverages Cairo’s fallback mechanism whereby operations that cannot be handled natively are rendered using a pure software backend.
Any of the fundamental Cairo operations (such as fill, stroke, show_glyphs, etc.) are first passed to the platform-specific Cairo surface implementation. The surface then performs the operation, if possible. If the surface indicates that it cannot perform the operation, Cairo’s fallback mechanism is invoked which attempts to decompose the operation into a set of simpler steps. For example, paths are tessellated into trapezoids, and a set of trapezoids are given to the surface for rendering. Eventually, the software renderer is invoked, which will read the pixels from the native surface, perform the rendering, then update the contents of the surface.
The various Cairo backends provide different levels of support for native operations.
Due to the dependance on GDI, the Windows platform provides the fewest native operations. However, the ones that it does provide are the most common operations used by Gecko to render web content. On Linux, the RENDER capabilities of the X server are used. On MacOS X, Cairo ends up mapping directly to Quartz, the native 2D API. Thus, all rendering operations result in native Quartz methods being used, and no software fallback within Cairo is ever used on MacOS X. In addition to these platform specific surfaces, Cairo can also generate PDF and PostScript output.
The final component of the low-level rendering bundle is native rendering. In some cases, platform-specific code needs to render directly to one of Gecko’s widgets. This is most commonly seen in the native theme drawing code, where the widgets themselves are drawn using the platform’s existing theme API.
The approach is to check whether the current Cairo state (destination surface, transformation, compositing operator, etc.) is representable to the native API. If it is, the appropriate setup happens and drawing occurs normally to the underlying platform drawing target. If the state cannot be represented, Gecko creates a temporary surface with no complex transforms, performs the drawing operation here, and then applies transforms and operators as this surface is rendered to the destination. In this way, we can do things such as render rotated and/or scaled widgets on Windows or print widgets with the native theme with Gtk on Linux.
In the future, it will be possible to transform plugins and other content over which we don’t have control using this same mechanism. Such transformed content won’t be as sharp as it could be, but we could mitigate that by rendering at a higher resolution than we need (assuming we control the rendering, as in the native theme case).
That’s it for the first part. In future parts I plan on giving example code that uses Thebes to do rendering to serve as a cookbook for anyone writing rendering code in Gecko. This should be especially useful as legacy code slowly gets converted from nsIRenderingContext to Thebes. I also plan on discussing how painting and invalidation are handled from the OS/widget level down to the thebes level. In addition, a separate presentation on the text API would probably be useful — there is significant complexity there, especially regarding the interaction of the Thebes font code and the layout TextFrame code.
Any mistakes here are almost certainly mine; please leave me a comment with any corrections, and I’ll update the post!