Vladimir Vukićević — Words
 



For the past while, Theora decoding wasn’t enabled on any of our ARM Firefox builds due to it simply not compiling after an update.  I finally got around to taking a look why, and the fix was fairly straightforward.  With this patch, we have rough baseline Theora decode capability on any platform.  However, it’s not fast, given that it’s running just pure C code even for tight inner loops for which there are mmx/sse/etc. variants on x86.

One of the big bottlenecks is often yuv to rgb conversion, which has to be done per frame.  On some of our mobile platforms, such as on Tegra-based Windows CE devices, we use OpenGL to render the Firefox UI.  We should be able to fairly easily take advantage of hardware acceleration here to do the conversion (and certainly the scaling), which should get us in decent shape for reasonably-sized videos.  Another option is to use OpenMAX DL, if present, for doing much of this work.  However, we’d still be hurting on platforms where we’re not (yet?) using hardware graphics acceleration, or where OpenMAX is not available.

Whether OpenGL is in use or not, we would still benefit from accelerated baseline ARM decoding here.  If anyone’s interested in doing some ARM acceleration work, using both ARMv6 SIMD and NEON (or just ARMv6 SIMD if it’s enough, but I suspect NEON could be used to good effect as well), let me know!


5 Comments to “Theora Video, soon in Firefox on ARM CPUs”  

  1. 1 dave

    Minor oddity: If liboggplay was only doing x86 then how did it manage to work on Mac PPC?

  2. 2 -

    This is hardly a problem of ARM only – performance in x86 is terrible too, even on trunk with the decoder asm enabled. The decoder, however, is the least troublesome thing. The pipeline doesn’t help (I hear there are several malloc()s per frame and several copies going on), but the stuff that really kills it is getting the pixels to the screen (colorspace conversion, which I think is supposed to be optimized already, and more specially resizing and maybe painting).

    If the video is resized in any way, performance goes through the floor. This is not just a video problem, it also happens with images (but then it’s less noticeable). Fx would benefit a lot from having a decent scaler. There are some GPL ones you can look at (but not use I guess) – one in VirtualDub/AviSynth and another one in swscale (I know the swscale one has dynamic code generation, also I think it can resize color convert all at once in a single pass.)

    Composition is also very bad, when the controls shows CPU increases significantly. Right now, Flash decodes and shows H.264 video much, much faster than Firefox does Theora, even if Theora is less expensive to decode (and that is withouth using hardware acceleration, and in any situation – resizing is ultrafast in Flash, and drawing stuff over the video doesn’t affect it much.)

    Using an standalone player such as mplayer is also much faster, even when forced to use software color conversion and resizing, and showing an OSD over the video.

    There are also other problems (audio is much more likely to skip, I guess Fx uses the main thread and Flash a dedicated thread or something) but performance should be up there. Please fix it!

  3. 3 Anonymous

    Last I checked, FFMPEG had NEON support in various codecs, including (I think) Theora.

  4. 4 Robin Watts

    Hi,

    I’ve been looking at Theora and Vorbis decode for ARM – I have just released my work as .

    The Theora code includes some fast YUV2RGB code, and baseline ARMv4 code for all the code (and more) that x86 has MMX for.

    Similarly the Vorbis decode has all the important loops entirely hand ARM coded for speed.

    There may well be more to come from some sections by using NEON or v6/v7 extensions (but I don’t have a device to test on!), but I cannot see how to accelerate the software YUV2RGB any more even with SIMD (which is a shame, as it’s 50% of the CPU time).

    Full timings are on that website.

    Robin

  5. 5 Robin Watts

    The URL appears to have dropped out of my previous comment. I’ll repeat it here:

    http://www.wss.co.uk/pinknoise/theorarm

    I’d love to speak to the firefox developers about it…

    Robin