Monday, October 25, 2010

Texture Mapper: Accelerated compositing evolved

In the last year I've been developing and maintaining the Qt implementation of Accelerated Compositing. The accelerated compositing code path in WebKit was originally written by Apple, to allow WebKit ports to accelerate animations, transitions and 3D transforms using the GPU. It was originally using CoreAnimations on Mac and on the iPhone, and something similar on Windows. In January 2010 we introduced the Qt implementation, that uses QGraphicsView and the Qt animation framework, which was then followed by the Chromium implementation, for, as you can guess, Chromium.

The Accelerated Compositing concept comes to optimize for cases where an element would be painted to the screen multiple times without its content changing. For example, a menu sliding into the screen, or a static toolbar on top of a video. It does so by creating a scene graph, a tree of objects (graphics layers), which have properties attached to them - transformation matrix, opacity, position, effects etc., and also a notification when the layer's content needs to be re-rendered.

How Qt Accelerated Compositing Works Today
If you want to know the basics about WebKit's graphics rendering, I recommend reading an entry from the Chromium blog entry from a few months ago: (https://sites.google.com/a/chromium.org/dev/developers/design-documents/gpu-accelerated-compositing-in-chrome).

Our architecture right now is slightly simpler than Chromium, since we don't use their cross process architecture. The idea right now is that each graphics layer is backed by a QGraphicsItem and a QPixmap, and the graphics scene is responsible to know which items to draw, who clips who, apply effects etc. The hardware acceleration is achieved by QtOpenGL, which knows how to accelerate repeated painting of QPixmaps, by caching them into OpenGL textures.

Why this was a good idea, and why it needs to change
Since the Qt APIs are so easy to use, getting Accelerated Compositing running on top of was quite simple. We were up and running with this approach within days, and the feature became stable rather quickly. We learned a lot about how to / how not to do this right during this time, and this wouldn't be possible without the API.

But now, after having more and more people use this feature and push its limits, the approach is showing its limitations. For example, getting 3D transformations to work correctly on top of the 2D graphics view was a pain, and it still doesn't look exactly right because support depth buffering, and isn't as fast as it can be, due to the software 3D->2D projection. Also mask and reflection effects are difficult to optimize, and texture uploads are a bottleneck. But the main problem is that there are too many abstractions - CSS goes to webkit, which goes to , then QPixmap, QPainter, the GL paint engine, and then OpenGL and the driver. Too many assumptions are being made along the way, that are not always correct in the context of the application.

Introducing Texture Mapper
The idea was to remove most of the abstractions, and go from GraphicsLayer directly to OpenGL, making the code that traverses the scene-graph and draws the texture-backed layers a cross-platform WebKit implementation that can be also used by other ports. The original code did exactly that, walked through the tree and issued GL calls, saving a texture per graphics layer. But the need for some abstraction layer came about - some GPUs require different optimizations then others, some have DirectFB and not OpenGL, some can benefit from EGL texture sharing and other proprietary extensions, and we still want this to work on pure software if hardware acceleration is not available.

The current implementation uses an abstraction called TextureMapper: it's a couple of thin abstract classes, one is a BitmapTexture and one is the TextureMapper. A BitmapTexture is equivalent to a GL texture or a QPixmap. Like QPixmap is gives you API to paint into it with software via a GraphicsContext, but it also allows you to use it as the render target for hardware accelerated drawing, like an FBO. This is important when using effects or opacity, for example. To render masks or opacity correctly on an element that has several descendant elements, the sub-tree has to be rendered into an intermediate buffer before the effect is applied. With QGraphicsEffect this is only possible using software, because of the limitations imposed by the many abstraction layers.

The TextureMapper class is equivalent to a QPainter or a GraphicsContext, but it's very limited in scope - you can bind a BitmapTexture to be the current rendering target, set the clip, and draw BitmapTextures.

Abstraction layers are usually a mistake
Abstraction layers are something I'm usually very careful about - people implement things from both sides of the layer, and you become tied to concepts you invented before the scope of the problem was completely clear. TextureMapper is an abstraction layer - which holds the same perils - but I chose to do it here anyway because of the other constraints I had.

To reduce the risks of having an abstraction layer, I limited its scope. The main limitation is that this abstraction layer is not public API, and is not pluggable. That means that, at least for now, the header files for that abstraction might change, as long as all the implementations inside webkit trunk can live with that change. So the abstraction layer just helps us separate parts of the code, it's not an "API".

Acknowledgements

This endeavor became real thanks to Sam Magnuson from Netflix, who did a lot of the work with me, and helped test and optimize it with a real-world application on TV hardware. also it was his idea and I stole it :)

Working across companies in the context of an open source project made a huge difference - they got better user experience for their application, and we got a QtWebkit feature that's functioning much better in a productized context.


Status
This code is now in webkit trunk (as of changeset 70487). To enable it, build webkit trunk with Qt (see http://trac.webkit.org/wiki/QtWebKit), adding the texmap CONFIG option. For example:

WebKit/WebKitTools/Scripts/build-webkit --qt --qmakearg="CONFIG+=texmap"

Most of the code resides under WebCore/platform/graphics/texmap, and the OpenGL implementation is under WebCore/platform/graphics/opengl.

6 comments:

  1. Are there any performance numbers/stats as a result of this change?

    ReplyDelete
  2. Girish: on most common tests both solutions show 60FPS on relevant hardware with OpenGL. With some DirectFB based hardware texture-mapper is maybe 50% faster, and on both it renders reflections, masks and opacity correctly, unlike the QGraphicsView based solution.

    ReplyDelete
  3. Is it possible to use accelerated composition with Overlay video ?

    I mean by Overlay, clearing the hole rect to show video playing in the video plan.

    ReplyDelete
  4. This is my first time i visit here. I found so many entertaining stuff in your blog, especially its discussion. From the tons of comments on your articles, I guess I am not the only one having all the leisure here! Keep up the good work. I have been meaning to write something like this on my website and you have given Earn Free Xbox Live Gift Card Codes

    ReplyDelete