[maemo-developers] N800 & Video playback
From: Siarhei Siamashka siarhei.siamashka at gmail.comDate: Thu May 3 08:15:13 EEST 2007
- Previous message: N800 & Video playback
- Next message: N800 & Video playback
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wednesday 02 May 2007 12:54, Daniel Stone wrote: > > The 'framebuffer' is just the ordinary system memory, converting color > > format and copying data to framebuffer will be done with the same > > performance as simulated in this test. RFBI performance is only critical > > for asynchronous DMA data transfer to LCD controller which does not > > introduce any overhead and is performed at the same time as ARM core is > > doing some other work (decoding the next frame). RFBI performance matters > > only if data transfer to LCD is still not complete at the time when the > > next frame is already decoded and is ready to be displayed. When playing > > video, ARM core and LCD controller are almost always working at the same > > time performing different tasks in parallel. I think I had already > > explained these details in [1] > > Right. My point is that the numbers you're showing -- while very good, > don't get me wrong -- won't necessarily have a huge direct impact on > video playback. Particularly if you want to avoid tearing. I have no idea what other proof would be enough for you. You already got all the numbers, and even benchmarks with patched xserver. They all confirm video output performance improvement. > > So now the results of the tests are consistent - when doing video output, > > most of ARM core cycles are spent in this 'omapCopyPlanarDataYUV420' > > function. > > Well, either that, or just waiting for RFBI transfers to complete. You need to wait a bit before displaying the next frame anyway, and the period between frames for 30 fps video usually eclipses transfer completion time. If you want some numbers, now 640x480 YUV420 (12bpp) screen update takes now 25ms without tearsync flag enabled (OMAPFB_FORMAT_FLAG_TEARSYNC for OMAPFB_UPDATE_WINDOW ioctl) and 25-42ms with tearsync. For 30 fps video, period between performing screen updates is normally 33ms. For playing video, we initiate RFBI transfer, wait till it completes, perform VY12->YUV420 color format conversion (which should take less than 4ms for 640x480 considering benmchmark results), wait till it is time to display the next frame and start RFBI transfer again. For 30 fps video 25ms+4ms is less than 33ms, so without tearsync enabled, any 640x480 video should play fine (considering video output performance). With tearsync enabled, we should add the time needed for performing vertical sync in LCD controller which breaks our nice numbers. Worst case (17ms wait for retrace + 25ms for actual data transfer) takes more time than 33ms between frames. We can be saved if LCD controller internal refresh rate is really 60Hz, it this case video playback will automagically synchronize to LCD refresh rate and each frame processing will be done exactly within 2 LCD refresh cycles (by the time we want to display a video frame, the next vertical will be near and we will not lose much time waiting for it). If decoding time for each frame will never exceed 28-29ms (which is a tough limitation, cpu usage is not uniform), video playback without dropping any frames will be possible even with tearsync enabled. That's what I'm investigating now. In any case, getting ideal 24 fps playback will be a bit easier. I hope all these explanations are clear now. And this is not just a theory, but already confirmed by some experiments and practical tests. > I'm still using Scratchbox 0.9.8.5 for day-to-day stuff ... Thanks, that is what I would consider 'additional tips and tricks' :) It is good to know that maemo 3.x development can be also done with older scratchbox (I have 0.9.8.8 installed now), I'll try it without upgrading scratchbox then. > > Well, anyway, everything worked perfectly and I could play 640x480 video > > on N800 with the following statistics: > > > > VIDEO: [DIVX] 640x480 12bpp 23.976 fps 886.7 kbps (108.2 kbyte/s) > > ... > > BENCHMARKs: VC: 87,757s VO: 8,712s A: 1,314s Sys: 3,835s = > > 101,618s BENCHMARK%: VC: 86,3592% VO: 8,5736% A: 1,2932% Sys: 3,7740% > > = 100,0000% BENCHMARKn: disp: 2044 (20,11 fps) drop: 355 (14%) total: > > 2399 (23,61 fps) > > > > As you see, mplayer took 8.712 seconds to display 2044 VGA resolution > > frames. If we do the necessary calculations, that's 72 millions pixels > > per second, quite close to 'yv12_to_yuv420_line_armv6' capabilities > > limit, so this function is the only major contributor to video output > > time. Video output took much less time than decoding, so it proves that > > video output overhead can be reduced to minimum (in this test tearsync > > was not used though). > > I'd be curious to see the results from this with tearsync _enabled_? > i.e., after your OMAPFB_UPDATE_WIDNOW call, issue an OMAPFB_SYNC_GFX > ioctl before you start writing to memory again. This is basically the > limiter for us at this stage. That's exactly how MPlayer works. It always waits on OMAPFB_SYNC_GFX before filling framebuffer with the data for the next frame. Not issuing OMAPFB_SYNC_GFX would introduce *artificial* tearing not related to sync with LCD refresh. Actually for this 24 fps video, OMAPFB_SYNC_GFX is not a problem. The detailed explanation with some numbers was posted above. When I'm talking about tearsync, I'm talking exclusively about OMAPFB_FORMAT_FLAG_TEARSYNC for screen updates ioctls. > > When tearsync comes into action, everything gets a bit more complicated. > > I'm still investigating its impact on video playback performance. > > 'Not good'. :) Video quality is still quite good even without tearsync (in my definition), but not perfect. With you definition, tearsync is always enabled in MPlayer anyway, on Nokia 770 too :)
- Previous message: N800 & Video playback
- Next message: N800 & Video playback
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]