[maemo-developers] Profiling on Nokia N810
From: Eero Tamminen eero.tamminen at nokia.comDate: Fri Sep 5 18:29:30 EEST 2008
- Previous message: Profiling on Nokia N810
- Next message: Moderator speak up - is maemo multitouch thread ok or not
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, Please keep the maemo-devel on CC. ext Bruno wrote: > I tried to use opreport, but the result are showed on the nokia > screen, and unreadable. Sorry I don't understand. How they are unreadable? (you're using ssh to the device, aren't you?) > On top of that it doesn't seem to gives much > information (i dit opreport -c -l like said in the doc). You need to have debug symbol packages (available from the maemo repositories) installed if you want function names. If you want Oprofile call-graphs from ARM, you need to re-compile all the related software with frame-pointers, but that's really too much pain especially considering how much nicer Kcachegrind UI[1] is for analysing them and the source code: http://kcachegrind.sourceforge.net/cgi-bin/show.cgi Btw. Oprofile has also an UI: http://maemo.org/development/tools/doc/diablo/oprofileui/ You can run the UI on desktop and connect to the oprofile-server running on the device (the debug symbol files will still need to be on the device for function names). > Is it possible to have a text output with opreport ? By default it outputs text? > Concerning profiling on x86, the result will be similar to what I > would get profiling on Nokia ? Yes, in general that seems to be true. > What's the point of profiling in a X86 scratchbox environment ? It > would be the same on any normal linux x86 computer isn't it ? The point is to get same environment as on the target. Same versions of all the libraries, X server with 16-bit display (Xephyr) and certain set of X extensions etc. The more differencies you have in your environment, the less the results will correspond. > I wanted to profile it on ARM because some part will be harder to > process with that kind of architecture, especially floating point > calculations parts I guess. > > But if this results are proportional to what I would get on the arm, > then I will probably do that on x86 ! I don't know how true that is for floating point operations, ARM VFP is a bit limited compared to Intel FPUs. So, I would do speed measurements on ARM (which tests you need anyway to validate the optimizations), find bottleneck functions on ARM with Oprofile+debug symbol packages (or instead of installing debug packages, re-building the sources with "-g" added to compiler flags) and do the main work in performance analysis i.e. getting an understanding of how the code really works :-) with Kcachegrind (AFAIK the best available open source perf tool and important when reading lots of code written by others). It's important to use different methods as they can point out different things. Valgrind/cachegrind profiles only a single process, but Oprofile profiles the whole system. I.e. from Oprofile data you can also see if you're stressing some other part of the system than your own program and then try optimizing use of that. - Eero > Bruno > > > 2008/9/5, Eero Tamminen <eero.tamminen at nokia.com>: >> Hi, >> >> ext Bruno wrote: >> >>> I've been trying to profile my program on the nokia N810 for some >>> times now, and I'm not able to get good results. I installed the >>> oprofile package and the oprofile modified kernel, and then tried to >>> run gprof. >>> >>> I tried 2 ways to get my profiling information (my program is called >>> src ... not really explicit I know ! ) : >>> >>> compile with normal compiler paramaters, no -pg. Then : >>> >>> Nokia-N810:~# opcontrol --init >>> Nokia-N810:~# opcontrol --no-vmlinux >>> Nokia-N810:~# opcontrol -e=CPU_CYCLES:100000 >>> Nokia-N810:~# opcontrol --start >>> Nokia-N810:~# ./src >>> Nokia-N810:~# opcontrol --stop >>> Nokia-N810:~# opgprof src >>> Nokia-N810:~# gprof src > lala.txt >>> gprof: gmon.out is file is missing call-graph data >>> Nokia-N810:~# gprof -Q src > lala.txt >>> >> Why not just use "opreport" like suggested in the documentation: >> >> http://maemo.org/development/tools/doc/diablo/oprofile/ >> ? >> >> FYI: if I want callgraphs, I'll profile on x86 with Valgrind+callgrind >> (in Scratchbox) and view the results with Kcachegring (outside >> Scratchbox). Callgrind gives *much* better callgraphs and UI/usability >> than oprofile or gprof. >> >> This of course assumes that your source code works the same on ARM >> and x86. >> >> >> Summary: >> - Oprofile for finding ARM bottleneck functions >> - Timings code to measure the performance on ARM (profiling disturbs >> the code functionality so it's not to be trusted too much) >> - Valgrind/Callgrind/Kcachegrind on x86 to _analyze_ the bottlenecks >> (why/how the bottlenecks are used by the running code) >> >> I've found that x86 profiling results are mostly accurate even for >> ARM, it's rare for major bottlenecks to differ between these two >> architectures (although that may happen due to cache size >> differencies and VFP vs. FPU) if you've otherwise guaranteed >> that the execution environments match. >> >> >> - Eero >>
- Previous message: Profiling on Nokia N810
- Next message: Moderator speak up - is maemo multitouch thread ok or not
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]