[maemo-developers] debugging spontaneous reboot issues with N800/N810

From: Eero Tamminen eero.tamminen at nokia.com
Date: Mon Nov 26 18:20:49 EET 2007

ext Aleksandr Koltsoff wrote:
> Thanks Kalle & Eero for the enlightening comments & info, much appreciated.
> Eero Tamminen wrote:
>> ext Kalle Valo wrote:
>>>> 3) Asterisk marks the application that caused the last wd timeout
>>>> operation? (so in this case, the spontaneous reboot was caused by dsp_dld
>>>> or it at least seems so?)
> Any ideas on the asterisk though? It's only ever set on one name at a time.

AFAIK It means that restarting the service (10 times in a row) didn't
help so the device was rebooted.

dsp_dld (along with other DSP clients) exits each time DSP is reseted.

>>> There is no way to know what cause hardware watchdog reboot. It can be
>>> a problem in kernel, or some userspace application taking all the CPU
>>> time.
> Right. So debugging a spontaneous system reboot is pretty impossible
> after the hw watchdog triggers, if I understand correctly.

If you have syslog installed, it might contain some information just
before the reset (e.g. if there were was a kernel Oops).

> Any ideas on how to proceed though? I'd like to post a bug, but without
> any ideas on what is causing the reboot, the bug report isn't going to
> be very useful to anyone.

An easily repeatable use-case would be best... :-)

>> Some process just taking all CPU doesn't cause HW watchdog reboot.
>> Some OOM-protected[1] process (such as Desktop) taking all memory
>> (e.g. due to a leak in an applet) so that kernel spends all time
>> just suffling memory pages can cause it though.
>> [1] OOM-protected = processes ignored by the kernel out of memory
>>     killer. Applications (such as Browser) are not not OOM-protected.
>>     If they use too much memory, kernel kills them to protect
>>     rest of the system.  (before this allocations are denied to
>>     them, but this doesn't help if other processes in the system
>>     need more memory too)
> Only processes on the OOM-protected list will be protected, right? This
> would imply that by default any other process would fall under the
> OOM-killer in the kernel (which makes sense).

Right.  The processes having -17 in their /proc/PID/oom_adj are

	- Eero

More information about the maemo-developers mailing list