[maemo-users] N900 microSD card I/O errors and corruption

From: Paul Hartman paul.hartman+maemo at gmail.com
Date: Sun Apr 3 02:43:37 EEST 2011
On Wed, Mar 30, 2011 at 11:24 AM, Eero Tamminen <eero.tamminen at nokia.com> wrote:
> Hi,

Hi Eero, thanks for your response.

> On 03/29/2011 06:32 PM, ext Paul Hartman wrote:
>> I've got three microSD cards. They work fine on my PCs, I've done
>> read/write tests and data is not corrupted. But, in my N900, two of
>> the three are not stable, leading to corruption.
> Does it afterwards show as corrupted on the PC too?


> You aren't by any chance changing the cards by taking the back cover
> out without powering off your device first?

Definitely not that, the only time I ever remove my cover is to change
the card when I buy a new one, and I fully shutdown the phone first
before doing so. My USB port still works, so I don't need to remove
the battery on a regular basis.

> Opening the back cover does an emergency shutdown on disks in case
> user rips battery out next (that's apparently a common way to get
> "phone not reachable" message back to your boss/wife/dog when they
> call you, at least in some parts of the world).

I would never do that. I rarely make or receive phone calls. I just
checked call logs on my N900 and I have 6 calls since 1st of January.
Those "pull the battery" people should learn about offline mode, hey.

> Also, the back cover has a magnetic latch that's used for detecting
> when it's opened. If you have something magnetic next to your phone,
> it may cause phone to think that back cover is being opened. See:
>        https://bugs.maemo.org/show_bug.cgi?id=8235#c15

I'm aware of the magnetic switch, I don't think it's that. I keep my
N900 on my desk and in my pocket, I don't wear magnets or use cases
for my N900. The magnets on back cover appear to be in place. I don't
have any of the dmesg lines about cover opened/closed.

But I have some new information!

I've experimented more with the Adata card, and now I notice that the
errors are 100% reproducable. If I mkfs it results in the same exact
errors in dmesg (same blocks) every single time! So this makes me
really think it's a problem with the kernel drivers (or maybe the SD
controller itself) on the N900. I think if the card truly had
bad-blocks, the wear leveling would cause the errors not to be the
same every time, and I'd see them also on my PC.

I formatted the cards using the SD Association's official formatting
tool on a MS Windows box, with full write erase over the whole device.
No errors and all went normally. Testing on my PC with 2 different
microSDHC card-readers I was not able to reproduce any corruption
after filling up the card, unmounting & removing, reinserting and
reading md5sum of the contents.

A simple test, on my PC:

$ sudo mkfs.ext3 -v -Ldebian -m0 /dev/sdg4
mke2fs 1.41.14 (22-Dec-2010)
fs_types for mke2fs.conf resolution: 'ext3'
Calling BLKDISCARD from 0 to 2612002816 failed.
Filesystem label=debian
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
159680 inodes, 637696 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=654311424
20 block groups
32768 blocks per group, 32768 fragments per group
7984 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912

Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 30 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

$ sudo fsck -v -f /dev/sdg4
fsck from util-linux 2.19
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

      11 inodes used (0.01%)
       0 non-contiguous files (0.0%)
       0 non-contiguous directories (0.0%)
         # of inodes with ind/dind/tind blocks: 0/0/0
   27369 blocks used (4.29%)
       0 bad blocks
       1 large file

       0 regular files
       2 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
       0 symbolic links (0 fast symbolic links)
       0 sockets
       2 files

So there are no errors found, and nothing shows up in dmesg. The new
partition is in-tact and works normally if I copy files, flush caches,
read back and checksum them on my PC.

However, when I perform the same thing on my N900, dmesg is full of
"-110" errors (that I posted in my first message), and fsck
immediately following mkfs finds errors in the new filesystem! That's
not good...

With my Sandisk Class 2 card, the test works successfully on the N900.
I'm using it even for swap partition, so there's a constant and heavy
I/O load on the card and I never had any problems with that. So I
don't think it's a hardware problem with the cover or magnets.

When I googled about it, I found some very similar reports with the
same dmesg lines, about similar hardware (such as Pandora) and
discussions about SD card problems with voltages, that's why I thought
it might be relevant. But I have no idea how to determine the voltage
in use at any given time. Maybe it's a red herring. Maybe it's a
timing issue, where the sdhci driver cannot properly identify some
cards' capabilities (or maybe the card misrepresents its own
attributes) and doesn't set the proper clock speed/bandwidth/whatever
it has. Based on the error messages I got in dmesg I think something
like this might be possibly responsible.

N900 kernel version is about 2 years old? So there have probably been
a lot of patches and glitch-workarounds added to the sdhci driver in
the kernel. Maybe they can be brought back to the 2.6.28 kernel.
However, if the SD controller itself in the N900 is unable to handle
these cards then I don't know if any driver will help.

MeeGo has a recent kernel (in fact, the reason I bought this fast card
was for testing MeeGo), but maybe someday if I can install MeeGo to
internal memory of N900 then I can test the SD card. But for now I
don't think I am able to do that.

And then, to confuse the matter even more, we have this person's story
how he had 2 N900's and he has one specific SD card works fine in one
device but doesn't work in the other, which might indicate hardware
problem (or a different revision?):

My N900 is the USA version, purchased in December 2009, made in Korea,
and has revision 2101. In case IMEI can reveal anything useful about
the manufacturing batch I can tell you in private email.

Thanks for your suggestions.
More information about the maemo-users mailing list