We have moved to a new Sailfish OS Forum. Please start new discussions there.
1

Recovery Mode: can't mount btrfs filesystem [answered]

asked 2017-12-29 13:09:52 +0200

Direc gravatar image

Hi

My sister's phone - the original Jolla 1 - hung during taking a photo, and never again booted up. It shows the Jolla logo, but won't go any further. Free space shouldn't be an issue, I've "monitored" the phone every now and then and ran balancing a few times. No hiccups so far, just this sudden death.

I can enter the Recovery Mode, open shell, enter device lock code, but that's about it. Trying to mount home partition with mount -t btrfs -o subvol=@home /dev/mmcblk0p28 /mnt/my_home just hangs and doesn't return to prompt. Pushing it into background with & and running dmesg gave me this:

[  231.041995] device label sailfish devid 1 transid 1171428 /dev/mmcblk0p28
[  231.044010] btrfs: disk space caching is enabled
[  231.047886] btrfs: bdev /dev/mmcblk0p28 errs: wr 0, rd 0, flush 0, corrupt 0, gen 0
[  231.059056] Btrfs detected SSD devices, enabling SSD mode
[  231.059453] btrfs: free space inode generation (0) did not match free space cache generation (1151214)
[  231.060552] btrfs: free space inode generation (0) did not match free space cache generation (1151214)
[  231.060582] btrfs: free space inode generation (0) did not match free space cache generation (1151214)
[  231.060643] btrfs: free space inode generation (0) did not match free space cache generation (1151214)
[  231.060674] btrfs: free space inode generation (0) did not match free space cache generation (1151214)
[  231.060704] btrfs: free space inode generation (0) did not match free space cache generation (1151214)
[  231.061437] btrfs: free space inode generation (0) did not match free space cache generation (1151214)
[  231.062658] btrfs: free space inode generation (0) did not match free space cache generation (1151214)
[  231.135632] btrfs: corrupt leaf, slot offset bad: block=116940800,root=1, slot=20
[  231.135693] ------------[ cut here ]------------
[  231.135723] WARNING: at fs/btrfs/super.c:221 __btrfs_abort_transaction+0x38/0x98()
[  231.135723] btrfs: Transaction aborted
[  231.135723] Modules linked in:
[  231.135784] [<c010b73c>] (unwind_backtrace+0x0/0x118) from [<c0173780>] (warn_slowpath_common+0x4c/0x64)
[  231.135784] [<c0173780>] (warn_slowpath_common+0x4c/0x64) from [<c01737c4>] (warn_slowpath_fmt+0x2c/0x3c)
[  231.135815] [<c01737c4>] (warn_slowpath_fmt+0x2c/0x3c) from [<c032566c>] (__btrfs_abort_transaction+0x38/0x98)
[  231.135846] [<c032566c>] (__btrfs_abort_transaction+0x38/0x98) from [<c03335e0>] (__btrfs_free_extent+0x688/0x6b4)
[  231.135876] [<c03335e0>] (__btrfs_free_extent+0x688/0x6b4) from [<c0337d54>] (run_clustered_refs+0x7fc/0x90c)
[  231.135907] [<c0337d54>] (run_clustered_refs+0x7fc/0x90c) from [<c033808c>] (btrfs_run_delayed_refs+0x228/0x398)
[  231.135937] [<c033808c>] (btrfs_run_delayed_refs+0x228/0x398) from [<c0347620>] (btrfs_commit_transaction+0x98/0x914)
[  231.135968] [<c0347620>] (btrfs_commit_transaction+0x98/0x914) from [<c038120c>] (btrfs_recover_log_trees+0x348/0x3a0)
[  231.135998] [<c038120c>] (btrfs_recover_log_trees+0x348/0x3a0) from [<c03462a4>] (open_ctree+0x125c/0x13dc)
[  231.136029] [<c03462a4>] (open_ctree+0x125c/0x13dc) from [<c03250e8>] (btrfs_mount+0x4fc/0x87c)
[  231.136151] [<c03250e8>] (btrfs_mount+0x4fc/0x87c) from [<c023e734>] (mount_fs+0x10/0xb4)
[  231.136181] [<c023e734>] (mount_fs+0x10/0xb4) from [<c0253f88>] (vfs_kern_mount+0x48/0xb8)
[  231.136181] [<c0253f88>] (vfs_kern_mount+0x48/0xb8) from [<c0324e48>] (btrfs_mount+0x25c/0x87c)
[  231.136212] [<c0324e48>] (btrfs_mount+0x25c/0x87c) from [<c023e734>] (mount_fs+0x10/0xb4)
[  231.136242] [<c023e734>] (mount_fs+0x10/0xb4) from [<c0253f88>] (vfs_kern_mount+0x48/0xb8)
[  231.136273] [<c0253f88>] (vfs_kern_mount+0x48/0xb8) from [<c0254544>] (do_kern_mount+0x30/0xd4)
[  231.136303] [<c0254544>] (do_kern_mount+0x30/0xd4) from [<c0255c94>] (do_mount+0x534/0x674)
[  231.136303] [<c0255c94>] (do_mount+0x534/0x674) from [<c0255f30>] (sys_mount+0x84/0xc4)
[  231.136334] [<c0255f30>] (sys_mount+0x84/0xc4) from [<c0105c40>] (ret_fast_syscall+0x0/0x30)
[  231.136364] ---[ end trace da227214a82491b9 ]---
[  231.136364] BTRFS error (device mmcblk0p28) in __btrfs_free_extent:5186: IO failure
[  231.136395] btrfs: run_one_delayed_ref returned -5
[  231.136395] BTRFS error (device mmcblk0p28) in btrfs_run_delayed_refs:2466: IO failure
[  231.136395] BTRFS warning (device mmcblk0p28): Skipping commit of aborted transaction.
[  231.136425] BTRFS error (device mmcblk0p28) in cleanup_transaction:1226: IO failure
[  231.189683] btrfs: corrupt leaf, slot offset bad: block=116940800,root=1, slot=20
[  231.190019] btrfs: corrupt leaf, slot offset bad: block=116940800,root=1, slot=20</c0105c40></c0255f30></c0255f30></c0255c94></c0255c94></c0254544></c0254544></c0253f88></c0253f88></c023e734></c023e734></c0324e48></c0324e48></c0253f88></c0253f88></c023e734></c023e734></c03250e8></c03250e8></c03462a4></c03462a4></c038120c></c038120c></c0347620></c0347620></c033808c></c033808c></c0337d54></c0337d54></c03335e0></c03335e0></c032566c></c032566c></c01737c4></c01737c4></c0173780></c0173780></c010b73c>

At this point I started to freak out on the inside a bit. Btrfs segfaulting can't be good. Running btrfs check /dev/mmcblk0p28 gives this:

Checking filesystem on /dev/mmcblk0p28
UUID: 86180ca0-d351-4551-b262-22b49e1adf47
checking extents
incorrect offsets 2924 4197228   # This line repeat some 40 times
bad block 116940800
Chunk[256, 228, 0]: length(4194304), offset(0), type(2) is not found in block group
Chunk[256, 228, 4194304]: length(8388608), offset(4194304), type(4) is not found in block group
Chunk[256, 228, 12582912]: length(8388608), offset(12582912), type(1) is not found in block group
Chunk[256, 228, 606076928]: length(230686720), offset(606076928), type(1) is not found in block group
Chunk[256, 228, 836763648]: length(230686720), offset(836763648), type(1) is not found in block group
Chunk[256, 228, 3676307456]: length(33554432), offset(3676307456), type(34) is not found in block group
Chunk[256, 228, 18742247424]: length(1073741824), offset(18742247424), type(1) is not found in block group
Chunk[256, 228, 25184698368]: length(1073741824), offset(25184698368), type(1) is not found in block group
Chunk[256, 228, 28405923840]: length(1073741824), offset(28405923840), type(1) is not found in block group
Errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
incorrect offsets 2924 4197228   # This line, too, repeats some 40 times
checking csums
checking root refs
checking quota groups
Segmentation fault

Running dmesg then gave me this:

[  870.416786] btrfs: unhandled page fault (11) at 0x026acb3a, code 0x005
[  870.416786] pgd = e88cc000
[  870.416816] [026acb3a] *pgd=00000000
[  870.416816] 
[  870.416816] Pid: 244, comm:                btrfs
[  870.416847] CPU: 1    Tainted: G        W     (3.4.108.20171017.1 #1)
[  870.416847] PC is at 0x47b00
[  870.416877] LR is at 0x1000
[  870.416877] pc : [<00047b00>]    lr : [<00001000>]    psr: 400f0030
[  870.416877] sp : bed78800  ip : 00400b39  fp : 00070c04
[  870.416908] r10: 00400b9e  r9 : 022ac1fe  r8 : 022abf30
[  870.416908] r7 : 026acb26  r6 : 00000272  r5 : 00000000  r4 : 06bff000
[  870.416938] r3 : 00000000  r2 : 00001000  r1 : 000000a8  r0 : 00000033
[  870.416938] Flags: nZcv  IRQs on  FIQs on  Mode USER_32  ISA Thumb  Segment user
[  870.416969] Control: 10c5787d  Table: a8acc06a  DAC: 00000015
[  870.416999] [<c010b73c>] (unwind_backtrace+0x0/0x118) from [<c010f658>] (__do_user_fault+0x6c/0xb4)
[  870.417030] [<c010f658>] (__do_user_fault+0x6c/0xb4) from [<c08a3624>] (do_page_fault+0x358/0x3e8)
[  870.417060] [<c08a3624>] (do_page_fault+0x358/0x3e8) from [<c01002f8>] (do_DataAbort+0x134/0x1a8)
[  870.417060] [<c01002f8>] (do_DataAbort+0x134/0x1a8) from [<c08a1e94>] (__dabt_usr+0x34/0x40)
[  870.417091] Exception stack(0xeefe1fb0 to 0xeefe1ff8)
[  870.417091] 1fa0:                                     00000033 000000a8 00001000 00000000
[  870.417121] 1fc0: 06bff000 00000000 00000272 026acb26 022abf30 022ac1fe 00400b9e 00070c04
[  870.417152] 1fe0: 00400b39 bed78800 00001000 00047b00 400f0030 ffffffff
[  870.417152] btrfs(244) send signal 11 to btrfs(244)</c08a1e94></c01002f8></c01002f8></c08a3624></c08a3624></c010f658></c010f658></c010b73c>

One thing worth mentioning: I do believe the uptime was several months, perhaps half a year. But there can't be that silly overflow bug in btrfs code that would ruin the whole filesystem, can there? :)

Hardware failure feels like a valid possibility, too, but is there anything even the service desk can do?

So, the phone in its current state doesn't boot and I can't mount nor check the filesystem in order to make any backups. Is there anything else I could try?

edit retag flag offensive reopen delete

The question has been closed for the following reason "the question is answered, an answer was accepted" by Direc
close date 2017-12-29 17:26:50.754807

Comments

The output of your btrfs check command mentions a bad block. Have you tried running badblocks e.g. badblocks -n /dev/mmcblk0p28? I do not have a definitive answer for you, but you may want to read this thread if you haven't already.

Good luck!

joachim ( 2017-12-29 14:38:38 +0200 )edit

Good idea, but badblocks isn't available in the recovery mode. Using btrfs scrub isn't an option, since the filesystem can't be mounted... I ended using dd to copy the partition to a microSD card, so I can try to fix/mount it using newer tools. Stay tuned...

Direc ( 2017-12-29 16:38:47 +0200 )edit

1 Answer

Sort by » oldest newest most voted
1

answered 2017-12-29 17:26:29 +0200

Direc gravatar image

I managed to get the user data back using newer set of btrfs tools. This is the basic idea of what I did.

Please note that your device and partition numbers may vary, and it is easy to brick your device in recovery mode, so proceed with thought and caution.

  1. Insert a large enough, partitioned microSD card and boot into recovery mode
  2. Copy the data partition to microSD card using command dd if=/dev/mmcblk0p28 of=/dev/mmcblk1p1 bs=2M
  3. Insert thr microSD card to a computer running up-to-date Linux and btrfs tools
  4. Fix the filesystem with btrfs check --repair /dev/mmcblk0p1
  5. Try to mount the filesystem with mount -t btrfs -o subvol=@home /mnt/sailfish
  6. If the mount fails, go to (4.)
  7. If the mount is successful, try and backup the files and/or folders

That's it. In the end it was the newer version of btrfs tools and utilities that were able to fix the filesystem enough so that the mount finally succeeded.

edit flag offensive delete publish link more

Comments

(mad scientist laugh here) My experiment worked! I assumed that just the filesystem was corrupted, because getting the partition out with dd didn't throw anything questionable to dmesg. Then I made a dd copy as a file from my empty, working Jolla 1 /dev/mmcblk0p28 to a microSD card and then just wrote it back to the broken one, overwriting the corrupted partition completely. Sync, unmount, exit, reboot... Lo and behold, it booted up!

Does anyone know if there is any problems with this? I checked the IMEI code from the settings and it's correct. My sister's Jolla account creation worked fine. I just installed Sailfish Utilities and I'm now installing Android support, so it seems that communication and authentication (of which I know quite nothing about) with the store works as intended...

Next step, data restore.

Direc ( 2017-12-29 20:18:41 +0200 )edit

@Direc, thank you very much. My Jolla 1 showed exactly these symptoms and mounting /dev/mmcblk0p28 in the recovery menu failed with a segfault.

  1. I copied the partition to an SD card partition that was large enough: dd if=/dev/mmcblk0p28 of=/dev/mmcblk1p1 bs=2M
  2. Inserted the SD card into my Linux computer (Xubuntu 16.04) and repaired the file system: btrfs check --repair /dev/mmcblk0p1
  3. Put the SD card back into the Jolla and restored the partition: dd if=/dev/mmcblk1p1 of=/dev/mmcblk0p28 bs=2097152 count=7038 followed by dd skip=14759755776 seek=14759755776 if=/dev/mmcblk1p1 of=/dev/mmcblk0p28 bs=1 count=2080256

I split the dd operation in 2 steps because the partition on the SD card was larger than the orginal one on the Jolla and I wanted match the original size exactly. But copying single bytes is quite slow. It should be easier if the partition size matches exactly.

rweickelt ( 2019-08-04 16:53:54 +0200 )edit

Question tools

Follow
1 follower

Stats

Asked: 2017-12-29 13:09:52 +0200

Seen: 2,386 times

Last updated: Dec 29 '17