diff mbox

UBIFS oops after remount ro

Message ID 1290785057.2552.14.camel@localhost
State New, archived
Headers show

Commit Message

Artem Bityutskiy Nov. 26, 2010, 3:24 p.m. UTC
On Fri, 2010-11-26 at 14:50 +0100, Wolfgang Wegner wrote:
> [<c0371e40>] (mutex_lock+0x4/0x14) from [<c015cfc4>] (make_reservation+0x74/0x364)
> [<c015cfc4>] (make_reservation+0x74/0x364) from [<c015d79c>] (ubifs_jnl_write_inode+0x80/0x1e4)
> [<c015d79c>] (ubifs_jnl_write_inode+0x80/0x1e4) from [<c0163c14>] (ubifs_write_inode+0x5c/0xbc)
> [<c0163c14>] (ubifs_write_inode+0x5c/0xbc) from [<c00cfbec>] (writeback_single_inode+0x120/0x238)
> [<c00cfbec>] (writeback_single_inode+0x120/0x238) from [<c00d08a8>] (writeback_inodes_wb+0x3d4/0x4a8)
> [<c00d08a8>] (writeback_inodes_wb+0x3d4/0x4a8) from [<c00d0ad4>] (wb_writeback+0x158/0x1ec)
> [<c00d0ad4>] (wb_writeback+0x158/0x1ec) from [<c00d0cb8>] (wb_do_writeback+0x6c/0x1cc)
> [<c00d0cb8>] (wb_do_writeback+0x6c/0x1cc) from [<c00d0e38>] (bdi_writeback_task+0x20/0x98)
> [<c00d0e38>] (bdi_writeback_task+0x20/0x98) from [<c0099020>] (bdi_start_fn+0x8c/0x100)
> [<c0099020>] (bdi_start_fn+0x8c/0x100) from [<c00543bc>] (kthread+0x7c/0x84)
> [<c00543bc>] (kthread+0x7c/0x84) from [<c002744c>] (kernel_thread_exit+0x0/0x8)
> Code: ebfffd21 e28dd014 e8bd80f0 e3a03000 (e1001093)
> ---[ end trace 162376f104dd0abc ]---

Well, for some reason the write-back code thinks UBIFS still has dirty
inodes, but UBIFS was re-mounted R/O and it should not have dirty
inodes. I do not know where the bug is.

> I am a bit puzzled about this all.
> - is the flush-ubifs_0_1 process expected to run after the filesystem
>   has been mounted read-only?

Yes, in .32 it wakes up every 5 seconds. But it should find that there
is nothing to do and go sleep. In newer kernels it does not wake up
unless there is something to do.

> - What can I do to further debug this?

Difficult to say.

Firs of all, try to enable UBIFS debugging - just CONFIG_UBIFS_FS_DEBUG,
not messages. Is the problem still reproducible?

Also, it is interesting where exactly wb_writeback() is called - there
are 2 places.

And it is interesting which inode number is being written by write-back
code. And is it the same every time you oops or not?

Try to reproduce with the following patch:

Comments

Wolfgang Wegner Nov. 26, 2010, 5:29 p.m. UTC | #1
Hi,

thank you for the hints!

I am currently trying to reproduce the error even without modified
kernel - it seems I changed enough of my init procedure to somehow
avoid the trigger condition. :-(

As soon as I can reproduce the error, I will try what you suggested.

Best regards,
Wolfgang
Wolfgang Wegner Nov. 29, 2010, 1:18 p.m. UTC | #2
Hi Artem,

I can now reproduce the Oops with CONFIG_UBIFS_FS_DEBUG and your
patch. However, I had to manually apply the hunks because my tree
seems to differ enough for patch to not apply it automatically...

My observations so far:
- I have to re-flash the whole UBI image, simply removing the
  directory that is created with the script is not sufficient
  to trigger the Oops on subsequent re-start.
  My guess is that not only the single remount r/w->r/o  cycle
  from the first script is needed but the complete sequence of
  2 r/w->modify->r/o "cycles" currently in my startup scripts.
  Hopefully it's not really depending on filesystem internal state...
- I have to power cycle the board after flashing.
  (I flash from an NFS-root system) Simply issuing a reboot from
  the NFS-root system does result in a clean restart. When stopping
  the boot progress in the U-Boot prompt and issuing 'reset' from
  there sometimes also triggers the Oops on the following re-start
  with the newly flashed image, but doing a power cycle gave a 100%
  chance to get the oops in my tests.

The inode in question is always the same, but I guess this is not
a surprise as I always start with the same UBI Image?

I have to think about this a bit, still having no clue what is
going on and not too much a clue about what I am looking for either,
so sorry for just posting this weird information...

Regards,
Wolfgang

Here's the oops in the current state with all debugging output:

qqq: kupdated for ubifs
qqq: writing inode 3298
Unable to handle kernel NULL pointer dereference at virtual address 000000ac
pgd = c0004000
[000000ac] *pgd=00000000
Internal error: Oops: 817 [#1] PREEMPT
last sysfs file: /sys/devices/virtual/net/lo/flags
Modules linked in:
CPU: 0    Not tainted  (2.6.32-svn21303 #3)
PC is at mutex_lock+0x4/0x14
LR is at make_reservation+0x74/0x6b0
pc : [<c038cd20>]    lr : [<c015d23c>]    psr: 60000013
sp : df061d28  ip : df82e150  fp : 000000ac
r10: 00000000  r9 : 00000000  r8 : 00000001
r7 : 00000088  r6 : df82e000  r5 : df4f8310  r4 : df060000
r3 : 00000000  r2 : 00000000  r1 : df061d28  r0 : 000000ac
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 0005397f  Table: 1f090000  DAC: 00000017
Process flush-ubifs_0_1 (pid: 552, stack limit = 0xdf060270)
Stack: (0xdf061d28 to 0xdf062000)
1d20:                   00000001 c002d058 00000000 dfb4a07c 00000080 00000000
1d40: df82e14c 000000a0 00000088 00000001 00000000 dfa1c600 00000000 df82e000
1d60: df4f8310 000000a0 00000000 00000000 00000000 dfa1c600 00000000 c015de84
1d80: c05b2798 00000002 c05b2840 00000000 df922680 00000021 df4f8310 00000001
1da0: df82e000 df4f8458 00000010 00000000 00000000 c0166dd0 df4f8310 00000001
1dc0: 00000b2d c099eb40 00000010 df4f8310 df060000 c0162310 c099eb40 df061f10
1de0: df4f83ac df4f83ac 00000000 c099eb40 df82e008 ffffffff 00000400 df061e3c
1e00: 00000000 c00899f4 df061f10 c008a3a4 0000000e df979900 df4f83ac 00000000
1e20: 00000001 00000001 00000001 c00899e0 df4f83ac 00000000 00000009 00000001
1e40: 00000000 c099eb40 df978030 c04d1290 8f468953 c04d177c 0000000a c05b2798
1e60: 0000000a 00000018 00000017 c0206b44 0000000a c0503cfe c05b2798 00000001
1e80: 0000000a df4f8310 df061f10 df061f10 00000000 df4f83ac 00000000 00000007
1ea0: df82e090 c00cfb64 20000013 df82e068 df061f10 df060000 df4f8310 df82e088
1ec0: 00000000 c00d0868 00000000 dfa8a840 00000000 ffff9cc8 00000007 df4f84a0
1ee0: df4f8318 dfa8a800 df978000 df060000 df061f70 df82e068 00000000 df82e090
1f00: 00000014 df82e008 00000000 c00d0a94 df82e008 00000000 00000000 df061f4c
1f20: 00000400 00000000 00000000 00000000 00000000 00000000 00000014 00000000
1f40: 00000000 df82e0a8 00000000 ffff9110 00000000 df82e068 df061f70 0000092e
1f60: 00000000 00000000 df82e0a8 c00d0dd0 0000092e 00000000 00000000 00000003
1f80: c0048bd4 000001f4 ffff8c14 df82e068 df82e068 00000000 00000000 00000000
1fa0: 00000000 c00d0e1c df060000 df82e008 df82e068 c0098fe0 00000000 df8d9f40
1fc0: df061fd4 c0098f54 df82e068 c00543bc 00000000 00000000 df061fd8 df061fd8
1fe0: 00000000 00000000 00000000 00000000 00000000 c002744c 00000000 00000000
[<c038cd20>] (mutex_lock+0x4/0x14) from [<c015d23c>] (make_reservation+0x74/0x6b0)
[<c015d23c>] (make_reservation+0x74/0x6b0) from [<c015de84>] (ubifs_jnl_write_inode+0xfc/0x270)
[<c015de84>] (ubifs_jnl_write_inode+0xfc/0x270) from [<c0166dd0>] (ubifs_write_inode+0x110/0x19c)
[<c0166dd0>] (ubifs_write_inode+0x110/0x19c) from [<c0162310>] (ubifs_writepage+0x234/0x284)
[<c0162310>] (ubifs_writepage+0x234/0x284) from [<c00899f4>] (__writepage+0x14/0x64)
[<c00899f4>] (__writepage+0x14/0x64) from [<c008a3a4>] (write_cache_pages+0x234/0x384)
[<c008a3a4>] (write_cache_pages+0x234/0x384) from [<c00cfb64>] (writeback_single_inode+0xd8/0x238)
[<c00cfb64>] (writeback_single_inode+0xd8/0x238) from [<c00d0868>] (writeback_inodes_wb+0x3d4/0x4a8)
[<c00d0868>] (writeback_inodes_wb+0x3d4/0x4a8) from [<c00d0a94>] (wb_writeback+0x158/0x1ec)
[<c00d0a94>] (wb_writeback+0x158/0x1ec) from [<c00d0dd0>] (wb_do_writeback+0x1c4/0x1f0)
[<c00d0dd0>] (wb_do_writeback+0x1c4/0x1f0) from [<c00d0e1c>] (bdi_writeback_task+0x20/0x98)
[<c00d0e1c>] (bdi_writeback_task+0x20/0x98) from [<c0098fe0>] (bdi_start_fn+0x8c/0x100)
[<c0098fe0>] (bdi_start_fn+0x8c/0x100) from [<c00543bc>] (kthread+0x7c/0x84)
[<c00543bc>] (kthread+0x7c/0x84) from [<c002744c>] (kernel_thread_exit+0x0/0x8)
Code: ebfffd21 e28dd014 e8bd80f0 e3a03000 (e1001093) 
---[ end trace 60c4657087d3ea99 ]---

And here is the complete console log in case I missed some
important information:

UBIFS: mounted UBI device 0, volume 1, name "rootfs_live"
UBIFS: mounted read-only
UBIFS: file system size:   65802240 bytes (64260 KiB, 62 MiB, 510 LEBs)
UBIFS: journal size:       9033728 bytes (8822 KiB, 8 MiB, 71 LEBs)
UBIFS: media format:       w4/r0 (latest is w4/r0)
UBIFS: default compressor: zlib
UBIFS: reserved for root:  0 bytes (0 KiB)
VFS: Mounted root (ubifs filesystem) readonly on device 0:13.
Freeing init memory: 120K
UBIFS: mounted UBI device 0, volume 2, name "configfiles"
UBIFS: mounted read-only
UBIFS: file system size:   9160704 bytes (8946 KiB, 8 MiB, 71 LEBs)
UBIFS: journal size:       9033728 bytes (8822 KiB, 8 MiB, 71 LEBs)
UBIFS: media format:       w4/r0 (latest is w4/r0)
UBIFS: default compressor: zlib
UBIFS: reserved for root:  0 bytes (0 KiB)
UBIFS: mounted UBI device 0, volume 3, name "configfilesdefault"
UBIFS: mounted read-only
UBIFS: file system size:   9160704 bytes (8946 KiB, 8 MiB, 71 LEBs)
UBIFS: journal size:       9033728 bytes (8822 KiB, 8 MiB, 71 LEBs)
UBIFS: media format:       w4/r0 (latest is w4/r0)
UBIFS: default compressor: zlib
UBIFS: reserved for root:  0 bytes (0 KiB)
UBIFS: mounted UBI device 0, volume 4, name "userdata"
UBIFS: mounted read-only
UBIFS: file system size:   9160704 bytes (8946 KiB, 8 MiB, 71 LEBs)
UBIFS: journal size:       9033728 bytes (8822 KiB, 8 MiB, 71 LEBs)
UBIFS: media format:       w4/r0 (latest is w4/r0)
UBIFS: default compressor: zlib
UBIFS: reserved for root:  0 bytes (0 KiB)
  PID USER       VSZ STAT COMMAND
    1 root      3284 S    init       
    2 root         0 SW   [kthreadd]
    3 root         0 SW   [ksoftirqqqq: ubifs_remount_rw
d/0]
    4 root[<c002b91c>] (unwind_backtrace+0x0/0xd0) from [<c0165e40>] (ubifs_remount_fs+0x138/0x89c)
         0 SW   [<c0165e40>] (ubifs_remount_fs+0x138/0x89c) from [<c00b449c>] (do_remount_sb+0xa4/0xe0)
[watchdog/0]
  [<c00b449c>] (do_remount_sb+0xa4/0xe0) from [<c00cbde8>] (do_mount+0x20c/0x74c)
  5 root        [<c00cbde8>] (do_mount+0x20c/0x74c) from [<c00cc3ac>] (sys_mount+0x84/0xc4)
 0 SW   [events/[<c00cc3ac>] (sys_mount+0x84/0xc4) from [<c0026a00>] (ret_fast_syscall+0x0/0x28)
0]
    6 root         0 SW   [khelper]
    9 root         0 SW   [async/mgr]
  139 root         0 SW   [sync_supers]
  141 root         0 SWUBIFS DBG (pid 547): ubifs_bg_thread: background thread "ubifs_bgt0_1" started, PID 547
   [bdi-default]
  143 root         0 SW   [kblockd/0]
  150 root         0 SW   [ata/0]
  151 root         0 SW   [ata_aux]
  159 root         0 SW   [khubd]
  162 root         0 SW   [kseriod]
  165 root         0 SW   [kmmcd]
  185 root         0 SW   [rpciod/0]
  193 root         0 SW   [khungtaskd]
  194 root         0 SW   [kswapd0]
  243 root         0 SW   [aio/0]
  251 root         0 SW   [nfsiod]
  257 rootqqq: wb work for ubifs
         0 SW   qqq: wb work for ubifs
[crypto/0]
  39qqq: kupdated for ubifs
7 root         0 SW   [scsi_eh_0]
  400 root         0 SW   [scsi_eh_1]
  407 root         0 SW   [mtdblockd]
  432 root         0 SW   [ubi_bgt0d]
  433 root         0 SW   [spi_gpio.0]
  490 root         0 SW   [mv_crypto]
  527 root         0 SW   [usbhid_resumerqqq: wb work for ubifs
]
  537 root   qqq: kupdated for ubifs
   2344 S    /biqqq: wb work for ubifs
n/sh /etc/init.dqqq: kupdated for ubifs
/rcS 
  545 root      3288 R    ps 
adjusting modules
qqq: wb work for ubifs
qqq: kupdated for ubifs
qqq: wb work for ubifs
qqq: kupdated for ubifs
UBIFS DBG (pid 547): ubifs_bg_thread: background thread "ubifs_bgt0_1" stops
  PID USER       VSZ STAT COMMAND
    1 root      3284 S    init       
    2 root         0 SW   [kthreadd]
    3 root         0 SW   [ksoftirqd/0]
    4 root         0 SW   [watchdog/0]
    5 root         0 SW   [events/0]
    6 root         0 SW   [khelper]
    9 root         0 SW   [async/mgr]
  139 root         0 SW   [sync_supers]
  141 root         0 SW   [bdi-default]
  143 root         0 SW   [kblockd/0]
  150 root         0 SW   [ata/0]
  151 root         0 SW   [ata_aux]
  159 root         0 SW   [khubd]
  162 root         0 SW   [kseriod]
  165 root         0 SW   [kmmcd]
  185 root         0 SW   [rpciod/0]
  193 root         0 SW   [khungtaskd]
  194 root         0 SW   [kswapd0]
  243 root         0 SW   [aio/0]
  251 root         0 SW   [nfsiod]
  257 rqqq: ubifs_remount_rw
oot         0 SW[<c002b91c>] (unwind_backtrace+0x0/0xd0) from [<c0165e40>] (ubifs_remount_fs+0x138/0x89c)
   [crypto/0]
 [<c0165e40>] (ubifs_remount_fs+0x138/0x89c) from [<c00b449c>] (do_remount_sb+0xa4/0xe0)
 397 root       [<c00b449c>] (do_remount_sb+0xa4/0xe0) from [<c00cbde8>] (do_mount+0x20c/0x74c)
  0 SW   [scsi_e[<c00cbde8>] (do_mount+0x20c/0x74c) from [<c00cc3ac>] (sys_mount+0x84/0xc4)
h_0]
  400 root[<c00cc3ac>] (sys_mount+0x84/0xc4) from [<c0026a00>] (ret_fast_syscall+0x0/0x28)
         0 SW   [scsi_eh_1]
  407 root         0 SW   [mtdblockd]
  432 root         0 SW   [ubi_bgt0d]
  433 root         0 SW   [spi_gpio.0UBIFS DBG (pid 563): ubifs_bg_thread: background thread "ubifs_bgt0_2" started, PI3
]
  490 root         0 SW   [mv_crypto]
  527 qqq: ubifs_remount_rw
root         0 SW   [usbhid_resu[<c002b91c>] (unwind_backtrace+0x0/0xd0) from [<c0165e40>] (ubifs_remount_fs+0x138/0x89)
mer]
  537 root[<c0165e40>] (ubifs_remount_fs+0x138/0x89c) from [<c00b449c>] (do_remount_sb+0xa4/0xe0)
      2344 S    [<c00b449c>] (do_remount_sb+0xa4/0xe0) from [<c00cbde8>] (do_mount+0x20c/0x74c)
/bin/sh /etc/ini[<c00cbde8>] (do_mount+0x20c/0x74c) from [<c00cc3ac>] (sys_mount+0x84/0xc4)
t.d/rcS 
  552 [<c00cc3ac>] (sys_mount+0x84/0xc4) from [<c0026a00>] (ret_fast_syscall+0x0/0x28)
root         0 SW   [flush-ubifs_0_1]
  555 root      3288 R    ps 
/etc/localconfig/deviceid exists: 454d32e8-397e-4c33-9356-5595c550b07b
/etc/localconfig/dUBIFS DBG (pid 565): ubifs_bg_thread: background thread "ubifs_bgt0_3" started, PID 565
isplayid exists: 454d32e8-397e-4c33-9356-5595c550b07b
Warning! remounting /etc/localconfig rw!
Warning! remounting /etc/localconfigdefault rw!
new /etc/localconfig/hostname:Artista_0A:D6:13
Warning! remounting /etc/localconfigdefault ro!
qqq: kupdated for ubifs
qqq: wb work for ubifs
qqq: wb work for ubifs
qqq: kupdated for ubifs
UBIFS DBG (pid 565): ubifs_bg_thread: background thread "ubifs_bgt0_3" stops
Warning! remounting /etc/localconfig ro!
qqq: wb work for ubifs
qqq: kupdated for ubifs
qqq: wb work for ubifs
qqq: kupdated for ubifs
UBIFS DBG (pid 563): ubifs_bg_thread: background thread "ubifs_bgt0_2" stops
Setting up network interfaces...
##################### Setting up network #####################
Setting up loopback device...done
Setting hostname to Artista_0A:D6:13...Setting up network interface eth0...
given parameter is:  static
cat: can't open '/etc/localconfig//nameserver': No such file or directory
cat: can't open '/etc/localconfig//domain': No such file or directory
Using static configuration for eth0
        IP Address:     192.168.2.16
        Netmask:        255.255.255.0
        Nameserver:
        Hostname:       Artista_0A:D6:13
        Domain:
        Gateway:        192.168.2.10
deleting routes...
route: SIOCDELRT: No such process
adding routes
qqq: ubifs_remount_rw
[<c002b91c>] (unwind_backtrace+0x0/0xd0) from [<c0165e40>] (ubifs_remount_fs+0x138/0x89c)
[<c0165e40>] (ubifs_remount_fs+0x138/0x89c) from [<c00b449c>] (do_remount_sb+0xa4/0xe0)
[<c00b449c>] (do_remount_sb+0xa4/0xe0) from [<c00cbde8>] (do_mount+0x20c/0x74c)
[<c00cbde8>] (do_mount+0x20c/0x74c) from [<c00cc3ac>] (sys_mount+0x84/0xc4)
[<c00cc3ac>] (sys_mount+0x84/0xc4) from [<c0026a00>] (ret_fast_syscall+0x0/0x28)
UBIFS DBG (pid 597): ubifs_bg_thread: background thread "ubifs_bgt0_2" started, PID 597
qqq: wb work for ubifs
qqq: kupdated for ubifs
eth0: link up, 1000 Mb/s, full duplex, flow control disabled
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: wb work for ubifs
qqq: kupdated for ubifs
UBIFS DBG (pid 597): ubifs_bg_thread: background thread "ubifs_bgt0_2" stops
set value gateway to 
192.168.2.10
cat: can't open '/etc/localconfig//domain': No such file or directory
cat: can't open '/etc/localconfig//resolv.conf': No such file or directory
cat: can't open '/etc/localconfig//host.conf': No such file or dqqq: ubifs_remount_rw
irectory
[<c002b91c>] (unwind_backtrace+0x0/0xd0) from [<c0165e40>] (ubifs_remount_fs+0x138/0x89c)
[<c0165e40>] (ubifs_remount_fs+0x138/0x89c) from [<c00b449c>] (do_remount_sb+0xa4/0xe0)
[<c00b449c>] (do_remount_sb+0xa4/0xe0) from [<c00cbde8>] (do_mount+0x20c/0x74c)
[<c00cbde8>] (do_mount+0x20c/0x74c) from [<c00cc3ac>] (sys_mount+0x84/0xc4)
[<c00cc3ac>] (sys_mount+0x84/0xc4) from [<c0026a00>] (ret_fast_syscall+0x0/0x28)
UBIFS DBG (pid 606): ubifs_bg_thread: background thread "ubifs_bgt0_2" started, PID 606
qqq: wb work for ubifs
qqq: kupdated for ubifs
qqq: wb work for ubifs
qqq: kupdated for ubifs
UBIFS DBG (pid 606): ubifs_bg_thread: background thread "ubifs_bgt0_2" stops
set value host.conf to multi on
cat: can't open '/etc/localconfig//hosts': No such file or direcqqq: ubifs_remount_rw
tory
[<c002b91c>] (unwind_backtrace+0x0/0xd0) from [<c0165e40>] (ubifs_remount_fs+0x138/0x89c)
[<c0165e40>] (ubifs_remount_fs+0x138/0x89c) from [<c00b449c>] (do_remount_sb+0xa4/0xe0)
[<c00b449c>] (do_remount_sb+0xa4/0xe0) from [<c00cbde8>] (do_mount+0x20c/0x74c)
[<c00cbde8>] (do_mount+0x20c/0x74c) from [<c00cc3ac>] (sys_mount+0x84/0xc4)
[<c00cc3ac>] (sys_mount+0x84/0xc4) from [<c0026a00>] (ret_fast_syscall+0x0/0x28)
UBIFS DBG (pid 612): ubifs_bg_thread: background thread "ubifs_bgt0_2" started, PID 612
qqq: wb work for ubifs
qqq: kupdated for ubifs
qqq: wb work for ubifs
qqq: kupdated for ubifs
UBIFS DBG (pid 612): ubifs_bg_thread: background thread "ubifs_bgt0_2" stops
set value hosts to #/etc/hosts
127.0.0.1 localhost
127.0.1.1 Artista_0A:D6:13
Interface eth0 is set up
##################### Network is set up #####################
done
Starting udev/mdev...
done
Starting dbus...
done
Starting hal...
done
Starting webserver...
 mini_httpd started
done
Starting ftp server...
chmod: /opt/persistent/shows/push: Read-only file system
link /var/www/.htpasswd exists
Warning! remounting / rw!
qqq: ubifs_remount_rw
[<c002b91c>] (unwind_backtrace+0x0/0xd0) from [<c0165e40>] (ubifs_remount_fs+0x138/0x89c)
[<c0165e40>] (ubifs_remount_fs+0x138/0x89c) from [<c00b449c>] (do_remount_sb+0xa4/0xe0)
[<c00b449c>] (do_remount_sb+0xa4/0xe0) from [<c00cbde8>] (do_mount+0x20c/0x74c)
[<c00cbde8>] (do_mount+0x20c/0x74c) from [<c00cc3ac>] (sys_mount+0x84/0xc4)
[<c00cc3ac>] (sys_mount+0x84/0xc4) from [<c0026a00>] (ret_fast_syscall+0x0/0x28)
warning: `proftpd' uses 32-bit capabilities (legacy support in use)
UBIFS DBG (pid 702): ubifs_bg_thread: background thread "ubifs_bgt0_1" started, PID 702
link /var/www/cgi-bin/.htpasswd was created
Warning! remountingqqq: wb work for ubifs
 / ro!
qqq: kupdated for ubifs
qqq: wb work for ubifs
qqq: kupdated for ubifs
done
WARNING: starting telnetd! (only for lab usage!)
UBIFS DBG (pid 702): ubifs_bg_thread: background thread "ubifs_bgt0_1" stops
done
starting netclient
done
socket: Address family not supported by protocol
/sbin/mini_httpd: started as root without requesting chroot(), warning only
sh-3.2# qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: kupdated for ubifs
qqq: writing inode 3298
Unable to handle kernel NULL pointer dereference at virtual address 000000ac
pgd = c0004000
[000000ac] *pgd=00000000
Internal error: Oops: 817 [#1] PREEMPT
last sysfs file: /sys/devices/virtual/net/lo/flags
Modules linked in:
CPU: 0    Not tainted  (2.6.32-svn21303 #3)
PC is at mutex_lock+0x4/0x14
LR is at make_reservation+0x74/0x6b0
pc : [<c038cd20>]    lr : [<c015d23c>]    psr: 60000013
sp : df061d28  ip : df82e150  fp : 000000ac
r10: 00000000  r9 : 00000000  r8 : 00000001
r7 : 00000088  r6 : df82e000  r5 : df4f8310  r4 : df060000
r3 : 00000000  r2 : 00000000  r1 : df061d28  r0 : 000000ac
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 0005397f  Table: 1f090000  DAC: 00000017
Process flush-ubifs_0_1 (pid: 552, stack limit = 0xdf060270)
Stack: (0xdf061d28 to 0xdf062000)
1d20:                   00000001 c002d058 00000000 dfb4a07c 00000080 00000000
1d40: df82e14c 000000a0 00000088 00000001 00000000 dfa1c600 00000000 df82e000
1d60: df4f8310 000000a0 00000000 00000000 00000000 dfa1c600 00000000 c015de84
1d80: c05b2798 00000002 c05b2840 00000000 df922680 00000021 df4f8310 00000001
1da0: df82e000 df4f8458 00000010 00000000 00000000 c0166dd0 df4f8310 00000001
1dc0: 00000b2d c099eb40 00000010 df4f8310 df060000 c0162310 c099eb40 df061f10
1de0: df4f83ac df4f83ac 00000000 c099eb40 df82e008 ffffffff 00000400 df061e3c
1e00: 00000000 c00899f4 df061f10 c008a3a4 0000000e df979900 df4f83ac 00000000
1e20: 00000001 00000001 00000001 c00899e0 df4f83ac 00000000 00000009 00000001
1e40: 00000000 c099eb40 df978030 c04d1290 8f468953 c04d177c 0000000a c05b2798
1e60: 0000000a 00000018 00000017 c0206b44 0000000a c0503cfe c05b2798 00000001
1e80: 0000000a df4f8310 df061f10 df061f10 00000000 df4f83ac 00000000 00000007
1ea0: df82e090 c00cfb64 20000013 df82e068 df061f10 df060000 df4f8310 df82e088
1ec0: 00000000 c00d0868 00000000 dfa8a840 00000000 ffff9cc8 00000007 df4f84a0
1ee0: df4f8318 dfa8a800 df978000 df060000 df061f70 df82e068 00000000 df82e090
1f00: 00000014 df82e008 00000000 c00d0a94 df82e008 00000000 00000000 df061f4c
1f20: 00000400 00000000 00000000 00000000 00000000 00000000 00000014 00000000
1f40: 00000000 df82e0a8 00000000 ffff9110 00000000 df82e068 df061f70 0000092e
1f60: 00000000 00000000 df82e0a8 c00d0dd0 0000092e 00000000 00000000 00000003
1f80: c0048bd4 000001f4 ffff8c14 df82e068 df82e068 00000000 00000000 00000000
1fa0: 00000000 c00d0e1c df060000 df82e008 df82e068 c0098fe0 00000000 df8d9f40
1fc0: df061fd4 c0098f54 df82e068 c00543bc 00000000 00000000 df061fd8 df061fd8
1fe0: 00000000 00000000 00000000 00000000 00000000 c002744c 00000000 00000000
[<c038cd20>] (mutex_lock+0x4/0x14) from [<c015d23c>] (make_reservation+0x74/0x6b0)
[<c015d23c>] (make_reservation+0x74/0x6b0) from [<c015de84>] (ubifs_jnl_write_inode+0xfc/0x270)
[<c015de84>] (ubifs_jnl_write_inode+0xfc/0x270) from [<c0166dd0>] (ubifs_write_inode+0x110/0x19c)
[<c0166dd0>] (ubifs_write_inode+0x110/0x19c) from [<c0162310>] (ubifs_writepage+0x234/0x284)
[<c0162310>] (ubifs_writepage+0x234/0x284) from [<c00899f4>] (__writepage+0x14/0x64)
[<c00899f4>] (__writepage+0x14/0x64) from [<c008a3a4>] (write_cache_pages+0x234/0x384)
[<c008a3a4>] (write_cache_pages+0x234/0x384) from [<c00cfb64>] (writeback_single_inode+0xd8/0x238)
[<c00cfb64>] (writeback_single_inode+0xd8/0x238) from [<c00d0868>] (writeback_inodes_wb+0x3d4/0x4a8)
[<c00d0868>] (writeback_inodes_wb+0x3d4/0x4a8) from [<c00d0a94>] (wb_writeback+0x158/0x1ec)
[<c00d0a94>] (wb_writeback+0x158/0x1ec) from [<c00d0dd0>] (wb_do_writeback+0x1c4/0x1f0)
[<c00d0dd0>] (wb_do_writeback+0x1c4/0x1f0) from [<c00d0e1c>] (bdi_writeback_task+0x20/0x98)
[<c00d0e1c>] (bdi_writeback_task+0x20/0x98) from [<c0098fe0>] (bdi_start_fn+0x8c/0x100)
[<c0098fe0>] (bdi_start_fn+0x8c/0x100) from [<c00543bc>] (kthread+0x7c/0x84)
[<c00543bc>] (kthread+0x7c/0x84) from [<c002744c>] (kernel_thread_exit+0x0/0x8)
Code: ebfffd21 e28dd014 e8bd80f0 e3a03000 (e1001093) 
---[ end trace 60c4657087d3ea99 ]---
qqq: kupdated for ubifs
qqq: kupdated for ubifs
Artem Bityutskiy Nov. 29, 2010, 2:21 p.m. UTC | #3
On Mon, 2010-11-29 at 14:18 +0100, Wolfgang Wegner wrote:
> Hi Artem,
> 
> I can now reproduce the Oops with CONFIG_UBIFS_FS_DEBUG and your
> patch.

OK, I thought one of ubi_asserts() would trigger. They do not.

>  However, I had to manually apply the hunks because my tree
> seems to differ enough for patch to not apply it automatically...

I sent the patch against the ubifs-v2.6.32 tree, which has all the UBIFS
patches back-ported, and which I recommend to use:

http://linux-mtd.infradead.org/doc/ubifs.html#L_source

(hmm, need to update that page, now there are more back-port trees)

> And here is the complete console log in case I missed some
> important information:

Well, because of the ps output, it is difficult to read kernel prints.
Also, the situation becomes more difficult because you have several
UBIFS file-systems, so many messages are irrelevant. Would be nice to
print them only for the instance which oopses.

> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs
> qqq: kupdated for ubifs

So, not 100% sure, but probably this is kuptdated writeback. For some
reason mm thinks ubifs has dirty data, although it should not have,
because re-mounting path has full sync.

> qqq: writing inode 3298

Does this always happen for inode 3298? Or the inode number changes?
Wolfgang Wegner Nov. 29, 2010, 5:02 p.m. UTC | #4
Hi Artem,

On Mon, Nov 29, 2010 at 06:45:57PM +0200, Artem Bityutskiy wrote:
> 
> I cannot judge for sure, but this looks like non-UBIFS issue in 2.6.32,
> so merging the stable 2.6.32.X tree can help. I make sense in general to
> merge stable tree as well.

the problem is I still have some "marvell" tree
(from git.marvell.com/orion.git) and am not sure if there is anything
special in it.

Would the "stock" 2.6.35.x from
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
be an option in your opinion, too? I seem to remember the marvell tree
was completely merged in between (right now digging through the commits
to make it clear), so I could get an all shiny, new kernel when having
to do any kind of merge at all...

Regards,
Wolfgang
Wolfgang Wegner Nov. 29, 2010, 5:23 p.m. UTC | #5
On Mon, Nov 29, 2010 at 06:02:23PM +0100, Wolfgang Wegner wrote:
> 
> Would the "stock" 2.6.35.x from
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

oops... s/2.6.35/2.6.36/g
thanks

Wolfgang
Artem Bityutskiy Dec. 2, 2010, 3:35 a.m. UTC | #6
On Mon, 2010-11-29 at 18:02 +0100, Wolfgang Wegner wrote:
> Hi Artem,
> 
> On Mon, Nov 29, 2010 at 06:45:57PM +0200, Artem Bityutskiy wrote:
> > 
> > I cannot judge for sure, but this looks like non-UBIFS issue in 2.6.32,
> > so merging the stable 2.6.32.X tree can help. I make sense in general to
> > merge stable tree as well.
> 
> the problem is I still have some "marvell" tree
> (from git.marvell.com/orion.git) and am not sure if there is anything
> special in it.

Mostly vendors change drivers and board files. They usually do not touch
core functionality. So you can try to merge 2.6.32

> Would the "stock" 2.6.35.x from
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> be an option in your opinion, too? I seem to remember the marvell tree
> was completely merged in between (right now digging through the commits
> to make it clear), so I could get an all shiny, new kernel when having
> to do any kind of merge at all...

Probably 2.6.35 would be good.

Anyway, the problem is that I do not really have time to drive you
through debugging of your issue. I only have time to get "big" results
from you and provide some help in form of my opinion :-(
Artem Bityutskiy Dec. 2, 2010, 3:41 a.m. UTC | #7
On Thu, 2010-12-02 at 05:35 +0200, Artem Bityutskiy wrote:
> On Mon, 2010-11-29 at 18:02 +0100, Wolfgang Wegner wrote:
> > Hi Artem,
> > 
> > On Mon, Nov 29, 2010 at 06:45:57PM +0200, Artem Bityutskiy wrote:
> > > 
> > > I cannot judge for sure, but this looks like non-UBIFS issue in 2.6.32,
> > > so merging the stable 2.6.32.X tree can help. I make sense in general to
> > > merge stable tree as well.
> > 
> > the problem is I still have some "marvell" tree
> > (from git.marvell.com/orion.git) and am not sure if there is anything
> > special in it.
> 
> Mostly vendors change drivers and board files. They usually do not touch
> core functionality. So you can try to merge 2.6.32

Err, I meant 2.6.32.x, of course, the so-called -stable tree which Greg
KH maintains.
Wolfgang Wegner Dec. 2, 2010, 9:17 a.m. UTC | #8
Hi Artem,

On Thu, Dec 02, 2010 at 05:35:34AM +0200, Artem Bityutskiy wrote:
> 
> Mostly vendors change drivers and board files. They usually do not touch
> core functionality. So you can try to merge 2.6.32

there had in fact been some special ARM and/or kirkwood things in this
2.6.32 release, but meanwhile this is all integrated into mainline, as
far as I can see.

> Probably 2.6.35 would be good.

I merged our (very few) local changes into 2.6.36 and am now running
this.

> Anyway, the problem is that I do not really have time to drive you
> through debugging of your issue. I only have time to get "big" results
> from you and provide some help in form of my opinion :-(

Thank you very much for your hints and help so far!

The big result I can give is that I already got this oops once with
2.6.36, too - but could not reproduce it. I also have some other thing
to fix right now, but in case I get back to the problem, I will of
course post results (if any) - thanks to you I now got some pointers
where I can start debugging, this is already great and valuable help.

Regards,
Wolfgang
Artem Bityutskiy Dec. 2, 2010, 12:20 p.m. UTC | #9
On Thu, 2010-12-02 at 10:17 +0100, Wolfgang Wegner wrote:
> Hi Artem,
> 
> On Thu, Dec 02, 2010 at 05:35:34AM +0200, Artem Bityutskiy wrote:
> > 
> > Mostly vendors change drivers and board files. They usually do not touch
> > core functionality. So you can try to merge 2.6.32
> 
> there had in fact been some special ARM and/or kirkwood things in this
> 2.6.32 release, but meanwhile this is all integrated into mainline, as
> far as I can see.
> 
> > Probably 2.6.35 would be good.
> 
> I merged our (very few) local changes into 2.6.36 and am now running
> this.
> 
> > Anyway, the problem is that I do not really have time to drive you
> > through debugging of your issue. I only have time to get "big" results
> > from you and provide some help in form of my opinion :-(
> 
> Thank you very much for your hints and help so far!
> 
> The big result I can give is that I already got this oops once with
> 2.6.36, too - but could not reproduce it. I also have some other thing
> to fix right now, but in case I get back to the problem, I will of
> course post results (if any) - thanks to you I now got some pointers
> where I can start debugging, this is already great and valuable help.

The same oops? If yes, then this is not 2.6.32-specific issue, then you
should preserve your 2.6.32 setup and dig further, I think. Having a
setup where you can reproduce the bug is very nice thing :-)
diff mbox

Patch

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 9d5360c..44ebcdf 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -908,6 +908,7 @@  long wb_do_writeback(struct bdi_writeback *wb, int force_wait)
 		if (args.sync_mode == WB_SYNC_NONE)
 			wb_clear_pending(wb, work);
 
+		printk(KERN_DEBUG "qqq: wb work for %s\n", bdi->name);
 		wrote += wb_writeback(wb, &args);
 
 		/*
@@ -921,6 +922,7 @@  long wb_do_writeback(struct bdi_writeback *wb, int force_wait)
 	/*
 	 * Check for periodic writeback, kupdated() style
 	 */
+	printk(KERN_DEBUG "qqq: kupdated for %s\n", bdi->name);
 	wrote += wb_check_old_data_flush(wb);
 
 	return wrote;
diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index 914f1bd..6b0e41c 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -124,6 +124,7 @@  static int reserve_space(struct ubifs_info *c, int jhead, int len)
 	 */
 	ubifs_assert(!c->ro_media && !c->ro_mount);
 	squeeze = (jhead == BASEHD);
+
 again:
 	mutex_lock_nested(&wbuf->io_mutex, wbuf->jhead);
 
@@ -779,6 +780,11 @@  int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode)
 	if (!ino)
 		return -ENOMEM;
 
+	if (!c->jheads) {
+		/* We are about to oops, so here we can print useful info */
+		printk(KERN_DEBUG "qqq: writing inode %lu\n", inode->i_ino);
+	}
+
 	/* Make reservation before allocating sequence numbers */
 	err = make_reservation(c, BASEHD, len);
 	if (err)
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 52f5627..752f4e4 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -1533,6 +1533,9 @@  static int ubifs_remount_rw(struct ubifs_info *c)
 		return -EROFS;
 	}
 
+	printk(KERN_DEBUG "qqq: %s\n", __func__);
+	dump_stack();
+
 	mutex_lock(&c->umount_mutex);
 	dbg_save_space_info(c);
 	c->remounting_rw = 1;
@@ -1678,6 +1681,8 @@  static void ubifs_remount_ro(struct ubifs_info *c)
 
 	ubifs_assert(!c->need_recovery);
 	ubifs_assert(!c->ro_mount);
+	printk(KERN_DEBUG "qqq: %s\n", __func__);
+	dump_stack();
 
 	mutex_lock(&c->umount_mutex);
 	if (c->bgt) {