Patchwork [BUG] fatal hang untarring 90GB file, possibly writeback related.

login
register
mail settings
Submitter Mel Gorman
Date April 28, 2011, 7:21 p.m.
Message ID <20110428192104.GA4658@suse.de>
Download mbox | patch
Permalink /patch/93278/
State Not Applicable
Headers show

Comments

Mel Gorman - April 28, 2011, 7:21 p.m.
On Thu, Apr 28, 2011 at 01:30:36PM -0500, James Bottomley wrote:
> On Thu, 2011-04-28 at 18:18 +0100, Mel Gorman wrote:
> > On Thu, Apr 28, 2011 at 11:56:17AM -0500, James Bottomley wrote:
> > > # Events: 6K cycles
> > > #
> > > # Overhead      Command        Shared Object                                   Symbol
> > > # ........  ...........  ...................  .......................................
> > > #
> > >     20.41%      kswapd0  [kernel.kallsyms]    [k] shrink_slab
> > >                 |
> > >                 --- shrink_slab
> > >                    |          
> > >                    |--99.91%-- kswapd
> > >                    |          kthread
> > >                    |          kernel_thread_helper
> > >                     --0.09%-- [...]
> > > 
> > 
> > Ok. I can't see how the patch "mm: vmscan: reclaim order-0 and use
> > compaction instead of lumpy reclaim" is related unless we are seeing
> > two problems that happen to manifest in a similar manner.
> > 
> > However, there were a number of changes made to dcache in particular
> > for 2.6.38. Specifically thinks like dentry_kill use trylock and is
> > happy to loop around if it fails to acquire anything. See things like
> > this for example;
> 
> OK, so for this, I tried a 2.6.37 kernel.  It doesn't work very well,
> networking is hosed for no reason I can see (probably systemd / cgroups
> problems).
> 
> However, it runs enough for me to say that the tar proceeds to
> completion in a non-PREEMPT kernel.  (I tried several times for good
> measure).  That makes this definitely a regression of some sort, but it
> doesn't definitively identify the dcache code ... it could be an ext4
> bug that got introduced in 2.6.38 either.
> 

True, it could be any shrinker and dcache is just a guess.

> > <SNIP>
> > 
> > Way hey, cgroups are also in the mix. How jolly.
> > 
> > Is systemd a common element of the machines hitting this bug by any
> > chance?
> 
> Well, yes, the bug report is against FC15, which needs cgroups for
> systemd.
> 

Ok although we do not have direct evidence that it's the problem yet. A
broken shrinker could just mean we are also trying to aggressively
reclaim in cgroups.

> > The remaining traces seem to be follow-on damage related to the three
> > issues of "shrinkers are bust in some manner" causing "we are not
> > getting over the min watermark" and as a side-show "we are spending lots
> > of time doing something unspecified but unhelpful in cgroups".
> 
> Heh, well find a way for me to verify this: I can't turn off cgroups
> because systemd then won't work and the machine won't boot ...
> 

Same testcase, same kernel but a distro that is not using systemd to
verify if cgroups are the problem. Not ideal I know. When I'm back
online Tuesday, I'll try reproducing this on a !Fedora distribution. In
the meantime, the following untested hatchet job might spit out
which shrinker we are getting stuck in. It is also breaking out of
the shrink_slab loop so it'd even be interesting to see if the bug
is mitigated in any way.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley - April 28, 2011, 7:59 p.m.
On Thu, 2011-04-28 at 20:21 +0100, Mel Gorman wrote:
> On Thu, Apr 28, 2011 at 01:30:36PM -0500, James Bottomley wrote:
> > > Way hey, cgroups are also in the mix. How jolly.
> > > 
> > > Is systemd a common element of the machines hitting this bug by any
> > > chance?
> > 
> > Well, yes, the bug report is against FC15, which needs cgroups for
> > systemd.
> > 
> 
> Ok although we do not have direct evidence that it's the problem yet. A
> broken shrinker could just mean we are also trying to aggressively
> reclaim in cgroups.
> 
> > > The remaining traces seem to be follow-on damage related to the three
> > > issues of "shrinkers are bust in some manner" causing "we are not
> > > getting over the min watermark" and as a side-show "we are spending lots
> > > of time doing something unspecified but unhelpful in cgroups".
> > 
> > Heh, well find a way for me to verify this: I can't turn off cgroups
> > because systemd then won't work and the machine won't boot ...
> > 
> 
> Same testcase, same kernel but a distro that is not using systemd to
> verify if cgroups are the problem. Not ideal I know. When I'm back
> online Tuesday, I'll try reproducing this on a !Fedora distribution. In
> the meantime, the following untested hatchet job might spit out
> which shrinker we are getting stuck in. It is also breaking out of
> the shrink_slab loop so it'd even be interesting to see if the bug
> is mitigated in any way.

Actually, talking to Chris, I think I can get the system up using
init=/bin/bash without systemd, so I can try the no cgroup config.

> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c74a501..ed99104 100644

In the mean time, this patch produces:

(that's nothing ... apparently the trace doesn't activate when kswapd
goes mad).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mel Gorman - April 28, 2011, 8:27 p.m.
On Thu, Apr 28, 2011 at 02:59:27PM -0500, James Bottomley wrote:
> On Thu, 2011-04-28 at 20:21 +0100, Mel Gorman wrote:
> > On Thu, Apr 28, 2011 at 01:30:36PM -0500, James Bottomley wrote:
> > > > Way hey, cgroups are also in the mix. How jolly.
> > > > 
> > > > Is systemd a common element of the machines hitting this bug by any
> > > > chance?
> > > 
> > > Well, yes, the bug report is against FC15, which needs cgroups for
> > > systemd.
> > > 
> > 
> > Ok although we do not have direct evidence that it's the problem yet. A
> > broken shrinker could just mean we are also trying to aggressively
> > reclaim in cgroups.
> > 
> > > > The remaining traces seem to be follow-on damage related to the three
> > > > issues of "shrinkers are bust in some manner" causing "we are not
> > > > getting over the min watermark" and as a side-show "we are spending lots
> > > > of time doing something unspecified but unhelpful in cgroups".
> > > 
> > > Heh, well find a way for me to verify this: I can't turn off cgroups
> > > because systemd then won't work and the machine won't boot ...
> > > 
> > 
> > Same testcase, same kernel but a distro that is not using systemd to
> > verify if cgroups are the problem. Not ideal I know. When I'm back
> > online Tuesday, I'll try reproducing this on a !Fedora distribution. In
> > the meantime, the following untested hatchet job might spit out
> > which shrinker we are getting stuck in. It is also breaking out of
> > the shrink_slab loop so it'd even be interesting to see if the bug
> > is mitigated in any way.
> 
> Actually, talking to Chris, I think I can get the system up using
> init=/bin/bash without systemd, so I can try the no cgroup config.
> 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index c74a501..ed99104 100644
> 
> In the mean time, this patch produces:
> 
> (that's nothing ... apparently the trace doesn't activate when kswapd
> goes mad).
> 

Or is looping there for shorter than we expect. HZ/10?
James Bottomley - April 28, 2011, 9:12 p.m.
On Thu, 2011-04-28 at 14:59 -0500, James Bottomley wrote:
> Actually, talking to Chris, I think I can get the system up using
> init=/bin/bash without systemd, so I can try the no cgroup config.

OK, so a non-PREEMPT non-CGROUP kernel has survived three back to back
runs of untar without locking or getting kswapd pegged, so I'm pretty
certain this is cgroups related.  The next steps are to turn cgroups
back on but try disabling the memory and IO controllers.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley - April 28, 2011, 10:43 p.m.
On Thu, 2011-04-28 at 16:12 -0500, James Bottomley wrote:
> On Thu, 2011-04-28 at 14:59 -0500, James Bottomley wrote:
> > Actually, talking to Chris, I think I can get the system up using
> > init=/bin/bash without systemd, so I can try the no cgroup config.
> 
> OK, so a non-PREEMPT non-CGROUP kernel has survived three back to back
> runs of untar without locking or getting kswapd pegged, so I'm pretty
> certain this is cgroups related.  The next steps are to turn cgroups
> back on but try disabling the memory and IO controllers.

I tried non-PREEMPT CGROUP but disabled GROUP_MEM_RES_CTLR.

The results are curious:  the tar does complete (I've done three back to
back).  However, I did get one soft lockup in kswapd (below).  But the
system recovers instead of halting I/O and hanging like it did
previously.

The soft lockup is in shrink_slab, so perhaps it's a combination of slab
shrinker and cgroup memory controller issues?

James

---
[  670.823843] BUG: soft lockup - CPU#2 stuck for 67s! [kswapd0:46]
[  670.825472] Modules linked in: netconsole configfs cpufreq_ondemand acpi_cpufreq freq_table mperf snd_hda_codec_hdmi snd_hda_codec_conexant arc4 snd_hda_intel btusb snd_hda_codec snd_hwdep iwlagn snd_seq mac80211 bluetooth snd_seq_device uvcvideo snd_pcm cfg80211 wmi microcode e1000e videodev xhci_hcd rfkill snd_timer iTCO_wdt v4l2_compat_ioctl32 iTCO_vendor_support pcspkr i2c_i801 snd soundcore snd_page_alloc joydev uinput ipv6 sdhci_pci sdhci mmc_core i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[  670.830864] CPU 2 
[  670.830881] Modules linked in: netconsole configfs cpufreq_ondemand acpi_cpufreq freq_table mperf snd_hda_codec_hdmi snd_hda_codec_conexant arc4 snd_hda_intel btusb snd_hda_codec snd_hwdep iwlagn snd_seq mac80211 bluetooth snd_seq_device uvcvideo snd_pcm cfg80211 wmi microcode e1000e videodev xhci_hcd rfkill snd_timer iTCO_wdt v4l2_compat_ioctl32 iTCO_vendor_support pcspkr i2c_i801 snd soundcore snd_page_alloc joydev uinput ipv6 sdhci_pci sdhci mmc_core i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[  670.838385] 
[  670.840289] Pid: 46, comm: kswapd0 Not tainted 2.6.39-rc4+ #3 LENOVO 4170CTO/4170CTO
[  670.842193] RIP: 0010:[<ffffffff810e07cb>]  [<ffffffff810e07cb>] shrink_slab+0x86/0x166
[  670.844063] RSP: 0018:ffff88006eea5da0  EFLAGS: 00000206
[  670.845881] RAX: 0000000000000000 RBX: ffff88006eea5de0 RCX: 0000000000000002
[  670.847652] RDX: 0000000000000000 RSI: ffff88006eea5d60 RDI: ffff88006eea5d60
[  670.849394] RBP: ffff88006eea5de0 R08: 000000000000000c R09: 0000000000000000
[  670.851091] R10: 0000000000000001 R11: 000000000000005f R12: ffffffff8147b50e
[  670.852733] R13: ffff8801005e6e00 R14: 0000000000000010 R15: 0000000000017fb6
[  670.854351] FS:  0000000000000000(0000) GS:ffff880100280000(0000) knlGS:0000000000000000
[  670.855968] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  670.857555] CR2: 00000037d90ae040 CR3: 0000000001a03000 CR4: 00000000000406e0
[  670.859138] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  670.860720] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  670.862320] Process kswapd0 (pid: 46, threadinfo ffff88006eea4000, task ffff88006eeb0000)
[  670.863932] Stack:
[  670.865477]  0000000000000001 0000000000000080 ffff880000000002 ffff8801005e6e00
[  670.867023]  ffff8801005e6000 0000000000000002 0000000000000000 000000000000000c
[  670.868558]  ffff88006eea5ee0 ffffffff810e308c 0000000000000003 ffff88006eeb0000
[  670.870120] Call Trace:
[  670.871652]  [<ffffffff810e308c>] kswapd+0x4f0/0x774
[  670.873218]  [<ffffffff810e2b9c>] ? try_to_free_pages+0xe5/0xe5
[  670.874786]  [<ffffffff8106ce57>] kthread+0x84/0x8c
[  670.876327]  [<ffffffff8147bc64>] kernel_thread_helper+0x4/0x10
[  670.877871]  [<ffffffff8106cdd3>] ? kthread_worker_fn+0x148/0x148
[  670.879403]  [<ffffffff8147bc60>] ? gs_change+0x13/0x13
[  670.880932] Code: 83 eb 10 e9 ce 00 00 00 44 89 f2 31 f6 48 89 df ff 13 48 63 4b 08 4c 63 e8 48 8b 45 c8 31 d2 48 f7 f1 31 d2 49 0f af c5 49 f7 f7 
[  670.881086]  03 43 20 48 85 c0 48 89 43 20 79 18 48 8b 33 48 89 c2 48 c7 
[  670.884285] Call Trace:
[  670.885884]  [<ffffffff810e308c>] kswapd+0x4f0/0x774
[  670.887462]  [<ffffffff810e2b9c>] ? try_to_free_pages+0xe5/0xe5
[  670.889031]  [<ffffffff8106ce57>] kthread+0x84/0x8c
[  670.890578]  [<ffffffff8147bc64>] kernel_thread_helper+0x4/0x10
[  670.892130]  [<ffffffff8106cdd3>] ? kthread_worker_fn+0x148/0x148
[  670.893653]  [<ffffffff8147bc60>] ? gs_change+0x13/0x13


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley - April 29, 2011, 3:02 p.m.
On Thu, 2011-04-28 at 21:27 +0100, Mel Gorman wrote:
> On Thu, Apr 28, 2011 at 02:59:27PM -0500, James Bottomley wrote:
> > On Thu, 2011-04-28 at 20:21 +0100, Mel Gorman wrote:
> > > On Thu, Apr 28, 2011 at 01:30:36PM -0500, James Bottomley wrote:
> > > > > Way hey, cgroups are also in the mix. How jolly.
> > > > > 
> > > > > Is systemd a common element of the machines hitting this bug by any
> > > > > chance?
> > > > 
> > > > Well, yes, the bug report is against FC15, which needs cgroups for
> > > > systemd.
> > > > 
> > > 
> > > Ok although we do not have direct evidence that it's the problem yet. A
> > > broken shrinker could just mean we are also trying to aggressively
> > > reclaim in cgroups.
> > > 
> > > > > The remaining traces seem to be follow-on damage related to the three
> > > > > issues of "shrinkers are bust in some manner" causing "we are not
> > > > > getting over the min watermark" and as a side-show "we are spending lots
> > > > > of time doing something unspecified but unhelpful in cgroups".
> > > > 
> > > > Heh, well find a way for me to verify this: I can't turn off cgroups
> > > > because systemd then won't work and the machine won't boot ...
> > > > 
> > > 
> > > Same testcase, same kernel but a distro that is not using systemd to
> > > verify if cgroups are the problem. Not ideal I know. When I'm back
> > > online Tuesday, I'll try reproducing this on a !Fedora distribution. In
> > > the meantime, the following untested hatchet job might spit out
> > > which shrinker we are getting stuck in. It is also breaking out of
> > > the shrink_slab loop so it'd even be interesting to see if the bug
> > > is mitigated in any way.
> > 
> > Actually, talking to Chris, I think I can get the system up using
> > init=/bin/bash without systemd, so I can try the no cgroup config.
> > 
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index c74a501..ed99104 100644
> > 
> > In the mean time, this patch produces:
> > 
> > (that's nothing ... apparently the trace doesn't activate when kswapd
> > goes mad).
> > 
> 
> Or is looping there for shorter than we expect. HZ/10?

Still doesn't print anything, even with HZ/10.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mel Gorman - May 3, 2011, 9:13 a.m.
On Thu, Apr 28, 2011 at 05:43:48PM -0500, James Bottomley wrote:
> On Thu, 2011-04-28 at 16:12 -0500, James Bottomley wrote:
> > On Thu, 2011-04-28 at 14:59 -0500, James Bottomley wrote:
> > > Actually, talking to Chris, I think I can get the system up using
> > > init=/bin/bash without systemd, so I can try the no cgroup config.
> > 
> > OK, so a non-PREEMPT non-CGROUP kernel has survived three back to back
> > runs of untar without locking or getting kswapd pegged, so I'm pretty
> > certain this is cgroups related.  The next steps are to turn cgroups
> > back on but try disabling the memory and IO controllers.
> 
> I tried non-PREEMPT CGROUP but disabled GROUP_MEM_RES_CTLR.
> 
> The results are curious:  the tar does complete (I've done three back to
> back).  However, I did get one soft lockup in kswapd (below).  But the
> system recovers instead of halting I/O and hanging like it did
> previously.
> 
> The soft lockup is in shrink_slab, so perhaps it's a combination of slab
> shrinker and cgroup memory controller issues?
> 

So, kswapd is still looping in reclaim and spending a lot of time in
shrink_slab but it must not be the shrinker itself or that debug patch
would have triggered. It's curious that cgroups are involved with
systemd considering that one would expect those groups to be fairly
small. I still don't have a new theory but will get hold of a Fedora 15
install CD and see can I reproduce it locally.

One last thing, what is the value of /proc/sys/vm/zone_reclaim_mode? Two
of the reporting machines could be NUMA and if that proc file reads as
1, I'd be interested in hearing the results of a test with it set to 0.
Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley - May 3, 2011, 2:13 p.m.
On Tue, 2011-05-03 at 10:13 +0100, Mel Gorman wrote:
> On Thu, Apr 28, 2011 at 05:43:48PM -0500, James Bottomley wrote:
> > On Thu, 2011-04-28 at 16:12 -0500, James Bottomley wrote:
> > > On Thu, 2011-04-28 at 14:59 -0500, James Bottomley wrote:
> > > > Actually, talking to Chris, I think I can get the system up using
> > > > init=/bin/bash without systemd, so I can try the no cgroup config.
> > > 
> > > OK, so a non-PREEMPT non-CGROUP kernel has survived three back to back
> > > runs of untar without locking or getting kswapd pegged, so I'm pretty
> > > certain this is cgroups related.  The next steps are to turn cgroups
> > > back on but try disabling the memory and IO controllers.
> > 
> > I tried non-PREEMPT CGROUP but disabled GROUP_MEM_RES_CTLR.
> > 
> > The results are curious:  the tar does complete (I've done three back to
> > back).  However, I did get one soft lockup in kswapd (below).  But the
> > system recovers instead of halting I/O and hanging like it did
> > previously.
> > 
> > The soft lockup is in shrink_slab, so perhaps it's a combination of slab
> > shrinker and cgroup memory controller issues?
> > 
> 
> So, kswapd is still looping in reclaim and spending a lot of time in
> shrink_slab but it must not be the shrinker itself or that debug patch
> would have triggered. It's curious that cgroups are involved with
> systemd considering that one would expect those groups to be fairly
> small. I still don't have a new theory but will get hold of a Fedora 15
> install CD and see can I reproduce it locally.

I've got a ftrace output of kswapd ... it's 500k compressed, so I'll
send under separate cover.

> One last thing, what is the value of /proc/sys/vm/zone_reclaim_mode? Two
> of the reporting machines could be NUMA and if that proc file reads as
> 1, I'd be interested in hearing the results of a test with it set to 0.
> Thanks.

It's zero, I'm afraid

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mel Gorman - May 6, 2011, 7:42 a.m.
On Tue, May 03, 2011 at 09:22:33AM -0500, James Bottomley wrote:
> On Tue, 2011-05-03 at 09:13 -0500, James Bottomley wrote:
> > I've got a ftrace output of kswapd ... it's 500k compressed, so I'll
> > send under separate cover.
> 
> Here it is ... it's under 2.6.38.4 vanilla, but the code is similar. 
> 

I was quiet because I was off trying to reproduce this but not having
much luck. It doesn't seem directly related to filesystems or
cgroups. For example, here is what I see with ext4 without cgroups

                2.6.34-vanilla    2.6.37-vanilla    2.6.38-vanilla       rc6-vanilla
download tar           70 ( 0.00%)   68 ( 2.94%)   69 ( 1.45%)   70 ( 0.00%)
unpack tar            601 ( 0.00%)  605 (-0.66%)  604 (-0.50%)  605 (-0.66%)
copy source files     319 ( 0.00%)  321 (-0.62%)  320 (-0.31%)  332 (-3.92%)
create tarfile       1368 ( 0.00%) 1372 (-0.29%) 1371 (-0.22%) 1363 ( 0.37%)
delete source dirs     21 ( 0.00%)   21 ( 0.00%)   23 (-8.70%)   22 (-4.55%)
expand tar            263 ( 0.00%)  261 ( 0.77%)  257 ( 2.33%)  259 ( 1.54%)

(all results are in seconds)

When running in cgroups, the results are similar - bit slower but
not remarkably so. ext3 is slower but not enough to count as the bug.

The trace you posted is very short but kswapd is not going to sleep
in it. It's less than a seconds worth on different cpus so it's hard
to draw any conclusion from it other than sleeping_prematurely()
is often deciding that kswapd should not sleep.

So lets consider what keeps it awake.

1. High-order allocations? You machine is using i915 and RPC, something
   neither of my test machine uses. i915 is potentially a source for
   high-order allocations. I'm attaching a perl script. Please run it as
   ./watch-highorder.pl --output /tmp/highorders.txt
   while you are running tar. When kswapd is running for about 30
   seconds, interrupt it with ctrl+c twice in quick succession and
   post /tmp/highorders.txt

2. All unreclaimable is not being set or we are not balancing at all.
   Can you post the output of sysrq+m while the machine is struggling
   please?

3. Slab may not be shrinking for some reason. Can you run a shell
   script like this during the whole test and record its output please?

   #!/bin/bash
   while [ 1 ]; do
	   echo time: `date +%s`
	   cat /proc/vmstat
	   sleep 2
   done

   Similarly if this is a slab issue, it'd be nice to know who it is so

   #!/bin/bash
   while [ 1 ]; do
	   echo time: `date +%s`
	   cat /proc/slabinfo
	   sleep $MONITOR_UPDATE_FREQUENCY
   done

4. Lets get a better look at what is going on in kswapd

   echo 1 > /sys/kernel/debug/tracing/events/vmscan/enable
   cat /sys/kernel/debug/tracing/trace_pipe > vmscan-ftrace.txt

Out of curiousity, what's the exact machine you are testing on
because I want to see what sort of hardware combination triggers
this problem? Is the tar local or is it copied over NFS? What is the
configuration of the disk or partition you are copying to?

Thanks
Mel Gorman - May 6, 2011, 8:07 a.m.
On Fri, May 06, 2011 at 08:42:24AM +0100, Mel Gorman wrote:
> 1. High-order allocations? You machine is using i915 and RPC, something
>    neither of my test machine uses. i915 is potentially a source for
>    high-order allocations. I'm attaching a perl script. Please run it as
>    ./watch-highorder.pl --output /tmp/highorders.txt
>    while you are running tar. When kswapd is running for about 30
>    seconds, interrupt it with ctrl+c twice in quick succession and
>    post /tmp/highorders.txt
> 

Attached this time :/
Mel Gorman - May 6, 2011, 11:42 a.m.
On Fri, May 06, 2011 at 08:42:24AM +0100, Mel Gorman wrote:
> On Tue, May 03, 2011 at 09:22:33AM -0500, James Bottomley wrote:
> > On Tue, 2011-05-03 at 09:13 -0500, James Bottomley wrote:
> > > I've got a ftrace output of kswapd ... it's 500k compressed, so I'll
> > > send under separate cover.
> > 
> > Here it is ... it's under 2.6.38.4 vanilla, but the code is similar. 
> > 
> 
> I was quiet because I was off trying to reproduce this but not having
> much luck. It doesn't seem directly related to filesystems or
> cgroups. For example, here is what I see with ext4 without cgroups
> 
>                 2.6.34-vanilla    2.6.37-vanilla    2.6.38-vanilla       rc6-vanilla
> download tar           70 ( 0.00%)   68 ( 2.94%)   69 ( 1.45%)   70 ( 0.00%)
> unpack tar            601 ( 0.00%)  605 (-0.66%)  604 (-0.50%)  605 (-0.66%)
> copy source files     319 ( 0.00%)  321 (-0.62%)  320 (-0.31%)  332 (-3.92%)
> create tarfile       1368 ( 0.00%) 1372 (-0.29%) 1371 (-0.22%) 1363 ( 0.37%)
> delete source dirs     21 ( 0.00%)   21 ( 0.00%)   23 (-8.70%)   22 (-4.55%)
> expand tar            263 ( 0.00%)  261 ( 0.77%)  257 ( 2.33%)  259 ( 1.54%)
> 
> (all results are in seconds)
> 
> When running in cgroups, the results are similar - bit slower but
> not remarkably so. ext3 is slower but not enough to count as the bug.
> 
> The trace you posted is very short but kswapd is not going to sleep
> in it. It's less than a seconds worth on different cpus so it's hard
> to draw any conclusion from it other than sleeping_prematurely()
> is often deciding that kswapd should not sleep.
> 
> So lets consider what keeps it awake.
> 
> 1. High-order allocations? You machine is using i915 and RPC, something
>    neither of my test machine uses. i915 is potentially a source for
>    high-order allocations. I'm attaching a perl script. Please run it as
>    ./watch-highorder.pl --output /tmp/highorders.txt
>    while you are running tar. When kswapd is running for about 30
>    seconds, interrupt it with ctrl+c twice in quick succession and
>    post /tmp/highorders.txt
> 
> 2. All unreclaimable is not being set or we are not balancing at all.
>    Can you post the output of sysrq+m while the machine is struggling
>    please?
> 
> 3. Slab may not be shrinking for some reason. Can you run a shell
>    script like this during the whole test and record its output please?
> 
>    #!/bin/bash
>    while [ 1 ]; do
> 	   echo time: `date +%s`
> 	   cat /proc/vmstat
> 	   sleep 2
>    done
> 
>    Similarly if this is a slab issue, it'd be nice to know who it is so
> 
>    #!/bin/bash
>    while [ 1 ]; do
> 	   echo time: `date +%s`
> 	   cat /proc/slabinfo
> 	   sleep $MONITOR_UPDATE_FREQUENCY
>    done
> 
> 4. Lets get a better look at what is going on in kswapd
> 
>    echo 1 > /sys/kernel/debug/tracing/events/vmscan/enable
>    cat /sys/kernel/debug/tracing/trace_pipe > vmscan-ftrace.txt
> 

Also, could you test the patch at https://lkml.org/lkml/2011/3/5/121
please?
Mel Gorman - May 6, 2011, 3:44 p.m.
On Fri, May 06, 2011 at 08:42:24AM +0100, Mel Gorman wrote:
> 
> 1. High-order allocations? You machine is using i915 and RPC, something
>    neither of my test machine uses. i915 is potentially a source for
>    high-order allocations. I'm attaching a perl script. Please run it as
>    ./watch-highorder.pl --output /tmp/highorders.txt
>    while you are running tar. When kswapd is running for about 30
>    seconds, interrupt it with ctrl+c twice in quick succession and
>    post /tmp/highorders.txt
> 

Colin send me this information for his test case at least and I see

11932 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC
 => alloc_pages_current+0xa5/0x110 <ffffffff81149ef5>
 => new_slab+0x1f5/0x290 <ffffffff81153645>
 => __slab_alloc+0x262/0x390 <ffffffff81155192>
 => kmem_cache_alloc+0x115/0x120 <ffffffff81155ab5>
 => mempool_alloc_slab+0x15/0x20 <ffffffff8110e705>
 => mempool_alloc+0x59/0x140 <ffffffff8110ea49>
 => bio_alloc_bioset+0x3e/0xf0 <ffffffff811976ae>
 => bio_alloc+0x15/0x30 <ffffffff81197805>

Colin and James: Did you happen to switch from SLAB to SLUB between
2.6.37 and 2.6.38? My own tests were against SLAB which might be why I
didn't see the problem. Am restarting the tests with SLUB.
James Bottomley - May 6, 2011, 7:14 p.m.
On Fri, 2011-05-06 at 16:44 +0100, Mel Gorman wrote:
> Colin and James: Did you happen to switch from SLAB to SLUB between
> 2.6.37 and 2.6.38? My own tests were against SLAB which might be why I
> didn't see the problem. Am restarting the tests with SLUB.

Aargh ... I'm an idiot.  I should have thought of SLUB immediately ...
it's been causing oopses since debian switched to it.

So I recompiled the 2.6.38.4 stable kernel with SLAB instead of SLUB and
the problem goes away ... at least from three untar runs on a loaded
box ... of course it could manifest a few ms after I send this email ...

There are material differences, as well: SLAB isn't taking my system
down to very low memory on the untar ... it's keeping about 0.5Gb listed
as free.  SLUB took that to under 100kb, so it could just be that SLAB
isn't wandering as close to the cliff edge?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mel Gorman - May 6, 2011, 7:37 p.m.
On Fri, May 06, 2011 at 02:14:37PM -0500, James Bottomley wrote:
> On Fri, 2011-05-06 at 16:44 +0100, Mel Gorman wrote:
> > Colin and James: Did you happen to switch from SLAB to SLUB between
> > 2.6.37 and 2.6.38? My own tests were against SLAB which might be why I
> > didn't see the problem. Am restarting the tests with SLUB.
> 
> Aargh ... I'm an idiot.  I should have thought of SLUB immediately ...
> it's been causing oopses since debian switched to it.
> 
> So I recompiled the 2.6.38.4 stable kernel with SLAB instead of SLUB and
> the problem goes away ... at least from three untar runs on a loaded
> box ... of course it could manifest a few ms after I send this email ...
> 
> There are material differences, as well: SLAB isn't taking my system
> down to very low memory on the untar ... it's keeping about 0.5Gb listed
> as free.  SLUB took that to under 100kb, so it could just be that SLAB
> isn't wandering as close to the cliff edge?
> 

A comparison of watch-highorder.pl with SLAB and SLUB may be
enlightening as well as testing SLUB altering allocate_slab() to read

alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;

i.e. try adding the __GFP_NO_KSWAPD. My own tests are still in progress
but I'm still not seeing the problem. I'm installing Fedora on another
test machine at the moment to see if X and other applications have to be
running to pressure high-order allocations properly.
James Bottomley - May 9, 2011, 6:16 p.m.
On Fri, 2011-05-06 at 09:07 +0100, Mel Gorman wrote:
> On Fri, May 06, 2011 at 08:42:24AM +0100, Mel Gorman wrote:
> > 1. High-order allocations? You machine is using i915 and RPC, something
> >    neither of my test machine uses. i915 is potentially a source for
> >    high-order allocations. I'm attaching a perl script. Please run it as
> >    ./watch-highorder.pl --output /tmp/highorders.txt
> >    while you are running tar. When kswapd is running for about 30
> >    seconds, interrupt it with ctrl+c twice in quick succession and
> >    post /tmp/highorders.txt
> > 
> 
> Attached this time :/

Here's the output (loaded with tar, evolution and firefox).  The top
trace is different this time because your perl script perturbs the
system quite a bit.  This was with your slub allocation fix applied.

James

---
1 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc+0xb1/0x10d <ffffffff8110dfbb>
 => sg_kmalloc+0x24/0x26 <ffffffff81230eb9>
 => __sg_alloc_table+0x63/0x11c <ffffffff81230dbf>

1 instances order=1 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __page_cache_alloc+0x77/0x7e <ffffffff810d5778>
 => __do_page_cache_readahead+0x9b/0x177 <ffffffff810dd74b>
 => ra_submit+0x21/0x25 <ffffffff810dda8d>
 => ondemand_readahead+0x1c9/0x1d8 <ffffffff810ddc5a>
 => page_cache_sync_readahead+0x3d/0x40 <ffffffff810ddd49>
 => generic_file_aio_read+0x27d/0x5e0 <ffffffff810d64d0>

177 instances order=2 normal gfp_flags=GFP_NOFS|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_RECLAIMABLE|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => radix_tree_preload+0x31/0x81 <ffffffff81229399>
 => add_to_page_cache_locked+0x56/0x118 <ffffffff810d57d5>

1 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_vma+0xf5/0xfa <ffffffff81107001>
 => do_wp_page+0x348/0x560 <ffffffff810ef711>
 => handle_pte_fault+0x73b/0x7a5 <ffffffff810f13b7>
 => handle_mm_fault+0x1bb/0x1ce <ffffffff810f1798>
 => do_page_fault+0x358/0x37a <ffffffff8147416e>
 => page_fault+0x25/0x30 <ffffffff81471415>

46 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => prepare_creds+0x26/0xae <ffffffff81074d4b>
 => sys_faccessat+0x37/0x162 <ffffffff8111d255>

1 instances order=1 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc_node+0x93/0x12d <ffffffff8110e199>
 => __alloc_skb+0x40/0x133 <ffffffff813b5df7>
 => __netdev_alloc_skb+0x1f/0x3b <ffffffff813b5f4a>

1 instances order=2 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => alloc_inode+0x30/0x78 <ffffffff811317f8>
 => new_inode+0x1b/0x4b <ffffffff8113185b>

16 instances order=9 normal gfp_flags=GFP_HIGHUSER_MOVABLE|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_vma+0xf5/0xfa <ffffffff81107001>
 => do_huge_pmd_anonymous_page+0xbf/0x261 <ffffffff81115b6a>
 => handle_mm_fault+0x113/0x1ce <ffffffff810f16f0>
 => do_page_fault+0x358/0x37a <ffffffff8147416e>
 => page_fault+0x25/0x30 <ffffffff81471415>

5 instances order=3 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc_node_track_caller+0xcf/0x131 <ffffffff8110ffa6>
 => __alloc_skb+0x75/0x133 <ffffffff813b5e2c>
 => __netdev_alloc_skb+0x1f/0x3b <ffffffff813b5f4a>

252 instances order=2 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => radix_tree_preload+0x31/0x81 <ffffffff81229399>
 => add_to_page_cache_locked+0x56/0x118 <ffffffff810d57d5>

1 instances order=1 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => skb_clone+0x50/0x70 <ffffffff813b5d97>
 => packet_rcv+0x101/0x2b2 <ffffffff81441e80>

1 instances order=2 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => alloc_inode+0x30/0x78 <ffffffff811317f8>
 => iget_locked+0x61/0xdd <ffffffff8113239a>

1 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc+0xb1/0x10d <ffffffff8110dfbb>
 => sg_kmalloc+0x24/0x26 <ffffffff81230eb9>
 => __sg_alloc_table+0x63/0x11c <ffffffff81230dbf>

1 instances order=3 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => copy_process+0x66e/0x10c5 <ffffffff81053961>
 => do_fork+0x104/0x286 <ffffffff810544f7>

1 instances order=1 normal gfp_flags=GFP_KERNEL
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __get_free_pages+0xe/0x4a <ffffffff810d89d2>
 => copy_process+0xe1/0x10c5 <ffffffff810533d4>
 => do_fork+0x104/0x286 <ffffffff810544f7>
 => kernel_thread+0x75/0x77 <ffffffff81010bb6>
 => wait_for_helper+0x6b/0xa3 <ffffffff81068fca>
 => kernel_thread_helper+0x4/0x10 <ffffffff8100a9a4>

133 instances order=9 normal gfp_flags=GFP_HIGHUSER_MOVABLE|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_vma+0xf5/0xfa <ffffffff81107001>
 => khugepaged+0x543/0xf2d <ffffffff81113cc1>
 => kthread+0x84/0x8c <ffffffff8106f2df>
 => kernel_thread_helper+0x4/0x10 <ffffffff8100a9a4>

593 instances order=3 normal gfp_flags=GFP_NOFS|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_RECLAIMABLE|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => ext4_alloc_inode+0x1a/0x111 <ffffffff8119f498>
 => alloc_inode+0x1d/0x78 <ffffffff811317e5>

2 instances order=3 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc+0xb1/0x10d <ffffffff8110dfbb>
 => drm_malloc_ab+0x3b/0x53 [i915] <ffffffffa007fbd2>
 => i915_gem_execbuffer2+0x4f/0x12e [i915] <ffffffffa0080f81>

14 instances order=1 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => d_alloc+0x26/0x18d <ffffffff8112e4c5>
 => shmem_file_setup+0xb6/0x160 <ffffffff810e70ad>

38 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => mmap_region+0x1b7/0x446 <ffffffff810f6555>
 => do_mmap_pgoff+0x298/0x2f2 <ffffffff810f6a7c>

1 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => idr_pre_get+0x2d/0x6f <ffffffff812266c4>
 => drm_gem_handle_create+0x2f/0x82 [drm] <ffffffffa0024f75>

400 instances order=1 normal gfp_flags=GFP_KERNEL
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __get_free_pages+0xe/0x4a <ffffffff810d89d2>
 => copy_process+0xe1/0x10c5 <ffffffff810533d4>
 => do_fork+0x104/0x286 <ffffffff810544f7>
 => sys_clone+0x28/0x2a <ffffffff8101150e>
 => stub_clone+0x13/0x20 <ffffffff81009ea3>

9 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc_node+0x93/0x12d <ffffffff8110e199>
 => __alloc_skb+0x40/0x133 <ffffffff813b5df7>
 => sock_alloc_send_pskb+0xb4/0x2d7 <ffffffff813b238a>

1 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_REPEAT|GFP_COMP
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_vma+0xf5/0xfa <ffffffff81107001>
 => handle_pte_fault+0x16f/0x7a5 <ffffffff810f0deb>
 => handle_mm_fault+0x1bb/0x1ce <ffffffff810f1798>
 => do_page_fault+0x358/0x37a <ffffffff8147416e>
 => page_fault+0x25/0x30 <ffffffff81471415>
 => unix_stream_recvmsg+0x40f/0x536 <ffffffff8143dbe0>
 => sock_aio_read.part.7+0x10d/0x121 <ffffffff813afa54>

2 instances order=1 normal gfp_flags=GFP_KERNEL
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __get_free_pages+0xe/0x4a <ffffffff810d89d2>
 => copy_process+0xe1/0x10c5 <ffffffff810533d4>
 => do_fork+0x104/0x286 <ffffffff810544f7>
 => kernel_thread+0x75/0x77 <ffffffff81010bb6>
 => __call_usermodehelper+0x43/0x76 <ffffffff81069132>
 => process_one_work+0x186/0x298 <ffffffff8106b3ca>

14 instances order=3 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc_node_track_caller+0xcf/0x131 <ffffffff8110ffa6>
 => __alloc_skb+0x75/0x133 <ffffffff813b5e2c>
 => sock_alloc_send_pskb+0xb4/0x2d7 <ffffffff813b238a>

781 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_REPEAT|GFP_COMP
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => kmalloc_large_node+0x56/0x95 <ffffffff8146a55d>
 => __kmalloc_node_track_caller+0x31/0x131 <ffffffff8110ff08>
 => __alloc_skb+0x75/0x133 <ffffffff813b5e2c>
 => sock_alloc_send_pskb+0xb4/0x2d7 <ffffffff813b238a>
 => sock_alloc_send_skb+0x15/0x17 <ffffffff813b25c2>
 => unix_stream_sendmsg+0x11e/0x2ec <ffffffff8143d217>
 => __sock_sendmsg+0x69/0x76 <ffffffff813af778>

1 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __get_free_pages+0xe/0x4a <ffffffff810d89d2>
 => __pollwait+0x5e/0xcc <ffffffff8112c9e7>
 => sock_poll_wait+0x18/0x1d <ffffffff8143cc50>
 => unix_poll+0x1c/0x9a <ffffffff8143cc71>
 => sock_poll+0x1a/0x1c <ffffffff813ad41a>
 => do_sys_poll+0x1fa/0x386 <ffffffff8112d75d>

48 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => __split_vma+0x6c/0x21b <ffffffff810f5099>
 => split_vma+0x20/0x22 <ffffffff810f59a6>

3 instances order=3 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => sk_prot_alloc+0x37/0x13a <ffffffff813b27d4>
 => sk_alloc+0x2c/0x8e <ffffffff813b2935>

24 instances order=2 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => proc_alloc_inode+0x20/0x91 <ffffffff8116829c>
 => alloc_inode+0x1d/0x78 <ffffffff811317e5>

1 instances order=1 normal gfp_flags=GFP_NOFS|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_RECLAIMABLE|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => ext4_mb_new_blocks+0x118/0x3c2 <ffffffff811b73a7>
 => ext4_ext_map_blocks+0x192b/0x1b5b <ffffffff811b0a5d>

5 instances order=2 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => sock_alloc_inode+0x1d/0xaa <ffffffff813ad605>
 => alloc_inode+0x1d/0x78 <ffffffff811317e5>

3 instances order=2 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => dst_alloc+0x48/0xaa <ffffffff813c6652>
 => __ip_route_output_key+0x561/0x764 <ffffffff813ee4f3>

3 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __get_free_pages+0xe/0x4a <ffffffff810d89d2>
 => __pollwait+0x5e/0xcc <ffffffff8112c9e7>
 => sock_poll_wait+0x18/0x1d <ffffffff8143cc50>
 => unix_poll+0x1c/0x9a <ffffffff8143cc71>
 => sock_poll+0x1a/0x1c <ffffffff813ad41a>
 => do_select+0x2fb/0x4f5 <ffffffff8112cf13>

4 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => __split_vma+0x6c/0x21b <ffffffff810f5099>
 => do_munmap+0x194/0x30b <ffffffff810f57ad>

229 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => dup_mm+0x1f8/0x486 <ffffffff81053039>
 => copy_process+0x917/0x10c5 <ffffffff81053c0a>

1 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => bprm_mm_init+0x70/0x1a0 <ffffffff811239aa>
 => do_execve+0xd6/0x277 <ffffffff81123dd4>

2 instances order=2 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc_node_track_caller+0xcf/0x131 <ffffffff8110ffa6>
 => __alloc_skb+0x75/0x133 <ffffffff813b5e2c>
 => find_skb+0x3a/0x82 <ffffffff813d1996>

2 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => __split_vma+0x6c/0x21b <ffffffff810f5099>
 => do_munmap+0x15f/0x30b <ffffffff810f5778>

1 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_vma+0xf5/0xfa <ffffffff81107001>
 => handle_pte_fault+0x16f/0x7a5 <ffffffff810f0deb>
 => handle_mm_fault+0x1bb/0x1ce <ffffffff810f1798>
 => do_page_fault+0x358/0x37a <ffffffff8147416e>
 => page_fault+0x25/0x30 <ffffffff81471415>

501 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => get_empty_filp+0x7a/0x141 <ffffffff8111f2d1>
 => do_filp_open+0xe7/0x60a <ffffffff81129bcf>

18 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => bvec_alloc_bs+0xae/0xcc <ffffffff81144d61>
 => bio_alloc_bioset+0x75/0xc3 <ffffffff81144df4>

1370 instances order=1 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => d_alloc+0x26/0x18d <ffffffff8112e4c5>
 => d_alloc_and_lookup+0x2c/0x6b <ffffffff81126d0e>

1 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc+0xb1/0x10d <ffffffff8110dfbb>
 => kzalloc.constprop.1+0x13/0x15 [i915] <ffffffffa007fbfd>
 => i915_gem_do_execbuffer+0x306/0x1116 [i915] <ffffffffa007ff05>

1 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => taskstats_exit+0x5b/0x2e3 <ffffffff810b0810>
 => do_exit+0x22a/0x6ef <ffffffff8105841f>

17 instances order=1 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => d_alloc+0x26/0x18d <ffffffff8112e4c5>
 => proc_fill_cache+0x82/0x135 <ffffffff8116903b>

99 instances order=1 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => mempool_alloc_slab+0x15/0x17 <ffffffff810d6e81>
 => mempool_alloc+0x68/0x116 <ffffffff810d70f6>

1 instances order=1 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => inet_twsk_alloc+0x31/0x11e <ffffffff813f8a16>
 => tcp_time_wait+0xb9/0x29a <ffffffff8140e24f>

26 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __page_cache_alloc+0x77/0x7e <ffffffff810d5778>
 => grab_cache_page_write_begin+0x54/0x9e <ffffffff810d5943>
 => ext4_da_write_begin+0x13f/0x20e <ffffffff81196376>
 => generic_file_buffered_write+0x109/0x23a <ffffffff810d5408>
 => __generic_file_aio_write+0x242/0x272 <ffffffff810d617d>
 => generic_file_aio_write+0x58/0xa6 <ffffffff810d6205>

1 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc+0xb1/0x10d <ffffffff8110dfbb>
 => drm_malloc_ab+0x3b/0x53 [i915] <ffffffffa007fbd2>
 => i915_gem_do_execbuffer+0x64a/0x1116 [i915] <ffffffffa0080249>

6 instances order=1 normal gfp_flags=GFP_KERNEL
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __get_free_pages+0xe/0x4a <ffffffff810d89d2>
 => copy_process+0xe1/0x10c5 <ffffffff810533d4>
 => do_fork+0x104/0x286 <ffffffff810544f7>
 => kernel_thread+0x75/0x77 <ffffffff81010bb6>
 => kthreadd+0xe7/0x124 <ffffffff8106f61f>
 => kernel_thread_helper+0x4/0x10 <ffffffff8100a9a4>

2 instances order=1 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc_node+0x93/0x12d <ffffffff8110e199>
 => __alloc_skb+0x40/0x133 <ffffffff813b5df7>
 => find_skb+0x3a/0x82 <ffffffff813d1996>

131 instances order=1 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => scsi_pool_alloc_command+0x29/0x68 <ffffffff812f3047>
 => scsi_host_alloc_command+0x1f/0x6b <ffffffff812f30cd>

6 instances order=1 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => d_alloc+0x26/0x18d <ffffffff8112e4c5>
 => d_alloc_pseudo+0x18/0x46 <ffffffff8112e6a7>

6 instances order=3 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => bvec_alloc_bs+0xae/0xcc <ffffffff81144d61>
 => bio_alloc_bioset+0x75/0xc3 <ffffffff81144df4>

11 instances order=3 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => shmem_alloc_inode+0x1a/0x2f <ffffffff810e58a0>
 => alloc_inode+0x1d/0x78 <ffffffff811317e5>

4 instances order=3 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc+0xb1/0x10d <ffffffff8110dfbb>
 => drm_malloc_ab+0x3b/0x53 [i915] <ffffffffa007fbd2>
 => i915_gem_do_execbuffer+0x64a/0x1116 [i915] <ffffffffa0080249>

680 instances order=3 atomic gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => mempool_alloc_slab+0x15/0x17 <ffffffff810d6e81>
 => mempool_alloc+0x68/0x116 <ffffffff810d70f6>

8 instances order=3 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => copy_process+0xc6/0x10c5 <ffffffff810533b9>
 => do_fork+0x104/0x286 <ffffffff810544f7>

4 instances order=9 normal gfp_flags=GFP_HIGHUSER_MOVABLE|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_vma+0xf5/0xfa <ffffffff81107001>
 => do_huge_pmd_wp_page+0x15a/0x637 <ffffffff81114a01>
 => handle_mm_fault+0x169/0x1ce <ffffffff810f1746>
 => do_page_fault+0x358/0x37a <ffffffff8147416e>
 => page_fault+0x25/0x30 <ffffffff81471415>

193 instances order=2 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => mempool_alloc_slab+0x15/0x17 <ffffffff810d6e81>
 => mempool_alloc+0x68/0x116 <ffffffff810d70f6>

140358 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => mempool_alloc_slab+0x15/0x17 <ffffffff810d6e81>
 => mempool_alloc+0x68/0x116 <ffffffff810d70f6>

3 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc_node_track_caller+0xcf/0x131 <ffffffff8110ffa6>
 => __alloc_skb+0x75/0x133 <ffffffff813b5e2c>
 => sock_alloc_send_pskb+0xb4/0x2d7 <ffffffff813b238a>

17 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __page_cache_alloc+0x77/0x7e <ffffffff810d5778>
 => __do_page_cache_readahead+0x9b/0x177 <ffffffff810dd74b>
 => ra_submit+0x21/0x25 <ffffffff810dda8d>
 => ondemand_readahead+0x1c9/0x1d8 <ffffffff810ddc5a>
 => page_cache_async_readahead+0x7b/0xa3 <ffffffff810ddce4>
 => generic_file_aio_read+0x2bd/0x5e0 <ffffffff810d6510>

9 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => __kmalloc+0xb1/0x10d <ffffffff8110dfbb>
 => kzalloc.constprop.15+0x13/0x15 [i915] <ffffffffa007a956>
 => i915_gem_alloc_object+0x27/0x111 [i915] <ffffffffa007e4b3>

14 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
 => new_slab+0x50/0x199 <ffffffff8110dc24>
 => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
 => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
 => get_empty_filp+0x7a/0x141 <ffffffff8111f2d1>
 => alloc_file+0x1e/0xbf <ffffffff8111f3b6>

1 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_COMP|GFP_ZERO
 => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
 => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
 => __get_free_pages+0xe/0x4a <ffffffff810d89d2>
 => kmalloc_order_trace+0x2c/0x5e <ffffffff8110c718>
 => __kmalloc+0x37/0x10d <ffffffff8110df41>
 => kzalloc.constprop.19+0xe/0x10 <ffffffff810b8d8c>
 => tracing_open_pipe+0x40/0x147 <ffffffff810b907a>
 => __dentry_open+0x161/0x283 <ffffffff8111c95c>

High-order normal allocations: 145450
High-order atomic allocations: 927
 


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Colin Ian King - May 10, 2011, 5:37 a.m.
On Fri, 2011-05-06 at 16:44 +0100, Mel Gorman wrote:
> On Fri, May 06, 2011 at 08:42:24AM +0100, Mel Gorman wrote:
> > 
> > 1. High-order allocations? You machine is using i915 and RPC, something
> >    neither of my test machine uses. i915 is potentially a source for
> >    high-order allocations. I'm attaching a perl script. Please run it as
> >    ./watch-highorder.pl --output /tmp/highorders.txt
> >    while you are running tar. When kswapd is running for about 30
> >    seconds, interrupt it with ctrl+c twice in quick succession and
> >    post /tmp/highorders.txt
> > 
> 
> Colin send me this information for his test case at least and I see
> 
> 11932 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC
>  => alloc_pages_current+0xa5/0x110 <ffffffff81149ef5>
>  => new_slab+0x1f5/0x290 <ffffffff81153645>
>  => __slab_alloc+0x262/0x390 <ffffffff81155192>
>  => kmem_cache_alloc+0x115/0x120 <ffffffff81155ab5>
>  => mempool_alloc_slab+0x15/0x20 <ffffffff8110e705>
>  => mempool_alloc+0x59/0x140 <ffffffff8110ea49>
>  => bio_alloc_bioset+0x3e/0xf0 <ffffffff811976ae>
>  => bio_alloc+0x15/0x30 <ffffffff81197805>
> 
> Colin and James: Did you happen to switch from SLAB to SLUB between
> 2.6.37 and 2.6.38? My own tests were against SLAB which might be why I
> didn't see the problem. Am restarting the tests with SLUB.

So I tested with SLAB instead of SLUB and I reliably ran my copy test
for 4+ hours with several hundred iterations of the test.  (Apologies
for taking time to respond, but I was travelling).
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mel Gorman - May 10, 2011, 10:21 a.m.
On Mon, May 09, 2011 at 01:16:20PM -0500, James Bottomley wrote:
> On Fri, 2011-05-06 at 09:07 +0100, Mel Gorman wrote:
> > On Fri, May 06, 2011 at 08:42:24AM +0100, Mel Gorman wrote:
> > > 1. High-order allocations? You machine is using i915 and RPC, something
> > >    neither of my test machine uses. i915 is potentially a source for
> > >    high-order allocations. I'm attaching a perl script. Please run it as
> > >    ./watch-highorder.pl --output /tmp/highorders.txt
> > >    while you are running tar. When kswapd is running for about 30
> > >    seconds, interrupt it with ctrl+c twice in quick succession and
> > >    post /tmp/highorders.txt
> > > 
> > 
> > Attached this time :/
> 
> Here's the output (loaded with tar, evolution and firefox).  The top
> trace is different this time because your perl script perturbs the
> system quite a bit.  This was with your slub allocation fix applied.
> 

I note that certain flags like __GFP_NO_KSWAPD are not recognised by
tracing which might explain why they are missing from the script output.
I regret the script perturbs the system quite a bit. It's possible it
can be made better by filtering events but it's not high on the list of
things to do.

How does the output compare without the fix? I can't find a similar
report in my inbox.

Does the fix help the system when the perl script is not running?

> 177 instances order=2 normal gfp_flags=GFP_NOFS|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_RECLAIMABLE|
>  => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
>  => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
>  => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
>  => new_slab+0x50/0x199 <ffffffff8110dc24>
>  => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
>  => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
>  => radix_tree_preload+0x31/0x81 <ffffffff81229399>
>  => add_to_page_cache_locked+0x56/0x118 <ffffffff810d57d5>
> 

Ouch.

> 46 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
>  => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
>  => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
>  => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
>  => new_slab+0x50/0x199 <ffffffff8110dc24>
>  => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
>  => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
>  => prepare_creds+0x26/0xae <ffffffff81074d4b>
>  => sys_faccessat+0x37/0x162 <ffffffff8111d255>
> 

Less ouch, but still.

> 252 instances order=2 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
>  => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
>  => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
>  => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
>  => new_slab+0x50/0x199 <ffffffff8110dc24>
>  => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
>  => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
>  => radix_tree_preload+0x31/0x81 <ffffffff81229399>
>  => add_to_page_cache_locked+0x56/0x118 <ffffffff810d57d5>
> 

Ouch again.

> 593 instances order=3 normal gfp_flags=GFP_NOFS|GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_RECLAIMABLE|
>  => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
>  => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
>  => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
>  => new_slab+0x50/0x199 <ffffffff8110dc24>
>  => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
>  => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
>  => ext4_alloc_inode+0x1a/0x111 <ffffffff8119f498>
>  => alloc_inode+0x1d/0x78 <ffffffff811317e5>
> 

Again, filesystem-related calls are hitting high-order paths quite a
bit.

> 781 instances order=2 normal gfp_flags=GFP_KERNEL|GFP_REPEAT|GFP_COMP
>  => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
>  => kmalloc_large_node+0x56/0x95 <ffffffff8146a55d>
>  => __kmalloc_node_track_caller+0x31/0x131 <ffffffff8110ff08>
>  => __alloc_skb+0x75/0x133 <ffffffff813b5e2c>
>  => sock_alloc_send_pskb+0xb4/0x2d7 <ffffffff813b238a>
>  => sock_alloc_send_skb+0x15/0x17 <ffffffff813b25c2>
>  => unix_stream_sendmsg+0x11e/0x2ec <ffffffff8143d217>
>  => __sock_sendmsg+0x69/0x76 <ffffffff813af778>
> 

A number of network paths are also being hit although this is the worst.

> 501 instances order=1 normal gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
>  => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
>  => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
>  => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
>  => new_slab+0x50/0x199 <ffffffff8110dc24>
>  => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
>  => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
>  => get_empty_filp+0x7a/0x141 <ffffffff8111f2d1>
>  => do_filp_open+0xe7/0x60a <ffffffff81129bcf>
> 

More filesystem impairment.

> 1370 instances order=1 normal gfp_flags=GFP_TEMPORARY|GFP_NOWARN|GFP_NORETRY|GFP_COMP|
>  => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
>  => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
>  => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
>  => new_slab+0x50/0x199 <ffffffff8110dc24>
>  => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
>  => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
>  => d_alloc+0x26/0x18d <ffffffff8112e4c5>
>  => d_alloc_and_lookup+0x2c/0x6b <ffffffff81126d0e>
> 

*cries*

> 140358 instances order=1 normal gfp_flags=GFP_NOWARN|GFP_NORETRY|GFP_COMP|GFP_NOMEMALLOC|
>  => __alloc_pages_nodemask+0x737/0x772 <ffffffff810dc0bd>
>  => alloc_pages_current+0xbe/0xd8 <ffffffff81105435>
>  => alloc_slab_page+0x1c/0x4d <ffffffff8110c5da>
>  => new_slab+0x50/0x199 <ffffffff8110dc24>
>  => __slab_alloc+0x24a/0x328 <ffffffff8146ab66>
>  => kmem_cache_alloc+0x77/0x105 <ffffffff8110e42c>
>  => mempool_alloc_slab+0x15/0x17 <ffffffff810d6e81>
>  => mempool_alloc+0x68/0x116 <ffffffff810d70f6>

Wonder which pool this is!

It goes on. A number of filesystem and network paths are being hit
with high-order allocs. i915 was a red herring, it's present but not
in massive numbers. The filesystem, network and mempool allocations
are likely to be kicking kswapd awake frequently and hurting overall
system performance as a result.

I really would like to hear if the fix makes a big difference or
if we need to consider forcing SLUB high-order allocations bailing
at the first sign of trouble (e.g. by masking out __GFP_WAIT in
allocate_slab). Even with the fix applied, kswapd might be waking up
less but processes will still be getting stalled in direct compaction
and direct reclaim so it would still be jittery.

> High-order normal allocations: 145450
> High-order atomic allocations: 927
>  

I bet a shiny penny that the high-order allocations for SLAB are lower
than this
Pekka Enberg - May 10, 2011, 10:33 a.m.
On Tue, May 10, 2011 at 1:21 PM, Mel Gorman <mgorman@novell.com> wrote:
> It goes on. A number of filesystem and network paths are being hit
> with high-order allocs. i915 was a red herring, it's present but not
> in massive numbers. The filesystem, network and mempool allocations
> are likely to be kicking kswapd awake frequently and hurting overall
> system performance as a result.
>
> I really would like to hear if the fix makes a big difference or
> if we need to consider forcing SLUB high-order allocations bailing
> at the first sign of trouble (e.g. by masking out __GFP_WAIT in
> allocate_slab). Even with the fix applied, kswapd might be waking up
> less but processes will still be getting stalled in direct compaction
> and direct reclaim so it would still be jittery.

Yes, sounds reasonable to me.

                         Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley - May 10, 2011, 2:01 p.m.
On Tue, 2011-05-10 at 11:21 +0100, Mel Gorman wrote:
> I really would like to hear if the fix makes a big difference or
> if we need to consider forcing SLUB high-order allocations bailing
> at the first sign of trouble (e.g. by masking out __GFP_WAIT in
> allocate_slab). Even with the fix applied, kswapd might be waking up
> less but processes will still be getting stalled in direct compaction
> and direct reclaim so it would still be jittery.

"the fix" being this

https://lkml.org/lkml/2011/3/5/121

In addition to your GFP_KSWAPD one?

OK, will retry with that.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c74a501..ed99104 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -225,6 +225,7 @@  unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 {
 	struct shrinker *shrinker;
 	unsigned long ret = 0;
+	unsigned long shrink_expired = jiffies + HZ;
 
 	if (scanned == 0)
 		scanned = SWAP_CLUSTER_MAX;
@@ -270,6 +271,14 @@  unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 								gfp_mask);
 			if (shrink_ret == -1)
 				break;
+			if (time_after(jiffies, shrink_expired)) {
+				printk(KERN_WARNING "Slab shrinker %p gone mental"
+						" comm=%s nr=%ld\n",
+					shrinker->shrink,
+					current->comm,
+					shrinker->nr);
+				break;
+			}
 			if (shrink_ret < nr_before)
 				ret += nr_before - shrink_ret;
 			count_vm_events(SLABS_SCANNED, this_scan);