Message ID | 20130701181742.GB8937@order.stressinduktion.org |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, Jul 01, 2013 at 08:17:42PM +0200, Hannes Frederic Sowa wrote: > If the socket had an IPV6_MTU value set, ip6_append_data_mtu lost track > of this when appending the second frame on a corked socket. This results > in the following splat: > > [...] > > While there, also check if path mtu discovery is activated for this > socket. The logic was adapted from ip6_append_data when first writing > on the corked socket. > > This bug was introduced with commit > 0c1833797a5a6ec23ea9261d979aa18078720b74 ("ipv6: fix incorrect ipsec > fragment"). > > Cc: Gao feng <gaofeng@cn.fujitsu.com> > Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> > Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> > --- > net/ipv6/ip6_output.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) I would love to have additional review for this patch. I am fairly sure that this patch is correct, but... Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/02/2013 02:17 AM, Hannes Frederic Sowa wrote: > If the socket had an IPV6_MTU value set, ip6_append_data_mtu lost track > of this when appending the second frame on a corked socket. This results > in the following splat: > > [37598.993962] ------------[ cut here ]------------ > [37598.994008] kernel BUG at net/core/skbuff.c:2064! > [37598.994008] invalid opcode: 0000 [#1] SMP > [37598.994008] Modules linked in: tcp_lp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media vfat fat usb_storage fuse ebtable_nat xt_CHECKSUM bridge stp llc ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat > +nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi > +scsi_transport_iscsi rfcomm bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant arc4 iwldvm mac80211 snd_hda_intel acpi_cpufreq mperf coretemp snd_hda_codec microcode cdc_wdm cdc_acm > [37598.994008] snd_hwdep cdc_ether snd_seq snd_seq_device usbnet mii joydev btusb snd_pcm bluetooth i2c_i801 e1000e lpc_ich mfd_core ptp iwlwifi pps_core snd_page_alloc mei cfg80211 snd_timer thinkpad_acpi snd tpm_tis soundcore rfkill tpm tpm_bios vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc > +dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core wmi video > [37598.994008] CPU 0 > [37598.994008] Pid: 27320, comm: t2 Not tainted 3.9.6-200.fc18.x86_64 #1 LENOVO 27744PG/27744PG > [37598.994008] RIP: 0010:[<ffffffff815443a5>] [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330 > [37598.994008] RSP: 0018:ffff88003670da18 EFLAGS: 00010202 > [37598.994008] RAX: ffff88018105c018 RBX: 0000000000000004 RCX: 00000000000006c0 > [37598.994008] RDX: ffff88018105a6c0 RSI: ffff88018105a000 RDI: ffff8801e1b0aa00 > [37598.994008] RBP: ffff88003670da78 R08: 0000000000000000 R09: ffff88018105c040 > [37598.994008] R10: ffff8801e1b0aa00 R11: 0000000000000000 R12: 000000000000fff8 > [37598.994008] R13: 00000000000004fc R14: 00000000ffff0504 R15: 0000000000000000 > [37598.994008] FS: 00007f28eea59740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 > [37598.994008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [37598.994008] CR2: 0000003d935789e0 CR3: 00000000365cb000 CR4: 00000000000407f0 > [37598.994008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [37598.994008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [37598.994008] Process t2 (pid: 27320, threadinfo ffff88003670c000, task ffff88022c162ee0) > [37598.994008] Stack: > [37598.994008] ffff88022e098a00 ffff88020f973fc0 0000000000000008 00000000000004c8 > [37598.994008] ffff88020f973fc0 00000000000004c4 ffff88003670da78 ffff8801e1b0a200 > [37598.994008] 0000000000000018 00000000000004c8 ffff88020f973fc0 00000000000004c4 > [37598.994008] Call Trace: > [37598.994008] [<ffffffff815fc21f>] ip6_append_data+0xccf/0xfe0 > [37598.994008] [<ffffffff8158d9f0>] ? ip_copy_metadata+0x1a0/0x1a0 > [37598.994008] [<ffffffff81661f66>] ? _raw_spin_lock_bh+0x16/0x40 > [37598.994008] [<ffffffff8161548d>] udpv6_sendmsg+0x1ed/0xc10 > [37598.994008] [<ffffffff812a2845>] ? sock_has_perm+0x75/0x90 > [37598.994008] [<ffffffff815c3693>] inet_sendmsg+0x63/0xb0 > [37598.994008] [<ffffffff812a2973>] ? selinux_socket_sendmsg+0x23/0x30 > [37598.994008] [<ffffffff8153a450>] sock_sendmsg+0xb0/0xe0 > [37598.994008] [<ffffffff810135d1>] ? __switch_to+0x181/0x4a0 > [37598.994008] [<ffffffff8153d97d>] sys_sendto+0x12d/0x180 > [37598.994008] [<ffffffff810dfb64>] ? __audit_syscall_entry+0x94/0xf0 > [37598.994008] [<ffffffff81020ed1>] ? syscall_trace_enter+0x231/0x240 > [37598.994008] [<ffffffff8166a7e7>] tracesys+0xdd/0xe2 > [37598.994008] Code: fe 07 00 00 48 c7 c7 04 28 a6 81 89 45 a0 4c 89 4d b8 44 89 5d a8 e8 1b ac b1 ff 44 8b 5d a8 4c 8b 4d b8 8b 45 a0 e9 cf fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 48 > [37598.994008] RIP [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330 > [37598.994008] RSP <ffff88003670da18> > [37599.007323] ---[ end trace d69f6a17f8ac8eee ]--- > > While there, also check if path mtu discovery is activated for this > socket. The logic was adapted from ip6_append_data when first writing > on the corked socket. > > This bug was introduced with commit > 0c1833797a5a6ec23ea9261d979aa18078720b74 ("ipv6: fix incorrect ipsec > fragment"). > > Cc: Gao feng <gaofeng@cn.fujitsu.com> > Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> > Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> > --- > net/ipv6/ip6_output.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > index dae1949..7701735 100644 > --- a/net/ipv6/ip6_output.c > +++ b/net/ipv6/ip6_output.c > @@ -1097,7 +1097,8 @@ static void ip6_append_data_mtu(int *mtu, > int *maxfraglen, > unsigned int fragheaderlen, > struct sk_buff *skb, > - struct rt6_info *rt) > + struct rt6_info *rt, > + struct ipv6_pinfo *np) > { > if (!(rt->dst.flags & DST_XFRM_TUNNEL)) { > if (skb == NULL) { > @@ -1109,7 +1110,11 @@ static void ip6_append_data_mtu(int *mtu, > * this fragment is not first, the headers > * space is regarded as data space. > */ > - *mtu = dst_mtu(rt->dst.path); > + *mtu = (np->pmtudisc == IPV6_PMTUDISC_DO) ? > + rt->dst.dev->mtu : > + dst_mtu(rt->dst.path); > + if (np->frag_size && np->frag_size < *mtu) > + *mtu = np->frag_size; No need to pass ipv6_pinfo to ip6_append_data_mtu, the below is enough. *mtu = min(*mtu, (np->pmtudisc == IPV6_PMTUDISC_DO) ? rt->dst.dev->mtu : dst_mtu(rt->dst.path); This patch looks good go me. Acked-by: Gao feng <gaofeng@cn.fujitsu.com> Thanks! > } > *maxfraglen = ((*mtu - fragheaderlen) & ~7) > + fragheaderlen - sizeof(struct frag_hdr); > @@ -1287,8 +1292,7 @@ alloc_new_skb: > /* update mtu and maxfraglen if necessary */ > if (skb == NULL || skb_prev == NULL) > ip6_append_data_mtu(&mtu, &maxfraglen, > - fragheaderlen, skb, rt); > - > + fragheaderlen, skb, rt, np); > skb_prev = skb; > > /* > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 02, 2013 at 10:46:38AM +0800, Gao feng wrote: > On 07/02/2013 02:17 AM, Hannes Frederic Sowa wrote: > No need to pass ipv6_pinfo to ip6_append_data_mtu, the below is enough. > > *mtu = min(*mtu, (np->pmtudisc == IPV6_PMTUDISC_DO) ? > rt->dst.dev->mtu : > dst_mtu(rt->dst.path); > > This patch looks good go me. > > Acked-by: Gao feng <gaofeng@cn.fujitsu.com> > > Thanks! Thanks for your feedback, I will incooperate it in my next version of the patch. Actually, I seem to have done an error in the above code, i need to replace IPV6_PMTUDISC_DO with IPV6_PMTUDISC_WANT. Does your Acked-by still hold? Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 02, 2013 at 07:13:21AM +0200, Hannes Frederic Sowa wrote: > On Tue, Jul 02, 2013 at 10:46:38AM +0800, Gao feng wrote: > > On 07/02/2013 02:17 AM, Hannes Frederic Sowa wrote: > > No need to pass ipv6_pinfo to ip6_append_data_mtu, the below is enough. > > > > *mtu = min(*mtu, (np->pmtudisc == IPV6_PMTUDISC_DO) ? > > rt->dst.dev->mtu : > > dst_mtu(rt->dst.path); > > > > This patch looks good go me. > > > > Acked-by: Gao feng <gaofeng@cn.fujitsu.com> > > > > Thanks! > > Thanks for your feedback, I will incooperate it in my next version of > the patch. Actually, I seem to have done an error in the above code, > i need to replace IPV6_PMTUDISC_DO with IPV6_PMTUDISC_WANT. Does your > Acked-by still hold? Erk, err, IPV6_PMTUDISC_PROBE was what i meant. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/02/2013 01:18 PM, Hannes Frederic Sowa wrote: > On Tue, Jul 02, 2013 at 07:13:21AM +0200, Hannes Frederic Sowa wrote: >> On Tue, Jul 02, 2013 at 10:46:38AM +0800, Gao feng wrote: >>> On 07/02/2013 02:17 AM, Hannes Frederic Sowa wrote: >>> No need to pass ipv6_pinfo to ip6_append_data_mtu, the below is enough. >>> >>> *mtu = min(*mtu, (np->pmtudisc == IPV6_PMTUDISC_DO) ? >>> rt->dst.dev->mtu : >>> dst_mtu(rt->dst.path); >>> >>> This patch looks good go me. >>> >>> Acked-by: Gao feng <gaofeng@cn.fujitsu.com> >>> >>> Thanks! >> >> Thanks for your feedback, I will incooperate it in my next version of >> the patch. Actually, I seem to have done an error in the above code, >> i need to replace IPV6_PMTUDISC_DO with IPV6_PMTUDISC_WANT. Does your >> Acked-by still hold? > > Erk, err, IPV6_PMTUDISC_PROBE was what i meant. > > Yes, IPV6_PMTUDISC_PROBE is more reasonable. you can hold my ack. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index dae1949..7701735 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1097,7 +1097,8 @@ static void ip6_append_data_mtu(int *mtu, int *maxfraglen, unsigned int fragheaderlen, struct sk_buff *skb, - struct rt6_info *rt) + struct rt6_info *rt, + struct ipv6_pinfo *np) { if (!(rt->dst.flags & DST_XFRM_TUNNEL)) { if (skb == NULL) { @@ -1109,7 +1110,11 @@ static void ip6_append_data_mtu(int *mtu, * this fragment is not first, the headers * space is regarded as data space. */ - *mtu = dst_mtu(rt->dst.path); + *mtu = (np->pmtudisc == IPV6_PMTUDISC_DO) ? + rt->dst.dev->mtu : + dst_mtu(rt->dst.path); + if (np->frag_size && np->frag_size < *mtu) + *mtu = np->frag_size; } *maxfraglen = ((*mtu - fragheaderlen) & ~7) + fragheaderlen - sizeof(struct frag_hdr); @@ -1287,8 +1292,7 @@ alloc_new_skb: /* update mtu and maxfraglen if necessary */ if (skb == NULL || skb_prev == NULL) ip6_append_data_mtu(&mtu, &maxfraglen, - fragheaderlen, skb, rt); - + fragheaderlen, skb, rt, np); skb_prev = skb; /*
If the socket had an IPV6_MTU value set, ip6_append_data_mtu lost track of this when appending the second frame on a corked socket. This results in the following splat: [37598.993962] ------------[ cut here ]------------ [37598.994008] kernel BUG at net/core/skbuff.c:2064! [37598.994008] invalid opcode: 0000 [#1] SMP [37598.994008] Modules linked in: tcp_lp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media vfat fat usb_storage fuse ebtable_nat xt_CHECKSUM bridge stp llc ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat +nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi +scsi_transport_iscsi rfcomm bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant arc4 iwldvm mac80211 snd_hda_intel acpi_cpufreq mperf coretemp snd_hda_codec microcode cdc_wdm cdc_acm [37598.994008] snd_hwdep cdc_ether snd_seq snd_seq_device usbnet mii joydev btusb snd_pcm bluetooth i2c_i801 e1000e lpc_ich mfd_core ptp iwlwifi pps_core snd_page_alloc mei cfg80211 snd_timer thinkpad_acpi snd tpm_tis soundcore rfkill tpm tpm_bios vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc +dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core wmi video [37598.994008] CPU 0 [37598.994008] Pid: 27320, comm: t2 Not tainted 3.9.6-200.fc18.x86_64 #1 LENOVO 27744PG/27744PG [37598.994008] RIP: 0010:[<ffffffff815443a5>] [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330 [37598.994008] RSP: 0018:ffff88003670da18 EFLAGS: 00010202 [37598.994008] RAX: ffff88018105c018 RBX: 0000000000000004 RCX: 00000000000006c0 [37598.994008] RDX: ffff88018105a6c0 RSI: ffff88018105a000 RDI: ffff8801e1b0aa00 [37598.994008] RBP: ffff88003670da78 R08: 0000000000000000 R09: ffff88018105c040 [37598.994008] R10: ffff8801e1b0aa00 R11: 0000000000000000 R12: 000000000000fff8 [37598.994008] R13: 00000000000004fc R14: 00000000ffff0504 R15: 0000000000000000 [37598.994008] FS: 00007f28eea59740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 [37598.994008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [37598.994008] CR2: 0000003d935789e0 CR3: 00000000365cb000 CR4: 00000000000407f0 [37598.994008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [37598.994008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [37598.994008] Process t2 (pid: 27320, threadinfo ffff88003670c000, task ffff88022c162ee0) [37598.994008] Stack: [37598.994008] ffff88022e098a00 ffff88020f973fc0 0000000000000008 00000000000004c8 [37598.994008] ffff88020f973fc0 00000000000004c4 ffff88003670da78 ffff8801e1b0a200 [37598.994008] 0000000000000018 00000000000004c8 ffff88020f973fc0 00000000000004c4 [37598.994008] Call Trace: [37598.994008] [<ffffffff815fc21f>] ip6_append_data+0xccf/0xfe0 [37598.994008] [<ffffffff8158d9f0>] ? ip_copy_metadata+0x1a0/0x1a0 [37598.994008] [<ffffffff81661f66>] ? _raw_spin_lock_bh+0x16/0x40 [37598.994008] [<ffffffff8161548d>] udpv6_sendmsg+0x1ed/0xc10 [37598.994008] [<ffffffff812a2845>] ? sock_has_perm+0x75/0x90 [37598.994008] [<ffffffff815c3693>] inet_sendmsg+0x63/0xb0 [37598.994008] [<ffffffff812a2973>] ? selinux_socket_sendmsg+0x23/0x30 [37598.994008] [<ffffffff8153a450>] sock_sendmsg+0xb0/0xe0 [37598.994008] [<ffffffff810135d1>] ? __switch_to+0x181/0x4a0 [37598.994008] [<ffffffff8153d97d>] sys_sendto+0x12d/0x180 [37598.994008] [<ffffffff810dfb64>] ? __audit_syscall_entry+0x94/0xf0 [37598.994008] [<ffffffff81020ed1>] ? syscall_trace_enter+0x231/0x240 [37598.994008] [<ffffffff8166a7e7>] tracesys+0xdd/0xe2 [37598.994008] Code: fe 07 00 00 48 c7 c7 04 28 a6 81 89 45 a0 4c 89 4d b8 44 89 5d a8 e8 1b ac b1 ff 44 8b 5d a8 4c 8b 4d b8 8b 45 a0 e9 cf fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 48 [37598.994008] RIP [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330 [37598.994008] RSP <ffff88003670da18> [37599.007323] ---[ end trace d69f6a17f8ac8eee ]--- While there, also check if path mtu discovery is activated for this socket. The logic was adapted from ip6_append_data when first writing on the corked socket. This bug was introduced with commit 0c1833797a5a6ec23ea9261d979aa18078720b74 ("ipv6: fix incorrect ipsec fragment"). Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> --- net/ipv6/ip6_output.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)