Message ID | 20110307224338.GU11864@gospo.rdu.redhat.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Le 07/03/2011 23:43, Andy Gospodarek a écrit : > On Mon, Mar 07, 2011 at 01:51:00PM +0100, Jiri Pirko wrote: >> Recent patch "bonding: move processing of recv handlers into >> handle_frame()" caused a regression on following net scheme: >> >> eth0 - bond0 - bond0.5 >> >> where arp monitoring is happening over vlan. This patch fixes it by >> reinjecting the arp packet into bonding slave device so the bonding >> rx_handler can pickup and process it. >> >> Signed-off-by: Jiri Pirko<jpirko@redhat.com> >> --- >> net/core/dev.c | 8 ++++---- >> 1 files changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index c71bd18..3d88458 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -3094,12 +3094,12 @@ void netdev_rx_handler_unregister(struct net_device *dev) >> } >> EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister); >> >> -static void vlan_on_bond_hook(struct sk_buff *skb) >> +static void vlan_on_bond_hook(struct sk_buff *skb, struct net_device *orig_dev) >> { >> /* >> * Make sure ARP frames received on VLAN interfaces stacked on >> * bonding interfaces still make their way to any base bonding >> - * device that may have registered for a specific ptype. >> + * device by reinjecting the frame into bonding slave (orig_dev) >> */ >> if (skb->dev->priv_flags& IFF_802_1Q_VLAN&& >> vlan_dev_real_dev(skb->dev)->priv_flags& IFF_BONDING&& >> @@ -3108,7 +3108,7 @@ static void vlan_on_bond_hook(struct sk_buff *skb) >> >> if (!skb2) >> return; >> - skb2->dev = vlan_dev_real_dev(skb->dev); >> + skb2->dev = orig_dev; >> netif_rx(skb2); >> } >> } >> @@ -3202,7 +3202,7 @@ ncls: >> goto out; >> } >> >> - vlan_on_bond_hook(skb); >> + vlan_on_bond_hook(skb, orig_dev); >> >> /* deliver only exact match when indicated */ >> null_or_dev = deliver_exact ? skb->dev : NULL; > > This patch doesn't work. > > My setup has bond0.100 -> bond0 -> eth2 and eth3. ARP monitoring is > enabled as is arp_valiate. > > The initial problem was just that just before vlan_on_bond_hook is > called, skb->dev = bond0.100 and orig_dev = eth2. (This is after > running goto another_route and having been called back through > __netif_receive_skb since vlan_hwaccel_do_receive it true.) > > Now vlan_on_bond_hook is called and we have 2 skbs. > > The original skb still have skb->dev = bond0.100 and orig_dev = eth2. > Since bond_arp_rcv is registered for traffic only to bond0, the handler > is not hit and the frame is dropped (or processed by another handler). After Jiri's last patch series, bond_arp_rcv() is not registered anymore as a protocol handler on bond0, but directly called from inside bond_handle_frame(), through bond->recv_probe. Because bond_handler_frame() is a rx_handler for the slave interfaces, bond_arp_rcv() is now called at the slave level and not a the master level anymore. Hence this patch and the reason I thought it should work. Did you tested this patch with Jiri's previous patches applied before? > The cloned skb has skb->dev = bond0 and is put back on the receive queue > and comes back through __netif_receive_skb. This frame will match the > ptype entry for bond_arp_rcv, but since orig_dev = bond0 in this case, > the code in bond_arp_rcv will not handle the frame. I definitely hate all those unnecessary reinjects from rx_handler. The another_round loop is designed to allow for stacking inside __netif_receive_skb(). Jiri apparently has another (better) solution in mind. I hope to see it, but Jiri arguably want some of the patchs in the queue to flow before adding more. Does someone had a look at my proposal of a late_delivery property for packet_type, previously in this thread, to handle the situation where a given protocol handler registered on a particular device would like to receive the final skb instead of the one at the time it crossed that particular device? > > If we truly want to track the original interface that received the > frame, the following is a better option. With the recursive nature of > __netif_receive_skb at this point, we should really consider setting > orig_dev from skb_iif rather than just from skb->dev. Or we can remove all this orig_dev stuff... Nicolas. > diff --git a/net/core/dev.c b/net/core/dev.c > index 30440e7..500fdbc 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -3135,7 +3135,6 @@ static int __netif_receive_skb(struct sk_buff *skb) > > if (!skb->skb_iif) > skb->skb_iif = skb->dev->ifindex; > - orig_dev = skb->dev; > > skb_reset_network_header(skb); > skb_reset_transport_header(skb); > @@ -3145,6 +3144,7 @@ static int __netif_receive_skb(struct sk_buff *skb) > > rcu_read_lock(); > > + orig_dev = dev_get_by_index_rcu(dev_net(skb->dev),skb->skb_iif); > another_round: > > __this_cpu_inc(softnet_data.processed); > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 08, 2011 at 12:09:48AM +0100, Nicolas de Pesloüan wrote: > > After Jiri's last patch series, bond_arp_rcv() is not registered anymore > as a protocol handler on bond0, but directly called from inside > bond_handle_frame(), through bond->recv_probe. > > Because bond_handler_frame() is a rx_handler for the slave interfaces, > bond_arp_rcv() is now called at the slave level and not a the master > level anymore. > > Hence this patch and the reason I thought it should work. > > Did you tested this patch with Jiri's previous patches applied before? > No, I have not tested them yet. I looked at this problem first as Jiri's patch did not fix a regression in code that is actually in net-next right now. I don't have a problem with the work you and Jiri are doing on this latest patch series, but the frequency of patch revisions and now regressions concerns me. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 08/03/2011 03:43, Andy Gospodarek a écrit : > On Tue, Mar 08, 2011 at 12:09:48AM +0100, Nicolas de Pesloüan wrote: >> >> After Jiri's last patch series, bond_arp_rcv() is not registered anymore >> as a protocol handler on bond0, but directly called from inside >> bond_handle_frame(), through bond->recv_probe. >> >> Because bond_handler_frame() is a rx_handler for the slave interfaces, >> bond_arp_rcv() is now called at the slave level and not a the master >> level anymore. >> >> Hence this patch and the reason I thought it should work. >> >> Did you tested this patch with Jiri's previous patches applied before? >> > > No, I have not tested them yet. I looked at this problem first as > Jiri's patch did not fix a regression in code that is actually in > net-next right now. For as far as I understand, the regression was introduced by Jiri's patches that are not yet accepted in net-next-2.6, so this path definitely need the full patch series to be applied before. And there is no special urgency to fix this regression, because it doesn't "exist" yet anywhere except in some sandboxes. This patch change the reinjection to inject at the slave level instead of the master level, because in Jiri's patch series, ARP handling moved from a normal protocol handler registered at the master level to the rx_handler at the slave level. Jiri, do I miss something ? > I don't have a problem with the work you and Jiri are doing on this > latest patch series, but the frequency of patch revisions and now > regressions concerns me. I agree with your concerns, and this is the reason why Jiri and I are working hard to try and fight every possible problems in the whole package of patch. I think that, at the end, Jiri and I will have to properly resubmit the whole thing, for the convenient of those who will have to decide to accept it or not. Nicolas. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/dev.c b/net/core/dev.c index 30440e7..500fdbc 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3135,7 +3135,6 @@ static int __netif_receive_skb(struct sk_buff *skb) if (!skb->skb_iif) skb->skb_iif = skb->dev->ifindex; - orig_dev = skb->dev; skb_reset_network_header(skb); skb_reset_transport_header(skb); @@ -3145,6 +3144,7 @@ static int __netif_receive_skb(struct sk_buff *skb) rcu_read_lock(); + orig_dev = dev_get_by_index_rcu(dev_net(skb->dev),skb->skb_iif); another_round: __this_cpu_inc(softnet_data.processed);