diff mbox

bridge: Superfluous skb->nfct check in br_nf_dev_queue_xmit

Message ID 5360BA5A.7020200@parallels.com
State Superseded
Headers show

Commit Message

Vasily Averin April 30, 2014, 8:54 a.m. UTC
Currently bridge can silently drop ipv4 fragments.
If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4,
br_nf_pre_routing defragments incoming ipv4 fragments
but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back,
and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters

Signed-off-by: Vasily Averin <vvs@openvz.org>
---
 net/bridge/br_netfilter.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Pablo Neira Ayuso April 30, 2014, 9:39 a.m. UTC | #1
On Wed, Apr 30, 2014 at 12:54:50PM +0400, Vasily Averin wrote:
> Currently bridge can silently drop ipv4 fragments.
> If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4,
> br_nf_pre_routing defragments incoming ipv4 fragments
> but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back,
> and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters

Patrick already mentioned that bridges should not defragment unless
conntrack is enabled.

Please, see: http://marc.info/?l=netfilter-devel&m=139878065822267&w=2

I think we have to consider some alternative way to fix what you
report.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal April 30, 2014, 10:02 a.m. UTC | #2
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Wed, Apr 30, 2014 at 12:54:50PM +0400, Vasily Averin wrote:
> > Currently bridge can silently drop ipv4 fragments.
> > If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4,
> > br_nf_pre_routing defragments incoming ipv4 fragments
> > but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back,
> > and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters
> 
> Patrick already mentioned that bridges should not defragment unless
> conntrack is enabled.
> 
> Please, see: http://marc.info/?l=netfilter-devel&m=139878065822267&w=2
> 
> I think we have to consider some alternative way to fix what you
> report.

Could you explain how br_nf_dev_queue_xmit could encouter
an IP packet that doesn't fit link MTU without netfilter
defrag?

Because I do not see any way of this happening, all bridge
ports must have the same MTU?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso May 4, 2014, 12:54 p.m. UTC | #3
On Wed, Apr 30, 2014 at 12:54:50PM +0400, Vasily Averin wrote:
> Currently bridge can silently drop ipv4 fragments.
> If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4,
> br_nf_pre_routing defragments incoming ipv4 fragments
> but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back,
> and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters

If no further objections, I'll push this original patch appending this comment
to the description:

[ It seems the only way to hit the ip_fragment code in the bridge xmit
  path is to have a fragment list whose reassembled fragments go over
  the mtu. This only happens if nf_defrag is enabled. Thanks to
  Florian Westphal for providing feedback to clarify this. ]
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vasily Averin May 4, 2014, 7:04 p.m. UTC | #4
On 05/04/2014 04:54 PM, Pablo Neira Ayuso wrote:
> On Wed, Apr 30, 2014 at 12:54:50PM +0400, Vasily Averin wrote:
>> Currently bridge can silently drop ipv4 fragments.
>> If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4,
>> br_nf_pre_routing defragments incoming ipv4 fragments
>> but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back,
>> and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters
> 
> If no further objections, I'll push this original patch appending this comment
> to the description:
> 
> [ It seems the only way to hit the ip_fragment code in the bridge xmit
>   path is to have a fragment list whose reassembled fragments go over
>   the mtu. This only happens if nf_defrag is enabled. Thanks to
>   Florian Westphal for providing feedback to clarify this. ]

I have not objections, however I still do not understand why #if IS_ENABLED(CONFIG_NF_CONNTRACK_IPV4) 
is required in br_dev_queue_push_xmit()?

If ipv4 defragmentation is required not only for conntracks but for TPROXY target and xt_socket match
I think we need to use NF_DEFRAG_IPV4 instead.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso May 4, 2014, 7:25 p.m. UTC | #5
On Sun, May 04, 2014 at 11:04:29PM +0400, Vasily Averin wrote:
> On 05/04/2014 04:54 PM, Pablo Neira Ayuso wrote:
> > On Wed, Apr 30, 2014 at 12:54:50PM +0400, Vasily Averin wrote:
> >> Currently bridge can silently drop ipv4 fragments.
> >> If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4,
> >> br_nf_pre_routing defragments incoming ipv4 fragments
> >> but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back,
> >> and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters
> > 
> > If no further objections, I'll push this original patch appending this comment
> > to the description:
> > 
> > [ It seems the only way to hit the ip_fragment code in the bridge xmit
> >   path is to have a fragment list whose reassembled fragments go over
> >   the mtu. This only happens if nf_defrag is enabled. Thanks to
> >   Florian Westphal for providing feedback to clarify this. ]
> 
> I have not objections, however I still do not understand why #if
> IS_ENABLED(CONFIG_NF_CONNTRACK_IPV4) is required in
> br_dev_queue_push_xmit()?
> 
> If ipv4 defragmentation is required not only for conntracks but for
> TPROXY target and xt_socket match I think we need to use
> NF_DEFRAG_IPV4 instead.

Before your patch, this was checking for skb->nfct which is defined by

#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)

in include/linux/skbuff.h.

But after removing that skb->nfct check, we can safely change it to
CONFIG_NF_DEFRAG_IPV4.

You can send me a new patch version including this change.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 80e1b0f..6a8407c 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -864,7 +864,7 @@  static int br_nf_dev_queue_xmit(struct sk_buff *skb)
 {
 	int ret;
 
-	if (skb->nfct != NULL && skb->protocol == htons(ETH_P_IP) &&
+	if (skb->protocol == htons(ETH_P_IP) &&
 	    skb->len + nf_bridge_mtu_reduction(skb) > skb->dev->mtu &&
 	    !skb_is_gso(skb)) {
 		if (br_parse_ip_options(skb))