diff mbox series

[net-next] flow_dissector: Add IPv6 flow label to symmetric keys

Message ID 20200804155642.52766-1-tom@herbertland.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [net-next] flow_dissector: Add IPv6 flow label to symmetric keys | expand

Commit Message

Tom Herbert Aug. 4, 2020, 3:56 p.m. UTC
The definition for symmetric keys does not include the flow label so
that when symmetric keys is used a non-zero IPv6 flow label is not
extracted. Symmetric keys are used in functions to compute the flow
hash for packets, and these functions also set "stop at flow label".
The upshot is that for IPv6 packets with a non-zero flow label, hashes
are only based on the address two tuple and there is no input entropy
from transport layer information. This patch fixes this bug.
---
 net/core/flow_dissector.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

David Miller Aug. 4, 2020, 11:12 p.m. UTC | #1
Kernel patch submissions must have proper signoffs.
Willem de Bruijn Aug. 5, 2020, 8:26 a.m. UTC | #2
On Tue, Aug 4, 2020 at 5:57 PM Tom Herbert <tom@herbertland.com> wrote:
>
> The definition for symmetric keys does not include the flow label so
> that when symmetric keys is used a non-zero IPv6 flow label is not
> extracted. Symmetric keys are used in functions to compute the flow
> hash for packets, and these functions also set "stop at flow label".
> The upshot is that for IPv6 packets with a non-zero flow label, hashes
> are only based on the address two tuple and there is no input entropy
> from transport layer information. This patch fixes this bug.

If this is a bug fix, it should probably target net and have a Fixes tag.

Should the actual fix be to remove the
FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL argument from
__skb_get_hash_symmetric?

The original commit mentions the symmetric key flow dissector to
compute a flat symmetric hash over only the protocol, addresses and
ports.

Autoflowlabel uses symmetric __get_hash_from_flowi6 to derive a flow
label, but this cannot be generally relied on. RFC 6437 suggests
even a PRNG as input, for instance.
Tom Herbert Aug. 17, 2020, 2:44 p.m. UTC | #3
On Wed, Aug 5, 2020 at 1:27 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Tue, Aug 4, 2020 at 5:57 PM Tom Herbert <tom@herbertland.com> wrote:
> >
> > The definition for symmetric keys does not include the flow label so
> > that when symmetric keys is used a non-zero IPv6 flow label is not
> > extracted. Symmetric keys are used in functions to compute the flow
> > hash for packets, and these functions also set "stop at flow label".
> > The upshot is that for IPv6 packets with a non-zero flow label, hashes
> > are only based on the address two tuple and there is no input entropy
> > from transport layer information. This patch fixes this bug.
>
> If this is a bug fix, it should probably target net and have a Fixes tag.
>
> Should the actual fix be to remove the
> FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL argument from
> __skb_get_hash_symmetric?
>
> The original commit mentions the symmetric key flow dissector to
> compute a flat symmetric hash over only the protocol, addresses and
> ports.
>
> Autoflowlabel uses symmetric __get_hash_from_flowi6 to derive a flow
> label, but this cannot be generally relied on. RFC 6437 suggests
> even a PRNG as input, for instance.

Willem,

Yes, the "reliability" of flow labels argument is brought up often
especially by routing vendors that seemingly want to continue doing
their expensive DPI. I'm not exactly what "reliability" means in this
context. I think it refers to the fact that some network devices might
change the flow label in the middle of a flow (AFAIK we squashed all
the instances in Linux stack where flow labels can change mid flow).
In reality, our ability to do "reliable" DPI is dwindling anyway due
to crypto, UDP encapsulation, and transports like QUIC that don't
identify flows solely based 4-tuple.

In any case, IMO the flow label should be part of the input entropy to
the hash even if we do DPI. It's sufficient entropy along with the
IPv6 addresses to produce a flow hash with conformant properties in
more cases than DPI alone can provide due to the limitations above.
One obvious problem with DPI is cost, especially in cases we need to
parse over a bunch of headers just to locate transport ports (i.e.
potential DOS). If the cost is shown negligible or there's other
benefits then maybe we could eliminate
FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL.

Tom
Willem de Bruijn Aug. 17, 2020, 3:35 p.m. UTC | #4
On Mon, Aug 17, 2020 at 4:44 PM Tom Herbert <tom@herbertland.com> wrote:
>
> On Wed, Aug 5, 2020 at 1:27 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > On Tue, Aug 4, 2020 at 5:57 PM Tom Herbert <tom@herbertland.com> wrote:
> > >
> > > The definition for symmetric keys does not include the flow label so
> > > that when symmetric keys is used a non-zero IPv6 flow label is not
> > > extracted. Symmetric keys are used in functions to compute the flow
> > > hash for packets, and these functions also set "stop at flow label".
> > > The upshot is that for IPv6 packets with a non-zero flow label, hashes
> > > are only based on the address two tuple and there is no input entropy
> > > from transport layer information. This patch fixes this bug.
> >
> > If this is a bug fix, it should probably target net and have a Fixes tag.
> >
> > Should the actual fix be to remove the
> > FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL argument from
> > __skb_get_hash_symmetric?
> >
> > The original commit mentions the symmetric key flow dissector to
> > compute a flat symmetric hash over only the protocol, addresses and
> > ports.
> >
> > Autoflowlabel uses symmetric __get_hash_from_flowi6 to derive a flow
> > label, but this cannot be generally relied on. RFC 6437 suggests
> > even a PRNG as input, for instance.
>
> Willem,
>
> Yes, the "reliability" of flow labels argument is brought up often
> especially by routing vendors that seemingly want to continue doing
> their expensive DPI. I'm not exactly what "reliability" means in this
> context. I think it refers to the fact that some network devices might
> change the flow label in the middle of a flow (AFAIK we squashed all
> the instances in Linux stack where flow labels can change mid flow).
> In reality, our ability to do "reliable" DPI is dwindling anyway due
> to crypto, UDP encapsulation, and transports like QUIC that don't
> identify flows solely based 4-tuple.
>
> In any case, IMO the flow label should be part of the input entropy to
> the hash even if we do DPI. It's sufficient entropy along with the
> IPv6 addresses to produce a flow hash with conformant properties in
> more cases than DPI alone can provide due to the limitations above.
> One obvious problem with DPI is cost, especially in cases we need to
> parse over a bunch of headers just to locate transport ports (i.e.
> potential DOS). If the cost is shown negligible or there's other
> benefits then maybe we could eliminate
> FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL.

Yes, I suppose it's fair to assume that if the flowlabel is set, this
sufficiently identifies the flow.

The existing code that breaks out of dissection with
FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL with only network layer addresses
as flowhash input is clearly undesirable. Sounds good to fix it by
using the flowlabel instead of explicit fields.

Then this should go to net with a proper Fixes, or did you have a
specific reason to target net-next?
diff mbox series

Patch

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 29806eb765cf..d72e13d125fb 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -1765,6 +1765,10 @@  static const struct flow_dissector_key flow_keys_dissector_symmetric_keys[] = {
 		.key_id = FLOW_DISSECTOR_KEY_PORTS,
 		.offset = offsetof(struct flow_keys, ports),
 	},
+	{
+		.key_id = FLOW_DISSECTOR_KEY_FLOW_LABEL,
+		.offset = offsetof(struct flow_keys, tags),
+	},
 };
 
 static const struct flow_dissector_key flow_keys_basic_dissector_keys[] = {