Message ID | 1547207513-22802-3-git-send-email-pieter.jansenvanvuuren@netronome.com |
---|---|
State | Changes Requested |
Headers | show |
Series | extend ovs-tc offload for more pedit action | expand |
On Fri, Jan 11, 2019 at 11:51:53AM +0000, Pieter Jansen van Vuuren wrote: > +/* These functions specifically help shifting words in network > + * byte order, given that they are specified in host order. */ > +static inline uint32_t > +shift_ovs_be32_left(uint32_t word, int shift) > +{ > + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) << shift; > + > + return ntohl((OVS_FORCE ovs_be32)word_shifted); > +} > + > +static inline uint32_t > +shift_ovs_be32_right(uint32_t word, int shift) > +{ > + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) >> shift; > + > + return ntohl((OVS_FORCE ovs_be32)word_shifted); > +} I don't understand how these new operations make sense. Can you explain?
On 11/01/2019 16:49, Ben Pfaff wrote: > On Fri, Jan 11, 2019 at 11:51:53AM +0000, Pieter Jansen van Vuuren wrote: >> +/* These functions specifically help shifting words in network >> + * byte order, given that they are specified in host order. */ >> +static inline uint32_t >> +shift_ovs_be32_left(uint32_t word, int shift) >> +{ >> + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) << shift; >> + >> + return ntohl((OVS_FORCE ovs_be32)word_shifted); >> +} >> + >> +static inline uint32_t >> +shift_ovs_be32_right(uint32_t word, int shift) >> +{ >> + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) >> shift; >> + >> + return ntohl((OVS_FORCE ovs_be32)word_shifted); >> +} > > I don't understand how these new operations make sense. Can you > explain? > Sure, Background: The original issue is that pedit performs set action operation per word, and the way we are describing the translation from OvS to TC is by using byte offsets(flower_pedit_map). However some fields like Traffic Class in IPv6 span 2 pedit-words - one nibble of the field is in one word and the other nibble is in another word. In order to accommodate this a shift may be used. We need to do this shift in network order thus we need the conversion from host to network, but the types for htonl/ntohl either takes as parameter or returns ovs_be32. Doing the shift operation on ovs_be32 results in the compiler degrades to integer: error: restricted ovs_be32 degrades to integer Passing that back to ntohl would result in: error: incorrect type in argument 1 (different base types) expected restricted ovs_be32 [usertype] x got unsigned int therefore we introduced these new operations. Hope this helps. I'm open to alternate approaches to this problem.
On Fri, Jan 11, 2019 at 05:35:45PM +0000, Pieter Jansen van Vuuren wrote: > On 11/01/2019 16:49, Ben Pfaff wrote: > > On Fri, Jan 11, 2019 at 11:51:53AM +0000, Pieter Jansen van Vuuren wrote: > >> +/* These functions specifically help shifting words in network > >> + * byte order, given that they are specified in host order. */ > >> +static inline uint32_t > >> +shift_ovs_be32_left(uint32_t word, int shift) > >> +{ > >> + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) << shift; > >> + > >> + return ntohl((OVS_FORCE ovs_be32)word_shifted); > >> +} > >> + > >> +static inline uint32_t > >> +shift_ovs_be32_right(uint32_t word, int shift) > >> +{ > >> + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) >> shift; > >> + > >> + return ntohl((OVS_FORCE ovs_be32)word_shifted); > >> +} > > > > I don't understand how these new operations make sense. Can you > > explain? > > > > Sure, > > Background: > The original issue is that pedit performs set action operation per word, and > the way we are describing the translation from OvS to TC is by using byte > offsets(flower_pedit_map). However some fields like Traffic Class in IPv6 span > 2 pedit-words - one nibble of the field is in one word and the other nibble is > in another word. In order to accommodate this a shift may be used. > > We need to do this shift in network order thus we need the conversion from host > to network, but the types for htonl/ntohl either takes as parameter or returns > ovs_be32. Doing the shift operation on ovs_be32 results in the compiler > degrades to integer: > > error: restricted ovs_be32 degrades to integer > > Passing that back to ntohl would result in: > > error: incorrect type in argument 1 (different base types) > expected restricted ovs_be32 [usertype] x > got unsigned int > > therefore we introduced these new operations. Hope this helps. I'm open to > alternate approaches to this problem. The part I don't understand is why doing a shift of a word in network byte order makes any sense. It will have different semantics depending on whether the host is little-endian or big-endian. Why is that desirable?
On 11/01/2019 18:02, Ben Pfaff wrote: > On Fri, Jan 11, 2019 at 05:35:45PM +0000, Pieter Jansen van Vuuren wrote: >> On 11/01/2019 16:49, Ben Pfaff wrote: >>> On Fri, Jan 11, 2019 at 11:51:53AM +0000, Pieter Jansen van Vuuren wrote: >>>> +/* These functions specifically help shifting words in network >>>> + * byte order, given that they are specified in host order. */ >>>> +static inline uint32_t >>>> +shift_ovs_be32_left(uint32_t word, int shift) >>>> +{ >>>> + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) << shift; >>>> + >>>> + return ntohl((OVS_FORCE ovs_be32)word_shifted); >>>> +} >>>> + >>>> +static inline uint32_t >>>> +shift_ovs_be32_right(uint32_t word, int shift) >>>> +{ >>>> + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) >> shift; >>>> + >>>> + return ntohl((OVS_FORCE ovs_be32)word_shifted); >>>> +} >>> >>> I don't understand how these new operations make sense. Can you >>> explain? >>> >> >> Sure, >> >> Background: >> The original issue is that pedit performs set action operation per word, and >> the way we are describing the translation from OvS to TC is by using byte >> offsets(flower_pedit_map). However some fields like Traffic Class in IPv6 span >> 2 pedit-words - one nibble of the field is in one word and the other nibble is >> in another word. In order to accommodate this a shift may be used. >> >> We need to do this shift in network order thus we need the conversion from host >> to network, but the types for htonl/ntohl either takes as parameter or returns >> ovs_be32. Doing the shift operation on ovs_be32 results in the compiler >> degrades to integer: >> >> error: restricted ovs_be32 degrades to integer >> >> Passing that back to ntohl would result in: >> >> error: incorrect type in argument 1 (different base types) >> expected restricted ovs_be32 [usertype] x >> got unsigned int >> >> therefore we introduced these new operations. Hope this helps. I'm open to >> alternate approaches to this problem. > > The part I don't understand is why doing a shift of a word in network > byte order makes any sense. It will have different semantics depending > on whether the host is little-endian or big-endian. Why is that > desirable? > Oh, yes that makes sense. We would not want different semantics for little/big endian machines. One confusing aspect of the code is that for historical reasons the pedit deals with values that are in network byte order using uint32_t variables. So I think in this case the comment is incorrect and the usage of htonl/ntohl is inverted. What we actually want is a shift of a value stored in network byte order, and we want that shift done in host byte order (to avoid the problem you describe). In terms of run-time the result would be the same as the current patch. But it would be documented correctly. As a follow-up documentation of the byte order used by pedit would likely be a good idea.
diff --git a/lib/byte-order.h b/lib/byte-order.h index 66d29a2b3..7bbadb24e 100644 --- a/lib/byte-order.h +++ b/lib/byte-order.h @@ -116,6 +116,24 @@ bytes_to_be32(uint8_t b0, uint8_t b1, uint8_t b2, uint8_t b3) return (OVS_FORCE ovs_be32) x; } +/* These functions specifically help shifting words in network + * byte order, given that they are specified in host order. */ +static inline uint32_t +shift_ovs_be32_left(uint32_t word, int shift) +{ + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) << shift; + + return ntohl((OVS_FORCE ovs_be32)word_shifted); +} + +static inline uint32_t +shift_ovs_be32_right(uint32_t word, int shift) +{ + uint32_t word_shifted = (OVS_FORCE uint32_t)htonl(word) >> shift; + + return ntohl((OVS_FORCE ovs_be32)word_shifted); +} + /* These functions zero-extend big-endian values to longer ones, * or truncate long big-endian value to shorter ones. */ #ifndef __CHECKER__ diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c index 90bd3c585..1c5bd22e4 100644 --- a/lib/netdev-tc-offloads.c +++ b/lib/netdev-tc-offloads.c @@ -84,6 +84,10 @@ static struct netlink_field set_flower_map[][4] = { offsetof(struct tc_flower_key, ipv6.rewrite_hlimit), MEMBER_SIZEOF(struct tc_flower_key, ipv6.rewrite_hlimit) }, + { offsetof(struct ovs_key_ipv6, ipv6_tclass), + offsetof(struct tc_flower_key, ipv6.rewrite_tclass), + MEMBER_SIZEOF(struct tc_flower_key, ipv6.rewrite_tclass) + }, }, [OVS_KEY_ATTR_ETHERNET] = { { offsetof(struct ovs_key_ethernet, eth_src), diff --git a/lib/tc.c b/lib/tc.c index 1696c9836..f93814fa3 100644 --- a/lib/tc.c +++ b/lib/tc.c @@ -73,6 +73,7 @@ struct flower_key_to_pedit { int offset; int flower_offset; int size; + int boundary_shift; }; static struct flower_key_to_pedit flower_pedit_map[] = { @@ -80,72 +81,92 @@ static struct flower_key_to_pedit flower_pedit_map[] = { TCA_PEDIT_KEY_EX_HDR_TYPE_IP4, 12, offsetof(struct tc_flower_key, ipv4.ipv4_src), - MEMBER_SIZEOF(struct tc_flower_key, ipv4.ipv4_src) + MEMBER_SIZEOF(struct tc_flower_key, ipv4.ipv4_src), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_IP4, 16, offsetof(struct tc_flower_key, ipv4.ipv4_dst), - MEMBER_SIZEOF(struct tc_flower_key, ipv4.ipv4_dst) + MEMBER_SIZEOF(struct tc_flower_key, ipv4.ipv4_dst), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_IP4, 8, offsetof(struct tc_flower_key, ipv4.rewrite_ttl), - MEMBER_SIZEOF(struct tc_flower_key, ipv4.rewrite_ttl) + MEMBER_SIZEOF(struct tc_flower_key, ipv4.rewrite_ttl), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_IP4, 1, offsetof(struct tc_flower_key, ipv4.rewrite_tos), - MEMBER_SIZEOF(struct tc_flower_key, ipv4.rewrite_tos) + MEMBER_SIZEOF(struct tc_flower_key, ipv4.rewrite_tos), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_IP6, 7, offsetof(struct tc_flower_key, ipv6.rewrite_hlimit), - MEMBER_SIZEOF(struct tc_flower_key, ipv6.rewrite_hlimit) + MEMBER_SIZEOF(struct tc_flower_key, ipv6.rewrite_hlimit), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_IP6, 8, offsetof(struct tc_flower_key, ipv6.ipv6_src), - MEMBER_SIZEOF(struct tc_flower_key, ipv6.ipv6_src) + MEMBER_SIZEOF(struct tc_flower_key, ipv6.ipv6_src), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_IP6, 24, offsetof(struct tc_flower_key, ipv6.ipv6_dst), - MEMBER_SIZEOF(struct tc_flower_key, ipv6.ipv6_dst) + MEMBER_SIZEOF(struct tc_flower_key, ipv6.ipv6_dst), + 0 + }, { + TCA_PEDIT_KEY_EX_HDR_TYPE_IP6, + 0, + offsetof(struct tc_flower_key, ipv6.rewrite_tclass), + MEMBER_SIZEOF(struct tc_flower_key, ipv6.rewrite_tclass), + 4 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_ETH, 6, offsetof(struct tc_flower_key, src_mac), - MEMBER_SIZEOF(struct tc_flower_key, src_mac) + MEMBER_SIZEOF(struct tc_flower_key, src_mac), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_ETH, 0, offsetof(struct tc_flower_key, dst_mac), - MEMBER_SIZEOF(struct tc_flower_key, dst_mac) + MEMBER_SIZEOF(struct tc_flower_key, dst_mac), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_ETH, 12, offsetof(struct tc_flower_key, eth_type), - MEMBER_SIZEOF(struct tc_flower_key, eth_type) + MEMBER_SIZEOF(struct tc_flower_key, eth_type), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_TCP, 0, offsetof(struct tc_flower_key, tcp_src), - MEMBER_SIZEOF(struct tc_flower_key, tcp_src) + MEMBER_SIZEOF(struct tc_flower_key, tcp_src), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_TCP, 2, offsetof(struct tc_flower_key, tcp_dst), - MEMBER_SIZEOF(struct tc_flower_key, tcp_dst) + MEMBER_SIZEOF(struct tc_flower_key, tcp_dst), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_UDP, 0, offsetof(struct tc_flower_key, udp_src), - MEMBER_SIZEOF(struct tc_flower_key, udp_src) + MEMBER_SIZEOF(struct tc_flower_key, udp_src), + 0 }, { TCA_PEDIT_KEY_EX_HDR_TYPE_UDP, 2, offsetof(struct tc_flower_key, udp_dst), - MEMBER_SIZEOF(struct tc_flower_key, udp_dst) + MEMBER_SIZEOF(struct tc_flower_key, udp_dst), + 0 }, }; @@ -832,8 +853,11 @@ nl_parse_act_pedit(struct nlattr *options, struct tc_flower *flower) int diff = flower_off + (keys->off - mf); uint32_t *dst = (void *) (rewrite_key + diff); uint32_t *dst_m = (void *) (rewrite_mask + diff); - uint32_t mask = ~(keys->mask); - uint32_t zero_bits; + uint32_t mask_word, data_word, mask, zero_bits; + + mask_word = shift_ovs_be32_left(keys->mask, m->boundary_shift); + data_word = shift_ovs_be32_left(keys->val, m->boundary_shift); + mask = ~(mask_word); if (keys->off < mf) { zero_bits = 8 * (mf - keys->off); @@ -844,7 +868,7 @@ nl_parse_act_pedit(struct nlattr *options, struct tc_flower *flower) } *dst_m |= mask; - *dst |= keys->val & mask; + *dst |= data_word & mask; } } @@ -1832,6 +1856,7 @@ nl_msg_put_flower_rewrite_pedits(struct ofpbuf *request, for (j = 0; j < cnt; j++, mask++, data++, cur_offset += 4) { uint32_t mask_word = *mask; + uint32_t data_word = *data; if (j == 0) { mask_word &= first_word_mask; @@ -1853,8 +1878,10 @@ nl_msg_put_flower_rewrite_pedits(struct ofpbuf *request, pedit_key_ex->cmd = TCA_PEDIT_KEY_EX_CMD_SET; pedit_key_ex->htype = m->htype; pedit_key->off = cur_offset; + mask_word = shift_ovs_be32_right(mask_word, m->boundary_shift); + data_word = shift_ovs_be32_right(data_word, m->boundary_shift); pedit_key->mask = ~mask_word; - pedit_key->val = *data & mask_word; + pedit_key->val = data_word & mask_word; sel.sel.nkeys++; err = csum_update_flag(flower, m->htype); diff --git a/lib/tc.h b/lib/tc.h index 04b08e298..6c909df21 100644 --- a/lib/tc.h +++ b/lib/tc.h @@ -109,6 +109,7 @@ struct tc_flower_key { struct in6_addr ipv6_src; struct in6_addr ipv6_dst; uint8_t rewrite_hlimit; + uint8_t rewrite_tclass; } ipv6; struct {