diff mbox series

[ovs-dev] 答复: [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

Message ID 7f2873367602497fb9f00c1df6b30b8f@inspur.com
State Superseded
Headers show
Series [ovs-dev] 答复: [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath | expand

Commit Message

Yi Yang (杨燚)-云服务集团 March 13, 2020, 1:04 a.m. UTC
Hi, Ben, can you help double confirm if include/linux/if_packet.h in ovs is necessary?

William, let me paster Ben's comments here. Original comments link is https://mail.openvswitch.org/pipermail/ovs-dev/2020-January/367085.html.

In addition, based on the fact tpacket_v3 isn't so good currently, I agree we needn't use it as default implementation, I think we can add an other config:userspace-use-tpacket-v3=true|false and set it to false by default, what do you think about it? I will change it in next ver in this way if yes.

"""
Thanks for the patch!

I am a bit concerned about version compatibility issues here.  There are
two relevant kinds of versions.  The first is the version of the
kernel/library headers.  This patch works pretty hard to adapt to the
headers that are available at compile time, only dealing with the
versions of the protocols that are available from the headers.  This
approach is sometimes fine, but an approach can be better is to simply
declare the structures or constants that the headers lack.  This is
often pretty easy for Linux data structures.  OVS does this for some
structures that it cares about with the headers in ovs/include/linux.
This approach has two advantages: the OVS code (outside these special
declarations) doesn't have to care whether particular structures are
declared, because they are always declared, and the OVS build always
supports a particular feature regardless of the headers of the system on
which it was built.

The second kind of version is the version of the system that OVS runs
on.  Unless a given feature is one that is supported by every version
that OVS cares about, OVS needs to test at runtime whether the feature
is supported and, if not, fall back to the older feature.  I don't see
that in this code.  Instead, it looks to me like it assumes that if the
feature was available at build time, then it is available at runtime.
This is not a good way to do things, since we want people to be able to
get builds from distributors such as Red Hat or Debian and then run
those builds on a diverse collection of kernels.

One specific comment I have here is that, in acinclude.m4, it would be
better to use AC_CHECK_TYPE or AC_CHECK_TYPES thatn OVS_GREP_IFELSE.
The latter is for testing for kernel builds only; we can't use the
normal AC_* tests for those because we often can't successfully build
kernel headers using the compiler and flags that Autoconf sets up for
building OVS.

Thanks,
"""

Per my understanding, Ben meant a build system (which isn't Linux probably, it doesn't have include/linux/if_packet.h) should be able to build tpacket_v3 code in order that built-out binary can work on Linux system with tpacket_v3 feature, this is Ben's point, that is why he wanted me to add include/linux/if_packet.h in ovs repo.

Ben, can you help double confirm if include/linux/if_packet.h in ovs is necessary?

-----邮件原件-----
发件人: William Tu [mailto:u9012063@gmail.com] 
发送时间: 2020年3月13日 2:35
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: yang_y_yi@163.com; ovs-dev@openvswitch.org
主题: Re: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

On Wed, Mar 11, 2020 at 6:14 PM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> > >
> > > TPACKET_V3 can support TSO, but its performance isn't good because 
> > > of
> > > TPACKET_V3 kernel implementation issue, so it falls back to
> >
> > What's the implementation issue? If we use latest kernel, does the 
> > issue still exist?
> >
> > [Yi Yang] Per my check, the issue is the kernel can't feed enough 
> > packets to tpacket_recv, so in many cases, no packets received, no 
> > 32 packets available, but for original non-tpacket case, one recv 
> > will get 32 packets in most cases, throughput is about more than 
> > twice for veth, for tap case, it is more than three times, I read 
> > kernel source code, but I can't find root cause, I'll check from tpacket maintainer.
> >
> > > recvmmsg in case userspace-tso-enable is set to true, but its 
> > > performance is better than recvmmsg in case userspace-tso-enable 
> > > is set to false, so just use TPACKET_V3 in that case.
> > >
> > > Signed-off-by: Yi Yang <yangyi01@inspur.com>
> > > Co-authored-by: William Tu <u9012063@gmail.com>
> > > Signed-off-by: William Tu <u9012063@gmail.com>
> > > ---
> > > diff --git a/include/linux/if_packet.h b/include/linux/if_packet.h 
> > > new file mode 100644 index 0000000..e20aacc
> > > --- /dev/null
> > > +++ b/include/linux/if_packet.h
> >
> > if OVS_CHECK_LINUX_TPACKET returns false, can we simply fall back to 
> > recvmmsg?
> > So this is not needed?
> >
> > [Yi Yang] As you said, ovs support Linux kernel 3.10.0 or above, so 
> > no that case existing, isn't it?
>
> I mean if kernel supports it AND if_packet.h header exists, then we enable it.
> If kernel supports it AND if_packet.h header does not exist, then just use recvmmsg.
>
> [Yi Yang] I'm confused here, Ben told me it should be built even if  if_packet.h isn't there, that is why I added if_packet,h in include/linux/if_packet.h, I mean tpacket_v3 code should be built in this case.
>

My concern is that since there is not a lot of performance improvement, we don't necessary need to use tpacket_v3. Or we should use tpacket_v3 as an optional configuration, but not default.

I remove the if_linux.h in the following diff, and travis works ok.
https://travis-ci.org/github/williamtu/ovs-travis/builds/661631098

---

Comments

Ben Pfaff March 13, 2020, 3:56 p.m. UTC | #1
On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> Per my understanding, Ben meant a build system (which isn't Linux
> probably, it doesn't have include/linux/if_packet.h) should be able to
> build tpacket_v3 code in order that built-out binary can work on Linux
> system with tpacket_v3 feature, this is Ben's point, that is why he
> wanted me to add include/linux/if_packet.h in ovs repo.
> 
> Ben, can you help double confirm if include/linux/if_packet.h in ovs
> is necessary?

I think my meaning was misunderstood.  Linux always has if_packet.h.
Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system
is Linux but the TPACKET_V3 types and constants are not defined in
if_packet.h, then the build system should define them.
William Tu March 13, 2020, 4:22 p.m. UTC | #2
On Fri, Mar 13, 2020 at 8:57 AM Ben Pfaff <blp@ovn.org> wrote:
>
> On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> > Per my understanding, Ben meant a build system (which isn't Linux
> > probably, it doesn't have include/linux/if_packet.h) should be able to
> > build tpacket_v3 code in order that built-out binary can work on Linux
> > system with tpacket_v3 feature, this is Ben's point, that is why he
> > wanted me to add include/linux/if_packet.h in ovs repo.
> >
> > Ben, can you help double confirm if include/linux/if_packet.h in ovs
> > is necessary?
>
> I think my meaning was misunderstood.  Linux always has if_packet.h.
> Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system
> is Linux but the TPACKET_V3 types and constants are not defined in
> if_packet.h, then the build system should define them.

Thanks!

My suggestion is that if the system is Linux but the TPACKET_V3 types
and constants are not defined in if_packet.h, then just skip using
TPACKET_V3 and
use the current recvmmsg approach.  Because when we start  TPACKET_V3 patch,
the af_packet on veth performance is about 200Mbps, so tpacket_v3 has huge
performance benefits.

With YiYang's patch
"Use batch process recv for tap and raw socket in netdev datapath"
the af_packet on veth improves to 1.47Gbps. And tpacket_v3 shows
similar or 7% better performance. So there isn't a huge benefits now.

William
Ilya Maximets March 13, 2020, 4:47 p.m. UTC | #3
On 3/13/20 5:22 PM, William Tu wrote:
> On Fri, Mar 13, 2020 at 8:57 AM Ben Pfaff <blp@ovn.org> wrote:
>>
>> On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
>>> Per my understanding, Ben meant a build system (which isn't Linux
>>> probably, it doesn't have include/linux/if_packet.h) should be able to
>>> build tpacket_v3 code in order that built-out binary can work on Linux
>>> system with tpacket_v3 feature, this is Ben's point, that is why he
>>> wanted me to add include/linux/if_packet.h in ovs repo.
>>>
>>> Ben, can you help double confirm if include/linux/if_packet.h in ovs
>>> is necessary?
>>
>> I think my meaning was misunderstood.  Linux always has if_packet.h.
>> Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system
>> is Linux but the TPACKET_V3 types and constants are not defined in
>> if_packet.h, then the build system should define them.
> 
> Thanks!
> 
> My suggestion is that if the system is Linux but the TPACKET_V3 types
> and constants are not defined in if_packet.h, then just skip using
> TPACKET_V3 and
> use the current recvmmsg approach.  Because when we start  TPACKET_V3 patch,
> the af_packet on veth performance is about 200Mbps, so tpacket_v3 has huge
> performance benefits.
> 
> With YiYang's patch
> "Use batch process recv for tap and raw socket in netdev datapath"
> the af_packet on veth improves to 1.47Gbps. And tpacket_v3 shows
> similar or 7% better performance. So there isn't a huge benefits now.

With such a small performance benefit does it make sense to have
these 700 lines of code that is so hard to read and maintain?

Another point is that hopefully segmentation offloading in userspace
datapath will evolve so we could enable it by default and all this
code will become almost useless.

If you're looking for poll mode/async -like solutions we could try and
check io_uring way for calling same recvmsg/sendmsg.  That might
have more benefits and it will support all the functionality supported
by these calls.  Even better, we could also make io_uring support as
an internal library and reuse it for other OVS subsystems like making
async poll/timers/logging/etc in the future.

Best regards, Ilya Maximets.
William Tu March 13, 2020, 5:04 p.m. UTC | #4
On Fri, Mar 13, 2020 at 9:48 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 3/13/20 5:22 PM, William Tu wrote:
> > On Fri, Mar 13, 2020 at 8:57 AM Ben Pfaff <blp@ovn.org> wrote:
> >>
> >> On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> >>> Per my understanding, Ben meant a build system (which isn't Linux
> >>> probably, it doesn't have include/linux/if_packet.h) should be able to
> >>> build tpacket_v3 code in order that built-out binary can work on Linux
> >>> system with tpacket_v3 feature, this is Ben's point, that is why he
> >>> wanted me to add include/linux/if_packet.h in ovs repo.
> >>>
> >>> Ben, can you help double confirm if include/linux/if_packet.h in ovs
> >>> is necessary?
> >>
> >> I think my meaning was misunderstood.  Linux always has if_packet.h.
> >> Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system
> >> is Linux but the TPACKET_V3 types and constants are not defined in
> >> if_packet.h, then the build system should define them.
> >
> > Thanks!
> >
> > My suggestion is that if the system is Linux but the TPACKET_V3 types
> > and constants are not defined in if_packet.h, then just skip using
> > TPACKET_V3 and
> > use the current recvmmsg approach.  Because when we start  TPACKET_V3 patch,
> > the af_packet on veth performance is about 200Mbps, so tpacket_v3 has huge
> > performance benefits.
> >
> > With YiYang's patch
> > "Use batch process recv for tap and raw socket in netdev datapath"
> > the af_packet on veth improves to 1.47Gbps. And tpacket_v3 shows
> > similar or 7% better performance. So there isn't a huge benefits now.
>
> With such a small performance benefit does it make sense to have
> these 700 lines of code that is so hard to read and maintain?

Agree.
I was hoping that using "tpacket_v3 + is_pmd=true + TSO" can show
much better performance. But TSO has some issue and this patch is
not there yet.

>
> Another point is that hopefully segmentation offloading in userspace
> datapath will evolve so we could enable it by default and all this
> code will become almost useless.
>
> If you're looking for poll mode/async -like solutions we could try and
> check io_uring way for calling same recvmsg/sendmsg.  That might
> have more benefits and it will support all the functionality supported
> by these calls.  Even better, we could also make io_uring support as
> an internal library and reuse it for other OVS subsystems like making
> async poll/timers/logging/etc in the future.

Thanks!
I will take a look.
William
Ben Pfaff March 13, 2020, 5:47 p.m. UTC | #5
On Fri, Mar 13, 2020 at 05:47:54PM +0100, Ilya Maximets wrote:
> On 3/13/20 5:22 PM, William Tu wrote:
> > On Fri, Mar 13, 2020 at 8:57 AM Ben Pfaff <blp@ovn.org> wrote:
> >>
> >> On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> >>> Per my understanding, Ben meant a build system (which isn't Linux
> >>> probably, it doesn't have include/linux/if_packet.h) should be able to
> >>> build tpacket_v3 code in order that built-out binary can work on Linux
> >>> system with tpacket_v3 feature, this is Ben's point, that is why he
> >>> wanted me to add include/linux/if_packet.h in ovs repo.
> >>>
> >>> Ben, can you help double confirm if include/linux/if_packet.h in ovs
> >>> is necessary?
> >>
> >> I think my meaning was misunderstood.  Linux always has if_packet.h.
> >> Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system
> >> is Linux but the TPACKET_V3 types and constants are not defined in
> >> if_packet.h, then the build system should define them.
> > 
> > Thanks!
> > 
> > My suggestion is that if the system is Linux but the TPACKET_V3 types
> > and constants are not defined in if_packet.h, then just skip using
> > TPACKET_V3 and
> > use the current recvmmsg approach.  Because when we start  TPACKET_V3 patch,
> > the af_packet on veth performance is about 200Mbps, so tpacket_v3 has huge
> > performance benefits.
> > 
> > With YiYang's patch
> > "Use batch process recv for tap and raw socket in netdev datapath"
> > the af_packet on veth improves to 1.47Gbps. And tpacket_v3 shows
> > similar or 7% better performance. So there isn't a huge benefits now.
> 
> With such a small performance benefit does it make sense to have
> these 700 lines of code that is so hard to read and maintain?

Rarely used code with minimal benefit is a burden, so I'd skip it for
now.  If we figure out some way to a bigger benefit later, we can
revisit it.
Yi Yang (杨燚)-云服务集团 March 14, 2020, 3:35 a.m. UTC | #6
Got it, then we can safely remove inclue/linux/if_packet.h in ovs because the minimal Linux version OVS supports has supported tpacket_v3. Thanks Ben for clarification.

-----邮件原件-----
发件人: Ben Pfaff [mailto:blp@ovn.org] 
发送时间: 2020年3月13日 23:57
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: u9012063@gmail.com; yang_y_yi@163.com; ovs-dev@openvswitch.org
主题: Re: 答复: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> Per my understanding, Ben meant a build system (which isn't Linux 
> probably, it doesn't have include/linux/if_packet.h) should be able to 
> build tpacket_v3 code in order that built-out binary can work on Linux 
> system with tpacket_v3 feature, this is Ben's point, that is why he 
> wanted me to add include/linux/if_packet.h in ovs repo.
> 
> Ben, can you help double confirm if include/linux/if_packet.h in ovs 
> is necessary?

I think my meaning was misunderstood.  Linux always has if_packet.h.
Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system is Linux but the TPACKET_V3 types and constants are not defined in if_packet.h, then the build system should define them.
Yi Yang (杨燚)-云服务集团 March 14, 2020, 4:43 a.m. UTC | #7
Io_uring is a feature brought in by Linux kernel 5.1, so it can't be used on Linux system with kernel version < 5.1. tpacket_v3 is only one way to avoid system call on almost all the Linux kernel versions, it is unique from this perspective. Maybe you will miss it if someone fixes kernel side issue :-)

In addition, according to what Flavio said, TSO can't support VXLAN currently, but in most cloud scenarios, VXLAN is only one choice, so for such cases, TSO can be ignored.

My point is we can provide one option for such use cases, once kernel side issue is fixed, all the Linux distributions can apply this fix, users can get immediate benefits without change. So maybe adding a switch userspace-use-tpacket-v3 in other-config (set to False by default) is an acceptable way to handle this.

-----邮件原件-----
发件人: dev [mailto:ovs-dev-bounces@openvswitch.org] 代表 Ilya Maximets
发送时间: 2020年3月14日 0:48
收件人: William Tu <u9012063@gmail.com>; Ben Pfaff <blp@ovn.org>
抄送: yang_y_yi@163.com; ovs-dev@openvswitch.org; i.maximets@ovn.org
主题: Re: [ovs-dev] 答复: [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

On 3/13/20 5:22 PM, William Tu wrote:
> On Fri, Mar 13, 2020 at 8:57 AM Ben Pfaff <blp@ovn.org> wrote:
>>
>> On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
>>> Per my understanding, Ben meant a build system (which isn't Linux 
>>> probably, it doesn't have include/linux/if_packet.h) should be able 
>>> to build tpacket_v3 code in order that built-out binary can work on 
>>> Linux system with tpacket_v3 feature, this is Ben's point, that is 
>>> why he wanted me to add include/linux/if_packet.h in ovs repo.
>>>
>>> Ben, can you help double confirm if include/linux/if_packet.h in ovs 
>>> is necessary?
>>
>> I think my meaning was misunderstood.  Linux always has if_packet.h.
>> Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the 
>> system is Linux but the TPACKET_V3 types and constants are not 
>> defined in if_packet.h, then the build system should define them.
> 
> Thanks!
> 
> My suggestion is that if the system is Linux but the TPACKET_V3 types 
> and constants are not defined in if_packet.h, then just skip using
> TPACKET_V3 and
> use the current recvmmsg approach.  Because when we start  TPACKET_V3 
> patch, the af_packet on veth performance is about 200Mbps, so 
> tpacket_v3 has huge performance benefits.
> 
> With YiYang's patch
> "Use batch process recv for tap and raw socket in netdev datapath"
> the af_packet on veth improves to 1.47Gbps. And tpacket_v3 shows 
> similar or 7% better performance. So there isn't a huge benefits now.

With such a small performance benefit does it make sense to have these 700 lines of code that is so hard to read and maintain?

Another point is that hopefully segmentation offloading in userspace datapath will evolve so we could enable it by default and all this code will become almost useless.

If you're looking for poll mode/async -like solutions we could try and check io_uring way for calling same recvmsg/sendmsg.  That might have more benefits and it will support all the functionality supported by these calls.  Even better, we could also make io_uring support as an internal library and reuse it for other OVS subsystems like making async poll/timers/logging/etc in the future.

Best regards, Ilya Maximets.
William Tu March 14, 2020, 2:17 p.m. UTC | #8
On Fri, Mar 13, 2020 at 9:45 PM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> Io_uring is a feature brought in by Linux kernel 5.1, so it can't be used on Linux system with kernel version < 5.1. tpacket_v3 is only one way to avoid system call on almost all the Linux kernel versions, it is unique from this perspective. Maybe you will miss it if someone fixes kernel side issue :-)
>
> In addition, according to what Flavio said, TSO can't support VXLAN currently, but in most cloud scenarios, VXLAN is only one choice, so for such cases, TSO can be ignored.
>
> My point is we can provide one option for such use cases, once kernel side issue is fixed, all the Linux distributions can apply this fix, users can get immediate benefits without change. So maybe adding a switch userspace-use-tpacket-v3 in other-config (set to False by default) is an acceptable way to handle this.
>

The tpacket_v3 patch now shows very little performance improvement.
So there is little incentive to merge and maintain this code.
Do you know if kernel side is fixed, will tpacket_v3 have better
performance improvement?

Or another way is to study io_uring and compare its performance with tpacket_v3.

William
Ben Pfaff March 14, 2020, 8:04 p.m. UTC | #9
There might still be a misunderstanding.

There can be a difference between the kernel that OVS runs on (version
A) and the kernel headers against which it is built (version B).  Often,
the latter are supplied by the distribution and they are not usually
kept as up to date, so B < A is common.

I don't know whether this is likely to be a problem in this particular
case.

On Sat, Mar 14, 2020 at 03:35:46AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> Got it, then we can safely remove inclue/linux/if_packet.h in ovs
> because the minimal Linux version OVS supports has supported
> tpacket_v3. Thanks Ben for clarification.
> 
> -----邮件原件-----
> 发件人: Ben Pfaff [mailto:blp@ovn.org] 
> 发送时间: 2020年3月13日 23:57
> 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
> 抄送: u9012063@gmail.com; yang_y_yi@163.com; ovs-dev@openvswitch.org
> 主题: Re: 答复: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath
> 
> On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> > Per my understanding, Ben meant a build system (which isn't Linux 
> > probably, it doesn't have include/linux/if_packet.h) should be able to 
> > build tpacket_v3 code in order that built-out binary can work on Linux 
> > system with tpacket_v3 feature, this is Ben's point, that is why he 
> > wanted me to add include/linux/if_packet.h in ovs repo.
> > 
> > Ben, can you help double confirm if include/linux/if_packet.h in ovs 
> > is necessary?
> 
> I think my meaning was misunderstood.  Linux always has if_packet.h.
> Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system is Linux but the TPACKET_V3 types and constants are not defined in if_packet.h, then the build system should define them.
Yi Yang (杨燚)-云服务集团 March 16, 2020, 12:48 a.m. UTC | #10
All the definitions/macros have been in include/linux/if_packet.h since 3.10.0, so there will not be that case existing. 

-----邮件原件-----
发件人: Ben Pfaff [mailto:blp@ovn.org] 
发送时间: 2020年3月15日 4:04
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: u9012063@gmail.com; yang_y_yi@163.com; ovs-dev@openvswitch.org
主题: Re: 答复: 答复: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

There might still be a misunderstanding.

There can be a difference between the kernel that OVS runs on (version
A) and the kernel headers against which it is built (version B).  Often, the latter are supplied by the distribution and they are not usually kept as up to date, so B < A is common.

I don't know whether this is likely to be a problem in this particular case.

On Sat, Mar 14, 2020 at 03:35:46AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> Got it, then we can safely remove inclue/linux/if_packet.h in ovs 
> because the minimal Linux version OVS supports has supported 
> tpacket_v3. Thanks Ben for clarification.
> 
> -----邮件原件-----
> 发件人: Ben Pfaff [mailto:blp@ovn.org]
> 发送时间: 2020年3月13日 23:57
> 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
> 抄送: u9012063@gmail.com; yang_y_yi@163.com; ovs-dev@openvswitch.org
> 主题: Re: 答复: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for 
> userspace datapath
> 
> On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> > Per my understanding, Ben meant a build system (which isn't Linux 
> > probably, it doesn't have include/linux/if_packet.h) should be able 
> > to build tpacket_v3 code in order that built-out binary can work on 
> > Linux system with tpacket_v3 feature, this is Ben's point, that is 
> > why he wanted me to add include/linux/if_packet.h in ovs repo.
> > 
> > Ben, can you help double confirm if include/linux/if_packet.h in ovs 
> > is necessary?
> 
> I think my meaning was misunderstood.  Linux always has if_packet.h.
> Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system is Linux but the TPACKET_V3 types and constants are not defined in if_packet.h, then the build system should define them.
Yi Yang (杨燚)-云服务集团 March 17, 2020, 9:08 a.m. UTC | #11
Hi, William

Finally, my highend server is available and so I can do performance comparison again, tpacket_v3 obviously has big performance improvement, here is my data. By the way, in order to get stable performance data, please use taskset to pin ovs-vswitchd to a physical core (you shouldn't schedule other task to its logical sibling core for stable performance data), iperf3 client an iperf3 use different cores, for my case, ovs-vswitchd is pinned to core 1, iperf3 server is pinned to core 4, iperf3 client is pinned to core 5. 

According to my test, tpacket_v3 can get about 55% improvement (from 1.34 to 2.08,  (2.08-1.34)/1.34 = 0.55) , with my further optimization (use zero copy for receive side), it can have more improvement (from 1.34 to 2.21, (2.21-1.34)/1.34 = 0.65), so I still think performance improvement is big, please reconsider it again.

William, I can help you do performance check on your servers if you'd like, from these data and previous data, we can draw a conclusion performance data is very platform sensitive. You can schedule a meeting for further discussion if needed.

No zero copy and no tpacket_v3 (recvmmsg, sendmmsg)
===================================================
eipadmin@eip01:~$ sudo ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 43194 connected to 10.15.1.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-10.00  sec  1.58 GBytes  1.35 Gbits/sec  13851    103 KBytes
[  4]  10.00-20.00  sec  1.56 GBytes  1.34 Gbits/sec  14018   94.7 KBytes
[  4]  20.00-30.00  sec  1.56 GBytes  1.34 Gbits/sec  13942   94.7 KBytes
[  4]  30.00-40.00  sec  1.56 GBytes  1.34 Gbits/sec  13565    106 KBytes
[  4]  40.00-50.00  sec  1.54 GBytes  1.32 Gbits/sec  14567    106 KBytes
[  4]  50.00-60.00  sec  1.56 GBytes  1.34 Gbits/sec  13738   84.8 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-60.00  sec  9.35 GBytes  1.34 Gbits/sec  83681             sender
[  4]   0.00-60.00  sec  9.35 GBytes  1.34 Gbits/sec                  receiver

Server output:
Accepted connection from 10.15.1.2, port 43192
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 43194
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  1.57 GBytes  1.35 Gbits/sec
[  5]  10.00-20.00  sec  1.56 GBytes  1.34 Gbits/sec
[  5]  20.00-30.00  sec  1.56 GBytes  1.34 Gbits/sec
[  5]  30.00-40.00  sec  1.56 GBytes  1.34 Gbits/sec
[  5]  40.00-50.00  sec  1.54 GBytes  1.32 Gbits/sec
[  5]  50.00-60.00  sec  1.56 GBytes  1.34 Gbits/sec


iperf Done.
eipadmin@eip01:~$

No zero copy but with tpacket_v3
================================
eipadmin@eip01:~$ sudo ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 43174 connected to 10.15.1.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-10.00  sec  2.36 GBytes  2.02 Gbits/sec    0   3.04 MBytes
[  4]  10.00-20.00  sec  2.43 GBytes  2.09 Gbits/sec    0   3.04 MBytes
[  4]  20.00-30.00  sec  2.44 GBytes  2.09 Gbits/sec    0   3.04 MBytes
[  4]  30.00-40.00  sec  2.43 GBytes  2.09 Gbits/sec    0   3.04 MBytes
[  4]  40.00-50.00  sec  2.43 GBytes  2.09 Gbits/sec    0   3.04 MBytes
[  4]  50.00-60.00  sec  2.44 GBytes  2.10 Gbits/sec    0   3.04 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-60.00  sec  14.5 GBytes  2.08 Gbits/sec    0             sender
[  4]   0.00-60.00  sec  14.5 GBytes  2.08 Gbits/sec                  receiver

Server output:
Accepted connection from 10.15.1.2, port 43172
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 43174
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  2.35 GBytes  2.02 Gbits/sec
[  5]  10.00-20.00  sec  2.43 GBytes  2.09 Gbits/sec
[  5]  20.00-30.00  sec  2.44 GBytes  2.09 Gbits/sec
[  5]  30.00-40.00  sec  2.43 GBytes  2.09 Gbits/sec
[  5]  40.00-50.00  sec  2.43 GBytes  2.09 Gbits/sec
[  5]  50.00-60.00  sec  2.44 GBytes  2.10 Gbits/sec


iperf Done.
eipadmin@eip01:~$


Have zero copy patch and tpacket_v3
===================================
eipadmin@eip01:~$ sudo ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 43182 connected to 10.15.1.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-10.00  sec  2.54 GBytes  2.18 Gbits/sec    0   3.03 MBytes
[  4]  10.00-20.00  sec  2.58 GBytes  2.22 Gbits/sec    0   3.03 MBytes
[  4]  20.00-30.00  sec  2.58 GBytes  2.22 Gbits/sec    0   3.03 MBytes
[  4]  30.00-40.00  sec  2.59 GBytes  2.22 Gbits/sec    0   3.03 MBytes
[  4]  40.00-50.00  sec  2.57 GBytes  2.21 Gbits/sec    0   3.03 MBytes
[  4]  50.00-60.00  sec  2.57 GBytes  2.21 Gbits/sec    0   3.03 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-60.00  sec  15.4 GBytes  2.21 Gbits/sec    0             sender
[  4]   0.00-60.00  sec  15.4 GBytes  2.21 Gbits/sec                  receiver

Server output:
Accepted connection from 10.15.1.2, port 43180
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 43182
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  2.53 GBytes  2.17 Gbits/sec
[  5]  10.00-20.00  sec  2.58 GBytes  2.22 Gbits/sec
[  5]  20.00-30.00  sec  2.58 GBytes  2.22 Gbits/sec
[  5]  30.00-40.00  sec  2.59 GBytes  2.22 Gbits/sec
[  5]  40.00-50.00  sec  2.57 GBytes  2.21 Gbits/sec
[  5]  50.00-60.00  sec  2.57 GBytes  2.21 Gbits/sec


iperf Done.
eipadmin@eip01:~$

-----邮件原件-----
发件人: William Tu [mailto:u9012063@gmail.com] 
发送时间: 2020年3月14日 22:18
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: i.maximets@ovn.org; blp@ovn.org; yang_y_yi@163.com; ovs-dev@openvswitch.org
主题: Re: [ovs-dev] 答复: [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

On Fri, Mar 13, 2020 at 9:45 PM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> Io_uring is a feature brought in by Linux kernel 5.1, so it can't be 
> used on Linux system with kernel version < 5.1. tpacket_v3 is only one 
> way to avoid system call on almost all the Linux kernel versions, it 
> is unique from this perspective. Maybe you will miss it if someone 
> fixes kernel side issue :-)
>
> In addition, according to what Flavio said, TSO can't support VXLAN currently, but in most cloud scenarios, VXLAN is only one choice, so for such cases, TSO can be ignored.
>
> My point is we can provide one option for such use cases, once kernel side issue is fixed, all the Linux distributions can apply this fix, users can get immediate benefits without change. So maybe adding a switch userspace-use-tpacket-v3 in other-config (set to False by default) is an acceptable way to handle this.
>

The tpacket_v3 patch now shows very little performance improvement.
So there is little incentive to merge and maintain this code.
Do you know if kernel side is fixed, will tpacket_v3 have better performance improvement?

Or another way is to study io_uring and compare its performance with tpacket_v3.

William
William Tu March 17, 2020, 2:58 p.m. UTC | #12
On Tue, Mar 17, 2020 at 2:08 AM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> Hi, William
>
> Finally, my highend server is available and so I can do performance comparison again, tpacket_v3 obviously has big performance improvement, here is my data. By the way, in order to get stable performance data, please use taskset to pin ovs-vswitchd to a physical core (you shouldn't schedule other task to its logical sibling core for stable performance data), iperf3 client an iperf3 use different cores, for my case, ovs-vswitchd is pinned to core 1, iperf3 server is pinned to core 4, iperf3 client is pinned to core 5.
>
> According to my test, tpacket_v3 can get about 55% improvement (from 1.34 to 2.08,  (2.08-1.34)/1.34 = 0.55) , with my further optimization (use zero copy for receive side), it can have more improvement (from 1.34 to 2.21, (2.21-1.34)/1.34 = 0.65), so I still think performance improvement is big, please reconsider it again.
>

That's great improvement.
What is your optimization "zero copy for receive side"?
Does it include in the patch?

Regards
William
Ben Pfaff March 17, 2020, 4:37 p.m. UTC | #13
Great.  Then we do not need any special case for if_packet.h.

On Mon, Mar 16, 2020 at 12:48:20AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> All the definitions/macros have been in include/linux/if_packet.h since 3.10.0, so there will not be that case existing. 
> 
> -----邮件原件-----
> 发件人: Ben Pfaff [mailto:blp@ovn.org] 
> 发送时间: 2020年3月15日 4:04
> 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
> 抄送: u9012063@gmail.com; yang_y_yi@163.com; ovs-dev@openvswitch.org
> 主题: Re: 答复: 答复: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath
> 
> There might still be a misunderstanding.
> 
> There can be a difference between the kernel that OVS runs on (version
> A) and the kernel headers against which it is built (version B).  Often, the latter are supplied by the distribution and they are not usually kept as up to date, so B < A is common.
> 
> I don't know whether this is likely to be a problem in this particular case.
> 
> On Sat, Mar 14, 2020 at 03:35:46AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> > Got it, then we can safely remove inclue/linux/if_packet.h in ovs 
> > because the minimal Linux version OVS supports has supported 
> > tpacket_v3. Thanks Ben for clarification.
> > 
> > -----邮件原件-----
> > 发件人: Ben Pfaff [mailto:blp@ovn.org]
> > 发送时间: 2020年3月13日 23:57
> > 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
> > 抄送: u9012063@gmail.com; yang_y_yi@163.com; ovs-dev@openvswitch.org
> > 主题: Re: 答复: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for 
> > userspace datapath
> > 
> > On Fri, Mar 13, 2020 at 01:04:07AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> > > Per my understanding, Ben meant a build system (which isn't Linux 
> > > probably, it doesn't have include/linux/if_packet.h) should be able 
> > > to build tpacket_v3 code in order that built-out binary can work on 
> > > Linux system with tpacket_v3 feature, this is Ben's point, that is 
> > > why he wanted me to add include/linux/if_packet.h in ovs repo.
> > > 
> > > Ben, can you help double confirm if include/linux/if_packet.h in ovs 
> > > is necessary?
> > 
> > I think my meaning was misunderstood.  Linux always has if_packet.h.
> > Only recent enough Linux has TPACKET_V3 in if_packet.h.  If the system is Linux but the TPACKET_V3 types and constants are not defined in if_packet.h, then the build system should define them.
> 
>
Yi Yang (杨燚)-云服务集团 March 18, 2020, 12:58 a.m. UTC | #14
William, are you trying my patch for zero copy? I can send you for a try on your platform. Per your af_xdp change, I find dp_packet can use pre-allocated buffer, so I used that way, because tpacket_v3 has setup rx ring there, so dp_packet can directly use those rx ring buffer.

-----邮件原件-----
发件人: William Tu [mailto:u9012063@gmail.com] 
发送时间: 2020年3月17日 22:58
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: i.maximets@ovn.org; blp@ovn.org; yang_y_yi@163.com; ovs-dev@openvswitch.org
主题: Re: [ovs-dev] 答复: [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

On Tue, Mar 17, 2020 at 2:08 AM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> Hi, William
>
> Finally, my highend server is available and so I can do performance comparison again, tpacket_v3 obviously has big performance improvement, here is my data. By the way, in order to get stable performance data, please use taskset to pin ovs-vswitchd to a physical core (you shouldn't schedule other task to its logical sibling core for stable performance data), iperf3 client an iperf3 use different cores, for my case, ovs-vswitchd is pinned to core 1, iperf3 server is pinned to core 4, iperf3 client is pinned to core 5.
>
> According to my test, tpacket_v3 can get about 55% improvement (from 1.34 to 2.08,  (2.08-1.34)/1.34 = 0.55) , with my further optimization (use zero copy for receive side), it can have more improvement (from 1.34 to 2.21, (2.21-1.34)/1.34 = 0.65), so I still think performance improvement is big, please reconsider it again.
>

That's great improvement.
What is your optimization "zero copy for receive side"?
Does it include in the patch?

Regards
William
Yi Yang (杨燚)-云服务集团 March 18, 2020, 2 a.m. UTC | #15
By the way, with tpacket_v3, zero copy optimization and is_pmd=true, the performance is much better, 3.77Gbps, (3.77-1.34)/1.34 = 1.81 , i.e. 181% improvement, here is the performance data.

is_pmd = true
=============
eipadmin@eip01:~$ sudo ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 43210 connected to 10.15.1.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-10.00  sec  4.34 GBytes  3.73 Gbits/sec    0   3.03 MBytes
[  4]  10.00-20.00  sec  4.40 GBytes  3.78 Gbits/sec    0   3.03 MBytes
[  4]  20.00-30.00  sec  4.40 GBytes  3.78 Gbits/sec    0   3.03 MBytes
[  4]  30.00-40.00  sec  4.40 GBytes  3.78 Gbits/sec    0   3.03 MBytes
[  4]  40.00-50.00  sec  4.40 GBytes  3.78 Gbits/sec    0   3.03 MBytes
[  4]  50.00-60.00  sec  4.40 GBytes  3.78 Gbits/sec    0   3.03 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-60.00  sec  26.3 GBytes  3.77 Gbits/sec    0             sender
[  4]   0.00-60.00  sec  26.3 GBytes  3.77 Gbits/sec                  receiver

Server output:
Accepted connection from 10.15.1.2, port 43208
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 43210
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  4.32 GBytes  3.71 Gbits/sec
[  5]  10.00-20.00  sec  4.40 GBytes  3.78 Gbits/sec
[  5]  20.00-30.00  sec  4.40 GBytes  3.78 Gbits/sec
[  5]  30.00-40.00  sec  4.40 GBytes  3.78 Gbits/sec
[  5]  40.00-50.00  sec  4.40 GBytes  3.78 Gbits/sec
[  5]  50.00-60.00  sec  4.40 GBytes  3.78 Gbits/sec


iperf Done.
eipadmin@eip01:~$

-----邮件原件-----
发件人: William Tu [mailto:u9012063@gmail.com] 
发送时间: 2020年3月17日 22:58
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: i.maximets@ovn.org; blp@ovn.org; yang_y_yi@163.com; ovs-dev@openvswitch.org
主题: Re: [ovs-dev] 答复: [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

On Tue, Mar 17, 2020 at 2:08 AM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> Hi, William
>
> Finally, my highend server is available and so I can do performance comparison again, tpacket_v3 obviously has big performance improvement, here is my data. By the way, in order to get stable performance data, please use taskset to pin ovs-vswitchd to a physical core (you shouldn't schedule other task to its logical sibling core for stable performance data), iperf3 client an iperf3 use different cores, for my case, ovs-vswitchd is pinned to core 1, iperf3 server is pinned to core 4, iperf3 client is pinned to core 5.
>
> According to my test, tpacket_v3 can get about 55% improvement (from 1.34 to 2.08,  (2.08-1.34)/1.34 = 0.55) , with my further optimization (use zero copy for receive side), it can have more improvement (from 1.34 to 2.21, (2.21-1.34)/1.34 = 0.65), so I still think performance improvement is big, please reconsider it again.
>

That's great improvement.
What is your optimization "zero copy for receive side"?
Does it include in the patch?

Regards
William
William Tu March 18, 2020, 3:55 a.m. UTC | #16
On Tue, Mar 17, 2020 at 7:00 PM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> By the way, with tpacket_v3, zero copy optimization and is_pmd=true, the performance is much better, 3.77Gbps, (3.77-1.34)/1.34 = 1.81 , i.e. 181% improvement, here is the performance data.
>
Can you send out the tpacket_v3 patch together with these
optimizations to the mailing list?
Thanks
William
Yi Yang (杨燚)-云服务集团 March 18, 2020, 4:08 a.m. UTC | #17
Ok, I will send out v7 with these changes.

-----邮件原件-----
发件人: William Tu [mailto:u9012063@gmail.com] 
发送时间: 2020年3月18日 11:56
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: i.maximets@ovn.org; blp@ovn.org; yang_y_yi@163.com; ovs-dev@openvswitch.org
主题: Re: [ovs-dev] 答复: [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

On Tue, Mar 17, 2020 at 7:00 PM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> By the way, with tpacket_v3, zero copy optimization and is_pmd=true, the performance is much better, 3.77Gbps, (3.77-1.34)/1.34 = 1.81 , i.e. 181% improvement, here is the performance data.
>
Can you send out the tpacket_v3 patch together with these optimizations to the mailing list?
Thanks
William
diff mbox series

Patch

diff --git a/acinclude.m4 b/acinclude.m4 index 1488deda0371..4b11085ab190 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -1086,12 +1086,14 @@  dnl OVS_CHECK_LINUX_TPACKET  dnl  dnl Configure Linux TPACKET.
 AC_DEFUN([OVS_CHECK_LINUX_TPACKET], [
-  AC_COMPILE_IFELSE([
-    AC_LANG_PROGRAM([#include <linux/if_packet.h>], [
-        struct tpacket3_hdr x =  { 0 };
-    ])],
-    [AC_DEFINE([HAVE_TPACKET_V3], [1],
-    [Define to 1 if struct tpacket3_hdr is available.])])
+  AC_CHECK_HEADER([linux/if_packet.h],
+    [AC_COMPILE_IFELSE([
+      AC_LANG_PROGRAM([#include <linux/if_packet.h>], [
+          struct tpacket3_hdr x =  { 0 };
+      ])],
+      [AC_DEFINE([HAVE_TPACKET_V3], [1],
+      [Define to 1 if struct tpacket3_hdr is available.])])],
+    [])
 ])

 dnl Checks for buggy strtok_r.
diff --git a/include/linux/automake.mk b/include/linux/automake.mk index a659e65abe27..8f063f482e15 100644
--- a/include/linux/automake.mk
+++ b/include/linux/automake.mk
@@ -1,5 +1,4 @@ 
 noinst_HEADERS += \
-       include/linux/if_packet.h \
        include/linux/netlink.h \
        include/linux/netfilter/nf_conntrack_sctp.h \
        include/linux/pkt_cls.h \
diff --git a/include/linux/if_packet.h b/include/linux/if_packet.h deleted file mode 100644 index e20aaccb1e32..000000000000
--- a/include/linux/if_packet.h
+++ /dev/null
@@ -1,128 +0,0 @@ 
-#ifndef __LINUX_IF_PACKET_WRAPPER_H
-#define __LINUX_IF_PACKET_WRAPPER_H 1
-
-#ifdef HAVE_TPACKET_V3
-#include_next <linux/if_packet.h>
-#else
-#define HAVE_TPACKET_V3 1
-
-struct sockaddr_pkt {
-        unsigned short  spkt_family;
-        unsigned char   spkt_device[14];
-        uint16_t        spkt_protocol;
-};
-
-struct sockaddr_ll {
-        unsigned short  sll_family;
-        uint16_t        sll_protocol;
-        int             sll_ifindex;
-        unsigned short  sll_hatype;
-        unsigned char   sll_pkttype;
-        unsigned char   sll_halen;
-        unsigned char   sll_addr[8];
-};
-
-/* Packet types */
-#define PACKET_HOST                     0 /* To us                */
-#define PACKET_OTHERHOST                3 /* To someone else    */
-#define PACKET_LOOPBACK                 5 /* MC/BRD frame looped back */
-
-/* Packet socket options */
-#define PACKET_RX_RING                  5
-#define PACKET_VERSION                 10
-#define PACKET_TX_RING                 13
-#define PACKET_VNET_HDR                15
-
-/* Rx ring - header status */
-#define TP_STATUS_KERNEL                0
-#define TP_STATUS_USER            (1 << 0)
-#define TP_STATUS_VLAN_VALID      (1 << 4) /* auxdata has valid tp_vlan_tci */
-#define TP_STATUS_VLAN_TPID_VALID (1 << 6) /* auxdata has valid tp_vlan_tpid */
-
-/* Tx ring - header status */
-#define TP_STATUS_SEND_REQUEST    (1 << 0)
-#define TP_STATUS_SENDING         (1 << 1)
-
-struct tpacket_hdr {
-    unsigned long tp_status;
-    unsigned int tp_len;
-    unsigned int tp_snaplen;
-    unsigned short tp_mac;
-    unsigned short tp_net;
-    unsigned int tp_sec;
-    unsigned int tp_usec;
-};
-
-#define TPACKET_ALIGNMENT 16
-#define TPACKET_ALIGN(x) (((x)+TPACKET_ALIGNMENT-1)&~(TPACKET_ALIGNMENT-1))
-
-struct tpacket_hdr_variant1 {
-    uint32_t tp_rxhash;
-    uint32_t tp_vlan_tci;
-    uint16_t tp_vlan_tpid;
-    uint16_t tp_padding;
-};
-
-struct tpacket3_hdr {
-    uint32_t  tp_next_offset;
-    uint32_t  tp_sec;
-    uint32_t  tp_nsec;
-    uint32_t  tp_snaplen;
-    uint32_t  tp_len;
-    uint32_t  tp_status;
-    uint16_t  tp_mac;
-    uint16_t  tp_net;
-    /* pkt_hdr variants */
-    union {
-        struct tpacket_hdr_variant1 hv1;
-    };
-    uint8_t  tp_padding[8];
-};
-
-struct tpacket_bd_ts {
-    unsigned int ts_sec;
-    union {
-        unsigned int ts_usec;
-        unsigned int ts_nsec;
-    };
-};
-
-struct tpacket_hdr_v1 {
-    uint32_t block_status;
-    uint32_t num_pkts;
-    uint32_t offset_to_first_pkt;
-    uint32_t blk_len;
-    uint64_t __attribute__((aligned(8))) seq_num;
-    struct tpacket_bd_ts ts_first_pkt, ts_last_pkt;
-};
-
-union tpacket_bd_header_u {
-    struct tpacket_hdr_v1 bh1;
-};
-
-struct tpacket_block_desc {
-    uint32_t version;
-    uint32_t offset_to_priv;
-    union tpacket_bd_header_u hdr;
-};
-
-#define TPACKET3_HDRLEN \
-    (TPACKET_ALIGN(sizeof(struct tpacket3_hdr)) + sizeof(struct sockaddr_ll))
-
-enum tpacket_versions {
-    TPACKET_V1,
-    TPACKET_V2,
-    TPACKET_V3
-};
-
-struct tpacket_req3 {
-    unsigned int tp_block_size; /* Minimal size of contiguous block */
-    unsigned int tp_block_nr; /* Number of blocks */
-    unsigned int tp_frame_size; /* Size of frame */
-    unsigned int tp_frame_nr; /* Total number of frames */
-    unsigned int tp_retire_blk_tov; /* Timeout in msecs */
-    unsigned int tp_sizeof_priv; /* Offset to private data area */
-    unsigned int tp_feature_req_word;
-};
-#endif /* HAVE_TPACKET_V3 */
-#endif /* __LINUX_IF_PACKET_WRAPPER_H */ diff --git a/include/sparse/linux/if_packet.h b/include/sparse/linux/if_packet.h
index 0ac3fcefc895..3813892a0788 100644
--- a/include/sparse/linux/if_packet.h
+++ b/include/sparse/linux/if_packet.h
@@ -28,114 +28,4 @@  struct sockaddr_ll {
         unsigned char   sll_addr[8];
 };

-/* Packet types */
-#define PACKET_HOST                     0 /* To us                */
-#define PACKET_OTHERHOST                3 /* To someone else   */
-#define PACKET_LOOPBACK                 5 /* MC/BRD frame looped back */
-
-/* Packet socket options */
-#define PACKET_RX_RING                  5
-#define PACKET_VERSION                 10
-#define PACKET_TX_RING                 13
-#define PACKET_VNET_HDR                15
-
-/* Rx ring - header status */
-#define TP_STATUS_KERNEL                0
-#define TP_STATUS_USER            (1 << 0)
-#define TP_STATUS_VLAN_VALID      (1 << 4) /* auxdata has valid tp_vlan_tci */
-#define TP_STATUS_VLAN_TPID_VALID (1 << 6) /* auxdata has valid tp_vlan_tpid */
-
-/* Tx ring - header status */
-#define TP_STATUS_SEND_REQUEST    (1 << 0)
-#define TP_STATUS_SENDING         (1 << 1)
-
-#define tpacket_hdr rpl_tpacket_hdr
-struct tpacket_hdr {
-    unsigned long tp_status;
-    unsigned int tp_len;
-    unsigned int tp_snaplen;
-    unsigned short tp_mac;
-    unsigned short tp_net;
-    unsigned int tp_sec;
-    unsigned int tp_usec;
-};
-
-#define TPACKET_ALIGNMENT 16
-#define TPACKET_ALIGN(x) (((x)+TPACKET_ALIGNMENT-1)&~(TPACKET_ALIGNMENT-1))
-
-#define tpacket_hdr_variant1 rpl_tpacket_hdr_variant1 -struct tpacket_hdr_variant1 {
-    uint32_t tp_rxhash;
-    uint32_t tp_vlan_tci;
-    uint16_t tp_vlan_tpid;
-    uint16_t tp_padding;
-};
-
-#define tpacket3_hdr rpl_tpacket3_hdr
-struct tpacket3_hdr {
-    uint32_t  tp_next_offset;
-    uint32_t  tp_sec;
-    uint32_t  tp_nsec;
-    uint32_t  tp_snaplen;
-    uint32_t  tp_len;
-    uint32_t  tp_status;
-    uint16_t  tp_mac;
-    uint16_t  tp_net;
-    /* pkt_hdr variants */
-    union {
-        struct tpacket_hdr_variant1 hv1;
-    };
-    uint8_t  tp_padding[8];
-};
-
-#define tpacket_bd_ts rpl_tpacket_bd_ts -struct tpacket_bd_ts {
-    unsigned int ts_sec;
-    union {
-        unsigned int ts_usec;
-        unsigned int ts_nsec;
-    };
-};
-
-#define tpacket_hdr_v1 rpl_tpacket_hdr_v1 -struct tpacket_hdr_v1 {
-    uint32_t block_status;
-    uint32_t num_pkts;
-    uint32_t offset_to_first_pkt;
-    uint32_t blk_len;
-    uint64_t __attribute__((aligned(8))) seq_num;
-    struct tpacket_bd_ts ts_first_pkt, ts_last_pkt;
-};
-
-#define tpacket_bd_header_u rpl_tpacket_bd_header_u -union tpacket_bd_header_u {
-    struct tpacket_hdr_v1 bh1;
-};
-
-#define tpacket_block_desc rpl_tpacket_block_desc -struct tpacket_block_desc {
-    uint32_t version;
-    uint32_t offset_to_priv;
-    union tpacket_bd_header_u hdr;
-};
-
-#define TPACKET3_HDRLEN \
-    (TPACKET_ALIGN(sizeof(struct tpacket3_hdr)) + sizeof(struct sockaddr_ll))
-
-enum rpl_tpacket_versions {
-    TPACKET_V1,
-    TPACKET_V2,
-    TPACKET_V3
-};
-
-#define tpacket_req3 rpl_tpacket_req3
-struct tpacket_req3 {
-    unsigned int tp_block_size; /* Minimal size of contiguous block */
-    unsigned int tp_block_nr; /* Number of blocks */
-    unsigned int tp_frame_size; /* Size of frame */
-    unsigned int tp_frame_nr; /* Total number of frames */
-    unsigned int tp_retire_blk_tov; /* Timeout in msecs */
-    unsigned int tp_sizeof_priv; /* Offset to private data area */
-    unsigned int tp_feature_req_word;
-};
 #endif
_______________________________________________
dev mailing list
dev@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev