diff mbox series

[ovs-dev,v4,14/14] userspace: Enable TSO if available.

Message ID 20220701035834.1851648-14-mkp@redhat.com
State Superseded
Headers show
Series [ovs-dev,v4,01/14] dp-packet: Rename flags with CKSUM to CSUM. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/intel-ovs-compilation fail test: fail

Commit Message

Mike Pattrick July 1, 2022, 3:58 a.m. UTC
From: Flavio Leitner <fbl@sysclose.org>

Now that there is a segmentation in software as a fall back in
case a netdev doesn't support TCP segmentation offloading (TSO),
enable it by default on all possible netdevs.

The existing TSO control is inverted, so that now it will disable
TSO globally, in case TSO is not desired for some deployment.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Co-authored-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Mike Pattrick <mkp@redhat.com>
---
 Documentation/topics/userspace-tso.rst |  21 ++--
 NEWS                                   |   4 +
 lib/netdev-linux.c                     | 133 ++++++++++++++++++++++++-
 lib/userspace-tso.c                    |   9 +-
 tests/ofproto-macros.at                |   1 +
 vswitchd/vswitch.xml                   |  17 +---
 6 files changed, 154 insertions(+), 31 deletions(-)

Comments

David Marchand July 6, 2022, 9 p.m. UTC | #1
On Fri, Jul 1, 2022 at 5:58 AM Mike Pattrick <mkp@redhat.com> wrote:
>
> From: Flavio Leitner <fbl@sysclose.org>
>
> Now that there is a segmentation in software as a fall back in
> case a netdev doesn't support TCP segmentation offloading (TSO),
> enable it by default on all possible netdevs.
>
> The existing TSO control is inverted, so that now it will disable
> TSO globally, in case TSO is not desired for some deployment.
>
> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> Co-authored-by: Mike Pattrick <mkp@redhat.com>
> Signed-off-by: Mike Pattrick <mkp@redhat.com>
> ---
>  Documentation/topics/userspace-tso.rst |  21 ++--
>  NEWS                                   |   4 +
>  lib/netdev-linux.c                     | 133 ++++++++++++++++++++++++-
>  lib/userspace-tso.c                    |   9 +-
>  tests/ofproto-macros.at                |   1 +
>  vswitchd/vswitch.xml                   |  17 +---
>  6 files changed, 154 insertions(+), 31 deletions(-)

We have one issue with vhost user client ports.

Consider the case with an OVS running  before this series is applied,
with userspace tso disabled (which is the case for existing OVS
installation).
I see that qemu negotiates TSO + ECN feature for a virtio port in the
vhost-user backend on OVS side:
2022-07-06T19:46:38.225Z|00175|dpdk|INFO|VHOST_CONFIG: negotiated
Virtio features: 0x17020a783

Next, I apply the whole series and restart OVS:
2022-07-06T19:53:29.121Z|00069|netdev_dpdk|INFO|vHost User device
'vhost1' created in 'client' mode, using client socket
'/var/lib/vhost_sockets/vhost1'
2022-07-06T19:53:29.122Z|00070|dpdk|INFO|VHOST_CONFIG: new device,
handle is 0, path is /var/lib/vhost_sockets/vhost1
2022-07-06T19:53:29.122Z|00001|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_GET_FEATURES
2022-07-06T19:53:29.122Z|00002|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_GET_PROTOCOL_FEATURES
2022-07-06T19:53:29.122Z|00003|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_PROTOCOL_FEATURES
2022-07-06T19:53:29.122Z|00004|dpdk|INFO|VHOST_CONFIG: negotiated
Vhost-user protocol features: 0xcbf
2022-07-06T19:53:29.122Z|00005|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_GET_QUEUE_NUM
2022-07-06T19:53:29.122Z|00006|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_SLAVE_REQ_FD
2022-07-06T19:53:29.122Z|00007|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_OWNER
2022-07-06T19:53:29.122Z|00008|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_GET_FEATURES
2022-07-06T19:53:29.122Z|00009|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_VRING_CALL
2022-07-06T19:53:29.123Z|00010|dpdk|INFO|VHOST_CONFIG: vring call idx:0 file:109
2022-07-06T19:53:29.123Z|00011|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_VRING_CALL
2022-07-06T19:53:29.123Z|00012|dpdk|INFO|VHOST_CONFIG: vring call idx:1 file:110
2022-07-06T19:53:29.123Z|00013|dpdk|INFO|VHOST_CONFIG: vhost peer closed

This happens for every vhost ports I have, in a loop flooding OVS logs.
Looking at qemu logs:
2022-07-06T19:53:17.581363Z qemu-kvm: Unexpected end-of-file before
all data were read
2022-07-06T19:53:17.583183Z qemu-kvm: Unexpected end-of-file before
all data were read
2022-07-06T19:53:17.587613Z qemu-kvm: Unexpected end-of-file before
all data were read
2022-07-06T19:53:17.588464Z qemu-kvm: Unexpected end-of-file before
all data were read
2022-07-06T19:53:17.641010Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.641023Z qemu-kvm: vhost VQ 0 ring restore failed:
-1: Invalid argument (22)
2022-07-06T19:53:17.641035Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.641040Z qemu-kvm: vhost VQ 1 ring restore failed:
-1: Invalid argument (22)
2022-07-06T19:53:17.645027Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.645039Z qemu-kvm: vhost VQ 0 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.645047Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.645052Z qemu-kvm: vhost VQ 1 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.648953Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.648964Z qemu-kvm: vhost VQ 0 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.648971Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.648976Z qemu-kvm: vhost VQ 1 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.652951Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.652962Z qemu-kvm: vhost VQ 0 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.652970Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.652975Z qemu-kvm: vhost VQ 1 ring restore failed:
-1: Resource temporarily unavailable (11)
vhost lacks feature mask 8192 for backend
2022-07-06T19:53:29.122990Z qemu-kvm: failed to init vhost_net for queue 0
vhost lacks feature mask 8192 for backend
2022-07-06T19:53:29.259739Z qemu-kvm: failed to init vhost_net for queue 0
vhost lacks feature mask 8192 for backend

Afaiu, 8192 == 0x2000 which translates to bit 13.
VIRTIO_NET_F_HOST_ECN (13) Device can receive TSO with ECN.

Even though this feature was wrongly enabled, we end up in a kind of
live loop situation.
The only solution I found is to stop qemu and restart the vm.

I can see it with a OVS upgrade, and I guess it would be the same for
live migration.
David Marchand July 11, 2022, 9:06 p.m. UTC | #2
On Wed, Jul 6, 2022 at 11:00 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Fri, Jul 1, 2022 at 5:58 AM Mike Pattrick <mkp@redhat.com> wrote:
> >
> > From: Flavio Leitner <fbl@sysclose.org>
> >
> > Now that there is a segmentation in software as a fall back in
> > case a netdev doesn't support TCP segmentation offloading (TSO),
> > enable it by default on all possible netdevs.
> >
> > The existing TSO control is inverted, so that now it will disable
> > TSO globally, in case TSO is not desired for some deployment.
> >
> > Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> > Co-authored-by: Mike Pattrick <mkp@redhat.com>
> > Signed-off-by: Mike Pattrick <mkp@redhat.com>
> > ---
> >  Documentation/topics/userspace-tso.rst |  21 ++--
> >  NEWS                                   |   4 +
> >  lib/netdev-linux.c                     | 133 ++++++++++++++++++++++++-
> >  lib/userspace-tso.c                    |   9 +-
> >  tests/ofproto-macros.at                |   1 +
> >  vswitchd/vswitch.xml                   |  17 +---
> >  6 files changed, 154 insertions(+), 31 deletions(-)
>
> We have one issue with vhost user client ports.
>
> Consider the case with an OVS running  before this series is applied,
> with userspace tso disabled (which is the case for existing OVS
> installation).
> I see that qemu negotiates TSO + ECN feature for a virtio port in the
> vhost-user backend on OVS side:
> 2022-07-06T19:46:38.225Z|00175|dpdk|INFO|VHOST_CONFIG: negotiated
> Virtio features: 0x17020a783
>
> Next, I apply the whole series and restart OVS:
> 2022-07-06T19:53:29.121Z|00069|netdev_dpdk|INFO|vHost User device
> 'vhost1' created in 'client' mode, using client socket
> '/var/lib/vhost_sockets/vhost1'
> 2022-07-06T19:53:29.122Z|00070|dpdk|INFO|VHOST_CONFIG: new device,
> handle is 0, path is /var/lib/vhost_sockets/vhost1
> 2022-07-06T19:53:29.122Z|00001|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_GET_FEATURES
> 2022-07-06T19:53:29.122Z|00002|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_GET_PROTOCOL_FEATURES
> 2022-07-06T19:53:29.122Z|00003|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_SET_PROTOCOL_FEATURES
> 2022-07-06T19:53:29.122Z|00004|dpdk|INFO|VHOST_CONFIG: negotiated
> Vhost-user protocol features: 0xcbf
> 2022-07-06T19:53:29.122Z|00005|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_GET_QUEUE_NUM
> 2022-07-06T19:53:29.122Z|00006|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_SET_SLAVE_REQ_FD
> 2022-07-06T19:53:29.122Z|00007|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_SET_OWNER
> 2022-07-06T19:53:29.122Z|00008|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_GET_FEATURES
> 2022-07-06T19:53:29.122Z|00009|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_SET_VRING_CALL
> 2022-07-06T19:53:29.123Z|00010|dpdk|INFO|VHOST_CONFIG: vring call idx:0 file:109
> 2022-07-06T19:53:29.123Z|00011|dpdk|INFO|VHOST_CONFIG: read message
> VHOST_USER_SET_VRING_CALL
> 2022-07-06T19:53:29.123Z|00012|dpdk|INFO|VHOST_CONFIG: vring call idx:1 file:110
> 2022-07-06T19:53:29.123Z|00013|dpdk|INFO|VHOST_CONFIG: vhost peer closed
>
> This happens for every vhost ports I have, in a loop flooding OVS logs.
> Looking at qemu logs:
> 2022-07-06T19:53:17.581363Z qemu-kvm: Unexpected end-of-file before
> all data were read
> 2022-07-06T19:53:17.583183Z qemu-kvm: Unexpected end-of-file before
> all data were read
> 2022-07-06T19:53:17.587613Z qemu-kvm: Unexpected end-of-file before
> all data were read
> 2022-07-06T19:53:17.588464Z qemu-kvm: Unexpected end-of-file before
> all data were read
> 2022-07-06T19:53:17.641010Z qemu-kvm: Failed to set msg fds.
> 2022-07-06T19:53:17.641023Z qemu-kvm: vhost VQ 0 ring restore failed:
> -1: Invalid argument (22)
> 2022-07-06T19:53:17.641035Z qemu-kvm: Failed to set msg fds.
> 2022-07-06T19:53:17.641040Z qemu-kvm: vhost VQ 1 ring restore failed:
> -1: Invalid argument (22)
> 2022-07-06T19:53:17.645027Z qemu-kvm: Failed to set msg fds.
> 2022-07-06T19:53:17.645039Z qemu-kvm: vhost VQ 0 ring restore failed:
> -1: Resource temporarily unavailable (11)
> 2022-07-06T19:53:17.645047Z qemu-kvm: Failed to set msg fds.
> 2022-07-06T19:53:17.645052Z qemu-kvm: vhost VQ 1 ring restore failed:
> -1: Resource temporarily unavailable (11)
> 2022-07-06T19:53:17.648953Z qemu-kvm: Failed to set msg fds.
> 2022-07-06T19:53:17.648964Z qemu-kvm: vhost VQ 0 ring restore failed:
> -1: Resource temporarily unavailable (11)
> 2022-07-06T19:53:17.648971Z qemu-kvm: Failed to set msg fds.
> 2022-07-06T19:53:17.648976Z qemu-kvm: vhost VQ 1 ring restore failed:
> -1: Resource temporarily unavailable (11)
> 2022-07-06T19:53:17.652951Z qemu-kvm: Failed to set msg fds.
> 2022-07-06T19:53:17.652962Z qemu-kvm: vhost VQ 0 ring restore failed:
> -1: Resource temporarily unavailable (11)
> 2022-07-06T19:53:17.652970Z qemu-kvm: Failed to set msg fds.
> 2022-07-06T19:53:17.652975Z qemu-kvm: vhost VQ 1 ring restore failed:
> -1: Resource temporarily unavailable (11)
> vhost lacks feature mask 8192 for backend
> 2022-07-06T19:53:29.122990Z qemu-kvm: failed to init vhost_net for queue 0
> vhost lacks feature mask 8192 for backend
> 2022-07-06T19:53:29.259739Z qemu-kvm: failed to init vhost_net for queue 0
> vhost lacks feature mask 8192 for backend
>
> Afaiu, 8192 == 0x2000 which translates to bit 13.
> VIRTIO_NET_F_HOST_ECN (13) Device can receive TSO with ECN.
>
> Even though this feature was wrongly enabled, we end up in a kind of
> live loop situation.
> The only solution I found is to stop qemu and restart the vm.
>
> I can see it with a OVS upgrade, and I guess it would be the same for
> live migration.

For now, the less ugly is probably to incorrectly announce OVS
supports those ECN and UFO features, at least it won't break the
upgrade.
We may support TSO+ECN later (which would probably mean a dpdk api
update to expose such a hw offload request).

As of UFO, I don't know what to think.
Maxime Coquelin July 12, 2022, 7:54 a.m. UTC | #3
On 7/11/22 23:06, David Marchand wrote:
> On Wed, Jul 6, 2022 at 11:00 PM David Marchand
> <david.marchand@redhat.com> wrote:
>>
>> On Fri, Jul 1, 2022 at 5:58 AM Mike Pattrick <mkp@redhat.com> wrote:
>>>
>>> From: Flavio Leitner <fbl@sysclose.org>
>>>
>>> Now that there is a segmentation in software as a fall back in
>>> case a netdev doesn't support TCP segmentation offloading (TSO),
>>> enable it by default on all possible netdevs.
>>>
>>> The existing TSO control is inverted, so that now it will disable
>>> TSO globally, in case TSO is not desired for some deployment.
>>>
>>> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>>> Co-authored-by: Mike Pattrick <mkp@redhat.com>
>>> Signed-off-by: Mike Pattrick <mkp@redhat.com>
>>> ---
>>>   Documentation/topics/userspace-tso.rst |  21 ++--
>>>   NEWS                                   |   4 +
>>>   lib/netdev-linux.c                     | 133 ++++++++++++++++++++++++-
>>>   lib/userspace-tso.c                    |   9 +-
>>>   tests/ofproto-macros.at                |   1 +
>>>   vswitchd/vswitch.xml                   |  17 +---
>>>   6 files changed, 154 insertions(+), 31 deletions(-)
>>
>> We have one issue with vhost user client ports.
>>
>> Consider the case with an OVS running  before this series is applied,
>> with userspace tso disabled (which is the case for existing OVS
>> installation).
>> I see that qemu negotiates TSO + ECN feature for a virtio port in the
>> vhost-user backend on OVS side:
>> 2022-07-06T19:46:38.225Z|00175|dpdk|INFO|VHOST_CONFIG: negotiated
>> Virtio features: 0x17020a783
>>
>> Next, I apply the whole series and restart OVS:
>> 2022-07-06T19:53:29.121Z|00069|netdev_dpdk|INFO|vHost User device
>> 'vhost1' created in 'client' mode, using client socket
>> '/var/lib/vhost_sockets/vhost1'
>> 2022-07-06T19:53:29.122Z|00070|dpdk|INFO|VHOST_CONFIG: new device,
>> handle is 0, path is /var/lib/vhost_sockets/vhost1
>> 2022-07-06T19:53:29.122Z|00001|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_GET_FEATURES
>> 2022-07-06T19:53:29.122Z|00002|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_GET_PROTOCOL_FEATURES
>> 2022-07-06T19:53:29.122Z|00003|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_SET_PROTOCOL_FEATURES
>> 2022-07-06T19:53:29.122Z|00004|dpdk|INFO|VHOST_CONFIG: negotiated
>> Vhost-user protocol features: 0xcbf
>> 2022-07-06T19:53:29.122Z|00005|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_GET_QUEUE_NUM
>> 2022-07-06T19:53:29.122Z|00006|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_SET_SLAVE_REQ_FD
>> 2022-07-06T19:53:29.122Z|00007|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_SET_OWNER
>> 2022-07-06T19:53:29.122Z|00008|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_GET_FEATURES
>> 2022-07-06T19:53:29.122Z|00009|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_SET_VRING_CALL
>> 2022-07-06T19:53:29.123Z|00010|dpdk|INFO|VHOST_CONFIG: vring call idx:0 file:109
>> 2022-07-06T19:53:29.123Z|00011|dpdk|INFO|VHOST_CONFIG: read message
>> VHOST_USER_SET_VRING_CALL
>> 2022-07-06T19:53:29.123Z|00012|dpdk|INFO|VHOST_CONFIG: vring call idx:1 file:110
>> 2022-07-06T19:53:29.123Z|00013|dpdk|INFO|VHOST_CONFIG: vhost peer closed
>>
>> This happens for every vhost ports I have, in a loop flooding OVS logs.
>> Looking at qemu logs:
>> 2022-07-06T19:53:17.581363Z qemu-kvm: Unexpected end-of-file before
>> all data were read
>> 2022-07-06T19:53:17.583183Z qemu-kvm: Unexpected end-of-file before
>> all data were read
>> 2022-07-06T19:53:17.587613Z qemu-kvm: Unexpected end-of-file before
>> all data were read
>> 2022-07-06T19:53:17.588464Z qemu-kvm: Unexpected end-of-file before
>> all data were read
>> 2022-07-06T19:53:17.641010Z qemu-kvm: Failed to set msg fds.
>> 2022-07-06T19:53:17.641023Z qemu-kvm: vhost VQ 0 ring restore failed:
>> -1: Invalid argument (22)
>> 2022-07-06T19:53:17.641035Z qemu-kvm: Failed to set msg fds.
>> 2022-07-06T19:53:17.641040Z qemu-kvm: vhost VQ 1 ring restore failed:
>> -1: Invalid argument (22)
>> 2022-07-06T19:53:17.645027Z qemu-kvm: Failed to set msg fds.
>> 2022-07-06T19:53:17.645039Z qemu-kvm: vhost VQ 0 ring restore failed:
>> -1: Resource temporarily unavailable (11)
>> 2022-07-06T19:53:17.645047Z qemu-kvm: Failed to set msg fds.
>> 2022-07-06T19:53:17.645052Z qemu-kvm: vhost VQ 1 ring restore failed:
>> -1: Resource temporarily unavailable (11)
>> 2022-07-06T19:53:17.648953Z qemu-kvm: Failed to set msg fds.
>> 2022-07-06T19:53:17.648964Z qemu-kvm: vhost VQ 0 ring restore failed:
>> -1: Resource temporarily unavailable (11)
>> 2022-07-06T19:53:17.648971Z qemu-kvm: Failed to set msg fds.
>> 2022-07-06T19:53:17.648976Z qemu-kvm: vhost VQ 1 ring restore failed:
>> -1: Resource temporarily unavailable (11)
>> 2022-07-06T19:53:17.652951Z qemu-kvm: Failed to set msg fds.
>> 2022-07-06T19:53:17.652962Z qemu-kvm: vhost VQ 0 ring restore failed:
>> -1: Resource temporarily unavailable (11)
>> 2022-07-06T19:53:17.652970Z qemu-kvm: Failed to set msg fds.
>> 2022-07-06T19:53:17.652975Z qemu-kvm: vhost VQ 1 ring restore failed:
>> -1: Resource temporarily unavailable (11)
>> vhost lacks feature mask 8192 for backend
>> 2022-07-06T19:53:29.122990Z qemu-kvm: failed to init vhost_net for queue 0
>> vhost lacks feature mask 8192 for backend
>> 2022-07-06T19:53:29.259739Z qemu-kvm: failed to init vhost_net for queue 0
>> vhost lacks feature mask 8192 for backend
>>
>> Afaiu, 8192 == 0x2000 which translates to bit 13.
>> VIRTIO_NET_F_HOST_ECN (13) Device can receive TSO with ECN.
>>
>> Even though this feature was wrongly enabled, we end up in a kind of
>> live loop situation.
>> The only solution I found is to stop qemu and restart the vm.
>>
>> I can see it with a OVS upgrade, and I guess it would be the same for
>> live migration.
> 
> For now, the less ugly is probably to incorrectly announce OVS
> supports those ECN and UFO features, at least it won't break the
> upgrade.

Sadly, I don't think we have over choices than keeping the broken
advertisement of unsupported ECN and UFO.

> We may support TSO+ECN later (which would probably mean a dpdk api
> update to expose such a hw offload request).
> 
> As of UFO, I don't know what to think.
> 
> 
>
David Marchand Sept. 9, 2022, 2:25 p.m. UTC | #4
On Tue, Jul 12, 2022 at 9:54 AM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
>
>
> On 7/11/22 23:06, David Marchand wrote:
> > On Wed, Jul 6, 2022 at 11:00 PM David Marchand
> > <david.marchand@redhat.com> wrote:
> >>
> >> On Fri, Jul 1, 2022 at 5:58 AM Mike Pattrick <mkp@redhat.com> wrote:
> >>>
> >>> From: Flavio Leitner <fbl@sysclose.org>
> >>>
> >>> Now that there is a segmentation in software as a fall back in
> >>> case a netdev doesn't support TCP segmentation offloading (TSO),
> >>> enable it by default on all possible netdevs.
> >>>
> >>> The existing TSO control is inverted, so that now it will disable
> >>> TSO globally, in case TSO is not desired for some deployment.
> >>>
> >>> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >>> Co-authored-by: Mike Pattrick <mkp@redhat.com>
> >>> Signed-off-by: Mike Pattrick <mkp@redhat.com>
> >>> ---
> >>>   Documentation/topics/userspace-tso.rst |  21 ++--
> >>>   NEWS                                   |   4 +
> >>>   lib/netdev-linux.c                     | 133 ++++++++++++++++++++++++-
> >>>   lib/userspace-tso.c                    |   9 +-
> >>>   tests/ofproto-macros.at                |   1 +
> >>>   vswitchd/vswitch.xml                   |  17 +---
> >>>   6 files changed, 154 insertions(+), 31 deletions(-)
> >>
> >> We have one issue with vhost user client ports.
> >>
> >> Consider the case with an OVS running  before this series is applied,
> >> with userspace tso disabled (which is the case for existing OVS
> >> installation).
> >> I see that qemu negotiates TSO + ECN feature for a virtio port in the
> >> vhost-user backend on OVS side:
> >> 2022-07-06T19:46:38.225Z|00175|dpdk|INFO|VHOST_CONFIG: negotiated
> >> Virtio features: 0x17020a783
> >>
> >> Next, I apply the whole series and restart OVS:
> >> 2022-07-06T19:53:29.121Z|00069|netdev_dpdk|INFO|vHost User device
> >> 'vhost1' created in 'client' mode, using client socket
> >> '/var/lib/vhost_sockets/vhost1'
> >> 2022-07-06T19:53:29.122Z|00070|dpdk|INFO|VHOST_CONFIG: new device,
> >> handle is 0, path is /var/lib/vhost_sockets/vhost1
> >> 2022-07-06T19:53:29.122Z|00001|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_GET_FEATURES
> >> 2022-07-06T19:53:29.122Z|00002|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_GET_PROTOCOL_FEATURES
> >> 2022-07-06T19:53:29.122Z|00003|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_SET_PROTOCOL_FEATURES
> >> 2022-07-06T19:53:29.122Z|00004|dpdk|INFO|VHOST_CONFIG: negotiated
> >> Vhost-user protocol features: 0xcbf
> >> 2022-07-06T19:53:29.122Z|00005|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_GET_QUEUE_NUM
> >> 2022-07-06T19:53:29.122Z|00006|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_SET_SLAVE_REQ_FD
> >> 2022-07-06T19:53:29.122Z|00007|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_SET_OWNER
> >> 2022-07-06T19:53:29.122Z|00008|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_GET_FEATURES
> >> 2022-07-06T19:53:29.122Z|00009|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_SET_VRING_CALL
> >> 2022-07-06T19:53:29.123Z|00010|dpdk|INFO|VHOST_CONFIG: vring call idx:0 file:109
> >> 2022-07-06T19:53:29.123Z|00011|dpdk|INFO|VHOST_CONFIG: read message
> >> VHOST_USER_SET_VRING_CALL
> >> 2022-07-06T19:53:29.123Z|00012|dpdk|INFO|VHOST_CONFIG: vring call idx:1 file:110
> >> 2022-07-06T19:53:29.123Z|00013|dpdk|INFO|VHOST_CONFIG: vhost peer closed
> >>
> >> This happens for every vhost ports I have, in a loop flooding OVS logs.
> >> Looking at qemu logs:
> >> 2022-07-06T19:53:17.581363Z qemu-kvm: Unexpected end-of-file before
> >> all data were read
> >> 2022-07-06T19:53:17.583183Z qemu-kvm: Unexpected end-of-file before
> >> all data were read
> >> 2022-07-06T19:53:17.587613Z qemu-kvm: Unexpected end-of-file before
> >> all data were read
> >> 2022-07-06T19:53:17.588464Z qemu-kvm: Unexpected end-of-file before
> >> all data were read
> >> 2022-07-06T19:53:17.641010Z qemu-kvm: Failed to set msg fds.
> >> 2022-07-06T19:53:17.641023Z qemu-kvm: vhost VQ 0 ring restore failed:
> >> -1: Invalid argument (22)
> >> 2022-07-06T19:53:17.641035Z qemu-kvm: Failed to set msg fds.
> >> 2022-07-06T19:53:17.641040Z qemu-kvm: vhost VQ 1 ring restore failed:
> >> -1: Invalid argument (22)
> >> 2022-07-06T19:53:17.645027Z qemu-kvm: Failed to set msg fds.
> >> 2022-07-06T19:53:17.645039Z qemu-kvm: vhost VQ 0 ring restore failed:
> >> -1: Resource temporarily unavailable (11)
> >> 2022-07-06T19:53:17.645047Z qemu-kvm: Failed to set msg fds.
> >> 2022-07-06T19:53:17.645052Z qemu-kvm: vhost VQ 1 ring restore failed:
> >> -1: Resource temporarily unavailable (11)
> >> 2022-07-06T19:53:17.648953Z qemu-kvm: Failed to set msg fds.
> >> 2022-07-06T19:53:17.648964Z qemu-kvm: vhost VQ 0 ring restore failed:
> >> -1: Resource temporarily unavailable (11)
> >> 2022-07-06T19:53:17.648971Z qemu-kvm: Failed to set msg fds.
> >> 2022-07-06T19:53:17.648976Z qemu-kvm: vhost VQ 1 ring restore failed:
> >> -1: Resource temporarily unavailable (11)
> >> 2022-07-06T19:53:17.652951Z qemu-kvm: Failed to set msg fds.
> >> 2022-07-06T19:53:17.652962Z qemu-kvm: vhost VQ 0 ring restore failed:
> >> -1: Resource temporarily unavailable (11)
> >> 2022-07-06T19:53:17.652970Z qemu-kvm: Failed to set msg fds.
> >> 2022-07-06T19:53:17.652975Z qemu-kvm: vhost VQ 1 ring restore failed:
> >> -1: Resource temporarily unavailable (11)
> >> vhost lacks feature mask 8192 for backend
> >> 2022-07-06T19:53:29.122990Z qemu-kvm: failed to init vhost_net for queue 0
> >> vhost lacks feature mask 8192 for backend
> >> 2022-07-06T19:53:29.259739Z qemu-kvm: failed to init vhost_net for queue 0
> >> vhost lacks feature mask 8192 for backend
> >>
> >> Afaiu, 8192 == 0x2000 which translates to bit 13.
> >> VIRTIO_NET_F_HOST_ECN (13) Device can receive TSO with ECN.
> >>
> >> Even though this feature was wrongly enabled, we end up in a kind of
> >> live loop situation.
> >> The only solution I found is to stop qemu and restart the vm.
> >>
> >> I can see it with a OVS upgrade, and I guess it would be the same for
> >> live migration.
> >
> > For now, the less ugly is probably to incorrectly announce OVS
> > supports those ECN and UFO features, at least it won't break the
> > upgrade.
>
> Sadly, I don't think we have over choices than keeping the broken
> advertisement of unsupported ECN and UFO.
>
> > We may support TSO+ECN later (which would probably mean a dpdk api
> > update to expose such a hw offload request).
> >
> > As of UFO, I don't know what to think.

I posted a patch which tries to disable those features and fallback to
incorrectly announce them (but disable TSO).

Let me know what you think.
https://patchwork.ozlabs.org/project/openvswitch/patch/20220909135710.3697046-1-david.marchand@redhat.com/
diff mbox series

Patch

diff --git a/Documentation/topics/userspace-tso.rst b/Documentation/topics/userspace-tso.rst
index 33a85965c..0245723af 100644
--- a/Documentation/topics/userspace-tso.rst
+++ b/Documentation/topics/userspace-tso.rst
@@ -27,8 +27,6 @@ 
 Userspace Datapath - TSO
 ========================
 
-**Note:** This feature is considered experimental.
-
 TCP Segmentation Offload (TSO) enables a network stack to delegate segmentation
 of an oversized TCP segment to the underlying physical NIC. Offload of frame
 segmentation achieves computational savings in the core, freeing up CPU cycles
@@ -48,16 +46,16 @@  refer to the `DPDK documentation`__.
 
 __ https://doc.dpdk.org/guides-21.11/nics/overview.html
 
-Enabling TSO
-~~~~~~~~~~~~
+Disabling TSO
+~~~~~~~~~~~~~
 
-The TSO support may be enabled via a global config value
-``userspace-tso-enable``.  Setting this to ``true`` enables TSO support for
-all ports.::
+The TSO support is enabled by default it may be disabled via a global config
+value ``userspace-tso-enable``.  Setting this to ``false`` disabled TSO support
+for all ports.::
 
-    $ ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=true
+    $ ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=false
 
-The default value is ``false``.
+The default value is ``true``.
 
 Changing ``userspace-tso-enable`` requires restarting the daemon.
 
@@ -108,7 +106,10 @@  All kernel devices that use the raw socket interface (veth, for example)
 require the kernel commit 9d2f67e43b73 ("net/packet: fix packet drop as of
 virtio gso") in order to work properly. This commit was merged in upstream
 kernel 4.19-rc7, so make sure your kernel is either newer or contains the
-backport.
+backport. Network topologies that include both TSO and Linux VLAN ports
+requires the kernel commit dfed913e8b55 ("net/af_packet: add VLAN support
+for AF_PACKET SOCK_RAW GSO") in order to work properly. This commit was
+merged in upstream kernel 5.19-rc1.
 
 ~~~~~~~~~~~~~~~~~~
 Performance Tuning
diff --git a/NEWS b/NEWS
index 994fdf6a9..e5e2f71c6 100644
--- a/NEWS
+++ b/NEWS
@@ -43,6 +43,10 @@  Post-v2.17.0
      * 'dpif-netdev/subtable-lookup-prio-get' appctl command renamed to
        'dpif-netdev/subtable-lookup-info-get' to better reflect its purpose.
        The old variant is kept for backward compatibility.
+     * Userspace TSO is now enabled by default, but can still be disabled by
+       setting other_config:userspace-tso-enable=false.
+       See Documentation/topics/userspace-tso.rst for more information.
+
 
 
 v2.17.0 - 17 Feb 2022
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 7f314b810..1d38d783e 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -542,6 +542,7 @@  static bool netdev_linux_miimon_enabled(void);
 static void netdev_linux_miimon_run(void);
 static void netdev_linux_miimon_wait(void);
 static int netdev_linux_get_mtu__(struct netdev_linux *netdev, int *mtup);
+static void netdev_linux_set_ol(struct netdev *netdev);
 
 static bool
 is_tap_netdev(const struct netdev *netdev)
@@ -943,11 +944,7 @@  netdev_linux_construct(struct netdev *netdev_)
     /* The socket interface doesn't offer the option to enable only
      * csum offloading without TSO. */
     if (userspace_tso_enabled()) {
-        netdev_->ol_flags |= NETDEV_OFFLOAD_TX_TCP_TSO;
-        netdev_->ol_flags |= NETDEV_OFFLOAD_TX_TCP_CSUM;
-        netdev_->ol_flags |= NETDEV_OFFLOAD_TX_UDP_CSUM;
-        netdev_->ol_flags |= NETDEV_OFFLOAD_TX_SCTP_CSUM;
-        netdev_->ol_flags |= NETDEV_OFFLOAD_TX_IPV4_CSUM;
+        netdev_linux_set_ol(netdev_);
     }
 
     error = get_flags(&netdev->up, &netdev->ifi_flags);
@@ -2348,6 +2345,132 @@  netdev_internal_get_stats(const struct netdev *netdev_,
     return error;
 }
 
+static int
+netdev_linux_read_stringset_info(struct netdev_linux *netdev, uint32_t *len)
+{
+    struct {
+        struct ethtool_sset_info hdr;
+        uint32_t buf[1];
+    } sset_info;
+
+    sset_info.hdr.cmd = ETHTOOL_GSSET_INFO;
+    sset_info.hdr.reserved = 0;
+    sset_info.hdr.sset_mask = 1ULL << ETH_SS_FEATURES;
+
+    int error = netdev_linux_do_ethtool(netdev->up.name,
+            (struct ethtool_cmd *)&sset_info,
+            ETHTOOL_GSSET_INFO, "ETHTOOL_GSSET_INFO");
+    if (error) {
+        return error;
+    }
+    *len = *(uint32_t *) sset_info.hdr.data;
+    return 0;
+}
+
+
+static int
+netdev_linux_read_definitions(struct netdev_linux *netdev,
+                              struct ethtool_gstrings **pstrings)
+{
+    int error = 0;
+    struct ethtool_gstrings *strings = NULL;
+    uint32_t len = 0;
+
+    error = netdev_linux_read_stringset_info(netdev, &len);
+    if (error || !len) {
+        return error;
+    }
+    strings = xcalloc(1, sizeof(*strings) + len * ETH_GSTRING_LEN);
+    if (!strings) {
+        return ENOMEM;
+    }
+
+    strings->cmd = ETHTOOL_GSTRINGS;
+    strings->string_set = ETH_SS_FEATURES;
+    strings->len = len;
+    error = netdev_linux_do_ethtool(netdev->up.name,
+            (struct ethtool_cmd *) strings,
+            ETHTOOL_GSTRINGS, "ETHTOOL_GSTRINGS");
+    if (error) {
+        goto out;
+    }
+
+    for (int i = 0; i < len; i++) {
+        strings->data[(i + 1) * ETH_GSTRING_LEN - 1] = 0;
+    }
+
+    *pstrings = strings;
+
+    return 0;
+out:
+    *pstrings = NULL;
+    free(strings);
+    return error;
+}
+
+static void
+netdev_linux_set_ol(struct netdev *netdev_)
+{
+    struct netdev_linux *netdev = netdev_linux_cast(netdev_);
+    struct ethtool_gstrings *names = NULL;
+    struct ethtool_gfeatures *features = NULL;
+    int error;
+
+    COVERAGE_INC(netdev_get_ethtool);
+
+    error = netdev_linux_read_definitions(netdev, &names);
+    if (error) {
+        return;
+    }
+
+    features = xmalloc(sizeof *features +
+                       DIV_ROUND_UP(names->len, 32) *
+                       sizeof features->features[0]);
+    if (!features) {
+        goto out;
+    }
+
+    features->cmd = ETHTOOL_GFEATURES;
+    features->size = DIV_ROUND_UP(names->len, 32);
+    error = netdev_linux_do_ethtool(netdev_get_name(netdev_),
+            (struct ethtool_cmd *) features,
+            ETHTOOL_GFEATURES, "ETHTOOL_GFEATURES");
+
+    if (error) {
+        goto out;
+    }
+
+#define FEATURE_WORD(blocks, index, field)  ((blocks)[(index) / 32U].field)
+#define FEATURE_FIELD_FLAG(index)       (1U << (index) % 32U)
+#define FEATURE_BIT_IS_SET(blocks, index, field)        \
+    (FEATURE_WORD(blocks, index, field) & FEATURE_FIELD_FLAG(index))
+
+    netdev->up.ol_flags = 0;
+    static const struct {
+        char * string;
+        int value;
+    } t_list[] = {
+        {"tx-checksum-ipv4", NETDEV_OFFLOAD_TX_IPV4_CSUM},
+        {"tx-sctp-segmentation", NETDEV_OFFLOAD_TX_SCTP_CSUM},
+        {"tx-udp-segmentation", NETDEV_OFFLOAD_TX_UDP_CSUM},
+    };
+
+    for (int i = 0; i < names->len; i++) {
+        char * name = (char *) names->data + i * ETH_GSTRING_LEN;
+        for (int j = 0; j < sizeof t_list / sizeof t_list[0]; j++) {
+            if (strcmp(t_list[j].string, name) == 0) {
+                if (FEATURE_BIT_IS_SET(features->features, i, active)) {
+                    netdev_->ol_flags |= t_list[j].value;
+                }
+            }
+        }
+    }
+
+out:
+    free(names);
+    free(features);
+}
+
 static void
 netdev_linux_read_features(struct netdev_linux *netdev)
 {
diff --git a/lib/userspace-tso.c b/lib/userspace-tso.c
index f843c2a76..27b9384d7 100644
--- a/lib/userspace-tso.c
+++ b/lib/userspace-tso.c
@@ -25,17 +25,18 @@ 
 
 VLOG_DEFINE_THIS_MODULE(userspace_tso);
 
-static bool userspace_tso = false;
+static bool userspace_tso = true;
 
 void
 userspace_tso_init(const struct smap *ovs_other_config)
 {
-    if (smap_get_bool(ovs_other_config, "userspace-tso-enable", false)) {
+    if (!smap_get_bool(ovs_other_config, "userspace-tso-enable", true)) {
         static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
 
         if (ovsthread_once_start(&once)) {
-            VLOG_INFO("Userspace TCP Segmentation Offloading support enabled");
-            userspace_tso = true;
+            VLOG_INFO("Userspace TCP Segmentation Offloading support "
+                      "disabled");
+            userspace_tso = false;
             ovsthread_once_done(&once);
         }
     }
diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
index b18f0fbc1..e91ef233e 100644
--- a/tests/ofproto-macros.at
+++ b/tests/ofproto-macros.at
@@ -256,6 +256,7 @@  check_logs () {
 /ovs_rcu.*blocked [[0-9]]* ms waiting for .* to quiesce/d
 /Dropped [[0-9]]* log messages/d
 /setting extended ack support failed/d
+/ETHTOOL_GSSET_INFO/d
 /|WARN|/p
 /|ERR|/p
 /|EMER|/p" ${logs}
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index cc1dd77ec..8aacddac9 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -761,22 +761,15 @@ 
       <column name="other_config" key="userspace-tso-enable"
               type='{"type": "boolean"}'>
         <p>
-          Set this value to <code>true</code> to enable userspace support for
-          TCP Segmentation Offloading (TSO). When it is enabled, the interfaces
-          can provide an oversized TCP segment to the datapath and the datapath
-          will offload the TCP segmentation and checksum calculation to the
-          interfaces when necessary.
+          Set this value to <code>false</code> to disable userspace support for
+          TCP Segmentation Offloading (TSO). When it is disabled, the datapath
+          will not offload the TCP segmentation and checksum calculation to
+          the interfaces.
         </p>
         <p>
-          The default value is <code>false</code>. Changing this value requires
+          The default value is <code>true</code>. Changing this value requires
           restarting the daemon.
         </p>
-        <p>
-          The feature only works if Open vSwitch is built with DPDK support.
-        </p>
-        <p>
-          The feature is considered experimental.
-        </p>
       </column>
     </group>
     <group title="Status">