diff mbox

[net] net: virtio: cap mtu when XDP programs are running

Message ID 20170102223031.11541.28717.stgit@john-Precision-Tower-5810
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

John Fastabend Jan. 2, 2017, 10:30 p.m. UTC
XDP programs can not consume multiple pages so we cap the MTU to
avoid this case. Virtio-net however only checks the MTU at XDP
program load and does not block MTU changes after the program
has loaded.

This patch sets/clears the max_mtu value at XDP load/unload time.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Jason Wang Jan. 3, 2017, 6:14 a.m. UTC | #1
On 2017年01月03日 06:30, John Fastabend wrote:
> XDP programs can not consume multiple pages so we cap the MTU to
> avoid this case. Virtio-net however only checks the MTU at XDP
> program load and does not block MTU changes after the program
> has loaded.
>
> This patch sets/clears the max_mtu value at XDP load/unload time.
>
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> ---
>   drivers/net/virtio_net.c |    9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 5deeda6..783e842 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1699,6 +1699,9 @@ static void virtnet_init_settings(struct net_device *dev)
>   	.set_settings = virtnet_set_settings,
>   };
>   
> +#define MIN_MTU ETH_MIN_MTU
> +#define MAX_MTU ETH_MAX_MTU
> +
>   static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
>   {
>   	unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
> @@ -1748,6 +1751,9 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
>   			virtnet_set_queues(vi, curr_qp);
>   			return PTR_ERR(prog);
>   		}
> +		dev->max_mtu = max_sz;
> +	} else {
> +		dev->max_mtu = ETH_MAX_MTU;

Or use ETH_DATA_LEN here consider we only allocate a size of 
GOOD_PACKET_LEN for each small buffer?

Thanks

>   	}
>   
>   	vi->xdp_queue_pairs = xdp_qp;
> @@ -2133,9 +2139,6 @@ static bool virtnet_validate_features(struct virtio_device *vdev)
>   	return true;
>   }
>   
> -#define MIN_MTU ETH_MIN_MTU
> -#define MAX_MTU ETH_MAX_MTU
> -
>   static int virtnet_probe(struct virtio_device *vdev)
>   {
>   	int i, err;
>
John Fastabend Jan. 3, 2017, 4:48 p.m. UTC | #2
On 17-01-02 10:14 PM, Jason Wang wrote:
> 
> 
> On 2017年01月03日 06:30, John Fastabend wrote:
>> XDP programs can not consume multiple pages so we cap the MTU to
>> avoid this case. Virtio-net however only checks the MTU at XDP
>> program load and does not block MTU changes after the program
>> has loaded.
>>
>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>> ---
>>   drivers/net/virtio_net.c |    9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 5deeda6..783e842 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -1699,6 +1699,9 @@ static void virtnet_init_settings(struct net_device *dev)
>>       .set_settings = virtnet_set_settings,
>>   };
>>   +#define MIN_MTU ETH_MIN_MTU
>> +#define MAX_MTU ETH_MAX_MTU
>> +
>>   static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
>>   {
>>       unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
>> @@ -1748,6 +1751,9 @@ static int virtnet_xdp_set(struct net_device *dev,
>> struct bpf_prog *prog)
>>               virtnet_set_queues(vi, curr_qp);
>>               return PTR_ERR(prog);
>>           }
>> +        dev->max_mtu = max_sz;
>> +    } else {
>> +        dev->max_mtu = ETH_MAX_MTU;
> 
> Or use ETH_DATA_LEN here consider we only allocate a size of GOOD_PACKET_LEN for
> each small buffer?
> 
> Thanks

OK so this logic is a bit too simply. When it resets the max_mtu I guess it
needs to read the mtu via

   virtio_cread16(vdev, ...)

or we may break the negotiated mtu.

As for capping it at GOOD_PACKET_LEN this has the nice benefit of avoiding any
underestimates in EWMA predictions because it appears min estimates are capped
at GOOD_PACKET_LEN via get_mergeable_buf_len().

Thanks,
John
John Fastabend Jan. 3, 2017, 4:48 p.m. UTC | #3
On 17-01-02 10:14 PM, Jason Wang wrote:
> 
> 
> On 2017年01月03日 06:30, John Fastabend wrote:
>> XDP programs can not consume multiple pages so we cap the MTU to
>> avoid this case. Virtio-net however only checks the MTU at XDP
>> program load and does not block MTU changes after the program
>> has loaded.
>>
>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>> ---
>>   drivers/net/virtio_net.c |    9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 5deeda6..783e842 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -1699,6 +1699,9 @@ static void virtnet_init_settings(struct net_device *dev)
>>       .set_settings = virtnet_set_settings,
>>   };
>>   +#define MIN_MTU ETH_MIN_MTU
>> +#define MAX_MTU ETH_MAX_MTU
>> +
>>   static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
>>   {
>>       unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
>> @@ -1748,6 +1751,9 @@ static int virtnet_xdp_set(struct net_device *dev,
>> struct bpf_prog *prog)
>>               virtnet_set_queues(vi, curr_qp);
>>               return PTR_ERR(prog);
>>           }
>> +        dev->max_mtu = max_sz;
>> +    } else {
>> +        dev->max_mtu = ETH_MAX_MTU;
> 
> Or use ETH_DATA_LEN here consider we only allocate a size of GOOD_PACKET_LEN for
> each small buffer?
> 
> Thanks

OK so this logic is a bit too simply. When it resets the max_mtu I guess it
needs to read the mtu via

   virtio_cread16(vdev, ...)

or we may break the negotiated mtu.

As for capping it at GOOD_PACKET_LEN this has the nice benefit of avoiding any
underestimates in EWMA predictions because it appears min estimates are capped
at GOOD_PACKET_LEN via get_mergeable_buf_len().

Thanks,
John
Jason Wang Jan. 4, 2017, 3:16 a.m. UTC | #4
case.



On 2017年01月04日 00:48, John Fastabend wrote:
> On 17-01-02 10:14 PM, Jason Wang wrote:
>>
>> On 2017年01月03日 06:30, John Fastabend wrote:
>>> XDP programs can not consume multiple pages so we cap the MTU to
>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>> program load and does not block MTU changes after the program
>>> has loaded.
>>>
>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>
>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>> ---
>>>    drivers/net/virtio_net.c |    9 ++++++---
>>>    1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index 5deeda6..783e842 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -1699,6 +1699,9 @@ static void virtnet_init_settings(struct net_device *dev)
>>>        .set_settings = virtnet_set_settings,
>>>    };
>>>    +#define MIN_MTU ETH_MIN_MTU
>>> +#define MAX_MTU ETH_MAX_MTU
>>> +
>>>    static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
>>>    {
>>>        unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
>>> @@ -1748,6 +1751,9 @@ static int virtnet_xdp_set(struct net_device *dev,
>>> struct bpf_prog *prog)
>>>                virtnet_set_queues(vi, curr_qp);
>>>                return PTR_ERR(prog);
>>>            }
>>> +        dev->max_mtu = max_sz;
>>> +    } else {
>>> +        dev->max_mtu = ETH_MAX_MTU;
>> Or use ETH_DATA_LEN here consider we only allocate a size of GOOD_PACKET_LEN for
>> each small buffer?
>>
>> Thanks
> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
> needs to read the mtu via
>
>     virtio_cread16(vdev, ...)
>
> or we may break the negotiated mtu.

Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to 
notify the device about the mtu in this case which is not supported by 
virtio now.
>
> As for capping it at GOOD_PACKET_LEN this has the nice benefit of avoiding any
> underestimates in EWMA predictions because it appears min estimates are capped
> at GOOD_PACKET_LEN via get_mergeable_buf_len().

This seems something misunderstanding here, I meant only use 
GOOD_PACKET_LEN for small buffer (which does not use EWMA).

Thanks

>
> Thanks,
> John
>
John Fastabend Jan. 4, 2017, 6:57 p.m. UTC | #5
[...]

> On 2017年01月04日 00:48, John Fastabend wrote:
>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>
>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>> program load and does not block MTU changes after the program
>>>> has loaded.
>>>>
>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>
>>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>>> ---

[...]

>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>> needs to read the mtu via
>>
>>     virtio_cread16(vdev, ...)
>>
>> or we may break the negotiated mtu.
> 
> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
> the device about the mtu in this case which is not supported by virtio now.

Note this is not really a XDP specific problem. The guest can change the MTU
after init time even without XDP which I assume should ideally result in a
notification if the MTU is negotiated.

>>
>> As for capping it at GOOD_PACKET_LEN this has the nice benefit of avoiding any
>> underestimates in EWMA predictions because it appears min estimates are capped
>> at GOOD_PACKET_LEN via get_mergeable_buf_len().
> 
> This seems something misunderstanding here, I meant only use GOOD_PACKET_LEN for
> small buffer (which does not use EWMA).
> 

Yep I think its all cleared up now.

Thanks.
Jason Wang Jan. 5, 2017, 3:09 a.m. UTC | #6
On 2017年01月05日 02:57, John Fastabend wrote:
> [...]
>
>> On 2017年01月04日 00:48, John Fastabend wrote:
>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>> program load and does not block MTU changes after the program
>>>>> has loaded.
>>>>>
>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>
>>>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>>>> ---
> [...]
>
>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>> needs to read the mtu via
>>>
>>>      virtio_cread16(vdev, ...)
>>>
>>> or we may break the negotiated mtu.
>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>> the device about the mtu in this case which is not supported by virtio now.
> Note this is not really a XDP specific problem. The guest can change the MTU
> after init time even without XDP which I assume should ideally result in a
> notification if the MTU is negotiated.

Yes, Michael, do you think we need add some mechanism to notify host 
about MTU change in this case?

Thanks
Michael S. Tsirkin Jan. 9, 2017, 11:05 p.m. UTC | #7
On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
> 
> 
> On 2017年01月05日 02:57, John Fastabend wrote:
> > [...]
> > 
> > > On 2017年01月04日 00:48, John Fastabend wrote:
> > > > On 17-01-02 10:14 PM, Jason Wang wrote:
> > > > > On 2017年01月03日 06:30, John Fastabend wrote:
> > > > > > XDP programs can not consume multiple pages so we cap the MTU to
> > > > > > avoid this case. Virtio-net however only checks the MTU at XDP
> > > > > > program load and does not block MTU changes after the program
> > > > > > has loaded.
> > > > > > 
> > > > > > This patch sets/clears the max_mtu value at XDP load/unload time.
> > > > > > 
> > > > > > Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> > > > > > ---
> > [...]
> > 
> > > > OK so this logic is a bit too simply. When it resets the max_mtu I guess it
> > > > needs to read the mtu via
> > > > 
> > > >      virtio_cread16(vdev, ...)
> > > > 
> > > > or we may break the negotiated mtu.
> > > Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
> > > the device about the mtu in this case which is not supported by virtio now.
> > Note this is not really a XDP specific problem. The guest can change the MTU
> > after init time even without XDP which I assume should ideally result in a
> > notification if the MTU is negotiated.
> 
> Yes, Michael, do you think we need add some mechanism to notify host about
> MTU change in this case?
> 
> Thanks

Why does host care?
John Fastabend Jan. 9, 2017, 11:13 p.m. UTC | #8
On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
>>
>>
>> On 2017年01月05日 02:57, John Fastabend wrote:
>>> [...]
>>>
>>>> On 2017年01月04日 00:48, John Fastabend wrote:
>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>>>> program load and does not block MTU changes after the program
>>>>>>> has loaded.
>>>>>>>
>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>>>
>>>>>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>>>>>> ---
>>> [...]
>>>
>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>>>> needs to read the mtu via
>>>>>
>>>>>      virtio_cread16(vdev, ...)
>>>>>
>>>>> or we may break the negotiated mtu.
>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>>>> the device about the mtu in this case which is not supported by virtio now.
>>> Note this is not really a XDP specific problem. The guest can change the MTU
>>> after init time even without XDP which I assume should ideally result in a
>>> notification if the MTU is negotiated.
>>
>> Yes, Michael, do you think we need add some mechanism to notify host about
>> MTU change in this case?
>>
>> Thanks
> 
> Why does host care?
> 

Well the guest will drop packets after mtu has been reduced. Although the guest
by reducing its MTU in some sense must expect this. Likewise if the host were
to change MTU after virtio_net probe time the guest would not learn about it.

I think at best negotiating the mtu is just a hint? If system _really_ cares
we could use lldp or some other out of band mechanism to learn/set/adjust MTU
on both systems and it would be more robust. I'm not actually convinced this
is a problem in bare metal systems we have the same issue with physical
switches and solve it out of band via configuration, protocols, etc.

.John
Michael S. Tsirkin Jan. 9, 2017, 11:24 p.m. UTC | #9
On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
> > On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
> >>
> >>
> >> On 2017年01月05日 02:57, John Fastabend wrote:
> >>> [...]
> >>>
> >>>> On 2017年01月04日 00:48, John Fastabend wrote:
> >>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
> >>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
> >>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
> >>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
> >>>>>>> program load and does not block MTU changes after the program
> >>>>>>> has loaded.
> >>>>>>>
> >>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
> >>>>>>>
> >>>>>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> >>>>>>> ---
> >>> [...]
> >>>
> >>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
> >>>>> needs to read the mtu via
> >>>>>
> >>>>>      virtio_cread16(vdev, ...)
> >>>>>
> >>>>> or we may break the negotiated mtu.
> >>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
> >>>> the device about the mtu in this case which is not supported by virtio now.
> >>> Note this is not really a XDP specific problem. The guest can change the MTU
> >>> after init time even without XDP which I assume should ideally result in a
> >>> notification if the MTU is negotiated.
> >>
> >> Yes, Michael, do you think we need add some mechanism to notify host about
> >> MTU change in this case?
> >>
> >> Thanks
> > 
> > Why does host care?
> > 
> 
> Well the guest will drop packets after mtu has been reduced.

I didn't know. What place in code does this?

> Although the guest
> by reducing its MTU in some sense must expect this. Likewise if the host were
> to change MTU after virtio_net probe time the guest would not learn about it.

The spec explicitly disallows this last one.

> I think at best negotiating the mtu is just a hint? If system _really_ cares
> we could use lldp or some other out of band mechanism to learn/set/adjust MTU
> on both systems and it would be more robust. I'm not actually convinced this
> is a problem in bare metal systems we have the same issue with physical
> switches and solve it out of band via configuration, protocols, etc.
> 
> .John

ATM we don't have negotiation in virtio, just a max mtu limit.
This doesn't free guest from configuring mtu correctly,
just helps it avoid doing something clearly bogus.
John Fastabend Jan. 9, 2017, 11:49 p.m. UTC | #10
On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
>>>>
>>>>
>>>> On 2017年01月05日 02:57, John Fastabend wrote:
>>>>> [...]
>>>>>
>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>>>>>> program load and does not block MTU changes after the program
>>>>>>>>> has loaded.
>>>>>>>>>
>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>>>>>
>>>>>>>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>>>>>>>> ---
>>>>> [...]
>>>>>
>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>>>>>> needs to read the mtu via
>>>>>>>
>>>>>>>      virtio_cread16(vdev, ...)
>>>>>>>
>>>>>>> or we may break the negotiated mtu.
>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>>>>>> the device about the mtu in this case which is not supported by virtio now.
>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
>>>>> after init time even without XDP which I assume should ideally result in a
>>>>> notification if the MTU is negotiated.
>>>>
>>>> Yes, Michael, do you think we need add some mechanism to notify host about
>>>> MTU change in this case?
>>>>
>>>> Thanks
>>>
>>> Why does host care?
>>>
>>
>> Well the guest will drop packets after mtu has been reduced.
> 
> I didn't know. What place in code does this?
> 

hmm in many of the drivers it is convention to use the mtu to set the rx
buffer sizes and a receive side max length filter. For example in the Intel
drivers if a packet with length greater than MTU + some headroom is received we
drop it. I guess in the networking stack RX path though nothing forces this and
virtio doesn't have any code to drop packets on rx size.

In virtio I don't see any existing case currently. In the XDP case though we
need to ensure packets fit in a page for the time being which is why I was
looking at this code and generated this patch.

>> Although the guest
>> by reducing its MTU in some sense must expect this. Likewise if the host were
>> to change MTU after virtio_net probe time the guest would not learn about it.
> 
> The spec explicitly disallows this last one.

OK. By the way were do I get the latest source I see the published virtio1.0 at
the oasis-open.org site but it doesn't mention the MTU logic.

> 
>> I think at best negotiating the mtu is just a hint? If system _really_ cares
>> we could use lldp or some other out of band mechanism to learn/set/adjust MTU
>> on both systems and it would be more robust. I'm not actually convinced this
>> is a problem in bare metal systems we have the same issue with physical
>> switches and solve it out of band via configuration, protocols, etc.
>>
>> .John
> 
> ATM we don't have negotiation in virtio, just a max mtu limit.
> This doesn't free guest from configuring mtu correctly,
> just helps it avoid doing something clearly bogus.
> 

Yep. I'm fine with calling it a misconfiguration if the guest reduces the MTU
and the host continues to send packets @ advertised MTU.
Michael S. Tsirkin Jan. 9, 2017, 11:58 p.m. UTC | #11
On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
> > On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
> >> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
> >>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
> >>>>
> >>>>
> >>>> On 2017年01月05日 02:57, John Fastabend wrote:
> >>>>> [...]
> >>>>>
> >>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
> >>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
> >>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
> >>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
> >>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
> >>>>>>>>> program load and does not block MTU changes after the program
> >>>>>>>>> has loaded.
> >>>>>>>>>
> >>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> >>>>>>>>> ---
> >>>>> [...]
> >>>>>
> >>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
> >>>>>>> needs to read the mtu via
> >>>>>>>
> >>>>>>>      virtio_cread16(vdev, ...)
> >>>>>>>
> >>>>>>> or we may break the negotiated mtu.
> >>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
> >>>>>> the device about the mtu in this case which is not supported by virtio now.
> >>>>> Note this is not really a XDP specific problem. The guest can change the MTU
> >>>>> after init time even without XDP which I assume should ideally result in a
> >>>>> notification if the MTU is negotiated.
> >>>>
> >>>> Yes, Michael, do you think we need add some mechanism to notify host about
> >>>> MTU change in this case?
> >>>>
> >>>> Thanks
> >>>
> >>> Why does host care?
> >>>
> >>
> >> Well the guest will drop packets after mtu has been reduced.
> > 
> > I didn't know. What place in code does this?
> > 
> 
> hmm in many of the drivers it is convention to use the mtu to set the rx
> buffer sizes and a receive side max length filter. For example in the Intel
> drivers if a packet with length greater than MTU + some headroom is received we
> drop it. I guess in the networking stack RX path though nothing forces this and
> virtio doesn't have any code to drop packets on rx size.
> 
> In virtio I don't see any existing case currently. In the XDP case though we
> need to ensure packets fit in a page for the time being which is why I was
> looking at this code and generated this patch.

I'd say just look at the hardware max mtu. Ignore the configured mtu.


> >> Although the guest
> >> by reducing its MTU in some sense must expect this. Likewise if the host were
> >> to change MTU after virtio_net probe time the guest would not learn about it.
> > 
> > The spec explicitly disallows this last one.
> 
> OK. By the way were do I get the latest source I see the published virtio1.0 at
> the oasis-open.org site but it doesn't mention the MTU logic.

You need to get it from svn.

> > 
> >> I think at best negotiating the mtu is just a hint? If system _really_ cares
> >> we could use lldp or some other out of band mechanism to learn/set/adjust MTU
> >> on both systems and it would be more robust. I'm not actually convinced this
> >> is a problem in bare metal systems we have the same issue with physical
> >> switches and solve it out of band via configuration, protocols, etc.
> >>
> >> .John
> > 
> > ATM we don't have negotiation in virtio, just a max mtu limit.
> > This doesn't free guest from configuring mtu correctly,
> > just helps it avoid doing something clearly bogus.
> > 
> 
> Yep. I'm fine with calling it a misconfiguration if the guest reduces the MTU
> and the host continues to send packets @ advertised MTU.
Jason Wang Jan. 10, 2017, 2:29 a.m. UTC | #12
On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
>> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
>>> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
>>>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
>>>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
>>>>>> On 2017年01月05日 02:57, John Fastabend wrote:
>>>>>>> [...]
>>>>>>>
>>>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
>>>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>>>>>>>> program load and does not block MTU changes after the program
>>>>>>>>>>> has loaded.
>>>>>>>>>>>
>>>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
>>>>>>>>>>> ---
>>>>>>> [...]
>>>>>>>
>>>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>>>>>>>> needs to read the mtu via
>>>>>>>>>
>>>>>>>>>       virtio_cread16(vdev, ...)
>>>>>>>>>
>>>>>>>>> or we may break the negotiated mtu.
>>>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>>>>>>>> the device about the mtu in this case which is not supported by virtio now.
>>>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
>>>>>>> after init time even without XDP which I assume should ideally result in a
>>>>>>> notification if the MTU is negotiated.
>>>>>> Yes, Michael, do you think we need add some mechanism to notify host about
>>>>>> MTU change in this case?
>>>>>>
>>>>>> Thanks
>>>>> Why does host care?
>>>>>
>>>> Well the guest will drop packets after mtu has been reduced.
>>> I didn't know. What place in code does this?
>>>
>> hmm in many of the drivers it is convention to use the mtu to set the rx
>> buffer sizes and a receive side max length filter. For example in the Intel
>> drivers if a packet with length greater than MTU + some headroom is received we
>> drop it. I guess in the networking stack RX path though nothing forces this and
>> virtio doesn't have any code to drop packets on rx size.
>>
>> In virtio I don't see any existing case currently. In the XDP case though we
>> need to ensure packets fit in a page for the time being which is why I was
>> looking at this code and generated this patch.
> I'd say just look at the hardware max mtu. Ignore the configured mtu.
>
>

Does this work for small buffers consider it always allocate skb with 
size of GOOD_PACKET_LEN? I think in any case, we should limit max_mtu to 
GOOD_PACKET_LEN for small buffers.

Thanks
Michael S. Tsirkin Jan. 10, 2017, 2:51 a.m. UTC | #13
On Tue, Jan 10, 2017 at 10:29:39AM +0800, Jason Wang wrote:
> 
> 
> On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
> > On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
> > > On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
> > > > On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
> > > > > On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
> > > > > > On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
> > > > > > > On 2017年01月05日 02:57, John Fastabend wrote:
> > > > > > > > [...]
> > > > > > > > 
> > > > > > > > > On 2017年01月04日 00:48, John Fastabend wrote:
> > > > > > > > > > On 17-01-02 10:14 PM, Jason Wang wrote:
> > > > > > > > > > > On 2017年01月03日 06:30, John Fastabend wrote:
> > > > > > > > > > > > XDP programs can not consume multiple pages so we cap the MTU to
> > > > > > > > > > > > avoid this case. Virtio-net however only checks the MTU at XDP
> > > > > > > > > > > > program load and does not block MTU changes after the program
> > > > > > > > > > > > has loaded.
> > > > > > > > > > > > 
> > > > > > > > > > > > This patch sets/clears the max_mtu value at XDP load/unload time.
> > > > > > > > > > > > 
> > > > > > > > > > > > Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
> > > > > > > > > > > > ---
> > > > > > > > [...]
> > > > > > > > 
> > > > > > > > > > OK so this logic is a bit too simply. When it resets the max_mtu I guess it
> > > > > > > > > > needs to read the mtu via
> > > > > > > > > > 
> > > > > > > > > >       virtio_cread16(vdev, ...)
> > > > > > > > > > 
> > > > > > > > > > or we may break the negotiated mtu.
> > > > > > > > > Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
> > > > > > > > > the device about the mtu in this case which is not supported by virtio now.
> > > > > > > > Note this is not really a XDP specific problem. The guest can change the MTU
> > > > > > > > after init time even without XDP which I assume should ideally result in a
> > > > > > > > notification if the MTU is negotiated.
> > > > > > > Yes, Michael, do you think we need add some mechanism to notify host about
> > > > > > > MTU change in this case?
> > > > > > > 
> > > > > > > Thanks
> > > > > > Why does host care?
> > > > > > 
> > > > > Well the guest will drop packets after mtu has been reduced.
> > > > I didn't know. What place in code does this?
> > > > 
> > > hmm in many of the drivers it is convention to use the mtu to set the rx
> > > buffer sizes and a receive side max length filter. For example in the Intel
> > > drivers if a packet with length greater than MTU + some headroom is received we
> > > drop it. I guess in the networking stack RX path though nothing forces this and
> > > virtio doesn't have any code to drop packets on rx size.
> > > 
> > > In virtio I don't see any existing case currently. In the XDP case though we
> > > need to ensure packets fit in a page for the time being which is why I was
> > > looking at this code and generated this patch.
> > I'd say just look at the hardware max mtu. Ignore the configured mtu.
> > 
> > 
> 
> Does this work for small buffers consider it always allocate skb with size
> of GOOD_PACKET_LEN?

Spec says hardware won't send in packets > max mtu in config space.

> I think in any case, we should limit max_mtu to
> GOOD_PACKET_LEN for small buffers.
> 
> Thanks

XDP seems to have a bunch of weird restrictions, I just
do not like it that the logic spills out to all drivers.
What if someone decides to extend it to two pages in the future?
Recode it all in all drivers ...

Why can't net core enforce mtu?
John Fastabend Jan. 10, 2017, 3:30 a.m. UTC | #14
On 17-01-09 06:51 PM, Michael S. Tsirkin wrote:
> On Tue, Jan 10, 2017 at 10:29:39AM +0800, Jason Wang wrote:
>>
>>
>> On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
>>> On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
>>>> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
>>>>> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
>>>>>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
>>>>>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
>>>>>>>> On 2017年01月05日 02:57, John Fastabend wrote:
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
>>>>>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>>>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>>>>>>>>>> program load and does not block MTU changes after the program
>>>>>>>>>>>>> has loaded.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
>>>>>>>>>>>>> ---
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>>>>>>>>>> needs to read the mtu via
>>>>>>>>>>>
>>>>>>>>>>>       virtio_cread16(vdev, ...)
>>>>>>>>>>>
>>>>>>>>>>> or we may break the negotiated mtu.
>>>>>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>>>>>>>>>> the device about the mtu in this case which is not supported by virtio now.
>>>>>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
>>>>>>>>> after init time even without XDP which I assume should ideally result in a
>>>>>>>>> notification if the MTU is negotiated.
>>>>>>>> Yes, Michael, do you think we need add some mechanism to notify host about
>>>>>>>> MTU change in this case?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>> Why does host care?
>>>>>>>
>>>>>> Well the guest will drop packets after mtu has been reduced.
>>>>> I didn't know. What place in code does this?
>>>>>
>>>> hmm in many of the drivers it is convention to use the mtu to set the rx
>>>> buffer sizes and a receive side max length filter. For example in the Intel
>>>> drivers if a packet with length greater than MTU + some headroom is received we
>>>> drop it. I guess in the networking stack RX path though nothing forces this and
>>>> virtio doesn't have any code to drop packets on rx size.
>>>>
>>>> In virtio I don't see any existing case currently. In the XDP case though we
>>>> need to ensure packets fit in a page for the time being which is why I was
>>>> looking at this code and generated this patch.
>>> I'd say just look at the hardware max mtu. Ignore the configured mtu.
>>>
>>>
>>
>> Does this work for small buffers consider it always allocate skb with size
>> of GOOD_PACKET_LEN?
> 
> Spec says hardware won't send in packets > max mtu in config space.
> 
>> I think in any case, we should limit max_mtu to
>> GOOD_PACKET_LEN for small buffers.
>>
>> Thanks
> 
> XDP seems to have a bunch of weird restrictions, I just
> do not like it that the logic spills out to all drivers.
> What if someone decides to extend it to two pages in the future?
> Recode it all in all drivers ...
> 
> Why can't net core enforce mtu?
> 

OK I agree I'll put most the logic in rtnetlink.c when the program is added
or removed.

But, I'm looking at the non-XDP receive_small path now and wondering how does
multiple buffer receives work (e.g. packet larger than GOOD_PACKET_LEN?) I think
this is what Jason is looking at as well? The mergeable case clearly looks at
num_bufs in the descriptor to construct multi-buffer packets but nothing like
that exists in the small_receive path as best I can tell.

.John
Jason Wang Jan. 10, 2017, 3:34 a.m. UTC | #15
On 2017年01月10日 10:51, Michael S. Tsirkin wrote:
> On Tue, Jan 10, 2017 at 10:29:39AM +0800, Jason Wang wrote:
>>
>> On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
>>> On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
>>>> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
>>>>> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
>>>>>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
>>>>>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
>>>>>>>> On 2017年01月05日 02:57, John Fastabend wrote:
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
>>>>>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>>>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>>>>>>>>>> program load and does not block MTU changes after the program
>>>>>>>>>>>>> has loaded.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
>>>>>>>>>>>>> ---
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>>>>>>>>>> needs to read the mtu via
>>>>>>>>>>>
>>>>>>>>>>>        virtio_cread16(vdev, ...)
>>>>>>>>>>>
>>>>>>>>>>> or we may break the negotiated mtu.
>>>>>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>>>>>>>>>> the device about the mtu in this case which is not supported by virtio now.
>>>>>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
>>>>>>>>> after init time even without XDP which I assume should ideally result in a
>>>>>>>>> notification if the MTU is negotiated.
>>>>>>>> Yes, Michael, do you think we need add some mechanism to notify host about
>>>>>>>> MTU change in this case?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>> Why does host care?
>>>>>>>
>>>>>> Well the guest will drop packets after mtu has been reduced.
>>>>> I didn't know. What place in code does this?
>>>>>
>>>> hmm in many of the drivers it is convention to use the mtu to set the rx
>>>> buffer sizes and a receive side max length filter. For example in the Intel
>>>> drivers if a packet with length greater than MTU + some headroom is received we
>>>> drop it. I guess in the networking stack RX path though nothing forces this and
>>>> virtio doesn't have any code to drop packets on rx size.
>>>>
>>>> In virtio I don't see any existing case currently. In the XDP case though we
>>>> need to ensure packets fit in a page for the time being which is why I was
>>>> looking at this code and generated this patch.
>>> I'd say just look at the hardware max mtu. Ignore the configured mtu.
>>>
>>>
>> Does this work for small buffers consider it always allocate skb with size
>> of GOOD_PACKET_LEN?
> Spec says hardware won't send in packets > max mtu in config space.

Yes, but if max mtu is greater than GOOD_PACKET_LEN, packet will be dropped.

>
>> I think in any case, we should limit max_mtu to
>> GOOD_PACKET_LEN for small buffers.
>>
>> Thanks
> XDP seems to have a bunch of weird restrictions, I just
> do not like it that the logic spills out to all drivers.
> What if someone decides to extend it to two pages in the future?
> Recode it all in all drivers ...
>
> Why can't net core enforce mtu?
>

Not sure it's a good idea to change mtu silently without notifying user.

Thanks
Michael S. Tsirkin Jan. 10, 2017, 3:55 a.m. UTC | #16
On Mon, Jan 09, 2017 at 07:30:34PM -0800, John Fastabend wrote:
> On 17-01-09 06:51 PM, Michael S. Tsirkin wrote:
> > On Tue, Jan 10, 2017 at 10:29:39AM +0800, Jason Wang wrote:
> >>
> >>
> >> On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
> >>> On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
> >>>> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
> >>>>> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
> >>>>>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
> >>>>>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
> >>>>>>>> On 2017年01月05日 02:57, John Fastabend wrote:
> >>>>>>>>> [...]
> >>>>>>>>>
> >>>>>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
> >>>>>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
> >>>>>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
> >>>>>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
> >>>>>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
> >>>>>>>>>>>>> program load and does not block MTU changes after the program
> >>>>>>>>>>>>> has loaded.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
> >>>>>>>>>>>>> ---
> >>>>>>>>> [...]
> >>>>>>>>>
> >>>>>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
> >>>>>>>>>>> needs to read the mtu via
> >>>>>>>>>>>
> >>>>>>>>>>>       virtio_cread16(vdev, ...)
> >>>>>>>>>>>
> >>>>>>>>>>> or we may break the negotiated mtu.
> >>>>>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
> >>>>>>>>>> the device about the mtu in this case which is not supported by virtio now.
> >>>>>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
> >>>>>>>>> after init time even without XDP which I assume should ideally result in a
> >>>>>>>>> notification if the MTU is negotiated.
> >>>>>>>> Yes, Michael, do you think we need add some mechanism to notify host about
> >>>>>>>> MTU change in this case?
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>> Why does host care?
> >>>>>>>
> >>>>>> Well the guest will drop packets after mtu has been reduced.
> >>>>> I didn't know. What place in code does this?
> >>>>>
> >>>> hmm in many of the drivers it is convention to use the mtu to set the rx
> >>>> buffer sizes and a receive side max length filter. For example in the Intel
> >>>> drivers if a packet with length greater than MTU + some headroom is received we
> >>>> drop it. I guess in the networking stack RX path though nothing forces this and
> >>>> virtio doesn't have any code to drop packets on rx size.
> >>>>
> >>>> In virtio I don't see any existing case currently. In the XDP case though we
> >>>> need to ensure packets fit in a page for the time being which is why I was
> >>>> looking at this code and generated this patch.
> >>> I'd say just look at the hardware max mtu. Ignore the configured mtu.
> >>>
> >>>
> >>
> >> Does this work for small buffers consider it always allocate skb with size
> >> of GOOD_PACKET_LEN?
> > 
> > Spec says hardware won't send in packets > max mtu in config space.
> > 
> >> I think in any case, we should limit max_mtu to
> >> GOOD_PACKET_LEN for small buffers.
> >>
> >> Thanks
> > 
> > XDP seems to have a bunch of weird restrictions, I just
> > do not like it that the logic spills out to all drivers.
> > What if someone decides to extend it to two pages in the future?
> > Recode it all in all drivers ...
> > 
> > Why can't net core enforce mtu?
> > 
> 
> OK I agree I'll put most the logic in rtnetlink.c when the program is added
> or removed.
> 
> But, I'm looking at the non-XDP receive_small path now and wondering how does
> multiple buffer receives work (e.g. packet larger than GOOD_PACKET_LEN?)

I don't understand the question. Look at add_recvbuf_small,
it adds a tiny buffer for head and then the skb.


> I think
> this is what Jason is looking at as well? The mergeable case clearly looks at
> num_bufs in the descriptor to construct multi-buffer packets but nothing like
> that exists in the small_receive path as best I can tell.
> 
> .John

There's always a single buffer there.
BTW it was always a legacy path but if it's now important for people we
should probably check ANY_LAYOUT and put header linearly with the packet
if there.
John Fastabend Jan. 10, 2017, 4:25 a.m. UTC | #17
On 17-01-09 07:55 PM, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2017 at 07:30:34PM -0800, John Fastabend wrote:
>> On 17-01-09 06:51 PM, Michael S. Tsirkin wrote:
>>> On Tue, Jan 10, 2017 at 10:29:39AM +0800, Jason Wang wrote:
>>>>
>>>>
>>>> On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
>>>>> On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
>>>>>> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
>>>>>>> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
>>>>>>>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
>>>>>>>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
>>>>>>>>>> On 2017年01月05日 02:57, John Fastabend wrote:
>>>>>>>>>>> [...]
>>>>>>>>>>>
>>>>>>>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
>>>>>>>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>>>>>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>>>>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>>>>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>>>>>>>>>>>> program load and does not block MTU changes after the program
>>>>>>>>>>>>>>> has loaded.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>> [...]
>>>>>>>>>>>
>>>>>>>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>>>>>>>>>>>> needs to read the mtu via
>>>>>>>>>>>>>
>>>>>>>>>>>>>       virtio_cread16(vdev, ...)
>>>>>>>>>>>>>
>>>>>>>>>>>>> or we may break the negotiated mtu.
>>>>>>>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>>>>>>>>>>>> the device about the mtu in this case which is not supported by virtio now.
>>>>>>>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
>>>>>>>>>>> after init time even without XDP which I assume should ideally result in a
>>>>>>>>>>> notification if the MTU is negotiated.
>>>>>>>>>> Yes, Michael, do you think we need add some mechanism to notify host about
>>>>>>>>>> MTU change in this case?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>> Why does host care?
>>>>>>>>>
>>>>>>>> Well the guest will drop packets after mtu has been reduced.
>>>>>>> I didn't know. What place in code does this?
>>>>>>>
>>>>>> hmm in many of the drivers it is convention to use the mtu to set the rx
>>>>>> buffer sizes and a receive side max length filter. For example in the Intel
>>>>>> drivers if a packet with length greater than MTU + some headroom is received we
>>>>>> drop it. I guess in the networking stack RX path though nothing forces this and
>>>>>> virtio doesn't have any code to drop packets on rx size.
>>>>>>
>>>>>> In virtio I don't see any existing case currently. In the XDP case though we
>>>>>> need to ensure packets fit in a page for the time being which is why I was
>>>>>> looking at this code and generated this patch.
>>>>> I'd say just look at the hardware max mtu. Ignore the configured mtu.
>>>>>
>>>>>
>>>>
>>>> Does this work for small buffers consider it always allocate skb with size
>>>> of GOOD_PACKET_LEN?
>>>
>>> Spec says hardware won't send in packets > max mtu in config space.
>>>
>>>> I think in any case, we should limit max_mtu to
>>>> GOOD_PACKET_LEN for small buffers.
>>>>
>>>> Thanks
>>>
>>> XDP seems to have a bunch of weird restrictions, I just
>>> do not like it that the logic spills out to all drivers.
>>> What if someone decides to extend it to two pages in the future?
>>> Recode it all in all drivers ...
>>>
>>> Why can't net core enforce mtu?
>>>
>>
>> OK I agree I'll put most the logic in rtnetlink.c when the program is added
>> or removed.
>>
>> But, I'm looking at the non-XDP receive_small path now and wondering how does
>> multiple buffer receives work (e.g. packet larger than GOOD_PACKET_LEN?)
> 
> I don't understand the question. Look at add_recvbuf_small,
> it adds a tiny buffer for head and then the skb.
> 

Specifically this seems to fail with mergeable buffers disabled

On the host:

# ip link set dev tap0 mtu 9000
# ping 22.2 -s 2048

On the guest:

# insmod ./drivers/net/virtio_net.ko
# ip link set dev eth0 mtu 9000

With mergeable buffers enabled no problems it works as I expect at least.


> 
>> I think
>> this is what Jason is looking at as well? The mergeable case clearly looks at
>> num_bufs in the descriptor to construct multi-buffer packets but nothing like
>> that exists in the small_receive path as best I can tell.
>>
>> .John
> 
> There's always a single buffer there.
> BTW it was always a legacy path but if it's now important for people we
> should probably check ANY_LAYOUT and put header linearly with the packet
> if there.
>
Michael S. Tsirkin Jan. 10, 2017, 5 a.m. UTC | #18
On Mon, Jan 09, 2017 at 08:25:43PM -0800, John Fastabend wrote:
> On 17-01-09 07:55 PM, Michael S. Tsirkin wrote:
> > On Mon, Jan 09, 2017 at 07:30:34PM -0800, John Fastabend wrote:
> >> On 17-01-09 06:51 PM, Michael S. Tsirkin wrote:
> >>> On Tue, Jan 10, 2017 at 10:29:39AM +0800, Jason Wang wrote:
> >>>>
> >>>>
> >>>> On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
> >>>>> On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
> >>>>>> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
> >>>>>>> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
> >>>>>>>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
> >>>>>>>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
> >>>>>>>>>> On 2017年01月05日 02:57, John Fastabend wrote:
> >>>>>>>>>>> [...]
> >>>>>>>>>>>
> >>>>>>>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
> >>>>>>>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
> >>>>>>>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
> >>>>>>>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
> >>>>>>>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
> >>>>>>>>>>>>>>> program load and does not block MTU changes after the program
> >>>>>>>>>>>>>>> has loaded.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
> >>>>>>>>>>>>>>> ---
> >>>>>>>>>>> [...]
> >>>>>>>>>>>
> >>>>>>>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
> >>>>>>>>>>>>> needs to read the mtu via
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>       virtio_cread16(vdev, ...)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> or we may break the negotiated mtu.
> >>>>>>>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
> >>>>>>>>>>>> the device about the mtu in this case which is not supported by virtio now.
> >>>>>>>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
> >>>>>>>>>>> after init time even without XDP which I assume should ideally result in a
> >>>>>>>>>>> notification if the MTU is negotiated.
> >>>>>>>>>> Yes, Michael, do you think we need add some mechanism to notify host about
> >>>>>>>>>> MTU change in this case?
> >>>>>>>>>>
> >>>>>>>>>> Thanks
> >>>>>>>>> Why does host care?
> >>>>>>>>>
> >>>>>>>> Well the guest will drop packets after mtu has been reduced.
> >>>>>>> I didn't know. What place in code does this?
> >>>>>>>
> >>>>>> hmm in many of the drivers it is convention to use the mtu to set the rx
> >>>>>> buffer sizes and a receive side max length filter. For example in the Intel
> >>>>>> drivers if a packet with length greater than MTU + some headroom is received we
> >>>>>> drop it. I guess in the networking stack RX path though nothing forces this and
> >>>>>> virtio doesn't have any code to drop packets on rx size.
> >>>>>>
> >>>>>> In virtio I don't see any existing case currently. In the XDP case though we
> >>>>>> need to ensure packets fit in a page for the time being which is why I was
> >>>>>> looking at this code and generated this patch.
> >>>>> I'd say just look at the hardware max mtu. Ignore the configured mtu.
> >>>>>
> >>>>>
> >>>>
> >>>> Does this work for small buffers consider it always allocate skb with size
> >>>> of GOOD_PACKET_LEN?
> >>>
> >>> Spec says hardware won't send in packets > max mtu in config space.
> >>>
> >>>> I think in any case, we should limit max_mtu to
> >>>> GOOD_PACKET_LEN for small buffers.
> >>>>
> >>>> Thanks
> >>>
> >>> XDP seems to have a bunch of weird restrictions, I just
> >>> do not like it that the logic spills out to all drivers.
> >>> What if someone decides to extend it to two pages in the future?
> >>> Recode it all in all drivers ...
> >>>
> >>> Why can't net core enforce mtu?
> >>>
> >>
> >> OK I agree I'll put most the logic in rtnetlink.c when the program is added
> >> or removed.
> >>
> >> But, I'm looking at the non-XDP receive_small path now and wondering how does
> >> multiple buffer receives work (e.g. packet larger than GOOD_PACKET_LEN?)
> > 
> > I don't understand the question. Look at add_recvbuf_small,
> > it adds a tiny buffer for head and then the skb.
> > 
> 
> Specifically this seems to fail with mergeable buffers disabled
> 
> On the host:
> 
> # ip link set dev tap0 mtu 9000
> # ping 22.2 -s 2048
> 
> On the guest:
> 
> # insmod ./drivers/net/virtio_net.ko
> # ip link set dev eth0 mtu 9000

Why would it work? You are sending a packet larger than ethernet MTU.

> With mergeable buffers enabled no problems it works as I expect at least.

We don't expect to get these packets but
mergeable is able to process them anyway.
It's an accident :) 

> 
> > 
> >> I think
> >> this is what Jason is looking at as well? The mergeable case clearly looks at
> >> num_bufs in the descriptor to construct multi-buffer packets but nothing like
> >> that exists in the small_receive path as best I can tell.
> >>
> >> .John
> > 
> > There's always a single buffer there.
> > BTW it was always a legacy path but if it's now important for people we
> > should probably check ANY_LAYOUT and put header linearly with the packet
> > if there.
> >
Michael S. Tsirkin Jan. 10, 2017, 2:51 p.m. UTC | #19
On Tue, Jan 10, 2017 at 04:51:34AM +0200, Michael S. Tsirkin wrote:
> XDP seems to have a bunch of weird restrictions, I just
> do not like it that the logic spills out to all drivers.
> What if someone decides to extend it to two pages in the future?
> Recode it all in all drivers ...
> 
> Why can't net core enforce mtu?

And BTW limits on MTU are a problem that will have to be
addressed sooner or later, disabling offloads on the NIC
is one thing, but reconfiguring all of the network
with a lower MTU is another.


> -- 
> MST
Jason Wang Jan. 11, 2017, 3:37 a.m. UTC | #20
On 2017年01月10日 13:00, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2017 at 08:25:43PM -0800, John Fastabend wrote:
>> On 17-01-09 07:55 PM, Michael S. Tsirkin wrote:
>>> On Mon, Jan 09, 2017 at 07:30:34PM -0800, John Fastabend wrote:
>>>> On 17-01-09 06:51 PM, Michael S. Tsirkin wrote:
>>>>> On Tue, Jan 10, 2017 at 10:29:39AM +0800, Jason Wang wrote:
>>>>>> On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
>>>>>>> On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
>>>>>>>> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
>>>>>>>>> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
>>>>>>>>>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
>>>>>>>>>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
>>>>>>>>>>>> On 2017年01月05日 02:57, John Fastabend wrote:
>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
>>>>>>>>>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>>>>>>>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>>>>>>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>>>>>>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>>>>>>>>>>>>>> program load and does not block MTU changes after the program
>>>>>>>>>>>>>>>>> has loaded.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>>>>>>>>>>>>>> needs to read the mtu via
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>        virtio_cread16(vdev, ...)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> or we may break the negotiated mtu.
>>>>>>>>>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>>>>>>>>>>>>>> the device about the mtu in this case which is not supported by virtio now.
>>>>>>>>>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
>>>>>>>>>>>>> after init time even without XDP which I assume should ideally result in a
>>>>>>>>>>>>> notification if the MTU is negotiated.
>>>>>>>>>>>> Yes, Michael, do you think we need add some mechanism to notify host about
>>>>>>>>>>>> MTU change in this case?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>> Why does host care?
>>>>>>>>>>>
>>>>>>>>>> Well the guest will drop packets after mtu has been reduced.
>>>>>>>>> I didn't know. What place in code does this?
>>>>>>>>>
>>>>>>>> hmm in many of the drivers it is convention to use the mtu to set the rx
>>>>>>>> buffer sizes and a receive side max length filter. For example in the Intel
>>>>>>>> drivers if a packet with length greater than MTU + some headroom is received we
>>>>>>>> drop it. I guess in the networking stack RX path though nothing forces this and
>>>>>>>> virtio doesn't have any code to drop packets on rx size.
>>>>>>>>
>>>>>>>> In virtio I don't see any existing case currently. In the XDP case though we
>>>>>>>> need to ensure packets fit in a page for the time being which is why I was
>>>>>>>> looking at this code and generated this patch.
>>>>>>> I'd say just look at the hardware max mtu. Ignore the configured mtu.
>>>>>>>
>>>>>>>
>>>>>> Does this work for small buffers consider it always allocate skb with size
>>>>>> of GOOD_PACKET_LEN?
>>>>> Spec says hardware won't send in packets > max mtu in config space.
>>>>>
>>>>>> I think in any case, we should limit max_mtu to
>>>>>> GOOD_PACKET_LEN for small buffers.
>>>>>>
>>>>>> Thanks
>>>>> XDP seems to have a bunch of weird restrictions, I just
>>>>> do not like it that the logic spills out to all drivers.
>>>>> What if someone decides to extend it to two pages in the future?
>>>>> Recode it all in all drivers ...
>>>>>
>>>>> Why can't net core enforce mtu?
>>>>>
>>>> OK I agree I'll put most the logic in rtnetlink.c when the program is added
>>>> or removed.
>>>>
>>>> But, I'm looking at the non-XDP receive_small path now and wondering how does
>>>> multiple buffer receives work (e.g. packet larger than GOOD_PACKET_LEN?)
>>> I don't understand the question. Look at add_recvbuf_small,
>>> it adds a tiny buffer for head and then the skb.
>>>
>> Specifically this seems to fail with mergeable buffers disabled
>>
>> On the host:
>>
>> # ip link set dev tap0 mtu 9000
>> # ping 22.2 -s 2048
>>
>> On the guest:
>>
>> # insmod ./drivers/net/virtio_net.ko
>> # ip link set dev eth0 mtu 9000
> Why would it work? You are sending a packet larger than ethernet MTU.

Ok, does it mean virtio-net does not support Jumbo frame? And if it 
can't work, use MAX_MTU as max_mtu is a bug to me.

>
>> With mergeable buffers enabled no problems it works as I expect at least.
> We don't expect to get these packets but
> mergeable is able to process them anyway.
> It's an accident:)  
>

But path MTU discovery indeed benefits from this "accident".

Thanks
diff mbox

Patch

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5deeda6..783e842 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1699,6 +1699,9 @@  static void virtnet_init_settings(struct net_device *dev)
 	.set_settings = virtnet_set_settings,
 };
 
+#define MIN_MTU ETH_MIN_MTU
+#define MAX_MTU ETH_MAX_MTU
+
 static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 {
 	unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
@@ -1748,6 +1751,9 @@  static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 			virtnet_set_queues(vi, curr_qp);
 			return PTR_ERR(prog);
 		}
+		dev->max_mtu = max_sz;
+	} else {
+		dev->max_mtu = ETH_MAX_MTU;
 	}
 
 	vi->xdp_queue_pairs = xdp_qp;
@@ -2133,9 +2139,6 @@  static bool virtnet_validate_features(struct virtio_device *vdev)
 	return true;
 }
 
-#define MIN_MTU ETH_MIN_MTU
-#define MAX_MTU ETH_MAX_MTU
-
 static int virtnet_probe(struct virtio_device *vdev)
 {
 	int i, err;