diff mbox

[REGRESSION] Select hang with zero sized UDP packets

Message ID 1471979019.14381.37.camel@edumazet-glaptop3.roam.corp.google.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Aug. 23, 2016, 7:03 p.m. UTC
On Tue, 2016-08-23 at 11:25 -0700, David Miller wrote:
> From: Laura Abbott <labbott@redhat.com>
> Date: Tue, 23 Aug 2016 10:53:26 -0700
> 
> > Fedora received a report[1] of a unit test failing on Ruby when using
> > the
> > 4.7 kernel. This was a test to send a zero sized UDP packet. With the
> > 4.7 kernel, the test now timing out on a select instead of completing.
> > The reduced ruby test is
> > 
> >   def test_udp_recvfrom_nonblock
> >     u1 = UDPSocket.new
> >     u2 = UDPSocket.new
> >     u1.bind("127.0.0.1", 0)
> >     u2.send("", 0, u1.getsockname)
> >     IO.select [u1]  # test gets stuck here
> >   ensure
> >     u1.close if u1
> >     u2.close if u2
> >   end
> 
> Well, if there is no data, should select really wake up?
> 
> I think it's valid not to.
There are skb in receive queue, with skb->len = 0

This looks like a bug in first_packet_length() or poll logic.

Definitely something we can fix.

Maybe with :

Comments

Laura Abbott Aug. 23, 2016, 8:06 p.m. UTC | #1
On 08/23/2016 12:03 PM, Eric Dumazet wrote:
> On Tue, 2016-08-23 at 11:25 -0700, David Miller wrote:
>> From: Laura Abbott <labbott@redhat.com>
>> Date: Tue, 23 Aug 2016 10:53:26 -0700
>>
>>> Fedora received a report[1] of a unit test failing on Ruby when using
>>> the
>>> 4.7 kernel. This was a test to send a zero sized UDP packet. With the
>>> 4.7 kernel, the test now timing out on a select instead of completing.
>>> The reduced ruby test is
>>>
>>>   def test_udp_recvfrom_nonblock
>>>     u1 = UDPSocket.new
>>>     u2 = UDPSocket.new
>>>     u1.bind("127.0.0.1", 0)
>>>     u2.send("", 0, u1.getsockname)
>>>     IO.select [u1]  # test gets stuck here
>>>   ensure
>>>     u1.close if u1
>>>     u2.close if u2
>>>   end
>>
>> Well, if there is no data, should select really wake up?
>>
>> I think it's valid not to.
> There are skb in receive queue, with skb->len = 0
>
> This looks like a bug in first_packet_length() or poll logic.
>
> Definitely something we can fix.
>
> Maybe with :
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index e61f7cd65d08..380c05a84041 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1184,11 +1184,11 @@ out:
>   *	Drops all bad checksum frames, until a valid one is found.
>   *	Returns the length of found skb, or 0 if none is found.
>   */
> -static unsigned int first_packet_length(struct sock *sk)
> +static int first_packet_length(struct sock *sk)
>  {
>  	struct sk_buff_head list_kill, *rcvq = &sk->sk_receive_queue;
>  	struct sk_buff *skb;
> -	unsigned int res;
> +	int res;
>
>  	__skb_queue_head_init(&list_kill);
>
> @@ -1203,7 +1203,7 @@ static unsigned int first_packet_length(struct sock *sk)
>  		__skb_unlink(skb, rcvq);
>  		__skb_queue_tail(&list_kill, skb);
>  	}
> -	res = skb ? skb->len : 0;
> +	res = skb ? skb->len : -1;
>  	spin_unlock_bh(&rcvq->lock);
>
>  	if (!skb_queue_empty(&list_kill)) {
> @@ -1232,7 +1232,7 @@ int udp_ioctl(struct sock *sk, int cmd, unsigned long arg)
>
>  	case SIOCINQ:
>  	{
> -		unsigned int amount = first_packet_length(sk);
> +		int amount = max(0, first_packet_length(sk));
>
>  		return put_user(amount, (int __user *)arg);
>  	}
> @@ -2184,7 +2184,7 @@ unsigned int udp_poll(struct file *file, struct socket *sock, poll_table *wait)
>
>  	/* Check for false positives due to checksum errors */
>  	if ((mask & POLLRDNORM) && !(file->f_flags & O_NONBLOCK) &&
> -	    !(sk->sk_shutdown & RCV_SHUTDOWN) && !first_packet_length(sk))
> +	    !(sk->sk_shutdown & RCV_SHUTDOWN) && first_packet_length(sk) == -1)
>  		mask &= ~(POLLIN | POLLRDNORM);
>
>  	return mask;
>
>

Fixes the test for me. You're welcome to take this as a Tested-by.

Thanks,
Laura
Eric Dumazet Aug. 23, 2016, 8:42 p.m. UTC | #2
On Tue, 2016-08-23 at 13:06 -0700, Laura Abbott wrote:

> 
> Fixes the test for me. You're welcome to take this as a Tested-by.

Thanks Laura, I will submit an official patch immediately.
diff mbox

Patch

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e61f7cd65d08..380c05a84041 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1184,11 +1184,11 @@  out:
  *	Drops all bad checksum frames, until a valid one is found.
  *	Returns the length of found skb, or 0 if none is found.
  */
-static unsigned int first_packet_length(struct sock *sk)
+static int first_packet_length(struct sock *sk)
 {
 	struct sk_buff_head list_kill, *rcvq = &sk->sk_receive_queue;
 	struct sk_buff *skb;
-	unsigned int res;
+	int res;
 
 	__skb_queue_head_init(&list_kill);
 
@@ -1203,7 +1203,7 @@  static unsigned int first_packet_length(struct sock *sk)
 		__skb_unlink(skb, rcvq);
 		__skb_queue_tail(&list_kill, skb);
 	}
-	res = skb ? skb->len : 0;
+	res = skb ? skb->len : -1;
 	spin_unlock_bh(&rcvq->lock);
 
 	if (!skb_queue_empty(&list_kill)) {
@@ -1232,7 +1232,7 @@  int udp_ioctl(struct sock *sk, int cmd, unsigned long arg)
 
 	case SIOCINQ:
 	{
-		unsigned int amount = first_packet_length(sk);
+		int amount = max(0, first_packet_length(sk));
 
 		return put_user(amount, (int __user *)arg);
 	}
@@ -2184,7 +2184,7 @@  unsigned int udp_poll(struct file *file, struct socket *sock, poll_table *wait)
 
 	/* Check for false positives due to checksum errors */
 	if ((mask & POLLRDNORM) && !(file->f_flags & O_NONBLOCK) &&
-	    !(sk->sk_shutdown & RCV_SHUTDOWN) && !first_packet_length(sk))
+	    !(sk->sk_shutdown & RCV_SHUTDOWN) && first_packet_length(sk) == -1)
 		mask &= ~(POLLIN | POLLRDNORM);
 
 	return mask;