diff mbox

[v2,1/2] can: Decrease default size of CAN_RAW socket send queue

Message ID 1390379257-9040-1-git-send-email-sojkam1@fel.cvut.cz
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Michal Sojka Jan. 22, 2014, 8:27 a.m. UTC
This fixes the infamous ENOBUFS problem, which appears when an
application sends CAN frames faster than they leave the interface.

Packets for sending can be queued at queueing discipline. Qdisc queue
has two limits: maximum length and per-socket byte limit (SO_SNDBUF).
Only the later limit can cause the sender to block. If maximum queue
length limit is reached before the per-socket limit, the application
receives ENOBUFS and there is no way how it can wait for the queue to
become free again. Since the length of the qdisc queue was set by
default to 10 packets, this is exactly what was happening.

This patch decreases the default per-socket limit to approximately 3
CAN frames and increases the length of the qdisc queue to 100 frames.
This setting allows for at least 33 CAN_RAW sockets to send
simultaneously to the same CAN interface without getting ENOBUFS
errors.

The exact maximum number of CAN frames that fit into the per-socket
limit is: 1+floor(sk_sndbuf/skb->truesize)

The calculation of the default sk_sndbuf value in the patch is only
approximate, because skb->truesize can be slightly greater than
SKB_TRUESIZE(). For example, for CAN frames on my 32 bit PowerPC
system, SKB_TRUESIZE() = 408, but skb->truesize = 448. Therefore, on
my system the per-socket limit allows 1+floor(3*408/448) =
1+floor(2.73) = 3 CAN frames to be queued.

Without this patch, the default per-socket limit is
/proc/sys/net/core/wmem_default, which is 163840 on my system. This
limit allows for queuing of 1+163840/448 = 366 CAN frames, which is
clearly more than the queue length (10 frames).

Since the per-socket limit is expressed in bytes, the number of queued
CANFD frames, which are bigger than CAN frames, may be lower. This is
not a big problem, because at least one frame could be always queued.

Changes since v1:
- Improved the commit message, added some number from my system.

Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
---
 drivers/net/can/dev.c | 2 +-
 net/can/raw.c         | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

David Miller Jan. 23, 2014, 8:41 p.m. UTC | #1
From: Michal Sojka <sojkam1@fel.cvut.cz>
Date: Wed, 22 Jan 2014 09:27:36 +0100

> Since the length of the qdisc queue was set by default to 10
> packets, this is exactly what was happening.

This is your bug, set the qdisc limit to something more reasonable.

Something large enough to absorb the traffic wrt. the speed at which
the CAN device can sink the data.

These two patches are something I am not willing to apply to my
tree, this is not how you solve this problem, sorry.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marc Kleine-Budde Jan. 23, 2014, 9:01 p.m. UTC | #2
On 01/23/2014 09:41 PM, David Miller wrote:
> From: Michal Sojka <sojkam1@fel.cvut.cz>
> Date: Wed, 22 Jan 2014 09:27:36 +0100
> 
>> Since the length of the qdisc queue was set by default to 10
>> packets, this is exactly what was happening.
> 
> This is your bug, set the qdisc limit to something more reasonable.
> 
> Something large enough to absorb the traffic wrt. the speed at which
> the CAN device can sink the data.

Hmmm, the problem is on i686 I have to increase the txqueuelen to 366
before the socket works as expected (writes are blocking). This means if
there are two processes one sending a stream of bulk data and the other
one occasionally more important data there will be a lot of frames in
front of the important ones.

With Michal's patch we can limit the number of frames in flight to a
reasonably low amount.

> These two patches are something I am not willing to apply to my
> tree, this is not how you solve this problem, sorry.

Marc
David Miller Jan. 23, 2014, 10:42 p.m. UTC | #3
From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Thu, 23 Jan 2014 22:01:37 +0100

> On 01/23/2014 09:41 PM, David Miller wrote:
>> From: Michal Sojka <sojkam1@fel.cvut.cz>
>> Date: Wed, 22 Jan 2014 09:27:36 +0100
>> 
>>> Since the length of the qdisc queue was set by default to 10
>>> packets, this is exactly what was happening.
>> 
>> This is your bug, set the qdisc limit to something more reasonable.
>> 
>> Something large enough to absorb the traffic wrt. the speed at which
>> the CAN device can sink the data.
> 
> Hmmm, the problem is on i686 I have to increase the txqueuelen to 366
> before the socket works as expected (writes are blocking). This means if
> there are two processes one sending a stream of bulk data and the other
> one occasionally more important data there will be a lot of frames in
> front of the important ones.
> 
> With Michal's patch we can limit the number of frames in flight to a
> reasonably low amount.

What do you think UDP applications have to do on slow interfaces?
If they care about overrunning the device queue and getting packet
drops, they themselves set a smaller socket send buffer size.

CAN is not special in this regard, please stop treating it as such.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Sojka Jan. 24, 2014, 12:14 p.m. UTC | #4
On Thu, Jan 23 2014, Marc Kleine-Budde wrote:
> On 01/23/2014 09:41 PM, David Miller wrote:
>> From: Michal Sojka <sojkam1@fel.cvut.cz>
>> Date: Wed, 22 Jan 2014 09:27:36 +0100
>> 
>>> Since the length of the qdisc queue was set by default to 10
>>> packets, this is exactly what was happening.
>> 
>> This is your bug, set the qdisc limit to something more reasonable.
>> 
>> Something large enough to absorb the traffic wrt. the speed at which
>> the CAN device can sink the data.
>
> Hmmm, the problem is on i686 I have to increase the txqueuelen to 366
> before the socket works as expected (writes are blocking). This means if
> there are two processes one sending a stream of bulk data and the other
> one occasionally more important data there will be a lot of frames in
> front of the important ones.

This problem can be solved with queuing disciplines. See
https://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf.

This reminds me that I have another patch that makes qdisc work better
with CAN raw sockets. I'll send the patch to linux-can.

-Michal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 1870c47..a0bce83 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -492,7 +492,7 @@  static void can_setup(struct net_device *dev)
 	dev->mtu = CAN_MTU;
 	dev->hard_header_len = 0;
 	dev->addr_len = 0;
-	dev->tx_queue_len = 10;
+	dev->tx_queue_len = 100;
 
 	/* New-style flags. */
 	dev->flags = IFF_NOARP;
diff --git a/net/can/raw.c b/net/can/raw.c
index fdda5f6..4ad0bb2 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -291,6 +291,10 @@  static int raw_init(struct sock *sk)
 {
 	struct raw_sock *ro = raw_sk(sk);
 
+	/* allow at most 3 frames to wait for transmission in socket queue */
+	sk->sk_sndbuf = 3 * SKB_TRUESIZE(sizeof(struct can_frame) +
+					 sizeof(struct can_skb_priv));
+
 	ro->bound            = 0;
 	ro->ifindex          = 0;