Message ID | 1390379257-9040-1-git-send-email-sojkam1@fel.cvut.cz |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Michal Sojka <sojkam1@fel.cvut.cz> Date: Wed, 22 Jan 2014 09:27:36 +0100 > Since the length of the qdisc queue was set by default to 10 > packets, this is exactly what was happening. This is your bug, set the qdisc limit to something more reasonable. Something large enough to absorb the traffic wrt. the speed at which the CAN device can sink the data. These two patches are something I am not willing to apply to my tree, this is not how you solve this problem, sorry. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/23/2014 09:41 PM, David Miller wrote: > From: Michal Sojka <sojkam1@fel.cvut.cz> > Date: Wed, 22 Jan 2014 09:27:36 +0100 > >> Since the length of the qdisc queue was set by default to 10 >> packets, this is exactly what was happening. > > This is your bug, set the qdisc limit to something more reasonable. > > Something large enough to absorb the traffic wrt. the speed at which > the CAN device can sink the data. Hmmm, the problem is on i686 I have to increase the txqueuelen to 366 before the socket works as expected (writes are blocking). This means if there are two processes one sending a stream of bulk data and the other one occasionally more important data there will be a lot of frames in front of the important ones. With Michal's patch we can limit the number of frames in flight to a reasonably low amount. > These two patches are something I am not willing to apply to my > tree, this is not how you solve this problem, sorry. Marc
From: Marc Kleine-Budde <mkl@pengutronix.de> Date: Thu, 23 Jan 2014 22:01:37 +0100 > On 01/23/2014 09:41 PM, David Miller wrote: >> From: Michal Sojka <sojkam1@fel.cvut.cz> >> Date: Wed, 22 Jan 2014 09:27:36 +0100 >> >>> Since the length of the qdisc queue was set by default to 10 >>> packets, this is exactly what was happening. >> >> This is your bug, set the qdisc limit to something more reasonable. >> >> Something large enough to absorb the traffic wrt. the speed at which >> the CAN device can sink the data. > > Hmmm, the problem is on i686 I have to increase the txqueuelen to 366 > before the socket works as expected (writes are blocking). This means if > there are two processes one sending a stream of bulk data and the other > one occasionally more important data there will be a lot of frames in > front of the important ones. > > With Michal's patch we can limit the number of frames in flight to a > reasonably low amount. What do you think UDP applications have to do on slow interfaces? If they care about overrunning the device queue and getting packet drops, they themselves set a smaller socket send buffer size. CAN is not special in this regard, please stop treating it as such. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 23 2014, Marc Kleine-Budde wrote: > On 01/23/2014 09:41 PM, David Miller wrote: >> From: Michal Sojka <sojkam1@fel.cvut.cz> >> Date: Wed, 22 Jan 2014 09:27:36 +0100 >> >>> Since the length of the qdisc queue was set by default to 10 >>> packets, this is exactly what was happening. >> >> This is your bug, set the qdisc limit to something more reasonable. >> >> Something large enough to absorb the traffic wrt. the speed at which >> the CAN device can sink the data. > > Hmmm, the problem is on i686 I have to increase the txqueuelen to 366 > before the socket works as expected (writes are blocking). This means if > there are two processes one sending a stream of bulk data and the other > one occasionally more important data there will be a lot of frames in > front of the important ones. This problem can be solved with queuing disciplines. See https://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf. This reminds me that I have another patch that makes qdisc work better with CAN raw sockets. I'll send the patch to linux-can. -Michal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index 1870c47..a0bce83 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -492,7 +492,7 @@ static void can_setup(struct net_device *dev) dev->mtu = CAN_MTU; dev->hard_header_len = 0; dev->addr_len = 0; - dev->tx_queue_len = 10; + dev->tx_queue_len = 100; /* New-style flags. */ dev->flags = IFF_NOARP; diff --git a/net/can/raw.c b/net/can/raw.c index fdda5f6..4ad0bb2 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -291,6 +291,10 @@ static int raw_init(struct sock *sk) { struct raw_sock *ro = raw_sk(sk); + /* allow at most 3 frames to wait for transmission in socket queue */ + sk->sk_sndbuf = 3 * SKB_TRUESIZE(sizeof(struct can_frame) + + sizeof(struct can_skb_priv)); + ro->bound = 0; ro->ifindex = 0;
This fixes the infamous ENOBUFS problem, which appears when an application sends CAN frames faster than they leave the interface. Packets for sending can be queued at queueing discipline. Qdisc queue has two limits: maximum length and per-socket byte limit (SO_SNDBUF). Only the later limit can cause the sender to block. If maximum queue length limit is reached before the per-socket limit, the application receives ENOBUFS and there is no way how it can wait for the queue to become free again. Since the length of the qdisc queue was set by default to 10 packets, this is exactly what was happening. This patch decreases the default per-socket limit to approximately 3 CAN frames and increases the length of the qdisc queue to 100 frames. This setting allows for at least 33 CAN_RAW sockets to send simultaneously to the same CAN interface without getting ENOBUFS errors. The exact maximum number of CAN frames that fit into the per-socket limit is: 1+floor(sk_sndbuf/skb->truesize) The calculation of the default sk_sndbuf value in the patch is only approximate, because skb->truesize can be slightly greater than SKB_TRUESIZE(). For example, for CAN frames on my 32 bit PowerPC system, SKB_TRUESIZE() = 408, but skb->truesize = 448. Therefore, on my system the per-socket limit allows 1+floor(3*408/448) = 1+floor(2.73) = 3 CAN frames to be queued. Without this patch, the default per-socket limit is /proc/sys/net/core/wmem_default, which is 163840 on my system. This limit allows for queuing of 1+163840/448 = 366 CAN frames, which is clearly more than the queue length (10 frames). Since the per-socket limit is expressed in bytes, the number of queued CANFD frames, which are bigger than CAN frames, may be lower. This is not a big problem, because at least one frame could be always queued. Changes since v1: - Improved the commit message, added some number from my system. Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz> --- drivers/net/can/dev.c | 2 +- net/can/raw.c | 4 ++++ 2 files changed, 5 insertions(+), 1 deletion(-)