diff mbox

This extends tx_data and and iscsit_do_tx_data with the additional parameter flags and avoids sending multiple TCP packets in iscsit_fe_sendpage_sg

Message ID 1391949039.10160.129.camel@edumazet-glaptop2.roam.corp.google.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Feb. 9, 2014, 12:30 p.m. UTC
On Sun, 2014-02-09 at 08:42 +0100, Eric Dumazet wrote:
> The new infrastructure is used in iscsit_fe_sendpage_sg to avoid sending three
> TCP packets instead of one by settings the MSG_MORE when calling kernel_sendmsg
> via the wrapper functions tx_data and iscsit_do_tx_data. This reduces the TCP
> overhead by sending the same data in less TCP packets and minimized the TCP RTP
> when TCP auto corking is enabled. When creating a 500 GB VMFS filesystem the
> filesystem is created in 3 seconds instead of 4 seconds.
> 
> Signed-off-by: Thomas Glanzmann <thomas@glanzmann.de>
> X-tested-by: Thomas Glanzmann <thomas@glanzmann.de>
> ---

Hmm, thanks but this is not how to do this.

When you submit a patch written by someone else, you should :

1) Use your own identity as the sender, not impersonate me.
( thats standard convention )

2) Put following line as first line of the mail
( Documentation/SubmittingPatches lines ~565)

From: Eric Dumazet <edumazet@google.com>

Then I'll add my :
Signed-off-by: Eric Dumazet <edumazet@google.com>

Anyway, patch is not yet complete : We also want to set
MSG_MORE/MSG_SENDPAGE_NOTLAST for all pages but last one in a sg list.


This will fix suboptimal traffic :

13:32:04.976923 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 289953:292849, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
13:32:04.976936 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 292849:295745, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
13:32:04.976944 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [P.], seq 295745:298193, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2448
13:32:04.976952 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 298193:301089, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
13:32:04.976960 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 301089:303985, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
13:32:04.976998 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [P.], seq 303985:306385, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2400

Please try following updated patch, thanks !

Once tested, we'll submit it formally.

 drivers/target/iscsi/iscsi_target_parameters.c |    2 
 drivers/target/iscsi/iscsi_target_util.c       |   38 +++++++++------
 drivers/target/iscsi/iscsi_target_util.h       |    2 
 3 files changed, 25 insertions(+), 17 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Thomas Glanzmann Feb. 9, 2014, 3:07 p.m. UTC | #1
Hello Eric,

> 1) Use your own identity as the sender, not impersonate me.
> ( thats standard convention )

sorry about that, will not happen ever again.

> 2) Put following line as first line of the mail
> ( Documentation/SubmittingPatches lines ~565)

> From: Eric Dumazet <edumazet@google.com>

> Then I'll add my :
> Signed-off-by: Eric Dumazet <edumazet@google.com>

I see. Thank you for the awareness training. I read SubmittingPatches
completly.

> Anyway, patch is not yet complete : We also want to set
> MSG_MORE/MSG_SENDPAGE_NOTLAST for all pages but last one in a sg list.

I see.

> This will fix suboptimal traffic :

> 13:32:04.976923 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 289953:292849, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
> 13:32:04.976936 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 292849:295745, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
> 13:32:04.976944 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [P.], seq 295745:298193, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2448
> 13:32:04.976952 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 298193:301089, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
> 13:32:04.976960 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 301089:303985, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
> 13:32:04.976998 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [P.], seq 303985:306385, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2400

What is suboptimal about the traffic, could they all go in one packet?
Since my MTU is 1500 I assume that the network card will split this then
in MTU sized packets, is that correct? Should I repeat the test with MTU
9000 as well?

> Please try following updated patch, thanks!

This time it took 2 seconds instead of 4 seconds (3.12) to create the
filesystem. Find pcap here:

https://thomas.glanzmann.de/tmp/tcp_auto_corking_on_patched_tcp_more_notlast.pcap.bz2

> Once tested, we'll submit it formally.

Let me know if you want to submit or I should. If I should do it I would
split it up in two patches, one for the interface change and one for the
packet submission logic. Btw. your last patches did not apply for me
because I cut & pasted them from e-mail instead of saving it in an
editor this one. So your patch was fine but they way I tried to apply it
was flawed.

Cheers,
        Thomas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger Feb. 10, 2014, 6:58 p.m. UTC | #2
Hi Eric & Thomas,

On Sun, 2014-02-09 at 04:30 -0800, Eric Dumazet wrote:
> On Sun, 2014-02-09 at 08:42 +0100, Eric Dumazet wrote:
> > The new infrastructure is used in iscsit_fe_sendpage_sg to avoid sending three
> > TCP packets instead of one by settings the MSG_MORE when calling kernel_sendmsg
> > via the wrapper functions tx_data and iscsit_do_tx_data. This reduces the TCP
> > overhead by sending the same data in less TCP packets and minimized the TCP RTP
> > when TCP auto corking is enabled. When creating a 500 GB VMFS filesystem the
> > filesystem is created in 3 seconds instead of 4 seconds.
> > 
> > Signed-off-by: Thomas Glanzmann <thomas@glanzmann.de>
> > X-tested-by: Thomas Glanzmann <thomas@glanzmann.de>
> > ---
> 
> Hmm, thanks but this is not how to do this.
> 
> When you submit a patch written by someone else, you should :
> 
> 1) Use your own identity as the sender, not impersonate me.
> ( thats standard convention )
> 
> 2) Put following line as first line of the mail
> ( Documentation/SubmittingPatches lines ~565)
> 
> From: Eric Dumazet <edumazet@google.com>
> 
> Then I'll add my :
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> Anyway, patch is not yet complete : We also want to set
> MSG_MORE/MSG_SENDPAGE_NOTLAST for all pages but last one in a sg list.
> 
> 
> This will fix suboptimal traffic :
> 
> 13:32:04.976923 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 289953:292849, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
> 13:32:04.976936 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 292849:295745, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
> 13:32:04.976944 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [P.], seq 295745:298193, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2448
> 13:32:04.976952 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 298193:301089, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
> 13:32:04.976960 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 301089:303985, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896
> 13:32:04.976998 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [P.], seq 303985:306385, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2400
> 
> Please try following updated patch, thanks !
> 
> Once tested, we'll submit it formally.
> 
>  drivers/target/iscsi/iscsi_target_parameters.c |    2 
>  drivers/target/iscsi/iscsi_target_util.c       |   38 +++++++++------
>  drivers/target/iscsi/iscsi_target_util.h       |    2 
>  3 files changed, 25 insertions(+), 17 deletions(-)
> 

This looks correct to me..

Thomas, once your able to confirm please include your 'Tested-by' and
I'll include for the next -rc3 PULL request.

Thanks!

--nab

> diff --git a/drivers/target/iscsi/iscsi_target_parameters.c b/drivers/target/iscsi/iscsi_target_parameters.c
> index 4d2e23fc76fd..b80239250a1c 100644
> --- a/drivers/target/iscsi/iscsi_target_parameters.c
> +++ b/drivers/target/iscsi/iscsi_target_parameters.c
> @@ -79,7 +79,7 @@ int iscsi_login_tx_data(
>  	 */
>  	conn->if_marker += length;
>  
> -	tx_sent = tx_data(conn, &iov[0], iov_cnt, length);
> +	tx_sent = tx_data(conn, &iov[0], iov_cnt, length, 0);
>  	if (tx_sent != length) {
>  		pr_err("tx_data returned %d, expecting %d.\n",
>  				tx_sent, length);
> diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
> index 0819e688a398..3c529f7c61ce 100644
> --- a/drivers/target/iscsi/iscsi_target_util.c
> +++ b/drivers/target/iscsi/iscsi_target_util.c
> @@ -1165,7 +1165,7 @@ send_data:
>  		iov_count = cmd->iov_misc_count;
>  	}
>  
> -	tx_sent = tx_data(conn, &iov[0], iov_count, tx_size);
> +	tx_sent = tx_data(conn, &iov[0], iov_count, tx_size, 0);
>  	if (tx_size != tx_sent) {
>  		if (tx_sent == -EAGAIN) {
>  			pr_err("tx_data() returned -EAGAIN\n");
> @@ -1196,7 +1196,8 @@ send_hdr:
>  	iov.iov_base = cmd->pdu;
>  	iov.iov_len = tx_hdr_size;
>  
> -	tx_sent = tx_data(conn, &iov, 1, tx_hdr_size);
> +	tx_sent = tx_data(conn, &iov, 1, tx_hdr_size,
> +			  cmd->tx_size != tx_hdr_size ? MSG_MORE : 0);
>  	if (tx_hdr_size != tx_sent) {
>  		if (tx_sent == -EAGAIN) {
>  			pr_err("tx_data() returned -EAGAIN\n");
> @@ -1225,18 +1226,24 @@ send_hdr:
>  	while (data_len) {
>  		u32 space = (sg->length - offset);
>  		u32 sub_len = min_t(u32, data_len, space);
> +		int flags = 0;
> +	
> +		if ((data_len != sub_len) || cmd->padding ||
> +		    conn->conn_ops->DataDigest)
> +			flags = MSG_SENDPAGE_NOTLAST | MSG_MORE;
> +
>  send_pg:
>  		tx_sent = conn->sock->ops->sendpage(conn->sock,
> -					sg_page(sg), sg->offset + offset, sub_len, 0);
> +						    sg_page(sg),
> +						    sg->offset + offset,
> +						    sub_len, flags);
>  		if (tx_sent != sub_len) {
>  			if (tx_sent == -EAGAIN) {
> -				pr_err("tcp_sendpage() returned"
> -						" -EAGAIN\n");
> +				pr_err("tcp_sendpage() returned -EAGAIN\n");
>  				goto send_pg;
>  			}
>  
> -			pr_err("tcp_sendpage() failure: %d\n",
> -					tx_sent);
> +			pr_err("tcp_sendpage() failure: %d\n", tx_sent);
>  			return -1;
>  		}
>  
> @@ -1249,7 +1256,8 @@ send_padding:
>  	if (cmd->padding) {
>  		struct kvec *iov_p = &cmd->iov_data[iov_off++];
>  
> -		tx_sent = tx_data(conn, iov_p, 1, cmd->padding);
> +		tx_sent = tx_data(conn, iov_p, 1, cmd->padding,
> +				  conn->conn_ops->DataDigest ? MSG_MORE : 0);
>  		if (cmd->padding != tx_sent) {
>  			if (tx_sent == -EAGAIN) {
>  				pr_err("tx_data() returned -EAGAIN\n");
> @@ -1263,7 +1271,7 @@ send_datacrc:
>  	if (conn->conn_ops->DataDigest) {
>  		struct kvec *iov_d = &cmd->iov_data[iov_off];
>  
> -		tx_sent = tx_data(conn, iov_d, 1, ISCSI_CRC_LEN);
> +		tx_sent = tx_data(conn, iov_d, 1, ISCSI_CRC_LEN, 0);
>  		if (ISCSI_CRC_LEN != tx_sent) {
>  			if (tx_sent == -EAGAIN) {
>  				pr_err("tx_data() returned -EAGAIN\n");
> @@ -1349,11 +1357,12 @@ static int iscsit_do_rx_data(
>  
>  static int iscsit_do_tx_data(
>  	struct iscsi_conn *conn,
> -	struct iscsi_data_count *count)
> +	struct iscsi_data_count *count,
> +	int flags)
>  {
>  	int data = count->data_length, total_tx = 0, tx_loop = 0, iov_len;
>  	struct kvec *iov_p;
> -	struct msghdr msg;
> +	struct msghdr msg = { .msg_flags = flags };
>  
>  	if (!conn || !conn->sock || !conn->conn_ops)
>  		return -1;
> @@ -1363,8 +1372,6 @@ static int iscsit_do_tx_data(
>  		return -1;
>  	}
>  
> -	memset(&msg, 0, sizeof(struct msghdr));
> -
>  	iov_p = count->iov;
>  	iov_len = count->iov_count;
>  
> @@ -1408,7 +1415,8 @@ int tx_data(
>  	struct iscsi_conn *conn,
>  	struct kvec *iov,
>  	int iov_count,
> -	int data)
> +	int data,
> +	int flags)
>  {
>  	struct iscsi_data_count c;
>  
> @@ -1421,7 +1429,7 @@ int tx_data(
>  	c.data_length = data;
>  	c.type = ISCSI_TX_DATA;
>  
> -	return iscsit_do_tx_data(conn, &c);
> +	return iscsit_do_tx_data(conn, &c, flags);
>  }
>  
>  void iscsit_collect_login_stats(
> diff --git a/drivers/target/iscsi/iscsi_target_util.h b/drivers/target/iscsi/iscsi_target_util.h
> index e4fc34a02f57..1b4f06801adc 100644
> --- a/drivers/target/iscsi/iscsi_target_util.h
> +++ b/drivers/target/iscsi/iscsi_target_util.h
> @@ -54,7 +54,7 @@ extern int iscsit_print_dev_to_proc(char *, char **, off_t, int);
>  extern int iscsit_print_sessions_to_proc(char *, char **, off_t, int);
>  extern int iscsit_print_tpg_to_proc(char *, char **, off_t, int);
>  extern int rx_data(struct iscsi_conn *, struct kvec *, int, int);
> -extern int tx_data(struct iscsi_conn *, struct kvec *, int, int);
> +extern int tx_data(struct iscsi_conn *, struct kvec *, int, int, int);
>  extern void iscsit_collect_login_stats(struct iscsi_conn *, u8, u8);
>  extern struct iscsi_tiqn *iscsit_snmp_get_tiqn(struct iscsi_conn *);
>  
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Glanzmann Feb. 10, 2014, 8:56 p.m. UTC | #3
Hello Nab,

> This looks correct to me.  Thomas, once your able to confirm please
> include your 'Tested-by' and I'll include for the next -rc3 PULL
> request.

Eric is currently reviewing our latest iteration with MSG_MORE for
kernel_sendmsg and MSG_MORE | MSG_SENDPAGE_NOTLAST for sendpage. However
with the last iteration we had again a high RTT for some packets. But
than Eric let me tune net.ipv4.tcp_min_tso_segs to 8 and the RTT went
down to what it used before auto corking was enabled. At least almost.

I'm having a steep learning curve but Eric hopefully knows how to get
this back in check. Nevertheless the regression I saw are history
because I saw that Eric has submitted the patch to David S. Miller which
fixes the two bugs that killed the iSCSI performance when tcp auto
corking was on. So currently we're just optimizing to get the last 20%
or so out of it. Quite interesting. Especially how much bandwidth can be
saved by coalescing packets.

Cheers,
        Thomas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 10, 2014, 9:01 p.m. UTC | #4
On Mon, 2014-02-10 at 21:56 +0100, Thomas Glanzmann wrote:
> Hello Nab,
> 
> > This looks correct to me.  Thomas, once your able to confirm please
> > include your 'Tested-by' and I'll include for the next -rc3 PULL
> > request.
> 
> Eric is currently reviewing our latest iteration with MSG_MORE for
> kernel_sendmsg and MSG_MORE | MSG_SENDPAGE_NOTLAST for sendpage. However
> with the last iteration we had again a high RTT for some packets. But
> than Eric let me tune net.ipv4.tcp_min_tso_segs to 8 and the RTT went
> down to what it used before auto corking was enabled. At least almost.
> 

Hmm.. I was not aware of high RTT for some packets.

Can you spot this on the pcap you provided ?


> I'm having a steep learning curve but Eric hopefully knows how to get
> this back in check. Nevertheless the regression I saw are history
> because I saw that Eric has submitted the patch to David S. Miller which
> fixes the two bugs that killed the iSCSI performance when tcp auto
> corking was on. So currently we're just optimizing to get the last 20%
> or so out of it. Quite interesting. Especially how much bandwidth can be
> saved by coalescing packets.

It depends on the ratio payload/headers.

The beginning of your pcap show a lot of 512 bytes requests, so for this
kind of requests, the gain is huge (maybe 50%), but for 32K or 64K
request, gain would be marginal.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Glanzmann Feb. 10, 2014, 9:14 p.m. UTC | #5
Hello Eric,

> Hmm.. I was not aware of high RTT for some packets.
> Can you spot this on the pcap you provided ?

with the latest patch as in:

(node-62) [~/work/linux-2.6] git diff | pbot
http://pbot.rmdir.de/CQwqI6b7wJProw_xaukmEg

with net.ipv4.tcp_min_tso_segs=2 we had this pcap:
https://thomas.glanzmann.de/tmp/tcp_auto_corking_on_patched_tcp_more_notlast.pcap.bz2
And here is the RTT TCP graph:
Wireshark > Statistics > TCP Stream Graph > Round Trip Time Graph
https://thomas.glanzmann.de/tmp/screenshot-mini-2014-02-10-22_04_30.png

with net.ipv4.tcp_min_tso_segs=8 we have this pcap:
https://thomas.glanzmann.de/tmp/tcp_auto_corking_on_patched_tcp_more_notlast_min_tso_segs_8.pcap.bz2
And here is the RTT TCP graph:
Wireshark > Statistics > TCP Stream Graph > Round Trip Time Graph

This gives us 0.0015 seconds RTT (1.5 ms)

Without TCP autocorking we had 0.0005 (0.5 ms).

https://thomas.glanzmann.de/tmp/screenshot-mini-2014-02-08-09:53:17.png

Cheers,
        Thomas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/target/iscsi/iscsi_target_parameters.c b/drivers/target/iscsi/iscsi_target_parameters.c
index 4d2e23fc76fd..b80239250a1c 100644
--- a/drivers/target/iscsi/iscsi_target_parameters.c
+++ b/drivers/target/iscsi/iscsi_target_parameters.c
@@ -79,7 +79,7 @@  int iscsi_login_tx_data(
 	 */
 	conn->if_marker += length;
 
-	tx_sent = tx_data(conn, &iov[0], iov_cnt, length);
+	tx_sent = tx_data(conn, &iov[0], iov_cnt, length, 0);
 	if (tx_sent != length) {
 		pr_err("tx_data returned %d, expecting %d.\n",
 				tx_sent, length);
diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index 0819e688a398..3c529f7c61ce 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -1165,7 +1165,7 @@  send_data:
 		iov_count = cmd->iov_misc_count;
 	}
 
-	tx_sent = tx_data(conn, &iov[0], iov_count, tx_size);
+	tx_sent = tx_data(conn, &iov[0], iov_count, tx_size, 0);
 	if (tx_size != tx_sent) {
 		if (tx_sent == -EAGAIN) {
 			pr_err("tx_data() returned -EAGAIN\n");
@@ -1196,7 +1196,8 @@  send_hdr:
 	iov.iov_base = cmd->pdu;
 	iov.iov_len = tx_hdr_size;
 
-	tx_sent = tx_data(conn, &iov, 1, tx_hdr_size);
+	tx_sent = tx_data(conn, &iov, 1, tx_hdr_size,
+			  cmd->tx_size != tx_hdr_size ? MSG_MORE : 0);
 	if (tx_hdr_size != tx_sent) {
 		if (tx_sent == -EAGAIN) {
 			pr_err("tx_data() returned -EAGAIN\n");
@@ -1225,18 +1226,24 @@  send_hdr:
 	while (data_len) {
 		u32 space = (sg->length - offset);
 		u32 sub_len = min_t(u32, data_len, space);
+		int flags = 0;
+	
+		if ((data_len != sub_len) || cmd->padding ||
+		    conn->conn_ops->DataDigest)
+			flags = MSG_SENDPAGE_NOTLAST | MSG_MORE;
+
 send_pg:
 		tx_sent = conn->sock->ops->sendpage(conn->sock,
-					sg_page(sg), sg->offset + offset, sub_len, 0);
+						    sg_page(sg),
+						    sg->offset + offset,
+						    sub_len, flags);
 		if (tx_sent != sub_len) {
 			if (tx_sent == -EAGAIN) {
-				pr_err("tcp_sendpage() returned"
-						" -EAGAIN\n");
+				pr_err("tcp_sendpage() returned -EAGAIN\n");
 				goto send_pg;
 			}
 
-			pr_err("tcp_sendpage() failure: %d\n",
-					tx_sent);
+			pr_err("tcp_sendpage() failure: %d\n", tx_sent);
 			return -1;
 		}
 
@@ -1249,7 +1256,8 @@  send_padding:
 	if (cmd->padding) {
 		struct kvec *iov_p = &cmd->iov_data[iov_off++];
 
-		tx_sent = tx_data(conn, iov_p, 1, cmd->padding);
+		tx_sent = tx_data(conn, iov_p, 1, cmd->padding,
+				  conn->conn_ops->DataDigest ? MSG_MORE : 0);
 		if (cmd->padding != tx_sent) {
 			if (tx_sent == -EAGAIN) {
 				pr_err("tx_data() returned -EAGAIN\n");
@@ -1263,7 +1271,7 @@  send_datacrc:
 	if (conn->conn_ops->DataDigest) {
 		struct kvec *iov_d = &cmd->iov_data[iov_off];
 
-		tx_sent = tx_data(conn, iov_d, 1, ISCSI_CRC_LEN);
+		tx_sent = tx_data(conn, iov_d, 1, ISCSI_CRC_LEN, 0);
 		if (ISCSI_CRC_LEN != tx_sent) {
 			if (tx_sent == -EAGAIN) {
 				pr_err("tx_data() returned -EAGAIN\n");
@@ -1349,11 +1357,12 @@  static int iscsit_do_rx_data(
 
 static int iscsit_do_tx_data(
 	struct iscsi_conn *conn,
-	struct iscsi_data_count *count)
+	struct iscsi_data_count *count,
+	int flags)
 {
 	int data = count->data_length, total_tx = 0, tx_loop = 0, iov_len;
 	struct kvec *iov_p;
-	struct msghdr msg;
+	struct msghdr msg = { .msg_flags = flags };
 
 	if (!conn || !conn->sock || !conn->conn_ops)
 		return -1;
@@ -1363,8 +1372,6 @@  static int iscsit_do_tx_data(
 		return -1;
 	}
 
-	memset(&msg, 0, sizeof(struct msghdr));
-
 	iov_p = count->iov;
 	iov_len = count->iov_count;
 
@@ -1408,7 +1415,8 @@  int tx_data(
 	struct iscsi_conn *conn,
 	struct kvec *iov,
 	int iov_count,
-	int data)
+	int data,
+	int flags)
 {
 	struct iscsi_data_count c;
 
@@ -1421,7 +1429,7 @@  int tx_data(
 	c.data_length = data;
 	c.type = ISCSI_TX_DATA;
 
-	return iscsit_do_tx_data(conn, &c);
+	return iscsit_do_tx_data(conn, &c, flags);
 }
 
 void iscsit_collect_login_stats(
diff --git a/drivers/target/iscsi/iscsi_target_util.h b/drivers/target/iscsi/iscsi_target_util.h
index e4fc34a02f57..1b4f06801adc 100644
--- a/drivers/target/iscsi/iscsi_target_util.h
+++ b/drivers/target/iscsi/iscsi_target_util.h
@@ -54,7 +54,7 @@  extern int iscsit_print_dev_to_proc(char *, char **, off_t, int);
 extern int iscsit_print_sessions_to_proc(char *, char **, off_t, int);
 extern int iscsit_print_tpg_to_proc(char *, char **, off_t, int);
 extern int rx_data(struct iscsi_conn *, struct kvec *, int, int);
-extern int tx_data(struct iscsi_conn *, struct kvec *, int, int);
+extern int tx_data(struct iscsi_conn *, struct kvec *, int, int, int);
 extern void iscsit_collect_login_stats(struct iscsi_conn *, u8, u8);
 extern struct iscsi_tiqn *iscsit_snmp_get_tiqn(struct iscsi_conn *);