NT_STATUS_INSUFFICIENT_RESOURCES and retrying writes to Windows 10 servers
diff mbox series

Message ID CAH2r5mvo9sWf8VoPb8puCDh4HM6WnrMgjs+HyhUzqEZXtuQwtA@mail.gmail.com
State New
Headers show
Series
  • NT_STATUS_INSUFFICIENT_RESOURCES and retrying writes to Windows 10 servers
Related show

Commit Message

Steve French June 16, 2019, 4:18 a.m. UTC
By default large file copy to Windows 10 can return MANY potentially
retryable errors on write (which we don't retry from the Linux cifs
client) which can cause cp to fail.

It did look like my patch for the problem worked (see below).  Windows
10 returns *A LOT* (about 1/3 of writes in some cases I tried) of
NT_STATUS_INSUFFICIENT_RESOURCES errors (presumably due to the
'blocking operation credit' max of 64 in Windows 10 - see note 203 of
MS-SMB2).

"<203> Section 3.3.4.2: Windows-based servers enforce a configurable
blocking operation credit,
which defaults to 64 on Windows Vista SP1, Windows 7, Windows 8,
Windows 8.1, and, Windows 10,
and defaults to 512 on Windows Server 2008, Windows Server 2008 R2,
Windows Server 2012 ..."

This patch did seem to work around the problem, but perhaps we should
use far fewer credits when mounting to Windows 10 even though they are
giving us enough credits for more? Or change how we do writes to not
do synchronous writes? I haven't seen this problem to Windows 2016 or
2019 but perhaps the explanation on note 203  is all we need to know
... ie that clients can enforce a lower limit than 512

~/cifs-2.6/fs/cifs$ git diff -a
        {STATUS_DEVICE_DATA_ERROR, -EIO, "STATUS_DEVICE_DATA_ERROR"},


e.g. see the number of write errors in my 8GB copy in my test below

# cat /proc/fs/cifs/Stats
Resources in use
CIFS Session: 1
Share (unique mount targets): 2
SMB Request/Response Buffer: 1 Pool size: 5
SMB Small Req/Resp Buffer: 1 Pool size: 30
Operations (MIDs): 0

0 session 0 share reconnects
Total vfs operations: 363 maximum at one time: 2

1) \\10.0.3.4\public-share
SMBs: 14879
Bytes read: 0  Bytes written: 8589934592
Open files: 2 total (local), 0 open on server
TreeConnects: 3 total 0 failed
TreeDisconnects: 0 total 0 failed
Creates: 12 total 0 failed
Closes: 10 total 0 failed
Flushes: 0 total 0 failed
Reads: 0 total 0 failed
Writes: 14838 total 5624 failed
...

Any thoughts?

Any risk that we could run into places where EAGAIN would not be
handled (there are SMB3 commands other than read and write where
NT_STATUS_INSUFFICIENT_RESOURCES could be returned in theory)

Comments

Steve French June 17, 2019, 7:51 p.m. UTC | #1
Attached is a patch with updated comments and cc:stable:


On Sat, Jun 15, 2019 at 11:18 PM Steve French <smfrench@gmail.com> wrote:
>
> By default large file copy to Windows 10 can return MANY potentially
> retryable errors on write (which we don't retry from the Linux cifs
> client) which can cause cp to fail.
>
> It did look like my patch for the problem worked (see below).  Windows
> 10 returns *A LOT* (about 1/3 of writes in some cases I tried) of
> NT_STATUS_INSUFFICIENT_RESOURCES errors (presumably due to the
> 'blocking operation credit' max of 64 in Windows 10 - see note 203 of
> MS-SMB2).
>
> "<203> Section 3.3.4.2: Windows-based servers enforce a configurable
> blocking operation credit,
> which defaults to 64 on Windows Vista SP1, Windows 7, Windows 8,
> Windows 8.1, and, Windows 10,
> and defaults to 512 on Windows Server 2008, Windows Server 2008 R2,
> Windows Server 2012 ..."
>
> This patch did seem to work around the problem, but perhaps we should
> use far fewer credits when mounting to Windows 10 even though they are
> giving us enough credits for more? Or change how we do writes to not
> do synchronous writes? I haven't seen this problem to Windows 2016 or
> 2019 but perhaps the explanation on note 203  is all we need to know
> ... ie that clients can enforce a lower limit than 512
>
> ~/cifs-2.6/fs/cifs$ git diff -a
> diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c
> index e32c264e3adb..82ade16c9501 100644
> --- a/fs/cifs/smb2maperror.c
> +++ b/fs/cifs/smb2maperror.c
> @@ -457,7 +457,7 @@ static const struct status_to_posix_error
> smb2_error_map_table[] = {
>         {STATUS_FILE_INVALID, -EIO, "STATUS_FILE_INVALID"},
>         {STATUS_ALLOTTED_SPACE_EXCEEDED, -EIO,
>         "STATUS_ALLOTTED_SPACE_EXCEEDED"},
> -       {STATUS_INSUFFICIENT_RESOURCES, -EREMOTEIO,
> +       {STATUS_INSUFFICIENT_RESOURCES, -EAGAIN,
>                                 "STATUS_INSUFFICIENT_RESOURCES"},
>         {STATUS_DFS_EXIT_PATH_FOUND, -EIO, "STATUS_DFS_EXIT_PATH_FOUND"},
>         {STATUS_DEVICE_DATA_ERROR, -EIO, "STATUS_DEVICE_DATA_ERROR"},
>
>
> e.g. see the number of write errors in my 8GB copy in my test below
>
> # cat /proc/fs/cifs/Stats
> Resources in use
> CIFS Session: 1
> Share (unique mount targets): 2
> SMB Request/Response Buffer: 1 Pool size: 5
> SMB Small Req/Resp Buffer: 1 Pool size: 30
> Operations (MIDs): 0
>
> 0 session 0 share reconnects
> Total vfs operations: 363 maximum at one time: 2
>
> 1) \\10.0.3.4\public-share
> SMBs: 14879
> Bytes read: 0  Bytes written: 8589934592
> Open files: 2 total (local), 0 open on server
> TreeConnects: 3 total 0 failed
> TreeDisconnects: 0 total 0 failed
> Creates: 12 total 0 failed
> Closes: 10 total 0 failed
> Flushes: 0 total 0 failed
> Reads: 0 total 0 failed
> Writes: 14838 total 5624 failed
> ...
>
> Any thoughts?
>
> Any risk that we could run into places where EAGAIN would not be
> handled (there are SMB3 commands other than read and write where
> NT_STATUS_INSUFFICIENT_RESOURCES could be returned in theory)
>
> --
> Thanks,
>
> Steve
ronnie sahlberg June 17, 2019, 8:45 p.m. UTC | #2
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>

On Tue, Jun 18, 2019 at 5:51 AM Steve French <smfrench@gmail.com> wrote:
>
> Attached is a patch with updated comments and cc:stable:
>
>
> On Sat, Jun 15, 2019 at 11:18 PM Steve French <smfrench@gmail.com> wrote:
> >
> > By default large file copy to Windows 10 can return MANY potentially
> > retryable errors on write (which we don't retry from the Linux cifs
> > client) which can cause cp to fail.
> >
> > It did look like my patch for the problem worked (see below).  Windows
> > 10 returns *A LOT* (about 1/3 of writes in some cases I tried) of
> > NT_STATUS_INSUFFICIENT_RESOURCES errors (presumably due to the
> > 'blocking operation credit' max of 64 in Windows 10 - see note 203 of
> > MS-SMB2).
> >
> > "<203> Section 3.3.4.2: Windows-based servers enforce a configurable
> > blocking operation credit,
> > which defaults to 64 on Windows Vista SP1, Windows 7, Windows 8,
> > Windows 8.1, and, Windows 10,
> > and defaults to 512 on Windows Server 2008, Windows Server 2008 R2,
> > Windows Server 2012 ..."
> >
> > This patch did seem to work around the problem, but perhaps we should
> > use far fewer credits when mounting to Windows 10 even though they are
> > giving us enough credits for more? Or change how we do writes to not
> > do synchronous writes? I haven't seen this problem to Windows 2016 or
> > 2019 but perhaps the explanation on note 203  is all we need to know
> > ... ie that clients can enforce a lower limit than 512
> >
> > ~/cifs-2.6/fs/cifs$ git diff -a
> > diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c
> > index e32c264e3adb..82ade16c9501 100644
> > --- a/fs/cifs/smb2maperror.c
> > +++ b/fs/cifs/smb2maperror.c
> > @@ -457,7 +457,7 @@ static const struct status_to_posix_error
> > smb2_error_map_table[] = {
> >         {STATUS_FILE_INVALID, -EIO, "STATUS_FILE_INVALID"},
> >         {STATUS_ALLOTTED_SPACE_EXCEEDED, -EIO,
> >         "STATUS_ALLOTTED_SPACE_EXCEEDED"},
> > -       {STATUS_INSUFFICIENT_RESOURCES, -EREMOTEIO,
> > +       {STATUS_INSUFFICIENT_RESOURCES, -EAGAIN,
> >                                 "STATUS_INSUFFICIENT_RESOURCES"},
> >         {STATUS_DFS_EXIT_PATH_FOUND, -EIO, "STATUS_DFS_EXIT_PATH_FOUND"},
> >         {STATUS_DEVICE_DATA_ERROR, -EIO, "STATUS_DEVICE_DATA_ERROR"},
> >
> >
> > e.g. see the number of write errors in my 8GB copy in my test below
> >
> > # cat /proc/fs/cifs/Stats
> > Resources in use
> > CIFS Session: 1
> > Share (unique mount targets): 2
> > SMB Request/Response Buffer: 1 Pool size: 5
> > SMB Small Req/Resp Buffer: 1 Pool size: 30
> > Operations (MIDs): 0
> >
> > 0 session 0 share reconnects
> > Total vfs operations: 363 maximum at one time: 2
> >
> > 1) \\10.0.3.4\public-share
> > SMBs: 14879
> > Bytes read: 0  Bytes written: 8589934592
> > Open files: 2 total (local), 0 open on server
> > TreeConnects: 3 total 0 failed
> > TreeDisconnects: 0 total 0 failed
> > Creates: 12 total 0 failed
> > Closes: 10 total 0 failed
> > Flushes: 0 total 0 failed
> > Reads: 0 total 0 failed
> > Writes: 14838 total 5624 failed
> > ...
> >
> > Any thoughts?
> >
> > Any risk that we could run into places where EAGAIN would not be
> > handled (there are SMB3 commands other than read and write where
> > NT_STATUS_INSUFFICIENT_RESOURCES could be returned in theory)
> >
> > --
> > Thanks,
> >
> > Steve
>
>
>
> --
> Thanks,
>
> Steve
Pavel Shilovsky June 17, 2019, 9:25 p.m. UTC | #3
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>

--
Best regards,
Pavel Shilovsky

пн, 17 июн. 2019 г. в 13:46, ronnie sahlberg <ronniesahlberg@gmail.com>:
>
> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
>
> On Tue, Jun 18, 2019 at 5:51 AM Steve French <smfrench@gmail.com> wrote:
> >
> > Attached is a patch with updated comments and cc:stable:
> >
> >
> > On Sat, Jun 15, 2019 at 11:18 PM Steve French <smfrench@gmail.com> wrote:
> > >
> > > By default large file copy to Windows 10 can return MANY potentially
> > > retryable errors on write (which we don't retry from the Linux cifs
> > > client) which can cause cp to fail.
> > >
> > > It did look like my patch for the problem worked (see below).  Windows
> > > 10 returns *A LOT* (about 1/3 of writes in some cases I tried) of
> > > NT_STATUS_INSUFFICIENT_RESOURCES errors (presumably due to the
> > > 'blocking operation credit' max of 64 in Windows 10 - see note 203 of
> > > MS-SMB2).
> > >
> > > "<203> Section 3.3.4.2: Windows-based servers enforce a configurable
> > > blocking operation credit,
> > > which defaults to 64 on Windows Vista SP1, Windows 7, Windows 8,
> > > Windows 8.1, and, Windows 10,
> > > and defaults to 512 on Windows Server 2008, Windows Server 2008 R2,
> > > Windows Server 2012 ..."
> > >
> > > This patch did seem to work around the problem, but perhaps we should
> > > use far fewer credits when mounting to Windows 10 even though they are
> > > giving us enough credits for more? Or change how we do writes to not
> > > do synchronous writes? I haven't seen this problem to Windows 2016 or
> > > 2019 but perhaps the explanation on note 203  is all we need to know
> > > ... ie that clients can enforce a lower limit than 512
> > >
> > > ~/cifs-2.6/fs/cifs$ git diff -a
> > > diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c
> > > index e32c264e3adb..82ade16c9501 100644
> > > --- a/fs/cifs/smb2maperror.c
> > > +++ b/fs/cifs/smb2maperror.c
> > > @@ -457,7 +457,7 @@ static const struct status_to_posix_error
> > > smb2_error_map_table[] = {
> > >         {STATUS_FILE_INVALID, -EIO, "STATUS_FILE_INVALID"},
> > >         {STATUS_ALLOTTED_SPACE_EXCEEDED, -EIO,
> > >         "STATUS_ALLOTTED_SPACE_EXCEEDED"},
> > > -       {STATUS_INSUFFICIENT_RESOURCES, -EREMOTEIO,
> > > +       {STATUS_INSUFFICIENT_RESOURCES, -EAGAIN,
> > >                                 "STATUS_INSUFFICIENT_RESOURCES"},
> > >         {STATUS_DFS_EXIT_PATH_FOUND, -EIO, "STATUS_DFS_EXIT_PATH_FOUND"},
> > >         {STATUS_DEVICE_DATA_ERROR, -EIO, "STATUS_DEVICE_DATA_ERROR"},
> > >
> > >
> > > e.g. see the number of write errors in my 8GB copy in my test below
> > >
> > > # cat /proc/fs/cifs/Stats
> > > Resources in use
> > > CIFS Session: 1
> > > Share (unique mount targets): 2
> > > SMB Request/Response Buffer: 1 Pool size: 5
> > > SMB Small Req/Resp Buffer: 1 Pool size: 30
> > > Operations (MIDs): 0
> > >
> > > 0 session 0 share reconnects
> > > Total vfs operations: 363 maximum at one time: 2
> > >
> > > 1) \\10.0.3.4\public-share
> > > SMBs: 14879
> > > Bytes read: 0  Bytes written: 8589934592
> > > Open files: 2 total (local), 0 open on server
> > > TreeConnects: 3 total 0 failed
> > > TreeDisconnects: 0 total 0 failed
> > > Creates: 12 total 0 failed
> > > Closes: 10 total 0 failed
> > > Flushes: 0 total 0 failed
> > > Reads: 0 total 0 failed
> > > Writes: 14838 total 5624 failed
> > > ...
> > >
> > > Any thoughts?
> > >
> > > Any risk that we could run into places where EAGAIN would not be
> > > handled (there are SMB3 commands other than read and write where
> > > NT_STATUS_INSUFFICIENT_RESOURCES could be returned in theory)
> > >
> > > --
> > > Thanks,
> > >
> > > Steve
> >
> >
> >
> > --
> > Thanks,
> >
> > Steve

Patch
diff mbox series

diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c
index e32c264e3adb..82ade16c9501 100644
--- a/fs/cifs/smb2maperror.c
+++ b/fs/cifs/smb2maperror.c
@@ -457,7 +457,7 @@  static const struct status_to_posix_error
smb2_error_map_table[] = {
        {STATUS_FILE_INVALID, -EIO, "STATUS_FILE_INVALID"},
        {STATUS_ALLOTTED_SPACE_EXCEEDED, -EIO,
        "STATUS_ALLOTTED_SPACE_EXCEEDED"},
-       {STATUS_INSUFFICIENT_RESOURCES, -EREMOTEIO,
+       {STATUS_INSUFFICIENT_RESOURCES, -EAGAIN,
                                "STATUS_INSUFFICIENT_RESOURCES"},
        {STATUS_DFS_EXIT_PATH_FOUND, -EIO, "STATUS_DFS_EXIT_PATH_FOUND"},