diff mbox series

smb: client: fix delay on concurrent opens

Message ID 20250428140500.1462107-1-pc@manguebit.com
State New
Headers show
Series smb: client: fix delay on concurrent opens | expand

Commit Message

Paulo Alcantara April 28, 2025, 2:05 p.m. UTC
Customers have reported open(2) calls being delayed by 30s or so, and
looking through the network traces, it is related to the client not
sending lease break acks to the server when a lease is being
downgraded from RWH to RW while having an open handle, causing
concurrent opens to be delayed and then impacting performance.

MS-SMB2 3.2.5.19.2:

  | If all open handles on this file are closed (that is, File.OpenTable
  | is empty for this file), the client SHOULD consider it as an implicit
  | acknowledgment of the lease break. No explicit acknowledgment is
  | required.

Since we hold an active reference of the open file to process the
lease break, then we should always send a lease break ack if required
by the server.

Cc: linux-cifs@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Reported-by: Pierguido Lambri <plambri@redhat.com>
Fixes: da787d5b7498 ("SMB3: Do not send lease break acknowledgment if all file handles have been closed")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
---
 fs/smb/client/file.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

Comments

Steve French April 28, 2025, 2:40 p.m. UTC | #1
Were you able to simulate a reproducer for this?

On Mon, Apr 28, 2025 at 9:05 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Customers have reported open(2) calls being delayed by 30s or so, and
> looking through the network traces, it is related to the client not
> sending lease break acks to the server when a lease is being
> downgraded from RWH to RW while having an open handle, causing
> concurrent opens to be delayed and then impacting performance.
>
> MS-SMB2 3.2.5.19.2:
>
>   | If all open handles on this file are closed (that is, File.OpenTable
>   | is empty for this file), the client SHOULD consider it as an implicit
>   | acknowledgment of the lease break. No explicit acknowledgment is
>   | required.
>
> Since we hold an active reference of the open file to process the
> lease break, then we should always send a lease break ack if required
> by the server.
>
> Cc: linux-cifs@vger.kernel.org
> Cc: David Howells <dhowells@redhat.com>
> Reported-by: Pierguido Lambri <plambri@redhat.com>
> Fixes: da787d5b7498 ("SMB3: Do not send lease break acknowledgment if all file handles have been closed")
> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
> ---
>  fs/smb/client/file.c | 13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)
>
> diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
> index 9e8f404b9e56..be9a07f2e8c6 100644
> --- a/fs/smb/client/file.c
> +++ b/fs/smb/client/file.c
> @@ -3146,19 +3146,12 @@ void cifs_oplock_break(struct work_struct *work)
>         oplock_break_cancelled = cfile->oplock_break_cancelled;
>
>         _cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
> -       /*
> -        * MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) do not require
> -        * an acknowledgment to be sent when the file has already been closed.
> -        */
> -       spin_lock(&cinode->open_file_lock);
> -       /* check list empty since can race with kill_sb calling tree disconnect */
> -       if (!oplock_break_cancelled && !list_empty(&cinode->openFileList)) {
> -               spin_unlock(&cinode->open_file_lock);
> +       /* MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) */
> +       if (!oplock_break_cancelled) {
>                 rc = server->ops->oplock_response(tcon, persistent_fid,
>                                                   volatile_fid, net_fid, cinode);
>                 cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
> -       } else
> -               spin_unlock(&cinode->open_file_lock);
> +       }
>
>         cifs_put_tlink(tlink);
>  out:
> --
> 2.49.0
>
Paulo Alcantara April 28, 2025, 2:44 p.m. UTC | #2
Steve French <smfrench@gmail.com> writes:

> Were you able to simulate a reproducer for this?

No.  Customers just confirmed that fix works.
Steve French April 30, 2025, 2:15 p.m. UTC | #3
Did you mean

downgraded "RWH to RW"  or downgraded from "RWH to RH"

But it is puzzling why file would still be open if not in openfile list

On Mon, Apr 28, 2025 at 9:05 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Customers have reported open(2) calls being delayed by 30s or so, and
> looking through the network traces, it is related to the client not
> sending lease break acks to the server when a lease is being
> downgraded from RWH to RW while having an open handle, causing
> concurrent opens to be delayed and then impacting performance.
>
> MS-SMB2 3.2.5.19.2:
>
>   | If all open handles on this file are closed (that is, File.OpenTable
>   | is empty for this file), the client SHOULD consider it as an implicit
>   | acknowledgment of the lease break. No explicit acknowledgment is
>   | required.
>
> Since we hold an active reference of the open file to process the
> lease break, then we should always send a lease break ack if required
> by the server.
>
> Cc: linux-cifs@vger.kernel.org
> Cc: David Howells <dhowells@redhat.com>
> Reported-by: Pierguido Lambri <plambri@redhat.com>
> Fixes: da787d5b7498 ("SMB3: Do not send lease break acknowledgment if all file handles have been closed")
> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
> ---
>  fs/smb/client/file.c | 13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)
>
> diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
> index 9e8f404b9e56..be9a07f2e8c6 100644
> --- a/fs/smb/client/file.c
> +++ b/fs/smb/client/file.c
> @@ -3146,19 +3146,12 @@ void cifs_oplock_break(struct work_struct *work)
>         oplock_break_cancelled = cfile->oplock_break_cancelled;
>
>         _cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
> -       /*
> -        * MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) do not require
> -        * an acknowledgment to be sent when the file has already been closed.
> -        */
> -       spin_lock(&cinode->open_file_lock);
> -       /* check list empty since can race with kill_sb calling tree disconnect */
> -       if (!oplock_break_cancelled && !list_empty(&cinode->openFileList)) {
> -               spin_unlock(&cinode->open_file_lock);
> +       /* MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) */
> +       if (!oplock_break_cancelled) {
>                 rc = server->ops->oplock_response(tcon, persistent_fid,
>                                                   volatile_fid, net_fid, cinode);
>                 cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
> -       } else
> -               spin_unlock(&cinode->open_file_lock);
> +       }
>
>         cifs_put_tlink(tlink);
>  out:
> --
> 2.49.0
>
Paulo Alcantara April 30, 2025, 7:14 p.m. UTC | #4
Steve French <smfrench@gmail.com> writes:

> Did you mean
>
> downgraded "RWH to RW"  or downgraded from "RWH to RH"

Downgraded from RWH to RW.

> But it is puzzling why file would still be open if not in openfile list

The server is breaking the H lease.  cifs_close_deferred_file() will put
all cifsFileInfo references from deferred closes in the inode, meaning
that _cifsFileInfo_put() call in cifs_oplock_break() will effectively
put the last reference and then closing the open handle.

The check for a non-empty list after that makes no sense as we already
closed all open handles.  As I understand it, the implicit ACK would
work if we have no open handles at the time we receive the lease break,
but that couldn't work as we always get an active reference of the open
handle before processing the lease break.

What am I missing?
Paulo Alcantara April 30, 2025, 7:44 p.m. UTC | #5
Steve French <smfrench@gmail.com> writes:

> What causes it to lose handle lease but still have read and write?

We don't know what caused it from the network traces we have.  I was
able to reproduce the downgrade from RWH to RW against Windows Server
2022 by mounting a share twice with 'nosharesock' and then having one
client writing to a file and the other client remaning the file's parent
directory.

For more information, see MS-SMB2 3.3.1.4.
Bharath SM May 5, 2025, 10:41 a.m. UTC | #6
On Thu, May 1, 2025 at 12:44 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Steve French <smfrench@gmail.com> writes:
>
> > Did you mean
> >
> > downgraded "RWH to RW"  or downgraded from "RWH to RH"
>
> Downgraded from RWH to RW.
>
> > But it is puzzling why file would still be open if not in openfile list
>
> The server is breaking the H lease.  cifs_close_deferred_file() will put
> all cifsFileInfo references from deferred closes in the inode, meaning
> that _cifsFileInfo_put() call in cifs_oplock_break() will effectively
> put the last reference and then closing the open handle.



> The check for a non-empty list after that makes no sense as we already
> closed all open handles.  As I understand it, the implicit ACK would
> work if we have no open handles at the time we receive the lease break,
> but that couldn't work as we always get an active reference of the open
> handle before processing the lease break.
>
> What am I missing?
>

If the file has only deferred handles and a handle lease break occurs,
closing all the handles triggers an implicit acknowledgment. After the
last handle is closed, the server may release the structures
associated with the file handle. If the acknowledgment is sent after
closing all the handles, the server may ignore it or it may return an
invalid file error, as it could have already freed the handle/lease
key and related structures. Additionally, this would result in an
extra command being sent to the server. This check was added to avoid
this case to send lease break ack only when openfilelist is not empty.


> We don't know what caused it from the network traces we have.  I was
> able to reproduce the downgrade from RWH to RW against Windows Server
> 2022 by mounting a share twice with 'nosharesock' and then having one
> client writing to a file and the other client remaning the file's parent
> directory.

> For more information, see MS-SMB2 3.3.1.4.

I didn't understand why a client would break 'H' lease on a file if
"one client writing to a file and other client remained the file's
parent directory."
Can you please help sharing more details and exact repro steps.?
Paulo Alcantara May 5, 2025, 1:12 p.m. UTC | #7
Bharath SM <bharathsm.hsk@gmail.com> writes:

> If the file has only deferred handles and a handle lease break occurs,
> closing all the handles triggers an implicit acknowledgment. After the
> last handle is closed, the server may release the structures
> associated with the file handle. If the acknowledgment is sent after
> closing all the handles, the server may ignore it or it may return an
> invalid file error, as it could have already freed the handle/lease
> key and related structures. 

I couldn't find anything in the specs related to the above.  Could you
please point me out to the correct specs or is this just theorical?

Have you been able to reproduce the above?  If so, please share the
details.

If the server is returning an invalid file error for a lease break ack
sent by the client that should be a no-op, isn't that a server bug then?

> Additionally, this would result in an extra command being sent to the
> server. This check was added to avoid this case to send lease break
> ack only when openfilelist is not empty.

I understand.  The problem with attempting to save that extra roundtrip
has caused performance problems with our customers accessing their
Windows Server SMB shares.

> I didn't understand why a client would break 'H' lease on a file if
> "one client writing to a file and other client remained the file's
> parent directory."
> Can you please help sharing more details and exact repro steps.?

mount.cifs //srv/share /mnt/1 -o ...,nosharesock
mount.cifs //srv/share /mnt/2 -o ...,nosharesock
tail -f /dev/null > /mnt/1/dir/foo &
mv /mnt/2/dir /mnt/2/dir2
Bharath SM May 5, 2025, 2:16 p.m. UTC | #8
Thanks Paulo for sharing details. Please find my responses.

On Mon, May 5, 2025 at 6:42 PM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Bharath SM <bharathsm.hsk@gmail.com> writes:
>
> > If the file has only deferred handles and a handle lease break occurs,
> > closing all the handles triggers an implicit acknowledgment. After the
> > last handle is closed, the server may release the structures
> > associated with the file handle. If the acknowledgment is sent after
> > closing all the handles, the server may ignore it or it may return an
> > invalid file error, as it could have already freed the handle/lease
> > key and related structures.
>
> I couldn't find anything in the specs related to the above.  Could you
> please point me out to the correct specs or is this just theorical?

Sorry for confusion, this is not from spec. I was trying to say, if we
remove the check
 "&& !list_empty(&cinode->openFileList" in code then client may get
"STATUS_OBJECT_NAME_NOT_FOUND" error when client sends lease break ack after
closing all file handles. Attached the pcap for ref.

> Have you been able to reproduce the above?  If so, please share the
> details.

Yes, attached the wireshark capture. please take a look.
steps:
1) Build cifs.ko by removing  "&& !list_empty(&cinode->openFileList)"
in cifs_oplock_break
2) Mount file share with "nosharesock,closetimeo=30" at /mnt/test1 and
/mnt/test2 ...
3) cd  /mnt/test1; echo "aaa" >> testfile
4) On other shell, cd /mnt/test2; echo "aaa" >> testfile

> If the server is returning an invalid file error for a lease break ack
> sent by the client that should be a no-op, isn't that a server bug then?

Windows Server returns STATUS_OBJECT_NAME_NOT_FOUND error code in such cases.
But I am not sure if this is a server bug.


> > Additionally, this would result in an extra command being sent to the
> > server. This check was added to avoid this case to send lease break
> > ack only when openfilelist is not empty.
>
> I understand.  The problem with attempting to save that extra roundtrip
> has caused performance problems with our customers accessing their
> Windows Server SMB shares.

I agree that we need to fix the customer issue on priority, but just
pointing out that when
we remove the existing check we will end up with this behavior. But if
we can reproduce
the cx issue or scenario then may be able to find a better fix which
can handle both cases.?

Please let me know your comments.


> > I didn't understand why a client would break 'H' lease on a file if
> > "one client writing to a file and other client remained the file's
> > parent directory."
> > Can you please help sharing more details and exact repro steps.?
>
> mount.cifs //srv/share /mnt/1 -o ...,nosharesock
> mount.cifs //srv/share /mnt/2 -o ...,nosharesock
> tail -f /dev/null > /mnt/1/dir/foo &
> mv /mnt/2/dir /mnt/2/dir2

Thanks for sharing details. I will try this out.
Paulo Alcantara May 5, 2025, 3:05 p.m. UTC | #9
Hi Bharath,

Bharath SM <bharathsm.hsk@gmail.com> writes:

> On Mon, May 5, 2025 at 6:42 PM Paulo Alcantara <pc@manguebit.com> wrote:
>>
>> Bharath SM <bharathsm.hsk@gmail.com> writes:
>>
>> > If the file has only deferred handles and a handle lease break occurs,
>> > closing all the handles triggers an implicit acknowledgment. After the
>> > last handle is closed, the server may release the structures
>> > associated with the file handle. If the acknowledgment is sent after
>> > closing all the handles, the server may ignore it or it may return an
>> > invalid file error, as it could have already freed the handle/lease
>> > key and related structures.
>>
>> I couldn't find anything in the specs related to the above.  Could you
>> please point me out to the correct specs or is this just theorical?
>
> Sorry for confusion, this is not from spec. I was trying to say, if we
> remove the check
>  "&& !list_empty(&cinode->openFileList" in code then client may get
> "STATUS_OBJECT_NAME_NOT_FOUND" error when client sends lease break ack after
> closing all file handles. Attached the pcap for ref.

Ah, thanks for the explanation!  Yes, that's indeed a problem.  We
should be calling _cifsFileInfo_put() right after sending the lease
break ack, as the old code did.

>> Have you been able to reproduce the above?  If so, please share the
>> details.
>
> Yes, attached the wireshark capture. please take a look.
> steps:
> 1) Build cifs.ko by removing  "&& !list_empty(&cinode->openFileList)"
> in cifs_oplock_break
> 2) Mount file share with "nosharesock,closetimeo=30" at /mnt/test1 and
> /mnt/test2 ...
> 3) cd  /mnt/test1; echo "aaa" >> testfile
> 4) On other shell, cd /mnt/test2; echo "aaa" >> testfile
>
>> If the server is returning an invalid file error for a lease break ack
>> sent by the client that should be a no-op, isn't that a server bug then?
>
> Windows Server returns STATUS_OBJECT_NAME_NOT_FOUND error code in such cases.
> But I am not sure if this is a server bug.

That's a legitimate error.  See above.

>> > Additionally, this would result in an extra command being sent to the
>> > server. This check was added to avoid this case to send lease break
>> > ack only when openfilelist is not empty.
>>
>> I understand.  The problem with attempting to save that extra roundtrip
>> has caused performance problems with our customers accessing their
>> Windows Server SMB shares.
>
> I agree that we need to fix the customer issue on priority, but just
> pointing out that when
> we remove the existing check we will end up with this behavior. But if
> we can reproduce
> the cx issue or scenario then may be able to find a better fix which
> can handle both cases.?
>
> Please let me know your comments.

Unfortunately I don't have any reproducers for the customer issue.

With these changes I no longer get the STATUS_OBJECT_NAME_NOT_FOUND
error with your reproducer.  Let me know.

diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index 851b74f557c1..5facc85b408a 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -3083,11 +3083,10 @@ void cifs_oplock_break(struct work_struct *work)
 	struct cifsInodeInfo *cinode = CIFS_I(inode);
 	struct cifs_tcon *tcon;
 	struct TCP_Server_Info *server;
+	bool purge_cache = false;
 	struct tcon_link *tlink;
+	struct cifs_fid *fid;
 	int rc = 0;
-	bool purge_cache = false, oplock_break_cancelled;
-	__u64 persistent_fid, volatile_fid;
-	__u16 net_fid;
 
 	wait_on_bit(&cinode->flags, CIFS_INODE_PENDING_WRITERS,
 			TASK_UNINTERRUPTIBLE);
@@ -3134,32 +3133,21 @@ void cifs_oplock_break(struct work_struct *work)
 	 * file handles but cached, then schedule deferred close immediately.
 	 * So, new open will not use cached handle.
 	 */
-
 	if (!CIFS_CACHE_HANDLE(cinode) && !list_empty(&cinode->deferred_closes))
 		cifs_close_deferred_file(cinode);
 
-	persistent_fid = cfile->fid.persistent_fid;
-	volatile_fid = cfile->fid.volatile_fid;
-	net_fid = cfile->fid.netfid;
-	oplock_break_cancelled = cfile->oplock_break_cancelled;
-
-	_cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
-	/*
-	 * MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) do not require
-	 * an acknowledgment to be sent when the file has already been closed.
-	 */
-	spin_lock(&cinode->open_file_lock);
-	/* check list empty since can race with kill_sb calling tree disconnect */
-	if (!oplock_break_cancelled && !list_empty(&cinode->openFileList)) {
-		spin_unlock(&cinode->open_file_lock);
-		rc = server->ops->oplock_response(tcon, persistent_fid,
-						  volatile_fid, net_fid, cinode);
+	fid = &cfile->fid;
+	/* MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) */
+	if (!cfile->oplock_break_cancelled) {
+		rc = server->ops->oplock_response(tcon, fid->persistent_fid,
+						  fid->volatile_fid,
+						  fid->netfid, cinode);
 		cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
-	} else
-		spin_unlock(&cinode->open_file_lock);
+	}
 
 	cifs_put_tlink(tlink);
 out:
+	_cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
 	cifs_done_oplock_break(cinode);
 }
Shyam Prasad N May 9, 2025, 11:22 a.m. UTC | #10
On Mon, May 5, 2025 at 8:35 PM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Hi Bharath,
>
> Bharath SM <bharathsm.hsk@gmail.com> writes:
>
> > On Mon, May 5, 2025 at 6:42 PM Paulo Alcantara <pc@manguebit.com> wrote:
> >>
> >> Bharath SM <bharathsm.hsk@gmail.com> writes:
> >>
> >> > If the file has only deferred handles and a handle lease break occurs,
> >> > closing all the handles triggers an implicit acknowledgment. After the
> >> > last handle is closed, the server may release the structures
> >> > associated with the file handle. If the acknowledgment is sent after
> >> > closing all the handles, the server may ignore it or it may return an
> >> > invalid file error, as it could have already freed the handle/lease
> >> > key and related structures.
> >>
> >> I couldn't find anything in the specs related to the above.  Could you
> >> please point me out to the correct specs or is this just theorical?
> >
> > Sorry for confusion, this is not from spec. I was trying to say, if we
> > remove the check
> >  "&& !list_empty(&cinode->openFileList" in code then client may get
> > "STATUS_OBJECT_NAME_NOT_FOUND" error when client sends lease break ack after
> > closing all file handles. Attached the pcap for ref.
>
> Ah, thanks for the explanation!  Yes, that's indeed a problem.  We
> should be calling _cifsFileInfo_put() right after sending the lease
> break ack, as the old code did.
>
> >> Have you been able to reproduce the above?  If so, please share the
> >> details.
> >
> > Yes, attached the wireshark capture. please take a look.
> > steps:
> > 1) Build cifs.ko by removing  "&& !list_empty(&cinode->openFileList)"
> > in cifs_oplock_break
> > 2) Mount file share with "nosharesock,closetimeo=30" at /mnt/test1 and
> > /mnt/test2 ...
> > 3) cd  /mnt/test1; echo "aaa" >> testfile
> > 4) On other shell, cd /mnt/test2; echo "aaa" >> testfile
> >
> >> If the server is returning an invalid file error for a lease break ack
> >> sent by the client that should be a no-op, isn't that a server bug then?
> >
> > Windows Server returns STATUS_OBJECT_NAME_NOT_FOUND error code in such cases.
> > But I am not sure if this is a server bug.
>
> That's a legitimate error.  See above.
>
> >> > Additionally, this would result in an extra command being sent to the
> >> > server. This check was added to avoid this case to send lease break
> >> > ack only when openfilelist is not empty.
> >>
> >> I understand.  The problem with attempting to save that extra roundtrip
> >> has caused performance problems with our customers accessing their
> >> Windows Server SMB shares.
> >
> > I agree that we need to fix the customer issue on priority, but just
> > pointing out that when
> > we remove the existing check we will end up with this behavior. But if
> > we can reproduce
> > the cx issue or scenario then may be able to find a better fix which
> > can handle both cases.?
> >
> > Please let me know your comments.
>
> Unfortunately I don't have any reproducers for the customer issue.
>
> With these changes I no longer get the STATUS_OBJECT_NAME_NOT_FOUND
> error with your reproducer.  Let me know.
>
> diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
> index 851b74f557c1..5facc85b408a 100644
> --- a/fs/smb/client/file.c
> +++ b/fs/smb/client/file.c
> @@ -3083,11 +3083,10 @@ void cifs_oplock_break(struct work_struct *work)
>         struct cifsInodeInfo *cinode = CIFS_I(inode);
>         struct cifs_tcon *tcon;
>         struct TCP_Server_Info *server;
> +       bool purge_cache = false;
>         struct tcon_link *tlink;
> +       struct cifs_fid *fid;
>         int rc = 0;
> -       bool purge_cache = false, oplock_break_cancelled;
> -       __u64 persistent_fid, volatile_fid;
> -       __u16 net_fid;
>
>         wait_on_bit(&cinode->flags, CIFS_INODE_PENDING_WRITERS,
>                         TASK_UNINTERRUPTIBLE);
> @@ -3134,32 +3133,21 @@ void cifs_oplock_break(struct work_struct *work)
>          * file handles but cached, then schedule deferred close immediately.
>          * So, new open will not use cached handle.
>          */
> -
>         if (!CIFS_CACHE_HANDLE(cinode) && !list_empty(&cinode->deferred_closes))
>                 cifs_close_deferred_file(cinode);
>
> -       persistent_fid = cfile->fid.persistent_fid;
> -       volatile_fid = cfile->fid.volatile_fid;
> -       net_fid = cfile->fid.netfid;
> -       oplock_break_cancelled = cfile->oplock_break_cancelled;
> -
> -       _cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
> -       /*
> -        * MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) do not require
> -        * an acknowledgment to be sent when the file has already been closed.
> -        */
> -       spin_lock(&cinode->open_file_lock);
> -       /* check list empty since can race with kill_sb calling tree disconnect */
> -       if (!oplock_break_cancelled && !list_empty(&cinode->openFileList)) {
> -               spin_unlock(&cinode->open_file_lock);
> -               rc = server->ops->oplock_response(tcon, persistent_fid,
> -                                                 volatile_fid, net_fid, cinode);
> +       fid = &cfile->fid;
> +       /* MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) */
> +       if (!cfile->oplock_break_cancelled) {
> +               rc = server->ops->oplock_response(tcon, fid->persistent_fid,
> +                                                 fid->volatile_fid,
> +                                                 fid->netfid, cinode);
>                 cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
> -       } else
> -               spin_unlock(&cinode->open_file_lock);
> +       }
>
>         cifs_put_tlink(tlink);
>  out:
> +       _cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
>         cifs_done_oplock_break(cinode);
>  }
>

Hi Paulo,

I think this version of the patch will be more problematic, as it will
open up a time window between the lease break ack and the file close.
Which is why we moved the _cifsFileInfo_put above as it is today.
Specifically, it is still not clear to me what was the exact bug that
your customer hit. And why is your patch fixing that issue? Can you
please elaborate on that?
On RWH to RW lease break, we would force close all deferred handles.
But any active handles (that are still kept open by the user) will
continue to remain open. Which is what this check is about.
Paulo Alcantara May 12, 2025, 1:31 p.m. UTC | #11
Shyam Prasad N <nspmangalore@gmail.com> writes:

> I think this version of the patch will be more problematic, as it will
> open up a time window between the lease break ack and the file close.
> Which is why we moved the _cifsFileInfo_put above as it is today.

Can you explain why it would be a problem sending a lease break ack
and then closing the file?

Do you have any reproducers for such problem?

> Specifically, it is still not clear to me what was the exact bug that
> your customer hit. And why is your patch fixing that issue? Can you
> please elaborate on that?

(1) CREATE foo
(2) CREATE resp
(3) CREATE foo
(4) LEASE BREAK (RWH -> RW)
(5) CLOSE foo
(6) CLOSE resp
...30s later...
(7) CREATE resp for (3)

Windows Server is delaying the CREATE request in (3) by 30s because the
client hasn't sent the lease break ack for (4).

Since the client closed the last open handle,
!list_empty(&cinode->openFileList) is obviously false and hence the
lease break ack isn't sent.

Pierguido suggested 'closetimeo=0' to the customer and it seemed to help
them with the performance issues due to delayed opens.

What is your suggestion to get rid of this optimization and then sending
lease break acks regardless whether there are open handles or not?

> On RWH to RW lease break, we would force close all deferred handles.
> But any active handles (that are still kept open by the user) will
> continue to remain open. Which is what this check is about.

The check you're referring to is related to whether or not sending lease
break acks.  What do you mean about "continue to remain open"?
Steve French May 12, 2025, 10:47 p.m. UTC | #12
On Mon, May 12, 2025 at 8:31 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Shyam Prasad N <nspmangalore@gmail.com> writes:
>
> > I think this version of the patch will be more problematic, as it will
> > open up a time window between the lease break ack and the file close.
> > Which is why we moved the _cifsFileInfo_put above as it is today.
>
> Can you explain why it would be a problem sending a lease break ack
> and then closing the file?

It is a performance issue to send an extra roundtrip that is unneeded, so
key question is why was the close not sent immediately if it was
a deferred close case.
Paulo Alcantara May 12, 2025, 10:57 p.m. UTC | #13
Steve French <smfrench@gmail.com> writes:

> On Mon, May 12, 2025 at 8:31 AM Paulo Alcantara <pc@manguebit.com> wrote:
>>
>> Shyam Prasad N <nspmangalore@gmail.com> writes:
>>
>> > I think this version of the patch will be more problematic, as it will
>> > open up a time window between the lease break ack and the file close.
>> > Which is why we moved the _cifsFileInfo_put above as it is today.
>>
>> Can you explain why it would be a problem sending a lease break ack
>> and then closing the file?
>
> It is a performance issue to send an extra roundtrip that is unneeded, so
> key question is why was the close not sent immediately if it was
> a deferred close case.

Agreed.  Even worse when the saved roundtrip causes concurrent opens to
be delayed by 30s.

Look at the commands I shared.  The close is sent immediately after the
lease break is received by the client.
diff mbox series

Patch

diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index 9e8f404b9e56..be9a07f2e8c6 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -3146,19 +3146,12 @@  void cifs_oplock_break(struct work_struct *work)
 	oplock_break_cancelled = cfile->oplock_break_cancelled;
 
 	_cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
-	/*
-	 * MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) do not require
-	 * an acknowledgment to be sent when the file has already been closed.
-	 */
-	spin_lock(&cinode->open_file_lock);
-	/* check list empty since can race with kill_sb calling tree disconnect */
-	if (!oplock_break_cancelled && !list_empty(&cinode->openFileList)) {
-		spin_unlock(&cinode->open_file_lock);
+	/* MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) */
+	if (!oplock_break_cancelled) {
 		rc = server->ops->oplock_response(tcon, persistent_fid,
 						  volatile_fid, net_fid, cinode);
 		cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
-	} else
-		spin_unlock(&cinode->open_file_lock);
+	}
 
 	cifs_put_tlink(tlink);
 out: