diff mbox series

[RFC] cifs: retry lookup and readdir when EAGAIN is returned.

Message ID YLplrk3FQiUtVoWi@nyarly.rlyeh.local
State New
Headers show
Series [RFC] cifs: retry lookup and readdir when EAGAIN is returned. | expand

Commit Message

Thiago Rafael Becker June 4, 2021, 5:41 p.m. UTC
According to the investigation performed by Jacob Shivers at Red Hat,
cifs_lookup and cifs_readdir leak EAGAIN when the user session is
deleted on the server. Fix this issue by implementing a retry with
limits, as is implemented in cifs_revalidate_dentry_attr.

Signed-off-by: Thiago Rafael Becker <trbecker@gmail.com>
---
 fs/cifs/dir.c     | 4 ++++
 fs/cifs/readdir.c | 6 ++++++
 2 files changed, 10 insertions(+)

Comments

Aurélien Aptel June 7, 2021, 9:32 a.m. UTC | #1
Thiago Rafael Becker <trbecker@gmail.com> writes:

> According to the investigation performed by Jacob Shivers at Red Hat,
> cifs_lookup and cifs_readdir leak EAGAIN when the user session is
> deleted on the server. Fix this issue by implementing a retry with
> limits, as is implemented in cifs_revalidate_dentry_attr.

If the user session is deleted trying again will always fail. Are you
sure this is the reason you get this issue?

Cheers,
Thiago Rafael Becker June 7, 2021, 1:57 p.m. UTC | #2
On Mon, Jun 07, 2021 at 11:32:46AM +0200, Aurélien Aptel wrote:
> If the user session is deleted trying again will always fail. Are you
> sure this is the reason you get this issue?

It is. We can explicitly delete the user sesssion on windows prior to one
of these calls, which will be replyed with STATUS_USER_SESSION_DELETED.


  470 0.0 client → server SMB2 198 Create Request File:
  471 0.n server → client SMB2 143 Create Response, Error: STATUS_USER_SESSION_DELETED

Which is converted to EAGAIN with the expectation that someone will
handle it down the stack while the user session is restablished. This
doesn't happen currently, and EAGAIN is leaking to userspace.

For getdents, EAGAIN is unexpected, and most applications don't bother
handling it, including coreutils (ls and stat were used to test this
patch). And for several syscalls that rely on lookup, returning EAGAIN is
unexpected, so we shouldn't leak it.

In our testing, sending the call again handles the problem with no userspace disrruption.

Best,
Thiago
Thiago Rafael Becker June 7, 2021, 3:27 p.m. UTC | #3
The following capture for a run of ls after running close-smbsession on
windows and clearing the caches on the linux client running the upstream
kernel.

    1   0.000000 client → server SMB2 410 Create Request File: dir;GetInfo Request FILE_INFO/SMB2_FILE_ALL_INFO;Close Request
    2   0.004586 server → client  SMB2 274 Create Response, Error: STATUS_USER_SESSION_DELETED;GetInfo Response, Error: STATUS_INVALID_PARAMETER;Close Response, Error: STATUS_INVALID_PARAMETER
    9   0.024997 client → server SMB2 290 Negotiate Protocol Request
   10   0.033179 server → client  SMB2 566 Negotiate Protocol Response
   12   0.033578 client → server SMB2 178 Session Setup Request, NTLMSSP_NEGOTIATE
   13   0.041297 server → client  SMB2 368 Session Setup Response, Error: STATUS_MORE_PROCESSING_REQUIRED, NTLMSSP_CHALLENGE
   15   0.041792 client → server SMB2 434 Session Setup Request, NTLMSSP_AUTH, User: \user
   16   0.047307 server → client  SMB2 130 Session Setup Response
   18   0.047832 client → server SMB2 174 Tree Connect Request Tree: \\server\root
   19   0.057075 server → client  SMB2 138 Tree Connect Response
   20   0.057586 client → server SMB2 172 Tree Connect Request Tree: \\server\IPC$
   21   0.062169 server → client  SMB2 138 Tree Connect Response
   22   0.062273 client → server SMB2 410 Create Request File: dir;GetInfo Request FILE_INFO/SMB2_FILE_ALL_INFO;Close Request
   23   0.069050 server → client  SMB2 570 Create Response File: dir;GetInfo Response;Close Response
   24   0.069943 client → server SMB2 316 Create Request File: dir;Find Request SMB2_FIND_ID_FULL_DIRECTORY_INFO Pattern: *
   25   0.079125 server → client  SMB2 3842 Create Response File: dir;Find Response SMB2_FIND_ID_FULL_DIRECTORY_INFO Pattern: *
   27   0.081926 client → server SMB2 156 Find Request File: dir SMB2_FIND_ID_FULL_DIRECTORY_INFO Pattern: *
   28   0.093209 server → client  SMB2 130 Find Response, Error: STATUS_NO_MORE_FILES SMB2_FIND_ID_FULL_DIRECTORY_INFO Pattern: *
   29   0.093743 client → server SMB2 146 Close Request File: dir
   30   0.099034 server → client  SMB2 182 Close Response
   
Similar pattern for stat.

Best,
Thiago
diff mbox series

Patch

diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 6bcd3e8f7cda..7c641f9a3dac 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -630,6 +630,7 @@  cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 	struct inode *newInode = NULL;
 	const char *full_path;
 	void *page;
+	int retry_count = 0;
 
 	xid = get_xid();
 
@@ -673,6 +674,7 @@  cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 	cifs_dbg(FYI, "Full path: %s inode = 0x%p\n",
 		 full_path, d_inode(direntry));
 
+again:
 	if (pTcon->posix_extensions)
 		rc = smb311_posix_get_inode_info(&newInode, full_path, parent_dir_inode->i_sb, xid);
 	else if (pTcon->unix_ext) {
@@ -687,6 +689,8 @@  cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 		/* since paths are not looked up by component - the parent
 		   directories are presumed to be good here */
 		renew_parental_timestamps(direntry);
+	} else if (rc == -EAGAIN && retry_count++ < 10) {
+		goto again;
 	} else if (rc == -ENOENT) {
 		cifs_set_time(direntry, jiffies);
 		newInode = NULL;
diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c
index 63bfc533c9fb..01e6d41e387f 100644
--- a/fs/cifs/readdir.c
+++ b/fs/cifs/readdir.c
@@ -844,9 +844,15 @@  static int cifs_filldir(char *find_entry, struct file *file,
 	struct qstr name;
 	int rc = 0;
 	ino_t ino;
+	int retry_count = 0;
 
+again:
 	rc = cifs_fill_dirent(&de, find_entry, file_info->srch_inf.info_level,
 			      file_info->srch_inf.unicode);
+
+	if (rc == -EAGAIN && retry_count++ < 10)
+		goto again;
+
 	if (rc)
 		return rc;