Patchwork prevent slab corruption by fixing race codition in cifs

login
register
mail settings
Submitter Dave Kleikamp
Date Aug. 25, 2009, 9:23 p.m.
Message ID <1251235415.27359.116.camel@norville.austin.ibm.com>
Download mbox | patch
Permalink /patch/32094/
State New
Headers show

Comments

Dave Kleikamp - Aug. 25, 2009, 9:23 p.m.
On Tue, 2009-08-18 at 12:43 -0400, Jeff Layton wrote:
> On Tue, 18 Aug 2009 10:23:09 -0500
> Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
> 
> > On Tue, Aug 18, 2009 at 6:15 AM, Jeff Layton<jlayton@redhat.com> wrote:

> > > I'm less than thrilled with this patch. This looks like like it's
> > > layering more complexity onto a codepath that is already far too
> > > complex for what it does.
> > >
> > > Is it not possible to just use regular old refcounting for the open
> > > filehandles? i.e. have each user of the filehandle take a reference,
> > > and the last one to put it does the actual close. That seems like a
> > > much better approach to me than all of this crazy flag business.
> > >
> > 
> > I think this needs, some re-designing the code right?
> > 
> 
> My vote would be "yes". Redesign and simplify the code rather than
> adding in new hacks to work around the flaws in the existing design.
> 
> Redesigning it with actual refcounting (and not this half-assed
> wrtPending stuff) seems like a much better approach.

How's this?  Untested so far:

cifs: Replace wrtPending with a real reference count

Currently, cifs_close() tries to wait until all I/O is complete and then
frees the file private data.  If I/O does not completely in a reasonable
amount of time it frees the structure anyway, leaving a potential use-
after-free situation.

This patch changes the wrtPending counter to a complete reference count and
lets the last user free the structure.

WARNING: compile tested only at this time

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Jeff Layton - Aug. 25, 2009, 11:32 p.m.
On Tue, 25 Aug 2009 16:23:35 -0500
Dave Kleikamp <shaggy@linux.vnet.ibm.com> wrote:

> On Tue, 2009-08-18 at 12:43 -0400, Jeff Layton wrote:
> > On Tue, 18 Aug 2009 10:23:09 -0500
> > Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
> > 
> > > On Tue, Aug 18, 2009 at 6:15 AM, Jeff Layton<jlayton@redhat.com> wrote:
> 
> > > > I'm less than thrilled with this patch. This looks like like it's
> > > > layering more complexity onto a codepath that is already far too
> > > > complex for what it does.
> > > >
> > > > Is it not possible to just use regular old refcounting for the open
> > > > filehandles? i.e. have each user of the filehandle take a reference,
> > > > and the last one to put it does the actual close. That seems like a
> > > > much better approach to me than all of this crazy flag business.
> > > >
> > > 
> > > I think this needs, some re-designing the code right?
> > > 
> > 
> > My vote would be "yes". Redesign and simplify the code rather than
> > adding in new hacks to work around the flaws in the existing design.
> > 
> > Redesigning it with actual refcounting (and not this half-assed
> > wrtPending stuff) seems like a much better approach.
> 
> How's this?  Untested so far:
> 
> cifs: Replace wrtPending with a real reference count
> 
> Currently, cifs_close() tries to wait until all I/O is complete and then
> frees the file private data.  If I/O does not completely in a reasonable
> amount of time it frees the structure anyway, leaving a potential use-
> after-free situation.
> 
> This patch changes the wrtPending counter to a complete reference count and
> lets the last user free the structure.
> 
> WARNING: compile tested only at this time
> 
> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
> 

Much better. Looks like a step in the right direction.

-- Jeff
Jeff Layton - Aug. 26, 2009, 10:59 a.m.
On Tue, 25 Aug 2009 16:23:35 -0500
Dave Kleikamp <shaggy@linux.vnet.ibm.com> wrote:

> On Tue, 2009-08-18 at 12:43 -0400, Jeff Layton wrote:
> > On Tue, 18 Aug 2009 10:23:09 -0500
> > Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
> > 
> > > On Tue, Aug 18, 2009 at 6:15 AM, Jeff Layton<jlayton@redhat.com> wrote:
> 
> > > > I'm less than thrilled with this patch. This looks like like it's
> > > > layering more complexity onto a codepath that is already far too
> > > > complex for what it does.
> > > >
> > > > Is it not possible to just use regular old refcounting for the open
> > > > filehandles? i.e. have each user of the filehandle take a reference,
> > > > and the last one to put it does the actual close. That seems like a
> > > > much better approach to me than all of this crazy flag business.
> > > >
> > > 
> > > I think this needs, some re-designing the code right?
> > > 
> > 
> > My vote would be "yes". Redesign and simplify the code rather than
> > adding in new hacks to work around the flaws in the existing design.
> > 
> > Redesigning it with actual refcounting (and not this half-assed
> > wrtPending stuff) seems like a much better approach.
> 
> How's this?  Untested so far:
> 
> cifs: Replace wrtPending with a real reference count
> 
> Currently, cifs_close() tries to wait until all I/O is complete and then
> frees the file private data.  If I/O does not completely in a reasonable
> amount of time it frees the structure anyway, leaving a potential use-
> after-free situation.
> 
> This patch changes the wrtPending counter to a complete reference count and
> lets the last user free the structure.
> 
> WARNING: compile tested only at this time
> 
> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
> 

Follow up to yesterday's email.

As I said yesterday, this is a step in the right direction, but there's
still room for improvement in this area:

1) rather than having cifs_close wait for all of the users of the
filehandle to finish, should we just have the last user issue the close
on the wire? cifsFileInfo is really just a representation of an open
filehandle on the server. CIFS should probably just create them and
attach VFS filehandles to them on an as-needed basis.

2) find_readable_file and find_writable_file could be consolidated.
Rather than finding and matching filehandles based on the f_flags in
the filp, it seems like it would be preferable to track the actual
flags that were sent to the server and have an interface that searches
for open filehandles based on those. That might also allow us to reduce
the number of open filehandles in use. When a VFS open call comes in,
we could search for an existing filehandle that has the right flags and
just bump the refcount instead of doing a new open if one is found.

Some other comments below...

> diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
> index 6941c22..7dfe084 100644
> --- a/fs/cifs/cifsacl.c
> +++ b/fs/cifs/cifsacl.c
> @@ -607,7 +607,7 @@ static struct cifs_ntsd *get_cifs_acl(struct cifs_sb_info *cifs_sb,
>  		return get_cifs_acl_by_path(cifs_sb, path, pacllen);
>  
>  	pntsd = get_cifs_acl_by_fid(cifs_sb, open_file->netfid, pacllen);
> -	atomic_dec(&open_file->wrtPending);
> +	cifsFileInfo_put(open_file);
>  	return pntsd;
>  }
>  
> @@ -665,7 +665,7 @@ static int set_cifs_acl(struct cifs_ntsd *pnntsd, __u32 acllen,
>  		return set_cifs_acl_by_path(cifs_sb, path, pnntsd, acllen);
>  
>  	rc = set_cifs_acl_by_fid(cifs_sb, open_file->netfid, pnntsd, acllen);
> -	atomic_dec(&open_file->wrtPending);
> +	cifsFileInfo_put(open_file);
>  	return rc;
>  }
>  
> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
> index 6084d63..e3b1161 100644
> --- a/fs/cifs/cifsglob.h
> +++ b/fs/cifs/cifsglob.h
> @@ -351,11 +351,24 @@ struct cifsFileInfo {
>  	bool closePend:1;	/* file is marked to close */
		^^^^
Not directly related to this patch, but I think we could eliminate this
flag by being more careful about how and when we add and remove this
struct from the lists.

>  	bool invalidHandle:1;	/* file closed via session abend */
>  	bool messageMode:1;	/* for pipes: message vs byte mode */
> -	atomic_t wrtPending;   /* handle in use - defer close */
> +	atomic_t count;		/* reference count */
>  	struct mutex fh_mutex; /* prevents reopen race after dead ses*/
>  	struct cifs_search_info srch_inf;
>  };
>  
> +/* Take a reference on the file private data */
> +static inline void cifsFileInfo_get(struct cifsFileInfo *cifs_file)
> +{
> +	atomic_inc(&cifs_file->count);
> +}
> +

Would krefs be better here? Probably not possible unless we change it
so that cifs_close doesn't wait for the refcount to drop.

> +/* Release a reference on the file private data */
> +static inline void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> +{
> +	if (atomic_dec_and_test(&cifs_file->count))
> +		kfree(cifs_file);
> +}
> +

The above looks racy...what prevents the count from going high again
before you kfree? In practice, it looks like the
find_readable/writable_file functions use the closePend flag to prevent
this from occurring, but it would be cleaner if that logic were more
self-contained.

>  /*
>   * One of these for each file inode
>   */
> diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
> index 4326ffd..a6424cf 100644
> --- a/fs/cifs/dir.c
> +++ b/fs/cifs/dir.c
> @@ -153,7 +153,7 @@ cifs_fill_fileinfo(struct inode *newinode, __u16 fileHandle,
>  	mutex_init(&pCifsFile->fh_mutex);
>  	mutex_init(&pCifsFile->lock_mutex);
>  	INIT_LIST_HEAD(&pCifsFile->llist);
> -	atomic_set(&pCifsFile->wrtPending, 0);
> +	atomic_set(&pCifsFile->count, 1);
>  
>  	/* set the following in open now
>  			pCifsFile->pfile = file; */
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index c34b7f8..fa7beac 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -53,11 +53,9 @@ static inline struct cifsFileInfo *cifs_init_private(
>  	private_data->pInode = inode;
>  	private_data->invalidHandle = false;
>  	private_data->closePend = false;
> -	/* we have to track num writers to the inode, since writepages
> -	does not tell us which handle the write is for so there can
> -	be a close (overlapping with write) of the filehandle that
> -	cifs_writepages chose to use */
> -	atomic_set(&private_data->wrtPending, 0);
> +	/* Initialize reference count to one.  The private data is
> +	freed on the release of the last reference */
> +	atomic_set(&private_data->count, 1);
>  
>  	return private_data;
>  }
> @@ -643,7 +641,7 @@ int cifs_close(struct inode *inode, struct file *file)
>  			if (!pTcon->need_reconnect) {
>  				write_unlock(&GlobalSMBSeslock);
>  				timeout = 2;
> -				while ((atomic_read(&pSMBFile->wrtPending) != 0)
> +				while ((atomic_read(&pSMBFile->count) != 1)
>  					&& (timeout <= 2048)) {
>  					/* Give write a better chance to get to
>  					server ahead of the close.  We do not
> @@ -657,8 +655,6 @@ int cifs_close(struct inode *inode, struct file *file)
>  					msleep(timeout);
>  					timeout *= 4;
>  				}
> -				if (atomic_read(&pSMBFile->wrtPending))
> -					cERROR(1, ("close with pending write"));
>  				if (!pTcon->need_reconnect &&
>  				    !pSMBFile->invalidHandle)
>  					rc = CIFSSMBClose(xid, pTcon,
> @@ -681,24 +677,7 @@ int cifs_close(struct inode *inode, struct file *file)
>  		list_del(&pSMBFile->flist);
>  		list_del(&pSMBFile->tlist);
>  		write_unlock(&GlobalSMBSeslock);
> -		timeout = 10;
> -		/* We waited above to give the SMBWrite a chance to issue
> -		   on the wire (so we do not get SMBWrite returning EBADF
> -		   if writepages is racing with close.  Note that writepages
> -		   does not specify a file handle, so it is possible for a file
> -		   to be opened twice, and the application close the "wrong"
> -		   file handle - in these cases we delay long enough to allow
> -		   the SMBWrite to get on the wire before the SMB Close.
> -		   We allow total wait here over 45 seconds, more than
> -		   oplock break time, and more than enough to allow any write
> -		   to complete on the server, or to time out on the client */
> -		while ((atomic_read(&pSMBFile->wrtPending) != 0)
> -				&& (timeout <= 50000)) {
> -			cERROR(1, ("writes pending, delay free of handle"));
> -			msleep(timeout);
> -			timeout *= 8;
> -		}
> -		kfree(file->private_data);
> +		cifsFileInfo_put(file->private_data);
>  		file->private_data = NULL;
>  	} else
>  		rc = -EBADF;
> @@ -1236,7 +1215,7 @@ struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *cifs_inode)
>  			if (!open_file->invalidHandle) {
>  				/* found a good file */
>  				/* lock it so it will not be closed on us */
> -				atomic_inc(&open_file->wrtPending);
> +				cifsFileInfo_get(open_file);
>  				read_unlock(&GlobalSMBSeslock);
>  				return open_file;
>  			} /* else might as well continue, and look for
> @@ -1276,7 +1255,7 @@ refind_writable:
>  		if (open_file->pfile &&
>  		    ((open_file->pfile->f_flags & O_RDWR) ||
>  		     (open_file->pfile->f_flags & O_WRONLY))) {
> -			atomic_inc(&open_file->wrtPending);
> +			cifsFileInfo_get(open_file);
>  
>  			if (!open_file->invalidHandle) {
>  				/* found a good writable file */
> @@ -1293,7 +1272,7 @@ refind_writable:
>  				else { /* start over in case this was deleted */
>  				       /* since the list could be modified */
>  					read_lock(&GlobalSMBSeslock);
> -					atomic_dec(&open_file->wrtPending);
> +					cifsFileInfo_put(open_file);
>  					goto refind_writable;
>  				}
>  			}
> @@ -1309,7 +1288,7 @@ refind_writable:
>  			read_lock(&GlobalSMBSeslock);
>  			/* can not use this handle, no write
>  			   pending on this one after all */
> -			atomic_dec(&open_file->wrtPending);
> +			cifsFileInfo_put(open_file);
>  
>  			if (open_file->closePend) /* list could have changed */
>  				goto refind_writable;
> @@ -1373,7 +1352,7 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to)
>  	if (open_file) {
>  		bytes_written = cifs_write(open_file->pfile, write_data,
>  					   to-from, &offset);
> -		atomic_dec(&open_file->wrtPending);
> +		cifsFileInfo_put(open_file);
>  		/* Does mm or vfs already set times? */
>  		inode->i_atime = inode->i_mtime = current_fs_time(inode->i_sb);
>  		if ((bytes_written > 0) && (offset))
> @@ -1562,7 +1541,7 @@ retry:
>  						   bytes_to_write, offset,
>  						   &bytes_written, iov, n_iov,
>  						   long_op);
> -				atomic_dec(&open_file->wrtPending);
> +				cifsFileInfo_put(open_file);
>  				cifs_update_eof(cifsi, offset, bytes_written);
>  
>  				if (rc || bytes_written < bytes_to_write) {
> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> index 82d8383..1f09c76 100644
> --- a/fs/cifs/inode.c
> +++ b/fs/cifs/inode.c
> @@ -800,7 +800,7 @@ set_via_filehandle:
>  	if (open_file == NULL)
>  		CIFSSMBClose(xid, pTcon, netfid);
>  	else
> -		atomic_dec(&open_file->wrtPending);
> +		cifsFileInfo_put(open_file);
>  out:
>  	return rc;
>  }
> @@ -1635,7 +1635,7 @@ cifs_set_file_size(struct inode *inode, struct iattr *attrs,
>  		__u32 npid = open_file->pid;
>  		rc = CIFSSMBSetFileSize(xid, pTcon, attrs->ia_size, nfid,
>  					npid, false);
> -		atomic_dec(&open_file->wrtPending);
> +		cifsFileInfo_put(open_file);
>  		cFYI(1, ("SetFSize for attrs rc = %d", rc));
>  		if ((rc == -EINVAL) || (rc == -EOPNOTSUPP)) {
>  			unsigned int bytes_written;
> @@ -1790,7 +1790,7 @@ cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs)
>  		u16 nfid = open_file->netfid;
>  		u32 npid = open_file->pid;
>  		rc = CIFSSMBUnixSetFileInfo(xid, pTcon, args, nfid, npid);
> -		atomic_dec(&open_file->wrtPending);
> +		cifsFileInfo_put(open_file);
>  	} else {
>  		rc = CIFSSMBUnixSetPathInfo(xid, pTcon, full_path, args,
>  				    cifs_sb->local_nls,
>
Jeff Layton - Aug. 26, 2009, 7:02 p.m.
On Wed, 26 Aug 2009 06:59:17 -0400
Jeff Layton <jlayton@redhat.com> wrote:

> On Tue, 25 Aug 2009 16:23:35 -0500
> Dave Kleikamp <shaggy@linux.vnet.ibm.com> wrote:
> 
> > On Tue, 2009-08-18 at 12:43 -0400, Jeff Layton wrote:
> > > On Tue, 18 Aug 2009 10:23:09 -0500
> > > Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
> > > 
> > > > On Tue, Aug 18, 2009 at 6:15 AM, Jeff Layton<jlayton@redhat.com> wrote:
> > 
> > > > > I'm less than thrilled with this patch. This looks like like it's
> > > > > layering more complexity onto a codepath that is already far too
> > > > > complex for what it does.
> > > > >
> > > > > Is it not possible to just use regular old refcounting for the open
> > > > > filehandles? i.e. have each user of the filehandle take a reference,
> > > > > and the last one to put it does the actual close. That seems like a
> > > > > much better approach to me than all of this crazy flag business.
> > > > >
> > > > 
> > > > I think this needs, some re-designing the code right?
> > > > 
> > > 
> > > My vote would be "yes". Redesign and simplify the code rather than
> > > adding in new hacks to work around the flaws in the existing design.
> > > 
> > > Redesigning it with actual refcounting (and not this half-assed
> > > wrtPending stuff) seems like a much better approach.
> > 
> > How's this?  Untested so far:
> > 
> > cifs: Replace wrtPending with a real reference count
> > 
> > Currently, cifs_close() tries to wait until all I/O is complete and then
> > frees the file private data.  If I/O does not completely in a reasonable
> > amount of time it frees the structure anyway, leaving a potential use-
> > after-free situation.
> > 
> > This patch changes the wrtPending counter to a complete reference count and
> > lets the last user free the structure.
> > 
> > WARNING: compile tested only at this time
> > 
> > Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
> > 
> 
> Follow up to yesterday's email.
> 
> As I said yesterday, this is a step in the right direction, but there's
> still room for improvement in this area:
> 
> 1) rather than having cifs_close wait for all of the users of the
> filehandle to finish, should we just have the last user issue the close
> on the wire? cifsFileInfo is really just a representation of an open
> filehandle on the server. CIFS should probably just create them and
> attach VFS filehandles to them on an as-needed basis.
> 
> 2) find_readable_file and find_writable_file could be consolidated.
> Rather than finding and matching filehandles based on the f_flags in
> the filp, it seems like it would be preferable to track the actual
> flags that were sent to the server and have an interface that searches
> for open filehandles based on those. That might also allow us to reduce
> the number of open filehandles in use. When a VFS open call comes in,
> we could search for an existing filehandle that has the right flags and
> just bump the refcount instead of doing a new open if one is found.
> 
> Some other comments below...
> 

Let me be clear here...

This code is still an improvement and will likely fix the
use-after-free that Shirish identified. Assuming that it tests out OK,
I'm inclined to take this patch in the near term and then plan an
overhaul of the open filehandle code in the future.

Sound reasonable?
Shirish Pargaonkar - Aug. 30, 2009, 3:30 p.m.
The testing, so far looks good, the same tests that detected this slab
memory corruption
have run without any errors/corruption for more than 24 hours.

On Tue, Aug 25, 2009 at 4:23 PM, Dave Kleikamp<shaggy@linux.vnet.ibm.com> wrote:
> On Tue, 2009-08-18 at 12:43 -0400, Jeff Layton wrote:
>> On Tue, 18 Aug 2009 10:23:09 -0500
>> Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
>>
>> > On Tue, Aug 18, 2009 at 6:15 AM, Jeff Layton<jlayton@redhat.com> wrote:
>
>> > > I'm less than thrilled with this patch. This looks like like it's
>> > > layering more complexity onto a codepath that is already far too
>> > > complex for what it does.
>> > >
>> > > Is it not possible to just use regular old refcounting for the open
>> > > filehandles? i.e. have each user of the filehandle take a reference,
>> > > and the last one to put it does the actual close. That seems like a
>> > > much better approach to me than all of this crazy flag business.
>> > >
>> >
>> > I think this needs, some re-designing the code right?
>> >
>>
>> My vote would be "yes". Redesign and simplify the code rather than
>> adding in new hacks to work around the flaws in the existing design.
>>
>> Redesigning it with actual refcounting (and not this half-assed
>> wrtPending stuff) seems like a much better approach.
>
> How's this?  Untested so far:
>
> cifs: Replace wrtPending with a real reference count
>
> Currently, cifs_close() tries to wait until all I/O is complete and then
> frees the file private data.  If I/O does not completely in a reasonable
> amount of time it frees the structure anyway, leaving a potential use-
> after-free situation.
>
> This patch changes the wrtPending counter to a complete reference count and
> lets the last user free the structure.
>
> WARNING: compile tested only at this time
>
> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
>
> diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
> index 6941c22..7dfe084 100644
> --- a/fs/cifs/cifsacl.c
> +++ b/fs/cifs/cifsacl.c
> @@ -607,7 +607,7 @@ static struct cifs_ntsd *get_cifs_acl(struct cifs_sb_info *cifs_sb,
>                return get_cifs_acl_by_path(cifs_sb, path, pacllen);
>
>        pntsd = get_cifs_acl_by_fid(cifs_sb, open_file->netfid, pacllen);
> -       atomic_dec(&open_file->wrtPending);
> +       cifsFileInfo_put(open_file);
>        return pntsd;
>  }
>
> @@ -665,7 +665,7 @@ static int set_cifs_acl(struct cifs_ntsd *pnntsd, __u32 acllen,
>                return set_cifs_acl_by_path(cifs_sb, path, pnntsd, acllen);
>
>        rc = set_cifs_acl_by_fid(cifs_sb, open_file->netfid, pnntsd, acllen);
> -       atomic_dec(&open_file->wrtPending);
> +       cifsFileInfo_put(open_file);
>        return rc;
>  }
>
> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
> index 6084d63..e3b1161 100644
> --- a/fs/cifs/cifsglob.h
> +++ b/fs/cifs/cifsglob.h
> @@ -351,11 +351,24 @@ struct cifsFileInfo {
>        bool closePend:1;       /* file is marked to close */
>        bool invalidHandle:1;   /* file closed via session abend */
>        bool messageMode:1;     /* for pipes: message vs byte mode */
> -       atomic_t wrtPending;   /* handle in use - defer close */
> +       atomic_t count;         /* reference count */
>        struct mutex fh_mutex; /* prevents reopen race after dead ses*/
>        struct cifs_search_info srch_inf;
>  };
>
> +/* Take a reference on the file private data */
> +static inline void cifsFileInfo_get(struct cifsFileInfo *cifs_file)
> +{
> +       atomic_inc(&cifs_file->count);
> +}
> +
> +/* Release a reference on the file private data */
> +static inline void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> +{
> +       if (atomic_dec_and_test(&cifs_file->count))
> +               kfree(cifs_file);
> +}
> +
>  /*
>  * One of these for each file inode
>  */
> diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
> index 4326ffd..a6424cf 100644
> --- a/fs/cifs/dir.c
> +++ b/fs/cifs/dir.c
> @@ -153,7 +153,7 @@ cifs_fill_fileinfo(struct inode *newinode, __u16 fileHandle,
>        mutex_init(&pCifsFile->fh_mutex);
>        mutex_init(&pCifsFile->lock_mutex);
>        INIT_LIST_HEAD(&pCifsFile->llist);
> -       atomic_set(&pCifsFile->wrtPending, 0);
> +       atomic_set(&pCifsFile->count, 1);
>
>        /* set the following in open now
>                        pCifsFile->pfile = file; */
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index c34b7f8..fa7beac 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -53,11 +53,9 @@ static inline struct cifsFileInfo *cifs_init_private(
>        private_data->pInode = inode;
>        private_data->invalidHandle = false;
>        private_data->closePend = false;
> -       /* we have to track num writers to the inode, since writepages
> -       does not tell us which handle the write is for so there can
> -       be a close (overlapping with write) of the filehandle that
> -       cifs_writepages chose to use */
> -       atomic_set(&private_data->wrtPending, 0);
> +       /* Initialize reference count to one.  The private data is
> +       freed on the release of the last reference */
> +       atomic_set(&private_data->count, 1);
>
>        return private_data;
>  }
> @@ -643,7 +641,7 @@ int cifs_close(struct inode *inode, struct file *file)
>                        if (!pTcon->need_reconnect) {
>                                write_unlock(&GlobalSMBSeslock);
>                                timeout = 2;
> -                               while ((atomic_read(&pSMBFile->wrtPending) != 0)
> +                               while ((atomic_read(&pSMBFile->count) != 1)
>                                        && (timeout <= 2048)) {
>                                        /* Give write a better chance to get to
>                                        server ahead of the close.  We do not
> @@ -657,8 +655,6 @@ int cifs_close(struct inode *inode, struct file *file)
>                                        msleep(timeout);
>                                        timeout *= 4;
>                                }
> -                               if (atomic_read(&pSMBFile->wrtPending))
> -                                       cERROR(1, ("close with pending write"));
>                                if (!pTcon->need_reconnect &&
>                                    !pSMBFile->invalidHandle)
>                                        rc = CIFSSMBClose(xid, pTcon,
> @@ -681,24 +677,7 @@ int cifs_close(struct inode *inode, struct file *file)
>                list_del(&pSMBFile->flist);
>                list_del(&pSMBFile->tlist);
>                write_unlock(&GlobalSMBSeslock);
> -               timeout = 10;
> -               /* We waited above to give the SMBWrite a chance to issue
> -                  on the wire (so we do not get SMBWrite returning EBADF
> -                  if writepages is racing with close.  Note that writepages
> -                  does not specify a file handle, so it is possible for a file
> -                  to be opened twice, and the application close the "wrong"
> -                  file handle - in these cases we delay long enough to allow
> -                  the SMBWrite to get on the wire before the SMB Close.
> -                  We allow total wait here over 45 seconds, more than
> -                  oplock break time, and more than enough to allow any write
> -                  to complete on the server, or to time out on the client */
> -               while ((atomic_read(&pSMBFile->wrtPending) != 0)
> -                               && (timeout <= 50000)) {
> -                       cERROR(1, ("writes pending, delay free of handle"));
> -                       msleep(timeout);
> -                       timeout *= 8;
> -               }
> -               kfree(file->private_data);
> +               cifsFileInfo_put(file->private_data);
>                file->private_data = NULL;
>        } else
>                rc = -EBADF;
> @@ -1236,7 +1215,7 @@ struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *cifs_inode)
>                        if (!open_file->invalidHandle) {
>                                /* found a good file */
>                                /* lock it so it will not be closed on us */
> -                               atomic_inc(&open_file->wrtPending);
> +                               cifsFileInfo_get(open_file);
>                                read_unlock(&GlobalSMBSeslock);
>                                return open_file;
>                        } /* else might as well continue, and look for
> @@ -1276,7 +1255,7 @@ refind_writable:
>                if (open_file->pfile &&
>                    ((open_file->pfile->f_flags & O_RDWR) ||
>                     (open_file->pfile->f_flags & O_WRONLY))) {
> -                       atomic_inc(&open_file->wrtPending);
> +                       cifsFileInfo_get(open_file);
>
>                        if (!open_file->invalidHandle) {
>                                /* found a good writable file */
> @@ -1293,7 +1272,7 @@ refind_writable:
>                                else { /* start over in case this was deleted */
>                                       /* since the list could be modified */
>                                        read_lock(&GlobalSMBSeslock);
> -                                       atomic_dec(&open_file->wrtPending);
> +                                       cifsFileInfo_put(open_file);
>                                        goto refind_writable;
>                                }
>                        }
> @@ -1309,7 +1288,7 @@ refind_writable:
>                        read_lock(&GlobalSMBSeslock);
>                        /* can not use this handle, no write
>                           pending on this one after all */
> -                       atomic_dec(&open_file->wrtPending);
> +                       cifsFileInfo_put(open_file);
>
>                        if (open_file->closePend) /* list could have changed */
>                                goto refind_writable;
> @@ -1373,7 +1352,7 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to)
>        if (open_file) {
>                bytes_written = cifs_write(open_file->pfile, write_data,
>                                           to-from, &offset);
> -               atomic_dec(&open_file->wrtPending);
> +               cifsFileInfo_put(open_file);
>                /* Does mm or vfs already set times? */
>                inode->i_atime = inode->i_mtime = current_fs_time(inode->i_sb);
>                if ((bytes_written > 0) && (offset))
> @@ -1562,7 +1541,7 @@ retry:
>                                                   bytes_to_write, offset,
>                                                   &bytes_written, iov, n_iov,
>                                                   long_op);
> -                               atomic_dec(&open_file->wrtPending);
> +                               cifsFileInfo_put(open_file);
>                                cifs_update_eof(cifsi, offset, bytes_written);
>
>                                if (rc || bytes_written < bytes_to_write) {
> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> index 82d8383..1f09c76 100644
> --- a/fs/cifs/inode.c
> +++ b/fs/cifs/inode.c
> @@ -800,7 +800,7 @@ set_via_filehandle:
>        if (open_file == NULL)
>                CIFSSMBClose(xid, pTcon, netfid);
>        else
> -               atomic_dec(&open_file->wrtPending);
> +               cifsFileInfo_put(open_file);
>  out:
>        return rc;
>  }
> @@ -1635,7 +1635,7 @@ cifs_set_file_size(struct inode *inode, struct iattr *attrs,
>                __u32 npid = open_file->pid;
>                rc = CIFSSMBSetFileSize(xid, pTcon, attrs->ia_size, nfid,
>                                        npid, false);
> -               atomic_dec(&open_file->wrtPending);
> +               cifsFileInfo_put(open_file);
>                cFYI(1, ("SetFSize for attrs rc = %d", rc));
>                if ((rc == -EINVAL) || (rc == -EOPNOTSUPP)) {
>                        unsigned int bytes_written;
> @@ -1790,7 +1790,7 @@ cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs)
>                u16 nfid = open_file->netfid;
>                u32 npid = open_file->pid;
>                rc = CIFSSMBUnixSetFileInfo(xid, pTcon, args, nfid, npid);
> -               atomic_dec(&open_file->wrtPending);
> +               cifsFileInfo_put(open_file);
>        } else {
>                rc = CIFSSMBUnixSetPathInfo(xid, pTcon, full_path, args,
>                                    cifs_sb->local_nls,
>
> --
> David Kleikamp
> IBM Linux Technology Center
>
>

Patch

diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
index 6941c22..7dfe084 100644
--- a/fs/cifs/cifsacl.c
+++ b/fs/cifs/cifsacl.c
@@ -607,7 +607,7 @@  static struct cifs_ntsd *get_cifs_acl(struct cifs_sb_info *cifs_sb,
 		return get_cifs_acl_by_path(cifs_sb, path, pacllen);
 
 	pntsd = get_cifs_acl_by_fid(cifs_sb, open_file->netfid, pacllen);
-	atomic_dec(&open_file->wrtPending);
+	cifsFileInfo_put(open_file);
 	return pntsd;
 }
 
@@ -665,7 +665,7 @@  static int set_cifs_acl(struct cifs_ntsd *pnntsd, __u32 acllen,
 		return set_cifs_acl_by_path(cifs_sb, path, pnntsd, acllen);
 
 	rc = set_cifs_acl_by_fid(cifs_sb, open_file->netfid, pnntsd, acllen);
-	atomic_dec(&open_file->wrtPending);
+	cifsFileInfo_put(open_file);
 	return rc;
 }
 
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 6084d63..e3b1161 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -351,11 +351,24 @@  struct cifsFileInfo {
 	bool closePend:1;	/* file is marked to close */
 	bool invalidHandle:1;	/* file closed via session abend */
 	bool messageMode:1;	/* for pipes: message vs byte mode */
-	atomic_t wrtPending;   /* handle in use - defer close */
+	atomic_t count;		/* reference count */
 	struct mutex fh_mutex; /* prevents reopen race after dead ses*/
 	struct cifs_search_info srch_inf;
 };
 
+/* Take a reference on the file private data */
+static inline void cifsFileInfo_get(struct cifsFileInfo *cifs_file)
+{
+	atomic_inc(&cifs_file->count);
+}
+
+/* Release a reference on the file private data */
+static inline void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
+{
+	if (atomic_dec_and_test(&cifs_file->count))
+		kfree(cifs_file);
+}
+
 /*
  * One of these for each file inode
  */
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 4326ffd..a6424cf 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -153,7 +153,7 @@  cifs_fill_fileinfo(struct inode *newinode, __u16 fileHandle,
 	mutex_init(&pCifsFile->fh_mutex);
 	mutex_init(&pCifsFile->lock_mutex);
 	INIT_LIST_HEAD(&pCifsFile->llist);
-	atomic_set(&pCifsFile->wrtPending, 0);
+	atomic_set(&pCifsFile->count, 1);
 
 	/* set the following in open now
 			pCifsFile->pfile = file; */
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index c34b7f8..fa7beac 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -53,11 +53,9 @@  static inline struct cifsFileInfo *cifs_init_private(
 	private_data->pInode = inode;
 	private_data->invalidHandle = false;
 	private_data->closePend = false;
-	/* we have to track num writers to the inode, since writepages
-	does not tell us which handle the write is for so there can
-	be a close (overlapping with write) of the filehandle that
-	cifs_writepages chose to use */
-	atomic_set(&private_data->wrtPending, 0);
+	/* Initialize reference count to one.  The private data is
+	freed on the release of the last reference */
+	atomic_set(&private_data->count, 1);
 
 	return private_data;
 }
@@ -643,7 +641,7 @@  int cifs_close(struct inode *inode, struct file *file)
 			if (!pTcon->need_reconnect) {
 				write_unlock(&GlobalSMBSeslock);
 				timeout = 2;
-				while ((atomic_read(&pSMBFile->wrtPending) != 0)
+				while ((atomic_read(&pSMBFile->count) != 1)
 					&& (timeout <= 2048)) {
 					/* Give write a better chance to get to
 					server ahead of the close.  We do not
@@ -657,8 +655,6 @@  int cifs_close(struct inode *inode, struct file *file)
 					msleep(timeout);
 					timeout *= 4;
 				}
-				if (atomic_read(&pSMBFile->wrtPending))
-					cERROR(1, ("close with pending write"));
 				if (!pTcon->need_reconnect &&
 				    !pSMBFile->invalidHandle)
 					rc = CIFSSMBClose(xid, pTcon,
@@ -681,24 +677,7 @@  int cifs_close(struct inode *inode, struct file *file)
 		list_del(&pSMBFile->flist);
 		list_del(&pSMBFile->tlist);
 		write_unlock(&GlobalSMBSeslock);
-		timeout = 10;
-		/* We waited above to give the SMBWrite a chance to issue
-		   on the wire (so we do not get SMBWrite returning EBADF
-		   if writepages is racing with close.  Note that writepages
-		   does not specify a file handle, so it is possible for a file
-		   to be opened twice, and the application close the "wrong"
-		   file handle - in these cases we delay long enough to allow
-		   the SMBWrite to get on the wire before the SMB Close.
-		   We allow total wait here over 45 seconds, more than
-		   oplock break time, and more than enough to allow any write
-		   to complete on the server, or to time out on the client */
-		while ((atomic_read(&pSMBFile->wrtPending) != 0)
-				&& (timeout <= 50000)) {
-			cERROR(1, ("writes pending, delay free of handle"));
-			msleep(timeout);
-			timeout *= 8;
-		}
-		kfree(file->private_data);
+		cifsFileInfo_put(file->private_data);
 		file->private_data = NULL;
 	} else
 		rc = -EBADF;
@@ -1236,7 +1215,7 @@  struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *cifs_inode)
 			if (!open_file->invalidHandle) {
 				/* found a good file */
 				/* lock it so it will not be closed on us */
-				atomic_inc(&open_file->wrtPending);
+				cifsFileInfo_get(open_file);
 				read_unlock(&GlobalSMBSeslock);
 				return open_file;
 			} /* else might as well continue, and look for
@@ -1276,7 +1255,7 @@  refind_writable:
 		if (open_file->pfile &&
 		    ((open_file->pfile->f_flags & O_RDWR) ||
 		     (open_file->pfile->f_flags & O_WRONLY))) {
-			atomic_inc(&open_file->wrtPending);
+			cifsFileInfo_get(open_file);
 
 			if (!open_file->invalidHandle) {
 				/* found a good writable file */
@@ -1293,7 +1272,7 @@  refind_writable:
 				else { /* start over in case this was deleted */
 				       /* since the list could be modified */
 					read_lock(&GlobalSMBSeslock);
-					atomic_dec(&open_file->wrtPending);
+					cifsFileInfo_put(open_file);
 					goto refind_writable;
 				}
 			}
@@ -1309,7 +1288,7 @@  refind_writable:
 			read_lock(&GlobalSMBSeslock);
 			/* can not use this handle, no write
 			   pending on this one after all */
-			atomic_dec(&open_file->wrtPending);
+			cifsFileInfo_put(open_file);
 
 			if (open_file->closePend) /* list could have changed */
 				goto refind_writable;
@@ -1373,7 +1352,7 @@  static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to)
 	if (open_file) {
 		bytes_written = cifs_write(open_file->pfile, write_data,
 					   to-from, &offset);
-		atomic_dec(&open_file->wrtPending);
+		cifsFileInfo_put(open_file);
 		/* Does mm or vfs already set times? */
 		inode->i_atime = inode->i_mtime = current_fs_time(inode->i_sb);
 		if ((bytes_written > 0) && (offset))
@@ -1562,7 +1541,7 @@  retry:
 						   bytes_to_write, offset,
 						   &bytes_written, iov, n_iov,
 						   long_op);
-				atomic_dec(&open_file->wrtPending);
+				cifsFileInfo_put(open_file);
 				cifs_update_eof(cifsi, offset, bytes_written);
 
 				if (rc || bytes_written < bytes_to_write) {
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 82d8383..1f09c76 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -800,7 +800,7 @@  set_via_filehandle:
 	if (open_file == NULL)
 		CIFSSMBClose(xid, pTcon, netfid);
 	else
-		atomic_dec(&open_file->wrtPending);
+		cifsFileInfo_put(open_file);
 out:
 	return rc;
 }
@@ -1635,7 +1635,7 @@  cifs_set_file_size(struct inode *inode, struct iattr *attrs,
 		__u32 npid = open_file->pid;
 		rc = CIFSSMBSetFileSize(xid, pTcon, attrs->ia_size, nfid,
 					npid, false);
-		atomic_dec(&open_file->wrtPending);
+		cifsFileInfo_put(open_file);
 		cFYI(1, ("SetFSize for attrs rc = %d", rc));
 		if ((rc == -EINVAL) || (rc == -EOPNOTSUPP)) {
 			unsigned int bytes_written;
@@ -1790,7 +1790,7 @@  cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs)
 		u16 nfid = open_file->netfid;
 		u32 npid = open_file->pid;
 		rc = CIFSSMBUnixSetFileInfo(xid, pTcon, args, nfid, npid);
-		atomic_dec(&open_file->wrtPending);
+		cifsFileInfo_put(open_file);
 	} else {
 		rc = CIFSSMBUnixSetPathInfo(xid, pTcon, full_path, args,
 				    cifs_sb->local_nls,