diff mbox series

[v5,1/5] linux: Use getdents64 on non-LFS readdir

Message ID 20230127172834.391311-2-adhemerval.zanella@linaro.org
State New
Headers show
Series Fix opendir regression on some FS | expand

Commit Message

Adhemerval Zanella Netto Jan. 27, 2023, 5:28 p.m. UTC
The non-LFS opendir reserves a translation entry to be used to return
the entry and the dirent64 struct is translated to the temporary buffer
on each readdir call.

Entries that overflow d_off/d_ino and the buffer reallocation failure
(in case of large d_name) are ignored.

Checked on x86_64-linux-gnu and i686-linux-gnu.
---
 dirent/tst-scandir.c                |  6 ++-
 include/dirent.h                    |  2 +-
 sysdeps/unix/sysv/linux/dirstream.h |  5 ++
 sysdeps/unix/sysv/linux/readdir.c   | 83 +++++++++++++++++++----------
 4 files changed, 67 insertions(+), 29 deletions(-)

Comments

Andreas Schwab Jan. 27, 2023, 5:38 p.m. UTC | #1
On Jan 27 2023, Adhemerval Zanella via Libc-alpha wrote:

> +      /* Non-lfs opendir skips entries that can not be represented (for
> +	 instance if d_off is not an offset but rather an internal filesystem
> +	 representation.  For this case there is no point in continue the
> +	 testcase.  */

Missing close paren.  Either "in continuing" or "to continue".
Florian Weimer Feb. 20, 2023, 1:31 p.m. UTC | #2
* Adhemerval Zanella:

> diff --git a/sysdeps/unix/sysv/linux/dirstream.h b/sysdeps/unix/sysv/linux/dirstream.h
> index 3cb313b410..adcf8234f1 100644
> --- a/sysdeps/unix/sysv/linux/dirstream.h
> +++ b/sysdeps/unix/sysv/linux/dirstream.h
> @@ -18,6 +18,7 @@
>  #ifndef	_DIRSTREAM_H
>  #define	_DIRSTREAM_H	1
>  
> +#include <dirent.h>
>  #include <sys/types.h>
>  
>  #include <libc-lock.h>
> @@ -41,6 +42,10 @@ struct __dirstream
>  
>      int errcode;		/* Delayed error code.  */
>  
> +#if !defined __OFF_T_MATCHES_OFF64_T || !defined __INO_T_MATCHES_INO64_T
> +    struct dirent tdp;
> +#endif

I don't quite see how this can work given that d_name in the struct may
not be large enough.  The new member needs a comment to explain its
purpose.

> diff --git a/sysdeps/unix/sysv/linux/readdir.c b/sysdeps/unix/sysv/linux/readdir.c
> index 4a4c00ea07..cd0ccaf33a 100644
> --- a/sysdeps/unix/sysv/linux/readdir.c
> +++ b/sysdeps/unix/sysv/linux/readdir.c
> @@ -21,42 +21,71 @@
>  #if !_DIRENT_MATCHES_DIRENT64
>  #include <dirstream.h>
>  
> +/* Translate the DP64 entry to the non-LFS one in the translation entry
> +   at dirstream DS.  Return true is the translation was possible or
> +   false if either an internal field can not be represented in the non-LFS
> +   entry or if the name is too long.  */
> +static bool
> +dirstream_entry (struct __dirstream *ds, const struct dirent64 *dp64)
> +{
> +  /* Check for overflow.  */
> +  if (!in_off_t_range (dp64->d_off) || !in_ino_t_range (dp64->d_ino))
> +    return false;
> +
> +  /* And if name is too large.  */
> +  if (dp64->d_reclen - offsetof (struct dirent64, d_name) > NAME_MAX)
> +    return false;

Sorry, I don't think this is the direction we should be going.  In
readdir_r, we at least delay the NAME_MAX error to the end of the
directory.  This just adds another rare case where 32-bit code fails and
64-bit code works.

struct dirent is always shorter than struct dirent64, right?  It should
be possible to do the translation in-place.  Or turn tdp into a pointer
and reallocate as needed.

However, I think we should fix only readdir64, not readdir.  It's simply
not possible to fix readdir fully because of d_ino, so applications
which use readdir instead of readdir64 will remain buggy even with this
change.

What do you think?

Thanks,
Florian
Florian Weimer Feb. 20, 2023, 5:55 p.m. UTC | #3
* Florian Weimer:

>> diff --git a/sysdeps/unix/sysv/linux/readdir.c b/sysdeps/unix/sysv/linux/readdir.c
>> index 4a4c00ea07..cd0ccaf33a 100644
>> --- a/sysdeps/unix/sysv/linux/readdir.c
>> +++ b/sysdeps/unix/sysv/linux/readdir.c
>> @@ -21,42 +21,71 @@
>>  #if !_DIRENT_MATCHES_DIRENT64
>>  #include <dirstream.h>
>>  
>> +/* Translate the DP64 entry to the non-LFS one in the translation entry
>> +   at dirstream DS.  Return true is the translation was possible or
>> +   false if either an internal field can not be represented in the non-LFS
>> +   entry or if the name is too long.  */
>> +static bool
>> +dirstream_entry (struct __dirstream *ds, const struct dirent64 *dp64)
>> +{
>> +  /* Check for overflow.  */
>> +  if (!in_off_t_range (dp64->d_off) || !in_ino_t_range (dp64->d_ino))
>> +    return false;
>> +
>> +  /* And if name is too large.  */
>> +  if (dp64->d_reclen - offsetof (struct dirent64, d_name) > NAME_MAX)
>> +    return false;
>
> Sorry, I don't think this is the direction we should be going.  In
> readdir_r, we at least delay the NAME_MAX error to the end of the
> directory.  This just adds another rare case where 32-bit code fails and
> 64-bit code works.
>
> struct dirent is always shorter than struct dirent64, right?  It should
> be possible to do the translation in-place.  Or turn tdp into a pointer
> and reallocate as needed.
>
> However, I think we should fix only readdir64, not readdir.  It's simply
> not possible to fix readdir fully because of d_ino, so applications
> which use readdir instead of readdir64 will remain buggy even with this
> change.

Meh, during a walk it occurred tome that 64-bit d_ino is far less common
than 64-bit d_off.  So we really need all this rewriting. 8-(

Could you do it in-place at least?  To address the d_name sizing issue?

Thanks,
Florian
Adhemerval Zanella Netto March 1, 2023, 4:34 p.m. UTC | #4
On 20/02/23 14:55, Florian Weimer wrote:
> * Florian Weimer:
> 
>>> diff --git a/sysdeps/unix/sysv/linux/readdir.c b/sysdeps/unix/sysv/linux/readdir.c
>>> index 4a4c00ea07..cd0ccaf33a 100644
>>> --- a/sysdeps/unix/sysv/linux/readdir.c
>>> +++ b/sysdeps/unix/sysv/linux/readdir.c
>>> @@ -21,42 +21,71 @@
>>>  #if !_DIRENT_MATCHES_DIRENT64
>>>  #include <dirstream.h>
>>>  
>>> +/* Translate the DP64 entry to the non-LFS one in the translation entry
>>> +   at dirstream DS.  Return true is the translation was possible or
>>> +   false if either an internal field can not be represented in the non-LFS
>>> +   entry or if the name is too long.  */
>>> +static bool
>>> +dirstream_entry (struct __dirstream *ds, const struct dirent64 *dp64)
>>> +{
>>> +  /* Check for overflow.  */
>>> +  if (!in_off_t_range (dp64->d_off) || !in_ino_t_range (dp64->d_ino))
>>> +    return false;
>>> +
>>> +  /* And if name is too large.  */
>>> +  if (dp64->d_reclen - offsetof (struct dirent64, d_name) > NAME_MAX)
>>> +    return false;
>>
>> Sorry, I don't think this is the direction we should be going.  In
>> readdir_r, we at least delay the NAME_MAX error to the end of the
>> directory.  This just adds another rare case where 32-bit code fails and
>> 64-bit code works.
>>
>> struct dirent is always shorter than struct dirent64, right?  It should
>> be possible to do the translation in-place.  Or turn tdp into a pointer
>> and reallocate as needed.
>>
>> However, I think we should fix only readdir64, not readdir.  It's simply
>> not possible to fix readdir fully because of d_ino, so applications
>> which use readdir instead of readdir64 will remain buggy even with this
>> change.
> 
> Meh, during a walk it occurred tome that 64-bit d_ino is far less common
> than 64-bit d_off.  So we really need all this rewriting. 8-(
> 
> Could you do it in-place at least?  To address the d_name sizing issue?

I added the translation buffer only to simplify the copy logic, but it should
be doable to do it in place (similar to what getdents64 for mips64 does).
diff mbox series

Patch

diff --git a/dirent/tst-scandir.c b/dirent/tst-scandir.c
index 8d87d4dd74..7bc666449e 100644
--- a/dirent/tst-scandir.c
+++ b/dirent/tst-scandir.c
@@ -155,8 +155,12 @@  do_test (void)
     }
   if (n != 6)
     {
+      /* Non-lfs opendir skips entries that can not be represented (for
+	 instance if d_off is not an offset but rather an internal filesystem
+	 representation.  For this case there is no point in continue the
+	 testcase.  */
       printf ("scandir returned %d entries instead of 6\n", n);
-      return 1;
+      return EXIT_UNSUPPORTED;
     }
 
   struct
diff --git a/include/dirent.h b/include/dirent.h
index d7567f5e86..17827176ba 100644
--- a/include/dirent.h
+++ b/include/dirent.h
@@ -1,8 +1,8 @@ 
 #ifndef _DIRENT_H
+# include <dirent/dirent.h>
 # ifndef _ISOMAC
 #  include <dirstream.h>
 # endif
-# include <dirent/dirent.h>
 # ifndef _ISOMAC
 # include <sys/stat.h>
 # include <stdbool.h>
diff --git a/sysdeps/unix/sysv/linux/dirstream.h b/sysdeps/unix/sysv/linux/dirstream.h
index 3cb313b410..adcf8234f1 100644
--- a/sysdeps/unix/sysv/linux/dirstream.h
+++ b/sysdeps/unix/sysv/linux/dirstream.h
@@ -18,6 +18,7 @@ 
 #ifndef	_DIRSTREAM_H
 #define	_DIRSTREAM_H	1
 
+#include <dirent.h>
 #include <sys/types.h>
 
 #include <libc-lock.h>
@@ -41,6 +42,10 @@  struct __dirstream
 
     int errcode;		/* Delayed error code.  */
 
+#if !defined __OFF_T_MATCHES_OFF64_T || !defined __INO_T_MATCHES_INO64_T
+    struct dirent tdp;
+#endif
+
     /* Directory block.  We must make sure that this block starts
        at an address that is aligned adequately enough to store
        dirent entries.  Using the alignment of "void *" is not
diff --git a/sysdeps/unix/sysv/linux/readdir.c b/sysdeps/unix/sysv/linux/readdir.c
index 4a4c00ea07..cd0ccaf33a 100644
--- a/sysdeps/unix/sysv/linux/readdir.c
+++ b/sysdeps/unix/sysv/linux/readdir.c
@@ -21,42 +21,71 @@ 
 #if !_DIRENT_MATCHES_DIRENT64
 #include <dirstream.h>
 
+/* Translate the DP64 entry to the non-LFS one in the translation entry
+   at dirstream DS.  Return true is the translation was possible or
+   false if either an internal field can not be represented in the non-LFS
+   entry or if the name is too long.  */
+static bool
+dirstream_entry (struct __dirstream *ds, const struct dirent64 *dp64)
+{
+  /* Check for overflow.  */
+  if (!in_off_t_range (dp64->d_off) || !in_ino_t_range (dp64->d_ino))
+    return false;
+
+  /* And if name is too large.  */
+  if (dp64->d_reclen - offsetof (struct dirent64, d_name) > NAME_MAX)
+    return false;
+
+  ds->filepos = dp64->d_off;
+
+  ds->tdp.d_off = dp64->d_off;
+  ds->tdp.d_ino = dp64->d_ino;
+  ds->tdp.d_reclen = sizeof (struct dirent)
+    + dp64->d_reclen - offsetof (struct dirent64, d_name);
+  ds->tdp.d_type = dp64->d_type;
+  memcpy (ds->tdp.d_name, dp64->d_name,
+	  dp64->d_reclen - offsetof (struct dirent64, d_name));
+
+  return true;
+}
+
 /* Read a directory entry from DIRP.  */
 struct dirent *
 __readdir_unlocked (DIR *dirp)
 {
-  struct dirent *dp;
   int saved_errno = errno;
 
-  if (dirp->offset >= dirp->size)
+  while (1)
     {
-      /* We've emptied out our buffer.  Refill it.  */
-
-      size_t maxread = dirp->allocation;
-      ssize_t bytes;
-
-      bytes = __getdents (dirp->fd, dirp->data, maxread);
-      if (bytes <= 0)
+      if (dirp->offset >= dirp->size)
 	{
-	  /* Linux may fail with ENOENT on some file systems if the
-	     directory inode is marked as dead (deleted).  POSIX
-	     treats this as a regular end-of-directory condition, so
-	     do not set errno in that case, to indicate success.  */
-	  if (bytes == 0 || errno == ENOENT)
-	    __set_errno (saved_errno);
-	  return NULL;
-	}
-      dirp->size = (size_t) bytes;
-
-      /* Reset the offset into the buffer.  */
-      dirp->offset = 0;
+	  /* We've emptied out our buffer.  Refill it.  */
+	  ssize_t bytes = __getdents64 (dirp->fd, dirp->data,
+					dirp->allocation);
+	  if (bytes <= 0)
+	    {
+	      /* Linux may fail with ENOENT on some file systems if the
+		 directory inode is marked as dead (deleted).  POSIX
+		 treats this as a regular end-of-directory condition, so
+		 do not set errno in that case, to indicate success.  */
+	      if (bytes < 0 && errno == ENOENT)
+		__set_errno (saved_errno);
+	      return NULL;
+	    }
+	  dirp->size = bytes;
+
+ 	  /* Reset the offset into the buffer.  */
+	  dirp->offset = 0;
+ 	}
+
+      struct dirent64 *dp64 = (struct dirent64 *) &dirp->data[dirp->offset];
+      dirp->offset += dp64->d_reclen;
+
+      /* Skip entries which might overflow d_off/d_ino or if the translation
+	 buffer can not be resized.  */
+      if (dirstream_entry (dirp, dp64))
+	return &dirp->tdp;
     }
-
-  dp = (struct dirent *) &dirp->data[dirp->offset];
-  dirp->offset += dp->d_reclen;
-  dirp->filepos = dp->d_off;
-
-  return dp;
 }
 
 struct dirent *