diff mbox series

posix/glob.c: update from gnulib

Message ID xn4k3fw1f0.fsf@greed.delorie.com
State New
Headers show
Series posix/glob.c: update from gnulib | expand

Commit Message

DJ Delorie March 30, 2022, 9:55 p.m. UTC
Copied from gnulib/lib/glob.c in order to fix rhbz 1982608

Used config.h instead of libc-config.h

The #ifdef around #define dirfd() was changed to #undef due to
conflicts between glibc's internal and external definitions of
dirfd().  This has been reported to gnulib.

Comments

Paul Eggert March 30, 2022, 11:40 p.m. UTC | #1
On 3/30/22 14:55, DJ Delorie via Libc-alpha wrote:

> Copied from gnulib/lib/glob.c in order to fix rhbz 1982608
> 
> Used config.h instead of libc-config.h

I don't see why this change is needed. This code is inside "#ifndef 
_LIBC" so this change should have no effect for glibc. And the change is 
harmful for Gnulib, since for this file Gnulib relies on including 
libc-config.h instead of plain config.h.

> The #ifdef around #define dirfd() was changed to #undef due to
> conflicts between glibc's internal and external definitions of
> dirfd().  This has been reported to gnulib.

I updated Gnulib to reflect this change; see first attached patch. That 
being said, I don't fully understand it. Wouldn't it be more efficient 
for glibc glob to use glibc's internal dirfd by whatever name you prefer?

Anyway, the only difference between what you proposed for glibc and 
current Gnulib glob is the second attached patch; could you please merge 
that into your proposal? That way, the two glob.c files can be 
identical, which is a good thing.

Thanks for following up on this.
Paul Eggert March 30, 2022, 11:45 p.m. UTC | #2
On 3/30/22 16:40, Paul Eggert wrote:
> 
> I updated Gnulib to reflect this change; see first attached patch.

Oops, forgot to attach that patch. Here it is. Also cc'ing to bug-gnulib.
DJ Delorie March 30, 2022, 11:54 p.m. UTC | #3
Paul Eggert <eggert@cs.ucla.edu> writes:
>> Used config.h instead of libc-config.h
>
> I don't see why this change is needed. This code is inside "#ifndef 
> _LIBC" so this change should have no effect for glibc. And the change is 
> harmful for Gnulib, since for this file Gnulib relies on including 
> libc-config.h instead of plain config.h.

I was just leaving that line as it was in glibc, but as you note, it
shouldn't make a difference.  glibc does have a config.h though, and no
libc-config.h, which is what made me think this was significant.

>> The #ifdef around #define dirfd() was changed to #undef due to
>> conflicts between glibc's internal and external definitions of
>> dirfd().  This has been reported to gnulib.
>
> I updated Gnulib to reflect this change; see first attached patch. That 
> being said, I don't fully understand it. Wouldn't it be more efficient 
> for glibc glob to use glibc's internal dirfd by whatever name you prefer?

Perhaps, but doing so involved more than just using a macro, because a
lot of what the internal macro does depends on having insight into the
internals of other modules (like typedef names, for example), which
involves other includes (for those typedefs) and such.  It was far
easier to just leave it alone than to go down that (perhaps shallow ;)
rabbit hole.  I.e. it was the minimum fix.

Since it's just a dereference I don't think it will affect performance.

> Anyway, the only difference between what you proposed for glibc and 
> current Gnulib glob is the second attached patch; could you please merge 
> that into your proposal? That way, the two glob.c files can be 
> identical, which is a good thing.

Done.
Adhemerval Zanella Netto March 31, 2022, 10:56 a.m. UTC | #4
On 30/03/2022 18:55, DJ Delorie via Libc-alpha wrote:
> 
> Copied from gnulib/lib/glob.c in order to fix rhbz 1982608
> 
> Used config.h instead of libc-config.h
> 
> The #ifdef around #define dirfd() was changed to #undef due to
> conflicts between glibc's internal and external definitions of
> dirfd().  This has been reported to gnulib.


Could you create a bug report to track this? If I recall correctly from
gnulib discussion, this came from an issue with XFS. Would be possible
to craft a regression test?


> 
> diff --git a/posix/glob.c b/posix/glob.c
> index a2b5aabada..19b876f9f0 100644
> --- a/posix/glob.c
> +++ b/posix/glob.c
> @@ -28,6 +28,7 @@
>  #include <glob.h>
>  
>  #include <errno.h>
> +#include <fcntl.h>
>  #include <sys/types.h>
>  #include <sys/stat.h>
>  #include <stdbool.h>
> @@ -56,6 +57,8 @@
>  # define sysconf(id) __sysconf (id)
>  # define closedir(dir) __closedir (dir)
>  # define opendir(name) __opendir (name)
> +# undef dirfd
> +# define dirfd(str) __dirfd (str)
>  # define readdir(str) __readdir64 (str)
>  # define getpwnam_r(name, bufp, buf, len, res) \
>      __getpwnam_r (name, bufp, buf, len, res)
> @@ -69,11 +72,8 @@
>  # ifndef GLOB_LSTAT
>  #  define GLOB_LSTAT            gl_lstat
>  # endif
> -# ifndef GLOB_STAT64
> -#  define GLOB_STAT64           __stat64
> -# endif
> -# ifndef GLOB_LSTAT64
> -#  define GLOB_LSTAT64          __lstat64
> +# ifndef GLOB_FSTATAT64
> +#  define GLOB_FSTATAT64        __fstatat64
>  # endif
>  # include <shlib-compat.h>
>  #else /* !_LIBC */
> @@ -88,8 +88,7 @@
>  # define struct_stat            struct stat
>  # define struct_stat64          struct stat
>  # define GLOB_LSTAT             gl_lstat
> -# define GLOB_STAT64            stat
> -# define GLOB_LSTAT64           lstat
> +# define GLOB_FSTATAT64         fstatat
>  #endif /* _LIBC */
>  
>  #include <fnmatch.h>
> @@ -215,7 +214,8 @@ glob_lstat (glob_t *pglob, int flags, const char *fullname)
>    } ust;
>    return (__glibc_unlikely (flags & GLOB_ALTDIRFUNC)
>            ? pglob->GLOB_LSTAT (fullname, &ust.st)
> -          : GLOB_LSTAT64 (fullname, &ust.st64));
> +          : GLOB_FSTATAT64 (AT_FDCWD, fullname, &ust.st64,
> +                            AT_SYMLINK_NOFOLLOW));
>  }
>  
>  /* Set *R = A + B.  Return true if the answer is mathematically
> @@ -257,7 +257,8 @@ is_dir (char const *filename, int flags, glob_t const *pglob)
>    struct_stat64 st64;
>    return (__glibc_unlikely (flags & GLOB_ALTDIRFUNC)
>            ? pglob->gl_stat (filename, &st) == 0 && S_ISDIR (st.st_mode)
> -          : GLOB_STAT64 (filename, &st64) == 0 && S_ISDIR (st64.st_mode));
> +          : (GLOB_FSTATAT64 (AT_FDCWD, filename, &st64, 0) == 0
> +             && S_ISDIR (st64.st_mode)));
>  }
>  
>  /* Find the end of the sub-pattern in a brace expression.  */
> @@ -747,6 +748,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>        else
>          {
>  #ifndef WINDOWS32
> +          /* Recognize ~user as a shorthand for the specified user's home
> +             directory.  */
>            char *end_name = strchr (dirname, '/');
>            char *user_name;
>            int malloc_user_name = 0;
> @@ -885,7 +888,22 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
>                }
>              scratch_buffer_free (&pwtmpbuf);
>            }
> -#endif /* !WINDOWS32 */
> +#else /* WINDOWS32 */
> +          /* On native Windows, access to a user's home directory
> +             (via GetUserProfileDirectory) or to a user's environment
> +             variables (via ExpandEnvironmentStringsForUser) requires
> +             the credentials of the user.  Therefore we cannot support
> +             the ~user syntax on this platform.
> +             Handling ~user specially (and treat it like plain ~) if
> +             user is getenv ("USERNAME") would not be a good idea,
> +             since it would make people think that ~user is supported
> +             in general.  */
> +          if (flags & GLOB_TILDE_CHECK)
> +            {
> +              retval = GLOB_NOMATCH;
> +              goto out;
> +            }
> +#endif /* WINDOWS32 */
>          }
>      }
>  
> @@ -1266,6 +1284,8 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
>  {
>    size_t dirlen = strlen (directory);
>    void *stream = NULL;
> +  struct scratch_buffer s;
> +  scratch_buffer_init (&s);
>  # define GLOBNAMES_MEMBERS(nnames) \
>      struct globnames *next; size_t count; char *name[nnames];
>    struct globnames { GLOBNAMES_MEMBERS (FLEXIBLE_ARRAY_MEMBER) };
> @@ -1337,6 +1357,7 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
>          }
>        else
>          {
> +          int dfd = dirfd (stream);
>            int fnm_flags = ((!(flags & GLOB_PERIOD) ? FNM_PERIOD : 0)
>                             | ((flags & GLOB_NOESCAPE) ? FNM_NOESCAPE : 0));
>            flags |= GLOB_MAGCHAR;
> @@ -1364,8 +1385,32 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
>                if (flags & GLOB_ONLYDIR)
>                  switch (readdir_result_type (d))
>                    {
> -                  case DT_DIR: case DT_LNK: case DT_UNKNOWN: break;
>                    default: continue;
> +                  case DT_DIR: break;
> +                  case DT_LNK: case DT_UNKNOWN:
> +                    /* The filesystem was too lazy to give us a hint,
> +                       so we have to do it the hard way.  */
> +                    if (__glibc_unlikely (dfd < 0 || flags & GLOB_ALTDIRFUNC))
> +                      {
> +                        size_t namelen = strlen (d.name);
> +                        size_t need = dirlen + 1 + namelen + 1;
> +                        if (s.length < need
> +                            && !scratch_buffer_set_array_size (&s, need, 1))
> +                          goto memory_error;
> +                        char *p = mempcpy (s.data, directory, dirlen);
> +                        *p = '/';
> +                        p += p[-1] != '/';
> +                        memcpy (p, d.name, namelen + 1);
> +                        if (! is_dir (s.data, flags, pglob))
> +                          continue;
> +                      }
> +                    else
> +                      {
> +                        struct_stat64 st64;
> +                        if (! (GLOB_FSTATAT64 (dfd, d.name, &st64, 0) == 0
> +                               && S_ISDIR (st64.st_mode)))
> +                          continue;
> +                      }
>                    }
>  
>                if (fnmatch (pattern, d.name, fnm_flags) == 0)
> @@ -1497,5 +1542,6 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
>        __set_errno (save);
>      }
>  
> +  scratch_buffer_free (&s);
>    return result;
>  }
>
DJ Delorie March 31, 2022, 5:38 p.m. UTC | #5
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
> Could you create a bug report to track this?

It would be a copy of rhbz 1982608 and it originated in make, I don't
see any advantage of copying it just to sync with gnulib.


> If I recall correctly from gnulib discussion, this came from an issue
> with XFS. Would be possible to craft a regression test?

Not really.  It requires either an old XFS, or an XFS built with custom
command line options to emulate the old XFS.
Adhemerval Zanella Netto March 31, 2022, 5:44 p.m. UTC | #6
On 31/03/2022 14:38, DJ Delorie wrote:
> Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
>> Could you create a bug report to track this?
> 
> It would be a copy of rhbz 1982608 and it originated in make, I don't
> see any advantage of copying it just to sync with gnulib.
> 

Yes, but we usually reference sourceware bug list for fixed bugs.  I think
mirroring the bug is better than just reference to an external database.

> 
>> If I recall correctly from gnulib discussion, this came from an issue
>> with XFS. Would be possible to craft a regression test?
> 
> Not really.  It requires either an old XFS, or an XFS built with custom
> command line options to emulate the old XFS.
>
DJ Delorie March 31, 2022, 7:29 p.m. UTC | #7
Conveniently...
https://sourceware.org/bugzilla/show_bug.cgi?id=25659
Adhemerval Zanella Netto March 31, 2022, 7:38 p.m. UTC | #8
On 31/03/2022 16:29, DJ Delorie wrote:
> 
> Conveniently...
> https://sourceware.org/bugzilla/show_bug.cgi?id=25659
> 

Thanks.
diff mbox series

Patch

diff --git a/posix/glob.c b/posix/glob.c
index a2b5aabada..19b876f9f0 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -28,6 +28,7 @@ 
 #include <glob.h>
 
 #include <errno.h>
+#include <fcntl.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <stdbool.h>
@@ -56,6 +57,8 @@ 
 # define sysconf(id) __sysconf (id)
 # define closedir(dir) __closedir (dir)
 # define opendir(name) __opendir (name)
+# undef dirfd
+# define dirfd(str) __dirfd (str)
 # define readdir(str) __readdir64 (str)
 # define getpwnam_r(name, bufp, buf, len, res) \
     __getpwnam_r (name, bufp, buf, len, res)
@@ -69,11 +72,8 @@ 
 # ifndef GLOB_LSTAT
 #  define GLOB_LSTAT            gl_lstat
 # endif
-# ifndef GLOB_STAT64
-#  define GLOB_STAT64           __stat64
-# endif
-# ifndef GLOB_LSTAT64
-#  define GLOB_LSTAT64          __lstat64
+# ifndef GLOB_FSTATAT64
+#  define GLOB_FSTATAT64        __fstatat64
 # endif
 # include <shlib-compat.h>
 #else /* !_LIBC */
@@ -88,8 +88,7 @@ 
 # define struct_stat            struct stat
 # define struct_stat64          struct stat
 # define GLOB_LSTAT             gl_lstat
-# define GLOB_STAT64            stat
-# define GLOB_LSTAT64           lstat
+# define GLOB_FSTATAT64         fstatat
 #endif /* _LIBC */
 
 #include <fnmatch.h>
@@ -215,7 +214,8 @@  glob_lstat (glob_t *pglob, int flags, const char *fullname)
   } ust;
   return (__glibc_unlikely (flags & GLOB_ALTDIRFUNC)
           ? pglob->GLOB_LSTAT (fullname, &ust.st)
-          : GLOB_LSTAT64 (fullname, &ust.st64));
+          : GLOB_FSTATAT64 (AT_FDCWD, fullname, &ust.st64,
+                            AT_SYMLINK_NOFOLLOW));
 }
 
 /* Set *R = A + B.  Return true if the answer is mathematically
@@ -257,7 +257,8 @@  is_dir (char const *filename, int flags, glob_t const *pglob)
   struct_stat64 st64;
   return (__glibc_unlikely (flags & GLOB_ALTDIRFUNC)
           ? pglob->gl_stat (filename, &st) == 0 && S_ISDIR (st.st_mode)
-          : GLOB_STAT64 (filename, &st64) == 0 && S_ISDIR (st64.st_mode));
+          : (GLOB_FSTATAT64 (AT_FDCWD, filename, &st64, 0) == 0
+             && S_ISDIR (st64.st_mode)));
 }
 
 /* Find the end of the sub-pattern in a brace expression.  */
@@ -747,6 +748,8 @@  __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
       else
         {
 #ifndef WINDOWS32
+          /* Recognize ~user as a shorthand for the specified user's home
+             directory.  */
           char *end_name = strchr (dirname, '/');
           char *user_name;
           int malloc_user_name = 0;
@@ -885,7 +888,22 @@  __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               }
             scratch_buffer_free (&pwtmpbuf);
           }
-#endif /* !WINDOWS32 */
+#else /* WINDOWS32 */
+          /* On native Windows, access to a user's home directory
+             (via GetUserProfileDirectory) or to a user's environment
+             variables (via ExpandEnvironmentStringsForUser) requires
+             the credentials of the user.  Therefore we cannot support
+             the ~user syntax on this platform.
+             Handling ~user specially (and treat it like plain ~) if
+             user is getenv ("USERNAME") would not be a good idea,
+             since it would make people think that ~user is supported
+             in general.  */
+          if (flags & GLOB_TILDE_CHECK)
+            {
+              retval = GLOB_NOMATCH;
+              goto out;
+            }
+#endif /* WINDOWS32 */
         }
     }
 
@@ -1266,6 +1284,8 @@  glob_in_dir (const char *pattern, const char *directory, int flags,
 {
   size_t dirlen = strlen (directory);
   void *stream = NULL;
+  struct scratch_buffer s;
+  scratch_buffer_init (&s);
 # define GLOBNAMES_MEMBERS(nnames) \
     struct globnames *next; size_t count; char *name[nnames];
   struct globnames { GLOBNAMES_MEMBERS (FLEXIBLE_ARRAY_MEMBER) };
@@ -1337,6 +1357,7 @@  glob_in_dir (const char *pattern, const char *directory, int flags,
         }
       else
         {
+          int dfd = dirfd (stream);
           int fnm_flags = ((!(flags & GLOB_PERIOD) ? FNM_PERIOD : 0)
                            | ((flags & GLOB_NOESCAPE) ? FNM_NOESCAPE : 0));
           flags |= GLOB_MAGCHAR;
@@ -1364,8 +1385,32 @@  glob_in_dir (const char *pattern, const char *directory, int flags,
               if (flags & GLOB_ONLYDIR)
                 switch (readdir_result_type (d))
                   {
-                  case DT_DIR: case DT_LNK: case DT_UNKNOWN: break;
                   default: continue;
+                  case DT_DIR: break;
+                  case DT_LNK: case DT_UNKNOWN:
+                    /* The filesystem was too lazy to give us a hint,
+                       so we have to do it the hard way.  */
+                    if (__glibc_unlikely (dfd < 0 || flags & GLOB_ALTDIRFUNC))
+                      {
+                        size_t namelen = strlen (d.name);
+                        size_t need = dirlen + 1 + namelen + 1;
+                        if (s.length < need
+                            && !scratch_buffer_set_array_size (&s, need, 1))
+                          goto memory_error;
+                        char *p = mempcpy (s.data, directory, dirlen);
+                        *p = '/';
+                        p += p[-1] != '/';
+                        memcpy (p, d.name, namelen + 1);
+                        if (! is_dir (s.data, flags, pglob))
+                          continue;
+                      }
+                    else
+                      {
+                        struct_stat64 st64;
+                        if (! (GLOB_FSTATAT64 (dfd, d.name, &st64, 0) == 0
+                               && S_ISDIR (st64.st_mode)))
+                          continue;
+                      }
                   }
 
               if (fnmatch (pattern, d.name, fnm_flags) == 0)
@@ -1497,5 +1542,6 @@  glob_in_dir (const char *pattern, const char *directory, int flags,
       __set_errno (save);
     }
 
+  scratch_buffer_free (&s);
   return result;
 }