diff mbox

PR 13631 Problems in messages

Message ID 547CE0AA.50209@gmail.com
State New
Headers show

Commit Message

François Dumont Dec. 1, 2014, 9:42 p.m. UTC
Hi

     Here is another proposal that consider all your remarks except one. 
I finally prefer to go with std::vector of pointers. Dynamically 
allocating Catalog_info allow to avoid numerous copies of locale when we 
find this pointer from the catalog info.


On 28/11/2014 11:49, Jonathan Wakely wrote:
> + class Catalogs
>> +  {
>> +  public:
>> +    typedef messages_base::catalog catalog_id;
>> +
>> +    struct catalog_info
>> +    {
>> +      catalog_info()
>> +      { }
>> +
>> +      catalog_info(catalog_id __id, const char* __domain, locale __loc)
>> +    : _M_id(__id), _M_domain(__domain), _M_locale(__loc)
>> +      { }
>> +
>
> I don't really like that this type doesn't own _M_domain and it has to
> be freed by Catalog, but without a move constructor (which requires
> C++11) it would be awkward to set _M_domain=0 after copy constructing
> a new catalog_info.
>
> So it's OK like this.

I don't liked it neither so now it is a usual std::string.

> + result_type
>> +    _M_get(catalog_id __c) const
>> +    {
>> +      __gnu_cxx::__scoped_lock lock(_M_mutex);
>> +
>> +      const catalog_info* __entry =
>> +    lower_bound(_M_map, _M_map + _M_nb_entry, __c, _Comp());
>> +      if (__entry != _M_map + _M_nb_entry && __entry->_M_id == __c)
>> +    return result_type(__entry->_M_domain, __entry->_M_locale);
>> +      return result_type(0, locale());
>
> I assume the second return was just for testing, it should be removed.

I don't get this one. I still need to return something when I can't find 
the catalog so the 2 return statements.

> Index: config/locale/gnu/messages_members.h
>> ===================================================================
>> --- config/locale/gnu/messages_members.h    (revision 218027)
>> +++ config/locale/gnu/messages_members.h    (working copy)
>> @@ -83,22 +83,6 @@
>>       _S_destroy_c_locale(_M_c_locale_messages);     }
>>
>> -  template<typename _CharT>
>> -    typename messages<_CharT>::catalog - 
>> messages<_CharT>::do_open(const basic_string<char>& __s, 
>> -                  const locale&) const
>> -    { -      // No error checking is done, assume the catalog exists 
>> and can
>> -      // be used.
>> -      textdomain(__s.c_str());
>> -      return 0;
>> -    }
>> -
>> -  template<typename _CharT>
>> -    void    -    messages<_CharT>::do_close(catalog) const -    { }
>> -
>>    // messages_byname
>>    template<typename _CharT>
>>      messages_byname<_CharT>::messages_byname(const char* __s, size_t 
>> __refs)
>
> Unless I'm misreading this patch, you've removed the definitions of
> messages<_CharT>::do_open() and messages<_CharT>::do_close() for the
> primary template. They would stil be needed if users instantiate e.g.
> messages<char16_t> or messages<signed char>.

Yes but do you confirm that it is already the same problem with do_get ?

In my opinion we could provide template implementations of all those 
methods relying on codecvt<_CharT, char, mbstate_t>, even for do_get. 
But in this case some implementation details will be exposed in the 
header files and additional symbols will have to be exported I think.

In fact I have already started doing something like that but then start 
facing issue accessing nl_langinfo_l. Shall I go further and provide a 
patch doing this ?

François

Comments

Jonathan Wakely Dec. 2, 2014, 10:55 a.m. UTC | #1
On 01/12/14 22:42 +0100, François Dumont wrote:
>Hi
>
>    Here is another proposal that consider all your remarks except 
>one. I finally prefer to go with std::vector of pointers. Dynamically 
>allocating Catalog_info allow to avoid numerous copies of locale when 
>we find this pointer from the catalog info.
>
>
>>+ result_type
>>>+    _M_get(catalog_id __c) const
>>>+    {
>>>+      __gnu_cxx::__scoped_lock lock(_M_mutex);
>>>+
>>>+      const catalog_info* __entry =
>>>+    lower_bound(_M_map, _M_map + _M_nb_entry, __c, _Comp());
>>>+      if (__entry != _M_map + _M_nb_entry && __entry->_M_id == __c)
>>>+    return result_type(__entry->_M_domain, __entry->_M_locale);
>>>+      return result_type(0, locale());
>>
>>I assume the second return was just for testing, it should be removed.
>
>I don't get this one. I still need to return something when I can't 
>find the catalog so the 2 return statements.

I thought I remembered seeing two returns at the same level of
indentation, but obviously they are not! Maybe I was looking it it in
gmail's attachment viewer which eats leading whitespace. Ignore that
comment, sorry.

>>Index: config/locale/gnu/messages_members.h
>>>===================================================================
>>>--- config/locale/gnu/messages_members.h    (revision 218027)
>>>+++ config/locale/gnu/messages_members.h    (working copy)
>>>@@ -83,22 +83,6 @@
>>>      _S_destroy_c_locale(_M_c_locale_messages);     }
>>>
>>>-  template<typename _CharT>
>>>-    typename messages<_CharT>::catalog - 
>>>messages<_CharT>::do_open(const basic_string<char>& __s, -                  
>>>const locale&) const
>>>-    { -      // No error checking is done, assume the catalog 
>>>exists and can
>>>-      // be used.
>>>-      textdomain(__s.c_str());
>>>-      return 0;
>>>-    }
>>>-
>>>-  template<typename _CharT>
>>>-    void    -    messages<_CharT>::do_close(catalog) const -    { }
>>>-
>>>   // messages_byname
>>>   template<typename _CharT>
>>>     messages_byname<_CharT>::messages_byname(const char* __s, 
>>>size_t __refs)
>>
>>Unless I'm misreading this patch, you've removed the definitions of
>>messages<_CharT>::do_open() and messages<_CharT>::do_close() for the
>>primary template. They would stil be needed if users instantiate e.g.
>>messages<char16_t> or messages<signed char>.
>
>Yes but do you confirm that it is already the same problem with do_get ?

Yes, I'm just wondering why make the situation worse than it already
is. Does leaving the default definitions cause a problem?

If some user has got code that uses messages<signed char> and they
provide a definition for messages<signed char>::do_get() their code
will break if do_open and do_close disappear. (Realistically I doubt
anyone is doing that though, it may not even work.)

>In my opinion we could provide template implementations of all those 
>methods relying on codecvt<_CharT, char, mbstate_t>, even for do_get. 
>But in this case some implementation details will be exposed in the 
>header files and additional symbols will have to be exported I think.
>
>In fact I have already started doing something like that but then 
>start facing issue accessing nl_langinfo_l. Shall I go further and 
>provide a patch doing this ?

IMHO no.

>+    catalog
>+    _M_add(const string& __domain, locale __l)
>+    {
>+      __gnu_cxx::__scoped_lock lock(_M_mutex);
>+
>+      // The counter is not likely to roll unless catalogs keep on being
>+      // opened/closed which is consider as an application mistake for the
>+      // moment.
>+      if (_M_catalog_counter == numeric_limits<catalog>::max())
>+	return -1;
>+
>+      _M_infos.push_back(new Catalog_info(_M_catalog_counter++, __domain, __l));
>+      return _M_infos.back()->_M_id;

This is not exception safe. You can use auto_ptr to fix it:

      std::auto_ptr<Catalog_info> info(new Catalog_info(_M_catalog_counter++,
                                                        __domain, __l));
      _M_infos.push_back(info.get());
      return info.release()->_M_id;
diff mbox

Patch

Index: config/locale/gnu/messages_members.cc
===================================================================
--- config/locale/gnu/messages_members.cc	(revision 218247)
+++ config/locale/gnu/messages_members.cc	(working copy)
@@ -31,20 +31,120 @@ 
 #include <locale>
 #include <bits/c++locale_internal.h>
 
-namespace std _GLIBCXX_VISIBILITY(default)
+#include <limits>
+#include <algorithm>
+#include <vector>
+
+#include <ext/concurrence.h>
+
+namespace
 {
-_GLIBCXX_BEGIN_NAMESPACE_VERSION
+  using namespace std;
 
-  // Specializations.
-  template<>
-    string
-    messages<char>::do_get(catalog, int, int, const string& __dfault) const
+  typedef messages_base::catalog catalog;
+
+  struct Catalog_info
     {
+    Catalog_info(catalog __id, const string& __domain, locale __loc)
+      : _M_id(__id), _M_domain(__domain), _M_locale(__loc)
+    { }
+
+    catalog _M_id;
+    string _M_domain;
+    locale _M_locale;
+  };
+
+  class Catalogs
+  {
+  public:
+    Catalogs() : _M_catalog_counter(0) { }
+
+    ~Catalogs()
+    {
+      for (vector<Catalog_info*>::iterator __it = _M_infos.begin();
+	   __it != _M_infos.end(); ++__it)
+	delete *__it;
+    }
+
+    catalog
+    _M_add(const string& __domain, locale __l)
+    {
+      __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+      // The counter is not likely to roll unless catalogs keep on being
+      // opened/closed which is consider as an application mistake for the
+      // moment.
+      if (_M_catalog_counter == numeric_limits<catalog>::max())
+	return -1;
+
+      _M_infos.push_back(new Catalog_info(_M_catalog_counter++, __domain, __l));
+      return _M_infos.back()->_M_id;
+    }
+
+    void
+    _M_erase(catalog __c)
+    {
+      __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+      vector<Catalog_info*>::iterator __res =
+	lower_bound(_M_infos.begin(), _M_infos.end(), __c, _Comp());
+      if (__res == _M_infos.end() || (*__res)->_M_id != __c)
+	return;
+
+      delete *__res;
+      _M_infos.erase(__res);
+
+      // Just in case closed catalog was the last open.
+      if (__c == _M_catalog_counter - 1)
+	--_M_catalog_counter;
+    }
+
+    const Catalog_info*
+    _M_get(catalog __c) const
+    {
+      __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+      vector<Catalog_info*>::const_iterator __res =
+	lower_bound(_M_infos.begin(), _M_infos.end(), __c, _Comp());
+
+      if (__res != _M_infos.end() && (*__res)->_M_id == __c)
+	return *__res;
+
+      return 0;
+    }
+
+  private:
+    struct _Comp
+    {
+      bool operator()(catalog __cat, const Catalog_info* __info) const
+      { return __cat < __info->_M_id; }
+
+      bool operator()(const Catalog_info* __info, catalog __cat) const
+      { return __info->_M_id < __cat; }
+    };
+
+    mutable __gnu_cxx::__mutex _M_mutex;
+    catalog _M_catalog_counter;
+    std::vector<Catalog_info*> _M_infos;
+  };
+
+  Catalogs&
+  get_catalogs()
+  {
+    static Catalogs __catalogs;
+    return __catalogs;
+  }
+
+  const char*
+  get_glibc_msg(__c_locale __attribute__((unused)) __locale_messages,
+		const char* __domainname,
+		const char* __dfault)
+  {
 #if __GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ > 2)
-      __c_locale __old = __uselocale(_M_c_locale_messages);
-      const char* __msg = const_cast<const char*>(gettext(__dfault.c_str()));
+    std::__c_locale __old = __uselocale(__locale_messages);
+    const char* __msg =
+      const_cast<const char*>(dgettext(__domainname, __dfault));
       __uselocale(__old);
-      return string(__msg);
 #else
       char* __old = setlocale(LC_ALL, 0);
       const size_t __len = strlen(__old) + 1;
@@ -51,35 +151,131 @@ 
       char* __sav = new char[__len];
       memcpy(__sav, __old, __len);
       setlocale(LC_ALL, _M_name_messages);
-      const char* __msg = gettext(__dfault.c_str());
+    const char* __msg = dgettext(__domainname, __dfault);
       setlocale(LC_ALL, __sav);
       delete [] __sav;
-      return string(__msg);
 #endif
+
+    return __msg;
     }
+}
 
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  // Specializations.
+  template<>
+    typename messages<char>::catalog
+    messages<char>::do_open(const basic_string<char>& __s,
+			    const locale& __l) const
+  {
+    typedef codecvt<char, char, mbstate_t> __codecvt_t;
+    const __codecvt_t& __codecvt = use_facet<__codecvt_t>(__l);
+
+    bind_textdomain_codeset(__s.c_str(),
+	__nl_langinfo_l(CODESET, __codecvt._M_c_locale_codecvt));
+    return get_catalogs()._M_add(__s, __l);
+  }
+
+  template<>
+    void
+    messages<char>::do_close(catalog __c) const
+    { get_catalogs()._M_erase(__c); }
+
+  template<>
+    string
+    messages<char>::do_get(catalog __c, int, int,
+			   const string& __dfault) const
+    {
+      if (__c < 0 || __dfault.empty())
+	return __dfault;
+
+      const Catalog_info* __cat_info = get_catalogs()._M_get(__c);
+
+      if (!__cat_info)
+	return __dfault;
+
+      return get_glibc_msg(_M_c_locale_messages,
+			   __cat_info->_M_domain.c_str(),
+			   __dfault.c_str());
+    }
+
 #ifdef _GLIBCXX_USE_WCHAR_T
   template<>
+    typename messages<wchar_t>::catalog
+    messages<wchar_t>::do_open(const basic_string<char>& __s,
+			       const locale& __l) const
+  {
+    typedef codecvt<wchar_t, char, mbstate_t> __codecvt_t;
+    const __codecvt_t& __codecvt = use_facet<__codecvt_t>(__l);
+
+    bind_textdomain_codeset(__s.c_str(),
+	__nl_langinfo_l(CODESET, __codecvt._M_c_locale_codecvt));
+
+    return get_catalogs()._M_add(__s, __l);
+  }
+
+  template<>
+    void
+    messages<wchar_t>::do_close(catalog __c) const
+    { get_catalogs()._M_erase(__c); }
+
+  template<>
     wstring
-    messages<wchar_t>::do_get(catalog, int, int, const wstring& __dfault) const
+    messages<wchar_t>::do_get(catalog __c, int, int,
+			      const wstring& __wdfault) const
     {
-# if __GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ > 2)
-      __c_locale __old = __uselocale(_M_c_locale_messages);
-      char* __msg = gettext(_M_convert_to_char(__dfault));
-      __uselocale(__old);
-      return _M_convert_from_char(__msg);
-# else
-      char* __old = setlocale(LC_ALL, 0);
-      const size_t __len = strlen(__old) + 1;
-      char* __sav = new char[__len];
-      memcpy(__sav, __old, __len);
-      setlocale(LC_ALL, _M_name_messages);
-      char* __msg = gettext(_M_convert_to_char(__dfault));
-      setlocale(LC_ALL, __sav);
-      delete [] __sav;
-      return _M_convert_from_char(__msg);
-# endif
+      if (__c < 0 || __wdfault.empty())
+	return __wdfault;
+
+      const Catalog_info* __cat_info = get_catalogs()._M_get(__c);
+
+      if (!__cat_info)
+	return __wdfault;
+
+      typedef codecvt<wchar_t, char, mbstate_t> __codecvt_t;
+      const __codecvt_t& __conv =
+	use_facet<__codecvt_t>(__cat_info->_M_locale);
+
+      const char* __translation;
+      mbstate_t __state;
+      __builtin_memset(&__state, 0, sizeof(mbstate_t));
+      {
+	const wchar_t* __wdfault_next;
+	size_t __mb_size = __wdfault.size() * __conv.max_length();;
+	char* __dfault =
+	  static_cast<char*>(__builtin_alloca(sizeof(char) * (__mb_size + 1)));
+	char* __dfault_next;
+	__conv.out(__state,
+		   __wdfault.data(), __wdfault.data() + __wdfault.size(),
+		   __wdfault_next,
+		   __dfault, __dfault + __mb_size, __dfault_next);
+
+	// Make sure string passed to dgettext is \0 terminated.
+	*__dfault_next = '\0';
+	__translation
+	  = get_glibc_msg(_M_c_locale_messages,
+			  __cat_info->_M_domain.c_str(), __dfault);
+
+	// If we end up getting default value back we can simply return original
+	// default value.
+	if (__translation == __dfault)
+	  return __wdfault;
     }
+
+      __builtin_memset(&__state, 0, sizeof(mbstate_t));
+      size_t __size = __builtin_strlen(__translation);
+      const char* __translation_next;
+      wchar_t* __wtranslation =
+	static_cast<wchar_t*>(__builtin_alloca(sizeof(wchar_t) * (__size + 1)));
+      wchar_t* __wtranslation_next;
+      __conv.in(__state, __translation, __translation + __size,
+		__translation_next,
+		__wtranslation, __wtranslation + __size,
+		__wtranslation_next);
+      return wstring(__wtranslation, __wtranslation_next);
+    }
 #endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
Index: config/locale/gnu/messages_members.h
===================================================================
--- config/locale/gnu/messages_members.h	(revision 218247)
+++ config/locale/gnu/messages_members.h	(working copy)
@@ -83,22 +83,6 @@ 
       _S_destroy_c_locale(_M_c_locale_messages); 
     }
 
-  template<typename _CharT>
-    typename messages<_CharT>::catalog 
-    messages<_CharT>::do_open(const basic_string<char>& __s, 
-			      const locale&) const
-    { 
-      // No error checking is done, assume the catalog exists and can
-      // be used.
-      textdomain(__s.c_str());
-      return 0;
-    }
-
-  template<typename _CharT>
-    void    
-    messages<_CharT>::do_close(catalog) const 
-    { }
-
    // messages_byname
    template<typename _CharT>
      messages_byname<_CharT>::messages_byname(const char* __s, size_t __refs)
@@ -126,5 +110,25 @@ 
 	 }
      }
 
+    template<>
+      typename messages<char>::catalog
+      messages<char>::do_open(const basic_string<char>&,
+			      const locale&) const;
+
+    template<>
+      void
+      messages<char>::do_close(catalog) const;
+
+#ifdef _GLIBCXX_USE_WCHAR_T
+    template<>
+      typename messages<wchar_t>::catalog
+      messages<wchar_t>::do_open(const basic_string<char>&,
+				 const locale&) const;
+
+    template<>
+      void
+      messages<wchar_t>::do_close(catalog) const;
+#endif
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
Index: include/bits/codecvt.h
===================================================================
--- include/bits/codecvt.h	(revision 218247)
+++ include/bits/codecvt.h	(working copy)
@@ -263,8 +263,6 @@ 
       do_max_length() const throw() = 0;
     };
 
-
-
   /**
    *  @brief  Primary class template codecvt.
    *  @ingroup locales
@@ -340,6 +338,8 @@ 
     class codecvt<char, char, mbstate_t>
     : public __codecvt_abstract_base<char, char, mbstate_t>
     {
+      friend class messages<char>;
+
     public:
       // Types:
       typedef char			intern_type;
@@ -398,6 +398,8 @@ 
     class codecvt<wchar_t, char, mbstate_t>
     : public __codecvt_abstract_base<wchar_t, char, mbstate_t>
     {
+      friend class messages<wchar_t>;
+
     public:
       // Types:
       typedef wchar_t			intern_type;
Index: testsuite/22_locale/messages/13631.cc
===================================================================
--- testsuite/22_locale/messages/13631.cc	(revision 0)
+++ testsuite/22_locale/messages/13631.cc	(working copy)
@@ -0,0 +1,99 @@ 
+// { dg-require-namedlocale "fr_FR" }
+
+// Copyright (C) 2014 Free Software Foundation
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include <locale>
+#include <testsuite_hooks.h>
+
+void test01()
+{
+  bool test __attribute__((unused)) = true;
+
+  // This is defined through CXXFLAGS in scripts/testsuite_flags[.in].
+  const char* dir = LOCALEDIR;
+
+  std::locale l("fr_FR");
+
+  typedef std::messages<char> messages;
+
+  const messages &msgs_facet = std::use_facet<messages>(l);
+
+  messages::catalog msgs = msgs_facet.open("libstdc++", l, dir);
+  VERIFY( msgs >= 0 );
+
+  const char msgid[] = "please";
+  std::string translation1 = msgs_facet.get(msgs, 0, 0, msgid);
+
+  // Without a real translation this test doesn't mean anything:
+  VERIFY( translation1 != msgid );
+
+  // Opening another catalog was enough to show the problem, even a fake
+  // catalog.
+  messages::catalog fake_msgs = msgs_facet.open("fake", l);
+
+  std::string translation2 = msgs_facet.get(msgs, 0, 0, msgid);
+
+  // Close catalogs before doing the check to avoid leaks.
+  msgs_facet.close(fake_msgs);
+  msgs_facet.close(msgs);
+
+  VERIFY( translation1 == translation2 );
+}
+
+void test02()
+{
+  bool test __attribute__((unused)) = true;
+
+  // This is defined through CXXFLAGS in scripts/testsuite_flags[.in].
+  const char* dir = LOCALEDIR;
+
+  std::locale l("fr_FR");
+
+  typedef std::messages<wchar_t> messages;
+
+  const messages &msgs_facet = std::use_facet<messages>(l);
+
+  messages::catalog msgs = msgs_facet.open("libstdc++", l, dir);
+  VERIFY( msgs >= 0 );
+
+  const wchar_t msgid[] = L"please";
+  std::wstring translation1 = msgs_facet.get(msgs, 0, 0, msgid);
+
+  // Without a real translation this test doesn't mean anything:
+  VERIFY( !translation1.empty() );
+  VERIFY( translation1 != msgid );
+
+  // Opening another catalog was enough to show the problem, even a fake
+  // catalog.
+  messages::catalog fake_msgs = msgs_facet.open("fake", l);
+
+  std::wstring translation2 = msgs_facet.get(msgs, 0, 0, msgid);
+
+  // Close catalogs before doing the check to avoid leaks.
+  msgs_facet.close(fake_msgs);
+  msgs_facet.close(msgs);
+
+  VERIFY( translation1 == translation2 );
+}
+
+int main()
+{
+  test01();
+  test02();
+  return 0;
+}
Index: testsuite/22_locale/messages/members/char/2.cc
===================================================================
--- testsuite/22_locale/messages/members/char/2.cc	(revision 218247)
+++ testsuite/22_locale/messages/members/char/2.cc	(working copy)
@@ -35,9 +35,8 @@ 
   const char* dir = LOCALEDIR;
 
   // basic construction
-  locale loc_c = locale::classic();
   locale loc_fr = locale("fr_FR");
-  VERIFY( loc_c != loc_fr );
+  VERIFY( locale::classic() != loc_fr );
 
   // cache the messages facets
   const messages<char>& mssg_fr = use_facet<messages<char> >(loc_fr); 
@@ -47,7 +46,7 @@ 
   // void close(catalog) const;
 
   // Check French (fr_FR) locale.
-  catalog cat_fr = mssg_fr.open("libstdc++", loc_c, dir);
+  catalog cat_fr = mssg_fr.open("libstdc++", loc_fr, dir);
   string s01 = mssg_fr.get(cat_fr, 0, 0, "please");
   string s02 = mssg_fr.get(cat_fr, 0, 0, "thank you");
   VERIFY ( s01 == "s'il vous plaît" );