Patchwork [RFC,v3,11/11] QMP/qmp.py: set locale for exceptions to display non-ascii messages correctly

login
register
mail settings
Submitter Tomoki Sekiyama
Date May 21, 2013, 3:34 p.m.
Message ID <20130521153415.4880.53934.stgit@hds.com>
Download mbox | patch
Permalink /patch/245329/
State New
Headers show

Comments

Tomoki Sekiyama - May 21, 2013, 3:34 p.m.
qemu-ga in Windows may return error message with multibyte characters
when the guest OS language is set to other than English. To display such
messages correctly, this encodes the message based on the locale settings.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
---
 QMP/qmp.py |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
Stefan Hajnoczi - May 23, 2013, 12:30 p.m.
On Tue, May 21, 2013 at 11:34:16AM -0400, Tomoki Sekiyama wrote:
> qemu-ga in Windows may return error message with multibyte characters
> when the guest OS language is set to other than English. To display such
> messages correctly, this encodes the message based on the locale settings.
> 
> Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
> ---
>  QMP/qmp.py |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/QMP/qmp.py b/QMP/qmp.py
> index c551df1..ee21819 100644
> --- a/QMP/qmp.py
> +++ b/QMP/qmp.py
> @@ -11,6 +11,7 @@
>  import json
>  import errno
>  import socket
> +import locale
>  
>  class QMPError(Exception):
>      pass
> @@ -133,7 +134,8 @@ class QEMUMonitorProtocol:
>      def command(self, cmd, **kwds):
>          ret = self.cmd(cmd, kwds)
>          if ret.has_key('error'):
> -            raise Exception(ret['error']['desc'])
> +            enc = locale.getpreferredencoding()
> +            raise Exception(ret['error']['desc'].encode(enc))

You should not need to explicitly encode the error descriptor.  The
error description should be UTF-8 on the wire and a Unicode Python
string in this script.

I think the real problem is:

1. Guest qga is writing strings in local encoding onto the wire.

or

2. qmp.py isn't UTF-8-decoding strings received over the wire.

Either or both bugs could be present.  Once they are fixed you shouldn't
see encoding problems.

Stefan
Tomoki Sekiyama - May 24, 2013, 6:55 p.m.
On 5/23/13 8:30 , "Stefan Hajnoczi" <stefanha@gmail.com> wrote:

>On Tue, May 21, 2013 at 11:34:16AM -0400, Tomoki Sekiyama wrote:
>> qemu-ga in Windows may return error message with multibyte characters
>> when the guest OS language is set to other than English. To display such
>> messages correctly, this encodes the message based on the locale
>>settings.
>> 
>> Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
>> ---
>>  QMP/qmp.py |    4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>> 
>> diff --git a/QMP/qmp.py b/QMP/qmp.py
>> index c551df1..ee21819 100644
>> --- a/QMP/qmp.py
>> +++ b/QMP/qmp.py
>> @@ -11,6 +11,7 @@
>>  import json
>>  import errno
>>  import socket
>> +import locale
>>  
>>  class QMPError(Exception):
>>      pass
>> @@ -133,7 +134,8 @@ class QEMUMonitorProtocol:
>>      def command(self, cmd, **kwds):
>>          ret = self.cmd(cmd, kwds)
>>          if ret.has_key('error'):
>> -            raise Exception(ret['error']['desc'])
>> +            enc = locale.getpreferredencoding()
>> +            raise Exception(ret['error']['desc'].encode(enc))
>
>You should not need to explicitly encode the error descriptor.  The
>error description should be UTF-8 on the wire and a Unicode Python
>string in this script.
>
>I think the real problem is:
>
>1. Guest qga is writing strings in local encoding onto the wire.

The error description Guest qga writes is encoded like:

  {"error": {"class": "GenericError", "desc": "\u64CD\u4F5C\u306F..."}}

I feel this is correct (\u64CD is a correct representation of single
character in Unicode).


>or
>
>2. qmp.py isn't UTF-8-decoding strings received over the wire.

And qmp.py can decode this correctly, and ret['error']['desc'] is
Unicode Python string.

The problem looks like in python Exception, that cannot print out
the message at all if it contains non-ascii Unicode string:

% python

>>> raise Exception(u"abc")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Exception: abc

>>> raise Exception(u"abcü")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Exception

while the "print" doesn't have this issue:

>>> print(u"abcü")
abcü



>Either or both bugs could be present.  Once they are fixed you shouldn't
>see encoding problems.
>
>Stefan

Anyway, as I understood this patch for qmp.py is not correct way to fix
this issue, I'm going to drop it from the series.

Thanks,

Tomoki Sekiyama
Stefan Hajnoczi - May 27, 2013, 12:06 p.m.
On Fri, May 24, 2013 at 06:55:21PM +0000, Tomoki Sekiyama wrote:
> On 5/23/13 8:30 , "Stefan Hajnoczi" <stefanha@gmail.com> wrote:
> >Either or both bugs could be present.  Once they are fixed you shouldn't
> >see encoding problems.
> >
> >Stefan
> 
> Anyway, as I understood this patch for qmp.py is not correct way to fix
> this issue, I'm going to drop it from the series.

Interesting, thanks for showing that that QEMU and qmp.py are behaving
correctly.

For reference, the Python unicode exception bug is here and doesn't seem
to be fixable in Python 2.x:

http://bugs.python.org/issue2517

Stefan

Patch

diff --git a/QMP/qmp.py b/QMP/qmp.py
index c551df1..ee21819 100644
--- a/QMP/qmp.py
+++ b/QMP/qmp.py
@@ -11,6 +11,7 @@ 
 import json
 import errno
 import socket
+import locale
 
 class QMPError(Exception):
     pass
@@ -133,7 +134,8 @@  class QEMUMonitorProtocol:
     def command(self, cmd, **kwds):
         ret = self.cmd(cmd, kwds)
         if ret.has_key('error'):
-            raise Exception(ret['error']['desc'])
+            enc = locale.getpreferredencoding()
+            raise Exception(ret['error']['desc'].encode(enc))
         return ret['return']
 
     def pull_event(self, wait=False):