From patchwork Thu Jul 26 11:22:00 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Markus Armbruster X-Patchwork-Id: 173405 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 6A1BD2C0097 for ; Thu, 26 Jul 2012 21:22:21 +1000 (EST) Received: from localhost ([::1]:55150 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuM95-0005hc-7e for incoming@patchwork.ozlabs.org; Thu, 26 Jul 2012 07:22:19 -0400 Received: from eggs.gnu.org ([208.118.235.92]:59917) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuM8v-0005h7-VU for qemu-devel@nongnu.org; Thu, 26 Jul 2012 07:22:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SuM8r-0005Bu-KK for qemu-devel@nongnu.org; Thu, 26 Jul 2012 07:22:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:64904) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuM8r-0005Bo-Bb for qemu-devel@nongnu.org; Thu, 26 Jul 2012 07:22:05 -0400 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q6QBM2xU006563 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 26 Jul 2012 07:22:02 -0400 Received: from blackfin.pond.sub.org (ovpn-116-63.ams2.redhat.com [10.36.116.63]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id q6QBM17p009748; Thu, 26 Jul 2012 07:22:01 -0400 Received: by blackfin.pond.sub.org (Postfix, from userid 1000) id C37BA20053; Thu, 26 Jul 2012 13:22:00 +0200 (CEST) From: Markus Armbruster To: Peter Maydell References: <1343235256-26310-1-git-send-email-lcapitulino@redhat.com> <1343235256-26310-8-git-send-email-lcapitulino@redhat.com> <20120725161813.538012f0@doriath.home> Date: Thu, 26 Jul 2012 13:22:00 +0200 In-Reply-To: (Peter Maydell's message of "Wed, 25 Jul 2012 20:47:24 +0100") Message-ID: <878ve6epif.fsf@blackfin.pond.sub.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.97 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.132.183.28 Cc: qemu-devel@nongnu.org, pbonzini@redhat.com, aliguori@us.ibm.com, mdroth@linux.vnet.ibm.com, Luiz Capitulino Subject: Re: [Qemu-devel] [PATCH 07/11] qapi: qapi.py: allow the "'" character be escaped X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Peter Maydell writes: > On 25 July 2012 20:18, Luiz Capitulino wrote: >> Peter Maydell wrote: >>> On 25 July 2012 17:54, Luiz Capitulino wrote: >>> > --- a/scripts/qapi.py >>> > +++ b/scripts/qapi.py >>> > @@ -21,7 +21,9 @@ def tokenize(data): >>> > elif data[0] == "'": >>> > data = data[1:] >>> > string = '' >>> > - while data[0] != "'": >>> > + while True: >>> > + if data[0] == "'" and string[len(string)-1] != "\\": >>> > + break >>> > string += data[0] >>> > data = data[1:] >>> > data = data[1:] >>> >>> Won't this cause us to look at string[-1] if >>> the input data has two ' characters in a row? >> >> Non escaped? If you meant '' that's a zero length string and should work, but >> if you meant 'foo '' bar' that's illegal, because ' characters >> should be escaped. > > I meant the zero length string case. yes. We come in with data = "''", > strip the first ' and set string to empty. Then in the first time > in the while loop data[0] is "'" but len(string) is 0 and so we'll > do string[-1] which I think will throw an exception. > > ...and yep, quick test of a nobbbled qapi-schema.json confirms: > $ python /home/pm215/src/qemu/qemu/scripts/qapi-types.py -h -o "." < > /home/pm215/src/qemu/qemu/qapi-schema.json > Traceback (most recent call last): > File "/home/pm215/src/qemu/qemu/scripts/qapi-types.py", line 260, in > exprs = parse_schema(sys.stdin) > File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 78, in parse_schema > expr_eval = evaluate(expr) > File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 64, in evaluate > return parse(map(lambda x: x, tokenize(string)))[0] > File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 25, in tokenize > if data[0] == "'" and string[len(string)-1] != "\\": > IndexError: string index out of range > > Try this (very lightly tested but seems to work): > (feel free to do something nicer than raising an exception on > the syntax error, and sorry I'm feeling too lazy to make this > an actual patch email) > > Signed-off-by: Peter Maydell > > --- a/scripts/qapi.py > +++ b/scripts/qapi.py > @@ -21,10 +21,16 @@ def tokenize(data): > elif data[0] == "'": > data = data[1:] > string = '' > - while data[0] != "'": > - string += data[0] > - data = data[1:] > - data = data[1:] > + while True: > + pos = data.find("'") > + if pos == -1: > + raise Exception("Mismatched quotes") > + string += data[0:pos] > + data = data[pos+1:] > + if len(string) == 0 or string[-1] != "\\": > + # found a ' and it wasn't escaped > + break > + string = string[0:-1] + "'" > yield string > > def parse(tokens): > > (if anybody wants to be able to use '\\' to escape escapes then > this approach is a bit stuffed, of course.) For what it's worth, the orthodox way to lexically analyze strings is a finite automaton. Utterly untested sketch: Doesn't handle missing close quote gracefully; you may want to add that. >> PS: Peter, I get claustrophobic when reading emails from you :) > > I can add more blank lines if that helps? :-) > > -- PMM diff --git a/scripts/qapi.py b/scripts/qapi.py index 8082af3..a745e92 100644 --- a/scripts/qapi.py +++ b/scripts/qapi.py @@ -21,8 +21,17 @@ def tokenize(data): elif data[0] == "'": data = data[1:] string = '' - while data[0] != "'": - string += data[0] + esc = False + while True: + if esc: + string += data[0] + esc = False + elif data[0] == "\\": + esc = True + elif data[0] == "'": + break + else + string += data[0] data = data[1:] data = data[1:] yield string