From patchwork Mon Jan 7 22:05:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yann E. MORIN" X-Patchwork-Id: 1021624 Return-Path: X-Original-To: incoming-buildroot@patchwork.ozlabs.org Delivered-To: patchwork-incoming-buildroot@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=busybox.net (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=buildroot-bounces@busybox.net; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=free.fr Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="INaYGZca"; dkim-atps=neutral Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43YTxD4tqqz9sD9 for ; Tue, 8 Jan 2019 09:06:08 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 9309124B59; Mon, 7 Jan 2019 22:06:05 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UX1nef9RufUf; Mon, 7 Jan 2019 22:06:03 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by silver.osuosl.org (Postfix) with ESMTP id 695B421544; Mon, 7 Jan 2019 22:06:03 +0000 (UTC) X-Original-To: buildroot@lists.busybox.net Delivered-To: buildroot@osuosl.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by ash.osuosl.org (Postfix) with ESMTP id 717681C296D for ; Mon, 7 Jan 2019 22:06:00 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 6753086468 for ; Mon, 7 Jan 2019 22:06:00 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8646dwgeBob5 for ; Mon, 7 Jan 2019 22:06:00 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by hemlock.osuosl.org (Postfix) with ESMTPS id ABD9286450 for ; Mon, 7 Jan 2019 22:05:59 +0000 (UTC) Received: by mail-ed1-f54.google.com with SMTP id x30so2385560edx.2 for ; Mon, 07 Jan 2019 14:05:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=0X+HMTa81f7dSv4z0McuVRy6P0SVGpn/NPmecDfPawc=; b=INaYGZca0sCXpU6cLz3h4SFxwn6Wo4oCHopmmfiwGorOX6tEEcS3FSApdvWVYHIfFv L1ayimZlXnrsAggBRS79E+jQu4IQWtxFH7mzXAVB9JhVNO7Ozqt9LQXKyun5VSh6Y2XW MF4I3mzzfsDmCAoVOvrEEtlCn8BXYlQDs/XoHYattvagnXmBYC3ZSARttozOIxXirF4b WMyk11Em21thJ5pLwmwebbWS3CGJPw0SOKVVsr5OgTWwSKb9S5aj3BQCdJDAKUmudz9c oRA+uS/NlP5oBal/+mf5Qt1BiqsY6g5STcJ0cu/GVJGMw5WHaNxuguRmWzbtcvwN1PNB YjQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=0X+HMTa81f7dSv4z0McuVRy6P0SVGpn/NPmecDfPawc=; b=cggxCBMr9R5YrXOwgffcXbNO8cUFLr+n1r5d6J6v/sxIpI4im83LV92p701SNvZUiY WwJsMDjQTj0iEo4bM8HYA/aZf06Z+EBfpwNEvgl485CjNxtrA2cBf6R8KOdILSoM9RRW i/vfXA6TrYNl6U2gNXmL+C5+v1GUBqsrmNl389kdG1RM/V7BAI5w/H4oxXUakrB3eJrI 0bOEGfIOqcM9Z5qHkDnzxEE4hE04WQro51vigKCR4P5D1To2iz7bqkCVrI+T3XFogf7H Idi53Bw/cOy+W9zsGUfQL5VcJaNzrCUO+mhHACgsc68EolsA8hTlxYKkkXAg0EIazYnC jv0Q== X-Gm-Message-State: AA+aEWYgCtI+3UfFFpv7EL/Z6DG4r7OJp+rd7+e55NckBaiw8TP1ZTaR 9R+c9gSI3TyVf6EasfA32WMWaDFD X-Google-Smtp-Source: AFSGD/W1+dzPqr8AbKfWhB2wGEOdDBSwmG1fWw+xH7L9I8qbGfQl0Y5/pCU4lIGFYWgPsaI5+jyRNQ== X-Received: by 2002:a17:906:6b43:: with SMTP id o3-v6mr48806161ejs.31.1546898757751; Mon, 07 Jan 2019 14:05:57 -0800 (PST) Received: from scaer.home ([2a01:cb19:829a:2800:68e8:7a61:9bb9:12a]) by smtp.gmail.com with ESMTPSA id d56sm31799589ede.76.2019.01.07.14.05.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Jan 2019 14:05:57 -0800 (PST) From: "Yann E. MORIN" To: buildroot@buildroot.org Date: Mon, 7 Jan 2019 23:05:33 +0100 Message-Id: X-Mailer: git-send-email 2.14.1 In-Reply-To: References: Subject: [Buildroot] [PATCH 11/19] support: introduce new format for packages-file-list files X-BeenThere: buildroot@busybox.net X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion and development of buildroot List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Yann E. MORIN" MIME-Version: 1.0 Errors-To: buildroot-bounces@busybox.net Sender: "buildroot" The existing format for the packages-files lists has two drawbacks: - it is not very resilient against filenames with weird characters, like \n or \b or spaces... - it is not easily expandable, partly because of the above. AS such, introduce a new format for those files, that solves both issues. First, we must find a way that unambiguously separate fields. There is one single byte that can never ever occur in filenames or package names, i.e. the NULL character. So, we use that as a field separator. Second, we must find a way to unambiguously separate records. Except for \0, any character may occur in filenames, but the other existing field we currently have is the package name, which we do know does not contain any weird byte (e.g. it's basically limited to [[:alnum:]_-]). Thus, we can't use a single character as record separator. A solution is to use \0\n as the record separator. Thirdly, we must ensure that filenames never mess up with our separators. By making the filename the first field, we can be sure that it is properly terminated by a field separator, and that any leading \n does not interfere with a previous field separator to form a spurious record separator. So, the new format is now (without spaces): filename \0 package-name \0\n Update the parser accordingly. Signed-off-by: "Yann E. MORIN" --- package/pkg-generic.mk | 8 ++++++-- support/scripts/brpkgutil.py | 28 ++++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 6 deletions(-) diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk index 7daea190a6..d261b5bf76 100644 --- a/package/pkg-generic.mk +++ b/package/pkg-generic.mk @@ -59,6 +59,10 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time # The suffix is typically empty for the target variant, for legacy backward # compatibility. +# Files are record-formatted, with \0\n as record separator, and \0 as +# field separator. A record is made of these fields: +# - file path +# - package name # $(1): package name # $(2): base directory to search in # $(3): suffix of file (optional) @@ -66,8 +70,8 @@ define step_pkg_size_inner cd $(2); \ find . \( -type f -o -type l \) \ -newer $@_before \ - -exec printf '$(1),%s\n' {} + \ - >> $(BUILD_DIR)/packages-file-list$(3).txt + |sed -r -e 's/$$/\x00$(1)\x00/' \ + >> $(BUILD_DIR)/packages-file-list$(3).txt endef define step_pkg_size diff --git a/support/scripts/brpkgutil.py b/support/scripts/brpkgutil.py index d15b18845b..f6ef4b3dca 100644 --- a/support/scripts/brpkgutil.py +++ b/support/scripts/brpkgutil.py @@ -5,6 +5,26 @@ import sys import subprocess +# Read the binary-opened file object f with \0\n separated records (aka lines). +# Highly inspired by: +# https://stackoverflow.com/questions/19600475/how-to-read-records-terminated-by-custom-separator-from-file-in-python +def _readlines0n(f): + buf = b'' + while True: + newbuf = f.read(1048576) + if not newbuf: + if buf: + yield buf + return + if buf is None: + buf = b'' + buf += newbuf + lines = buf.split(b'\x00\n') + for line in lines[:-1]: + yield line + buf = lines[-1] + + # Iterate on all records of the packages-file-list file passed as filename # Returns an iterator over a list of dictionaries. Each dictionary contains # these keys (others maybe added in the future): @@ -12,11 +32,11 @@ import subprocess # 'pkg': the last package that installed that file def parse_pkg_file_list(path): with open(path, 'rb') as f: - for rec in f.readlines(): - l = rec.split(',0') + for rec in _readlines0n(f): + srec = rec.split(b'\x00') d = { - 'file': l[0], - 'pkg': l[1], + 'file': srec[0], + 'pkg': srec[1], } yield d