diff mbox

[2/4] pkg-infra: add possiblity to check downloaded files against known hashes

Message ID 3ab303bd4f1ee78900e7fafc90947e30319635b7.1404416102.git.yann.morin.1998@free.fr
State Accepted
Headers show

Commit Message

Yann E. MORIN July 3, 2014, 7:36 p.m. UTC
Some of the packages that Buildroot might build are sensitive packages,
related to security: openssl, dropbear, ca-certificates...

Some of those packages are downloaded over plain http, because there is
no way to get them over a secure channel, such as https.

In these dark times of pervasive surveillance, the potential for harm that
a tampered-with package could generate, we may want to check the integrity
of those sensitive packages.

So, each package may now provide a list of hashes for all files that needs
to be downloaded, and Buildroot will just fail if any downloaded file does
not match its known hash, in which case it is removed.

Hashes can be any of the md5, sha1 or sha2 variants, and will be checked
even if the file was pre-downloaded.

Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Baruch Siach <baruch@tkos.co.il>
Cc: Arnout Vandecappelle <arnout@mind.be>
Cc: Gustavo Zacarias <gustavo@zacarias.com.ar>
Reviewed-by: Samuel Martin <s.martin49@gmail.com>
Cc: Thomas De Schampheleire <patrickdepinguin@gmail.com>

---
Changes v9 -> v10:
  - use bash as shell  (Peter)

Changes v7 -> v8
  - expand MITM to its full-length meaning  (Thomas DS)
  - typo  (Thomas DS)

Changes v4 -> v5:
  - fix detection of comments and empty lines

---
Note: this is not a bullet-proof solution, since Buildroot may itself be
compromised. But since we do sign our releases, then we secure the list of
hashes at the same time. Only random snapshots from the repository may be
at risk of tampering, although this is highly doubtfull, given how git
stores its data.
---
 package/pkg-download.mk     | 20 ++++++++++--
 support/download/check-hash | 76 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+), 3 deletions(-)
 create mode 100755 support/download/check-hash

Comments

Peter Korsgaard July 4, 2014, 9:49 p.m. UTC | #1
>>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:

 > Some of the packages that Buildroot might build are sensitive packages,
 > related to security: openssl, dropbear, ca-certificates...

 > Some of those packages are downloaded over plain http, because there is
 > no way to get them over a secure channel, such as https.

 > In these dark times of pervasive surveillance, the potential for harm that
 > a tampered-with package could generate, we may want to check the integrity
 > of those sensitive packages.

 > So, each package may now provide a list of hashes for all files that needs
 > to be downloaded, and Buildroot will just fail if any downloaded file does
 > not match its known hash, in which case it is removed.

 > Hashes can be any of the md5, sha1 or sha2 variants, and will be checked
 > even if the file was pre-downloaded.

 > Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
 > Cc: Baruch Siach <baruch@tkos.co.il>
 > Cc: Arnout Vandecappelle <arnout@mind.be>
 > Cc: Gustavo Zacarias <gustavo@zacarias.com.ar>
 > Reviewed-by: Samuel Martin <s.martin49@gmail.com>
 > Cc: Thomas De Schampheleire <patrickdepinguin@gmail.com>

 > ---
 > Changes v9 -> v10:
 >   - use bash as shell  (Peter)

 > Changes v7 -> v8
 >   - expand MITM to its full-length meaning  (Thomas DS)
 >   - typo  (Thomas DS)

 > Changes v4 -> v5:
 >   - fix detection of comments and empty lines

 > ---
 > Note: this is not a bullet-proof solution, since Buildroot may itself be
 > compromised. But since we do sign our releases, then we secure the list of
 > hashes at the same time. Only random snapshots from the repository may be
 > at risk of tampering, although this is highly doubtfull, given how git
 > stores its data.
 > ---
 >  package/pkg-download.mk     | 20 ++++++++++--
 >  support/download/check-hash | 76 +++++++++++++++++++++++++++++++++++++++++++++
 >  2 files changed, 93 insertions(+), 3 deletions(-)
 >  create mode 100755 support/download/check-hash

 > diff --git a/package/pkg-download.mk b/package/pkg-download.mk
 > index d3cd0c1..7f208d5 100644
 > --- a/package/pkg-download.mk
 > +++ b/package/pkg-download.mk
 > @@ -58,6 +58,17 @@ domainseparator=$(if $(1),$(1),/)
 >  # github(user,package,version): returns site of GitHub repository
 >  github = https://github.com/$(1)/$(2)/archive/$(3)
 
 > +# Helper for checking a tarball's checksum
 > +# If the hash does not match, remove the incorrect file
 > +# $(1): the path to the file with the hashes
 > +# $(2): the full path to the file to check
 > +define VERIFY_HASH
 > +	if ! support/download/check-hash $(1) $(2); then \
 > +		rm -f $(2); \
 > +		exit 1; \
 > +	fi

I wonder if it would be a worthwhile optimization to do the -f
<whatever>.hash file check here, so we don't need to run the script at
all for packages without hashes, but perhaps that's not measurable.


 > +++ b/support/download/check-hash
 > @@ -0,0 +1,76 @@
 > +#!/bin/bash
 > +set -e
 > +
 > +# Helper to check a file matches its known hash
 > +# Call it with:
 > +#   $1: the full path to the file to check
 > +#   $2: the path of the file containing all the the expected hashes
 > +
 > +h_file="${1}"
 > +file="${2}"
 > +
 > +# Does the hash-file exist?
 > +if [ ! -f "${h_file}" ]; then
 > +    exit 0
 > +fi
 > +
 > +# Check one hash for a file
 > +# $1: known hash
 > +# $2: file (full path)
 > +check_one_hash() {
 > +    _h="${1}"
 > +    _known="${2}"
 > +    _file="${3}"
 > +
 > +    # Note: sha3 is not supported, since there is currently no implementation
 > +    #       (the NIST has yet to publish the parameters).
 > +    case "${_h}" in
 > +        md5|sha1)                       ;;
 > +        sha224|sha256|sha384|sha512)    ;;

All of these come from coreutils and have been available for ages so
they are basically guaranteed to be available on the build machine,
right?

If not, we should probably warn with a sensible error message instead of
what we get now:

sudo mv /usr/bin/sha1sum{,-dontuse}
make ca-certificates-source
support/download/check-hash: line 39: sha1sum: command not found
ERROR: ca-certificates_20140223.tar.xz has wrong sha1 hash:
ERROR: expected: ad57a45f0422fafd78a2e8191e5204f2306cc91b
ERROR: got     : 
ERROR: Incomplete download, or man-in-the-middle (MITM) attack
package/pkg-generic.mk:73: recipe for target '/home/peko/source/buildroot/output/build/ca-certificates-20140223/.stamp_downloaded' failed


But we can fix that up later if needed - Committed, thanks.
Bernd Kuhls Nov. 5, 2020, 6:38 p.m. UTC | #2
Hi Yann,

Am Thu, 03 Jul 2014 21:36:21 +0200 schrieb Yann E. MORIN:

> +    # Note: sha3 is not supported, since there is currently no 
implementation
> +    #       (the NIST has yet to publish the parameters).

sqlite.org started to release sha3 hashes for their tarballs:
https://www.sqlite.org/download.html

Apparently a sha3sum utility is not available so I tried

openssl dgst -sha3-256 dl/sqlite/sqlite-autoconf-3330000.tar.gz

which shows the hash mentioned upstream:

SHA3-256(dl/sqlite/sqlite-autoconf-3330000.tar.gz)= 
6e94e9453cedf8f2023e3923f856741d1e28a2271e9f93d24d95fa48870edaad

Should we add the openssl-way to verify sha3 hashes?

Should our hash files support different sha3 methods?
Openssl support sha3-224, sha3-256, sha3-384 & sha3-512.

Regards, Bernd
Yann E. MORIN Nov. 5, 2020, 9:12 p.m. UTC | #3
Bernd, All,

On 2020-11-05 19:38 +0100, Bernd Kuhls spake thusly:
> Am Thu, 03 Jul 2014 21:36:21 +0200 schrieb Yann E. MORIN:
> sqlite.org started to release sha3 hashes for their tarballs:
> https://www.sqlite.org/download.html
> 
> Apparently a sha3sum utility is not available so I tried
> 
> openssl dgst -sha3-256 dl/sqlite/sqlite-autoconf-3330000.tar.gz
> 
> which shows the hash mentioned upstream:
> 
> SHA3-256(dl/sqlite/sqlite-autoconf-3330000.tar.gz)= 
> 6e94e9453cedf8f2023e3923f856741d1e28a2271e9f93d24d95fa48870edaad
> 
> Should we add the openssl-way to verify sha3 hashes?

Unfortunately, this is only present in very recent versions of openssl.
It was missing in Ubuntu 16.04, while it is present in 18.04 (both still
supported LTS versions)

So, I am not really sure how we can move forward...

If we were to add it, and were to make it mandatory that we be able to
validate them, then it would mean we would have to build our own
host-openssl prior to doing downloads. This is very not nice (see the
existing issue with host-tar, which we are trying to get rid of).

If we were to add it, and only check for them if the host openssl did
support them, and ignore those hashes otherwise, then it would mean some
packages which have only sha3 hashes would not be checked (even if
in-tree we ensure a fallback hash exists, that might not be the case for
external trees, or the situation can slip our attention).

If we don't add it, and upstream does not provide anything else, then we
will have to resort to locally computing our own hashes.

Franckly, my preference would got for the third option: not support sha3,
and add our own hashes. Adding our own hashes is anyway what we already
do for a lot of packages already. sha3 does provide extra resilience,
thanks to its novel design, but sha2 is still far from being considered
broken yet [0].

One thing we may consider adding to reinforce our robustness, is to
store the file size in the hash file, in addition to the hash, e.g.:

    sha256  c35d87f1d0...bbff51fe689  2439463  busybox-1.32.0.tar.bz2

This would protect against size-extension attacks, which afaiu are the
only attacks really considered for now against sha2 [1]...

And we could be backward compatible and recognise 3- or 4-field lines,
to decide whether the size is present of not, and not checking it in the
latter case.

> Should our hash files support different sha3 methods?
> Openssl support sha3-224, sha3-256, sha3-384 & sha3-512.

If we were to add such support, then yes, we would have to add all four
lengths, as we do for sha224, sha256, sha384, and sha512, which all four
are really just sha2 variants.

But unfortunately, I am not convinced we can implement it for now. And
be sure it saddens me... :-(

[0] https://issha2brokenyet.com/  (just for the lols)
[1] https://en.wikipedia.org/wiki/SHA-2#Comparison_of_SHA_functions

Regards,
Yann E. MORIN.
Bernd Kuhls Nov. 6, 2020, 8:28 p.m. UTC | #4
Am Thu, 05 Nov 2020 22:12:32 +0100 schrieb Yann E. MORIN:

> Bernd, All,
> 
> On 2020-11-05 19:38 +0100, Bernd Kuhls spake thusly:
>> Am Thu, 03 Jul 2014 21:36:21 +0200 schrieb Yann E. MORIN: sqlite.org
>> started to release sha3 hashes for their tarballs:
>> https://www.sqlite.org/download.html
>> 
>> Apparently a sha3sum utility is not available so I tried
>> 
>> openssl dgst -sha3-256 dl/sqlite/sqlite-autoconf-3330000.tar.gz
>> 
>> which shows the hash mentioned upstream:
>> 
>> SHA3-256(dl/sqlite/sqlite-autoconf-3330000.tar.gz)=
>> 6e94e9453cedf8f2023e3923f856741d1e28a2271e9f93d24d95fa48870edaad
>> 
>> Should we add the openssl-way to verify sha3 hashes?
> 
> Unfortunately, this is only present in very recent versions of openssl.
> It was missing in Ubuntu 16.04, while it is present in 18.04 (both still
> supported LTS versions)
> 
> So, I am not really sure how we can move forward...

Hi Yann,

what about host-rhash?
http://patchwork.ozlabs.org/project/buildroot/list/?series=212773

$ output/host/bin/rhash --sha3-256 dl/sqlite/sqlite-
autoconf-3330000.tar.gz 
6e94e9453cedf8f2023e3923f856741d1e28a2271e9f93d24d95fa48870edaad  dl/
sqlite/sqlite-autoconf-3330000.tar.gz

Regards, Bernd
Peter Korsgaard Nov. 7, 2020, 5:27 p.m. UTC | #5
>>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:

Hi,

 > So, I am not really sure how we can move forward...

 > If we were to add it, and were to make it mandatory that we be able to
 > validate them, then it would mean we would have to build our own
 > host-openssl prior to doing downloads. This is very not nice (see the
 > existing issue with host-tar, which we are trying to get rid of).

Indeed, lets not go there before a significant amount of upstreams start
to only provide sha3 hashes.


 > Franckly, my preference would got for the third option: not support sha3,
 > and add our own hashes. Adding our own hashes is anyway what we already
 > do for a lot of packages already. sha3 does provide extra resilience,
 > thanks to its novel design, but sha2 is still far from being considered
 > broken yet [0].

Agreed!

 > One thing we may consider adding to reinforce our robustness, is to
 > store the file size in the hash file, in addition to the hash, e.g.:

 >     sha256  c35d87f1d0...bbff51fe689  2439463  busybox-1.32.0.tar.bz2

 > This would protect against size-extension attacks, which afaiu are the
 > only attacks really considered for now against sha2 [1]...

 > And we could be backward compatible and recognise 3- or 4-field lines,
 > to decide whether the size is present of not, and not checking it in the
 > latter case.

I wonder if the gain is worth the extra complexity for our users and in
the implementation. Are there are any realistic size extension attacks
against sha256?
Yann E. MORIN Nov. 7, 2020, 9:32 p.m. UTC | #6
Peter, All,

On 2020-11-07 18:27 +0100, Peter Korsgaard spake thusly:
[--SNIP--]
>  > One thing we may consider adding to reinforce our robustness, is to
>  > store the file size in the hash file, in addition to the hash, e.g.:
[--SNIP--]
>  > This would protect against size-extension attacks, which afaiu are the
>  > only attacks really considered for now against sha2 [1]...
[--SNIP--]
> I wonder if the gain is worth the extra complexity for our users and in
> the implementation.

The implementation is pretty trivial. I have more changes against the
manual than I have against the code...

> Are there are any realistic size extension attacks
> against sha256?

Good question... At least, it looks like it was known from the onset,
during the standardisation process:

    https://www.cryptologie.net/article/417/how-did-length-extension-attacks-made-it-into-sha-2/
    http://www.cs.utsa.edu/~wagner/CS4363/SHS/dfips-180-2-comments1.pdf (comment #3)

And the following seems to imply it is pretty trivial (for a seasoned
cryptographer at least):

    https://www.whitehatsec.com/blog/hash-length-extension-attacks/
    https://blog.skullsecurity.org/2012/everything-you-need-to-know-about-hash-length-extension-attacks

So yes, it looks like length extension attacks (LEA) are easy...

However, now that I've read a bit more, especially that last article, I
doubt we'd be susceptible to such attacks. Indeed, LEA target MACs, that
is signatures. We're not using hashes that way; we just hash files, not
secrets.

So maybe this is not so interesting to add the size to the hash file...

Crypto is hard, damit... ;-)

Regards,
Yann E. MORIN.
Peter Korsgaard Nov. 8, 2020, 5:14 p.m. UTC | #7
>>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:

Hi,

 >> I wonder if the gain is worth the extra complexity for our users and in
 >> the implementation.

 > The implementation is pretty trivial. I have more changes against the
 > manual than I have against the code...

yes, I was mainly referring to the first part, E.G. our users. Now the
hash lines are an algorithm prefix and then the output of <alg>sum, but
with the suggested change this is no longer the case.

But yes, it isn't a big complication.


 > However, now that I've read a bit more, especially that last article, I
 > doubt we'd be susceptible to such attacks. Indeed, LEA target MACs, that
 > is signatures. We're not using hashes that way; we just hash files, not
 > secrets.

I am not a cryptographer, but I would imagine that creating LEA attacks
against the kind of hashes we have is HARD to do, otherwise a lot of
things would break, and all those upstreams publishing hashes with their
releases would be for nothing.
diff mbox

Patch

diff --git a/package/pkg-download.mk b/package/pkg-download.mk
index d3cd0c1..7f208d5 100644
--- a/package/pkg-download.mk
+++ b/package/pkg-download.mk
@@ -58,6 +58,17 @@  domainseparator=$(if $(1),$(1),/)
 # github(user,package,version): returns site of GitHub repository
 github = https://github.com/$(1)/$(2)/archive/$(3)
 
+# Helper for checking a tarball's checksum
+# If the hash does not match, remove the incorrect file
+# $(1): the path to the file with the hashes
+# $(2): the full path to the file to check
+define VERIFY_HASH
+	if ! support/download/check-hash $(1) $(2); then \
+		rm -f $(2); \
+		exit 1; \
+	fi
+endef
+
 ################################################################################
 # The DOWNLOAD_* helpers are in charge of getting a working copy
 # of the source repository for their corresponding SCM,
@@ -148,7 +159,8 @@  endef
 define DOWNLOAD_SCP
 	test -e $(DL_DIR)/$(2) || \
 	$(EXTRA_ENV) support/download/scp '$(call stripurischeme,$(call qstrip,$(1)))' \
-					  $(DL_DIR)/$(2)
+					  $(DL_DIR)/$(2) && \
+	$(call VERIFY_HASH,$(PKGDIR)/$($(PKG)_NAME).hash,$(DL_DIR)/$(2))
 endef
 
 define SOURCE_CHECK_SCP
@@ -179,7 +191,8 @@  endef
 
 define DOWNLOAD_WGET
 	test -e $(DL_DIR)/$(2) || \
-	$(EXTRA_ENV) support/download/wget '$(call qstrip,$(1))' $(DL_DIR)/$(2)
+	$(EXTRA_ENV) support/download/wget '$(call qstrip,$(1))' $(DL_DIR)/$(2) && \
+	$(call VERIFY_HASH,$(PKGDIR)/$($(PKG)_NAME).hash,$(DL_DIR)/$(2))
 endef
 
 define SOURCE_CHECK_WGET
@@ -193,7 +206,8 @@  endef
 define DOWNLOAD_LOCALFILES
 	test -e $(DL_DIR)/$(2) || \
 	$(EXTRA_ENV) support/download/cp $(call stripurischeme,$(call qstrip,$(1))) \
-					 $(DL_DIR)
+					 $(DL_DIR) && \
+	$(call VERIFY_HASH,$(PKGDIR)/$($(PKG)_NAME).hash,$(DL_DIR)/$(2))
 endef
 
 define SOURCE_CHECK_LOCALFILES
diff --git a/support/download/check-hash b/support/download/check-hash
new file mode 100755
index 0000000..9ea7c41
--- /dev/null
+++ b/support/download/check-hash
@@ -0,0 +1,76 @@ 
+#!/bin/bash
+set -e
+
+# Helper to check a file matches its known hash
+# Call it with:
+#   $1: the full path to the file to check
+#   $2: the path of the file containing all the the expected hashes
+
+h_file="${1}"
+file="${2}"
+
+# Does the hash-file exist?
+if [ ! -f "${h_file}" ]; then
+    exit 0
+fi
+
+# Check one hash for a file
+# $1: known hash
+# $2: file (full path)
+check_one_hash() {
+    _h="${1}"
+    _known="${2}"
+    _file="${3}"
+
+    # Note: sha3 is not supported, since there is currently no implementation
+    #       (the NIST has yet to publish the parameters).
+    case "${_h}" in
+        md5|sha1)                       ;;
+        sha224|sha256|sha384|sha512)    ;;
+        *) # Unknown hash, exit with error
+            printf "ERROR: unknown hash '%s' for '%s'\n"  \
+                   "${_h}" "${_file##*/}" >&2
+            exit 1
+            ;;
+    esac
+
+    # Do the hashes match?
+    _hash=$( ${_h}sum "${_file}" |cut -d ' ' -f 1 )
+    if [ "${_hash}" = "${_known}" ]; then
+        printf "%s: OK (%s: %s)\n" "${_file##*/}" "${_h}" "${_hash}"
+        return 0
+    fi
+
+    printf "ERROR: %s has wrong %s hash:\n" "${_file##*/}" "${_h}" >&2
+    printf "ERROR: expected: %s\n" "${_known}" >&2
+    printf "ERROR: got     : %s\n" "${_hash}" >&2
+    printf "ERROR: Incomplete download, or man-in-the-middle (MITM) attack\n" >&2
+
+    exit 1
+}
+
+# Do we know one or more hashes for that file?
+nb_checks=0
+while read t h f; do
+    case "${t}" in
+        ''|'#'*)
+            # Skip comments and empty lines
+            continue
+            ;;
+        *)
+            if [ "${f}" = "${file##*/}" ]; then
+                check_one_hash "${t}" "${h}" "${file}"
+                : $((nb_checks++))
+            fi
+            ;;
+    esac
+done <"${h_file}"
+
+if [ ${nb_checks} -eq 0 ]; then
+    if [ -n "${BR2_ENFORCE_CHECK_HASH}" ]; then
+        printf "ERROR: No hash found for %s\n" "${file}" >&2
+        exit 1
+    else
+        printf "WARNING: No hash found for %s\n" "${file}" >&2
+    fi
+fi