diff mbox series

[v3,06/10] package/pkg-cargo.mk: Introduce the cargo dl backend

Message ID 20200220160119.3407-6-patrick.havelange@essensium.com
State Superseded
Headers show
Series [v3,01/10] package/pkg-cargo.mk: Introduce the cargo package infrastructure. | expand

Commit Message

Patrick Havelange Feb. 20, 2020, 4:01 p.m. UTC
Cargo is now a fully supported PKGMGR, automatically set for any
package using the cargo infrastructure.
This effectively splits the download phase and the build phase.

The cargo backend will set CARGO_HOME inside DL_DIR during download.
The cached files in CARGO_HOME permits to download only once the
crates index and each dependencies.
During download phase, it will call cargo vendor to copy the
dependencies inside the VENDOR directory inside the source archive.

A local cargo config is also inserted inside the archive in order
to use the VENDOR dir during the build phase.
The build phase is forced to not query the online repository anymore
and thus will be using the vendored dependencies from the tarball.
This also permits to have offline builds.

Signed-off-by: Patrick Havelange <patrick.havelange@essensium.com>
---
 package/pkg-cargo.mk         | 10 ++++--
 package/ripgrep/ripgrep.hash |  2 +-
 support/download/cargo       | 65 ++++++++++++++++++++++++++++++++++++
 support/download/dl-wrapper  |  2 +-
 4 files changed, 75 insertions(+), 4 deletions(-)
 create mode 100755 support/download/cargo

Comments

Thomas Petazzoni Aug. 26, 2020, 7:22 p.m. UTC | #1
Hello,

On Thu, 20 Feb 2020 17:01:15 +0100
Patrick Havelange <patrick.havelange@essensium.com> wrote:

> Cargo is now a fully supported PKGMGR, automatically set for any
> package using the cargo infrastructure.
> This effectively splits the download phase and the build phase.
> 
> The cargo backend will set CARGO_HOME inside DL_DIR during download.
> The cached files in CARGO_HOME permits to download only once the
> crates index and each dependencies.
> During download phase, it will call cargo vendor to copy the
> dependencies inside the VENDOR directory inside the source archive.
> 
> A local cargo config is also inserted inside the archive in order
> to use the VENDOR dir during the build phase.
> The build phase is forced to not query the online repository anymore
> and thus will be using the vendored dependencies from the tarball.
> This also permits to have offline builds.
> 
> Signed-off-by: Patrick Havelange <patrick.havelange@essensium.com>

We looked at this patch series during the previous (virtual) Buildroot
Developers Meeting in late July, and we forgot to give some feedback.
I'll provide the raw IRC logs of the discussion we had, however, I see
there's been more discussion tonight on the IRC channel on this topic.

[13:54:34] <kos_tom> wow, wow this series https://patchwork.ozlabs.org/project/buildroot/list/?series=159771 has some pretty weird download infrastructure changes
[13:55:17] <y_morin> Yes, it scares me quite a bit...
[13:55:33] <y_morin> However, I think we already discussed that here some time ago.
[13:55:52] <y_morin> The proposal I did was:
[13:56:19] <y_morin> 1) introduce a cargo download backend
[13:56:47] <y_morin> 2) have the cargo backend call the real backend (git, wget, etc...) for the cargo package,
[13:57:21] <y_morin> 3) have the cargo backend call "cargo bundle" to complete the package archive.
[13:57:26] <y_morin> kos_tom: ^^^
[13:59:50] <kos_tom> y_morin: I'm not sure how the "cargo backend can call the real backend
[14:00:10] <kos_tom> y_morin: what the patch series does is call "cargo vendor" to download all the dependencies
[14:00:34] <kos_tom> and then it makes a tarball of the whole thing, i.e the actual package source code + all its cargo dependencies
[14:00:48] <kos_tom> and that's what you get as a tarball for this package in DL_DIR
[14:01:09] <y_morin> kos_tom: It is easy for the cargo backend to call the adequate "real" backend:  "${0%/*}/wget"
[14:01:30] <y_morin> We "just" need to be able to pass that info yto he cargo backend
[14:01:32] <kos_tom> y_morin: the cargo backend does not do the download of dependecnies by itself
[14:01:37] <kos_tom> it calls the "cargo vendor" command
[14:01:55] <y_morin> kos_tom: I don;t yet care about the depdnencies. I care about the main download for now.
[14:02:15] <y_morin> kos_tom: Hold on a sec, I'll post a simple script that shows that...
[14:02:46] <kos_tom> y_morin: look at the patch series
[14:02:52] <kos_tom> y_morin: it does use *two* backends
[14:02:56] <kos_tom> it uses the normal backend for the main download
[14:03:04] <y_morin> kos_tom: I know.
[14:03:08] <kos_tom> + an extra secondary backend, called "cargo" to download the dependencis
[14:03:11] <y_morin> And I am not OK with that.
[14:03:23] <y_morin> I am propiosing something else.
[14:03:40] <y_morin> I.e. to invert the logic
[14:03:48] <kos_tom> yeah, you are proposing that the "cargo" backend takes care of both the main download and the dependencies download
[14:03:57] <kos_tom> and not have this concept of "secondary backend"
[14:04:02] <y_morin> Exactly.
[14:04:05] <kos_tom> Ok.
[14:04:15] <kos_tom> the question is: do we want this kind of download integration at all?
[14:04:22] <kos_tom> if we do, does it work for NodeJS ? PHP ?
[14:05:40] <y_morin> kos_tom: https://pastebin.com/c3K42Lb7
[14:06:03] <y_morin> kos_tom: That is also the idea. For go, it would be very similar too.
[14:06:19] <kos_tom> y_morin: but how do you pass the "real backend" if the backend is "cargo" ?
[14:06:22] <y_morin> And for npm and php, afaiu (which is not much), as well
[14:06:28] <kos_tom> i.e if you have a cargo package to fetch from Git, and another from https
[14:07:06] <y_morin> kos_tom: Pretty east. In the cargo infra:    FOO_SITE_METHOD = cargo  FOO_EXTRA_DLOPTS += --backend=$(whatever)
[14:07:22] <y_morin> kos_tom: Hmm...
[14:07:39] <y_morin> you mean "cargo vendor" can fetch from various locations?
[14:08:07] <y_morin> I meant: from various location usign various methods?
[14:08:21] <y_morin> It that is so, then my proposal is not good, indeed.
[14:08:55] <kos_tom> y_morin: no I meant the primary download
[14:09:11] <kos_tom> y_morin: ie. for the ripgrep package, the actual download of the ripgrep code
[14:09:44] <kos_tom> of the cargo series, I'm wondering if I should apply patches 1/2/3, which create the package infra, but not the download stuff
[14:09:53] <y_morin> kos_tom: Ah, but the primary download is exactly that: FOO_EXTRA_DLOPTS += --backend=$(geturischeme $(FOO_SITE))
[14:10:08] <kos_tom> my main concern is that we have a single package that would use this infra right now, which doesn't make merging such a simple infra very convincing.
[14:10:30] <kos_tom> y_morin: instead of an extra backend, can we do with a post-download hook ?
[14:10:54] <y_morin> kos_tom: I had a good reason not to use such a hook.
[14:11:06] <y_morin> kos_tom: I would have to dig my irc logs...
[14:11:54] <y_morin> kos_tom: Ah yes, I know...
[14:12:19] <y_morin> kos_tom: If you use a post-dl hook, it means the archive with just the main package is already in $(FOO_DL_DIR)
[15:12:14] <kos_tom> y_morin: so can we try to conclude on the cargo download thing?
[15:13:08] <y_morin> kos_tom: I provided some info above, not sure you saw it...
[15:13:29] <y_morin> kos_tom: If you use a post-dl hook, it means the archive with just the main package is already in $(FOO_DL_DIR)
[15:13:47] <kos_tom> y_morin: but is that a problem ?
[15:14:00] <kos_tom> I guess it is indeed simpler to have a single tarball that has the source code for everything, including dependencies
[15:14:05] <y_morin> kos_tom: Yes it is, for a subsequent build: it would not match the hashes.
[15:14:35] <kos_tom> are the hashes anyway going to match when you see (from the "bat" example being discussed with Romain) that dependency versions anyway can change
[15:14:40] <y_morin> kos_tom: That is the idea that the archive contains the main package *and* all its depednencies.
[15:14:56] <y_morin> For those package managers that allow pinning the version, we can get hashes of the archive, then

Best regards,

Thomas
diff mbox series

Patch

diff --git a/package/pkg-cargo.mk b/package/pkg-cargo.mk
index 35f7c15ad9..c084f7b35e 100644
--- a/package/pkg-cargo.mk
+++ b/package/pkg-cargo.mk
@@ -39,6 +39,8 @@  define inner-cargo-package
 # We need host-rustc to run cargo
 $(2)_DEPENDENCIES += host-rustc
 
+$(2)_PKGMGR = cargo\|
+
 $(2)_CARGO_ENV = \
 	CARGO_HOME=$(HOST_DIR)/share/cargo \
 	$(TARGET_CONFIGURE_OPTS)
@@ -55,11 +57,13 @@  endif
 #
 ifndef $(2)_BUILD_CMDS
 define $(2)_BUILD_CMDS
+	cd $$(@D) && \
 	$(TARGET_MAKE_ENV) $$($(2)_CARGO_ENV) \
 		cargo build \
 			--$$($(2)_CARGO_MODE) \
 			$$($(2)_CARGO_TARGET_OPT) \
-			--manifest-path $$(@D)/Cargo.toml
+			--manifest-path Cargo.toml \
+			--offline
 endef
 endif
 
@@ -69,12 +73,14 @@  endif
 #
 ifndef $(2)_INSTALL_TARGET_CMDS
 define $(2)_INSTALL_TARGET_CMDS
+	cd $$(@D) && \
 	$(TARGET_MAKE_ENV) $$($(2)_CARGO_ENV) \
 		cargo install \
 			--root $(TARGET_DIR)/usr/ \
 			--bins \
-			--path $$(@D) \
+			--path ./ \
 			$$($(2)_CARGO_TARGET_OPT) \
+			--offline \
 			--force
 endef
 endif
diff --git a/package/ripgrep/ripgrep.hash b/package/ripgrep/ripgrep.hash
index 0841c0185c..8c48458cd8 100644
--- a/package/ripgrep/ripgrep.hash
+++ b/package/ripgrep/ripgrep.hash
@@ -1,3 +1,3 @@ 
 # Locally calculated
-sha256 7035379fce0c1e32552e8ee528b92c3d01b8d3935ea31d26c51a73287be74bb3 ripgrep-0.8.1.tar.gz
+sha256 cb895cff182740c219fefbbaaf903e60f005a4ac4688cac1e43e5fbbbccc5a94 ripgrep-0.8.1.tar.gz
 sha256 0f96a83840e146e43c0ec96a22ec1f392e0680e6c1226e6f3ba87e0740af850f LICENSE-MIT
diff --git a/support/download/cargo b/support/download/cargo
new file mode 100755
index 0000000000..0ce94cf16b
--- /dev/null
+++ b/support/download/cargo
@@ -0,0 +1,65 @@ 
+#!/usr/bin/env bash
+
+# We want to catch any unexpected failure, and exit immediately
+set -e
+
+# Download helper for cargo package. It will populate the VENDOR directory
+# inside the archive with the dependencies of the package
+#
+# Arguments are in this order:
+# $1 mandatory fullpath to an already downloaded source file of a cargo package
+# $2 mandatory fullpath to a temp dir
+# $3 mandatory fullpath to the DL_DIR
+#
+# Environment:
+# cargo in PATH
+
+tmpd=""
+
+do_clean() {
+    if [ -n "${tmpd}" ]; then
+        rm -rf "${tmpd}"
+    fi
+}
+
+trap do_clean EXIT
+
+if [ $# -le 2 ] ; then
+    echo 'Need at least 3 arguments' >&2
+    exit 1
+fi;
+
+tmpd="$(mktemp -d -p "$2")"
+cd "${tmpd}"
+
+tar xf "${1}"
+cd ./*/
+echo "Running cargo vendor."
+CARGO_HOME="${3}"/cargo_home cargo vendor -q --locked VENDOR
+# Create the local .cargo/config with vendor info
+mkdir -p .cargo/
+cat <<EOF >.cargo/config
+[source.crates-io]
+replace-with = "vendored-sources"
+
+[source.vendored-sources]
+directory = "VENDOR"
+EOF
+
+cd ..
+
+# Generate the archive, sort with the C locale so that it is reproducible.
+find "$(basename "$OLDPWD")" -not -type d -print0 >files.list
+LC_ALL=C sort -z <files.list >files.list.sorted
+# let's use a fixed hardcoded date to be reproducible
+date="2020-02-06 01:02:03"
+
+# Create GNU-format tarballs, since that's the format of the tarballs on
+# sources.buildroot.org and used in the *.hash files
+echo "Creating final archive."
+tar cf new.tar --null --verbatim-files-from --numeric-owner --format=gnu \
+    --owner=0 --group=0 --mtime="${date}" -T files.list.sorted
+gzip -6 -n <new.tar >new.tar.gz
+mv "${1}" "${1}".old
+mv new.tar.gz "${1}"
+rm "${1}".old
diff --git a/support/download/dl-wrapper b/support/download/dl-wrapper
index 3f613bb622..5e52b3e60f 100755
--- a/support/download/dl-wrapper
+++ b/support/download/dl-wrapper
@@ -98,7 +98,7 @@  main() {
             case "${b}" in
                 urlencode) urlencode="${b}" ;;
                 git|svn|cvs|bzr|file|scp|hg) backend="${b}" ;;
-                # insert here supported second backends) backend2="${b}" ;;
+                cargo) backend2="${b}" ;;
             esac
         done
         uri=${uri#*+}