Message ID | 20161021202715.16528-1-arnout@mind.be |
---|---|
State | Accepted |
Headers | show |
Hello, There is nothing named "SECONDARY" in Buildroot, it's called "BACKUP". This can be fixed when applying the patch. On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle (Essensium/Mind) wrote: > Some packages download files (especially patches) with a ? in the > URL. The ? marks the query part of the URL. However, the downloaded > file still contains the ? but from then on it doesn't designate a > query part anymore. Therefore, when fetching from PRIMARY or > SECONDARY site over http, the server will report a 404 Not Found. > > To fix, we need to replace the ? with %3F. Obviously, this should > be done only when fetching from PRIMARY or SECONDARY. For fetching > from the real upstream, the ? really does designate the query part. One thing I'm wondering is if the '?' character is the only one that needs to be escaped like this, or not. Thanks, Thomas
>>>>> "Thomas" == Thomas Petazzoni <thomas.petazzoni@free-electrons.com> writes: > Hello, > There is nothing named "SECONDARY" in Buildroot, it's called "BACKUP". > This can be fixed when applying the patch. > On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle > (Essensium/Mind) wrote: >> Some packages download files (especially patches) with a ? in the >> URL. The ? marks the query part of the URL. However, the downloaded >> file still contains the ? but from then on it doesn't designate a >> query part anymore. Therefore, when fetching from PRIMARY or >> SECONDARY site over http, the server will report a 404 Not Found. >> >> To fix, we need to replace the ? with %3F. Obviously, this should >> be done only when fetching from PRIMARY or SECONDARY. For fetching >> from the real upstream, the ? really does designate the query part. > One thing I'm wondering is if the '?' character is the only one that > needs to be escaped like this, or not. I believe it is, as we already quote the URLs correctly so they don't confuse the shell and wget afaik correctly URL encodes URL before requesting the location from the web server. It works here at least with lighttpd: echo ok > '/var/www/url?with&special!characters in' wget 'http://127.0.0.1/url?with&special!characters in' --2016-10-23 11:04:49-- http://127.0.0.1/url?with&special!characters%20in Connecting to 127.0.0.1:80... connected. HTTP request sent, awaiting response... 404 Not Found 2016-10-23 11:04:49 ERROR 404: Not Found. wget 'http://127.0.0.1/url%3fwith&special!characters in' --2016-10-23 11:04:27-- http://127.0.0.1/url%3fwith&special!characters%20in Connecting to 127.0.0.1:80... connected. HTTP request sent, awaiting response... 200 OK Length: 3 [application/octet-stream] Saving to: ‘url?with&special!characters in’ 2016-10-23 11:04:27 (316 KB/s) - ‘url?with&special!characters in’ saved [3/3] We have seen this problem on sources.buildroot.org as well. I have so far manually fixed it up by renaming the files to drop ?.., so that will need to be undone when this gets applied.
Hello, On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle (Essensium/Mind) wrote: > Some packages download files (especially patches) with a ? in the > URL. The ? marks the query part of the URL. However, the downloaded > file still contains the ? but from then on it doesn't designate a > query part anymore. Therefore, when fetching from PRIMARY or > SECONDARY site over http, the server will report a 404 Not Found. > > To fix, we need to replace the ? with %3F. Obviously, this should > be done only when fetching from PRIMARY or SECONDARY. For fetching > from the real upstream, the ? really does designate the query part. > > Fixes #9371. > > Reported-by: Johan Derycke <johanderycke@gmail.com> > Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be> > --- > I tested with > make BR2_PRIMARY_SITE=http://localhost/dl BR2_PRIMARY_SITE_ONLY=y BR2_DL_DIR=/tmp/dl openssl > and also checked with a file:// primary site, with secondary site, and > openssl-source-check > --- > package/pkg-download.mk | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) Applied to master, after s/SECONDARY/BACKUP/. Peter can you take the appropriate actions in sources.b.o ? Thanks! Thomas
On 22-10-16 15:04, Thomas Petazzoni wrote: > Hello, > > There is nothing named "SECONDARY" in Buildroot, it's called "BACKUP". > This can be fixed when applying the patch. > > On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle > (Essensium/Mind) wrote: >> Some packages download files (especially patches) with a ? in the >> URL. The ? marks the query part of the URL. However, the downloaded >> file still contains the ? but from then on it doesn't designate a >> query part anymore. Therefore, when fetching from PRIMARY or >> SECONDARY site over http, the server will report a 404 Not Found. >> >> To fix, we need to replace the ? with %3F. Obviously, this should >> be done only when fetching from PRIMARY or SECONDARY. For fetching >> from the real upstream, the ? really does designate the query part. > > One thing I'm wondering is if the '?' character is the only one that > needs to be escaped like this, or not. From wikipedia, an URL is: scheme:[//[user:password@]host[:port]][/]path[?query][#fragment] so after the path, only ? and # are interpreted differently between the normal download and the PRIMARY/BACKUP download. You might think that & (or ;) is also special, but those characters are only special when they appear _after_ '?'. Just to be sure I tested a ?...&... scenario (just adding \&foo after one of the openssl patches) and with this patch it works correctly both for normal and PRIMARY downloads. [1] As to #fragment, that's very simple: it can never occur because it's interpreted by the client and doesn't get sent to the server. If a # ends up in a URL somehow, it already has to be escaped anyway or it won't get passed by wget. Note: I haven't tested a PRIMARY that has scp or file download method, but those are not affected by this patch anyway. Regards, Arnout [1] Well, actually it doesn't work at all, our download handling fails dramatically when there is a & in the URL. I tried fixing it by adding quoting here and there, but it's not exactly trivial. Since we don't have URLs with & and I don't expect any, I'm not continuing work on this. But at least using our dl-wrapper works for a URL like ...%3F...&...
On 23-10-16 14:54, Thomas Petazzoni wrote: > Hello, > > On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle > (Essensium/Mind) wrote: >> Some packages download files (especially patches) with a ? in the >> URL. The ? marks the query part of the URL. However, the downloaded >> file still contains the ? but from then on it doesn't designate a >> query part anymore. Therefore, when fetching from PRIMARY or >> SECONDARY site over http, the server will report a 404 Not Found. >> >> To fix, we need to replace the ? with %3F. Obviously, this should >> be done only when fetching from PRIMARY or SECONDARY. For fetching >> from the real upstream, the ? really does designate the query part. >> >> Fixes #9371. >> >> Reported-by: Johan Derycke <johanderycke@gmail.com> >> Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be> >> --- >> I tested with >> make BR2_PRIMARY_SITE=http://localhost/dl BR2_PRIMARY_SITE_ONLY=y BR2_DL_DIR=/tmp/dl openssl >> and also checked with a file:// primary site, with secondary site, and >> openssl-source-check >> --- >> package/pkg-download.mk | 8 ++++++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) > > Applied to master, after s/SECONDARY/BACKUP/. > > Peter can you take the appropriate actions in sources.b.o ? There's nothing to do AFAICS. I did test a download of the openssl patches from sources.buildroot.net and that worked. Regards, Arnout
>>>>> "Thomas" == Thomas Petazzoni <thomas.petazzoni@free-electrons.com> writes: Hi, >> package/pkg-download.mk | 8 ++++++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) > Applied to master, after s/SECONDARY/BACKUP/. > Peter can you take the appropriate actions in sources.b.o ? Yes. From a quick look it seems I copied files instead of renaming them, so everything should be good (I'll leave the copies for older BR versions).
diff --git a/package/pkg-download.mk b/package/pkg-download.mk index 0f542e6..65e753c 100644 --- a/package/pkg-download.mk +++ b/package/pkg-download.mk @@ -210,6 +210,10 @@ endef # # E.G. use like this: # $(call DOWNLOAD,$(FOO_SITE)) +# +# For PRIMARY and SECONDARY site, any ? in the URL is replaced by %3F. A ? in +# the URL is used to separate query arguments, but the PRIMARY and SECONDARY +# sites serve just plain files. ################################################################################ define DOWNLOAD @@ -226,7 +230,7 @@ define DOWNLOAD_INNER case "$(call geturischeme,$(BR2_PRIMARY_SITE))" in \ file) $(call $(3)_LOCALFILES,$(BR2_PRIMARY_SITE)/$(2),$(2)) && exit ;; \ scp) $(call $(3)_SCP,$(BR2_PRIMARY_SITE)/$(2),$(2)) && exit ;; \ - *) $(call $(3)_WGET,$(BR2_PRIMARY_SITE)/$(2),$(2)) && exit ;; \ + *) $(call $(3)_WGET,$(BR2_PRIMARY_SITE)/$(subst ?,%3F,$(2)),$(2)) && exit ;; \ esac ; \ fi ; \ if test "$(BR2_PRIMARY_SITE_ONLY)" = "y" ; then \ @@ -245,7 +249,7 @@ define DOWNLOAD_INNER esac ; \ fi ; \ if test -n "$(call qstrip,$(BR2_BACKUP_SITE))" ; then \ - $(call $(3)_WGET,$(BR2_BACKUP_SITE)/$(2),$(2)) && exit ; \ + $(call $(3)_WGET,$(BR2_BACKUP_SITE)/$(subst ?,%3F,$(2)),$(2)) && exit ; \ fi ; \ exit 1 endef
Some packages download files (especially patches) with a ? in the URL. The ? marks the query part of the URL. However, the downloaded file still contains the ? but from then on it doesn't designate a query part anymore. Therefore, when fetching from PRIMARY or SECONDARY site over http, the server will report a 404 Not Found. To fix, we need to replace the ? with %3F. Obviously, this should be done only when fetching from PRIMARY or SECONDARY. For fetching from the real upstream, the ? really does designate the query part. Fixes #9371. Reported-by: Johan Derycke <johanderycke@gmail.com> Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be> --- I tested with make BR2_PRIMARY_SITE=http://localhost/dl BR2_PRIMARY_SITE_ONLY=y BR2_DL_DIR=/tmp/dl openssl and also checked with a file:// primary site, with secondary site, and openssl-source-check --- package/pkg-download.mk | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)