Message ID | 1437570442-30203-2-git-send-email-rjones@redhat.com |
---|---|
State | New |
Headers | show |
On Wed, Jul 22, 2015 at 02:07:22PM +0100, Richard W.M. Jones wrote: > On some (but not all) systems: > > $ qemu-img create -f qcow2 overlay -b ssh://xen/ > Segmentation fault > > It turns out this happens when inet_connect returns -1 in the > following code, but errno == 0. > > s->sock = inet_connect(s->hostport, errp); > if (s->sock < 0) { > ret = -errno; > goto err; > } > > In the test case above, no host called "xen" exists, so getaddrinfo fails. > > On Fedora 22, getaddrinfo happens to set errno = ENOENT (although it > is *not* documented to do that), so it doesn't segfault. > > On RHEL 7, errno is not set by the failing getaddrinfo, so ret = > -errno = 0, so the caller doesn't know there was an error and > continues with a half-initialized BDRVSSHState struct, and everything > goes south from there, eventually resulting in a segfault. > > Fix this by setting ret to -EINVAL. The real error is saved in the > Error** errp struct, so it is printed correctly: > > $ ./qemu-img create -f qcow2 overlay -b ssh://xen/ > qemu-img: overlay: address resolution failed for xen:22: No address associated with hostname > > Signed-off-by: Richard W.M. Jones <rjones@redhat.com> > Reported-by: Jun Li > BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1147343 > --- > block/ssh.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/ssh.c b/block/ssh.c > index aebb18c..8d4dc2a 100644 > --- a/block/ssh.c > +++ b/block/ssh.c > @@ -563,7 +563,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options, > /* Open the socket and connect. */ > s->sock = inet_connect(s->hostport, errp); > if (s->sock < 0) { > - ret = -errno; > + ret = -EINVAL; > goto err; There are a reasonable number of other uses of inet_connect() in QEMU, so can't we fix inet_connect() itself to set EINVAL in the error case instead of just fixing one caller. Regards, Daniel
On Wed, Jul 22, 2015 at 02:10:51PM +0100, Daniel P. Berrange wrote: > There are a reasonable number of other uses of inet_connect() in QEMU, > so can't we fix inet_connect() itself to set EINVAL in the error case > instead of just fixing one caller. The only users I can find are block/nbd.c and block/sheepdog.c, and neither of those is expecting errno to be set. So I think my use of errno in block/ssh.c was flat out wrong, although it happened to work most of the time. Rich.
On Wed, Jul 22, 2015 at 02:07:22PM +0100, Richard W.M. Jones wrote: > On some (but not all) systems: > > $ qemu-img create -f qcow2 overlay -b ssh://xen/ > Segmentation fault > > It turns out this happens when inet_connect returns -1 in the > following code, but errno == 0. > > s->sock = inet_connect(s->hostport, errp); > if (s->sock < 0) { > ret = -errno; > goto err; > } > > In the test case above, no host called "xen" exists, so getaddrinfo fails. > > On Fedora 22, getaddrinfo happens to set errno = ENOENT (although it > is *not* documented to do that), so it doesn't segfault. > > On RHEL 7, errno is not set by the failing getaddrinfo, so ret = > -errno = 0, so the caller doesn't know there was an error and > continues with a half-initialized BDRVSSHState struct, and everything > goes south from there, eventually resulting in a segfault. > > Fix this by setting ret to -EINVAL. The real error is saved in the > Error** errp struct, so it is printed correctly: > > $ ./qemu-img create -f qcow2 overlay -b ssh://xen/ > qemu-img: overlay: address resolution failed for xen:22: No address associated with hostname > > Signed-off-by: Richard W.M. Jones <rjones@redhat.com> > Reported-by: Jun Li > BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1147343 > --- > block/ssh.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/ssh.c b/block/ssh.c > index aebb18c..8d4dc2a 100644 > --- a/block/ssh.c > +++ b/block/ssh.c > @@ -563,7 +563,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options, > /* Open the socket and connect. */ > s->sock = inet_connect(s->hostport, errp); > if (s->sock < 0) { > - ret = -errno; > + ret = -EINVAL; Both nbd and sheepdog handle it in a similar fashion (i.e. not relying on errno being set on inet_connect failure). However, both nbd and sheepdog use -EIO as the error return, and I think that makes more sense. What do you think? > goto err; > } > > -- > 2.4.3 >
On Wed, Jul 22, 2015 at 09:17:49AM -0400, Jeff Cody wrote: > Both nbd and sheepdog handle it in a similar fashion (i.e. not relying > on errno being set on inet_connect failure). However, both nbd and > sheepdog use -EIO as the error return, and I think that makes more > sense. What do you think? If you prefer -- I don't mind as long as it doesn't segfault :-) Rich.
Am 22.07.2015 um 15:10 hat Daniel P. Berrange geschrieben: > On Wed, Jul 22, 2015 at 02:07:22PM +0100, Richard W.M. Jones wrote: > > On some (but not all) systems: > > > > $ qemu-img create -f qcow2 overlay -b ssh://xen/ > > Segmentation fault > > > > It turns out this happens when inet_connect returns -1 in the > > following code, but errno == 0. > > > > s->sock = inet_connect(s->hostport, errp); > > if (s->sock < 0) { > > ret = -errno; > > goto err; > > } > > > > In the test case above, no host called "xen" exists, so getaddrinfo fails. > > > > On Fedora 22, getaddrinfo happens to set errno = ENOENT (although it > > is *not* documented to do that), so it doesn't segfault. > > > > On RHEL 7, errno is not set by the failing getaddrinfo, so ret = > > -errno = 0, so the caller doesn't know there was an error and > > continues with a half-initialized BDRVSSHState struct, and everything > > goes south from there, eventually resulting in a segfault. > > > > Fix this by setting ret to -EINVAL. The real error is saved in the > > Error** errp struct, so it is printed correctly: > > > > $ ./qemu-img create -f qcow2 overlay -b ssh://xen/ > > qemu-img: overlay: address resolution failed for xen:22: No address associated with hostname > > > > Signed-off-by: Richard W.M. Jones <rjones@redhat.com> > > Reported-by: Jun Li > > BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1147343 > > --- > > block/ssh.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/ssh.c b/block/ssh.c > > index aebb18c..8d4dc2a 100644 > > --- a/block/ssh.c > > +++ b/block/ssh.c > > @@ -563,7 +563,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options, > > /* Open the socket and connect. */ > > s->sock = inet_connect(s->hostport, errp); > > if (s->sock < 0) { > > - ret = -errno; > > + ret = -EINVAL; > > goto err; > > There are a reasonable number of other uses of inet_connect() in QEMU, > so can't we fix inet_connect() itself to set EINVAL in the error case > instead of just fixing one caller. None of the other callers try to use errno, which isn't even documented as part of the inet_connect() interface. So I think setting errno inside inet_connect() would be wrong. If we were to change anything, we might consider returning -EINVAL instead of -1, though I'm not sure how useful that change would be. We would still have to fix this call in the ssh block driver. Kevin
On Wed, Jul 22, 2015 at 02:22:18PM +0100, Richard W.M. Jones wrote: > On Wed, Jul 22, 2015 at 09:17:49AM -0400, Jeff Cody wrote: > > Both nbd and sheepdog handle it in a similar fashion (i.e. not relying > > on errno being set on inet_connect failure). However, both nbd and > > sheepdog use -EIO as the error return, and I think that makes more > > sense. What do you think? > > If you prefer -- I don't mind as long as it doesn't segfault :-) > I think I would - mainly for consistency, but also because -EIO better describes all the various failures that may occur in inet_connect(). Thanks! Jeff
diff --git a/block/ssh.c b/block/ssh.c index aebb18c..8d4dc2a 100644 --- a/block/ssh.c +++ b/block/ssh.c @@ -563,7 +563,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options, /* Open the socket and connect. */ s->sock = inet_connect(s->hostport, errp); if (s->sock < 0) { - ret = -errno; + ret = -EINVAL; goto err; }
On some (but not all) systems: $ qemu-img create -f qcow2 overlay -b ssh://xen/ Segmentation fault It turns out this happens when inet_connect returns -1 in the following code, but errno == 0. s->sock = inet_connect(s->hostport, errp); if (s->sock < 0) { ret = -errno; goto err; } In the test case above, no host called "xen" exists, so getaddrinfo fails. On Fedora 22, getaddrinfo happens to set errno = ENOENT (although it is *not* documented to do that), so it doesn't segfault. On RHEL 7, errno is not set by the failing getaddrinfo, so ret = -errno = 0, so the caller doesn't know there was an error and continues with a half-initialized BDRVSSHState struct, and everything goes south from there, eventually resulting in a segfault. Fix this by setting ret to -EINVAL. The real error is saved in the Error** errp struct, so it is printed correctly: $ ./qemu-img create -f qcow2 overlay -b ssh://xen/ qemu-img: overlay: address resolution failed for xen:22: No address associated with hostname Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Reported-by: Jun Li BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1147343 --- block/ssh.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)