mbox

[PULL,0/2] ppc-for-3.0 queue 20180801

Message ID 20180801035357.7804-1-david@gibson.dropbear.id.au
State New
Headers show

Pull-request

git://github.com/dgibson/qemu.git tags/ppc-for-3.0-20180801

Message

David Gibson Aug. 1, 2018, 3:53 a.m. UTC
The following changes since commit f7502360397d291be04bc040e9f96c92ff2d8030:

  Update version for v3.0.0-rc3 release (2018-07-31 19:30:17 +0100)

are available in the Git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-3.0-20180801

for you to fetch changes up to 6484ab3dffadc79020a71376010f517d60b81b83:

  sam460ex: Fix PCI interrupts with multiple devices (2018-08-01 11:01:38 +1000)

----------------------------------------------------------------
ppc patch queue for 2018-08-01

Here are a final couple of fixes for the 3.0 release.

----------------------------------------------------------------
BALATON Zoltan (1):
      sam460ex: Fix PCI interrupts with multiple devices

Thomas Huth (1):
      hw/misc/macio: Fix device introspection problems in macio devices

 hw/misc/macio/cuda.c  |  5 ++---
 hw/misc/macio/macio.c | 24 ++++++++----------------
 hw/misc/macio/pmu.c   |  5 ++---
 hw/ppc/ppc440_pcix.c  | 21 ++++++++-------------
 hw/ppc/sam460ex.c     |  6 ++----
 5 files changed, 22 insertions(+), 39 deletions(-)

Comments

Peter Maydell Aug. 1, 2018, 10:11 a.m. UTC | #1
On 1 August 2018 at 04:53, David Gibson <david@gibson.dropbear.id.au> wrote:
> The following changes since commit f7502360397d291be04bc040e9f96c92ff2d8030:
>
>   Update version for v3.0.0-rc3 release (2018-07-31 19:30:17 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/dgibson/qemu.git tags/ppc-for-3.0-20180801
>
> for you to fetch changes up to 6484ab3dffadc79020a71376010f517d60b81b83:
>
>   sam460ex: Fix PCI interrupts with multiple devices (2018-08-01 11:01:38 +1000)
>
> ----------------------------------------------------------------
> ppc patch queue for 2018-08-01
>
> Here are a final couple of fixes for the 3.0 release.

So, we've just put out rc3, which in an ideal world is our
final release candidate for 3.0. Are these bugs regressions from
2.12 ?

thanks
-- PMM
BALATON Zoltan Aug. 1, 2018, 11:24 a.m. UTC | #2
On Wed, 1 Aug 2018, Peter Maydell wrote:
> On 1 August 2018 at 04:53, David Gibson <david@gibson.dropbear.id.au> wrote:
>> The following changes since commit f7502360397d291be04bc040e9f96c92ff2d8030:
>>
>>   Update version for v3.0.0-rc3 release (2018-07-31 19:30:17 +0100)
>>
>> are available in the Git repository at:
>>
>>   git://github.com/dgibson/qemu.git tags/ppc-for-3.0-20180801
>>
>> for you to fetch changes up to 6484ab3dffadc79020a71376010f517d60b81b83:
>>
>>   sam460ex: Fix PCI interrupts with multiple devices (2018-08-01 11:01:38 +1000)
>>
>> ----------------------------------------------------------------
>> ppc patch queue for 2018-08-01
>>
>> Here are a final couple of fixes for the 3.0 release.
>
> So, we've just put out rc3, which in an ideal world is our
> final release candidate for 3.0. Are these bugs regressions from
> 2.12 ?

I don't know about the macio one but the sam460ex PCI interrupts were 
broken in 2.12 too. However it's a fix for a device only used in sam460ex 
which is now fixed by this patch so including it is not high risk for 
breaking anything else than sam460ex which is known to be not finished yet 
so I would not worry too much. But which is better? Releasing 3.0 with a 
known bug or including this fix without an rc4? In case of machines that 
are in QEMU for a long time, widely used and other infrastructure is 
dependent upon it's good to have strict rules to avoid regressions but for 
the newly added sam460ex which is really only expected to be first usable 
in 3.0 and not many people depend on it yet delaying a fix for a known 
problem until December just to follow a rule looks counterproductive. So I 
would say it's OK to include this fix but skip rc4 and if it makes 
sam460ex worse than it is now then at worst we would have a broken 3.0 
that will need to be fixed in next release. But we have a good chance to 
release a better 3.0 as opposed to not include the fix and release a known 
broken 3.0 that surely will have to be fixed in next release. (This fix 
may improve sound and network support so it would be good to have it in 
3.0. A lot of potential users can only use release and rc versions because 
they can't build QEMU from source so they will only get this fix in next 
release otherwise.)

But if it's not possible to have this in 3.0 without rc4 and there won't 
be other changes needing an rc4 we can live with this not getting in 3.0 
but due to the above it might delay this for the general public by a few 
months.

Regards,
BALATON Zoltan
Peter Maydell Aug. 1, 2018, 1:04 p.m. UTC | #3
On 1 August 2018 at 12:24, BALATON Zoltan <balaton@eik.bme.hu> wrote:
> On Wed, 1 Aug 2018, Peter Maydell wrote:
>> So, we've just put out rc3, which in an ideal world is our
>> final release candidate for 3.0. Are these bugs regressions from
>> 2.12 ?
>
>
> I don't know about the macio one but the sam460ex PCI interrupts were broken
> in 2.12 too. However it's a fix for a device only used in sam460ex which is
> now fixed by this patch so including it is not high risk for breaking
> anything else than sam460ex which is known to be not finished yet so I would
> not worry too much. But which is better? Releasing 3.0 with a known bug or
> including this fix without an rc4?

The problem with continuing to delay 3.0 while we have known bugs
is that bugs generally come in at an even rate, so we *always*
have known bugs, and so "we found another bug, let's delay 3.0
again to put in a fix for it" is a recipe for never doing a release.
That's why we gradually wind up the bar for "should this go in",
from "any bug" to "regressions" to "really really serious showstopper
regressions".

We never do a final release without a last rc (it is too risky),
so that is not an option.

thanks
-- PMM
David Gibson Aug. 2, 2018, 7:08 a.m. UTC | #4
On Wed, Aug 01, 2018 at 02:04:04PM +0100, Peter Maydell wrote:
> On 1 August 2018 at 12:24, BALATON Zoltan <balaton@eik.bme.hu> wrote:
> > On Wed, 1 Aug 2018, Peter Maydell wrote:
> >> So, we've just put out rc3, which in an ideal world is our
> >> final release candidate for 3.0. Are these bugs regressions from
> >> 2.12 ?
> >
> >
> > I don't know about the macio one but the sam460ex PCI interrupts were broken
> > in 2.12 too. However it's a fix for a device only used in sam460ex which is
> > now fixed by this patch so including it is not high risk for breaking
> > anything else than sam460ex which is known to be not finished yet so I would
> > not worry too much. But which is better? Releasing 3.0 with a known bug or
> > including this fix without an rc4?
> 
> The problem with continuing to delay 3.0 while we have known bugs
> is that bugs generally come in at an even rate, so we *always*
> have known bugs, and so "we found another bug, let's delay 3.0
> again to put in a fix for it" is a recipe for never doing a release.
> That's why we gradually wind up the bar for "should this go in",
> from "any bug" to "regressions" to "really really serious showstopper
> regressions".
> 
> We never do a final release without a last rc (it is too risky),
> so that is not an option.

Ugh, sorry about this.  I made an off-by-one error in my mental
calculation of whether it would be ok to include non-regression
bugfixes aet this point.  Also, I originally meant to send this before
the last -rc, but stuff happened.

So, the sam460 fix is, indeed, not a regression and I'm happy to punt
it to 3.1.

The macio fix, however, *is* a regression from 2.12.  Whether it's
severe enough to warrant another -rc, I'm not sure.  It is a bad
pointer access which is, well, bad.  It doesn't seem to bite
obviously, needing valgrind to pick it up, but possibly that's just
luck.
Peter Maydell Aug. 2, 2018, 9:16 a.m. UTC | #5
On 2 August 2018 at 08:08, David Gibson <david@gibson.dropbear.id.au> wrote:
> The macio fix, however, *is* a regression from 2.12.  Whether it's
> severe enough to warrant another -rc, I'm not sure.  It is a bad
> pointer access which is, well, bad.  It doesn't seem to bite
> obviously, needing valgrind to pick it up, but possibly that's just
> luck.

I thought those introspection-bugs like the macio ones weren't
regressions ? Certainly the devices have been leaking a refcount
in their init methods for a long time. I don't view these as
very important bugs, though, as they only manifest if you
try to introspect fixed-always-present device/SoC objects,
which in practice I don't think anybody does except as
part of "test every device via introspection" test cases.

thanks
-- PMM
David Gibson Aug. 2, 2018, 2:07 p.m. UTC | #6
On Thu, Aug 02, 2018 at 10:16:32AM +0100, Peter Maydell wrote:
> On 2 August 2018 at 08:08, David Gibson <david@gibson.dropbear.id.au> wrote:
> > The macio fix, however, *is* a regression from 2.12.  Whether it's
> > severe enough to warrant another -rc, I'm not sure.  It is a bad
> > pointer access which is, well, bad.  It doesn't seem to bite
> > obviously, needing valgrind to pick it up, but possibly that's just
> > luck.
> 
> I thought those introspection-bugs like the macio ones weren't
> regressions ?

Well, I ran Thomas's testcase on master and it generates several
valgrind warnings, which don't appear on either 2.12 or master+the
patch.

> Certainly the devices have been leaking a refcount
> in their init methods for a long time. I don't view these as
> very important bugs, though, as they only manifest if you
> try to introspect fixed-always-present device/SoC objects,
> which in practice I don't think anybody does except as
> part of "test every device via introspection" test cases.
> 
> thanks
> -- PMM
>
Thomas Huth Aug. 3, 2018, 5:49 a.m. UTC | #7
On 08/02/2018 04:07 PM, David Gibson wrote:
> On Thu, Aug 02, 2018 at 10:16:32AM +0100, Peter Maydell wrote:
>> On 2 August 2018 at 08:08, David Gibson <david@gibson.dropbear.id.au> wrote:
>>> The macio fix, however, *is* a regression from 2.12.  Whether it's
>>> severe enough to warrant another -rc, I'm not sure.  It is a bad
>>> pointer access which is, well, bad.  It doesn't seem to bite
>>> obviously, needing valgrind to pick it up, but possibly that's just
>>> luck.
>>
>> I thought those introspection-bugs like the macio ones weren't
>> regressions ?
> 
> Well, I ran Thomas's testcase on master and it generates several
> valgrind warnings, which don't appear on either 2.12 or master+the
> patch.

Maybe the macio bug is something new, but we had plenty of these
introspetion bugs in the other code (mainly the ARM code) which were
clearly there since a looong time already and nobody ever complained. So
it seems quite unusual that upper layer tools / the users are using the
introspection feature of QEMU. Thus I'd say this bug is not important
enough to block the release. We could fix it in the stable branch instead.

 Thomas
David Gibson Aug. 3, 2018, 6:49 a.m. UTC | #8
On Fri, Aug 03, 2018 at 07:49:12AM +0200, Thomas Huth wrote:
> On 08/02/2018 04:07 PM, David Gibson wrote:
> > On Thu, Aug 02, 2018 at 10:16:32AM +0100, Peter Maydell wrote:
> >> On 2 August 2018 at 08:08, David Gibson <david@gibson.dropbear.id.au> wrote:
> >>> The macio fix, however, *is* a regression from 2.12.  Whether it's
> >>> severe enough to warrant another -rc, I'm not sure.  It is a bad
> >>> pointer access which is, well, bad.  It doesn't seem to bite
> >>> obviously, needing valgrind to pick it up, but possibly that's just
> >>> luck.
> >>
> >> I thought those introspection-bugs like the macio ones weren't
> >> regressions ?
> > 
> > Well, I ran Thomas's testcase on master and it generates several
> > valgrind warnings, which don't appear on either 2.12 or master+the
> > patch.
> 
> Maybe the macio bug is something new, but we had plenty of these
> introspetion bugs in the other code (mainly the ARM code) which were
> clearly there since a looong time already and nobody ever complained. So
> it seems quite unusual that upper layer tools / the users are using the
> introspection feature of QEMU. Thus I'd say this bug is not important
> enough to block the release. We could fix it in the stable branch
> instead.

Understood, I'll punt the patch to my 3.1 staging tree.
BALATON Zoltan Aug. 5, 2018, 3:38 p.m. UTC | #9
On Wed, 1 Aug 2018, Peter Maydell wrote:
> On 1 August 2018 at 12:24, BALATON Zoltan <balaton@eik.bme.hu> wrote:
>> On Wed, 1 Aug 2018, Peter Maydell wrote:
>>> So, we've just put out rc3, which in an ideal world is our
>>> final release candidate for 3.0. Are these bugs regressions from
>>> 2.12 ?
>>
>>
>> I don't know about the macio one but the sam460ex PCI interrupts were broken
>> in 2.12 too. However it's a fix for a device only used in sam460ex which is
>> now fixed by this patch so including it is not high risk for breaking
>> anything else than sam460ex which is known to be not finished yet so I would
>> not worry too much. But which is better? Releasing 3.0 with a known bug or
>> including this fix without an rc4?
>
> The problem with continuing to delay 3.0 while we have known bugs
> is that bugs generally come in at an even rate, so we *always*
> have known bugs, and so "we found another bug, let's delay 3.0
> again to put in a fix for it" is a recipe for never doing a release.
> That's why we gradually wind up the bar for "should this go in",
> from "any bug" to "regressions" to "really really serious showstopper
> regressions".
>
> We never do a final release without a last rc (it is too risky),
> so that is not an option.

Now that it looks like we'll have an rc4 due to other fixes can these be 
included as well despite not being regressions? These may not have been 
serious enough to fix when we wouldn't have rc4 otherwise but holding on 
to broken implementation just because they were also broken in the 
previous release does not make sense if we'll have another rc anyway.

Regards,
BALATON Zoltan
Peter Maydell Aug. 6, 2018, 8:39 a.m. UTC | #10
On 5 August 2018 at 16:38, BALATON Zoltan <balaton@eik.bme.hu> wrote:
> Now that it looks like we'll have an rc4 due to other fixes can these be
> included as well despite not being regressions? These may not have been
> serious enough to fix when we wouldn't have rc4 otherwise but holding on to
> broken implementation just because they were also broken in the previous
> release does not make sense if we'll have another rc anyway.

I'm considering it, yes, but it increases risk, so it is not
a completely trivial decision. (All changes later in the release
cycle increase the risk of something unexpected breaking as a result
of the change.)

thanks
-- PMM
Peter Maydell Aug. 6, 2018, 10:30 a.m. UTC | #11
On 1 August 2018 at 04:53, David Gibson <david@gibson.dropbear.id.au> wrote:
> The following changes since commit f7502360397d291be04bc040e9f96c92ff2d8030:
>
>   Update version for v3.0.0-rc3 release (2018-07-31 19:30:17 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/dgibson/qemu.git tags/ppc-for-3.0-20180801
>
> for you to fetch changes up to 6484ab3dffadc79020a71376010f517d60b81b83:
>
>   sam460ex: Fix PCI interrupts with multiple devices (2018-08-01 11:01:38 +1000)
>
> ----------------------------------------------------------------
> ppc patch queue for 2018-08-01
>
> Here are a final couple of fixes for the 3.0 release.
>
> ----------------------------------------------------------------
> BALATON Zoltan (1):
>       sam460ex: Fix PCI interrupts with multiple devices
>
> Thomas Huth (1):
>       hw/misc/macio: Fix device introspection problems in macio devices

Since we needed an rc4, I've applied this to master (the bug
fixes are not strict regressions but they are limited in
scope to particular machines).

thanks
-- PMM