mbox series

[0/2] s390x: reset handling for ccw devices

Message ID 20180507155130.21085-1-cohuck@redhat.com
Headers show
Series s390x: reset handling for ccw devices | expand

Message

Cornelia Huck May 7, 2018, 3:51 p.m. UTC
On Friday, Thomas noticed some problems with 3270 devices. One result
was "s390x/css: disabled subchannels cannot be status pending", but
a reboot did not cure the previous broken status. Turns out that 3270
devices are missing a reset handler.

This series cleans up reset handling a bit and makes sure that the base
ccw device class provides a subchannel reset handler. I'm currently
not sure what we should do with vfio-ccw, so the behaviour there is
left unchanged.

Cornelia Huck (2):
  virtio-ccw: common reset handler
  s390x/ccw: make sure all ccw devices are properly reset

 hw/s390x/ccw-device.c |  8 ++++++++
 hw/s390x/virtio-ccw.c | 20 ++++++--------------
 hw/s390x/virtio-ccw.h |  1 +
 3 files changed, 15 insertions(+), 14 deletions(-)

Comments

Cornelia Huck May 8, 2018, 12:55 p.m. UTC | #1
On Mon,  7 May 2018 17:51:28 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Friday, Thomas noticed some problems with 3270 devices. One result
> was "s390x/css: disabled subchannels cannot be status pending", but
> a reboot did not cure the previous broken status. Turns out that 3270
> devices are missing a reset handler.
> 
> This series cleans up reset handling a bit and makes sure that the base
> ccw device class provides a subchannel reset handler. I'm currently
> not sure what we should do with vfio-ccw, so the behaviour there is
> left unchanged.
> 
> Cornelia Huck (2):
>   virtio-ccw: common reset handler
>   s390x/ccw: make sure all ccw devices are properly reset
> 
>  hw/s390x/ccw-device.c |  8 ++++++++
>  hw/s390x/virtio-ccw.c | 20 ++++++--------------
>  hw/s390x/virtio-ccw.h |  1 +
>  3 files changed, 15 insertions(+), 14 deletions(-)
> 

Queued to s390-next (with cc:stable added).
Halil Pasic May 8, 2018, 1:29 p.m. UTC | #2
On 05/07/2018 05:51 PM, Cornelia Huck wrote:
> On Friday, Thomas noticed some problems with 3270 devices. One result
> was "s390x/css: disabled subchannels cannot be status pending", but
> a reboot did not cure the previous broken status. Turns out that 3270
> devices are missing a reset handler.
> 
> This series cleans up reset handling a bit and makes sure that the base
> ccw device class provides a subchannel reset handler. I'm currently
> not sure what we should do with vfio-ccw, so the behaviour there is
> left unchanged.

Had a look, and LGTM (will tag separately) modulo what follows here. I'm
a bit concerned about vfio-ccw not being made compliant to this new
he reset of CCWDeviceClass is taking care of resetting the subchannel data
structures. This feels like introducing a common abstraction
to me, but then some things bother me.

In particular the the pim, the lpm and the pam set in css_reset_sch
seems to suit only devices that use the virtual chp. That
is it ain't suits any CCWDevice instance.

Do you plan to tackle vfio-ccw reset?

> 
> Cornelia Huck (2):
>    virtio-ccw: common reset handler
>    s390x/ccw: make sure all ccw devices are properly reset
> 
>   hw/s390x/ccw-device.c |  8 ++++++++
>   hw/s390x/virtio-ccw.c | 20 ++++++--------------
>   hw/s390x/virtio-ccw.h |  1 +
>   3 files changed, 15 insertions(+), 14 deletions(-)
>
Cornelia Huck May 8, 2018, 1:55 p.m. UTC | #3
On Tue, 8 May 2018 15:29:50 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On 05/07/2018 05:51 PM, Cornelia Huck wrote:
> > On Friday, Thomas noticed some problems with 3270 devices. One result
> > was "s390x/css: disabled subchannels cannot be status pending", but
> > a reboot did not cure the previous broken status. Turns out that 3270
> > devices are missing a reset handler.
> > 
> > This series cleans up reset handling a bit and makes sure that the base
> > ccw device class provides a subchannel reset handler. I'm currently
> > not sure what we should do with vfio-ccw, so the behaviour there is
> > left unchanged.  
> 
> Had a look, and LGTM (will tag separately) modulo what follows here. I'm
> a bit concerned about vfio-ccw not being made compliant to this new
> he reset of CCWDeviceClass is taking care of resetting the subchannel data
> structures. This feels like introducing a common abstraction
> to me, but then some things bother me.

We are having a common abstraction that can be overwritten by any
specialized implementation - and this is what vfio-ccw is doing,
therefore nothing changes for it.

> In particular the the pim, the lpm and the pam set in css_reset_sch
> seems to suit only devices that use the virtual chp. That
> is it ain't suits any CCWDevice instance.

Yes, we need to revisit this code and split out what makes sense and
what doesn't. For now, we only have virtual devices and vfio-ccw, so
we're fine. It even might make sense to keep them separate in the
future, as having a virtual device and one only mirroring some state in
QEMU sound like they want to be handled differently.

> 
> Do you plan to tackle vfio-ccw reset?

It's on my to-do list (which is sadly quite crowded...)

> 
> > 
> > Cornelia Huck (2):
> >    virtio-ccw: common reset handler
> >    s390x/ccw: make sure all ccw devices are properly reset
> > 
> >   hw/s390x/ccw-device.c |  8 ++++++++
> >   hw/s390x/virtio-ccw.c | 20 ++++++--------------
> >   hw/s390x/virtio-ccw.h |  1 +
> >   3 files changed, 15 insertions(+), 14 deletions(-)
> >   
>
Halil Pasic May 8, 2018, 2:24 p.m. UTC | #4
On 05/08/2018 03:55 PM, Cornelia Huck wrote:
> On Tue, 8 May 2018 15:29:50 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> On 05/07/2018 05:51 PM, Cornelia Huck wrote:
>>> On Friday, Thomas noticed some problems with 3270 devices. One result
>>> was "s390x/css: disabled subchannels cannot be status pending", but
>>> a reboot did not cure the previous broken status. Turns out that 3270
>>> devices are missing a reset handler.
>>>
>>> This series cleans up reset handling a bit and makes sure that the base
>>> ccw device class provides a subchannel reset handler. I'm currently
>>> not sure what we should do with vfio-ccw, so the behaviour there is
>>> left unchanged.
>>
>> Had a look, and LGTM (will tag separately) modulo what follows here. I'm
>> a bit concerned about vfio-ccw not being made compliant to this new
>> he reset of CCWDeviceClass is taking care of resetting the subchannel data
>> structures. This feels like introducing a common abstraction
>> to me, but then some things bother me.
> 
> We are having a common abstraction that can be overwritten by any
> specialized implementation - and this is what vfio-ccw is doing,
> therefore nothing changes for it.
> 

Sorry my OO background is making me overthink things. I was on
the separation of concerns line: base class takes care of its
own class invariants and the the derived of its own. Considering
what your statements below, I think we are in agreement.

>> In particular the the pim, the lpm and the pam set in css_reset_sch
>> seems to suit only devices that use the virtual chp. That
>> is it ain't suits any CCWDevice instance.
> 
> Yes, we need to revisit this code and split out what makes sense and
> what doesn't. For now, we only have virtual devices and vfio-ccw, so
> we're fine. It even might make sense to keep them separate in the
> future, as having a virtual device and one only mirroring some state in
> QEMU sound like they want to be handled differently.
> 

I agree. The last sentence probably means that resetting the in QEMU
state may not be sufficient.

Another to me somewhat strange thing I noticed is this disabled_cb
used by virtio. It's is I guess the way it it is (specified in
the OASIS spec and everything), but I don't really understand how
this aligns with what the PoP says about MSCH. I mean AFAIU
MSCH does not trigger any communication between the channel subsystem
and the CU and or the device. I read the PoP as nothing is supposed
to change expect the things specified where MSCH is described. I guess
it is just another strange thing we have to live with -- for historical
reasons.

Regards,
Halil
Cornelia Huck May 8, 2018, 3:22 p.m. UTC | #5
On Tue, 8 May 2018 16:24:08 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On 05/08/2018 03:55 PM, Cornelia Huck wrote:
> > On Tue, 8 May 2018 15:29:50 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> >> In particular the the pim, the lpm and the pam set in css_reset_sch
> >> seems to suit only devices that use the virtual chp. That
> >> is it ain't suits any CCWDevice instance.  
> > 
> > Yes, we need to revisit this code and split out what makes sense and
> > what doesn't. For now, we only have virtual devices and vfio-ccw, so
> > we're fine. It even might make sense to keep them separate in the
> > future, as having a virtual device and one only mirroring some state in
> > QEMU sound like they want to be handled differently.
> >   
> 
> I agree. The last sentence probably means that resetting the in QEMU
> state may not be sufficient.

We currently have vfio-ccw's reset handler calling into the kernel and
triggering an disable/enable. What's missing is resetting QEMU's
internal state (or rather, syncing it up with the hardware state).
Needs some thinking.

> Another to me somewhat strange thing I noticed is this disabled_cb
> used by virtio. It's is I guess the way it it is (specified in
> the OASIS spec and everything), but I don't really understand how
> this aligns with what the PoP says about MSCH. I mean AFAIU
> MSCH does not trigger any communication between the channel subsystem
> and the CU and or the device. I read the PoP as nothing is supposed
> to change expect the things specified where MSCH is described. I guess
> it is just another strange thing we have to live with -- for historical
> reasons.

OTOH, I'd expect to have to setup things again if I disabled and
afterwards enabled a subchannel again. We should be able to overwrite
any old state after doing that. This was the best way to get virtio to
start with a clean slate again -- we don't want virtio reset to clean
the revision.