diff mbox

[2/2] xen-netback: cancel the credit timer when taking the vif down

Message ID 1360847938-11185-3-git-send-email-david.vrabel@citrix.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

David Vrabel Feb. 14, 2013, 1:18 p.m. UTC
From: David Vrabel <david.vrabel@citrix.com>

If the credit timer is left armed after calling
xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
the vif which will then oops as vif->netbk == NULL.

This may happen both in the fatal error path and during normal
disconnection from the front end.

The sequencing during shutdown is critical to ensure that: a)
vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
is not freed.

1. Mark as unschedulable (netif_carrier_off()).
2. Synchronously cancel the timer.
3. Remove the vif from the schedule list.
4. Remove it from it netback thread group.
5. Wait for vif->refcnt to become 0.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/net/xen-netback/interface.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

Comments

Wei Liu Feb. 14, 2013, 1:53 p.m. UTC | #1
On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> If the credit timer is left armed after calling
> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> the vif which will then oops as vif->netbk == NULL.
> 
> This may happen both in the fatal error path and during normal
> disconnection from the front end.
> 
> The sequencing during shutdown is critical to ensure that: a)
> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> is not freed.
> 
> 1. Mark as unschedulable (netif_carrier_off()).
> 2. Synchronously cancel the timer.
> 3. Remove the vif from the schedule list.
> 4. Remove it from it netback thread group.
> 5. Wait for vif->refcnt to become 0.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

You would need to reinitialize the timer in xenvif_up, given that user
might `ifconfig vifX.X down; ifconfig vifX.X up`.

Another less desired but simpler fix would be leave the timer alone but
check for vif->netbk != NULL in the timer callback.


Wei.

> ---
>  drivers/net/xen-netback/interface.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index b8c5193..221f426 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
>  static void xenvif_down(struct xenvif *vif)
>  {
>  	disable_irq(vif->irq);
> +	del_timer_sync(&vif->credit_timeout);
>  	xen_netbk_deschedule_xenvif(vif);
>  	xen_netbk_remove_xenvif(vif);
>  }
> @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
>  	atomic_dec(&vif->refcnt);
>  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>  
> -	del_timer_sync(&vif->credit_timeout);
> -
>  	if (vif->irq)
>  		unbind_from_irqhandler(vif->irq, vif);
>  


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Vrabel Feb. 14, 2013, 1:56 p.m. UTC | #2
On 14/02/13 13:53, Wei Liu wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> If the credit timer is left armed after calling
>> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
>> the vif which will then oops as vif->netbk == NULL.
>>
>> This may happen both in the fatal error path and during normal
>> disconnection from the front end.
>>
>> The sequencing during shutdown is critical to ensure that: a)
>> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
>> is not freed.
>>
>> 1. Mark as unschedulable (netif_carrier_off()).
>> 2. Synchronously cancel the timer.
>> 3. Remove the vif from the schedule list.
>> 4. Remove it from it netback thread group.
>> 5. Wait for vif->refcnt to become 0.
>>
>> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> 
> You would need to reinitialize the timer in xenvif_up, given that user
> might `ifconfig vifX.X down; ifconfig vifX.X up`.

No.  Deleted timers do not need to be reinitialized.  The timer will be
armed as usual with mod_timer() when credit is next exhausted.

David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Liu Feb. 14, 2013, 2:10 p.m. UTC | #3
On Thu, 2013-02-14 at 13:56 +0000, David Vrabel wrote:
> On 14/02/13 13:53, Wei Liu wrote:
> > On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> >> From: David Vrabel <david.vrabel@citrix.com>
> >>
> >> If the credit timer is left armed after calling
> >> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> >> the vif which will then oops as vif->netbk == NULL.
> >>
> >> This may happen both in the fatal error path and during normal
> >> disconnection from the front end.
> >>
> >> The sequencing during shutdown is critical to ensure that: a)
> >> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> >> is not freed.
> >>
> >> 1. Mark as unschedulable (netif_carrier_off()).
> >> 2. Synchronously cancel the timer.
> >> 3. Remove the vif from the schedule list.
> >> 4. Remove it from it netback thread group.
> >> 5. Wait for vif->refcnt to become 0.
> >>
> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> > 
> > You would need to reinitialize the timer in xenvif_up, given that user
> > might `ifconfig vifX.X down; ifconfig vifX.X up`.
> 
> No.  Deleted timers do not need to be reinitialized.  The timer will be

What I really meant was to "rearm"...

> armed as usual with mod_timer() when credit is next exhausted.
> 

Ah, ok.


Wei.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Beulich Feb. 14, 2013, 2:15 p.m. UTC | #4
>>> On 14.02.13 at 14:53, Wei Liu <wei.liu2@citrix.com> wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>> 
>> If the credit timer is left armed after calling
>> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
>> the vif which will then oops as vif->netbk == NULL.
>> 
>> This may happen both in the fatal error path and during normal
>> disconnection from the front end.
>> 
>> The sequencing during shutdown is critical to ensure that: a)
>> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
>> is not freed.
>> 
>> 1. Mark as unschedulable (netif_carrier_off()).
>> 2. Synchronously cancel the timer.
>> 3. Remove the vif from the schedule list.
>> 4. Remove it from it netback thread group.
>> 5. Wait for vif->refcnt to become 0.
>> 
>> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> 
> You would need to reinitialize the timer in xenvif_up, given that user
> might `ifconfig vifX.X down; ifconfig vifX.X up`.

Which gets us to another aspect of the original fix that I don't
think was considered: Is there anything preventing the interface
to be brought back up after fatal_tx_err() shut it down?

Jan

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Liu Feb. 14, 2013, 2:21 p.m. UTC | #5
On Thu, 2013-02-14 at 14:15 +0000, Jan Beulich wrote:
> >>> On 14.02.13 at 14:53, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> >> From: David Vrabel <david.vrabel@citrix.com>
> >> 
> >> If the credit timer is left armed after calling
> >> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> >> the vif which will then oops as vif->netbk == NULL.
> >> 
> >> This may happen both in the fatal error path and during normal
> >> disconnection from the front end.
> >> 
> >> The sequencing during shutdown is critical to ensure that: a)
> >> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> >> is not freed.
> >> 
> >> 1. Mark as unschedulable (netif_carrier_off()).
> >> 2. Synchronously cancel the timer.
> >> 3. Remove the vif from the schedule list.
> >> 4. Remove it from it netback thread group.
> >> 5. Wait for vif->refcnt to become 0.
> >> 
> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> > 
> > You would need to reinitialize the timer in xenvif_up, given that user
> > might `ifconfig vifX.X down; ifconfig vifX.X up`.
> 
> Which gets us to another aspect of the original fix that I don't
> think was considered: Is there anything preventing the interface
> to be brought back up after fatal_tx_err() shut it down?
> 

I don't think so. Code could / should not prevent host admin from doing
anything he wants - even it is re-enabling a malicious vif. ;-)


Wei.

> Jan
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ian Campbell Feb. 14, 2013, 4:39 p.m. UTC | #6
On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> If the credit timer is left armed after calling
> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> the vif which will then oops as vif->netbk == NULL.
> 
> This may happen both in the fatal error path and during normal
> disconnection from the front end.
> 
> The sequencing during shutdown is critical to ensure that: a)
> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> is not freed.
> 
> 1. Mark as unschedulable (netif_carrier_off()).
> 2. Synchronously cancel the timer.
> 3. Remove the vif from the schedule list.
> 4. Remove it from it netback thread group.
> 5. Wait for vif->refcnt to become 0.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Was this one also Reported-by Christopher S. Aker or was it just
discovered in the process of investigating?

Another stable candidate please Dave.

> ---
>  drivers/net/xen-netback/interface.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index b8c5193..221f426 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
>  static void xenvif_down(struct xenvif *vif)
>  {
>  	disable_irq(vif->irq);
> +	del_timer_sync(&vif->credit_timeout);
>  	xen_netbk_deschedule_xenvif(vif);
>  	xen_netbk_remove_xenvif(vif);
>  }
> @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
>  	atomic_dec(&vif->refcnt);
>  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>  
> -	del_timer_sync(&vif->credit_timeout);
> -
>  	if (vif->irq)
>  		unbind_from_irqhandler(vif->irq, vif);
>  


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Liu Feb. 14, 2013, 4:48 p.m. UTC | #7
On Thu, 2013-02-14 at 16:39 +0000, Ian Campbell wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> > From: David Vrabel <david.vrabel@citrix.com>
> > 
> > If the credit timer is left armed after calling
> > xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> > the vif which will then oops as vif->netbk == NULL.
> > 
> > This may happen both in the fatal error path and during normal
> > disconnection from the front end.
> > 
> > The sequencing during shutdown is critical to ensure that: a)
> > vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> > is not freed.
> > 
> > 1. Mark as unschedulable (netif_carrier_off()).
> > 2. Synchronously cancel the timer.
> > 3. Remove the vif from the schedule list.
> > 4. Remove it from it netback thread group.
> > 5. Wait for vif->refcnt to become 0.
> > 
> > Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Was this one also Reported-by Christopher S. Aker or was it just
> discovered in the process of investigating?
> 

His bug report did prod me to look into this, so I think it is worth
adding

Reported-by: Christopher S. Aker <caker@theshore.net>


Wei.

> Another stable candidate please Dave.
> 
> > ---
> >  drivers/net/xen-netback/interface.c |    3 +--
> >  1 files changed, 1 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> > index b8c5193..221f426 100644
> > --- a/drivers/net/xen-netback/interface.c
> > +++ b/drivers/net/xen-netback/interface.c
> > @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
> >  static void xenvif_down(struct xenvif *vif)
> >  {
> >  	disable_irq(vif->irq);
> > +	del_timer_sync(&vif->credit_timeout);
> >  	xen_netbk_deschedule_xenvif(vif);
> >  	xen_netbk_remove_xenvif(vif);
> >  }
> > @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
> >  	atomic_dec(&vif->refcnt);
> >  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
> >  
> > -	del_timer_sync(&vif->credit_timeout);
> > -
> >  	if (vif->irq)
> >  		unbind_from_irqhandler(vif->irq, vif);
> >  
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index b8c5193..221f426 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -132,6 +132,7 @@  static void xenvif_up(struct xenvif *vif)
 static void xenvif_down(struct xenvif *vif)
 {
 	disable_irq(vif->irq);
+	del_timer_sync(&vif->credit_timeout);
 	xen_netbk_deschedule_xenvif(vif);
 	xen_netbk_remove_xenvif(vif);
 }
@@ -363,8 +364,6 @@  void xenvif_disconnect(struct xenvif *vif)
 	atomic_dec(&vif->refcnt);
 	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
 
-	del_timer_sync(&vif->credit_timeout);
-
 	if (vif->irq)
 		unbind_from_irqhandler(vif->irq, vif);