diff mbox

[3/9] pci: Make bounds checks on config space accesses actually work

Message ID 1326347188-12119-4-git-send-email-david@gibson.dropbear.id.au
State New
Headers show

Commit Message

David Gibson Jan. 12, 2012, 5:46 a.m. UTC
The pci_host_config_{read,write}_common() functions perform PCI config
accesses.  They take a limit parameter which they appear to be supposed
to bounds check against, however the bounds checking logic, such as it is,
is completely broken.

Currently, it takes the minimum of the supplied length and the remaining
space in the region and passes that as the length to the underlying
config_{read,write} function pointer.  This means that accesses which
partially overrun the region will be silently truncated - which makes
little sense.  Accesses which entirely overrun the region will *not*
be blocked (an exploitable bug), because in that case (limit - addr) will
be negative and so the unsigned MIN will always return len instead.  Even
if signed arithmetic was used, the config_{read,write} callback wouldn't
know what to do with a negative len parameter.

This patch handles things more sanely by simply ignoring writes which
overrun, and returning -1 for reads, which is the usual hardware convention
for reads to unpopulated IO regions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/pci_host.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

Comments

Alexander Graf Jan. 12, 2012, 1:19 p.m. UTC | #1
On 01/12/2012 06:46 AM, David Gibson wrote:
> The pci_host_config_{read,write}_common() functions perform PCI config
> accesses.  They take a limit parameter which they appear to be supposed
> to bounds check against, however the bounds checking logic, such as it is,
> is completely broken.
>
> Currently, it takes the minimum of the supplied length and the remaining
> space in the region and passes that as the length to the underlying
> config_{read,write} function pointer.  This means that accesses which
> partially overrun the region will be silently truncated - which makes
> little sense.  Accesses which entirely overrun the region will *not*
> be blocked (an exploitable bug), because in that case (limit - addr) will
> be negative and so the unsigned MIN will always return len instead.  Even
> if signed arithmetic was used, the config_{read,write} callback wouldn't
> know what to do with a negative len parameter.
>
> This patch handles things more sanely by simply ignoring writes which
> overrun, and returning -1 for reads, which is the usual hardware convention
> for reads to unpopulated IO regions.
>
> Signed-off-by: David Gibson<david@gibson.dropbear.id.au>

Michael, please ack or apply yourself. I'll cache this in my tree 
regardless so it doesn't get lost.


Alex

> ---
>   hw/pci_host.c |   10 ++++++++--
>   1 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/hw/pci_host.c b/hw/pci_host.c
> index 44c6c20..16b3ac3 100644
> --- a/hw/pci_host.c
> +++ b/hw/pci_host.c
> @@ -51,14 +51,20 @@ void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr,
>                                     uint32_t limit, uint32_t val, uint32_t len)
>   {
>       assert(len<= 4);
> -    pci_dev->config_write(pci_dev, addr, val, MIN(len, limit - addr));
> +    if ((addr + len)<= limit) {
> +        pci_dev->config_write(pci_dev, addr, val, len);
> +    }
>   }
>
>   uint32_t pci_host_config_read_common(PCIDevice *pci_dev, uint32_t addr,
>                                        uint32_t limit, uint32_t len)
>   {
>       assert(len<= 4);
> -    return pci_dev->config_read(pci_dev, addr, MIN(len, limit - addr));
> +    if ((addr + len)<= limit) {
> +        return pci_dev->config_read(pci_dev, addr, len);
> +    } else {
> +        return ~0x0;
> +    }
>   }
>
>   void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len)
Michael S. Tsirkin Jan. 12, 2012, 1:32 p.m. UTC | #2
On Thu, Jan 12, 2012 at 04:46:22PM +1100, David Gibson wrote:
> The pci_host_config_{read,write}_common() functions perform PCI config
> accesses.  They take a limit parameter which they appear to be supposed
> to bounds check against, however the bounds checking logic, such as it is,
> is completely broken.
> 
> Currently, it takes the minimum of the supplied length and the remaining
> space in the region and passes that as the length to the underlying
> config_{read,write} function pointer.  This means that accesses which
> partially overrun the region will be silently truncated - which makes
> little sense.

Why does it make little sense? Makes sense to me.

>  Accesses which entirely overrun the region will *not*
> be blocked (an exploitable bug)
>, because in that case (limit - addr) will
> be negative and so the unsigned MIN will always return len instead.  Even
> if signed arithmetic was used, the config_{read,write} callback wouldn't
> know what to do with a negative len parameter.

The assumption was callers never pass in such values.
Could you please give an example how this exploitable bug
can get triggered?

> 
> This patch handles things more sanely by simply ignoring writes which
> overrun, and returning -1 for reads, which is the usual hardware convention
> for reads to unpopulated IO regions.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/pci_host.c |   10 ++++++++--
>  1 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/pci_host.c b/hw/pci_host.c
> index 44c6c20..16b3ac3 100644
> --- a/hw/pci_host.c
> +++ b/hw/pci_host.c
> @@ -51,14 +51,20 @@ void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr,
>                                    uint32_t limit, uint32_t val, uint32_t len)
>  {
>      assert(len <= 4);
> -    pci_dev->config_write(pci_dev, addr, val, MIN(len, limit - addr));
> +    if ((addr + len) <= limit) {
> +        pci_dev->config_write(pci_dev, addr, val, len);
> +    }
>  }
>  
>  uint32_t pci_host_config_read_common(PCIDevice *pci_dev, uint32_t addr,
>                                       uint32_t limit, uint32_t len)
>  {
>      assert(len <= 4);
> -    return pci_dev->config_read(pci_dev, addr, MIN(len, limit - addr));
> +    if ((addr + len) <= limit) {
> +        return pci_dev->config_read(pci_dev, addr, len);
> +    } else {
> +        return ~0x0;
> +    }
>  }
>  
>  void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len)
> -- 
> 1.7.7.3
>
Michael S. Tsirkin Jan. 12, 2012, 1:33 p.m. UTC | #3
On Thu, Jan 12, 2012 at 02:19:59PM +0100, Alexander Graf wrote:
> On 01/12/2012 06:46 AM, David Gibson wrote:
> >The pci_host_config_{read,write}_common() functions perform PCI config
> >accesses.  They take a limit parameter which they appear to be supposed
> >to bounds check against, however the bounds checking logic, such as it is,
> >is completely broken.
> >
> >Currently, it takes the minimum of the supplied length and the remaining
> >space in the region and passes that as the length to the underlying
> >config_{read,write} function pointer.  This means that accesses which
> >partially overrun the region will be silently truncated - which makes
> >little sense.  Accesses which entirely overrun the region will *not*
> >be blocked (an exploitable bug), because in that case (limit - addr) will
> >be negative and so the unsigned MIN will always return len instead.  Even
> >if signed arithmetic was used, the config_{read,write} callback wouldn't
> >know what to do with a negative len parameter.
> >
> >This patch handles things more sanely by simply ignoring writes which
> >overrun, and returning -1 for reads, which is the usual hardware convention
> >for reads to unpopulated IO regions.
> >
> >Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
> 
> Michael, please ack or apply yourself. I'll cache this in my tree
> regardless so it doesn't get lost.
> 
> 
> Alex

I'd like to understand the bug a bit.

> >---
> >  hw/pci_host.c |   10 ++++++++--
> >  1 files changed, 8 insertions(+), 2 deletions(-)
> >
> >diff --git a/hw/pci_host.c b/hw/pci_host.c
> >index 44c6c20..16b3ac3 100644
> >--- a/hw/pci_host.c
> >+++ b/hw/pci_host.c
> >@@ -51,14 +51,20 @@ void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr,
> >                                    uint32_t limit, uint32_t val, uint32_t len)
> >  {
> >      assert(len<= 4);
> >-    pci_dev->config_write(pci_dev, addr, val, MIN(len, limit - addr));
> >+    if ((addr + len)<= limit) {
> >+        pci_dev->config_write(pci_dev, addr, val, len);
> >+    }
> >  }
> >
> >  uint32_t pci_host_config_read_common(PCIDevice *pci_dev, uint32_t addr,
> >                                       uint32_t limit, uint32_t len)
> >  {
> >      assert(len<= 4);
> >-    return pci_dev->config_read(pci_dev, addr, MIN(len, limit - addr));
> >+    if ((addr + len)<= limit) {
> >+        return pci_dev->config_read(pci_dev, addr, len);
> >+    } else {
> >+        return ~0x0;
> >+    }
> >  }
> >
> >  void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len)
>
David Gibson Jan. 13, 2012, 12:26 a.m. UTC | #4
On Thu, Jan 12, 2012 at 03:32:32PM +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 12, 2012 at 04:46:22PM +1100, David Gibson wrote:
> > The pci_host_config_{read,write}_common() functions perform PCI config
> > accesses.  They take a limit parameter which they appear to be supposed
> > to bounds check against, however the bounds checking logic, such as it is,
> > is completely broken.
> > 
> > Currently, it takes the minimum of the supplied length and the remaining
> > space in the region and passes that as the length to the underlying
> > config_{read,write} function pointer.  This means that accesses which
> > partially overrun the region will be silently truncated - which makes
> > little sense.
> 
> Why does it make little sense? Makes sense to me.

Well, for starters a partial overrun would have to be an unaligned
config space access, which is not supported by PCI.  The behaviour if
you try is undefined on most bridges and unlikely to be a partial
read/write (ignoring the low addr bits would be more likely).

Even if a bridge somehow did a partial access, it's still wrong for
reads, since the overrunning bits would typically return 0xff not
0x00, and so you'd need to 0xff pad the result.

There's just no point doing anything other than simply failing partial
overruns.

> >  Accesses which entirely overrun the region will *not*
> > be blocked (an exploitable bug)
> >, because in that case (limit - addr) will
> > be negative and so the unsigned MIN will always return len instead.  Even
> > if signed arithmetic was used, the config_{read,write} callback wouldn't
> > know what to do with a negative len parameter.
> 
> The assumption was callers never pass in such values.

So, callers are to to treat this function, taking a limit parameter as
having semantics of "checks a pointless edge case, but not the obvious
type of overrun".  You think that's a sensible semantic for a general
helper function?  Seriously?

> Could you please give an example how this exploitable bug
> can get triggered?

Ah, yes, it's not actually exploitable in the current code, since the
only callers mask the address down.  It becomes exploitable if someone
writes a new bridge which doesn't go via the existing accessors and
assumes that the function which looks like it bounds checks actually
bounds checks (which I'm about to do for the pseries PCI code.

Factoring the bounds checking into pci_host_config_{read,write} just
makes more sense.
Michael S. Tsirkin Jan. 13, 2012, 1:23 a.m. UTC | #5
On Fri, Jan 13, 2012 at 11:26:12AM +1100, David Gibson wrote:
> On Thu, Jan 12, 2012 at 03:32:32PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Jan 12, 2012 at 04:46:22PM +1100, David Gibson wrote:
> > > The pci_host_config_{read,write}_common() functions perform PCI config
> > > accesses.  They take a limit parameter which they appear to be supposed
> > > to bounds check against, however the bounds checking logic, such as it is,
> > > is completely broken.
> > > 
> > > Currently, it takes the minimum of the supplied length and the remaining
> > > space in the region and passes that as the length to the underlying
> > > config_{read,write} function pointer.  This means that accesses which
> > > partially overrun the region will be silently truncated - which makes
> > > little sense.
> > 
> > Why does it make little sense? Makes sense to me.
> 
> Well, for starters a partial overrun would have to be an unaligned
> config space access, which is not supported by PCI.  The behaviour if
> you try is undefined on most bridges and unlikely to be a partial
> read/write (ignoring the low addr bits would be more likely).

Yes, bus level cycles have dword granularity in PCI.
But look e.g. at our express implementation.
Config is memory mapped, so we simply map this as memory into guest.
If you do a large read, what happens on real hardware?
I'm not sure, but it looks possible that it will get split
and multiple dword transactions generated on the bus.
OTOH our implementation passes the read on as is, so
it can get a multi-dword.
I'll try to play with some real systems to see what they do.

> Even if a bridge somehow did a partial access, it's still wrong for
> reads, since the overrunning bits would typically return 0xff not
> 0x00, and so you'd need to 0xff pad the result.

Or maybe an error needs to be generated. We really don't know,
it's up to the caller.

> There's just no point doing anything other than simply failing partial
> overruns.

There's no point in random code churn either.

> > >  Accesses which entirely overrun the region will *not*
> > > be blocked (an exploitable bug)
> > >, because in that case (limit - addr) will
> > > be negative and so the unsigned MIN will always return len instead.  Even
> > > if signed arithmetic was used, the config_{read,write} callback wouldn't
> > > know what to do with a negative len parameter.
> > 
> > The assumption was callers never pass in such values.
> 
> So, callers are to to treat this function, taking a limit parameter as
> having semantics of "checks a pointless edge case, but not the obvious
> type of overrun".  You think that's a sensible semantic for a general
> helper function?  Seriously?

I don't mind adding an extra check at this level. But
the comment would need to be reworded from
'fix a bug' to 'pseries wants to pass out of range
values so let's check this'.

> > Could you please give an example how this exploitable bug
> > can get triggered?
> 
> Ah, yes, it's not actually exploitable in the current code, since the
> only callers mask the address down.  It becomes exploitable if someone
> writes a new bridge which doesn't go via the existing accessors and
> assumes that the function which looks like it bounds checks actually
> bounds checks (which I'm about to do for the pseries PCI code.
> 
> Factoring the bounds checking into pci_host_config_{read,write} just
> makes more sense.
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
David Gibson Jan. 13, 2012, 1:44 a.m. UTC | #6
On Fri, Jan 13, 2012 at 03:23:37AM +0200, Michael S. Tsirkin wrote:
> On Fri, Jan 13, 2012 at 11:26:12AM +1100, David Gibson wrote:
> > On Thu, Jan 12, 2012 at 03:32:32PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Jan 12, 2012 at 04:46:22PM +1100, David Gibson wrote:
> > > > The pci_host_config_{read,write}_common() functions perform PCI config
> > > > accesses.  They take a limit parameter which they appear to be supposed
> > > > to bounds check against, however the bounds checking logic, such as it is,
> > > > is completely broken.
> > > > 
> > > > Currently, it takes the minimum of the supplied length and the remaining
> > > > space in the region and passes that as the length to the underlying
> > > > config_{read,write} function pointer.  This means that accesses which
> > > > partially overrun the region will be silently truncated - which makes
> > > > little sense.
> > > 
> > > Why does it make little sense? Makes sense to me.
> > 
> > Well, for starters a partial overrun would have to be an unaligned
> > config space access, which is not supported by PCI.  The behaviour if
> > you try is undefined on most bridges and unlikely to be a partial
> > read/write (ignoring the low addr bits would be more likely).
> 
> Yes, bus level cycles have dword granularity in PCI.
> But look e.g. at our express implementation.
> Config is memory mapped, so we simply map this as memory into guest.

So?

> If you do a large read, what happens on real hardware?
> I'm not sure, but it looks possible that it will get split
> and multiple dword transactions generated on the bus.

Totally irrelevant in this context.  The function definition only
allows a maximum of 4 byte transfers (because of the uint32_ts), so
the bus emulation logic would already have to split a long read into
multiple calls to the function.

> OTOH our implementation passes the read on as is, so
> it can get a multi-dword.
> I'll try to play with some real systems to see what they do.
> 
> > Even if a bridge somehow did a partial access, it's still wrong for
> > reads, since the overrunning bits would typically return 0xff not
> > 0x00, and so you'd need to 0xff pad the result.
> 
> Or maybe an error needs to be generated. We really don't know,
> it's up to the caller.
> 
> > There's just no point doing anything other than simply failing partial
> > overruns.
> 
> There's no point in random code churn either.

Removing an insane semantic os not random code churn.

> > > >  Accesses which entirely overrun the region will *not*
> > > > be blocked (an exploitable bug)
> > > >, because in that case (limit - addr) will
> > > > be negative and so the unsigned MIN will always return len instead.  Even
> > > > if signed arithmetic was used, the config_{read,write} callback wouldn't
> > > > know what to do with a negative len parameter.
> > > 
> > > The assumption was callers never pass in such values.
> > 
> > So, callers are to to treat this function, taking a limit parameter as
> > having semantics of "checks a pointless edge case, but not the obvious
> > type of overrun".  You think that's a sensible semantic for a general
> > helper function?  Seriously?
> 
> I don't mind adding an extra check at this level. But
> the comment would need to be reworded from
> 'fix a bug' to 'pseries wants to pass out of range
> values so let's check this'.

Oh, FFS, you're just making excuses for not making a simple fix to
stupid code.
Alexander Graf Jan. 16, 2012, 11:24 a.m. UTC | #7
On 13.01.2012, at 02:44, David Gibson wrote:

> On Fri, Jan 13, 2012 at 03:23:37AM +0200, Michael S. Tsirkin wrote:
>> On Fri, Jan 13, 2012 at 11:26:12AM +1100, David Gibson wrote:
>>> On Thu, Jan 12, 2012 at 03:32:32PM +0200, Michael S. Tsirkin wrote:
>>>> On Thu, Jan 12, 2012 at 04:46:22PM +1100, David Gibson wrote:
>>>>> The pci_host_config_{read,write}_common() functions perform PCI config
>>>>> accesses.  They take a limit parameter which they appear to be supposed
>>>>> to bounds check against, however the bounds checking logic, such as it is,
>>>>> is completely broken.
>>>>> 
>>>>> Currently, it takes the minimum of the supplied length and the remaining
>>>>> space in the region and passes that as the length to the underlying
>>>>> config_{read,write} function pointer.  This means that accesses which
>>>>> partially overrun the region will be silently truncated - which makes
>>>>> little sense.
>>>> 
>>>> Why does it make little sense? Makes sense to me.
>>> 
>>> Well, for starters a partial overrun would have to be an unaligned
>>> config space access, which is not supported by PCI.  The behaviour if
>>> you try is undefined on most bridges and unlikely to be a partial
>>> read/write (ignoring the low addr bits would be more likely).
>> 
>> Yes, bus level cycles have dword granularity in PCI.
>> But look e.g. at our express implementation.
>> Config is memory mapped, so we simply map this as memory into guest.
> 
> So?
> 
>> If you do a large read, what happens on real hardware?
>> I'm not sure, but it looks possible that it will get split
>> and multiple dword transactions generated on the bus.
> 
> Totally irrelevant in this context.  The function definition only
> allows a maximum of 4 byte transfers (because of the uint32_ts), so
> the bus emulation logic would already have to split a long read into
> multiple calls to the function.
> 
>> OTOH our implementation passes the read on as is, so
>> it can get a multi-dword.
>> I'll try to play with some real systems to see what they do.
>> 
>>> Even if a bridge somehow did a partial access, it's still wrong for
>>> reads, since the overrunning bits would typically return 0xff not
>>> 0x00, and so you'd need to 0xff pad the result.
>> 
>> Or maybe an error needs to be generated. We really don't know,
>> it's up to the caller.
>> 
>>> There's just no point doing anything other than simply failing partial
>>> overruns.
>> 
>> There's no point in random code churn either.
> 
> Removing an insane semantic os not random code churn.
> 
>>>>> Accesses which entirely overrun the region will *not*
>>>>> be blocked (an exploitable bug)
>>>>> , because in that case (limit - addr) will
>>>>> be negative and so the unsigned MIN will always return len instead.  Even
>>>>> if signed arithmetic was used, the config_{read,write} callback wouldn't
>>>>> know what to do with a negative len parameter.
>>>> 
>>>> The assumption was callers never pass in such values.
>>> 
>>> So, callers are to to treat this function, taking a limit parameter as
>>> having semantics of "checks a pointless edge case, but not the obvious
>>> type of overrun".  You think that's a sensible semantic for a general
>>> helper function?  Seriously?
>> 
>> I don't mind adding an extra check at this level. But
>> the comment would need to be reworded from
>> 'fix a bug' to 'pseries wants to pass out of range
>> values so let's check this'.
> 
> Oh, FFS, you're just making excuses for not making a simple fix to
> stupid code.

ping?


Alex
diff mbox

Patch

diff --git a/hw/pci_host.c b/hw/pci_host.c
index 44c6c20..16b3ac3 100644
--- a/hw/pci_host.c
+++ b/hw/pci_host.c
@@ -51,14 +51,20 @@  void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr,
                                   uint32_t limit, uint32_t val, uint32_t len)
 {
     assert(len <= 4);
-    pci_dev->config_write(pci_dev, addr, val, MIN(len, limit - addr));
+    if ((addr + len) <= limit) {
+        pci_dev->config_write(pci_dev, addr, val, len);
+    }
 }
 
 uint32_t pci_host_config_read_common(PCIDevice *pci_dev, uint32_t addr,
                                      uint32_t limit, uint32_t len)
 {
     assert(len <= 4);
-    return pci_dev->config_read(pci_dev, addr, MIN(len, limit - addr));
+    if ((addr + len) <= limit) {
+        return pci_dev->config_read(pci_dev, addr, len);
+    } else {
+        return ~0x0;
+    }
 }
 
 void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len)