diff mbox series

[v4,5/9] pcie_sriov: Validate NumVFs

Message ID 20240214-reuse-v4-5-89ad093a07f4@daynix.com
State New
Headers show
Series hw/pci: SR-IOV related fixes and improvements | expand

Commit Message

Akihiko Odaki Feb. 14, 2024, 5:13 a.m. UTC
The guest may write NumVFs greater than TotalVFs and that can lead
to buffer overflow in VF implementations.

Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/pci/pcie_sriov.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Michael S. Tsirkin Feb. 14, 2024, 6:52 a.m. UTC | #1
On Wed, Feb 14, 2024 at 02:13:43PM +0900, Akihiko Odaki wrote:
> The guest may write NumVFs greater than TotalVFs and that can lead
> to buffer overflow in VF implementations.
> 
> Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>  hw/pci/pcie_sriov.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
> index a1fe65f5d801..da209b7f47fd 100644
> --- a/hw/pci/pcie_sriov.c
> +++ b/hw/pci/pcie_sriov.c
> @@ -176,6 +176,9 @@ static void register_vfs(PCIDevice *dev)
>  
>      assert(sriov_cap > 0);
>      num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
> +    if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_VF)) {
> +        return;
> +    }


yes but with your change PCI_SRIOV_NUM_VF no longer reflects the
number of registered VFs and that might lead to uninitialized
data read which is not better :(.

How about just forcing the PCI_SRIOV_NUM_VF register to be
below PCI_SRIOV_TOTAL_VF at all times?

>      dev->exp.sriov_pf.vf = g_new(PCIDevice *, num_vfs);
>  
> 
> -- 
> 2.43.0
Michael Tokarev Feb. 14, 2024, 8:58 a.m. UTC | #2
14.02.2024 08:13, Akihiko Odaki wrote:
> The guest may write NumVFs greater than TotalVFs and that can lead
> to buffer overflow in VF implementations.

This seems to be stable-worthy (Cc'd), and maybe even CVE-worthy?

Thanks,

/mjt

> Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
>   hw/pci/pcie_sriov.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
> index a1fe65f5d801..da209b7f47fd 100644
> --- a/hw/pci/pcie_sriov.c
> +++ b/hw/pci/pcie_sriov.c
> @@ -176,6 +176,9 @@ static void register_vfs(PCIDevice *dev)
>   
>       assert(sriov_cap > 0);
>       num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
> +    if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_VF)) {
> +        return;
> +    }
>   
>       dev->exp.sriov_pf.vf = g_new(PCIDevice *, num_vfs);
>   
>
Akihiko Odaki Feb. 14, 2024, 2:49 p.m. UTC | #3
On 2024/02/14 15:52, Michael S. Tsirkin wrote:
> On Wed, Feb 14, 2024 at 02:13:43PM +0900, Akihiko Odaki wrote:
>> The guest may write NumVFs greater than TotalVFs and that can lead
>> to buffer overflow in VF implementations.
>>
>> Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>>   hw/pci/pcie_sriov.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
>> index a1fe65f5d801..da209b7f47fd 100644
>> --- a/hw/pci/pcie_sriov.c
>> +++ b/hw/pci/pcie_sriov.c
>> @@ -176,6 +176,9 @@ static void register_vfs(PCIDevice *dev)
>>   
>>       assert(sriov_cap > 0);
>>       num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
>> +    if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_VF)) {
>> +        return;
>> +    }
> 
> 
> yes but with your change PCI_SRIOV_NUM_VF no longer reflects the
> number of registered VFs and that might lead to uninitialized
> data read which is not better :(.
> 
> How about just forcing the PCI_SRIOV_NUM_VF register to be
> below PCI_SRIOV_TOTAL_VF at all times?

PCI_SRIOV_NUM_VF is already divergent from the number of registered VFs. 
It may have a number greater than the current registered VFs before 
setting VF Enable.

The guest may also change PCI_SRIOV_NUM_VF while VF Enable is set; the 
behavior is undefined in such a case but we still accept such a write. A 
value written in such a case won't be reflected to the actual number of 
enabled VFs.
Akihiko Odaki Feb. 14, 2024, 2:54 p.m. UTC | #4
On 2024/02/14 17:58, Michael Tokarev wrote:
> 14.02.2024 08:13, Akihiko Odaki wrote:
>> The guest may write NumVFs greater than TotalVFs and that can lead
>> to buffer overflow in VF implementations.
> 
> This seems to be stable-worthy (Cc'd), and maybe even CVE-worthy?

Perhaps so. The scope of the bug is limited to emulated SR-IOV devices, 
and I think nobody use them except for development, but it may be still 
nice to have a CVE.

Can anyone help assign a CVE? I don't know the procedure.

Regards,
Akihiko Odaki

> 
> Thanks,
> 
> /mjt
> 
>> Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O 
>> Virtualization (SR/IOV)")
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>>   hw/pci/pcie_sriov.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
>> index a1fe65f5d801..da209b7f47fd 100644
>> --- a/hw/pci/pcie_sriov.c
>> +++ b/hw/pci/pcie_sriov.c
>> @@ -176,6 +176,9 @@ static void register_vfs(PCIDevice *dev)
>>       assert(sriov_cap > 0);
>>       num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
>> +    if (num_vfs > pci_get_word(dev->config + sriov_cap + 
>> PCI_SRIOV_TOTAL_VF)) {
>> +        return;
>> +    }
>>       dev->exp.sriov_pf.vf = g_new(PCIDevice *, num_vfs);
>>
>
Michael Tokarev Feb. 14, 2024, 3:53 p.m. UTC | #5
14.02.2024 17:54, Akihiko Odaki wrote:
> On 2024/02/14 17:58, Michael Tokarev wrote:
>> 14.02.2024 08:13, Akihiko Odaki wrote:
>>> The guest may write NumVFs greater than TotalVFs and that can lead
>>> to buffer overflow in VF implementations.
>>
>> This seems to be stable-worthy (Cc'd), and maybe even CVE-worthy?
> 
> Perhaps so. The scope of the bug is limited to emulated SR-IOV devices, and I think nobody use them except for development, but it may be still nice 
> to have a CVE.
> 
> Can anyone help assign a CVE? I don't know the procedure.

Heh. Usually I ask exactly the opposite question: how to avoid assigning
a CVE# for a non-issue which they most likely think is a serious security
bug?  We've plenty of these in qemu, collecting dust for years...  For
example, for things like some actions by privileged guest process (or kernel)
which leads to qemu dying with assertion failure, which, on a real HW, will
cause hardware lockup.

Nope, I don't remember how to request a CVE ;)

/mjt
Michael S. Tsirkin Feb. 14, 2024, 3:54 p.m. UTC | #6
On Wed, Feb 14, 2024 at 11:49:52PM +0900, Akihiko Odaki wrote:
> On 2024/02/14 15:52, Michael S. Tsirkin wrote:
> > On Wed, Feb 14, 2024 at 02:13:43PM +0900, Akihiko Odaki wrote:
> > > The guest may write NumVFs greater than TotalVFs and that can lead
> > > to buffer overflow in VF implementations.
> > > 
> > > Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
> > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> > > ---
> > >   hw/pci/pcie_sriov.c | 3 +++
> > >   1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
> > > index a1fe65f5d801..da209b7f47fd 100644
> > > --- a/hw/pci/pcie_sriov.c
> > > +++ b/hw/pci/pcie_sriov.c
> > > @@ -176,6 +176,9 @@ static void register_vfs(PCIDevice *dev)
> > >       assert(sriov_cap > 0);
> > >       num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
> > > +    if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_VF)) {
> > > +        return;
> > > +    }
> > 
> > 
> > yes but with your change PCI_SRIOV_NUM_VF no longer reflects the
> > number of registered VFs and that might lead to uninitialized
> > data read which is not better :(.
> > 
> > How about just forcing the PCI_SRIOV_NUM_VF register to be
> > below PCI_SRIOV_TOTAL_VF at all times?
> 
> PCI_SRIOV_NUM_VF is already divergent from the number of registered VFs. It
> may have a number greater than the current registered VFs before setting VF
> Enable.
> 
> The guest may also change PCI_SRIOV_NUM_VF while VF Enable is set; the
> behavior is undefined in such a case but we still accept such a write. A
> value written in such a case won't be reflected to the actual number of
> enabled VFs.

OK then let's add a comment near num_vfs explaining all this and saying
only to use this value. I also would prefer to have this if
just where we set num_vfs. And maybe after all do set num_vfs to
PCI_SRIOV_TOTAL_VF? Less of a chance of breaking something I feel...
Michael S. Tsirkin Feb. 14, 2024, 3:55 p.m. UTC | #7
On Wed, Feb 14, 2024 at 06:53:43PM +0300, Michael Tokarev wrote:
> Nope, I don't remember how to request a CVE ;)

https://www.qemu.org/contribute/security-process/
Akihiko Odaki Feb. 14, 2024, 4:22 p.m. UTC | #8
On 2024/02/15 0:54, Michael S. Tsirkin wrote:
> On Wed, Feb 14, 2024 at 11:49:52PM +0900, Akihiko Odaki wrote:
>> On 2024/02/14 15:52, Michael S. Tsirkin wrote:
>>> On Wed, Feb 14, 2024 at 02:13:43PM +0900, Akihiko Odaki wrote:
>>>> The guest may write NumVFs greater than TotalVFs and that can lead
>>>> to buffer overflow in VF implementations.
>>>>
>>>> Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
>>>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>>>> ---
>>>>    hw/pci/pcie_sriov.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
>>>> index a1fe65f5d801..da209b7f47fd 100644
>>>> --- a/hw/pci/pcie_sriov.c
>>>> +++ b/hw/pci/pcie_sriov.c
>>>> @@ -176,6 +176,9 @@ static void register_vfs(PCIDevice *dev)
>>>>        assert(sriov_cap > 0);
>>>>        num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
>>>> +    if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_VF)) {
>>>> +        return;
>>>> +    }
>>>
>>>
>>> yes but with your change PCI_SRIOV_NUM_VF no longer reflects the
>>> number of registered VFs and that might lead to uninitialized
>>> data read which is not better :(.
>>>
>>> How about just forcing the PCI_SRIOV_NUM_VF register to be
>>> below PCI_SRIOV_TOTAL_VF at all times?
>>
>> PCI_SRIOV_NUM_VF is already divergent from the number of registered VFs. It
>> may have a number greater than the current registered VFs before setting VF
>> Enable.
>>
>> The guest may also change PCI_SRIOV_NUM_VF while VF Enable is set; the
>> behavior is undefined in such a case but we still accept such a write. A
>> value written in such a case won't be reflected to the actual number of
>> enabled VFs.
> 
> OK then let's add a comment near num_vfs explaining all this and saying
> only to use this value. I also would prefer to have this if
> just where we set num_vfs. And maybe after all do set num_vfs to
> PCI_SRIOV_TOTAL_VF? Less of a chance of breaking something I feel...

I don't think we need a comment for this. In general, it should be the 
last thing to do to read the PCI configuration space, which is writable 
by the guest, without proper validation. And such a validation should 
happen in the internal of the capability implementation.

I think the core of the problem in nvme is that it decided to take a 
look into SR-IOV configurations. It should have never looked at 
PCI_SRIOV_CTRL or PCI_SRIOV_NUM_VF.
Akihiko Odaki Feb. 17, 2024, 11:32 a.m. UTC | #9
On 2024/02/15 0:55, Michael S. Tsirkin wrote:
> On Wed, Feb 14, 2024 at 06:53:43PM +0300, Michael Tokarev wrote:
>> Nope, I don't remember how to request a CVE ;)
> 
> https://www.qemu.org/contribute/security-process/

Thanks, I requested CVEs with the form.

QEMU doesn't have any list of features without security guarantee so I 
think it's "kind" to have CVEs for any features though I would certainly 
add igb and NVMe's SR-IOV feature to such a list if there is.

It should be harmless to have extra CVEs at least. Well, some may think 
all CVEs serious but that's not my fault.
diff mbox series

Patch

diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
index a1fe65f5d801..da209b7f47fd 100644
--- a/hw/pci/pcie_sriov.c
+++ b/hw/pci/pcie_sriov.c
@@ -176,6 +176,9 @@  static void register_vfs(PCIDevice *dev)
 
     assert(sriov_cap > 0);
     num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
+    if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_VF)) {
+        return;
+    }
 
     dev->exp.sriov_pf.vf = g_new(PCIDevice *, num_vfs);