diff mbox

use an unsigned long for the max_sz parameter in load_image_targphys

Message ID 1331225951-31306-1-git-send-email-mark.langsdorf@calxeda.com
State New
Headers show

Commit Message

Mark Langsdorf March 8, 2012, 4:59 p.m. UTC
Allow load_image_targphys to load files on systems with more than 2G of
emulated memory by changing the max_sz parameter from an int to an
unsigned long.

Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
---
 hw/loader.c |    4 ++--
 hw/loader.h |    3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

Comments

Eric Blake March 8, 2012, 5:56 p.m. UTC | #1
On 03/08/2012 09:59 AM, Mark Langsdorf wrote:
> Allow load_image_targphys to load files on systems with more than 2G of
> emulated memory by changing the max_sz parameter from an int to an
> unsigned long.

unsigned long is still 32-bits on a 32-bit host.  You probably want to
be using off_t.
Mark Langsdorf March 8, 2012, 6:13 p.m. UTC | #2
On 03/08/2012 11:56 AM, Eric Blake wrote:
> On 03/08/2012 09:59 AM, Mark Langsdorf wrote:
>> Allow load_image_targphys to load files on systems with more than 2G of
>> emulated memory by changing the max_sz parameter from an int to an
>> unsigned long.
> 
> unsigned long is still 32-bits on a 32-bit host.  You probably want to
> be using off_t.

I know that unsigned long is 32-bits. The issue is more that
comparing 0xf000_0000 > 0x1000_0000 returns FALSE if both values
are compared as signed ints, the way the current code does.

Strict correctness would be for max_sz to be of type size_t, and I
can change it to that if people would prefer, but unsigned long is
clear enough in this instance.

--Mark Langsdorf
Calxeda, Inc.
Markus Armbruster March 9, 2012, 9:25 a.m. UTC | #3
Mark Langsdorf <mark.langsdorf@calxeda.com> writes:

> Allow load_image_targphys to load files on systems with more than 2G of
> emulated memory by changing the max_sz parameter from an int to an
> unsigned long.
>
> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
> ---
>  hw/loader.c |    4 ++--
>  hw/loader.h |    3 ++-
>  2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/hw/loader.c b/hw/loader.c
> index 415cdce..a5333d0 100644
> --- a/hw/loader.c
> +++ b/hw/loader.c
> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name,
>  
>  /* return the size or -1 if error */
>  int load_image_targphys(const char *filename,
> -			target_phys_addr_t addr, int max_sz)
> +                        target_phys_addr_t addr, unsigned long max_sz)
>  {
> -    int size;
> +    unsigned long size;
>  
>      size = get_image_size(filename);
>      if (size > max_sz) {

get_image_size() returns int.  How does widening size and max_sz here
improve things?

> diff --git a/hw/loader.h b/hw/loader.h
> index fbcaba9..35c1652 100644
> --- a/hw/loader.h
> +++ b/hw/loader.h
> @@ -4,7 +4,8 @@
>  /* loader.c */
>  int get_image_size(const char *filename);
>  int load_image(const char *filename, uint8_t *addr); /* deprecated */
> -int load_image_targphys(const char *filename, target_phys_addr_t, int max_sz);
> +int load_image_targphys(const char *filename, target_phys_addr_t,
> +                        unsigned long max_sz);
>  int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
>               void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
>               uint64_t *highaddr, int big_endian, int elf_machine,
Mark Langsdorf March 9, 2012, 1:15 p.m. UTC | #4
On 03/09/2012 03:25 AM, Markus Armbruster wrote:
> Mark Langsdorf <mark.langsdorf@calxeda.com> writes:
> 
>> Allow load_image_targphys to load files on systems with more than 2G of
>> emulated memory by changing the max_sz parameter from an int to an
>> unsigned long.
>>
>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
>> ---
>>  hw/loader.c |    4 ++--
>>  hw/loader.h |    3 ++-
>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/loader.c b/hw/loader.c
>> index 415cdce..a5333d0 100644
>> --- a/hw/loader.c
>> +++ b/hw/loader.c
>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name,
>>  
>>  /* return the size or -1 if error */
>>  int load_image_targphys(const char *filename,
>> -			target_phys_addr_t addr, int max_sz)
>> +                        target_phys_addr_t addr, unsigned long max_sz)
>>  {
>> -    int size;
>> +    unsigned long size;
>>  
>>      size = get_image_size(filename);
>>      if (size > max_sz) {
> 
> get_image_size() returns int.  How does widening size and max_sz here
> improve things?

If max_sz is greater than 2GB, then:
  int max_sz = 0xc0000000;
  int size =   0x300;
  if (size > max_sz)
      return -1;

returns -1, even though size is much less than max_sz.

doing it my way:
  unsigned long max_sz = 0xc0000000;
  unsigned long size =   0x300;
  if (size > max_sz)
      return -1;

does not return -1.

--Mark Langsdorf
Calxeda, Inc.
Alexander Graf March 9, 2012, 1:21 p.m. UTC | #5
On 09.03.2012, at 14:15, Mark Langsdorf wrote:

> On 03/09/2012 03:25 AM, Markus Armbruster wrote:
>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes:
>> 
>>> Allow load_image_targphys to load files on systems with more than 2G of
>>> emulated memory by changing the max_sz parameter from an int to an
>>> unsigned long.
>>> 
>>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
>>> ---
>>> hw/loader.c |    4 ++--
>>> hw/loader.h |    3 ++-
>>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/hw/loader.c b/hw/loader.c
>>> index 415cdce..a5333d0 100644
>>> --- a/hw/loader.c
>>> +++ b/hw/loader.c
>>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name,
>>> 
>>> /* return the size or -1 if error */
>>> int load_image_targphys(const char *filename,
>>> -			target_phys_addr_t addr, int max_sz)
>>> +                        target_phys_addr_t addr, unsigned long max_sz)
>>> {
>>> -    int size;
>>> +    unsigned long size;
>>> 
>>>     size = get_image_size(filename);
>>>     if (size > max_sz) {
>> 
>> get_image_size() returns int.  How does widening size and max_sz here
>> improve things?
> 
> If max_sz is greater than 2GB, then:
>  int max_sz = 0xc0000000;
>  int size =   0x300;
>  if (size > max_sz)
>      return -1;
> 
> returns -1, even though size is much less than max_sz.
> 
> doing it my way:
>  unsigned long max_sz = 0xc0000000;
>  unsigned long size =   0x300;
>  if (size > max_sz)
>      return -1;
> 
> does not return -1.

So why change it to long then? Unsigned int would have the same effect.

But really what this should be changed to is target_phys_addr_t. That way it's aligned with the address. I guess we can leave int for return values for now though, since we won't get images that big.

Also, why are you hitting this in the first place? How are you calling read_targphys that you end up with such a big max_sz? Using INT_MAX as max_sz should work just as well, no? It's probably cleaner to change the size type, but I'm curious why nobody else hit this before :).


Alex
Mark Langsdorf March 9, 2012, 1:34 p.m. UTC | #6
On 03/09/2012 07:21 AM, Alexander Graf wrote:
> 
> On 09.03.2012, at 14:15, Mark Langsdorf wrote:
> 
>> On 03/09/2012 03:25 AM, Markus Armbruster wrote:
>>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes:
>>>
>>>> Allow load_image_targphys to load files on systems with more than 2G of
>>>> emulated memory by changing the max_sz parameter from an int to an
>>>> unsigned long.
>>>>
>>>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
>>>> ---
>>>> hw/loader.c |    4 ++--
>>>> hw/loader.h |    3 ++-
>>>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/hw/loader.c b/hw/loader.c
>>>> index 415cdce..a5333d0 100644
>>>> --- a/hw/loader.c
>>>> +++ b/hw/loader.c
>>>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name,
>>>>
>>>> /* return the size or -1 if error */
>>>> int load_image_targphys(const char *filename,
>>>> -			target_phys_addr_t addr, int max_sz)
>>>> +                        target_phys_addr_t addr, unsigned long max_sz)
>>>> {
>>>> -    int size;
>>>> +    unsigned long size;
>>>>
>>>>     size = get_image_size(filename);
>>>>     if (size > max_sz) {
>>>
>>> get_image_size() returns int.  How does widening size and max_sz here
>>> improve things?
>>
>> If max_sz is greater than 2GB, then:
>>  int max_sz = 0xc0000000;
>>  int size =   0x300;
>>  if (size > max_sz)
>>      return -1;
>>
>> returns -1, even though size is much less than max_sz.
>>
>> doing it my way:
>>  unsigned long max_sz = 0xc0000000;
>>  unsigned long size =   0x300;
>>  if (size > max_sz)
>>      return -1;
>>
>> does not return -1.
> 
> So why change it to long then? Unsigned int would have the same effect.

Well, I hit this bug because the arm_loader code passes the system's
memory size as the max_sz argument. Currently, we have a 32-bit memory
bus, but I know we'll move to 64-bits in the future, and I wanted to
be type safe.

> But really what this should be changed to is target_phys_addr_t. That
> way it's aligned with the address. I guess we can leave int for return
> values for now though, since we won't get images that big.

Or convert all this stuff to size_t, since that's also appropriate.

> Also, why are you hitting this in the first place? How are you calling
> read_targphys that you end up with such a big max_sz? Using INT_MAX
> as max_sz should work just as well, no? It's probably cleaner to
> change the size type, but I'm curious why nobody else hit this before :).

Well, arm_load_kernel calls read_targphys and passes
(ram_size - KERNEL_LOAD_ADDR) as the max_sz argument. As far as I know,
the highbank model is the only ARM model that uses more than 2 GB as
ram_size. The change that actually tested the max_sz argument didn't
go in until January 2012, and our internal tree was lagging until
quite recently, so our testing didn't catch it either.

I'll resubmit the patch with target_phys_addr_t.

--Mark Langsdorf
Calxeda, Inc.
Alexander Graf March 9, 2012, 1:50 p.m. UTC | #7
On 09.03.2012, at 14:34, Mark Langsdorf wrote:

> On 03/09/2012 07:21 AM, Alexander Graf wrote:
>> 
>> On 09.03.2012, at 14:15, Mark Langsdorf wrote:
>> 
>>> On 03/09/2012 03:25 AM, Markus Armbruster wrote:
>>>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes:
>>>> 
>>>>> Allow load_image_targphys to load files on systems with more than 2G of
>>>>> emulated memory by changing the max_sz parameter from an int to an
>>>>> unsigned long.
>>>>> 
>>>>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
>>>>> ---
>>>>> hw/loader.c |    4 ++--
>>>>> hw/loader.h |    3 ++-
>>>>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>>>> 
>>>>> diff --git a/hw/loader.c b/hw/loader.c
>>>>> index 415cdce..a5333d0 100644
>>>>> --- a/hw/loader.c
>>>>> +++ b/hw/loader.c
>>>>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name,
>>>>> 
>>>>> /* return the size or -1 if error */
>>>>> int load_image_targphys(const char *filename,
>>>>> -			target_phys_addr_t addr, int max_sz)
>>>>> +                        target_phys_addr_t addr, unsigned long max_sz)
>>>>> {
>>>>> -    int size;
>>>>> +    unsigned long size;
>>>>> 
>>>>>    size = get_image_size(filename);
>>>>>    if (size > max_sz) {
>>>> 
>>>> get_image_size() returns int.  How does widening size and max_sz here
>>>> improve things?
>>> 
>>> If max_sz is greater than 2GB, then:
>>> int max_sz = 0xc0000000;
>>> int size =   0x300;
>>> if (size > max_sz)
>>>     return -1;
>>> 
>>> returns -1, even though size is much less than max_sz.
>>> 
>>> doing it my way:
>>> unsigned long max_sz = 0xc0000000;
>>> unsigned long size =   0x300;
>>> if (size > max_sz)
>>>     return -1;
>>> 
>>> does not return -1.
>> 
>> So why change it to long then? Unsigned int would have the same effect.
> 
> Well, I hit this bug because the arm_loader code passes the system's
> memory size as the max_sz argument. Currently, we have a 32-bit memory
> bus, but I know we'll move to 64-bits in the future, and I wanted to
> be type safe.

Then use uint64_t or target_phys_addr_t really. Longs are almost always wrong in qemu code, because the guest shouldn't care about the host's bitness.

> 
>> But really what this should be changed to is target_phys_addr_t. That
>> way it's aligned with the address. I guess we can leave int for return
>> values for now though, since we won't get images that big.
> 
> Or convert all this stuff to size_t, since that's also appropriate.

Semantically, I would rather go with target_phys_addr_t. You're trying to describe addresses. If anything, ram_addr_t might work too - never quite grasped the difference.

> 
>> Also, why are you hitting this in the first place? How are you calling
>> read_targphys that you end up with such a big max_sz? Using INT_MAX
>> as max_sz should work just as well, no? It's probably cleaner to
>> change the size type, but I'm curious why nobody else hit this before :).
> 
> Well, arm_load_kernel calls read_targphys and passes
> (ram_size - KERNEL_LOAD_ADDR) as the max_sz argument. As far as I know,
> the highbank model is the only ARM model that uses more than 2 GB as
> ram_size. The change that actually tested the max_sz argument didn't
> go in until January 2012, and our internal tree was lagging until
> quite recently, so our testing didn't catch it either.

Ah, that makes sense :).

> I'll resubmit the patch with target_phys_addr_t.

Thanks! And please check for the other loaders too if they suffer from the same badness.


Alex
Peter Maydell March 9, 2012, 1:58 p.m. UTC | #8
On 9 March 2012 13:50, Alexander Graf <agraf@suse.de> wrote:
> Semantically, I would rather go with target_phys_addr_t. You're
> trying to describe addresses. If anything, ram_addr_t might work too
> - never quite grasped the difference.

HACKING says "ram_addr_t is a QEMU internal address space that maps
guest RAM physical addresses into an intermediate address space that
can map to host virtual address spaces". Unless you're actually
dealing with things in that intermediate address space you don't
want ram_addr_t.

-- PMM
Markus Armbruster March 9, 2012, 2:17 p.m. UTC | #9
Mark Langsdorf <mark.langsdorf@calxeda.com> writes:

> On 03/09/2012 03:25 AM, Markus Armbruster wrote:
>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes:
>> 
>>> Allow load_image_targphys to load files on systems with more than 2G of
>>> emulated memory by changing the max_sz parameter from an int to an
>>> unsigned long.
>>>
>>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
>>> ---
>>>  hw/loader.c |    4 ++--
>>>  hw/loader.h |    3 ++-
>>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/loader.c b/hw/loader.c
>>> index 415cdce..a5333d0 100644
>>> --- a/hw/loader.c
>>> +++ b/hw/loader.c
>>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name,
>>>  
>>>  /* return the size or -1 if error */
>>>  int load_image_targphys(const char *filename,
>>> -			target_phys_addr_t addr, int max_sz)
>>> +                        target_phys_addr_t addr, unsigned long max_sz)
>>>  {
>>> -    int size;
>>> +    unsigned long size;
>>>  
>>>      size = get_image_size(filename);
>>>      if (size > max_sz) {
>> 
>> get_image_size() returns int.  How does widening size and max_sz here
>> improve things?
>
> If max_sz is greater than 2GB, then:
>   int max_sz = 0xc0000000;
>   int size =   0x300;
>   if (size > max_sz)
>       return -1;
>
> returns -1, even though size is much less than max_sz.
>
> doing it my way:
>   unsigned long max_sz = 0xc0000000;
>   unsigned long size =   0x300;
>   if (size > max_sz)
>       return -1;
>
> does not return -1.

The current code limits max_sz to INT_MAX.  Passing 0xc0000000 is a bug.

You want to relax the limit.  That's fair.  But I'm afraid your patch
doesn't really relax the limit, or rather it relaxes it just by one bit.

Your example shows it works for a case where the higher limit isn't
needed.  Let's examine three cases where it is: image sizes 2GiB, 4GiB
and 5GiB on a host with 32 bit twos complement int, and 64 bit unsigned
size_t, max_sz 0xc0000000.

Expected result: the first one works, the other two exceed max_sz and
get rejected.

Actual result, correct me if I'm wrong:

1. get_image_size() calls lseek(), which returns (off_t)2147483648 for
   2GiB, (off_t)4294967296 for 4GiB, and (off_t)5368709120 for 5GiB.

2. get_image_size() assigns it to int size.  Since int can't hold the
   value, the result is technically implementation-defined, but in
   practice it's simply -2147483648 for 2GiB, 0 for 4GiB, and 1073741824
   for 5GiB.

3. get_image_size() returns the same.

4. load_image_targphys() assigns the return value to unsigned long size.
   This changes the value back to 2147483648 for 2GiB, leaves it at 0
   for 4GiB, and at 1073741824 for 5GiB.

5. load_image_targphys() fails if size > max_sz.  Doesn't fail for any
   of the three.

6. load_image_targphys() adds the ROM file unless size is zero.  Adds
   the first and the last file.

   Bug: the second case's image is ignored silently, not rejected for
   exceeding max_sz.

7. rom_add_file() creates a Rom object.  It gets the image size again
   (hello TOCTTOU, not your patch's fault), and stores it in member
   size_t romsize.

   Bug: the third case's image is loaded even though it exceeds max_sz.
   Hello buffer overrun.

I'm afraid you need to do more work to solve this problem.  If you're
willing to do that, please check out
http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg03627.html
Andreas Färber March 9, 2012, 2:28 p.m. UTC | #10
Am 09.03.2012 14:50, schrieb Alexander Graf:
> 
> On 09.03.2012, at 14:34, Mark Langsdorf wrote:
> 
>> On 03/09/2012 07:21 AM, Alexander Graf wrote:
>>>
>>> On 09.03.2012, at 14:15, Mark Langsdorf wrote:
>>>
>>>> On 03/09/2012 03:25 AM, Markus Armbruster wrote:
>>>>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes:
>>>>>
>>>>>> Allow load_image_targphys to load files on systems with more than 2G of
>>>>>> emulated memory by changing the max_sz parameter from an int to an
>>>>>> unsigned long.
>>>>>>
>>>>>> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
>>>>>> ---
>>>>>> hw/loader.c |    4 ++--
>>>>>> hw/loader.h |    3 ++-
>>>>>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/loader.c b/hw/loader.c
>>>>>> index 415cdce..a5333d0 100644
>>>>>> --- a/hw/loader.c
>>>>>> +++ b/hw/loader.c
>>>>>> @@ -103,9 +103,9 @@ ssize_t read_targphys(const char *name,
>>>>>>
>>>>>> /* return the size or -1 if error */
>>>>>> int load_image_targphys(const char *filename,
>>>>>> -			target_phys_addr_t addr, int max_sz)
>>>>>> +                        target_phys_addr_t addr, unsigned long max_sz)
>>>>>> {
>>>>>> -    int size;
>>>>>> +    unsigned long size;
>>>>>>
>>>>>>    size = get_image_size(filename);
>>>>>>    if (size > max_sz) {
>>>>>
>>>>> get_image_size() returns int.  How does widening size and max_sz here
>>>>> improve things?
>>>>
>>>> If max_sz is greater than 2GB, then:
>>>> int max_sz = 0xc0000000;
>>>> int size =   0x300;
>>>> if (size > max_sz)
>>>>     return -1;
>>>>
>>>> returns -1, even though size is much less than max_sz.
>>>>
>>>> doing it my way:
>>>> unsigned long max_sz = 0xc0000000;
>>>> unsigned long size =   0x300;
>>>> if (size > max_sz)
>>>>     return -1;
>>>>
>>>> does not return -1.
>>>
>>> So why change it to long then? Unsigned int would have the same effect.
>>
>> Well, I hit this bug because the arm_loader code passes the system's
>> memory size as the max_sz argument. Currently, we have a 32-bit memory
>> bus, but I know we'll move to 64-bits in the future, and I wanted to
>> be type safe.
> 
> Then use uint64_t or target_phys_addr_t really. Longs are almost always wrong in qemu code, because the guest shouldn't care about the host's bitness.
> 
>>
>>> But really what this should be changed to is target_phys_addr_t. That
>>> way it's aligned with the address. I guess we can leave int for return
>>> values for now though, since we won't get images that big.
>>
>> Or convert all this stuff to size_t, since that's also appropriate.
> 
> Semantically, I would rather go with target_phys_addr_t. You're trying to describe addresses. If anything, ram_addr_t might work too - never quite grasped the difference.
> 
>>
>>> Also, why are you hitting this in the first place? How are you calling
>>> read_targphys that you end up with such a big max_sz? Using INT_MAX
>>> as max_sz should work just as well, no? It's probably cleaner to
>>> change the size type, but I'm curious why nobody else hit this before :).
>>
>> Well, arm_load_kernel calls read_targphys and passes
>> (ram_size - KERNEL_LOAD_ADDR) as the max_sz argument. As far as I know,
>> the highbank model is the only ARM model that uses more than 2 GB as
>> ram_size. The change that actually tested the max_sz argument didn't
>> go in until January 2012, and our internal tree was lagging until
>> quite recently, so our testing didn't catch it either.
> 
> Ah, that makes sense :).
> 
>> I'll resubmit the patch with target_phys_addr_t.

No, please. We're describing sizes, not addresses. target_phys_addr_t
thus is semantically wrong here. The RAM size is unsigned long IIRC (it
is limited by the host's available memory). If you subtract something
from a size it remains a size. I had therefore suggested size_t before.
I expect sizeof(size_t) >= sizeof(unsigned long).

In the previous discussion someone suggested off_t due to some function
used internally returning it. I don't know the exact difference between
the two, but off_t still sounds wrong to me since we want a size, not a
file offset.

Andreas

> Thanks! And please check for the other loaders too if they suffer from the same badness.
> 
> 
> Alex
Mark Langsdorf March 9, 2012, 2:52 p.m. UTC | #11
On 03/09/2012 08:17 AM, Markus Armbruster wrote:
> Mark Langsdorf <mark.langsdorf@calxeda.com> writes:
> 
>> On 03/09/2012 03:25 AM, Markus Armbruster wrote:
>>> get_image_size() returns int.  How does widening size and max_sz here
>>> improve things?
>>
>> If max_sz is greater than 2GB, then:
>>   int max_sz = 0xc0000000;
>>   int size =   0x300;
>>   if (size > max_sz)
>>       return -1;
>>
>> returns -1, even though size is much less than max_sz.
>>
>> doing it my way:
>>   unsigned long max_sz = 0xc0000000;
>>   unsigned long size =   0x300;
>>   if (size > max_sz)
>>       return -1;
>>
>> does not return -1.
> 
> The current code limits max_sz to INT_MAX.  Passing 0xc0000000 is a bug.
> 
> You want to relax the limit.  That's fair.  But I'm afraid your patch
> doesn't really relax the limit, or rather it relaxes it just by one bit.
> 
> Your example shows it works for a case where the higher limit isn't
> needed.  Let's examine three cases where it is: image sizes 2GiB, 4GiB
> and 5GiB on a host with 32 bit twos complement int, and 64 bit unsigned
> size_t, max_sz 0xc0000000.

I'm quite willing to leave the problem of people loading 2GiB images to
another day. Loading a 300KiB on an ARM machine with 4GiB of memory is
a problem I'm facing right now, though.

> I'm afraid you need to do more work to solve this problem.  If you're
> willing to do that, please check out
> http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg03627.html

I'm inclined to agree with David that this is not the issue I'm trying
to solve =)

--Mark Langsdorf
Calxeda, Inc.
Markus Armbruster March 9, 2012, 3:12 p.m. UTC | #12
Mark Langsdorf <mark.langsdorf@calxeda.com> writes:

> On 03/09/2012 08:17 AM, Markus Armbruster wrote:
>> Mark Langsdorf <mark.langsdorf@calxeda.com> writes:
>> 
>>> On 03/09/2012 03:25 AM, Markus Armbruster wrote:
>>>> get_image_size() returns int.  How does widening size and max_sz here
>>>> improve things?
>>>
>>> If max_sz is greater than 2GB, then:
>>>   int max_sz = 0xc0000000;
>>>   int size =   0x300;
>>>   if (size > max_sz)
>>>       return -1;
>>>
>>> returns -1, even though size is much less than max_sz.
>>>
>>> doing it my way:
>>>   unsigned long max_sz = 0xc0000000;
>>>   unsigned long size =   0x300;
>>>   if (size > max_sz)
>>>       return -1;
>>>
>>> does not return -1.
>> 
>> The current code limits max_sz to INT_MAX.  Passing 0xc0000000 is a bug.
>> 
>> You want to relax the limit.  That's fair.  But I'm afraid your patch
>> doesn't really relax the limit, or rather it relaxes it just by one bit.
>> 
>> Your example shows it works for a case where the higher limit isn't
>> needed.  Let's examine three cases where it is: image sizes 2GiB, 4GiB
>> and 5GiB on a host with 32 bit twos complement int, and 64 bit unsigned
>> size_t, max_sz 0xc0000000.
>
> I'm quite willing to leave the problem of people loading 2GiB images to
> another day. Loading a 300KiB on an ARM machine with 4GiB of memory is
> a problem I'm facing right now, though.

Easiest fix for that is passing a max_sz argument that doesn't overflow
the parameter type :)

[...]
Peter Maydell March 9, 2012, 5:11 p.m. UTC | #13
On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote:
> No, please. We're describing sizes, not addresses. target_phys_addr_t
> thus is semantically wrong here. The RAM size is unsigned long IIRC (it
> is limited by the host's available memory). If you subtract something
> from a size it remains a size. I had therefore suggested size_t before.
> I expect sizeof(size_t) >= sizeof(unsigned long).

We're discussing target sizes. size_t might be smaller than
target_phys_addr_t, so it's also semantically wrong. We don't
have a target_size_t, though, and I think "use an address
related type for an offset" is less bad than "use a host
sized type for a guest sized value".

Compare the way we use target_phys_addr_t for the offset arguments
to MemoryRegion read/write functions.

-- PMM
Andreas Färber March 9, 2012, 6:47 p.m. UTC | #14
Am 09.03.2012 18:11, schrieb Peter Maydell:
> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote:
>> No, please. We're describing sizes, not addresses. target_phys_addr_t
>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it
>> is limited by the host's available memory). If you subtract something
>> from a size it remains a size. I had therefore suggested size_t before.
>> I expect sizeof(size_t) >= sizeof(unsigned long).
> 
> We're discussing target sizes. size_t might be smaller than
> target_phys_addr_t, so it's also semantically wrong. We don't
> have a target_size_t, though, and I think "use an address
> related type for an offset" is less bad than "use a host
> sized type for a guest sized value".

That is a moot point. There is no such thing as a "target size". The
size is defined by the guest OS, not by the architecture. And it doesn't
matter if the guest OS's size is defined larger than the host's because
we process those files on the host and they must fit into host memory.

So unsigned long would be perfectly fine if ignoring oddball win64.

> Compare the way we use target_phys_addr_t for the offset arguments
> to MemoryRegion read/write functions.

Nobody here has been arguing against using target_phys_addr_t for guest
memory *offsets*.

Actually, the size (1, 2, 4) is an unsigned int there. :) Fine with me.

And due to very similar signed overflow issues with int64_t, 128-bit
integer support was introduced as a workaround. ;)


Anyway, POSIX:2008 says this about sys/types.h:

off_t
	Used for file sizes.
size_t
	Used for sizes of objects.
ssize_t
	Used for a count of bytes or an error indication.

So off_t seems right for get_image_size() although a bit
counter-intuitive. However later is said, "off_t shall be signed integer
types". So on a 32-bit host that does not necessarily help for the case
at hands. (Since get_image_size() gets its value as an off_t though, it
doesn't matter there and improves the 64-bit host case.)

By comparison, "size_t shall be an unsigned integer type" and "The
implementation shall support one or more programming environments in
which the widths of [...] size_t, ssize_t, [...] are no greater than the
width of type long".

So I still think that size_t is our best bet for potentially large sizes
here. Comparison with off_t would cause warnings though...

Andreas
Alexander Graf March 9, 2012, 7:04 p.m. UTC | #15
On 09.03.2012, at 19:47, Andreas Färber wrote:

> Am 09.03.2012 18:11, schrieb Peter Maydell:
>> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote:
>>> No, please. We're describing sizes, not addresses. target_phys_addr_t
>>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it
>>> is limited by the host's available memory). If you subtract something
>>> from a size it remains a size. I had therefore suggested size_t before.
>>> I expect sizeof(size_t) >= sizeof(unsigned long).
>> 
>> We're discussing target sizes. size_t might be smaller than
>> target_phys_addr_t, so it's also semantically wrong. We don't
>> have a target_size_t, though, and I think "use an address
>> related type for an offset" is less bad than "use a host
>> sized type for a guest sized value".
> 
> That is a moot point. There is no such thing as a "target size". The
> size is defined by the guest OS, not by the architecture. And it doesn't
> matter if the guest OS's size is defined larger than the host's because
> we process those files on the host and they must fit into host memory.
> 
> So unsigned long would be perfectly fine if ignoring oddball win64.
> 
>> Compare the way we use target_phys_addr_t for the offset arguments
>> to MemoryRegion read/write functions.
> 
> Nobody here has been arguing against using target_phys_addr_t for guest
> memory *offsets*.
> 
> Actually, the size (1, 2, 4) is an unsigned int there. :) Fine with me.
> 
> And due to very similar signed overflow issues with int64_t, 128-bit
> integer support was introduced as a workaround. ;)
> 
> 
> Anyway, POSIX:2008 says this about sys/types.h:
> 
> off_t
> 	Used for file sizes.
> size_t
> 	Used for sizes of objects.
> ssize_t
> 	Used for a count of bytes or an error indication.
> 
> So off_t seems right for get_image_size() although a bit
> counter-intuitive. However later is said, "off_t shall be signed integer
> types". So on a 32-bit host that does not necessarily help for the case
> at hands. (Since get_image_size() gets its value as an off_t though, it
> doesn't matter there and improves the 64-bit host case.)
> 
> By comparison, "size_t shall be an unsigned integer type" and "The
> implementation shall support one or more programming environments in
> which the widths of [...] size_t, ssize_t, [...] are no greater than the
> width of type long".
> 
> So I still think that size_t is our best bet for potentially large sizes
> here. Comparison with off_t would cause warnings though...

You're on the wrong track here really :). Size, max_size and friends are all offsets into the guest's address space, so that's the type they should have. We don't care about host POSIX whatever semantics here - it's a question of how big can guest address space grow.

Imagine, your host can only do 32 bit file offsets. You want to emulate a 64bit guest though. To load an image, you say "load the image at offset x, max size <guest ram size - x>". If you use size_t, it would break for large ram guests, because the size really means semantically that you want to be able to load into address [x...ram_size] of the guest's physical memory range. If the host can actually load images that big is an orthogonal question.


Alex
Markus Armbruster March 10, 2012, 6:24 a.m. UTC | #16
Alexander Graf <agraf@suse.de> writes:

[...]
> Imagine, your host can only do 32 bit file offsets. You want to
> emulate a 64bit guest though. To load an image, you say "load the
> image at offset x, max size <guest ram size - x>". If you use size_t,
> it would break for large ram guests, because the size really means
> semantically that you want to be able to load into address
> [x...ram_size] of the guest's physical memory range. If the host can
> actually load images that big is an orthogonal question.

A host with 64 bit virtual address space, yet file offsets (and thus
file sizes) limited to 32 bits?  That's a contrived example if I ever
saw one.  You need to come up with a remotely practical one to convince
me :)
Peter Maydell March 10, 2012, 1:51 p.m. UTC | #17
On 9 March 2012 18:47, Andreas Färber <afaerber@suse.de> wrote:
> Am 09.03.2012 18:11, schrieb Peter Maydell:
>> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote:
>>> No, please. We're describing sizes, not addresses. target_phys_addr_t
>>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it
>>> is limited by the host's available memory). If you subtract something
>>> from a size it remains a size. I had therefore suggested size_t before.
>>> I expect sizeof(size_t) >= sizeof(unsigned long).
>>
>> We're discussing target sizes. size_t might be smaller than
>> target_phys_addr_t, so it's also semantically wrong. We don't
>> have a target_size_t, though, and I think "use an address
>> related type for an offset" is less bad than "use a host
>> sized type for a guest sized value".
>
> That is a moot point. There is no such thing as a "target size".

"Length of a block of memory on the guest" is what I meant.
What you need is "an integer type large enough to hold the
difference between two guest pointer values". The size of
that type should depend only on the guest config, not on the
host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong.

-- PMM
Andreas Färber March 10, 2012, 2:08 p.m. UTC | #18
Am 10.03.2012 14:51, schrieb Peter Maydell:
> On 9 March 2012 18:47, Andreas Färber <afaerber@suse.de> wrote:
>> Am 09.03.2012 18:11, schrieb Peter Maydell:
>>> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote:
>>>> No, please. We're describing sizes, not addresses. target_phys_addr_t
>>>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it
>>>> is limited by the host's available memory). If you subtract something
>>>> from a size it remains a size. I had therefore suggested size_t before.
>>>> I expect sizeof(size_t) >= sizeof(unsigned long).
>>>
>>> We're discussing target sizes. size_t might be smaller than
>>> target_phys_addr_t, so it's also semantically wrong. We don't
>>> have a target_size_t, though, and I think "use an address
>>> related type for an offset" is less bad than "use a host
>>> sized type for a guest sized value".
>>
>> That is a moot point. There is no such thing as a "target size".
> 
> "Length of a block of memory on the guest" is what I meant.
> What you need is "an integer type large enough to hold the
> difference between two guest pointer values". The size of
> that type should depend only on the guest config, not on the
> host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong.

Your view is very ARM-centric. In the PowerPC domain for instance, we
have a number of usages where we hardcode a size of, e.g., 1 MB. And
Alex should know that. I don't want to use target_phys_addr_t for that
and forcing an end address calculation, as suggested by Alex, would be
possible but is not as convenient as the current API.

Doing a size check as Mark has demonstrated in ARM code (one place!)
seems much simpler to me than hurting all targets just because ARM wants
to pass a possibly stupid value unchecked to the common API.

Compare David's off_t patch from March 8th: We'll never get an image
size larger than off_t's max. Using target_phys_addr_t, uint64_t or
__int128_t for the max are all moot (academic) because we'll never get
to their max if it's larger than off_t. Therefore I've been saying, the
host's limit is the upper realistic limit for image_load_targphys().

ELF may be a different case to consider since it can be sparse.

Andreas
Andreas Färber March 10, 2012, 2:22 p.m. UTC | #19
Am 09.03.2012 20:04, schrieb Alexander Graf:
> 
> On 09.03.2012, at 19:47, Andreas Färber wrote:
> 
>> Am 09.03.2012 18:11, schrieb Peter Maydell:
>>> On 9 March 2012 14:28, Andreas Färber <afaerber@suse.de> wrote:
>>>> No, please. We're describing sizes, not addresses. target_phys_addr_t
>>>> thus is semantically wrong here. The RAM size is unsigned long IIRC (it
>>>> is limited by the host's available memory). If you subtract something
>>>> from a size it remains a size. I had therefore suggested size_t before.
>>>> I expect sizeof(size_t) >= sizeof(unsigned long).
>>>
>>> We're discussing target sizes. size_t might be smaller than
>>> target_phys_addr_t, so it's also semantically wrong. We don't
>>> have a target_size_t, though, and I think "use an address
>>> related type for an offset" is less bad than "use a host
>>> sized type for a guest sized value".
>>
>> That is a moot point. There is no such thing as a "target size". The
>> size is defined by the guest OS, not by the architecture. And it doesn't
>> matter if the guest OS's size is defined larger than the host's because
>> we process those files on the host and they must fit into host memory.
>>
>> So unsigned long would be perfectly fine if ignoring oddball win64.
>>
>>> Compare the way we use target_phys_addr_t for the offset arguments
>>> to MemoryRegion read/write functions.
>>
>> Nobody here has been arguing against using target_phys_addr_t for guest
>> memory *offsets*.
>>
>> Actually, the size (1, 2, 4) is an unsigned int there. :) Fine with me.
>>
>> And due to very similar signed overflow issues with int64_t, 128-bit
>> integer support was introduced as a workaround. ;)
>>
>>
>> Anyway, POSIX:2008 says this about sys/types.h:
>>
>> off_t
>> 	Used for file sizes.
>> size_t
>> 	Used for sizes of objects.
>> ssize_t
>> 	Used for a count of bytes or an error indication.
>>
>> So off_t seems right for get_image_size() although a bit
>> counter-intuitive. However later is said, "off_t shall be signed integer
>> types". So on a 32-bit host that does not necessarily help for the case
>> at hands. (Since get_image_size() gets its value as an off_t though, it
>> doesn't matter there and improves the 64-bit host case.)
>>
>> By comparison, "size_t shall be an unsigned integer type" and "The
>> implementation shall support one or more programming environments in
>> which the widths of [...] size_t, ssize_t, [...] are no greater than the
>> width of type long".
>>
>> So I still think that size_t is our best bet for potentially large sizes
>> here. Comparison with off_t would cause warnings though...
> 
> You're on the wrong track here really :). Size, max_size and friends are all offsets into the guest's address space, so that's the type they should have. We don't care about host POSIX whatever semantics here - it's a question of how big can guest address space grow.

No, we're talking about different use cases here. You *assume* the size
depends on guest RAM. I don't.

And no, max_size is not an offset. start + max_size is. The difference
is that this expression may overflow to zero in cases I care about.

For both PReP and Macs the size depends solely on the physical address
space and we know the max size ahead of time.

For your use case you can easily provide a separate function as a
wrapper, taking target_phys_addr_t start, end if you like (you're right,
that's possible) and calculating the off_t/int/... max_sz there and pass
it on to the existing API. Anything bigger we just cannot load as a blob.

Andreas
Peter Maydell March 10, 2012, 3:27 p.m. UTC | #20
On 10 March 2012 14:08, Andreas Färber <afaerber@suse.de> wrote:
> Am 10.03.2012 14:51, schrieb Peter Maydell:
>> "Length of a block of memory on the guest" is what I meant.
>> What you need is "an integer type large enough to hold the
>> difference between two guest pointer values". The size of
>> that type should depend only on the guest config, not on the
>> host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong.
>
> Your view is very ARM-centric.

I don't understand this remark. Nothing about the above explanation
is ARM-centric -- it's just pointing out that the guest and the
host are two separate things and maximum widths of size and
pointer types aren't necessarily the same. Assuming they were the
same would be an x86-64-ism :-)

> In the PowerPC domain for instance, we
> have a number of usages where we hardcode a size of, e.g., 1 MB. And
> Alex should know that. I don't want to use target_phys_addr_t for that
> and forcing an end address calculation, as suggested by Alex, would be
> possible but is not as convenient as the current API.
>
> Doing a size check as Mark has demonstrated in ARM code (one place!)
> seems much simpler to me than hurting all targets just because ARM wants
> to pass a possibly stupid value unchecked to the common API.

Where the ARM specific code has particular restrictions then
I'm happy to have the ARM specific code either relax those, or
do explicit checks so we can fail cleanly or whatever.

Putting a "restrict to INT_MAX" in the highbank code is definitely
wrong (not least because passing values to load_image_targphys isn't
the only thing we use that field in arm_boot_info for!)
The ARM boot code needs updating because it shouldn't be
using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t
the same as we do for initrd_size is the obvious thing. I have no
particular objection to having some new target_phys_size_t or whatever,
and I could be persuaded that we should follow the memory API in
using uint64_t for sizes, but it needs to be a type that either follows
guest phys addr restrictions or a fixed-width type which we have decreed
is always large enough, not a type which varies based on host properties.

(arm_boot also needs to fail if the size is > 4GB since at the moment
we do assume it to be 32 bits wide, but that's a different problem.)

> Compare David's off_t patch from March 8th: We'll never get an image
> size larger than off_t's max. Using target_phys_addr_t, uint64_t or
> __int128_t for the max are all moot (academic) because we'll never get
> to their max if it's larger than off_t. Therefore I've been saying, the
> host's limit is the upper realistic limit for image_load_targphys().

You mean this patch?
http://patchwork.ozlabs.org/patch/140064/
get_image_size() is asking a question about the host (ie "what is
the size of this host file?"). It therefore makes perfect sense
for its return type to be one whose size depends on properties of
the host, like off_t. load_image_targphys()'s max_sz parameter is
passing it information which is about the guest (usually "how big is
this block of RAM we're going to try to load this thing into?").
It therefore makes perfect sense for its type to be one whose size
depends on properties of the guest, or alternatively one whose
type is guaranteed to always be at least as large as the worst
case guest we support (so you could use uint64_t, as the memory
region API does for region sizes).

-- PMM
Mark Langsdorf March 12, 2012, 3:28 p.m. UTC | #21
On 03/10/2012 09:27 AM, Peter Maydell wrote:
> On 10 March 2012 14:08, Andreas Färber <afaerber@suse.de> wrote:
>> Am 10.03.2012 14:51, schrieb Peter Maydell:
>>> "Length of a block of memory on the guest" is what I meant.
>>> What you need is "an integer type large enough to hold the
>>> difference between two guest pointer values". The size of
>>> that type should depend only on the guest config, not on the
>>> host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong.
>>
>> Your view is very ARM-centric.
> 
> I don't understand this remark. Nothing about the above explanation
> is ARM-centric -- it's just pointing out that the guest and the
> host are two separate things and maximum widths of size and
> pointer types aren't necessarily the same. Assuming they were the
> same would be an x86-64-ism :-)
>>
>> Doing a size check as Mark has demonstrated in ARM code (one place!)
>> seems much simpler to me than hurting all targets just because ARM wants
>> to pass a possibly stupid value unchecked to the common API.
> 
> Where the ARM specific code has particular restrictions then
> I'm happy to have the ARM specific code either relax those, or
> do explicit checks so we can fail cleanly or whatever.
> 
> Putting a "restrict to INT_MAX" in the highbank code is definitely
> wrong (not least because passing values to load_image_targphys isn't
> the only thing we use that field in arm_boot_info for!)
> The ARM boot code needs updating because it shouldn't be
> using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t
> the same as we do for initrd_size is the obvious thing. I have no
> particular objection to having some new target_phys_size_t or whatever,
> and I could be persuaded that we should follow the memory API in
> using uint64_t for sizes, but it needs to be a type that either follows
> guest phys addr restrictions or a fixed-width type which we have decreed
> is always large enough, not a type which varies based on host properties.

I'm reluctant to continue this thread, but I feel obliged to ask:

what types of changes should I make to allow the highbank model to put
with more than 2GB of emulated DRAM?

I've fired off 3 versions of a patch that answers that question, some
of which I've liked more than others. I'm willing to do a reasonable
amount of refactoring the general QEMU image loading code, but I don't
want to do that until I have a sense that the maintainers agree on the
general solution and that I'm working toward their understand.

Thanks for thinking this over.

--Mark Langsdorf
Calxeda, Inc.
Markus Armbruster March 12, 2012, 3:53 p.m. UTC | #22
Peter Maydell <peter.maydell@linaro.org> writes:

[...]
> Putting a "restrict to INT_MAX" in the highbank code is definitely
> wrong (not least because passing values to load_image_targphys isn't
> the only thing we use that field in arm_boot_info for!)
> The ARM boot code needs updating because it shouldn't be
> using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t
> the same as we do for initrd_size is the obvious thing. I have no
> particular objection to having some new target_phys_size_t or whatever,
> and I could be persuaded that we should follow the memory API in
> using uint64_t for sizes, but it needs to be a type that either follows
> guest phys addr restrictions or a fixed-width type which we have decreed
> is always large enough,

There is already a type that is defined to be wide enough for any object
size: size_t.

>                         not a type which varies based on host properties.

To be honest, the whole debate feels like bikeshedding to me.  Yes,
load_image_targphys()'s argument max_sz is the size of a slice of guest
memory.  It's also the size of a host object, allocated with g_malloc0()
in rom_add_file().  It's also the size of a disk file.

I'd make it size_t and be done with it.  If you absolutely must
overengineer things, go ahead and create a new type for target sizes.  I
doubt making it wider than size_t will work in practice without a lot of
hoop jumping, though.

[...]
Alexander Graf March 12, 2012, 4:04 p.m. UTC | #23
On 12.03.2012, at 16:53, Markus Armbruster wrote:

> Peter Maydell <peter.maydell@linaro.org> writes:
> 
> [...]
>> Putting a "restrict to INT_MAX" in the highbank code is definitely
>> wrong (not least because passing values to load_image_targphys isn't
>> the only thing we use that field in arm_boot_info for!)
>> The ARM boot code needs updating because it shouldn't be
>> using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t
>> the same as we do for initrd_size is the obvious thing. I have no
>> particular objection to having some new target_phys_size_t or whatever,
>> and I could be persuaded that we should follow the memory API in
>> using uint64_t for sizes, but it needs to be a type that either follows
>> guest phys addr restrictions or a fixed-width type which we have decreed
>> is always large enough,
> 
> There is already a type that is defined to be wide enough for any object
> size: size_t.
> 
>>                        not a type which varies based on host properties.
> 
> To be honest, the whole debate feels like bikeshedding to me.  Yes,
> load_image_targphys()'s argument max_sz is the size of a slice of guest
> memory.  It's also the size of a host object, allocated with g_malloc0()
> in rom_add_file().  It's also the size of a disk file.
> 
> I'd make it size_t and be done with it.  If you absolutely must
> overengineer things, go ahead and create a new type for target sizes.  I
> doubt making it wider than size_t will work in practice without a lot of
> hoop jumping, though.

I agree. Let's end this discussion and use the biggest variable type we support for addresses: uint64_t. That way we have host independent predictability, but don't use _addr_t types which Andreas seems to dislike.


Alex
Peter Maydell March 12, 2012, 4:09 p.m. UTC | #24
On 12 March 2012 16:04, Alexander Graf <agraf@suse.de> wrote:
> On 12.03.2012, at 16:53, Markus Armbruster wrote:
>> To be honest, the whole debate feels like bikeshedding to me.  Yes,
>> load_image_targphys()'s argument max_sz is the size of a slice of guest
>> memory.  It's also the size of a host object, allocated with g_malloc0()
>> in rom_add_file().  It's also the size of a disk file.
>>
>> I'd make it size_t and be done with it.  If you absolutely must
>> overengineer things, go ahead and create a new type for target sizes.  I
>> doubt making it wider than size_t will work in practice without a lot of
>> hoop jumping, though.
>
> I agree. Let's end this discussion and use the biggest variable type we
> support for addresses: uint64_t. That way we have host independent
> predictability, but don't use _addr_t types which Andreas seems to dislike.

Works for me, and follows the precedent of the memory region API, which
makes sense.

-- PMM
Andreas Färber March 12, 2012, 4:12 p.m. UTC | #25
Am 10.03.2012 16:27, schrieb Peter Maydell:
> On 10 March 2012 14:08, Andreas Färber <afaerber@suse.de> wrote:
>> Am 10.03.2012 14:51, schrieb Peter Maydell:
>>> "Length of a block of memory on the guest" is what I meant.
>>> What you need is "an integer type large enough to hold the
>>> difference between two guest pointer values". The size of
>>> that type should depend only on the guest config, not on the
>>> host, so 'unsigned long', 'size_t', 'off_t' etc are all wrong.
>>
>> Your view is very ARM-centric.
> 
> I don't understand this remark. Nothing about the above explanation
> is ARM-centric
[snip]

ARM-centric is that you and Mark are trying to solve the issue of ARM
boards supplying guest RAM size as limit for image loading. I'm saying
many other boards don't have that issue because they're using a constant
size value way below any theoretical limit.

Now the theory behind this part of the thread is about what type to use
for a delta vs. absolute value. If you subtract void * from void * you
get a ptrdiff_t. Similarly in C# or Java or whatever subtracing two Time
objects will return a TimeDiff object or so. Whether you specify a
temperature in degrees Celsius or Kelvin, the difference is in degrees.
In that same spirit I'm opposed to using target_phys_addr_t for a size
(delta) between two addresses. HTE.

Andreas
Andreas Färber March 12, 2012, 4:14 p.m. UTC | #26
Am 12.03.2012 17:04, schrieb Alexander Graf:
> 
> On 12.03.2012, at 16:53, Markus Armbruster wrote:
> 
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>> [...]
>>> Putting a "restrict to INT_MAX" in the highbank code is definitely
>>> wrong (not least because passing values to load_image_targphys isn't
>>> the only thing we use that field in arm_boot_info for!)
>>> The ARM boot code needs updating because it shouldn't be
>>> using 'int' for arm_boot_info.ram_size, and using target_phys_addr_t
>>> the same as we do for initrd_size is the obvious thing. I have no
>>> particular objection to having some new target_phys_size_t or whatever,
>>> and I could be persuaded that we should follow the memory API in
>>> using uint64_t for sizes, but it needs to be a type that either follows
>>> guest phys addr restrictions or a fixed-width type which we have decreed
>>> is always large enough,
>>
>> There is already a type that is defined to be wide enough for any object
>> size: size_t.
>>
>>>                        not a type which varies based on host properties.
>>
>> To be honest, the whole debate feels like bikeshedding to me.  Yes,
>> load_image_targphys()'s argument max_sz is the size of a slice of guest
>> memory.  It's also the size of a host object, allocated with g_malloc0()
>> in rom_add_file().  It's also the size of a disk file.
>>
>> I'd make it size_t and be done with it.  If you absolutely must
>> overengineer things, go ahead and create a new type for target sizes.  I
>> doubt making it wider than size_t will work in practice without a lot of
>> hoop jumping, though.
> 
> I agree. Let's end this discussion and use the biggest variable type we support for addresses: uint64_t. That way we have host independent predictability, but don't use _addr_t types which Andreas seems to dislike.

Fine with me.

Andreas
diff mbox

Patch

diff --git a/hw/loader.c b/hw/loader.c
index 415cdce..a5333d0 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -103,9 +103,9 @@  ssize_t read_targphys(const char *name,
 
 /* return the size or -1 if error */
 int load_image_targphys(const char *filename,
-			target_phys_addr_t addr, int max_sz)
+                        target_phys_addr_t addr, unsigned long max_sz)
 {
-    int size;
+    unsigned long size;
 
     size = get_image_size(filename);
     if (size > max_sz) {
diff --git a/hw/loader.h b/hw/loader.h
index fbcaba9..35c1652 100644
--- a/hw/loader.h
+++ b/hw/loader.h
@@ -4,7 +4,8 @@ 
 /* loader.c */
 int get_image_size(const char *filename);
 int load_image(const char *filename, uint8_t *addr); /* deprecated */
-int load_image_targphys(const char *filename, target_phys_addr_t, int max_sz);
+int load_image_targphys(const char *filename, target_phys_addr_t,
+                        unsigned long max_sz);
 int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
              void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
              uint64_t *highaddr, int big_endian, int elf_machine,