Patchwork [2/5] Support human unit formats in strtobytes, eg. 1.0G

login
register
mail settings
Submitter Jes Sorensen
Date Sept. 15, 2010, 12:23 p.m.
Message ID <1284553440-17985-3-git-send-email-Jes.Sorensen@redhat.com>
Download mbox | patch
Permalink /patch/64804/
State New
Headers show

Comments

Jes Sorensen - Sept. 15, 2010, 12:23 p.m.
From: Jes Sorensen <Jes.Sorensen@redhat.com>

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
---
 cutils.c |   27 ++++++++++++++++++++++++++-
 1 files changed, 26 insertions(+), 1 deletions(-)
Juan Quintela - Sept. 15, 2010, 2:50 p.m.
Jes.Sorensen@redhat.com wrote:
> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>          value <<= 10;
>          break;
>      case 0:
> +        if (divider) {
> +            value = 0;
> +            break;

changing break by goto fail here?
1.5G and 1.0G is ok, but using 1024.00  or similar should be one error,
no?

nice cleanup for the rest of the patch series.

Later, Juan.


> +        }
>      case 'M':
>      case 'm':
>          value <<= 20;
Paolo Bonzini - Sept. 15, 2010, 3:45 p.m.
On 09/15/2010 02:23 PM, Jes.Sorensen@redhat.com wrote:
>       switch (*endptr++) {
>       case 'K':
>       case 'k':
>           value<<= 10;
>           break;
>       case 0:
> +        if (divider) {
> +            value = 0;
> +            break;
> +        }
>       case 'M':
>       case 'm':
>           value<<= 20;
> @@ -284,9 +306,12 @@ uint64_t strtobytes(const char *nptr, char **end)
>       default:
>           value = 0;
>       }
> +    if (divider)
> +        value /= divider;
>

This risks overflow if you do 1.00000000000000G or something similarly 
braindead.  Do we loathe floating point so much that you cannot use 
strtod, like

     endptr1 = nptr + strspn(s, "0123456789.");
     switch (*endptr1)
     {
     case 0: divider = 1; break;
     case 'm': divider = 1 << 20; break;
     ...
     default: /* error, including for 1.0e+5 and negative */
     }
     value = (uint64_t) (strtod(nptr, &endptr2) / divider);
     if (endptr1 != endptr2) /* error, e.g. 1.2.3 */

     return value;

Paolo
Anthony Liguori - Sept. 15, 2010, 3:50 p.m.
On 09/15/2010 10:45 AM, Paolo Bonzini wrote:
> On 09/15/2010 02:23 PM, Jes.Sorensen@redhat.com wrote:
>>       switch (*endptr++) {
>>       case 'K':
>>       case 'k':
>>           value<<= 10;
>>           break;
>>       case 0:
>> +        if (divider) {
>> +            value = 0;
>> +            break;
>> +        }
>>       case 'M':
>>       case 'm':
>>           value<<= 20;
>> @@ -284,9 +306,12 @@ uint64_t strtobytes(const char *nptr, char **end)
>>       default:
>>           value = 0;
>>       }
>> +    if (divider)
>> +        value /= divider;
>>
>
> This risks overflow if you do 1.00000000000000G or something similarly 
> braindead.  Do we loathe floating point so much that you cannot use 
> strtod, like

It should be strtod.  Only badness can happen otherwise.

Regards,

Anthony Liguori

>     endptr1 = nptr + strspn(s, "0123456789.");
>     switch (*endptr1)
>     {
>     case 0: divider = 1; break;
>     case 'm': divider = 1 << 20; break;
>     ...
>     default: /* error, including for 1.0e+5 and negative */
>     }
>     value = (uint64_t) (strtod(nptr, &endptr2) / divider);
>     if (endptr1 != endptr2) /* error, e.g. 1.2.3 */
>
>     return value;
> Paolo
>
Jes Sorensen - Sept. 15, 2010, 7:29 p.m.
On 09/15/10 16:50, Juan Quintela wrote:
> Jes.Sorensen@redhat.com wrote:
>> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>>          value <<= 10;
>>          break;
>>      case 0:
>> +        if (divider) {
>> +            value = 0;
>> +            break;
> 
> changing break by goto fail here?
> 1.5G and 1.0G is ok, but using 1024.00  or similar should be one error,
> no?
> 
> nice cleanup for the rest of the patch series.

In my testing, 1234.5 fails as expected, so I expect 1024.00 to fail as
well.

Cheers,
Jes
Jes Sorensen - Sept. 15, 2010, 7:31 p.m.
On 09/15/10 17:45, Paolo Bonzini wrote:
> On 09/15/2010 02:23 PM, Jes.Sorensen@redhat.com wrote:
>>       switch (*endptr++) {
>>       case 'K':
>>       case 'k':
>>           value<<= 10;
>>           break;
>>       case 0:
>> +        if (divider) {
>> +            value = 0;
>> +            break;
>> +        }
>>       case 'M':
>>       case 'm':
>>           value<<= 20;
>> @@ -284,9 +306,12 @@ uint64_t strtobytes(const char *nptr, char **end)
>>       default:
>>           value = 0;
>>       }
>> +    if (divider)
>> +        value /= divider;
>>
> 
> This risks overflow if you do 1.00000000000000G or something similarly
> braindead.  Do we loathe floating point so much that you cannot use
> strtod, like

Floating point is just plain wrong. If someone wants to do something
like in your example they really ask for an error.

Jes
Paolo Bonzini - Sept. 16, 2010, 7:19 a.m.
On 09/15/2010 09:31 PM, Jes Sorensen wrote:
> On 09/15/10 17:45, Paolo Bonzini wrote:
>> On 09/15/2010 02:23 PM, Jes.Sorensen@redhat.com wrote:
>>>        switch (*endptr++) {
>>>        case 'K':
>>>        case 'k':
>>>            value<<= 10;
>>>            break;
>>>        case 0:
>>> +        if (divider) {
>>> +            value = 0;
>>> +            break;
>>> +        }
>>>        case 'M':
>>>        case 'm':
>>>            value<<= 20;
>>> @@ -284,9 +306,12 @@ uint64_t strtobytes(const char *nptr, char **end)
>>>        default:
>>>            value = 0;
>>>        }
>>> +    if (divider)
>>> +        value /= divider;
>>>
>>
>> This risks overflow if you do 1.00000000000000G or something similarly
>> braindead.  Do we loathe floating point so much that you cannot use
>> strtod, like
>
> Floating point is just plain wrong. If someone wants to do something
> like in your example they really ask for an error.

An error, not an overflow.

Adding overflow checking on top of your patch is also fine.  Another 
possibility is to look ahead for the multiplier so that you correctly 
base the divider and do everything in 64.64 fixed point.  But it seems 
overkill compared to floating-point, whose 53-bit mantissa precision 
will almost always lead to exact results (large numbers usually have a 
lot of zeros at the end, both in binary and in decimal).

Paolo
Jes Sorensen - Sept. 16, 2010, 10:14 a.m.
On 09/16/10 09:19, Paolo Bonzini wrote:
> On 09/15/2010 09:31 PM, Jes Sorensen wrote:
>> Floating point is just plain wrong. If someone wants to do something
>> like in your example they really ask for an error.
> 
> An error, not an overflow.
> 
> Adding overflow checking on top of your patch is also fine.  Another
> possibility is to look ahead for the multiplier so that you correctly
> base the divider and do everything in 64.64 fixed point.  But it seems
> overkill compared to floating-point, whose 53-bit mantissa precision
> will almost always lead to exact results (large numbers usually have a
> lot of zeros at the end, both in binary and in decimal).

I think it would be quite reasonable not to accept anything more than
say 3-4 decimal points, since there are the t/g/m/k options as well.

Cheers,
Jes
Avi Kivity - Sept. 16, 2010, 10:40 a.m.
On 09/15/2010 09:31 PM, Jes Sorensen wrote:
> Floating point is just plain wrong.

Why?  If command-line processing becomes too slow, you can always buy a 
math co-processor.
Jes Sorensen - Sept. 16, 2010, 10:42 a.m.
On 09/16/10 12:40, Avi Kivity wrote:
>  On 09/15/2010 09:31 PM, Jes Sorensen wrote:
>> Floating point is just plain wrong.
> 
> Why?  If command-line processing becomes too slow, you can always buy a
> math co-processor.
> 

Because it's imprecise anyway and requires dealing with fp regs.
Besides, most users will probably hit their shell command line limit
before hitting the problem with the decimals.

Jes
Avi Kivity - Sept. 16, 2010, 10:46 a.m.
On 09/16/2010 12:42 PM, Jes Sorensen wrote:
> On 09/16/10 12:40, Avi Kivity wrote:
> >   On 09/15/2010 09:31 PM, Jes Sorensen wrote:
> >>  Floating point is just plain wrong.
> >
> >  Why?  If command-line processing becomes too slow, you can always buy a
> >  math co-processor.
> >
>
> Because it's imprecise anyway

52 bits = 4PB.  At that point some rounding will take place.

> and requires dealing with fp regs.

The compiler takes care of allocating registers.

> Besides, most users will probably hit their shell command line limit
> before hitting the problem with the decimals.
>

20 digits will overflow your divider.
Paolo Bonzini - Sept. 16, 2010, 11:09 a.m.
On 09/16/2010 12:42 PM, Jes Sorensen wrote:
> On 09/16/10 12:40, Avi Kivity wrote:
>>   On 09/15/2010 09:31 PM, Jes Sorensen wrote:
>>> Floating point is just plain wrong.
>>
>> Why?  If command-line processing becomes too slow, you can always buy a
>> math co-processor.
>
> Because it's imprecise anyway

As Avi mentioned, this is only true if you need byte precision beyond 4 
PB.  But most of the time byte precision is not necessary so in practice 
floating-point will be indistinguishable: all exact powers of 10 up to 
10^22 (beyond 64-bits) can be represented correctly by an IEEE double.

There's also strtold, if you're worried about precision...

> Besides, most users will probably hit their shell command line limit
> before hitting the problem with the decimals.

Value is first shifted and then multiplied, so that 6-7 digits may 
already overflow if the unit is terabytes.

Paolo

Patch

diff --git a/cutils.c b/cutils.c
index a3087fe..d34ed08 100644
--- a/cutils.c
+++ b/cutils.c
@@ -259,16 +259,38 @@  int fcntl_setfl(int fd, int flag)
  */
 uint64_t strtobytes(const char *nptr, char **end)
 {
-    uint64_t value;
+    uint64_t value, value2;
     char *endptr;
+    int divider = 0;
 
     value = strtoll(nptr, &endptr, 0);
+    if (endptr[0] == '.') {
+        endptr++;
+        value2 = 0;
+        divider = 10;
+        while ((endptr[0] == '0') && (endptr[1] >= '0') && (endptr[1] <= '9')) {
+            divider = divider * 10;
+            endptr++;
+        }
+
+        if ((endptr[0] >= '0') && (endptr[0] <= '9')) {
+            value2 = strtoll(endptr, &endptr, 0);
+            value = value * divider + value2;
+        } else {
+            value = 0;
+            goto fail;
+        }
+    }
     switch (*endptr++) {
     case 'K':
     case 'k':
         value <<= 10;
         break;
     case 0:
+        if (divider) {
+            value = 0;
+            break;
+        }
     case 'M':
     case 'm':
         value <<= 20;
@@ -284,9 +306,12 @@  uint64_t strtobytes(const char *nptr, char **end)
     default:
         value = 0;
     }
+    if (divider)
+        value /= divider;
 
     if (end)
         *end = endptr;
 
+fail:
     return value;
 }