Message ID | 20230525161450.3704901-1-qing.zhao@oracle.com |
---|---|
Headers | show |
Series | New attribute "element_count" to annotate bounds for C99 FAM(PR108896) | expand |
On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote: > This patch set introduces a new attribute "element_count" to annotate bounds > for C99 flexible array member. Thank you for this work! I'm really excited to start using it in the Linux kernel. I'll give this a spin, but I know you've already been testing this series against the test cases I created earlier, so I don't expect any problems. :) One bike-shedding note: with the recent "-fbounds-safety" RFC posted for LLVM, we may want to consider renaming "element_count" to "counted_by": https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/ Thanks again! -Kees
On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote: > GCC will pass the number of elements info from the attached attribute to both > __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds > or dynamic object size issues during runtime for flexible array members. > > This new feature will provide nice protection to flexible array members (which > currently are completely ignored by both __builtin_dynamic_object_size and > bounds sanitizers). Testing went pretty well, though I think I found some bdos issues: - some things that bdos can't know the size of, and correctly returned SIZE_MAX in the past, now thinks are 0-sized. - while bdos correctly knows the size of an element_count-annotated flexible array, it doesn't know the size of the containing object (i.e. it returns SIZE_MAX). Also, I think I found a precedence issue: - if both __alloc_size and 'element_count' are in use, the _smallest_ of the two is what I would expect to be enforced by the sanitizer and reported by __bdos. As is, alloc_size appears to be used when it is available, regardless of what 'element_count' shows. I've updated my test cases to show it more clearly, but here is the before/after: GCC 13 (correctly does not implement "element_count"): $ ./array-bounds 2>&1 | grep -v ^'#' TAP version 13 1..12 ok 1 global.fixed_size_seen_by_bdos ok 2 global.fixed_size_enforced_by_sanitizer ok 3 global.unknown_size_unknown_to_bdos ok 4 global.unknown_size_ignored_by_sanitizer ok 5 global.alloc_size_seen_by_bdos ok 6 global.alloc_size_enforced_by_sanitizer not ok 7 global.element_count_seen_by_bdos not ok 8 global.element_count_enforced_by_sanitizer not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer ToT GCC + this element_count series: $ ./array-bounds 2>&1 | grep -v ^'#' TAP version 13 1..12 ok 1 global.fixed_size_seen_by_bdos ok 2 global.fixed_size_enforced_by_sanitizer not ok 3 global.unknown_size_unknown_to_bdos not ok 4 global.unknown_size_ignored_by_sanitizer ok 5 global.alloc_size_seen_by_bdos ok 6 global.alloc_size_enforced_by_sanitizer not ok 7 global.element_count_seen_by_bdos ok 8 global.element_count_enforced_by_sanitizer not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer Test suite is here: https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c
> On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote: > > On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote: >> GCC will pass the number of elements info from the attached attribute to both >> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds >> or dynamic object size issues during runtime for flexible array members. >> >> This new feature will provide nice protection to flexible array members (which >> currently are completely ignored by both __builtin_dynamic_object_size and >> bounds sanitizers). > > Testing went pretty well, though I think I found some bdos issues: > > - some things that bdos can't know the size of, and correctly returned > SIZE_MAX in the past, now thinks are 0-sized. Will check this issue and fix it. > - while bdos correctly knows the size of an element_count-annotated > flexible array, it doesn't know the size of the containing object > (i.e. it returns SIZE_MAX). This is a known issue I found during the implementation, and filed a bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557 for it And turned out that this was expected behavior. > > Also, I think I found a precedence issue: > > - if both __alloc_size and 'element_count' are in use, the _smallest_ > of the two is what I would expect to be enforced by the sanitizer > and reported by __bdos. As is, alloc_size appears to be used when > it is available, regardless of what 'element_count' shows. Will check on this and fix it. > > I've updated my test cases to show it more clearly, but here is the > before/after: > > > GCC 13 (correctly does not implement "element_count"): > > $ ./array-bounds 2>&1 | grep -v ^'#' > TAP version 13 > 1..12 > ok 1 global.fixed_size_seen_by_bdos > ok 2 global.fixed_size_enforced_by_sanitizer > ok 3 global.unknown_size_unknown_to_bdos > ok 4 global.unknown_size_ignored_by_sanitizer > ok 5 global.alloc_size_seen_by_bdos > ok 6 global.alloc_size_enforced_by_sanitizer > not ok 7 global.element_count_seen_by_bdos > not ok 8 global.element_count_enforced_by_sanitizer > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer > > > ToT GCC + this element_count series: > > $ ./array-bounds 2>&1 | grep -v ^'#' > TAP version 13 > 1..12 > ok 1 global.fixed_size_seen_by_bdos > ok 2 global.fixed_size_enforced_by_sanitizer > not ok 3 global.unknown_size_unknown_to_bdos > not ok 4 global.unknown_size_ignored_by_sanitizer > ok 5 global.alloc_size_seen_by_bdos > ok 6 global.alloc_size_enforced_by_sanitizer > not ok 7 global.element_count_seen_by_bdos > ok 8 global.element_count_enforced_by_sanitizer > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer > > > Test suite is here: > https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c Thanks a lot for the testing, this is really helpful. Will study and fix all these issues. Qing > > -- > Kees Cook
> On May 26, 2023, at 12:12 PM, Kees Cook <keescook@chromium.org> wrote: > > On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote: >> This patch set introduces a new attribute "element_count" to annotate bounds >> for C99 flexible array member. > > Thank you for this work! I'm really excited to start using it in the > Linux kernel. I'll give this a spin, but I know you've already been > testing this series against the test cases I created earlier, so I don't > expect any problems. :) > > One bike-shedding note: with the recent "-fbounds-safety" RFC posted for > LLVM, we may want to consider renaming "element_count" to "counted_by": > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/ Yeah, we can do that. thanks. Qing > > Thanks again! > > -Kees > > -- > Kees Cook
Hi, Kees, I have updated my V1 patch with the following changes: A. changed the name to "counted_by" B. changed the argument from a string to an identifier C. updated the documentation and testing cases accordingly. And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change) [opc@qinzhao-ol8u3-x86 Kees]$ !1091 diff array-bounds.c array-bounds.c.org 32c32 < # define __counted_by(member) __attribute__((counted_by (member))) --- > # define __counted_by(member) __attribute__((__element_count__(#member))) 34c34 < # define __counted_by(member) __attribute__((counted_by (member))) --- > # define __counted_by(member) /* __attribute__((__element_count__(#member))) */ Then I got the following result: [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#' TAP version 13 1..12 ok 1 global.fixed_size_seen_by_bdos ok 2 global.fixed_size_enforced_by_sanitizer not ok 3 global.unknown_size_unknown_to_bdos not ok 4 global.unknown_size_ignored_by_sanitizer ok 5 global.alloc_size_seen_by_bdos ok 6 global.alloc_size_enforced_by_sanitizer not ok 7 global.element_count_seen_by_bdos ok 8 global.element_count_enforced_by_sanitizer not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them. in a summary, there are two major issues: 1. The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557 Which is not a bug, it’s an expected behavior. 2. The common issue for the failed testing 3, 4, 9, 10 is: for the following annotated structure: ==== struct annotated { unsigned long flags; size_t foo; int array[] __attribute__((counted_by (foo))); }; struct annotated *p; int index = 16; p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 or p->foo was not set to any value as in test 3 and 4 ==== i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. I think that this should be considered as an user error, and the documentation of the attribute should include this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) We can add a new warning option -Wcounted-by to report such user error if needed. What’s your opinion on this? thanks. Qing > On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote: > > On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote: >> GCC will pass the number of elements info from the attached attribute to both >> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds >> or dynamic object size issues during runtime for flexible array members. >> >> This new feature will provide nice protection to flexible array members (which >> currently are completely ignored by both __builtin_dynamic_object_size and >> bounds sanitizers). > > Testing went pretty well, though I think I found some bdos issues: > > - some things that bdos can't know the size of, and correctly returned > SIZE_MAX in the past, now thinks are 0-sized. > - while bdos correctly knows the size of an element_count-annotated > flexible array, it doesn't know the size of the containing object > (i.e. it returns SIZE_MAX). > > Also, I think I found a precedence issue: > > - if both __alloc_size and 'element_count' are in use, the _smallest_ > of the two is what I would expect to be enforced by the sanitizer > and reported by __bdos. As is, alloc_size appears to be used when > it is available, regardless of what 'element_count' shows. > > I've updated my test cases to show it more clearly, but here is the > before/after: > > > GCC 13 (correctly does not implement "element_count"): > > $ ./array-bounds 2>&1 | grep -v ^'#' > TAP version 13 > 1..12 > ok 1 global.fixed_size_seen_by_bdos > ok 2 global.fixed_size_enforced_by_sanitizer > ok 3 global.unknown_size_unknown_to_bdos > ok 4 global.unknown_size_ignored_by_sanitizer > ok 5 global.alloc_size_seen_by_bdos > ok 6 global.alloc_size_enforced_by_sanitizer > not ok 7 global.element_count_seen_by_bdos > not ok 8 global.element_count_enforced_by_sanitizer > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer > > > ToT GCC + this element_count series: > > $ ./array-bounds 2>&1 | grep -v ^'#' > TAP version 13 > 1..12 > ok 1 global.fixed_size_seen_by_bdos > ok 2 global.fixed_size_enforced_by_sanitizer > not ok 3 global.unknown_size_unknown_to_bdos > not ok 4 global.unknown_size_ignored_by_sanitizer > ok 5 global.alloc_size_seen_by_bdos > ok 6 global.alloc_size_enforced_by_sanitizer > not ok 7 global.element_count_seen_by_bdos > ok 8 global.element_count_enforced_by_sanitizer > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer > > > Test suite is here: > https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c > > -- > Kees Cook
Am Donnerstag, dem 06.07.2023 um 18:56 +0000 schrieb Qing Zhao: > Hi, Kees, > > I have updated my V1 patch with the following changes: > A. changed the name to "counted_by" > B. changed the argument from a string to an identifier > C. updated the documentation and testing cases accordingly. > > And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change) > [opc@qinzhao-ol8u3-x86 Kees]$ !1091 > diff array-bounds.c array-bounds.c.org > 32c32 > < # define __counted_by(member) __attribute__((counted_by (member))) > --- > > # define __counted_by(member) __attribute__((__element_count__(#member))) > 34c34 > < # define __counted_by(member) __attribute__((counted_by (member))) > --- > > # define __counted_by(member) /* __attribute__((__element_count__(#member))) */ > > Then I got the following result: > [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#' > TAP version 13 > 1..12 > ok 1 global.fixed_size_seen_by_bdos > ok 2 global.fixed_size_enforced_by_sanitizer > not ok 3 global.unknown_size_unknown_to_bdos > not ok 4 global.unknown_size_ignored_by_sanitizer > ok 5 global.alloc_size_seen_by_bdos > ok 6 global.alloc_size_enforced_by_sanitizer > not ok 7 global.element_count_seen_by_bdos > ok 8 global.element_count_enforced_by_sanitizer > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer > > The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them. > > in a summary, there are two major issues: > 1. The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557 > Which is not a bug, it’s an expected behavior. > > 2. The common issue for the failed testing 3, 4, 9, 10 is: > > for the following annotated structure: > > ==== > struct annotated { > unsigned long flags; > size_t foo; > int array[] __attribute__((counted_by (foo))); > }; > > > struct annotated *p; > int index = 16; > > p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size > > p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 > or > p->foo was not set to any value as in test 3 and 4 > > ==== > > i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. > > I think that this should be considered as an user error, and the documentation of the attribute should include > this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) > > We can add a new warning option -Wcounted-by to report such user error if needed. > > What’s your opinion on this? Additionally, we could also have a sanitizer that checks this at run-time. Personally, I am still not very happy that in the following example the two 'n's refer to different entities: void f(int n) { struct foo { int n; int (*p[])[n] [[counted_by(n)]]; }; } But I guess it will be difficult to convince everybody that it would be wise to use a new syntax for disambiguation: void f(int n) { struct foo { int n; int (*p[])[n] [[counted_by(.n)]]; }; } Martin > > thanks. > > Qing > > > > On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote: > > > > On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote: > > > GCC will pass the number of elements info from the attached attribute to both > > > __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds > > > or dynamic object size issues during runtime for flexible array members. > > > > > > This new feature will provide nice protection to flexible array members (which > > > currently are completely ignored by both __builtin_dynamic_object_size and > > > bounds sanitizers). > > > > Testing went pretty well, though I think I found some bdos issues: > > > > - some things that bdos can't know the size of, and correctly returned > > SIZE_MAX in the past, now thinks are 0-sized. > > - while bdos correctly knows the size of an element_count-annotated > > flexible array, it doesn't know the size of the containing object > > (i.e. it returns SIZE_MAX). > > > > Also, I think I found a precedence issue: > > > > - if both __alloc_size and 'element_count' are in use, the _smallest_ > > of the two is what I would expect to be enforced by the sanitizer > > and reported by __bdos. As is, alloc_size appears to be used when > > it is available, regardless of what 'element_count' shows. > > > > I've updated my test cases to show it more clearly, but here is the > > before/after: > > > > > > GCC 13 (correctly does not implement "element_count"): > > > > $ ./array-bounds 2>&1 | grep -v ^'#' > > TAP version 13 > > 1..12 > > ok 1 global.fixed_size_seen_by_bdos > > ok 2 global.fixed_size_enforced_by_sanitizer > > ok 3 global.unknown_size_unknown_to_bdos > > ok 4 global.unknown_size_ignored_by_sanitizer > > ok 5 global.alloc_size_seen_by_bdos > > ok 6 global.alloc_size_enforced_by_sanitizer > > not ok 7 global.element_count_seen_by_bdos > > not ok 8 global.element_count_enforced_by_sanitizer > > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos > > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer > > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos > > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer > > > > > > ToT GCC + this element_count series: > > > > $ ./array-bounds 2>&1 | grep -v ^'#' > > TAP version 13 > > 1..12 > > ok 1 global.fixed_size_seen_by_bdos > > ok 2 global.fixed_size_enforced_by_sanitizer > > not ok 3 global.unknown_size_unknown_to_bdos > > not ok 4 global.unknown_size_ignored_by_sanitizer > > ok 5 global.alloc_size_seen_by_bdos > > ok 6 global.alloc_size_enforced_by_sanitizer > > not ok 7 global.element_count_seen_by_bdos > > ok 8 global.element_count_enforced_by_sanitizer > > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos > > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer > > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos > > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer > > > > > > Test suite is here: > > https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c > > > > -- > > Kees Cook >
> On Jul 6, 2023, at 5:10 PM, Martin Uecker <uecker@tugraz.at> wrote: > > Am Donnerstag, dem 06.07.2023 um 18:56 +0000 schrieb Qing Zhao: >> Hi, Kees, >> >> I have updated my V1 patch with the following changes: >> A. changed the name to "counted_by" >> B. changed the argument from a string to an identifier >> C. updated the documentation and testing cases accordingly. >> >> And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change) >> [opc@qinzhao-ol8u3-x86 Kees]$ !1091 >> diff array-bounds.c array-bounds.c.org >> 32c32 >> < # define __counted_by(member) __attribute__((counted_by (member))) >> --- >>> # define __counted_by(member) __attribute__((__element_count__(#member))) >> 34c34 >> < # define __counted_by(member) __attribute__((counted_by (member))) >> --- >>> # define __counted_by(member) /* __attribute__((__element_count__(#member))) */ >> >> Then I got the following result: >> [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#' >> TAP version 13 >> 1..12 >> ok 1 global.fixed_size_seen_by_bdos >> ok 2 global.fixed_size_enforced_by_sanitizer >> not ok 3 global.unknown_size_unknown_to_bdos >> not ok 4 global.unknown_size_ignored_by_sanitizer >> ok 5 global.alloc_size_seen_by_bdos >> ok 6 global.alloc_size_enforced_by_sanitizer >> not ok 7 global.element_count_seen_by_bdos >> ok 8 global.element_count_enforced_by_sanitizer >> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos >> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer >> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos >> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer >> >> The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them. >> >> in a summary, there are two major issues: >> 1. The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557 >> Which is not a bug, it’s an expected behavior. >> >> 2. The common issue for the failed testing 3, 4, 9, 10 is: >> >> for the following annotated structure: >> >> ==== >> struct annotated { >> unsigned long flags; >> size_t foo; >> int array[] __attribute__((counted_by (foo))); >> }; >> >> >> struct annotated *p; >> int index = 16; >> >> p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size >> >> p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 >> or >> p->foo was not set to any value as in test 3 and 4 >> >> ==== >> >> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. >> >> I think that this should be considered as an user error, and the documentation of the attribute should include >> this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: >> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) >> >> We can add a new warning option -Wcounted-by to report such user error if needed. >> >> What’s your opinion on this? > > > Additionally, we could also have a sanitizer that > checks this at run-time. Yes, that’s also a nice feature to have. I think that the main point here is to catch such user errors during compilation time or run time. I will add one or two separate patches for these compilation warning and sanitizer feature. > > Personally, I am still not very happy that in the > following example the two 'n's refer to different > entities: > > void f(int n) > { > struct foo { > int n; > int (*p[])[n] [[counted_by(n)]]; > }; > } > Me either )-: > But I guess it will be difficult to convince everybody > that it would be wise to use a new syntax for > disambiguation: > > void f(int n) > { > struct foo { > int n; > int (*p[])[n] [[counted_by(.n)]]; > }; > } > I guess that it’s quite hard to convince everyone that the new syntax is the best solution at this moment. And it might not worth the effort at this time. We can do the new syntax later if necessary. thanks. Qing > Martin > > >> >> thanks. >> >> Qing >> >> >>> On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote: >>> >>> On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote: >>>> GCC will pass the number of elements info from the attached attribute to both >>>> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds >>>> or dynamic object size issues during runtime for flexible array members. >>>> >>>> This new feature will provide nice protection to flexible array members (which >>>> currently are completely ignored by both __builtin_dynamic_object_size and >>>> bounds sanitizers). >>> >>> Testing went pretty well, though I think I found some bdos issues: >>> >>> - some things that bdos can't know the size of, and correctly returned >>> SIZE_MAX in the past, now thinks are 0-sized. >>> - while bdos correctly knows the size of an element_count-annotated >>> flexible array, it doesn't know the size of the containing object >>> (i.e. it returns SIZE_MAX). >>> >>> Also, I think I found a precedence issue: >>> >>> - if both __alloc_size and 'element_count' are in use, the _smallest_ >>> of the two is what I would expect to be enforced by the sanitizer >>> and reported by __bdos. As is, alloc_size appears to be used when >>> it is available, regardless of what 'element_count' shows. >>> >>> I've updated my test cases to show it more clearly, but here is the >>> before/after: >>> >>> >>> GCC 13 (correctly does not implement "element_count"): >>> >>> $ ./array-bounds 2>&1 | grep -v ^'#' >>> TAP version 13 >>> 1..12 >>> ok 1 global.fixed_size_seen_by_bdos >>> ok 2 global.fixed_size_enforced_by_sanitizer >>> ok 3 global.unknown_size_unknown_to_bdos >>> ok 4 global.unknown_size_ignored_by_sanitizer >>> ok 5 global.alloc_size_seen_by_bdos >>> ok 6 global.alloc_size_enforced_by_sanitizer >>> not ok 7 global.element_count_seen_by_bdos >>> not ok 8 global.element_count_enforced_by_sanitizer >>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos >>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer >>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos >>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer >>> >>> >>> ToT GCC + this element_count series: >>> >>> $ ./array-bounds 2>&1 | grep -v ^'#' >>> TAP version 13 >>> 1..12 >>> ok 1 global.fixed_size_seen_by_bdos >>> ok 2 global.fixed_size_enforced_by_sanitizer >>> not ok 3 global.unknown_size_unknown_to_bdos >>> not ok 4 global.unknown_size_ignored_by_sanitizer >>> ok 5 global.alloc_size_seen_by_bdos >>> ok 6 global.alloc_size_enforced_by_sanitizer >>> not ok 7 global.element_count_seen_by_bdos >>> ok 8 global.element_count_enforced_by_sanitizer >>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos >>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer >>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos >>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer >>> >>> >>> Test suite is here: >>> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c >>> >>> -- >>> Kees Cook >> > >
The following is the updated documentation on this new attribute, please let me know any suggestion and comment: ====== 'counted_by (COUNT)' The 'counted_by' attribute may be attached to the flexible array member of a structure. It indicates that the number of the elements of the array is given by the field named "COUNT" in the same structure as the flexible array member. GCC uses this information to improve the results of '__builtin_dynamic_object_size' and array bound sanitizer. For instance, the following declaration: struct P { size_t count; int array[] __attribute__ ((counted_by (count))); } *p; specifies that the 'array' is a flexible array member whose number of elements is given by the field 'count' in the same structure. The field that represents the number of the elements should have an integer type. An explicit 'counted_by' annotation defines a relationship between two objects, 'p->array' and 'p->count', that 'p->array' has 'p->count' number of elements available. This relationship must hold even after any of these related objects are updated. It's the user's responsibility to make sure this relationship to be kept all the time. Otherwise the results of the '__builtin_dynamic_object_size' and array bound sanitizer might be incorrect. For instance, the following 2nd update to the field 'count' of the above structure will permit out-of-bounds access to the array 'sbuf>array': struct P *sbuf; void alloc_buf (size_t nelems) { sbuf = (int *) malloc (sizeof (struct P) + sizeof (int) * nelems); sbuf->count = nelems; } void use_buf (int index) { sbuf->count++; /* Now the value of sbuf->count and the number of elements of sbuf->array is out of sync. */ sbuf->array[index] = 0; /* then the out-of-bound access to this array might not be detected. */ } The users can use the warning option '-Wcounted-by-attribute' to detect such user errors during compilation time, or the sanitizer option '-fsanitize=counted-by-attribute' to detect such user errors during runtime. ===== Qing > On Jul 7, 2023, at 11:47 AM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > > >> On Jul 6, 2023, at 5:10 PM, Martin Uecker <uecker@tugraz.at> wrote: >> >> Am Donnerstag, dem 06.07.2023 um 18:56 +0000 schrieb Qing Zhao: >>> Hi, Kees, >>> >>> I have updated my V1 patch with the following changes: >>> A. changed the name to "counted_by" >>> B. changed the argument from a string to an identifier >>> C. updated the documentation and testing cases accordingly. >>> >>> And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change) >>> [opc@qinzhao-ol8u3-x86 Kees]$ !1091 >>> diff array-bounds.c array-bounds.c.org >>> 32c32 >>> < # define __counted_by(member) __attribute__((counted_by (member))) >>> --- >>>> # define __counted_by(member) __attribute__((__element_count__(#member))) >>> 34c34 >>> < # define __counted_by(member) __attribute__((counted_by (member))) >>> --- >>>> # define __counted_by(member) /* __attribute__((__element_count__(#member))) */ >>> >>> Then I got the following result: >>> [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#' >>> TAP version 13 >>> 1..12 >>> ok 1 global.fixed_size_seen_by_bdos >>> ok 2 global.fixed_size_enforced_by_sanitizer >>> not ok 3 global.unknown_size_unknown_to_bdos >>> not ok 4 global.unknown_size_ignored_by_sanitizer >>> ok 5 global.alloc_size_seen_by_bdos >>> ok 6 global.alloc_size_enforced_by_sanitizer >>> not ok 7 global.element_count_seen_by_bdos >>> ok 8 global.element_count_enforced_by_sanitizer >>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos >>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer >>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos >>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer >>> >>> The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them. >>> >>> in a summary, there are two major issues: >>> 1. The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557 >>> Which is not a bug, it’s an expected behavior. >>> >>> 2. The common issue for the failed testing 3, 4, 9, 10 is: >>> >>> for the following annotated structure: >>> >>> ==== >>> struct annotated { >>> unsigned long flags; >>> size_t foo; >>> int array[] __attribute__((counted_by (foo))); >>> }; >>> >>> >>> struct annotated *p; >>> int index = 16; >>> >>> p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size >>> >>> p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 >>> or >>> p->foo was not set to any value as in test 3 and 4 >>> >>> ==== >>> >>> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. >>> >>> I think that this should be considered as an user error, and the documentation of the attribute should include >>> this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: >>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) >>> >>> We can add a new warning option -Wcounted-by to report such user error if needed. >>> >>> What’s your opinion on this? >> >> >> Additionally, we could also have a sanitizer that >> checks this at run-time. > > Yes, that’s also a nice feature to have. > I think that the main point here is to catch such user errors during compilation time or run time. > > I will add one or two separate patches for these compilation warning and sanitizer feature. > > >> >> Personally, I am still not very happy that in the >> following example the two 'n's refer to different >> entities: >> >> void f(int n) >> { >> struct foo { >> int n; >> int (*p[])[n] [[counted_by(n)]]; >> }; >> } >> > Me either )-: > > >> But I guess it will be difficult to convince everybody >> that it would be wise to use a new syntax for >> disambiguation: >> >> void f(int n) >> { >> struct foo { >> int n; >> int (*p[])[n] [[counted_by(.n)]]; >> }; >> } >> > > I guess that it’s quite hard to convince everyone that the new syntax is the best solution at this moment. > And it might not worth the effort at this time. > > We can do the new syntax later if necessary. > > thanks. > > Qing > >> Martin >> >> >>> >>> thanks. >>> >>> Qing >>> >>> >>>> On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote: >>>> >>>> On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote: >>>>> GCC will pass the number of elements info from the attached attribute to both >>>>> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds >>>>> or dynamic object size issues during runtime for flexible array members. >>>>> >>>>> This new feature will provide nice protection to flexible array members (which >>>>> currently are completely ignored by both __builtin_dynamic_object_size and >>>>> bounds sanitizers). >>>> >>>> Testing went pretty well, though I think I found some bdos issues: >>>> >>>> - some things that bdos can't know the size of, and correctly returned >>>> SIZE_MAX in the past, now thinks are 0-sized. >>>> - while bdos correctly knows the size of an element_count-annotated >>>> flexible array, it doesn't know the size of the containing object >>>> (i.e. it returns SIZE_MAX). >>>> >>>> Also, I think I found a precedence issue: >>>> >>>> - if both __alloc_size and 'element_count' are in use, the _smallest_ >>>> of the two is what I would expect to be enforced by the sanitizer >>>> and reported by __bdos. As is, alloc_size appears to be used when >>>> it is available, regardless of what 'element_count' shows. >>>> >>>> I've updated my test cases to show it more clearly, but here is the >>>> before/after: >>>> >>>> >>>> GCC 13 (correctly does not implement "element_count"): >>>> >>>> $ ./array-bounds 2>&1 | grep -v ^'#' >>>> TAP version 13 >>>> 1..12 >>>> ok 1 global.fixed_size_seen_by_bdos >>>> ok 2 global.fixed_size_enforced_by_sanitizer >>>> ok 3 global.unknown_size_unknown_to_bdos >>>> ok 4 global.unknown_size_ignored_by_sanitizer >>>> ok 5 global.alloc_size_seen_by_bdos >>>> ok 6 global.alloc_size_enforced_by_sanitizer >>>> not ok 7 global.element_count_seen_by_bdos >>>> not ok 8 global.element_count_enforced_by_sanitizer >>>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos >>>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer >>>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos >>>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer >>>> >>>> >>>> ToT GCC + this element_count series: >>>> >>>> $ ./array-bounds 2>&1 | grep -v ^'#' >>>> TAP version 13 >>>> 1..12 >>>> ok 1 global.fixed_size_seen_by_bdos >>>> ok 2 global.fixed_size_enforced_by_sanitizer >>>> not ok 3 global.unknown_size_unknown_to_bdos >>>> not ok 4 global.unknown_size_ignored_by_sanitizer >>>> ok 5 global.alloc_size_seen_by_bdos >>>> ok 6 global.alloc_size_enforced_by_sanitizer >>>> not ok 7 global.element_count_seen_by_bdos >>>> ok 8 global.element_count_enforced_by_sanitizer >>>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos >>>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer >>>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos >>>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer >>>> >>>> >>>> Test suite is here: >>>> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c >>>> >>>> -- >>>> Kees Cook
On Thu, Jul 06, 2023 at 06:56:21PM +0000, Qing Zhao wrote: > Hi, Kees, > > I have updated my V1 patch with the following changes: > A. changed the name to "counted_by" > B. changed the argument from a string to an identifier > C. updated the documentation and testing cases accordingly. Sounds great! > > And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change) > [opc@qinzhao-ol8u3-x86 Kees]$ !1091 > diff array-bounds.c array-bounds.c.org > 32c32 > < # define __counted_by(member) __attribute__((counted_by (member))) > --- > > # define __counted_by(member) __attribute__((__element_count__(#member))) > 34c34 > < # define __counted_by(member) __attribute__((counted_by (member))) > --- > > # define __counted_by(member) /* __attribute__((__element_count__(#member))) */ > > Then I got the following result: > [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#' > TAP version 13 > 1..12 > ok 1 global.fixed_size_seen_by_bdos > ok 2 global.fixed_size_enforced_by_sanitizer > not ok 3 global.unknown_size_unknown_to_bdos > not ok 4 global.unknown_size_ignored_by_sanitizer > ok 5 global.alloc_size_seen_by_bdos > ok 6 global.alloc_size_enforced_by_sanitizer > not ok 7 global.element_count_seen_by_bdos > ok 8 global.element_count_enforced_by_sanitizer > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer > > The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them. > > in a summary, there are two major issues: > 1. The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557 > Which is not a bug, it’s an expected behavior. In the bug, the problem is that "p" isn't known to be allocated, if I'm reading that correctly? I'm not sure this is a reasonable behavior, but let me get into the specific test, which looks like this: TEST(counted_by_seen_by_bdos) { struct annotated *p; int index = MAX_INDEX + unconst; p = alloc_annotated(index); REPORT_SIZE(p->array); /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array)); /* Check array size alone. */ /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX); /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array)); /* Check check entire object size. */ /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX); /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array)); } Test 1 trivially passes -- this is just a sanity check. Test 2 should be SIZE_MAX because __bos only knows compile-time sizes. Test 3 should pass: counted_by knows the allocation size and can be queried as part of the "p" object. Test 4 should be SIZE_MAX because __bos only knows compile-time sizes. Test 5 should pass as well, since, again, p can be examined. Passing p to __bdos implies it is allocated and the __counted_by annotation can be examined. If "return p->array[index];" would be caught by the sanitizer, then it follows that __builtin_dynamic_object_size(p, 1) must also know the size. i.e. both must examine "p" and its trailing flexible array and __counted_by annotation. > > 2. The common issue for the failed testing 3, 4, 9, 10 is: > > for the following annotated structure: > > ==== > struct annotated { > unsigned long flags; > size_t foo; > int array[] __attribute__((counted_by (foo))); > }; > > > struct annotated *p; > int index = 16; > > p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size > > p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 Right, tests 9 and 10 are checking that the _smallest_ possible value of the array is used. (There are two sources of information: the allocation size and the size calculated by counted_by. The smaller of the two should be used when both are available.) > or > p->foo was not set to any value as in test 3 and 4 For tests 3 and 4, yes, this was my mistake. I have fixed up the tests now. Bill noticed the same issue. Sorry for the think-o! > ==== > > i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. > > I think that this should be considered as an user error, and the documentation of the attribute should include > this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) > > We can add a new warning option -Wcounted-by to report such user error if needed. > > What’s your opinion on this? I think it is correct to allow mismatch between allocation and counted_by as long as only the least of the two is used. This may be desirable in a few situations. One example would be a large allocation that is slowly filled up by the program. I.e. the counted_by member is slowly increased during runtime (but not beyond the true allocation size). Of course allocation size is only available in limited situations, so the loss of that info is fine: we have counted_by for everything else.
> On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> wrote: > > In the bug, the problem is that "p" isn't known to be allocated, if I'm > reading that correctly? I think that the major point in PR109557 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557): for the following pointer p.3_1, p.3_1 = p; _2 = __builtin_object_size (p.3_1, 0); Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p when the TYPE_SIZE can be determined at compile time? Answer: From just knowing the type of the pointee of p, the compiler cannot determine the size of the object. Therefore the bug has been closed. In your following testing 5: > I'm not sure this is a reasonable behavior, but > let me get into the specific test, which looks like this: > > TEST(counted_by_seen_by_bdos) > { > struct annotated *p; > int index = MAX_INDEX + unconst; > > p = alloc_annotated(index); > > REPORT_SIZE(p->array); > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array)); > /* Check array size alone. */ > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX); > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array)); > /* Check check entire object size. */ > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX); > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array)); > } > > Test 5 should pass as well, since, again, p can be examined. Passing p > to __bdos implies it is allocated and the __counted_by annotation can be > examined. Since the call to the routine “alloc_annotated" cannot be inlined, GCC does not see any allocation calls for the pointer p. At the same time, due to the same reason as PR109986, GCC cannot determine the size of the object by just knowing the TYPE_SIZE of the pointee of p. So, this is exactly the same issue as PR109557. It’s an existing behavior per the current __buitlin_object_size algorithm. I am still not very sure whether the situation in PR109557 can be improved or not, but either way, it’s a separate issue. Please see the new testing case I added for PR109557, you will see that the above case 5 is a similar case as the new testing case in PR109557. > > If "return p->array[index];" would be caught by the sanitizer, then > it follows that __builtin_dynamic_object_size(p, 1) must also know the > size. i.e. both must examine "p" and its trailing flexible array and > __counted_by annotation. > >> >> 2. The common issue for the failed testing 3, 4, 9, 10 is: >> >> for the following annotated structure: >> >> ==== >> struct annotated { >> unsigned long flags; >> size_t foo; >> int array[] __attribute__((counted_by (foo))); >> }; >> >> >> struct annotated *p; >> int index = 16; >> >> p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size >> >> p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 > > Right, tests 9 and 10 are checking that the _smallest_ possible value of > the array is used. (There are two sources of information: the allocation > size and the size calculated by counted_by. The smaller of the two > should be used when both are available.) The counted_by attribute is used to annotate a Flexible array member on how many elements it will have. However, if this information can not accurately reflect the real number of elements for the array allocated, What’s the purpose of such information? >> or >> p->foo was not set to any value as in test 3 and 4 > > For tests 3 and 4, yes, this was my mistake. I have fixed up the tests > now. Bill noticed the same issue. Sorry for the think-o! > >> ==== >> >> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. >> >> I think that this should be considered as an user error, and the documentation of the attribute should include >> this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: >> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) >> >> We can add a new warning option -Wcounted-by to report such user error if needed. >> >> What’s your opinion on this? > > I think it is correct to allow mismatch between allocation and > counted_by as long as only the least of the two is used. What’s your mean by “only the least of the two is used”? > This may be > desirable in a few situations. One example would be a large allocation > that is slowly filled up by the program. So, for such situation, whenever the allocation is filled up, the field that hold the “counted_by” attribute should be increased at the same time, Then, the “counted_by” value always sync with the real allocation. > I.e. the counted_by member is > slowly increased during runtime (but not beyond the true allocation size). Then there should be source code to increase the “counted_by” field whenever the allocated space increased too. > > Of course allocation size is only available in limited situations, so > the loss of that info is fine: we have counted_by for everything else. The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement: https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 Qing > > -- > Kees Cook
On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote: > > > On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> wrote: > > > > In the bug, the problem is that "p" isn't known to be allocated, if I'm > > reading that correctly? > > I think that the major point in PR109557 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557): > > for the following pointer p.3_1, > > p.3_1 = p; > _2 = __builtin_object_size (p.3_1, 0); > > Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p when the TYPE_SIZE can be determined at compile time? > > Answer: From just knowing the type of the pointee of p, the compiler cannot determine the size of the object. Why is that? "p" points to "struct P", which has a fixed size. There must be an assumption somewhere that a pointer is allocated, otherwise __bos would almost never work? > Therefore the bug has been closed. > > In your following testing 5: > > > I'm not sure this is a reasonable behavior, but > > let me get into the specific test, which looks like this: > > > > TEST(counted_by_seen_by_bdos) > > { > > struct annotated *p; > > int index = MAX_INDEX + unconst; > > > > p = alloc_annotated(index); > > > > REPORT_SIZE(p->array); > > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array)); > > /* Check array size alone. */ > > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX); > > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array)); > > /* Check check entire object size. */ > > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX); > > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array)); > > } > > > > Test 5 should pass as well, since, again, p can be examined. Passing p > > to __bdos implies it is allocated and the __counted_by annotation can be > > examined. > > Since the call to the routine “alloc_annotated" cannot be inlined, GCC does not see any allocation calls for the pointer p. Correct. > At the same time, due to the same reason as PR109986, GCC cannot determine the size of the object by just knowing the TYPE_SIZE > of the pointee of p. So the difference between test 3 and test 5 is that "p" is explicitly dereferenced to find "array", and therefore an assumption is established that "p" is allocated? > So, this is exactly the same issue as PR109557. It’s an existing behavior per the current __buitlin_object_size algorithm. > I am still not very sure whether the situation in PR109557 can be improved or not, but either way, it’s a separate issue. Okay, so the issue is one of object allocation visibility (or assumptions there in)? > Please see the new testing case I added for PR109557, you will see that the above case 5 is a similar case as the new testing case in PR109557. I will ponder this a bit more to see if I can come up with a useful test case to replace the part from "test 5" above. > > > > If "return p->array[index];" would be caught by the sanitizer, then > > it follows that __builtin_dynamic_object_size(p, 1) must also know the > > size. i.e. both must examine "p" and its trailing flexible array and > > __counted_by annotation. > > > >> > >> 2. The common issue for the failed testing 3, 4, 9, 10 is: > >> > >> for the following annotated structure: > >> > >> ==== > >> struct annotated { > >> unsigned long flags; > >> size_t foo; > >> int array[] __attribute__((counted_by (foo))); > >> }; > >> > >> > >> struct annotated *p; > >> int index = 16; > >> > >> p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size > >> > >> p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 > > > > Right, tests 9 and 10 are checking that the _smallest_ possible value of > > the array is used. (There are two sources of information: the allocation > > size and the size calculated by counted_by. The smaller of the two > > should be used when both are available.) > > The counted_by attribute is used to annotate a Flexible array member on how many elements it will have. > However, if this information can not accurately reflect the real number of elements for the array allocated, > What’s the purpose of such information? For example, imagine code that allocates space for 100 elements since the common case is that the number of elements will grow over time. Elements are added as it goes. For example: struct grows { int alloc_count; int valid_count; struct element item[] __counted_by(valid_count); } *p; void something(void) { p = malloc(sizeof(*p) + sizeof(*p->item) * 100); p->alloc_count = 100; p->valid_count = 0; /* this loop doesn't check that we don't go over 100. */ while (items_to_copy) { struct element *item_ptr = get_next_item(); /* __counted_by stays in sync: */ p->valid_count++; p->item[p->valid_count - 1] = *item_ptr; } } We would want to catch cases there p->item[] is accessed with an index that is >= p->valid_count, even though the allocation is (currently) larger. However, if we ever reached valid_count >= alloc_count, we need to trap too, since we can still "see" the true allocation size. Now, the __alloc_size hint is visible in very few places, so if there is a strong reason to do so, I can live with saying that __counted_by takes full precedence over __alloc_size. It seems it should be possible to compare when both are present, but I can live with __counted_by being the universal truth. :) > > >> or > >> p->foo was not set to any value as in test 3 and 4 > > > > For tests 3 and 4, yes, this was my mistake. I have fixed up the tests > > now. Bill noticed the same issue. Sorry for the think-o! > > > >> ==== > >> > >> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. > >> > >> I think that this should be considered as an user error, and the documentation of the attribute should include > >> this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: > >> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) > >> > >> We can add a new warning option -Wcounted-by to report such user error if needed. > >> > >> What’s your opinion on this? > > > > I think it is correct to allow mismatch between allocation and > > counted_by as long as only the least of the two is used. > > What’s your mean by “only the least of the two is used”? I was just restating the above: if size information is available via both __alloc_size and __counted_by, the smaller of the two should be used. > > > This may be > > desirable in a few situations. One example would be a large allocation > > that is slowly filled up by the program. > > So, for such situation, whenever the allocation is filled up, the field that hold the “counted_by” attribute should be increased at the same time, > Then, the “counted_by” value always sync with the real allocation. > > I.e. the counted_by member is > > slowly increased during runtime (but not beyond the true allocation size). > > Then there should be source code to increase the “counted_by” field whenever the allocated space increased too. > > > > Of course allocation size is only available in limited situations, so > > the loss of that info is fine: we have counted_by for everything else. > > The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement: > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 Right, I'm saying it would be nice if __alloc_size was checked as well, in the sense that if it is available, it knows without question what the size of the allocation is. If __alloc_size and __counted_by conflict, the smaller of the two should be the truth. But, as I said, if there is some need to explicitly ignore __alloc_size when __counted_by is present, I can live with it; we just need to document it. If the RFC and you agree that the __counted_by variable can only ever be (re)assigned after the flex array has been (re)allocated, then I guess we'll see how it goes. :) I think most places in the kernel using __counted_by will be fine, but I suspect we may have cases where we need to update it like in the loop I described above. If that's true, we can revisit the requirement then. :) -Kees
> On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> wrote: > > On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote: >> >>> On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> wrote: >>> >>> In the bug, the problem is that "p" isn't known to be allocated, if I'm >>> reading that correctly? >> >> I think that the major point in PR109557 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557): >> >> for the following pointer p.3_1, >> >> p.3_1 = p; >> _2 = __builtin_object_size (p.3_1, 0); >> >> Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p when the TYPE_SIZE can be determined at compile time? >> >> Answer: From just knowing the type of the pointee of p, the compiler cannot determine the size of the object. > > Why is that? "p" points to "struct P", which has a fixed size. There > must be an assumption somewhere that a pointer is allocated, otherwise > __bos would almost never work? My understanding from the comments in PR109557 was: In general the pointer could point to the first object of an array that has more elements, or to an object of a different type. Without seeing the real allocation to the pointer, the compiler cannot assume that the pointer point to an object that has the exact same type as its declaration. Sid and Martin, is the above understand correctly? Honestly, I am still not 100% clear on this yet. Jakub, Sid and Martin, could you please explain a little bit more here, especially for the following small example? [opc@qinzhao-ol8u3-x86 109557]$ cat t.c #include <stdlib.h> #include <assert.h> struct P { int k; int x[10]; } *p; void store(int a, int b) { p = (struct P *)malloc (sizeof (struct P)); p->k = a; p->x[b] = 0; assert (__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10])); assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P)); return; } int main() { store(7, 7); assert (__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10])); assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P)); free (p); } [opc@qinzhao-ol8u3-x86 109557]$ sh t /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10])' failed. t: line 19: 859944 Aborted (core dumped) ./a.out In the above, among the 4 assertions, only the last one failed. Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as the size of the object, but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the size of the object? > >> Therefore the bug has been closed. >> >> In your following testing 5: >> >>> I'm not sure this is a reasonable behavior, but >>> let me get into the specific test, which looks like this: >>> >>> TEST(counted_by_seen_by_bdos) >>> { >>> struct annotated *p; >>> int index = MAX_INDEX + unconst; >>> >>> p = alloc_annotated(index); >>> >>> REPORT_SIZE(p->array); >>> /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array)); >>> /* Check array size alone. */ >>> /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX); >>> /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array)); >>> /* Check check entire object size. */ >>> /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX); >>> /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array)); >>> } >>> >>> Test 5 should pass as well, since, again, p can be examined. Passing p >>> to __bdos implies it is allocated and the __counted_by annotation can be >>> examined. >> >> Since the call to the routine “alloc_annotated" cannot be inlined, GCC does not see any allocation calls for the pointer p. > > Correct. > >> At the same time, due to the same reason as PR109986, GCC cannot determine the size of the object by just knowing the TYPE_SIZE >> of the pointee of p. > > So the difference between test 3 and test 5 is that "p" is explicitly > dereferenced to find "array", and therefore an assumption is established > that "p" is allocated? Yes, this might be the assumption that GCC made -:) > >> So, this is exactly the same issue as PR109557. It’s an existing behavior per the current __buitlin_object_size algorithm. >> I am still not very sure whether the situation in PR109557 can be improved or not, but either way, it’s a separate issue. > > Okay, so the issue is one of object allocation visibility (or > assumptions there in)? Might be, let’s see what Sid or Martin will say on this. > >> Please see the new testing case I added for PR109557, you will see that the above case 5 is a similar case as the new testing case in PR109557. > > I will ponder this a bit more to see if I can come up with a useful > test case to replace the part from "test 5" above. > >>> >>> If "return p->array[index];" would be caught by the sanitizer, then >>> it follows that __builtin_dynamic_object_size(p, 1) must also know the >>> size. i.e. both must examine "p" and its trailing flexible array and >>> __counted_by annotation. >>> >>>> >>>> 2. The common issue for the failed testing 3, 4, 9, 10 is: >>>> >>>> for the following annotated structure: >>>> >>>> ==== >>>> struct annotated { >>>> unsigned long flags; >>>> size_t foo; >>>> int array[] __attribute__((counted_by (foo))); >>>> }; >>>> >>>> >>>> struct annotated *p; >>>> int index = 16; >>>> >>>> p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size >>>> >>>> p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 >>> >>> Right, tests 9 and 10 are checking that the _smallest_ possible value of >>> the array is used. (There are two sources of information: the allocation >>> size and the size calculated by counted_by. The smaller of the two >>> should be used when both are available.) >> >> The counted_by attribute is used to annotate a Flexible array member on how many elements it will have. >> However, if this information can not accurately reflect the real number of elements for the array allocated, >> What’s the purpose of such information? > > For example, imagine code that allocates space for 100 elements since > the common case is that the number of elements will grow over time. > Elements are added as it goes. For example: > > struct grows { > int alloc_count; > int valid_count; > struct element item[] __counted_by(valid_count); > } *p; > > void something(void) > { > p = malloc(sizeof(*p) + sizeof(*p->item) * 100); > p->alloc_count = 100; > p->valid_count = 0; > > /* this loop doesn't check that we don't go over 100. */ > while (items_to_copy) { > struct element *item_ptr = get_next_item(); > /* __counted_by stays in sync: */ > p->valid_count++; > p->item[p->valid_count - 1] = *item_ptr; > } > } > > We would want to catch cases there p->item[] is accessed with an index > that is >= p->valid_count, even though the allocation is (currently) > larger. > > However, if we ever reached valid_count >= alloc_count, we need to trap > too, since we can still "see" the true allocation size. > > Now, the __alloc_size hint is visible in very few places, so if there is > a strong reason to do so, I can live with saying that __counted_by takes > full precedence over __alloc_size. It seems it should be possible to > compare when both are present, but I can live with __counted_by being > the universal truth. :) > Thanks for the example and the explanation. Understood now. LLVM’s RFC requires the following relationship: (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) Buffer’s real allocated size >= counted_by value Should be true all the time. I think that this is a reasonable requirement. (Otherwise, if counted_by > buffer’s real allocated size, overflow might happen) Is the above enough to cover your use cases? If so, I will study a little bit more to see how to implement this. >> >>>> or >>>> p->foo was not set to any value as in test 3 and 4 >>> >>> For tests 3 and 4, yes, this was my mistake. I have fixed up the tests >>> now. Bill noticed the same issue. Sorry for the think-o! >>> >>>> ==== >>>> >>>> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. >>>> >>>> I think that this should be considered as an user error, and the documentation of the attribute should include >>>> this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: >>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) >>>> >>>> We can add a new warning option -Wcounted-by to report such user error if needed. >>>> >>>> What’s your opinion on this? >>> >>> I think it is correct to allow mismatch between allocation and >>> counted_by as long as only the least of the two is used. >> >> What’s your mean by “only the least of the two is used”? > > I was just restating the above: if size information is available via > both __alloc_size and __counted_by, the smaller of the two should be > used. If we are consistent with LLVM’s RFC, __alloc_size >= __counted_by all the time. Then we can always use the counted_by info for the array size. > >> >>> This may be >>> desirable in a few situations. One example would be a large allocation >>> that is slowly filled up by the program. >> >> So, for such situation, whenever the allocation is filled up, the field that hold the “counted_by” attribute should be increased at the same time, >> Then, the “counted_by” value always sync with the real allocation. >>> I.e. the counted_by member is >>> slowly increased during runtime (but not beyond the true allocation size). >> >> Then there should be source code to increase the “counted_by” field whenever the allocated space increased too. >>> >>> Of course allocation size is only available in limited situations, so >>> the loss of that info is fine: we have counted_by for everything else. >> >> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement: >> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 > > Right, I'm saying it would be nice if __alloc_size was checked as well, > in the sense that if it is available, it knows without question what the > size of the allocation is. If __alloc_size and __counted_by conflict, > the smaller of the two should be the truth. > > But, as I said, if there is some need to explicitly ignore __alloc_size > when __counted_by is present, I can live with it; we just need to > document it. > > If the RFC and you agree that the __counted_by variable can only ever be > (re)assigned after the flex array has been (re)allocated, then I guess > we'll see how it goes. :) I think most places in the kernel using > __counted_by will be fine, but I suspect we may have cases where we need > to update it like in the loop I described above. If that's true, we can > revisit the requirement then. :) I guess if we can always keep the relationship: __alloc_size >= __counted_by all the time. Should be fine. Please check whether this is enough for kernel, I will check whether this is doable For GCC. thanks. Qing > > -Kees > > -- > Kees Cook
Am Dienstag, dem 18.07.2023 um 15:37 +0000 schrieb Qing Zhao: > > > > On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> > > wrote: > > > > On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote: > > > > > > > On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> > > > > wrote: > > > > > > > > In the bug, the problem is that "p" isn't known to be > > > > allocated, if I'm > > > > reading that correctly? > > > > > > I think that the major point in PR109557 > > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557): > > > > > > for the following pointer p.3_1, > > > > > > p.3_1 = p; > > > _2 = __builtin_object_size (p.3_1, 0); > > > > > > Question: why the size of p.3_1 cannot use the TYPE_SIZE of the > > > pointee of p when the TYPE_SIZE can be determined at compile > > > time? > > > > > > Answer: From just knowing the type of the pointee of p, the > > > compiler cannot determine the size of the object. > > > > Why is that? "p" points to "struct P", which has a fixed size. > > There > > must be an assumption somewhere that a pointer is allocated, > > otherwise > > __bos would almost never work? > > My understanding from the comments in PR109557 was: > > In general the pointer could point to the first object of an array > that has more elements, or to an object of a different type. > Without seeing the real allocation to the pointer, the compiler > cannot assume that the pointer point to an object that has > the exact same type as its declaration. > > Sid and Martin, is the above understand correctly? Yes. In the example, it *could* work because the compiler could inline 'store' or otherwise use its knowledge about the function definition to conclude that 'p' points to an object of size 'sizeof (struct P)'. But this is fragile because it relies on optimization and will not work across TUs. > Honestly, I am still not 100% clear on this yet. > Jakub, Sid and Martin, could you please explain a little bit more > here, especially for the following small example? > > [opc@qinzhao-ol8u3-x86 109557]$ cat t.c > #include <stdlib.h> > #include <assert.h> > struct P { > int k; > int x[10]; > } *p; > > void store(int a, int b) > { > p = (struct P *)malloc (sizeof (struct P)); > p->k = a; > p->x[b] = 0; > assert (__builtin_dynamic_object_size (p->x, 1) == sizeof > (int[10])); > assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P)); > return; > } > > int main() > { > store(7, 7); > assert (__builtin_dynamic_object_size (p->x, 1) == sizeof > (int[10])); > assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P)); > free (p); > } > [opc@qinzhao-ol8u3-x86 109557]$ sh t > /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c > a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x, > 1) == sizeof (int[10])' failed. > t: line 19: 859944 Aborted (core dumped) ./a.out > > > In the above, among the 4 assertions, only the last one failed. I don't know why this is the case. > > Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as > the size of the object, I do not think it can do this in general. Is this how it is implemented? Thus would then seem incorrect to me. > but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the > size of the object? In general, the type of a pointer does not say anything about the object it points to, because you could cast the pointer to a different type, pass it around, and then cast it back before use. Only observed allocations or observed accesses provide information about the type / existence of an object at the corresponding address. The only other way in C which ensures that a pointer actually points to an object of the correct type is 'static': void foo(struct P *p[static 1]); Martin > > > > > > Therefore the bug has been closed. > > > > > > In your following testing 5: > > > > > > > I'm not sure this is a reasonable behavior, but > > > > let me get into the specific test, which looks like this: > > > > > > > > TEST(counted_by_seen_by_bdos) > > > > { > > > > struct annotated *p; > > > > int index = MAX_INDEX + unconst; > > > > > > > > p = alloc_annotated(index); > > > > > > > > REPORT_SIZE(p->array); > > > > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array)); > > > > /* Check array size alone. */ > > > > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), > > > > SIZE_MAX); > > > > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), > > > > p->foo * sizeof(*p->array)); > > > > /* Check check entire object size. */ > > > > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX); > > > > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), > > > > sizeof(*p) + p->foo * sizeof(*p->array)); > > > > } > > > > > > > > Test 5 should pass as well, since, again, p can be examined. > > > > Passing p > > > > to __bdos implies it is allocated and the __counted_by > > > > annotation can be > > > > examined. > > > > > > Since the call to the routine “alloc_annotated" cannot be > > > inlined, GCC does not see any allocation calls for the pointer p. > > > > Correct. > > > > > At the same time, due to the same reason as PR109986, GCC cannot > > > determine the size of the object by just knowing the TYPE_SIZE > > > of the pointee of p. > > > > So the difference between test 3 and test 5 is that "p" is > > explicitly > > dereferenced to find "array", and therefore an assumption is > > established > > that "p" is allocated? > > Yes, this might be the assumption that GCC made -:) > > > > > So, this is exactly the same issue as PR109557. It’s an existing > > > behavior per the current __buitlin_object_size algorithm. > > > I am still not very sure whether the situation in PR109557 can be > > > improved or not, but either way, it’s a separate issue. > > > > Okay, so the issue is one of object allocation visibility (or > > assumptions there in)? > > Might be, let’s see what Sid or Martin will say on this. > > > > > Please see the new testing case I added for PR109557, you will > > > see that the above case 5 is a similar case as the new testing > > > case in PR109557. > > > > I will ponder this a bit more to see if I can come up with a useful > > test case to replace the part from "test 5" above. > > > > > > > > > > If "return p->array[index];" would be caught by the sanitizer, > > > > then > > > > it follows that __builtin_dynamic_object_size(p, 1) must also > > > > know the > > > > size. i.e. both must examine "p" and its trailing flexible > > > > array and > > > > __counted_by annotation. > > > > > > > > > > > > > > 2. The common issue for the failed testing 3, 4, 9, 10 is: > > > > > > > > > > for the following annotated structure: > > > > > > > > > > ==== > > > > > struct annotated { > > > > > unsigned long flags; > > > > > size_t foo; > > > > > int array[] __attribute__((counted_by (foo))); > > > > > }; > > > > > > > > > > > > > > > struct annotated *p; > > > > > int index = 16; > > > > > > > > > > p = malloc(sizeof(*p) + index * sizeof(*p->array)); // > > > > > allocated real size > > > > > > > > > > p->foo = index + 2; // p->foo was set by a different value > > > > > than the real size of p->array as in test 9 and 10 > > > > > > > > Right, tests 9 and 10 are checking that the _smallest_ possible > > > > value of > > > > the array is used. (There are two sources of information: the > > > > allocation > > > > size and the size calculated by counted_by. The smaller of the > > > > two > > > > should be used when both are available.) > > > > > > The counted_by attribute is used to annotate a Flexible array > > > member on how many elements it will have. > > > However, if this information can not accurately reflect the real > > > number of elements for the array allocated, > > > What’s the purpose of such information? > > > > For example, imagine code that allocates space for 100 elements > > since > > the common case is that the number of elements will grow over time. > > Elements are added as it goes. For example: > > > > struct grows { > > int alloc_count; > > int valid_count; > > struct element item[] __counted_by(valid_count); > > } *p; > > > > void something(void) > > { > > p = malloc(sizeof(*p) + sizeof(*p->item) * 100); > > p->alloc_count = 100; > > p->valid_count = 0; > > > > /* this loop doesn't check that we don't go over 100. */ > > while (items_to_copy) { > > struct element *item_ptr = get_next_item(); > > /* __counted_by stays in sync: */ > > p->valid_count++; > > p->item[p->valid_count - 1] = *item_ptr; > > } > > } > > > > We would want to catch cases there p->item[] is accessed with an > > index > > that is >= p->valid_count, even though the allocation is > > (currently) > > larger. > > > > However, if we ever reached valid_count >= alloc_count, we need to > > trap > > too, since we can still "see" the true allocation size. > > > > Now, the __alloc_size hint is visible in very few places, so if > > there is > > a strong reason to do so, I can live with saying that __counted_by > > takes > > full precedence over __alloc_size. It seems it should be possible > > to > > compare when both are present, but I can live with __counted_by > > being > > the universal truth. :) > > > > Thanks for the example and the explanation. Understood now. > > LLVM’s RFC requires the following relationship: > (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbound > s-safety/70854#maintaining-correctness-of-bounds-annotations-18) > > Buffer’s real allocated size >= counted_by value > > Should be true all the time. > > I think that this is a reasonable requirement. > > (Otherwise, if counted_by > buffer’s real allocated size, overflow > might happen) > > Is the above enough to cover your use cases? > > If so, I will study a little bit more to see how to implement this. > > > > > > > > or > > > > > p->foo was not set to any value as in test 3 and 4 > > > > > > > > For tests 3 and 4, yes, this was my mistake. I have fixed up > > > > the tests > > > > now. Bill noticed the same issue. Sorry for the think-o! > > > > > > > > > ==== > > > > > > > > > > i.e, the value of p->foo is NOT synced with the number of > > > > > elements allocated for the array p->array. > > > > > > > > > > I think that this should be considered as an user error, and > > > > > the documentation of the attribute should include > > > > > this requirement. (In the LLVM’s RFC, such requirement was > > > > > included in the programing model: > > > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 > > > > > ) > > > > > > > > > > We can add a new warning option -Wcounted-by to report such > > > > > user error if needed. > > > > > > > > > > What’s your opinion on this? > > > > > > > > I think it is correct to allow mismatch between allocation and > > > > counted_by as long as only the least of the two is used. > > > > > > What’s your mean by “only the least of the two is used”? > > > > I was just restating the above: if size information is available > > via > > both __alloc_size and __counted_by, the smaller of the two should > > be > > used. > > If we are consistent with LLVM’s RFC, __alloc_size >= __counted_by > all the time. > Then we can always use the counted_by info for the array size. > > > > > > > > > This may be > > > > desirable in a few situations. One example would be a large > > > > allocation > > > > that is slowly filled up by the program. > > > > > > So, for such situation, whenever the allocation is filled up, the > > > field that hold the “counted_by” attribute should be increased at > > > the same time, > > > Then, the “counted_by” value always sync with the real > > > allocation. > > > > I.e. the counted_by member is > > > > slowly increased during runtime (but not beyond the true > > > > allocation size). > > > > > > Then there should be source code to increase the “counted_by” > > > field whenever the allocated space increased too. > > > > > > > > Of course allocation size is only available in limited > > > > situations, so > > > > the loss of that info is fine: we have counted_by for > > > > everything else. > > > > > > The point is: allocation size should synced with the value of > > > “counted_by”. LLVM’s RFC also have the similar requirement: > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 > > > > Right, I'm saying it would be nice if __alloc_size was checked as > > well, > > in the sense that if it is available, it knows without question > > what the > > size of the allocation is. If __alloc_size and __counted_by > > conflict, > > the smaller of the two should be the truth. > > > > > But, as I said, if there is some need to explicitly ignore > > __alloc_size > > when __counted_by is present, I can live with it; we just need to > > document it. > > > > If the RFC and you agree that the __counted_by variable can only > > ever be > > (re)assigned after the flex array has been (re)allocated, then I > > guess > > we'll see how it goes. :) I think most places in the kernel using > > __counted_by will be fine, but I suspect we may have cases where we > > need > > to update it like in the loop I described above. If that's true, we > > can > > revisit the requirement then. :) > > I guess if we can always keep the relationship: __alloc_size >= > __counted_by all the time. Should be fine. > > Please check whether this is enough for kernel, I will check whether > this is doable For GCC. > > thanks. > > > Qing > > > > -Kees > > > > -- > > Kees Cook >
> On Jul 18, 2023, at 12:03 PM, Martin Uecker <uecker@tugraz.at> wrote: > > Am Dienstag, dem 18.07.2023 um 15:37 +0000 schrieb Qing Zhao: >> >> >>> On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> >>> wrote: >>> >>> On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote: >>>> >>>>> On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> >>>>> wrote: >>>>> >>>>> In the bug, the problem is that "p" isn't known to be >>>>> allocated, if I'm >>>>> reading that correctly? >>>> >>>> I think that the major point in PR109557 >>>> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557): >>>> >>>> for the following pointer p.3_1, >>>> >>>> p.3_1 = p; >>>> _2 = __builtin_object_size (p.3_1, 0); >>>> >>>> Question: why the size of p.3_1 cannot use the TYPE_SIZE of the >>>> pointee of p when the TYPE_SIZE can be determined at compile >>>> time? >>>> >>>> Answer: From just knowing the type of the pointee of p, the >>>> compiler cannot determine the size of the object. >>> >>> Why is that? "p" points to "struct P", which has a fixed size. >>> There >>> must be an assumption somewhere that a pointer is allocated, >>> otherwise >>> __bos would almost never work? >> >> My understanding from the comments in PR109557 was: >> >> In general the pointer could point to the first object of an array >> that has more elements, or to an object of a different type. >> Without seeing the real allocation to the pointer, the compiler >> cannot assume that the pointer point to an object that has >> the exact same type as its declaration. >> >> Sid and Martin, is the above understand correctly? > > Yes. > > In the example, it *could* work because the compiler > could inline 'store' or otherwise use its knowledge about > the function definition to conclude that 'p' points to > an object of size 'sizeof (struct P)'. But this is fragile > because it relies on optimization and will not work across > TUs. > >> Honestly, I am still not 100% clear on this yet. > >> Jakub, Sid and Martin, could you please explain a little bit more >> here, especially for the following small example? >> >> [opc@qinzhao-ol8u3-x86 109557]$ cat t.c >> #include <stdlib.h> >> #include <assert.h> >> struct P { >> int k; >> int x[10]; >> } *p; >> >> void store(int a, int b) >> { >> p = (struct P *)malloc (sizeof (struct P)); >> p->k = a; >> p->x[b] = 0; >> assert (__builtin_dynamic_object_size (p->x, 1) == sizeof >> (int[10])); >> assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P)); >> return; >> } >> >> int main() >> { >> store(7, 7); >> assert (__builtin_dynamic_object_size (p->x, 1) == sizeof >> (int[10])); >> assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P)); >> free (p); >> } >> [opc@qinzhao-ol8u3-x86 109557]$ sh t >> /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c >> a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x, >> 1) == sizeof (int[10])' failed. >> t: line 19: 859944 Aborted (core dumped) ./a.out >> >> >> In the above, among the 4 assertions, only the last one failed. > > I don't know why this is the case. > >> >> Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as >> the size of the object, > > I do not think it can do this in general. Is this how it > is implemented? No. -:) I guess that the implementation of this should base on your following case, “observed accesses”: Although I am not 100% sure on the definition of “observed accesses”. p->x is an access to the field of the object, so compiler can assume that the object exist and have the type associate with this access? On the other hand, “p” is just a plain pointer, no observed access. > Thus would then seem incorrect to me. > >> but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the >> size of the object? > > In general, the type of a pointer does not say anything about the > object it points to, because you could cast the pointer to a different > type, pass it around, and then cast it back before use. Okay, I see. > > Only observed allocations or observed accesses provide information > about the type / existence of an object at the corresponding address. What will be included in “observed accesses”? > > The only other way in C which ensures that a pointer actually points > to an object of the correct type is 'static': > > void foo(struct P *p[static 1]); Thanks for the info. Qing > > > > Martin > > >> >>> >>>> Therefore the bug has been closed. >>>> >>>> In your following testing 5: >>>> >>>>> I'm not sure this is a reasonable behavior, but >>>>> let me get into the specific test, which looks like this: >>>>> >>>>> TEST(counted_by_seen_by_bdos) >>>>> { >>>>> struct annotated *p; >>>>> int index = MAX_INDEX + unconst; >>>>> >>>>> p = alloc_annotated(index); >>>>> >>>>> REPORT_SIZE(p->array); >>>>> /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array)); >>>>> /* Check array size alone. */ >>>>> /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), >>>>> SIZE_MAX); >>>>> /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), >>>>> p->foo * sizeof(*p->array)); >>>>> /* Check check entire object size. */ >>>>> /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX); >>>>> /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), >>>>> sizeof(*p) + p->foo * sizeof(*p->array)); >>>>> } >>>>> >>>>> Test 5 should pass as well, since, again, p can be examined. >>>>> Passing p >>>>> to __bdos implies it is allocated and the __counted_by >>>>> annotation can be >>>>> examined. >>>> >>>> Since the call to the routine “alloc_annotated" cannot be >>>> inlined, GCC does not see any allocation calls for the pointer p. >>> >>> Correct. >>> >>>> At the same time, due to the same reason as PR109986, GCC cannot >>>> determine the size of the object by just knowing the TYPE_SIZE >>>> of the pointee of p. >>> >>> So the difference between test 3 and test 5 is that "p" is >>> explicitly >>> dereferenced to find "array", and therefore an assumption is >>> established >>> that "p" is allocated? >> >> Yes, this might be the assumption that GCC made -:) >>> >>>> So, this is exactly the same issue as PR109557. It’s an existing >>>> behavior per the current __buitlin_object_size algorithm. >>>> I am still not very sure whether the situation in PR109557 can be >>>> improved or not, but either way, it’s a separate issue. >>> >>> Okay, so the issue is one of object allocation visibility (or >>> assumptions there in)? >> >> Might be, let’s see what Sid or Martin will say on this. >>> >>>> Please see the new testing case I added for PR109557, you will >>>> see that the above case 5 is a similar case as the new testing >>>> case in PR109557. >>> >>> I will ponder this a bit more to see if I can come up with a useful >>> test case to replace the part from "test 5" above. >>> >>>>> >>>>> If "return p->array[index];" would be caught by the sanitizer, >>>>> then >>>>> it follows that __builtin_dynamic_object_size(p, 1) must also >>>>> know the >>>>> size. i.e. both must examine "p" and its trailing flexible >>>>> array and >>>>> __counted_by annotation. >>>>> >>>>>> >>>>>> 2. The common issue for the failed testing 3, 4, 9, 10 is: >>>>>> >>>>>> for the following annotated structure: >>>>>> >>>>>> ==== >>>>>> struct annotated { >>>>>> unsigned long flags; >>>>>> size_t foo; >>>>>> int array[] __attribute__((counted_by (foo))); >>>>>> }; >>>>>> >>>>>> >>>>>> struct annotated *p; >>>>>> int index = 16; >>>>>> >>>>>> p = malloc(sizeof(*p) + index * sizeof(*p->array)); // >>>>>> allocated real size >>>>>> >>>>>> p->foo = index + 2; // p->foo was set by a different value >>>>>> than the real size of p->array as in test 9 and 10 >>>>> >>>>> Right, tests 9 and 10 are checking that the _smallest_ possible >>>>> value of >>>>> the array is used. (There are two sources of information: the >>>>> allocation >>>>> size and the size calculated by counted_by. The smaller of the >>>>> two >>>>> should be used when both are available.) >>>> >>>> The counted_by attribute is used to annotate a Flexible array >>>> member on how many elements it will have. >>>> However, if this information can not accurately reflect the real >>>> number of elements for the array allocated, >>>> What’s the purpose of such information? >>> >>> For example, imagine code that allocates space for 100 elements >>> since >>> the common case is that the number of elements will grow over time. >>> Elements are added as it goes. For example: >>> >>> struct grows { >>> int alloc_count; >>> int valid_count; >>> struct element item[] __counted_by(valid_count); >>> } *p; >>> >>> void something(void) >>> { >>> p = malloc(sizeof(*p) + sizeof(*p->item) * 100); >>> p->alloc_count = 100; >>> p->valid_count = 0; >>> >>> /* this loop doesn't check that we don't go over 100. */ >>> while (items_to_copy) { >>> struct element *item_ptr = get_next_item(); >>> /* __counted_by stays in sync: */ >>> p->valid_count++; >>> p->item[p->valid_count - 1] = *item_ptr; >>> } >>> } >>> >>> We would want to catch cases there p->item[] is accessed with an >>> index >>> that is >= p->valid_count, even though the allocation is >>> (currently) >>> larger. >>> >>> However, if we ever reached valid_count >= alloc_count, we need to >>> trap >>> too, since we can still "see" the true allocation size. >>> >>> Now, the __alloc_size hint is visible in very few places, so if >>> there is >>> a strong reason to do so, I can live with saying that __counted_by >>> takes >>> full precedence over __alloc_size. It seems it should be possible >>> to >>> compare when both are present, but I can live with __counted_by >>> being >>> the universal truth. :) >>> >> >> Thanks for the example and the explanation. Understood now. >> >> LLVM’s RFC requires the following relationship: >> (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbound >> s-safety/70854#maintaining-correctness-of-bounds-annotations-18) >> >> Buffer’s real allocated size >= counted_by value >> >> Should be true all the time. >> >> I think that this is a reasonable requirement. >> >> (Otherwise, if counted_by > buffer’s real allocated size, overflow >> might happen) >> >> Is the above enough to cover your use cases? >> >> If so, I will study a little bit more to see how to implement this. >>>> >>>>>> or >>>>>> p->foo was not set to any value as in test 3 and 4 >>>>> >>>>> For tests 3 and 4, yes, this was my mistake. I have fixed up >>>>> the tests >>>>> now. Bill noticed the same issue. Sorry for the think-o! >>>>> >>>>>> ==== >>>>>> >>>>>> i.e, the value of p->foo is NOT synced with the number of >>>>>> elements allocated for the array p->array. >>>>>> >>>>>> I think that this should be considered as an user error, and >>>>>> the documentation of the attribute should include >>>>>> this requirement. (In the LLVM’s RFC, such requirement was >>>>>> included in the programing model: >>>>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 >>>>>> ) >>>>>> >>>>>> We can add a new warning option -Wcounted-by to report such >>>>>> user error if needed. >>>>>> >>>>>> What’s your opinion on this? >>>>> >>>>> I think it is correct to allow mismatch between allocation and >>>>> counted_by as long as only the least of the two is used. >>>> >>>> What’s your mean by “only the least of the two is used”? >>> >>> I was just restating the above: if size information is available >>> via >>> both __alloc_size and __counted_by, the smaller of the two should >>> be >>> used. >> >> If we are consistent with LLVM’s RFC, __alloc_size >= __counted_by >> all the time. >> Then we can always use the counted_by info for the array size. >>> >>>> >>>>> This may be >>>>> desirable in a few situations. One example would be a large >>>>> allocation >>>>> that is slowly filled up by the program. >>>> >>>> So, for such situation, whenever the allocation is filled up, the >>>> field that hold the “counted_by” attribute should be increased at >>>> the same time, >>>> Then, the “counted_by” value always sync with the real >>>> allocation. >>>>> I.e. the counted_by member is >>>>> slowly increased during runtime (but not beyond the true >>>>> allocation size). >>>> >>>> Then there should be source code to increase the “counted_by” >>>> field whenever the allocated space increased too. >>>>> >>>>> Of course allocation size is only available in limited >>>>> situations, so >>>>> the loss of that info is fine: we have counted_by for >>>>> everything else. >>>> >>>> The point is: allocation size should synced with the value of >>>> “counted_by”. LLVM’s RFC also have the similar requirement: >>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 >>> >>> Right, I'm saying it would be nice if __alloc_size was checked as >>> well, >>> in the sense that if it is available, it knows without question >>> what the >>> size of the allocation is. If __alloc_size and __counted_by >>> conflict, >>> the smaller of the two should be the truth. >> >>> >>> But, as I said, if there is some need to explicitly ignore >>> __alloc_size >>> when __counted_by is present, I can live with it; we just need to >>> document it. >>> >>> If the RFC and you agree that the __counted_by variable can only >>> ever be >>> (re)assigned after the flex array has been (re)allocated, then I >>> guess >>> we'll see how it goes. :) I think most places in the kernel using >>> __counted_by will be fine, but I suspect we may have cases where we >>> need >>> to update it like in the loop I described above. If that's true, we >>> can >>> revisit the requirement then. :) >> >> I guess if we can always keep the relationship: __alloc_size >= >> __counted_by all the time. Should be fine. >> >> Please check whether this is enough for kernel, I will check whether >> this is doable For GCC. >> >> thanks. >> >> >> Qing >>> >>> -Kees >>> >>> -- >>> Kees Cook >> > > -- > Univ.-Prof. Dr. rer. nat. Martin Uecker > Graz University of Technology > Institute of Biomedical Imaging
Am Dienstag, dem 18.07.2023 um 16:25 +0000 schrieb Qing Zhao: > > > > On Jul 18, 2023, at 12:03 PM, Martin Uecker <uecker@tugraz.at> > > wrote: > > > > Am Dienstag, dem 18.07.2023 um 15:37 +0000 schrieb Qing Zhao: > > > > > > > > > > On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> > > > > wrote: > > > > > > > > On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote: > > > > > > > > > > > On Jul 13, 2023, at 4:31 PM, Kees Cook > > > > > > <keescook@chromium.org> > > > > > > wrote: > > > > > > > > > > > > In the bug, the problem is that "p" isn't known to be > > > > > > allocated, if I'm > > > > > > reading that correctly? > > > > > > > > > > I think that the major point in PR109557 > > > > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557): > > > > > > > > > > for the following pointer p.3_1, > > > > > > > > > > p.3_1 = p; > > > > > _2 = __builtin_object_size (p.3_1, 0); > > > > > > > > > > Question: why the size of p.3_1 cannot use the TYPE_SIZE of > > > > > the > > > > > pointee of p when the TYPE_SIZE can be determined at compile > > > > > time? > > > > > > > > > > Answer: From just knowing the type of the pointee of p, the > > > > > compiler cannot determine the size of the object. > > > > > > > > Why is that? "p" points to "struct P", which has a fixed size. > > > > There > > > > must be an assumption somewhere that a pointer is allocated, > > > > otherwise > > > > __bos would almost never work? > > > > > > My understanding from the comments in PR109557 was: > > > > > > In general the pointer could point to the first object of an > > > array > > > that has more elements, or to an object of a different type. > > > Without seeing the real allocation to the pointer, the compiler > > > cannot assume that the pointer point to an object that has > > > the exact same type as its declaration. > > > > > > Sid and Martin, is the above understand correctly? > > > > Yes. > > > > In the example, it *could* work because the compiler > > could inline 'store' or otherwise use its knowledge about > > the function definition to conclude that 'p' points to > > an object of size 'sizeof (struct P)'. But this is fragile > > because it relies on optimization and will not work across > > TUs. > > > > > Honestly, I am still not 100% clear on this yet. > > > > > Jakub, Sid and Martin, could you please explain a little bit > > > more > > > here, especially for the following small example? > > > > > > [opc@qinzhao-ol8u3-x86 109557]$ cat t.c > > > #include <stdlib.h> > > > #include <assert.h> > > > struct P { > > > int k; > > > int x[10]; > > > } *p; > > > > > > void store(int a, int b) > > > { > > > p = (struct P *)malloc (sizeof (struct P)); > > > p->k = a; > > > p->x[b] = 0; > > > assert (__builtin_dynamic_object_size (p->x, 1) == sizeof > > > (int[10])); > > > assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct > > > P)); > > > return; > > > } > > > > > > int main() > > > { > > > store(7, 7); > > > assert (__builtin_dynamic_object_size (p->x, 1) == sizeof > > > (int[10])); > > > assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct > > > P)); > > > free (p); > > > } > > > [opc@qinzhao-ol8u3-x86 109557]$ sh t > > > /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c > > > a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p- > > > >x, > > > 1) == sizeof (int[10])' failed. > > > t: line 19: 859944 Aborted (core dumped) ./a.out > > > > > > > > > In the above, among the 4 assertions, only the last one failed. > > > > I don't know why this is the case. > > > > > > > > Why GCC can use the TYPE_SIZE of the pointee of the pointer “p- > > > >x” as > > > the size of the object, > > > > I do not think it can do this in general. Is this how it > > is implemented? > > No. -:) > > I guess that the implementation of this should base on your > following case, “observed accesses”: > Although I am not 100% sure on the definition of “observed accesses”. > > p->x is an access to the field of the object, so compiler can assume > that the object exist and have > the type associate with this access? > Only if the access happens at run-time, but the argument to BDOS is not evaluated, so there is no access. At least, this is my guess based on C's semantics. > On the other hand, “p” is just a plain pointer, no observed access. > > > Thus would then seem incorrect to me. > > > > > but cannot use the TYPE_SIZE of the pointee of the pointer “p” as > > > the > > > size of the object? > > > > In general, the type of a pointer does not say anything about the > > object it points to, because you could cast the pointer to a > > different > > type, pass it around, and then cast it back before use. > > Okay, I see. > > > > Only observed allocations or observed accesses provide information > > about the type / existence of an object at the corresponding > > address. > > What will be included in “observed accesses”? Any read or write access can be used to determine that there must be an object of the corresponding type at this location. (A write access could possibly create an object.) This still does not tell one anything about the size of the storage region, because it could be larger than one object of this type. Martin > > > > The only other way in C which ensures that a pointer actually > > points > > to an object of the correct type is 'static': > > > > void foo(struct P *p[static 1]); > > Thanks for the info. > > Qing > > > > > > > > Martin > > > > > > > > > > > > > > > > Therefore the bug has been closed. > > > > > > > > > > In your following testing 5: > > > > > > > > > > > I'm not sure this is a reasonable behavior, but > > > > > > let me get into the specific test, which looks like this: > > > > > > > > > > > > TEST(counted_by_seen_by_bdos) > > > > > > { > > > > > > struct annotated *p; > > > > > > int index = MAX_INDEX + unconst; > > > > > > > > > > > > p = alloc_annotated(index); > > > > > > > > > > > > REPORT_SIZE(p->array); > > > > > > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array)); > > > > > > /* Check array size alone. */ > > > > > > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), > > > > > > SIZE_MAX); > > > > > > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, > > > > > > 1), > > > > > > p->foo * sizeof(*p->array)); > > > > > > /* Check check entire object size. */ > > > > > > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX); > > > > > > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), > > > > > > sizeof(*p) + p->foo * sizeof(*p->array)); > > > > > > } > > > > > > > > > > > > Test 5 should pass as well, since, again, p can be > > > > > > examined. > > > > > > Passing p > > > > > > to __bdos implies it is allocated and the __counted_by > > > > > > annotation can be > > > > > > examined. > > > > > > > > > > Since the call to the routine “alloc_annotated" cannot be > > > > > inlined, GCC does not see any allocation calls for the > > > > > pointer p. > > > > > > > > Correct. > > > > > > > > > At the same time, due to the same reason as PR109986, GCC > > > > > cannot > > > > > determine the size of the object by just knowing the > > > > > TYPE_SIZE > > > > > of the pointee of p. > > > > > > > > So the difference between test 3 and test 5 is that "p" is > > > > explicitly > > > > dereferenced to find "array", and therefore an assumption is > > > > established > > > > that "p" is allocated? > > > > > > Yes, this might be the assumption that GCC made -:) > > > > > > > > > So, this is exactly the same issue as PR109557. It’s an > > > > > existing > > > > > behavior per the current __buitlin_object_size algorithm. > > > > > I am still not very sure whether the situation in PR109557 > > > > > can be > > > > > improved or not, but either way, it’s a separate issue. > > > > > > > > Okay, so the issue is one of object allocation visibility (or > > > > assumptions there in)? > > > > > > Might be, let’s see what Sid or Martin will say on this. > > > > > > > > > Please see the new testing case I added for PR109557, you > > > > > will > > > > > see that the above case 5 is a similar case as the new > > > > > testing > > > > > case in PR109557. > > > > > > > > I will ponder this a bit more to see if I can come up with a > > > > useful > > > > test case to replace the part from "test 5" above. > > > > > > > > > > > > > > > > If "return p->array[index];" would be caught by the > > > > > > sanitizer, > > > > > > then > > > > > > it follows that __builtin_dynamic_object_size(p, 1) must > > > > > > also > > > > > > know the > > > > > > size. i.e. both must examine "p" and its trailing flexible > > > > > > array and > > > > > > __counted_by annotation. > > > > > > > > > > > > > > > > > > > > 2. The common issue for the failed testing 3, 4, 9, 10 > > > > > > > is: > > > > > > > > > > > > > > for the following annotated structure: > > > > > > > > > > > > > > ==== > > > > > > > struct annotated { > > > > > > > unsigned long flags; > > > > > > > size_t foo; > > > > > > > int array[] __attribute__((counted_by (foo))); > > > > > > > }; > > > > > > > > > > > > > > > > > > > > > struct annotated *p; > > > > > > > int index = 16; > > > > > > > > > > > > > > p = malloc(sizeof(*p) + index * sizeof(*p->array)); // > > > > > > > allocated real size > > > > > > > > > > > > > > p->foo = index + 2; // p->foo was set by a different > > > > > > > value > > > > > > > than the real size of p->array as in test 9 and 10 > > > > > > > > > > > > Right, tests 9 and 10 are checking that the _smallest_ > > > > > > possible > > > > > > value of > > > > > > the array is used. (There are two sources of information: > > > > > > the > > > > > > allocation > > > > > > size and the size calculated by counted_by. The smaller of > > > > > > the > > > > > > two > > > > > > should be used when both are available.) > > > > > > > > > > The counted_by attribute is used to annotate a Flexible array > > > > > member on how many elements it will have. > > > > > However, if this information can not accurately reflect the > > > > > real > > > > > number of elements for the array allocated, > > > > > What’s the purpose of such information? > > > > > > > > For example, imagine code that allocates space for 100 elements > > > > since > > > > the common case is that the number of elements will grow over > > > > time. > > > > Elements are added as it goes. For example: > > > > > > > > struct grows { > > > > int alloc_count; > > > > int valid_count; > > > > struct element item[] __counted_by(valid_count); > > > > } *p; > > > > > > > > void something(void) > > > > { > > > > p = malloc(sizeof(*p) + sizeof(*p->item) * 100); > > > > p->alloc_count = 100; > > > > p->valid_count = 0; > > > > > > > > /* this loop doesn't check that we don't go over 100. > > > > */ > > > > while (items_to_copy) { > > > > struct element *item_ptr = get_next_item(); > > > > /* __counted_by stays in sync: */ > > > > p->valid_count++; > > > > p->item[p->valid_count - 1] = *item_ptr; > > > > } > > > > } > > > > > > > > We would want to catch cases there p->item[] is accessed with > > > > an > > > > index > > > > that is >= p->valid_count, even though the allocation is > > > > (currently) > > > > larger. > > > > > > > > However, if we ever reached valid_count >= alloc_count, we need > > > > to > > > > trap > > > > too, since we can still "see" the true allocation size. > > > > > > > > Now, the __alloc_size hint is visible in very few places, so if > > > > there is > > > > a strong reason to do so, I can live with saying that > > > > __counted_by > > > > takes > > > > full precedence over __alloc_size. It seems it should be > > > > possible > > > > to > > > > compare when both are present, but I can live with __counted_by > > > > being > > > > the universal truth. :) > > > > > > > > > > Thanks for the example and the explanation. Understood now. > > > > > > LLVM’s RFC requires the following relationship: > > > ( > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbou > > > nd > > > s-safety/70854#maintaining-correctness-of-bounds-annotations-18) > > > > > > Buffer’s real allocated size >= counted_by value > > > > > > Should be true all the time. > > > > > > I think that this is a reasonable requirement. > > > > > > (Otherwise, if counted_by > buffer’s real allocated size, > > > overflow > > > might happen) > > > > > > Is the above enough to cover your use cases? > > > > > > If so, I will study a little bit more to see how to implement > > > this. > > > > > > > > > > > > or > > > > > > > p->foo was not set to any value as in test 3 and 4 > > > > > > > > > > > > For tests 3 and 4, yes, this was my mistake. I have fixed > > > > > > up > > > > > > the tests > > > > > > now. Bill noticed the same issue. Sorry for the think-o! > > > > > > > > > > > > > ==== > > > > > > > > > > > > > > i.e, the value of p->foo is NOT synced with the number of > > > > > > > elements allocated for the array p->array. > > > > > > > > > > > > > > I think that this should be considered as an user error, > > > > > > > and > > > > > > > the documentation of the attribute should include > > > > > > > this requirement. (In the LLVM’s RFC, such requirement > > > > > > > was > > > > > > > included in the programing model: > > > > > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 > > > > > > > ) > > > > > > > > > > > > > > We can add a new warning option -Wcounted-by to report > > > > > > > such > > > > > > > user error if needed. > > > > > > > > > > > > > > What’s your opinion on this? > > > > > > > > > > > > I think it is correct to allow mismatch between allocation > > > > > > and > > > > > > counted_by as long as only the least of the two is used. > > > > > > > > > > What’s your mean by “only the least of the two is used”? > > > > > > > > I was just restating the above: if size information is > > > > available > > > > via > > > > both __alloc_size and __counted_by, the smaller of the two > > > > should > > > > be > > > > used. > > > > > > If we are consistent with LLVM’s RFC, __alloc_size >= > > > __counted_by > > > all the time. > > > Then we can always use the counted_by info for the array size. > > > > > > > > > > > > > > > This may be > > > > > > desirable in a few situations. One example would be a large > > > > > > allocation > > > > > > that is slowly filled up by the program. > > > > > > > > > > So, for such situation, whenever the allocation is filled up, > > > > > the > > > > > field that hold the “counted_by” attribute should be > > > > > increased at > > > > > the same time, > > > > > Then, the “counted_by” value always sync with the real > > > > > allocation. > > > > > > I.e. the counted_by member is > > > > > > slowly increased during runtime (but not beyond the true > > > > > > allocation size). > > > > > > > > > > Then there should be source code to increase the “counted_by” > > > > > field whenever the allocated space increased too. > > > > > > > > > > > > Of course allocation size is only available in limited > > > > > > situations, so > > > > > > the loss of that info is fine: we have counted_by for > > > > > > everything else. > > > > > > > > > > The point is: allocation size should synced with the value of > > > > > “counted_by”. LLVM’s RFC also have the similar requirement: > > > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 > > > > > > > > Right, I'm saying it would be nice if __alloc_size was checked > > > > as > > > > well, > > > > in the sense that if it is available, it knows without question > > > > what the > > > > size of the allocation is. If __alloc_size and __counted_by > > > > conflict, > > > > the smaller of the two should be the truth. > > > > > > > > > > > But, as I said, if there is some need to explicitly ignore > > > > __alloc_size > > > > when __counted_by is present, I can live with it; we just need > > > > to > > > > document it. > > > > > > > > If the RFC and you agree that the __counted_by variable can > > > > only > > > > ever be > > > > (re)assigned after the flex array has been (re)allocated, then > > > > I > > > > guess > > > > we'll see how it goes. :) I think most places in the kernel > > > > using > > > > __counted_by will be fine, but I suspect we may have cases > > > > where we > > > > need > > > > to update it like in the loop I described above. If that's > > > > true, we > > > > can > > > > revisit the requirement then. :) > > > > > > I guess if we can always keep the relationship: __alloc_size > > > >= > > > __counted_by all the time. Should be fine. > > > > > > Please check whether this is enough for kernel, I will check > > > whether > > > this is doable For GCC. > > > > > > thanks. > > > > > > > > > Qing > > > > > > > > -Kees > > > > > > > > -- > > > > Kees Cook > > > > > > > -- > > Univ.-Prof. Dr. rer. nat. Martin Uecker > > Graz University of Technology > > Institute of Biomedical Imaging >
> On Jul 18, 2023, at 11:37 AM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > > >> On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> wrote: >> >> On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote: >>> >>>> On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> wrote: >>>> >>>> In the bug, the problem is that "p" isn't known to be allocated, if I'm >>>> reading that correctly? >>> >>> I think that the major point in PR109557 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557): >>> >>> for the following pointer p.3_1, >>> >>> p.3_1 = p; >>> _2 = __builtin_object_size (p.3_1, 0); >>> >>> Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p when the TYPE_SIZE can be determined at compile time? >>> >>> Answer: From just knowing the type of the pointee of p, the compiler cannot determine the size of the object. >> >> Why is that? "p" points to "struct P", which has a fixed size. There >> must be an assumption somewhere that a pointer is allocated, otherwise >> __bos would almost never work? > > My understanding from the comments in PR109557 was: > > In general the pointer could point to the first object of an array that has more elements, or to an object of a different type. > Without seeing the real allocation to the pointer, the compiler cannot assume that the pointer point to an object that has > the exact same type as its declaration. > > Sid and Martin, is the above understand correctly? > > Honestly, I am still not 100% clear on this yet. > > Jakub, Sid and Martin, could you please explain a little bit more here, especially for the following small example? > > [opc@qinzhao-ol8u3-x86 109557]$ cat t.c > #include <stdlib.h> > #include <assert.h> > struct P { > int k; > int x[10]; > } *p; > > void store(int a, int b) > { > p = (struct P *)malloc (sizeof (struct P)); > p->k = a; > p->x[b] = 0; > assert (__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10])); > assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P)); > return; > } > > int main() > { > store(7, 7); > assert (__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10])); > assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P)); > free (p); > } > [opc@qinzhao-ol8u3-x86 109557]$ sh t > /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c > a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10])' failed. > t: line 19: 859944 Aborted (core dumped) ./a.out > > A correction to the above compilation option: /home/opc/Install/latest-d/bin/gcc -O -fstrict-flex-arrays=3 t.c a.out: t.c:22: main: Assertion `__builtin_dynamic_object_size (p, 1) == sizeof (struct P)' failed. t: line 19: 918833 Aborted (core dumped) ./a.out (All others keep the same). Sorry for the mistake. Qing > In the above, among the 4 assertions, only the last one failed. > > Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as the size of the object, > but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the size of the object? > > >> >>> Therefore the bug has been closed. >>> >>> In your following testing 5: >>> >>>> I'm not sure this is a reasonable behavior, but >>>> let me get into the specific test, which looks like this: >>>> >>>> TEST(counted_by_seen_by_bdos) >>>> { >>>> struct annotated *p; >>>> int index = MAX_INDEX + unconst; >>>> >>>> p = alloc_annotated(index); >>>> >>>> REPORT_SIZE(p->array); >>>> /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array)); >>>> /* Check array size alone. */ >>>> /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX); >>>> /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array)); >>>> /* Check check entire object size. */ >>>> /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX); >>>> /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array)); >>>> } >>>> >>>> Test 5 should pass as well, since, again, p can be examined. Passing p >>>> to __bdos implies it is allocated and the __counted_by annotation can be >>>> examined. >>> >>> Since the call to the routine “alloc_annotated" cannot be inlined, GCC does not see any allocation calls for the pointer p. >> >> Correct. >> >>> At the same time, due to the same reason as PR109986, GCC cannot determine the size of the object by just knowing the TYPE_SIZE >>> of the pointee of p. >> >> So the difference between test 3 and test 5 is that "p" is explicitly >> dereferenced to find "array", and therefore an assumption is established >> that "p" is allocated? > > Yes, this might be the assumption that GCC made -:) >> >>> So, this is exactly the same issue as PR109557. It’s an existing behavior per the current __buitlin_object_size algorithm. >>> I am still not very sure whether the situation in PR109557 can be improved or not, but either way, it’s a separate issue. >> >> Okay, so the issue is one of object allocation visibility (or >> assumptions there in)? > > Might be, let’s see what Sid or Martin will say on this. >> >>> Please see the new testing case I added for PR109557, you will see that the above case 5 is a similar case as the new testing case in PR109557. >> >> I will ponder this a bit more to see if I can come up with a useful >> test case to replace the part from "test 5" above. >> >>>> >>>> If "return p->array[index];" would be caught by the sanitizer, then >>>> it follows that __builtin_dynamic_object_size(p, 1) must also know the >>>> size. i.e. both must examine "p" and its trailing flexible array and >>>> __counted_by annotation. >>>> >>>>> >>>>> 2. The common issue for the failed testing 3, 4, 9, 10 is: >>>>> >>>>> for the following annotated structure: >>>>> >>>>> ==== >>>>> struct annotated { >>>>> unsigned long flags; >>>>> size_t foo; >>>>> int array[] __attribute__((counted_by (foo))); >>>>> }; >>>>> >>>>> >>>>> struct annotated *p; >>>>> int index = 16; >>>>> >>>>> p = malloc(sizeof(*p) + index * sizeof(*p->array)); // allocated real size >>>>> >>>>> p->foo = index + 2; // p->foo was set by a different value than the real size of p->array as in test 9 and 10 >>>> >>>> Right, tests 9 and 10 are checking that the _smallest_ possible value of >>>> the array is used. (There are two sources of information: the allocation >>>> size and the size calculated by counted_by. The smaller of the two >>>> should be used when both are available.) >>> >>> The counted_by attribute is used to annotate a Flexible array member on how many elements it will have. >>> However, if this information can not accurately reflect the real number of elements for the array allocated, >>> What’s the purpose of such information? >> >> For example, imagine code that allocates space for 100 elements since >> the common case is that the number of elements will grow over time. >> Elements are added as it goes. For example: >> >> struct grows { >> int alloc_count; >> int valid_count; >> struct element item[] __counted_by(valid_count); >> } *p; >> >> void something(void) >> { >> p = malloc(sizeof(*p) + sizeof(*p->item) * 100); >> p->alloc_count = 100; >> p->valid_count = 0; >> >> /* this loop doesn't check that we don't go over 100. */ >> while (items_to_copy) { >> struct element *item_ptr = get_next_item(); >> /* __counted_by stays in sync: */ >> p->valid_count++; >> p->item[p->valid_count - 1] = *item_ptr; >> } >> } >> >> We would want to catch cases there p->item[] is accessed with an index >> that is >= p->valid_count, even though the allocation is (currently) >> larger. >> >> However, if we ever reached valid_count >= alloc_count, we need to trap >> too, since we can still "see" the true allocation size. >> >> Now, the __alloc_size hint is visible in very few places, so if there is >> a strong reason to do so, I can live with saying that __counted_by takes >> full precedence over __alloc_size. It seems it should be possible to >> compare when both are present, but I can live with __counted_by being >> the universal truth. :) >> > > Thanks for the example and the explanation. Understood now. > > LLVM’s RFC requires the following relationship: (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) > > Buffer’s real allocated size >= counted_by value > > Should be true all the time. > > I think that this is a reasonable requirement. > > (Otherwise, if counted_by > buffer’s real allocated size, overflow might happen) > > Is the above enough to cover your use cases? > > If so, I will study a little bit more to see how to implement this. >>> >>>>> or >>>>> p->foo was not set to any value as in test 3 and 4 >>>> >>>> For tests 3 and 4, yes, this was my mistake. I have fixed up the tests >>>> now. Bill noticed the same issue. Sorry for the think-o! >>>> >>>>> ==== >>>>> >>>>> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array. >>>>> >>>>> I think that this should be considered as an user error, and the documentation of the attribute should include >>>>> this requirement. (In the LLVM’s RFC, such requirement was included in the programing model: >>>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18) >>>>> >>>>> We can add a new warning option -Wcounted-by to report such user error if needed. >>>>> >>>>> What’s your opinion on this? >>>> >>>> I think it is correct to allow mismatch between allocation and >>>> counted_by as long as only the least of the two is used. >>> >>> What’s your mean by “only the least of the two is used”? >> >> I was just restating the above: if size information is available via >> both __alloc_size and __counted_by, the smaller of the two should be >> used. > > If we are consistent with LLVM’s RFC, __alloc_size >= __counted_by all the time. > Then we can always use the counted_by info for the array size. >> >>> >>>> This may be >>>> desirable in a few situations. One example would be a large allocation >>>> that is slowly filled up by the program. >>> >>> So, for such situation, whenever the allocation is filled up, the field that hold the “counted_by” attribute should be increased at the same time, >>> Then, the “counted_by” value always sync with the real allocation. >>>> I.e. the counted_by member is >>>> slowly increased during runtime (but not beyond the true allocation size). >>> >>> Then there should be source code to increase the “counted_by” field whenever the allocated space increased too. >>>> >>>> Of course allocation size is only available in limited situations, so >>>> the loss of that info is fine: we have counted_by for everything else. >>> >>> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement: >>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 >> >> Right, I'm saying it would be nice if __alloc_size was checked as well, >> in the sense that if it is available, it knows without question what the >> size of the allocation is. If __alloc_size and __counted_by conflict, >> the smaller of the two should be the truth. > >> >> But, as I said, if there is some need to explicitly ignore __alloc_size >> when __counted_by is present, I can live with it; we just need to >> document it. >> >> If the RFC and you agree that the __counted_by variable can only ever be >> (re)assigned after the flex array has been (re)allocated, then I guess >> we'll see how it goes. :) I think most places in the kernel using >> __counted_by will be fine, but I suspect we may have cases where we need >> to update it like in the loop I described above. If that's true, we can >> revisit the requirement then. :) > > I guess if we can always keep the relationship: __alloc_size >= __counted_by all the time. Should be fine. > > Please check whether this is enough for kernel, I will check whether this is doable For GCC. > > thanks. > > > Qing >> >> -Kees >> >> -- >> Kees Cook
Am Montag, dem 17.07.2023 um 16:40 -0700 schrieb Kees Cook: > On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote: > > > > > On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> > > > wrote: > > > > > > In the bug, the problem is that "p" isn't known to be allocated, > > > if I'm > > > reading that correctly? > > > > I think that the major point in PR109557 > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557): > > > > for the following pointer p.3_1, > > > > p.3_1 = p; > > _2 = __builtin_object_size (p.3_1, 0); > > > > Question: why the size of p.3_1 cannot use the TYPE_SIZE of the > > pointee of p when the TYPE_SIZE can be determined at compile time? > > > > Answer: From just knowing the type of the pointee of p, the > > compiler cannot determine the size of the object. > > Why is that? "p" points to "struct P", which has a fixed size. There > must be an assumption somewhere that a pointer is allocated, > otherwise > __bos would almost never work? It often does not work, because it relies on the optimizer propagating the information instead of the type system. This is why it would be better to have proper *bounds* checks, and not just object size checks. It is not quite clear to me how BOS and bounds checking is supposed to work together, but FAMs should be bounds checked. ... > > > > > > This may be > > > desirable in a few situations. One example would be a large > > > allocation > > > that is slowly filled up by the program. > > > > So, for such situation, whenever the allocation is filled up, the > > field that hold the “counted_by” attribute should be increased at > > the same time, > > Then, the “counted_by” value always sync with the real allocation. > > > I.e. the counted_by member is > > > slowly increased during runtime (but not beyond the true > > > allocation size). > > > > Then there should be source code to increase the “counted_by” field > > whenever the allocated space increased too. > > > > > > Of course allocation size is only available in limited > > > situations, so > > > the loss of that info is fine: we have counted_by for everything > > > else. > > > > The point is: allocation size should synced with the value of > > “counted_by”. LLVM’s RFC also have the similar requirement: > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 > > Right, I'm saying it would be nice if __alloc_size was checked as > well, > in the sense that if it is available, it knows without question what > the > size of the allocation is. If __alloc_size and __counted_by conflict, > the smaller of the two should be the truth. > > But, as I said, if there is some need to explicitly ignore > __alloc_size > when __counted_by is present, I can live with it; we just need to > document it. > > If the RFC and you agree that the __counted_by variable can only ever > be > (re)assigned after the flex array has been (re)allocated, then I > guess > we'll see how it goes. :) I think most places in the kernel using > __counted_by will be fine, but I suspect we may have cases where we > need > to update it like in the loop I described above. If that's true, we > can > revisit the requirement then. :) It should be the other way round: You should first set 'count' and then reassign the pointer, because you can then often check the pointer assignment (reading 'count'). The other way round this works only sometimes, i.e. if both assignments are close together and the optimizer can see this. Martin
More thoughts on the following example Kees provided: > On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> wrote: >> >> The counted_by attribute is used to annotate a Flexible array member on how many elements it will have. >> However, if this information can not accurately reflect the real number of elements for the array allocated, >> What’s the purpose of such information? > > For example, imagine code that allocates space for 100 elements since > the common case is that the number of elements will grow over time. > Elements are added as it goes. For example: > > struct grows { > int alloc_count; > int valid_count; > struct element item[] __counted_by(valid_count); > } *p; > > void something(void) > { > p = malloc(sizeof(*p) + sizeof(*p->item) * 100); > p->alloc_count = 100; > p->valid_count = 0; > > /* this loop doesn't check that we don't go over 100. */ > while (items_to_copy) { > struct element *item_ptr = get_next_item(); > /* __counted_by stays in sync: */ > p->valid_count++; > p->item[p->valid_count - 1] = *item_ptr; > } > } > > We would want to catch cases there p->item[] is accessed with an index > that is >= p->valid_count, even though the allocation is (currently) > larger. > > However, if we ever reached valid_count >= alloc_count, we need to trap > too, since we can still "see" the true allocation size. > > Now, the __alloc_size hint is visible in very few places, so if there is > a strong reason to do so, I can live with saying that __counted_by takes > full precedence over __alloc_size. It seems it should be possible to > compare when both are present, but I can live with __counted_by being > the universal truth. :) In the above use case (not sure how popular such user case is?), the major questions are: for one object with flexible array member, 1. Shall we allow the situation when the allocated size for the object and the number of element for the contained FAM are mismatched? If the answer to 1 is YES (to support such user cases), then 2. If there is a mismatch between these two, should the number of element impact the allocated size for the object? (__builtin_object_size()) From the doc of object size checking: (https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html) ===== Built-in Function: size_t __builtin_object_size (const void * ptr, int type) is a built-in construct that returns a constant number of bytes from ptr to the end of the object ptr pointer points to (if known at compile time). To determine the sizes of dynamically allocated objects the function relies on the allocation functions called to obtain the storage to be declared with the alloc_size attribute (see Common Function Attributes). __builtin_object_size never evaluates its arguments for side effects. If there are any side effects in them, it returns (size_t) -1 for type 0 or 1 and (size_t) 0 for type 2 or 3. If there are multiple objects ptr can point to and all of them are known at compile time, the returned number is the maximum of remaining byte counts in those objects if type & 2 is 0 and minimum if nonzero. If it is not possible to determine which objects ptr points to at compile time, __builtin_object_size should return (size_t) -1 for type 0 or 1 and (size_t) 0 for type 2 or 3. ===== Based on the current documentation for __bos, I think that the answer should be NO, i.e, we should not use the counted_by info to change the REAL allocated size for the object. 3. Then, As pointed out also by Martin, only the bounds check (including -Warray-bounds or -fsanitizer=bounds) should be impacted by the counted_by information, since these checks are based on the TYPE system, and “counted_by” info should be treated as a complement to the TYPE system. Let me know your opinions. Qing
>> >> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement: >> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 > > Right, I'm saying it would be nice if __alloc_size was checked as well, > in the sense that if it is available, it knows without question what the > size of the allocation is. If __alloc_size and __counted_by conflict, > the smaller of the two should be the truth. I don’t think that “if __alloc_size and __counted_by conflict, the smaller of the two should be the truth” will work correctly. When __alloc_size is larger than the value of __counted_by, it’s okay. But when the value of __counted_by is larger than the __alloc_size, the array bound check or object size sanitizer might not work correctly. Please see the following example: struct grows { int alloc_count; int valid_count; int item[] __counted_by(valid_count); } *p; void __attribute__((__noinline__)) something (int n) { p = malloc(sizeof(*p) + sizeof(*p->item) * 100); p->alloc_count = 100; p->valid_count = 102; p->item[n] = 10; // both _alloc_size and the value of __counted_by are available in this routine, the smaller one is , 100; } void __attribute__((__noinline__)) something_2 (int n) { p->item[n] = 10; // only the value of __counted_by is available in this routine, which is 102; } Int main { Something (101); Something_2 (101); } For the above example, the out-of-bound array access in routine “something” should be able to be caught by the compiler. However, the out-of-bound array access in the routine “something_2” will NOT be able to be caught by the compiler. Since in the routine “something_2” , the compiler don’t know the alloc_size, the only available info is the counted_by value through the attribute. But this value is bigger than the REAL size of the array. Therefore the compiler cannot detect the out-of-bound array access in the routine something_2 Based on the above observation, I think we should add the following requirement: The value of “counted_by” should be equal or SMALLER than the real alloc_size for the flexible array member. This is the same requirement as the LLVM RFC. https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 "the compiler inserts additional checks to ensure the new buf has at least as many elements as the new count indicates.” LLVM has additional requirement in addition to this, we might need to consider those requirement too. Qing > But, as I said, if there is some need to explicitly ignore __alloc_size > when __counted_by is present, I can live with it; we just need to > document it. > > If the RFC and you agree that the __counted_by variable can only ever be > (re)assigned after the flex array has been (re)allocated, then I guess > we'll see how it goes. :) I think most places in the kernel using > __counted_by will be fine, but I suspect we may have cases where we need > to update it like in the loop I described above. If that's true, we can > revisit the requirement then. :) > > -Kees > > -- > Kees Cook
Hi, After some detailed study and consideration on how to use the new attribute “counted_by” in __builtin_dynamic_object_size, I came up with the following example with detailed explanation on the expected behavior from GCC on using this new attribute. Please take a look on this example and the explanation embedded, and let me know if you have further Comments or suggestions. Thanks a lot. Qing =========================================== #include <stdint.h> #include <malloc.h> #include <string.h> #include <stdio.h> struct annotated { size_t foo; int array[] __attribute__((counted_by (foo))); }; #define expect(p, _v) do { \ size_t v = _v; \ if (p == v) \ __builtin_printf ("ok: %s == %zd\n", #p, p); \ else \ { \ __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \ } \ } while (0); #define noinline __attribute__((__noinline__)) #define SIZE_BUMP 2 /* In general, Due to type casting, the type for the pointee of a pointer does not say anything about the object it points to, So, __builtin_object_size can not directly use the type of the pointee to decide the size of the object the pointer points to. there are only two reliable ways: A. observed allocations (call to the allocation functions in the routine) B. observed accesses (read or write access to the location of the pointer points to) that provide information about the type/existence of an object at the corresponding address. for A, we use the "alloc_size" attribute for the corresponding allocation functions to determine the object size; For B, we use the SIZE info of the TYPE attached to the corresponding access. (We treat counted_by attribute as a complement to the SIZE info of the TYPE for FMA) The only other way in C which ensures that a pointer actually points to an object of the correct type is 'static': void foo(struct P *p[static 1]); See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html for more details. */ /* in the following function, malloc allocated more space than the value of counted_by attribute. Then what's the correct behavior we expect the __builtin_dynamic_object_size should have for each of the cases? */ static struct annotated * noinline alloc_buf (int index) { struct annotated *p; p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int)); p->foo = index; /*when checking the observed access p->array, we have info on both observered allocation and observed access, A. from observed allocation: (index + SIZE_BUMP) * sizeof (int) B. from observed access: p->foo * sizeof (int) in the above, p->foo = index. */ /* for MAXIMUM sub-object size: chose the smaller of A and B. * Please see https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625891.html * for details on why. */ expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int)); expect(__builtin_dynamic_object_size(p->array, 0), sizeof (*p) + (p->foo) * sizeof(int)); /* for MINIMUM sub-object size: chose the smaller of A and B too. */ expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int)); expect(__builtin_dynamic_object_size(p->array, 2), sizeof (*p) + p->foo * sizeof(int)); /*when checking the pointer p, we only have info on the observed allocation. So, the object size info can only been obtained from the call to malloc. for both MAXIMUM and MINIMUM: A = (index + SIZE_BUMP) * sizeof (int) */ expect(__builtin_dynamic_object_size(p, 1), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int)); expect(__builtin_dynamic_object_size(p, 0), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int)); expect(__builtin_dynamic_object_size(p, 3), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int)); expect(__builtin_dynamic_object_size(p, 2), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int)); return p; } int main () { struct annotated *p; p = alloc_buf (10); /*when checking the observed access p->array, we only have info on the observed access, i.e, the TYPE_SIZE info from the access. We don't have info on the whole object. */ /*For MAXIMUM size, We know the SIZE info of the TYPE from the access to the sub-object p->array. but don't know the whole object the pointer p points to. */ expect(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(int)); expect(__builtin_dynamic_object_size(p->array, 0), -1); /*for MINIMUM size, We know the TYPE_SIZE from the access to the sub-oject p->array. but don't know the whole object the pointer p points to. */ expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int)); expect(__builtin_dynamic_object_size(p->array, 2), 0); /*when checking the pointer p, we have no observed allocation nor observed access. therefore, we cannot determine the size info here. */ expect(__builtin_dynamic_object_size(p, 1), -1); expect(__builtin_dynamic_object_size(p, 0), -1); expect(__builtin_dynamic_object_size(p, 3), 0); expect(__builtin_dynamic_object_size(p, 2), 0); return 0; } > On Jul 19, 2023, at 2:52 PM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > >>> >>> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement: >>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 >> >> Right, I'm saying it would be nice if __alloc_size was checked as well, >> in the sense that if it is available, it knows without question what the >> size of the allocation is. If __alloc_size and __counted_by conflict, >> the smaller of the two should be the truth. > > I don’t think that “if __alloc_size and __counted_by conflict, the smaller of the two should be the truth” will work correctly. > > When __alloc_size is larger than the value of __counted_by, it’s okay. > But when the value of __counted_by is larger than the __alloc_size, the array bound check or object size sanitizer might not work correctly. > > > Please see the following example: > > struct grows { > int alloc_count; > int valid_count; > int item[] __counted_by(valid_count); > } *p; > > void __attribute__((__noinline__)) something (int n) > { > p = malloc(sizeof(*p) + sizeof(*p->item) * 100); > p->alloc_count = 100; > p->valid_count = 102; > p->item[n] = 10; // both _alloc_size and the value of __counted_by are available in this routine, the smaller one is , 100; > > } > > void __attribute__((__noinline__)) something_2 (int n) > { > p->item[n] = 10; // only the value of __counted_by is available in this routine, which is 102; > } > > Int main > { > Something (101); > Something_2 (101); > } > > > For the above example, the out-of-bound array access in routine “something” should be able to be caught by the compiler. > However, the out-of-bound array access in the routine “something_2” will NOT be able to be caught by the compiler. > > Since in the routine “something_2” , the compiler don’t know the alloc_size, the only available info is the counted_by value > through the attribute. But this value is bigger than the REAL size of the array. Therefore the compiler cannot detect the > out-of-bound array access in the routine something_2 > > > Based on the above observation, I think we should add the following requirement: > > The value of “counted_by” should be equal or SMALLER than the real alloc_size for the flexible array member. > > This is the same requirement as the LLVM RFC. > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18 > > "the compiler inserts additional checks to ensure the new buf has at least as many elements as the new count indicates.” > LLVM has additional requirement in addition to this, we might need to consider those requirement too. > > Qing > >> But, as I said, if there is some need to explicitly ignore __alloc_size >> when __counted_by is present, I can live with it; we just need to >> document it. >> >> If the RFC and you agree that the __counted_by variable can only ever be >> (re)assigned after the flex array has been (re)allocated, then I guess >> we'll see how it goes. :) I think most places in the kernel using >> __counted_by will be fine, but I suspect we may have cases where we need >> to update it like in the loop I described above. If that's true, we can >> revisit the requirement then. :) >> >> -Kees >> >> -- >> Kees Cook >
On Mon, Jul 31, 2023 at 08:14:42PM +0000, Qing Zhao wrote: > /* In general, Due to type casting, the type for the pointee of a pointer > does not say anything about the object it points to, > So, __builtin_object_size can not directly use the type of the pointee > to decide the size of the object the pointer points to. > > there are only two reliable ways: > A. observed allocations (call to the allocation functions in the routine) > B. observed accesses (read or write access to the location of the > pointer points to) > > that provide information about the type/existence of an object at > the corresponding address. > > for A, we use the "alloc_size" attribute for the corresponding allocation > functions to determine the object size; > > For B, we use the SIZE info of the TYPE attached to the corresponding access. > (We treat counted_by attribute as a complement to the SIZE info of the TYPE > for FMA) > > The only other way in C which ensures that a pointer actually points > to an object of the correct type is 'static': > > void foo(struct P *p[static 1]); > > See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html > for more details. */ This is a great explanation; thank you! In the future I might want to have a new builtin that will allow a program to query a pointer when neither A nor B have happened. But for the first version of the __counted_by infrastructure, the above limitations seen fine. For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) + sizeof(*p->flex_array_member) * p->counted_by_member). Though since there might be multiple flex array members, maybe this can't work. :) -Kees
Am Dienstag, dem 01.08.2023 um 15:45 -0700 schrieb Kees Cook: > On Mon, Jul 31, 2023 at 08:14:42PM +0000, Qing Zhao wrote: > > /* In general, Due to type casting, the type for the pointee of a pointer > > does not say anything about the object it points to, > > So, __builtin_object_size can not directly use the type of the pointee > > to decide the size of the object the pointer points to. > > > > there are only two reliable ways: > > A. observed allocations (call to the allocation functions in the routine) > > B. observed accesses (read or write access to the location of the > > pointer points to) > > > > that provide information about the type/existence of an object at > > the corresponding address. > > > > for A, we use the "alloc_size" attribute for the corresponding allocation > > functions to determine the object size; > > > > For B, we use the SIZE info of the TYPE attached to the corresponding access. > > (We treat counted_by attribute as a complement to the SIZE info of the TYPE > > for FMA) > > > > The only other way in C which ensures that a pointer actually points > > to an object of the correct type is 'static': > > > > void foo(struct P *p[static 1]); > > > > See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html > > for more details. */ > > This is a great explanation; thank you! > > In the future I might want to have a new builtin that will allow > a program to query a pointer when neither A nor B have happened. But > for the first version of the __counted_by infrastructure, the above > limitations seen fine. > > For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) + > sizeof(*p->flex_array_member) * p->counted_by_member). Though since > there might be multiple flex array members, maybe this can't work. :) We had a _Lengthof proposal for arrays (instead of sizeof/sizeof) and thought about how to extend this to structs with FAM. The problem is that it can not rely on an attribute. With GCC's VLA in structs you could do struct foo { int n; char buf[n_init]; } *p = malloc(sizeof *p); p->n_init = n; and get sizeof and bounds checking with UBSan https://godbolt.org/z/d4nneqs3P (but also compiler bugs and other issues) Also see my experimental container library, where you can do: vec_decl(int); vec(int)* v = vec_alloc(int); vec_push(&v, 1); vec_push(&v, 3); auto p = &vec_array(v); (*p)[1] = 1; // bounds check Here, "vec_array()" would give you a regular C array view of the vector contant and with correct dynamic size, so you can apply "sizeof" and have bounds checking with UBSan and it just works (with clang / GCC without changes). https://github.com/uecker/noplate Martin
> On Aug 2, 2023, at 2:25 AM, Martin Uecker <uecker@tugraz.at> wrote: > > Am Dienstag, dem 01.08.2023 um 15:45 -0700 schrieb Kees Cook: >> On Mon, Jul 31, 2023 at 08:14:42PM +0000, Qing Zhao wrote: >>> /* In general, Due to type casting, the type for the pointee of a pointer >>> does not say anything about the object it points to, >>> So, __builtin_object_size can not directly use the type of the pointee >>> to decide the size of the object the pointer points to. >>> >>> there are only two reliable ways: >>> A. observed allocations (call to the allocation functions in the routine) >>> B. observed accesses (read or write access to the location of the >>> pointer points to) >>> >>> that provide information about the type/existence of an object at >>> the corresponding address. >>> >>> for A, we use the "alloc_size" attribute for the corresponding allocation >>> functions to determine the object size; >>> >>> For B, we use the SIZE info of the TYPE attached to the corresponding access. >>> (We treat counted_by attribute as a complement to the SIZE info of the TYPE >>> for FMA) >>> >>> The only other way in C which ensures that a pointer actually points >>> to an object of the correct type is 'static': >>> >>> void foo(struct P *p[static 1]); >>> >>> See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html >>> for more details. */ >> >> This is a great explanation; thank you! >> >> In the future I might want to have a new builtin that will allow >> a program to query a pointer when neither A nor B have happened. But >> for the first version of the __counted_by infrastructure, the above >> limitations seen fine. >> >> For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) + >> sizeof(*p->flex_array_member) * p->counted_by_member). Though since >> there might be multiple flex array members, maybe this can't work. :) > > We had a _Lengthof proposal for arrays (instead of sizeof/sizeof) > and thought about how to extend this to structs with FAM. The > problem is that it can not rely on an attribute. > > With GCC's VLA in structs you could do > > struct foo { int n; char buf[n_init]; } *p = malloc(sizeof *p); > p->n_init = n; > > and get sizeof and bounds checking with UBSan > https://godbolt.org/z/d4nneqs3P > > (but also compiler bugs and other issues) This works great! If later the bounds information for the FAM can be integrated into TYPE system just like the VLA, That will be ideal, then we don’t need to hack the compiler here and there to handle the FMA specially. > > > Also see my experimental container library, where you can do: > > vec_decl(int); > vec(int)* v = vec_alloc(int); > > vec_push(&v, 1); > vec_push(&v, 3); > > auto p = &vec_array(v); > (*p)[1] = 1; // bounds check > > Here, "vec_array()" would give you a regular C array view > of the vector contant and with correct dynamic size, so you > can apply "sizeof" and have bounds checking with UBSan and > it just works (with clang / GCC without changes). > https://github.com/uecker/noplate Yes, the idea of providing a type-safe library for C also looks promising. thanks Qing > > > > Martin
> On Aug 1, 2023, at 6:45 PM, Kees Cook <keescook@chromium.org> wrote: > > On Mon, Jul 31, 2023 at 08:14:42PM +0000, Qing Zhao wrote: >> /* In general, Due to type casting, the type for the pointee of a pointer >> does not say anything about the object it points to, >> So, __builtin_object_size can not directly use the type of the pointee >> to decide the size of the object the pointer points to. >> >> there are only two reliable ways: >> A. observed allocations (call to the allocation functions in the routine) >> B. observed accesses (read or write access to the location of the >> pointer points to) >> >> that provide information about the type/existence of an object at >> the corresponding address. >> >> for A, we use the "alloc_size" attribute for the corresponding allocation >> functions to determine the object size; >> >> For B, we use the SIZE info of the TYPE attached to the corresponding access. >> (We treat counted_by attribute as a complement to the SIZE info of the TYPE >> for FMA) >> >> The only other way in C which ensures that a pointer actually points >> to an object of the correct type is 'static': >> >> void foo(struct P *p[static 1]); >> >> See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html >> for more details. */ > > This is a great explanation; thank you! > > In the future I might want to have a new builtin that will allow > a program to query a pointer when neither A nor B have happened. But > for the first version of the __counted_by infrastructure, the above > limitations seen fine. > > For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) + > sizeof(*p->flex_array_member) * p->counted_by_member). Though since > there might be multiple flex array members, maybe this can't work. :) What do you mean by “there might be multiple flex array members”? Do you mean the following example: struct annotated { size_t foo; int array[] __attribute__((counted_by (foo))); }; static struct annotated * noinline alloc_buf (int index) { struct annotated *p; p = malloc(sizeof (*p) + (index) * sizeof (int)); p->foo = index; return p; } Int main () { struct annotated *p1, *p2; p1 = alloc_buf (10); p2 = alloc_buf (20); __builtin_counted_size(p1)??? __builtin_counted_size(p2)??? } Or something else? Qing > > -Kees > > -- > Kees Cook