Message ID | ri64jl1cyky.fsf@suse.cz |
---|---|
State | New |
Headers | show |
Series | Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR 110677) | expand |
Hi Martin, Am 14.08.23 um 19:39 schrieb Martin Jambor: > Hello, > > this patch addresses an issue uncovered by the undefined behavior > sanitizer. In function resolve_structure_cons in resolve.cc there is > a test starting with: > > if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl > && comp->ts.u.cl->length > && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT > > and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of > integer value 1818451807 which is outside of the value range expr_t > enum. If I understand the code correctly it the entire load was > unwanted because comp->ts.type in those cases is BT_CLASS and not > BT_CHARACTER. This patch simply adds a check to make sure it is only > accessed in those cases. > > I have verified that the UPBSAN failure goes away with this patch, it > also passes bootstrap and testing on x86_64-linux. OK for master? this looks good to me. Looking at that code block, there is a potential other UB a few lines below, where (hopefully integer) string lengths are to be passed to mpz_cmp. If the string length is ill-defined (e.g. non-integer), value.integer is undefined. We've seen this elsewhere, where on BE platforms that undefined value was interpreted as some large integer and giving failures on those platforms. One could similarly add the following checks here (on top of your patch): diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index e7c8d919bef..43095406c16 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -1401,6 +1401,8 @@ resolve_structure_cons (gfc_expr *expr, int init) && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT + && cons->expr->ts.u.cl->length->ts.type == BT_INTEGER + && comp->ts.u.cl->length->ts.type == BT_INTEGER && mpz_cmp (cons->expr->ts.u.cl->length->value.integer, comp->ts.u.cl->length->value.integer) != 0) { It is up to you whether you want to add this. Thanks for the patch! Harald > Thanks, > > Martin > > > > gcc/fortran/ChangeLog: > > 2023-08-14 Martin Jambor <mjambor@suse.cz> > > PR fortran/110677 > * resolve.cc (resolve_structure_cons): Check comp->ts is character > type before accessing stuff through comp->ts.u.cl. > --- > gcc/fortran/resolve.cc | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc > index e7c8d919bef..5b4dfc5fcd2 100644 > --- a/gcc/fortran/resolve.cc > +++ b/gcc/fortran/resolve.cc > @@ -1396,8 +1396,9 @@ resolve_structure_cons (gfc_expr *expr, int init) > the one of the structure, ensure this if the lengths are known at > compile time and when we are dealing with PARAMETER or structure > constructors. */ > - if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl > - && comp->ts.u.cl->length > + if (cons->expr->ts.type == BT_CHARACTER > + && comp->ts.type == BT_CHARACTER > + && comp->ts.u.cl && comp->ts.u.cl->length > && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT > && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length > && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT
Hello, On Mon, Aug 14 2023, Harald Anlauf via Gcc-patches wrote: > Hi Martin, > > Am 14.08.23 um 19:39 schrieb Martin Jambor: >> Hello, >> >> this patch addresses an issue uncovered by the undefined behavior >> sanitizer. In function resolve_structure_cons in resolve.cc there is >> a test starting with: >> >> if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl >> && comp->ts.u.cl->length >> && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT >> >> and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of >> integer value 1818451807 which is outside of the value range expr_t >> enum. If I understand the code correctly it the entire load was >> unwanted because comp->ts.type in those cases is BT_CLASS and not >> BT_CHARACTER. This patch simply adds a check to make sure it is only >> accessed in those cases. >> >> I have verified that the UPBSAN failure goes away with this patch, it >> also passes bootstrap and testing on x86_64-linux. OK for master? > > this looks good to me. > > Looking at that code block, there is a potential other UB a few lines > below, where (hopefully integer) string lengths are to be passed to > mpz_cmp. > > If the string length is ill-defined (e.g. non-integer), value.integer > is undefined. We've seen this elsewhere, where on BE platforms that > undefined value was interpreted as some large integer and giving > failures on those platforms. One could similarly add the following > checks here (on top of your patch): Thank you very much for the approval and the improvement. I have committed the following (after another round of testing). Martin Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR 110677) This patch addresses an issue uncovered by the undefined behavior sanitizer. In function resolve_structure_cons in resolve.cc there is a test starting with: if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl && comp->ts.u.cl->length && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of integer value 1818451807 which is outside of the value range expr_t enum. If I understand the code correctly it the entire load was unwanted because comp->ts.type in those cases is BT_CLASS and not BT_CHARACTER. This patch simply adds a check to make sure it is only accessed in those cases. During review, Harald Anlauf noticed that length types also need to be checked and so I added also checks that he suggested to the condition. Co-authored-by: Harald Anlauf <anlauf@gmx.de> gcc/fortran/ChangeLog: 2023-08-14 Martin Jambor <mjambor@suse.cz> PR fortran/110677 * resolve.cc (resolve_structure_cons): Check comp->ts is character type before accessing stuff through comp->ts.u.cl. --- gcc/fortran/resolve.cc | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index e7c8d919bef..f51674f7faa 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -1396,11 +1396,14 @@ resolve_structure_cons (gfc_expr *expr, int init) the one of the structure, ensure this if the lengths are known at compile time and when we are dealing with PARAMETER or structure constructors. */ - if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl - && comp->ts.u.cl->length + if (cons->expr->ts.type == BT_CHARACTER + && comp->ts.type == BT_CHARACTER + && comp->ts.u.cl && comp->ts.u.cl->length && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT + && cons->expr->ts.u.cl->length->ts.type == BT_INTEGER + && comp->ts.u.cl->length->ts.type == BT_INTEGER && mpz_cmp (cons->expr->ts.u.cl->length->value.integer, comp->ts.u.cl->length->value.integer) != 0) {
diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index e7c8d919bef..5b4dfc5fcd2 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -1396,8 +1396,9 @@ resolve_structure_cons (gfc_expr *expr, int init) the one of the structure, ensure this if the lengths are known at compile time and when we are dealing with PARAMETER or structure constructors. */ - if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl - && comp->ts.u.cl->length + if (cons->expr->ts.type == BT_CHARACTER + && comp->ts.type == BT_CHARACTER + && comp->ts.u.cl && comp->ts.u.cl->length && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT