Message ID | 20200122170231.GA10992@dyn-9-152-222-55.boeblingen.de.ibm.com |
---|---|
State | New |
Headers | show |
Series | Fix comparison of trees via tree_cmp | expand |
On Wed, 22 Jan 2020, Stefan Schulze Frielinghaus wrote: > Hi David, > > In function `tree_cmp` an invariant [1] is assumed which does not necessarily > exist. In case both input trees are finally compared via `strcmp`, then > > tree_cmp (t1, t2) == -tree_cmp (t2, t1) > > does not hold in general, since function `strcmp (x, y)` guarantees only that a > negative integer is returned in case x<y or a positive integer in case x>y. > Currently this breaks s390x where, for example, for certain inputs x,y > `tree_cmp (x, y) == 1` and `tree_cmp (y, x) == -2` hold. The attached patch > normalizes the output from `strcmp` to -1, 0, or 1 while using an auxiliary > function `sign` (stolen from the Hacker's Delight book ;-)). > > Bootstrapped and tested on s390x. Any thoughts? It's more appropriate to fix the assert rather than the comparator, like gcc_assert (sign (reversed) == -sign (result)); But qsort_chk already checks that, and more, so why is the assert there? Shouldn't it be simply removed? Alexander
On Wed, Jan 22, 2020 at 08:08:32PM +0300, Alexander Monakov wrote: > > > On Wed, 22 Jan 2020, Stefan Schulze Frielinghaus wrote: > > > Hi David, > > > > In function `tree_cmp` an invariant [1] is assumed which does not necessarily > > exist. In case both input trees are finally compared via `strcmp`, then > > > > tree_cmp (t1, t2) == -tree_cmp (t2, t1) > > > > does not hold in general, since function `strcmp (x, y)` guarantees only that a > > negative integer is returned in case x<y or a positive integer in case x>y. > > Currently this breaks s390x where, for example, for certain inputs x,y > > `tree_cmp (x, y) == 1` and `tree_cmp (y, x) == -2` hold. The attached patch > > normalizes the output from `strcmp` to -1, 0, or 1 while using an auxiliary > > function `sign` (stolen from the Hacker's Delight book ;-)). > > > > Bootstrapped and tested on s390x. Any thoughts? > > It's more appropriate to fix the assert rather than the comparator, like > > gcc_assert (sign (reversed) == -sign (result)); > > But qsort_chk already checks that, and more, so why is the assert there? > Shouldn't it be simply removed? Yeah. Note there is also return DECL_UID (t1) - DECL_UID (t2); that also doesn't guarantee -1/0/1 return values, so the patch as posted isn't sufficient. Jakub
On Wed, 2020-01-22 at 19:02 +0100, Jakub Jelinek wrote: > On Wed, Jan 22, 2020 at 08:08:32PM +0300, Alexander Monakov wrote: > > > > On Wed, 22 Jan 2020, Stefan Schulze Frielinghaus wrote: > > > > > Hi David, > > > > > > In function `tree_cmp` an invariant [1] is assumed which does not > > > necessarily > > > exist. In case both input trees are finally compared via > > > `strcmp`, then > > > > > > tree_cmp (t1, t2) == -tree_cmp (t2, t1) > > > > > > does not hold in general, since function `strcmp (x, y)` > > > guarantees only that a > > > negative integer is returned in case x<y or a positive integer in > > > case x>y. > > > Currently this breaks s390x where, for example, for certain > > > inputs x,y > > > `tree_cmp (x, y) == 1` and `tree_cmp (y, x) == -2` hold. The > > > attached patch > > > normalizes the output from `strcmp` to -1, 0, or 1 while using an > > > auxiliary > > > function `sign` (stolen from the Hacker's Delight book ;-)). > > > > > > Bootstrapped and tested on s390x. Any thoughts? > > > > It's more appropriate to fix the assert rather than the comparator, > > like > > > > gcc_assert (sign (reversed) == -sign (result)); > > > > But qsort_chk already checks that, and more, so why is the assert > > there? > > Shouldn't it be simply removed? > > Yeah. Note there is also return DECL_UID (t1) - DECL_UID (t2); that > also > doesn't guarantee -1/0/1 return values, so the patch as posted isn't > sufficient. Sorry about the breakage; to be specific, I've reproduced this on s390x-ibm-linux-gnu, where it fails the selftests in stage 1, breaking the build (unless configured with --disable-analyzer). Removing the assertions fixes it for me (a stage1 build, at least, and it then passes the testsuite). I've made this blunder in four places in the analyzer: call-string.cc:162: call_string::cmp engine.cc:1820: worklist::key_t::cmp program-point.cc:461: function_point::cmp_within_supernode region-model.cc:1878: tree_cmp IIRC, I added these checks as I was finding it difficult to debug things when qsort_chk failed - the first three of the comparators in the list above build on each other: worklist::key_t::cmp uses both function_point::cmp_within_supernode and call_string::cmp (and various other things), so if the worklist qsort_chk fails, it sometimes required a fair bit of digging into which of the nested comparators had failed (also, all the void * don't help; I'd love to have a template that does a typesafe qsort without needing casts in the comparator). Some approaches for fixing this: (a) I can simply remove these checks on all the comparators (b) I could fix the checks so that they merely look at the signedness of the results (c) Remove the checks from tree_cmp, but retain them in the other places, fixing them to just look at signedness, for ease of debugging. I think I prefer (c) as it makes debugging failures easier, though I appreciate it's rather redundant. Thoughts? Dave
On Thu, 23 Jan 2020, David Malcolm wrote: > Removing the assertions fixes it for me (a stage1 build, at least, and > it then passes the testsuite). > > I've made this blunder in four places in the analyzer: > > call-string.cc:162: call_string::cmp > program-point.cc:461: function_point::cmp_within_supernode These two don't raise concerns on brief inspection. > engine.cc:1820: worklist::key_t::cmp This one tries to use signed difference of hashes as return value, this is not going to work in general while passing your assert almost always (except when difference of hashes is exactly -2^31). The problem is, difference of hashes leads to non-transitive comparison and qsort_chk can catch it on a suitable testcase. > region-model.cc:1878: tree_cmp This is the one under discussion here. > IIRC, I added these checks as I was finding it difficult to debug > things when qsort_chk failed - the first three of the comparators in > the list above build on each other: worklist::key_t::cmp uses both > function_point::cmp_within_supernode and call_string::cmp (and various > other things), so if the worklist qsort_chk fails, it sometimes > required a fair bit of digging into which of the nested comparators had > failed (also, all the void * don't help; I'd love to have a template > that does a typesafe qsort without needing casts in the comparator). FWIW qsort_chk_error is a separate function to make that easier: if you place a breakpoint on qsort_chk_error you just need to step a few times to get into a comparator that is under suspicion, and don't need to type out casts in gdb. Alexander
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 9474c6737d5..bdd9a97e5f8 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -62,6 +62,20 @@ along with GCC; see the file COPYING3. If not see #if ENABLE_ANALYZER +/* Normalize X to either -1, 0, or 1. + + Basically performs the same as + if (x < 0) return -1; + else if (x > 0) return 1; + else return 0; + but in a concise way trying to prevent branches. */ + +static int +sign (int x) +{ + return (x > 0) - (x < 0); +} + /* Dump T to PP in language-independent form, for debugging/logging/dumping purposes. */ @@ -1768,8 +1782,8 @@ tree_cmp (const_tree t1, const_tree t2) if (DECL_P (t1)) { if (DECL_NAME (t1) && DECL_NAME (t2)) - return strcmp (IDENTIFIER_POINTER (DECL_NAME (t1)), - IDENTIFIER_POINTER (DECL_NAME (t2))); + return sign (strcmp (IDENTIFIER_POINTER (DECL_NAME (t1)), + IDENTIFIER_POINTER (DECL_NAME (t2)))); else { if (DECL_NAME (t1)) @@ -1819,8 +1833,8 @@ tree_cmp (const_tree t1, const_tree t2) } case STRING_CST: - return strcmp (TREE_STRING_POINTER (t1), - TREE_STRING_POINTER (t2)); + return sign (strcmp (TREE_STRING_POINTER (t1), + TREE_STRING_POINTER (t2))); default: gcc_unreachable ();