target/i386: fix pcmpxstrx substring search

Message ID alpine.DEB.2.20.1708102139310.8101@digraph.polyomino.org.uk
State New
Headers show

Commit Message

Joseph Myers Aug. 10, 2017, 9:40 p.m.
One of the cases of the SSE4.2 pcmpestri / pcmpestrm / pcmpistri /
pcmpistrm instructions does a substring search.  The implementation of
this case in the pcmpxstrx helper is incorrect.  The operation in this
case is a search for a string (argument d to the helper) in another
string (argument s to the helper); if a copy of d at a particular
position would run off the end of s, the resulting output bit should
be 0 whether or not the strings match in the region where they
overlap, but the QEMU implementation was wrongly comparing only up to
the point where s ends and counting it as a match if an initial
segment of d matched a terminal segment of s.  Here, "run off the end
of s" means that some byte of d would overlap some byte outside of s;
thus, if d has zero length, it is considered to match everywhere,
including after the end of s.  This patch fixes the implementation to
correspond with the proper instruction semantics.  This fixes four gcc
test failures in my GCC 6-based testing.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

Comments

Paolo Bonzini Aug. 11, 2017, 2:06 p.m. | #1
On 10/08/2017 23:40, Joseph Myers wrote:
> One of the cases of the SSE4.2 pcmpestri / pcmpestrm / pcmpistri /
> pcmpistrm instructions does a substring search.  The implementation of
> this case in the pcmpxstrx helper is incorrect.  The operation in this
> case is a search for a string (argument d to the helper) in another
> string (argument s to the helper); if a copy of d at a particular
> position would run off the end of s, the resulting output bit should
> be 0 whether or not the strings match in the region where they
> overlap, but the QEMU implementation was wrongly comparing only up to
> the point where s ends and counting it as a match if an initial
> segment of d matched a terminal segment of s.  Here, "run off the end
> of s" means that some byte of d would overlap some byte outside of s;
> thus, if d has zero length, it is considered to match everywhere,
> including after the end of s.  This patch fixes the implementation to
> correspond with the proper instruction semantics.  This fixes four gcc
> test failures in my GCC 6-based testing.
> 
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
> 
> ---
> 
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index 16509d0..9f1b351 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -2037,10 +2040,14 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
>          }
>          break;
>      case 3:
> -        for (j = valids; j >= 0; j--) {
> +        if (validd == -1) {
> +            res = (2 << upper) - 1;
> +            break;
> +        }
> +        for (j = valids - validd; j >= 0; j--) {
>              res <<= 1;
>              v = 1;
> -            for (i = MIN(valids - j, validd); i >= 0; i--) {
> +            for (i = validd; i >= 0; i--) {
>                  v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
>              }
>              res |= v;

Ah, so j=valids-validd was correct before commit 54c54f8b56
("target-i386: fix pcmpxstrx equal-ordered (strstr) mode", 2015-11-04),
it was the upper bound of i that was incorrect in ignoring validd.

Queued, thanks.

Paolo

Patch

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 16509d0..9f1b351 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2037,10 +2040,14 @@  static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
         }
         break;
     case 3:
-        for (j = valids; j >= 0; j--) {
+        if (validd == -1) {
+            res = (2 << upper) - 1;
+            break;
+        }
+        for (j = valids - validd; j >= 0; j--) {
             res <<= 1;
             v = 1;
-            for (i = MIN(valids - j, validd); i >= 0; i--) {
+            for (i = validd; i >= 0; i--) {
                 v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
             }
             res |= v;