diff mbox

fw_cfg: fix endianness in fw_cfg_data_mem_read() / _write()

Message ID 1420024880-15416-1-git-send-email-lersek@redhat.com
State New
Headers show

Commit Message

Laszlo Ersek Dec. 31, 2014, 11:21 a.m. UTC
(1) Let's contemplate what device endianness means, for a memory mapped
device register (independently of QEMU -- that is, on physical hardware).

It determines the byte order that the device will put on the data bus when
the device is producing a *numerical value* for the CPU. This byte order
may differ from the CPU's own byte order, therefore when software wants to
consume the *numerical value*, it may have to swap the byte order first.

For example, suppose we have a device that exposes in a 2-byte register
the number of sheep we have to count before falling asleep. If the value
is decimal 37 (0x0025), then a big endian register will produce [0x00,
0x25], while a little endian register will produce [0x25, 0x00].

If the device register is big endian, but the CPU is little endian, the
numerical value will read as 0x2500 (decimal 9472), which software has to
byte swap before use.

However... if we ask the device about who stole our herd of sheep, and it
answers "XY", then the byte representation coming out of the register must
be [0x58, 0x59], regardless of the device register's endianness for
numeric values. And, software needs to copy these bytes into a string
field regardless of the CPU's own endianness.

(2) QEMU's device register accessor functions work with *numerical values*
exclusively, not strings:

The emulated register's read accessor function returns the numerical value
(eg. 37 decimal, 0x0025) as a *host-encoded* uint64_t. QEMU translates
this value for the guest to the endianness of the emulated device register
(which is recorded in MemoryRegionOps.endianness). Then guest code must
translate the numerical value from device register to guest CPU
endianness, before including it in any computation (see (1)).

(3) However, the data register of the fw_cfg device shall transfer strings
*only* -- that is, opaque blobs. Interpretation of any given blob is
subject to further agreement -- it can be an integer in an independently
determined byte order, or a genuine string, or an array of structs of
integers (in some byte order) and fixed size strings, and so on.

Because register emulation in QEMU is integer-preserving, not
string-preserving (see (2)), we have to jump through a few hoops.

(3a) We defined the memory mapped fw_cfg data register as
DEVICE_BIG_ENDIAN.

The particular choice is not really relevant -- we picked BE only for
consistency with the control register, which *does* transfer integers --
but our choice affects how we must host-encode values from fw_cfg strings.

(3b) Since we want the fw_cfg string "XY" to appear as the [0x58, 0x59]
array on the data register, *and* we picked DEVICE_BIG_ENDIAN, we must
compose the host (== C language) value 0x5859 in the read accessor
function.

(3c) When the guest performs the read access, the immediate uint16_t value
will be 0x5958 (in LE guests) and 0x5859 (in BE guests). However, the
uint16_t value does not matter. The only thing that matters is the byte
pattern [0x58, 0x59], which the guest code must copy into the target
string *without* any byte-swapping.

(4) Now I get to explain where I screwed up. :(

When we decided for big endian *integer* representation in the MMIO data
register -- see (3a) --, I mindlessly added an indiscriminate
byte-swizzling step to the (little endian) guest firmware.

This was a grave error -- it violates (3c) --, but I didn't realize it. I
only saw that the code I otherwise intended for fw_cfg_data_mem_read():

    value = 0;
    for (i = 0; i < size; ++i) {
        value = (value << 8) | fw_cfg_read(s);
    }

didn't produce the expected result in the guest.

In true facepalm style, instead of blaming my guest code (which violated
(3c)), I blamed my host code (which was correct). Ultimately, I coded
ldX_he_p() into fw_cfg_data_mem_read(), because that happened to work.

Obviously (...in retrospect) that was wrong. Only because my host happened
to be LE, ldX_he_p() composed the (otherwise incorrect) host value 0x5958
from the fw_cfg string "XY". And that happened to compensate for the bogus
indiscriminate byte-swizzling in my guest code.

Clearly the current code leaks the host endianness through to the guest,
which is wrong. Any device should work the same regardless of host
endianness.

The solution is to compose the host-endian representation (2) of the big
endian interpretation (3a, 3b) of the fw_cfg string, and to drop the wrong
byte-swizzling in the guest (3c).

Brown paper bag time for me.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---

Notes:
    I apologize for this mess -- I really had to let this endianness stuff
    brew in my subconscious for several days. I could only untangle it for
    myself this morning. I had to sit in an unlit room for hours, without a
    computer, until it dawned on me (both literally and figuratively).

 hw/nvram/fw_cfg.c | 41 +++++++----------------------------------
 1 file changed, 7 insertions(+), 34 deletions(-)

Comments

Paolo Bonzini Dec. 31, 2014, 1:20 p.m. UTC | #1
On 31/12/2014 12:21, Laszlo Ersek wrote:
> Because register emulation in QEMU is integer-preserving, not
> string-preserving (see (2)), we have to jump through a few hoops.
> 
> (3a) We defined the memory mapped fw_cfg data register as
> DEVICE_BIG_ENDIAN.
> 
> The particular choice is not really relevant -- we picked BE only for
> consistency with the control register, which *does* transfer integers --
> but our choice affects how we must host-encode values from fw_cfg strings.
> 
> (3b) Since we want the fw_cfg string "XY" to appear as the [0x58, 0x59]
> array on the data register, *and* we picked DEVICE_BIG_ENDIAN, we must
> compose the host (== C language) value 0x5859 in the read accessor
> function.

I'm not sure this is right.

DEVICE_BIG_ENDIAN means that if we return 0xaabb and the guest is little
endian, it will return 0xbbaa.  But the value returned by the accessor
is always in host endianness.

And it makes sense to swap in the guest if the register is big endian
but the guest is little endian.

So IMHO your old code is right.  Either you are overthinking it, or I'm
underthinking it...  Knowing our respective personalities, either
possibility is just as likely... ;)

Paolo
Laszlo Ersek Dec. 31, 2014, 2:07 p.m. UTC | #2
On 12/31/14 14:20, Paolo Bonzini wrote:
> 
> 
> On 31/12/2014 12:21, Laszlo Ersek wrote:
>> Because register emulation in QEMU is integer-preserving, not
>> string-preserving (see (2)), we have to jump through a few hoops.
>>
>> (3a) We defined the memory mapped fw_cfg data register as
>> DEVICE_BIG_ENDIAN.
>>
>> The particular choice is not really relevant -- we picked BE only for
>> consistency with the control register, which *does* transfer integers --
>> but our choice affects how we must host-encode values from fw_cfg strings.
>>
>> (3b) Since we want the fw_cfg string "XY" to appear as the [0x58, 0x59]
>> array on the data register, *and* we picked DEVICE_BIG_ENDIAN, we must
>> compose the host (== C language) value 0x5859 in the read accessor
>> function.
> 
> I'm not sure this is right.
> 
> DEVICE_BIG_ENDIAN means that if we return 0xaabb and the guest is little
> endian, it will return 0xbbaa.

Let me decode that...
- the logical value (which we return from the accessor in host-endian)
is 0xaabb
- the device is big endian, so 0xaabb will mean [0xaa, 0xbb] for it
- you're saying that if the guest is little endian, then it will
  read 0xbbaa.

I agree fully.

> But the value returned by the accessor
> is always in host endianness.

Yes.

> And it makes sense to swap in the guest if the register is big endian
> but the guest is little endian.

Absolutely -- *if* in the guest you care about the integer value that
the accessor function originally returned. (Ie. you want 0xaabb in the
guest.)

But we don't care about that integer value. What we care about is the
fw_cfg string underlying the host value. And the current code will
return a different host value, dependent on host endianness, for the
same underlying string.

> So IMHO your old code is right.

The immediate sign that makes me nervous is the ldX_he_p() calls in the
accessors. "Host endian" means that the same fw_cfg string will cause
different host (== C) integer values to be returned from the accessor if
you change the host from LE to BE. It doesn't seem right.

Obviously this issue could be put to rest if I could test on any big
endian host. Too bad the last sparc64 I used to have access to was
decommissioned years ago. The True64 / alpha box even earlier. :(

Of course this also renders the issue mostly moot -- if none of us can
test the code on a BE host, then that use case simply doesn't exist in
practice.

(Maybe I should build a big endian target of qemu, like mips or powerpc
or sparc, then install a matching Debian distro in this TCG guest, then
build & install qemu-system-aarch64 in it, then test the ARM firmware in
*that*...)

> Either you are overthinking it, or I'm
> underthinking it...  Knowing our respective personalities, either
> possibility is just as likely... ;)

Touché :)

I just wanted to have a clear conscience. The ldX_he_p() thing kept
bugging me, and I would have *hated* to see this patch down the line
from someone else, saying (or not saying, but thinking) "lol, after all
this noise, Laszlo messed it up nonetheless". So, I wrote it up as
precisely as I could for discussion's sake.

You could argue that the commit message is actually the more important
half of the patch, because it tries to codify silent interface
contracts, and I definitely wanted to run those statements by you guys.

If the statements in the commit message are incorrect, then I'm more
than relieved to leave the in-tree code as-is, because then you'll have
investigated my concerns and put them to rest. :) I felt that
*submitting* (but not necessarily committing) this patch was necessary
for keeping my integrity / credibility.

Thanks!
Laszlo
Peter Maydell Dec. 31, 2014, 3:17 p.m. UTC | #3
On 31 December 2014 at 14:07, Laszlo Ersek <lersek@redhat.com> wrote:
> Of course this also renders the issue mostly moot -- if none of us can
> test the code on a BE host, then that use case simply doesn't exist in
> practice.

If you can give me a test image and a command line I can test
it on one of the PPC64 boxes in the GCC compile farm.

-- PMM
Paolo Bonzini Dec. 31, 2014, 4:23 p.m. UTC | #4
On 31/12/2014 16:17, Peter Maydell wrote:
> On 31 December 2014 at 14:07, Laszlo Ersek <lersek@redhat.com> wrote:
>> > Of course this also renders the issue mostly moot -- if none of us can
>> > test the code on a BE host, then that use case simply doesn't exist in
>> > practice.
> If you can give me a test image and a command line I can test
> it on one of the PPC64 boxes in the GCC compile farm.

You can follow the qtest steps in the commit messages of
6c87e3d5967a1d731b5f591a8f0ee6c319c14ca8:

$ arm-softmmu/qemu-system-arm -M virt -machine accel=qtest \
             -qtest stdio -uuid 4600cb32-38ec-4b2f-8acb-81c6ea54f2d8
writew 0x9020008 0x0200
readl 0x9020000

The readl should return "OK 0x000000004600cb32".

Paolo
Laszlo Ersek Dec. 31, 2014, 4:27 p.m. UTC | #5
On 12/31/14 16:17, Peter Maydell wrote:
> On 31 December 2014 at 14:07, Laszlo Ersek <lersek@redhat.com> wrote:
>> Of course this also renders the issue mostly moot -- if none of us can
>> test the code on a BE host, then that use case simply doesn't exist in
>> practice.
> 
> If you can give me a test image and a command line I can test
> it on one of the PPC64 boxes in the GCC compile farm.

Sorry for the late answer, I've been laying out stuff and uploading it
(at 50KB/s :().

I'm very thankful for your help -- please see
<http://people.redhat.com/~lersek/for_Peter/>.

Thanks!
Laszlo
Laszlo Ersek Dec. 31, 2014, 5:04 p.m. UTC | #6
On 12/31/14 17:23, Paolo Bonzini wrote:
> 
> 
> On 31/12/2014 16:17, Peter Maydell wrote:
>> On 31 December 2014 at 14:07, Laszlo Ersek <lersek@redhat.com> wrote:
>>>> Of course this also renders the issue mostly moot -- if none of us can
>>>> test the code on a BE host, then that use case simply doesn't exist in
>>>> practice.
>> If you can give me a test image and a command line I can test
>> it on one of the PPC64 boxes in the GCC compile farm.
> 
> You can follow the qtest steps in the commit messages of
> 6c87e3d5967a1d731b5f591a8f0ee6c319c14ca8:
> 
> $ arm-softmmu/qemu-system-arm -M virt -machine accel=qtest \
>              -qtest stdio -uuid 4600cb32-38ec-4b2f-8acb-81c6ea54f2d8
> writew 0x9020008 0x0200
> readl 0x9020000
> 
> The readl should return "OK 0x000000004600cb32".

The -uuid switch parses the UUID string with qemu_uuid_parse() into the
qemu_uuid array, and the first four bytes are stored with
"%02hhx%02hhx%02hhx%02hhx"; ie. byte-wise.

The fw_cfg_init1() function adds these bytes as a string (== saves a
reference to qemu_uuid), with fw_cfg_add_bytes(). Good.

I looked at the "readl" command implementation in
qtest_process_command(). I don't know what it intends to do.

*If* it prints the number that the guest CPU sees immediately when it
performs the wide read, then it should print 0x0000000032cb0046, on both
big and little endian hosts; assuming a little endian guest.

Namely, the fw_cfg (sub)string in question is [0x46, 0x00, 0xcb, 0x32].
The device is big endian, and the register accessor function should
return the 0x4600cb32 host value in qemu. The guest CPU should see the
same byte array [0x46, 0x00, 0xcb, 0x32], whose direct interpretation in
the little endian guest is 0x32cb0046.

... I think this is a good test case for what we *think* should happen :)

I think the following will happen if Peter executes this test:
- With the current code, QEMU will print different values when the host
endianness changes.
- With the patch under discussion, QEMU will print 0x0000000032cb0046 on
both host endiannesses.

Thanks
Laszlo
Peter Maydell Dec. 31, 2014, 5:17 p.m. UTC | #7
On 31 December 2014 at 16:27, Laszlo Ersek <lersek@redhat.com> wrote:
> On 12/31/14 16:17, Peter Maydell wrote:
>> On 31 December 2014 at 14:07, Laszlo Ersek <lersek@redhat.com> wrote:
>>> Of course this also renders the issue mostly moot -- if none of us can
>>> test the code on a BE host, then that use case simply doesn't exist in
>>> practice.
>>
>> If you can give me a test image and a command line I can test
>> it on one of the PPC64 boxes in the GCC compile farm.
>
> Sorry for the late answer, I've been laying out stuff and uploading it
> (at 50KB/s :().
>
> I'm very thankful for your help -- please see
> <http://people.redhat.com/~lersek/for_Peter/>.

Unpatched QEMU + QEMU_EFI.fd.v4 : doesn't boot
Patched QEMU + QEMU_EFI.fd.v4_noswizzle : boots OK (dracut-initqueue
starts downloading things).

One thing I did notice in the dmesg:

[   35.798423] alg: hash: Test 1 failed for sha1-ce
[   35.799135] 00000000: d3 5b 9a 85 7f 18 48 21 97 5c 12 72 a8 96 62 88
[   35.799815] 00000010: c3 d2 e1 f0
[   35.807121] alg: hash: Test 1 failed for sha224-ce
[   35.807458] 00000000: 75 49 49 85 18 97 6f 0c a7 d3 c3 ee 54 5d b3 59
[   35.807910] 00000010: a0 97 b3 34 f8 9c ab c1 91 e1 24 9e
[   35.814059] alg: hash: Test 1 failed for sha256-ce
[   35.814450] 00000000: b9 cb b3 d4 f5 29 fa 2c 8f 9b 8e 67 e9 6d 24 9b
[   35.814897] 00000010: 83 da e8 b1 ba 32 6c bb 4e 53 77 76 fc 11 e4 aa
[   35.827619] alg: cipher: Test 1 failed on encryption for aes-ce
[   35.828042] 00000000: ba 4a ec ea df 05 7d 76 56 af 92 77 5d 00 e0 90

...does that happen on little endian TCG hosts too, or do we have
a bug in our encryption emulation?

-- PMM
Peter Maydell Dec. 31, 2014, 5:29 p.m. UTC | #8
On 31 December 2014 at 16:23, Paolo Bonzini <pbonzini@redhat.com> wrote:
> $ arm-softmmu/qemu-system-arm -M virt -machine accel=qtest \
>              -qtest stdio -uuid 4600cb32-38ec-4b2f-8acb-81c6ea54f2d8
> writew 0x9020008 0x0200
> readl 0x9020000
>
> The readl should return "OK 0x000000004600cb32".

On BE ppc64 host I get:
OK 0x0000000032cb0046
with or without Laszlo's patch.

On LE x86-64 host I get
OK 0x000000004600cb32
without the patch, and
OK 0x0000000032cb0046
with it.

So I think that the patch is good (in that it removes a host-endian
dependency which we should not have) and your intuition for the
correct result is possibly not so good :-)

-- PMM
Peter Maydell Dec. 31, 2014, 5:37 p.m. UTC | #9
On 31 December 2014 at 17:17, Peter Maydell <peter.maydell@linaro.org> wrote:
> One thing I did notice in the dmesg:
>
> [   35.798423] alg: hash: Test 1 failed for sha1-ce
> [   35.799135] 00000000: d3 5b 9a 85 7f 18 48 21 97 5c 12 72 a8 96 62 88
> [   35.799815] 00000010: c3 d2 e1 f0
> [   35.807121] alg: hash: Test 1 failed for sha224-ce
> [   35.807458] 00000000: 75 49 49 85 18 97 6f 0c a7 d3 c3 ee 54 5d b3 59
> [   35.807910] 00000010: a0 97 b3 34 f8 9c ab c1 91 e1 24 9e
> [   35.814059] alg: hash: Test 1 failed for sha256-ce
> [   35.814450] 00000000: b9 cb b3 d4 f5 29 fa 2c 8f 9b 8e 67 e9 6d 24 9b
> [   35.814897] 00000010: 83 da e8 b1 ba 32 6c bb 4e 53 77 76 fc 11 e4 aa
> [   35.827619] alg: cipher: Test 1 failed on encryption for aes-ce
> [   35.828042] 00000000: ba 4a ec ea df 05 7d 76 56 af 92 77 5d 00 e0 90
>
> ...does that happen on little endian TCG hosts too, or do we have
> a bug in our encryption emulation?

It doesn't happen on LE TCG hosts. Joy :-)

-- PMM
Peter Maydell Dec. 31, 2014, 5:44 p.m. UTC | #10
On 31 December 2014 at 17:17, Peter Maydell <peter.maydell@linaro.org> wrote:
> Unpatched QEMU + QEMU_EFI.fd.v4 : doesn't boot (doesn't load kernel)
> Patched QEMU + QEMU_EFI.fd.v4_noswizzle : boots OK (dracut-initqueue
> starts downloading things).

Just noticed you asked for the full matrix of tests:
patched QEMU + fd.v4 : doesn't boot (doesn't load kernel)
unpatched + noswizzle : boots OK

That's a bit confusing...

-- PMM
Paolo Bonzini Dec. 31, 2014, 5:56 p.m. UTC | #11
On 31/12/2014 18:04, Laszlo Ersek wrote:
> 
> *If* it prints the number that the guest CPU sees immediately when it
> performs the wide read, then it should print 0x0000000032cb0046, on both
> big and little endian hosts; assuming a little endian guest.

Yes.

Paolo

> Namely, the fw_cfg (sub)string in question is [0x46, 0x00, 0xcb, 0x32].
> The device is big endian, and the register accessor function should
> return the 0x4600cb32 host value in qemu. The guest CPU should see the
> same byte array [0x46, 0x00, 0xcb, 0x32], whose direct interpretation in
> the little endian guest is 0x32cb0046.
Laszlo Ersek Dec. 31, 2014, 5:58 p.m. UTC | #12
On 12/31/14 18:44, Peter Maydell wrote:
> On 31 December 2014 at 17:17, Peter Maydell <peter.maydell@linaro.org> wrote:
>> Unpatched QEMU + QEMU_EFI.fd.v4 : doesn't boot (doesn't load kernel)
>> Patched QEMU + QEMU_EFI.fd.v4_noswizzle : boots OK (dracut-initqueue
>> starts downloading things).
> 
> Just noticed you asked for the full matrix of tests:
> patched QEMU + fd.v4 : doesn't boot (doesn't load kernel)
> unpatched + noswizzle : boots OK
> 
> That's a bit confusing...

It's not confusing. When you run unpatched qemu *on a big endian host*,
that's identical to what the patched code does *on a big endian host*.

The full matrix actually has 8 elements (3 dimensions with 2 values per
dimension) -- host endianness, patched qemu vs. unpatched qemu, and
swizzling vs. non-swizzling firmware.

Thanks,
Laszlo
Laszlo Ersek Dec. 31, 2014, 5:59 p.m. UTC | #13
On 12/31/14 18:37, Peter Maydell wrote:
> On 31 December 2014 at 17:17, Peter Maydell <peter.maydell@linaro.org> wrote:
>> One thing I did notice in the dmesg:
>>
>> [   35.798423] alg: hash: Test 1 failed for sha1-ce
>> [   35.799135] 00000000: d3 5b 9a 85 7f 18 48 21 97 5c 12 72 a8 96 62 88
>> [   35.799815] 00000010: c3 d2 e1 f0
>> [   35.807121] alg: hash: Test 1 failed for sha224-ce
>> [   35.807458] 00000000: 75 49 49 85 18 97 6f 0c a7 d3 c3 ee 54 5d b3 59
>> [   35.807910] 00000010: a0 97 b3 34 f8 9c ab c1 91 e1 24 9e
>> [   35.814059] alg: hash: Test 1 failed for sha256-ce
>> [   35.814450] 00000000: b9 cb b3 d4 f5 29 fa 2c 8f 9b 8e 67 e9 6d 24 9b
>> [   35.814897] 00000010: 83 da e8 b1 ba 32 6c bb 4e 53 77 76 fc 11 e4 aa
>> [   35.827619] alg: cipher: Test 1 failed on encryption for aes-ce
>> [   35.828042] 00000000: ba 4a ec ea df 05 7d 76 56 af 92 77 5d 00 e0 90
>>
>> ...does that happen on little endian TCG hosts too, or do we have
>> a bug in our encryption emulation?
> 
> It doesn't happen on LE TCG hosts. Joy :-)

I think I agree, I repeated the test on my x86_64 laptop, and the only
"alg:" line I see is

[   39.286903] alg: No test for stdrng (krng)

Thanks
Laszlo
Ard Biesheuvel Dec. 31, 2014, 6:08 p.m. UTC | #14
On 31 December 2014 at 17:37, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 31 December 2014 at 17:17, Peter Maydell <peter.maydell@linaro.org> wrote:
>> One thing I did notice in the dmesg:
>>
>> [   35.798423] alg: hash: Test 1 failed for sha1-ce
>> [   35.799135] 00000000: d3 5b 9a 85 7f 18 48 21 97 5c 12 72 a8 96 62 88
>> [   35.799815] 00000010: c3 d2 e1 f0
>> [   35.807121] alg: hash: Test 1 failed for sha224-ce
>> [   35.807458] 00000000: 75 49 49 85 18 97 6f 0c a7 d3 c3 ee 54 5d b3 59
>> [   35.807910] 00000010: a0 97 b3 34 f8 9c ab c1 91 e1 24 9e
>> [   35.814059] alg: hash: Test 1 failed for sha256-ce
>> [   35.814450] 00000000: b9 cb b3 d4 f5 29 fa 2c 8f 9b 8e 67 e9 6d 24 9b
>> [   35.814897] 00000010: 83 da e8 b1 ba 32 6c bb 4e 53 77 76 fc 11 e4 aa
>> [   35.827619] alg: cipher: Test 1 failed on encryption for aes-ce
>> [   35.828042] 00000000: ba 4a ec ea df 05 7d 76 56 af 92 77 5d 00 e0 90
>>
>> ...does that happen on little endian TCG hosts too, or do we have
>> a bug in our encryption emulation?
>
> It doesn't happen on LE TCG hosts. Joy :-)
>

I will need to find out first if this is the kernel driver or the
emulation failing. I never tested the former on BE, as the models
don't support the crypto extensions (at least, not without a plugin
that I don't have access to) and the actual hardware boots via EFI
which is LE only, and with no 'ground truth' as a reference, it is
hard to be 100% certain both implementations are in spec, even if they
work in combination.

The SHA emulation code only operates on 32-bit quantities, and the
only memory access it does is populating the CRYPTO_STATE union using
float64_val(). Does anyone feel there is anything obviously wrong with
that?
Laszlo Ersek Dec. 31, 2014, 6:22 p.m. UTC | #15
On 12/31/14 18:58, Laszlo Ersek wrote:
> On 12/31/14 18:44, Peter Maydell wrote:
>> On 31 December 2014 at 17:17, Peter Maydell <peter.maydell@linaro.org> wrote:
>>> Unpatched QEMU + QEMU_EFI.fd.v4 : doesn't boot (doesn't load kernel)
>>> Patched QEMU + QEMU_EFI.fd.v4_noswizzle : boots OK (dracut-initqueue
>>> starts downloading things).
>>
>> Just noticed you asked for the full matrix of tests:
>> patched QEMU + fd.v4 : doesn't boot (doesn't load kernel)
>> unpatched + noswizzle : boots OK
>>
>> That's a bit confusing...
> 
> It's not confusing. When you run unpatched qemu *on a big endian host*,
> that's identical to what the patched code does *on a big endian host*.
> 
> The full matrix actually has 8 elements (3 dimensions with 2 values per
> dimension) -- host endianness, patched qemu vs. unpatched qemu, and
> swizzling vs. non-swizzling firmware.

To elaborate a bit more (I hope I can manage after several glasses of wine):

The central idea is that the firmware should not swizzle directly in the
transport code. It should only swizzle dependent on the individual
integer encodings in the payload for a given key. QEMU's read accessor
should build the host-endian representation of the big-endian
interpretation of the fw_cfg *(sub)string*. This is what the patch does.

The matrix is (guest endianness is invariably LE):

qemu     firmware swizzles       host        boots
patched  (== firmware is buggy)  endianness
-------  ----------------------  ----------  -------------------------
0        0                       BE          yes (your test #4)
0        0                       LE          no (tested right now)

0        1                       BE          no (your test #1)
0        1                       LE          yes (my earliest test)

1        0                       BE          yes (your test #2)
1        0                       LE          yes (my test for this p.)

1        1                       BE          no (your test #3)
1        1                       LE          no (tested right now)

Thanks
Laszlo
Peter Maydell Dec. 31, 2014, 6:25 p.m. UTC | #16
On 31 December 2014 at 18:08, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 31 December 2014 at 17:37, Peter Maydell <peter.maydell@linaro.org> wrote:
>> It doesn't happen on LE TCG hosts. Joy :-)

> I will need to find out first if this is the kernel driver or the
> emulation failing. I never tested the former on BE, as the models
> don't support the crypto extensions (at least, not without a plugin
> that I don't have access to) and the actual hardware boots via EFI
> which is LE only, and with no 'ground truth' as a reference, it is
> hard to be 100% certain both implementations are in spec, even if they
> work in combination.

It's an LE guest in both cases (AArch64); the only difference is the
host system (LE x86-64 vs BE PPC64). If our emulated CPU behaves
differently based on the endianness of the host, it's always an
emulation bug by definition. (Of course it's possible that we're
currently correct on BE host, wrong on LE host, and with the
corresponding bug in LE guest kernels, but that seems unlikely -- I
did do cross-checks against the fast models for the crypto insns
using an LE-host QEMU.)

> The SHA emulation code only operates on 32-bit quantities, and the
> only memory access it does is populating the CRYPTO_STATE union using
> float64_val(). Does anyone feel there is anything obviously wrong with
> that?

How do you extract the 32 bit quantities from the float64_val ?

Definition of vfp.regs[] from cpu.h:
   * In AArch32:
   *  Qn = regs[2n+1]:regs[2n]
   *  Dn = regs[n]
   *  Sn = regs[n/2] bits 31..0 for even n, and bits 63..32 for odd n
   * (and regs[32] to regs[63] are inaccessible)
   * In AArch64:
   *  Qn = regs[2n+1]:regs[2n]
   *  Dn = regs[2n]
   *  Sn = regs[2n] bits 31..0

where regs[x] is always a 64-bit value in host endianness
(ie if you store 0x112233445566778899aabbccddeeff00 as
a 64 bit write to VFP register D0 then regs[0] will be
0x112233445566778899aabbccddeeff00 regardless of host
endianness. That is, the least significant 8 bits of D0
will be (regs[0] & 0xff). (This isn't the same number as
if you do the union-type-punning thing with union {
uint64_t l; uint8_t b[8]; } and look at b[0].)

Notice that for 128-bit AArch64 vectors, the high half
is always regs[2n+1] and the low half regs[2n], regardless
of host endianness. This implies that on big-endian hosts
a 128-bit vector is not represented as a 128-bit host
endian vector at any offset in regs[]. The function
vec_reg_offset() in translate-a64.c may be helpful.

thanks
-- PMM
Laszlo Ersek Dec. 31, 2014, 6:26 p.m. UTC | #17
On 12/31/14 19:08, Ard Biesheuvel wrote:
> On 31 December 2014 at 17:37, Peter Maydell <peter.maydell@linaro.org> wrote:
>> On 31 December 2014 at 17:17, Peter Maydell <peter.maydell@linaro.org> wrote:
>>> One thing I did notice in the dmesg:
>>>
>>> [   35.798423] alg: hash: Test 1 failed for sha1-ce
>>> [   35.799135] 00000000: d3 5b 9a 85 7f 18 48 21 97 5c 12 72 a8 96 62 88
>>> [   35.799815] 00000010: c3 d2 e1 f0
>>> [   35.807121] alg: hash: Test 1 failed for sha224-ce
>>> [   35.807458] 00000000: 75 49 49 85 18 97 6f 0c a7 d3 c3 ee 54 5d b3 59
>>> [   35.807910] 00000010: a0 97 b3 34 f8 9c ab c1 91 e1 24 9e
>>> [   35.814059] alg: hash: Test 1 failed for sha256-ce
>>> [   35.814450] 00000000: b9 cb b3 d4 f5 29 fa 2c 8f 9b 8e 67 e9 6d 24 9b
>>> [   35.814897] 00000010: 83 da e8 b1 ba 32 6c bb 4e 53 77 76 fc 11 e4 aa
>>> [   35.827619] alg: cipher: Test 1 failed on encryption for aes-ce
>>> [   35.828042] 00000000: ba 4a ec ea df 05 7d 76 56 af 92 77 5d 00 e0 90
>>>
>>> ...does that happen on little endian TCG hosts too, or do we have
>>> a bug in our encryption emulation?
>>
>> It doesn't happen on LE TCG hosts. Joy :-)
>>
> 
> I will need to find out first if this is the kernel driver or the
> emulation failing. I never tested the former on BE, as the models
> don't support the crypto extensions (at least, not without a plugin
> that I don't have access to) and the actual hardware boots via EFI
> which is LE only, and with no 'ground truth' as a reference, it is
> hard to be 100% certain both implementations are in spec, even if they
> work in combination.
> 
> The SHA emulation code only operates on 32-bit quantities, and the
> only memory access it does is populating the CRYPTO_STATE union using
> float64_val(). Does anyone feel there is anything obviously wrong with
> that?

I'm not sure, but I seem to recall that Peter said that TCG held all the
virtual registers in host-endianness.

Laszlo
Paolo Bonzini Dec. 31, 2014, 7:25 p.m. UTC | #18
On 31/12/2014 18:29, Peter Maydell wrote:
> On 31 December 2014 at 16:23, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> $ arm-softmmu/qemu-system-arm -M virt -machine accel=qtest \
>>              -qtest stdio -uuid 4600cb32-38ec-4b2f-8acb-81c6ea54f2d8
>> writew 0x9020008 0x0200
>> readl 0x9020000
>>
>> The readl should return "OK 0x000000004600cb32".
> 
> On BE ppc64 host I get:
> OK 0x0000000032cb0046
> with or without Laszlo's patch.
> 
> On LE x86-64 host I get
> OK 0x000000004600cb32
> without the patch, and
> OK 0x0000000032cb0046
> with it.
> 
> So I think that the patch is good (in that it removes a host-endian
> dependency which we should not have) and your intuition for the
> correct result is possibly not so good :-)

Indeed.  I give up until at least the end of the holiday break.

Paolo
Laszlo Ersek Dec. 31, 2014, 7:32 p.m. UTC | #19
On 12/31/14 20:25, Paolo Bonzini wrote:
> 
> 
> On 31/12/2014 18:29, Peter Maydell wrote:
>> On 31 December 2014 at 16:23, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> $ arm-softmmu/qemu-system-arm -M virt -machine accel=qtest \
>>>              -qtest stdio -uuid 4600cb32-38ec-4b2f-8acb-81c6ea54f2d8
>>> writew 0x9020008 0x0200
>>> readl 0x9020000
>>>
>>> The readl should return "OK 0x000000004600cb32".
>>
>> On BE ppc64 host I get:
>> OK 0x0000000032cb0046
>> with or without Laszlo's patch.
>>
>> On LE x86-64 host I get
>> OK 0x000000004600cb32
>> without the patch, and
>> OK 0x0000000032cb0046
>> with it.
>>
>> So I think that the patch is good (in that it removes a host-endian
>> dependency which we should not have) and your intuition for the
>> correct result is possibly not so good :-)
> 
> Indeed.  I give up until at least the end of the holiday break.

I disagree with the wording "give up". I owe you a great deal of
gratitude for the technical advice and the time & attention you've spent
helping me with this. I hope I can reciprocate sometime.

Laszlo
Ard Biesheuvel Jan. 1, 2015, 10:27 a.m. UTC | #20
> On 31 dec. 2014, at 18:25, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
>> On 31 December 2014 at 18:08, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>> On 31 December 2014 at 17:37, Peter Maydell <peter.maydell@linaro.org> wrote:
>>> It doesn't happen on LE TCG hosts. Joy :-)
> 
>> I will need to find out first if this is the kernel driver or the
>> emulation failing. I never tested the former on BE, as the models
>> don't support the crypto extensions (at least, not without a plugin
>> that I don't have access to) and the actual hardware boots via EFI
>> which is LE only, and with no 'ground truth' as a reference, it is
>> hard to be 100% certain both implementations are in spec, even if they
>> work in combination.
> 
> It's an LE guest in both cases (AArch64); the only difference is the
> host system (LE x86-64 vs BE PPC64). If our emulated CPU behaves
> differently based on the endianness of the host, it's always an
> emulation bug by definition. (Of course it's possible that we're
> currently correct on BE host, wrong on LE host, and with the
> corresponding bug in LE guest kernels, but that seems unlikely -- I
> did do cross-checks against the fast models for the crypto insns
> using an LE-host QEMU.)
> 

Ok, I had missed that bit of context, but obviously, if the only difference is the endianness of the host, it must be an emulation bug.

>> The SHA emulation code only operates on 32-bit quantities, and the
>> only memory access it does is populating the CRYPTO_STATE union using
>> float64_val(). Does anyone feel there is anything obviously wrong with
>> that?
> 
> How do you extract the 32 bit quantities from the float64_val ?
> 
> Definition of vfp.regs[] from cpu.h:
>   * In AArch32:
>   *  Qn = regs[2n+1]:regs[2n]
>   *  Dn = regs[n]
>   *  Sn = regs[n/2] bits 31..0 for even n, and bits 63..32 for odd n
>   * (and regs[32] to regs[63] are inaccessible)
>   * In AArch64:
>   *  Qn = regs[2n+1]:regs[2n]
>   *  Dn = regs[2n]
>   *  Sn = regs[2n] bits 31..0
> 
> where regs[x] is always a 64-bit value in host endianness
> (ie if you store 0x112233445566778899aabbccddeeff00 as
> a 64 bit write to VFP register D0 then regs[0] will be
> 0x112233445566778899aabbccddeeff00 regardless of host
> endianness. That is, the least significant 8 bits of D0
> will be (regs[0] & 0xff). (This isn't the same number as
> if you do the union-type-punning thing with union {
> uint64_t l; uint8_t b[8]; } and look at b[0].)
> 

Ok, that must be the culprit then. I indeed use a union to convert between the various operand sizes.

> Notice that for 128-bit AArch64 vectors, the high half
> is always regs[2n+1] and the low half regs[2n], regardless
> of host endianness. This implies that on big-endian hosts
> a 128-bit vector is not represented as a 128-bit host
> endian vector at any offset in regs[]. The function
> vec_reg_offset() in translate-a64.c may be helpful.

I cannot easily test this myself, but i will try to cook up a patch anyway.

Cheers,
Ard.
Peter Maydell Jan. 5, 2015, 3:24 p.m. UTC | #21
On 31 December 2014 at 11:21, Laszlo Ersek <lersek@redhat.com> wrote:
> (1) Let's contemplate what device endianness means

...let's not, I value my sanity :-)

>  hw/nvram/fw_cfg.c | 41 +++++++----------------------------------
>  1 file changed, 7 insertions(+), 34 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

and tested on BE host, and applied to target-arm.next.

thanks
-- PMM
Laszlo Ersek Jan. 5, 2015, 3:33 p.m. UTC | #22
On 01/05/15 16:24, Peter Maydell wrote:
> On 31 December 2014 at 11:21, Laszlo Ersek <lersek@redhat.com> wrote:
>> (1) Let's contemplate what device endianness means
> 
> ...let's not, I value my sanity :-)

Lol. :)

I'm happy to have sacrificed mine for this matter! ;)

>>  hw/nvram/fw_cfg.c | 41 +++++++----------------------------------
>>  1 file changed, 7 insertions(+), 34 deletions(-)
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> 
> and tested on BE host, and applied to target-arm.next.

Yay!

(Also -- do you tend to push that branch too? I've been stalking it in
<git://git.linaro.org/people/pmaydell/qemu-arm.git> for a few days now
but I'm not seeing new commits.)

Thanks!
Laszlo
diff mbox

Patch

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index fcdf821..78a37be 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -287,51 +287,24 @@  static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
                                      unsigned size)
 {
     FWCfgState *s = opaque;
-    uint8_t buf[8];
+    uint64_t value = 0;
     unsigned i;
 
     for (i = 0; i < size; ++i) {
-        buf[i] = fw_cfg_read(s);
+        value = (value << 8) | fw_cfg_read(s);
     }
-    switch (size) {
-    case 1:
-        return buf[0];
-    case 2:
-        return lduw_he_p(buf);
-    case 4:
-        return (uint32_t)ldl_he_p(buf);
-    case 8:
-        return ldq_he_p(buf);
-    }
-    abort();
+    return value;
 }
 
 static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
                                   uint64_t value, unsigned size)
 {
     FWCfgState *s = opaque;
-    uint8_t buf[8];
-    unsigned i;
+    unsigned i = size;
 
-    switch (size) {
-    case 1:
-        buf[0] = value;
-        break;
-    case 2:
-        stw_he_p(buf, value);
-        break;
-    case 4:
-        stl_he_p(buf, value);
-        break;
-    case 8:
-        stq_he_p(buf, value);
-        break;
-    default:
-        abort();
-    }
-    for (i = 0; i < size; ++i) {
-        fw_cfg_write(s, buf[i]);
-    }
+    do {
+        fw_cfg_write(s, value >> (8 * --i));
+    } while (i);
 }
 
 static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr,