diff mbox

[RFC,1/1] memory: Support unaligned accesses on aligned-only models

Message ID 20170630030058.28943-1-andrew@aj.id.au
State New
Headers show

Commit Message

Andrew Jeffery June 30, 2017, 3 a.m. UTC
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
---
Hello,

This RFC patch stems from a discussion on a patch for an ADC model[1] where it
was pointed out that I should be able to use the .impl member of
MemoryRegionOps to constrain how my read() and write() callbacks where invoked.

I tried Phil's suggested approach and found I got reads of size 4, but with an
address that was not 4-byte aligned.

Looking at the source for access_with_adjusted_size() lead to the comment

     /* FIXME: support unaligned access? */

which at least suggests that the implementation isn't complete.

So, this patch is a quick and incomplete attempt at resolving the issue to see
whether I'm on the right track or way off in the weeds.

I've lightly tested it with the ADC model mentioned above, and it appears to do
the right thing (I changed the values generated by the ADC to distinguish
between the lower and upper 16 bits).

Things the patch is not:

1. Capable of handling unaligned writes
2. Tested on big-endian models

If the general idea of the patch is reasonable I'll look to resolve the above
to points.

Cheers,

Andrew

[1] http://lists.nongnu.org/archive/html/qemu-arm/2017-05/msg00400.html
 memory.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 137 insertions(+), 7 deletions(-)

Comments

Paolo Bonzini July 13, 2017, 12:05 p.m. UTC | #1
On 30/06/2017 05:00, Andrew Jeffery wrote:
> This RFC patch stems from a discussion on a patch for an ADC model[1] where it
> was pointed out that I should be able to use the .impl member of
> MemoryRegionOps to constrain how my read() and write() callbacks where invoked.
> 
> I tried Phil's suggested approach and found I got reads of size 4, but with an
> address that was not 4-byte aligned.
> 
> Looking at the source for access_with_adjusted_size() lead to the comment
> 
>      /* FIXME: support unaligned access? */
> 
> which at least suggests that the implementation isn't complete.
> 
> So, this patch is a quick and incomplete attempt at resolving the issue to see
> whether I'm on the right track or way off in the weeds.
> 
> I've lightly tested it with the ADC model mentioned above, and it appears to do
> the right thing (I changed the values generated by the ADC to distinguish
> between the lower and upper 16 bits).

I think the idea is okay.

> +    access_addr[0] = align_down(addr, access_size);
> +    access_addr[1] = align_up(addr + size, access_size);
> +
> +        for (cur = access_addr[0]; cur < access_addr[1]; cur += access_size) {
> +            uint64_t mask_bounds[2];
> +            mask_bounds[0] = MAX(addr, cur) - cur;
> +            mask_bounds[1] =
> +                MIN(addr + size, align_up(cur + 1, access_size)) - cur;
> +
> +            access_mask = (-1ULL << mask_bounds[0] * 8) &
> +                (-1ULL >> (64 - mask_bounds[1] * 8));

Please use MAKE_64BIT_MASK.

> +            r |= access(mr, cur, &access_value, access_size,
> +                  (MAX(addr, cur) - addr), access_mask, attrs);
> +
> +            /* XXX: Can't do this hack for writes */
> +            access_value >>= mask_bounds[0] * 8;
> +        }

Can you subtract access_addr[0] from mask_bounds[0] and mask_bounds[1]
(instead of cur) to remove the need for this right shift?

Thanks,

Paolo
Andrew Jeffery July 14, 2017, 6:20 a.m. UTC | #2
Hi Paolo,

Thanks for taking a look!

On Thu, 2017-07-13 at 14:05 +0200, Paolo Bonzini wrote:
> On 30/06/2017 05:00, Andrew Jeffery wrote:
> > This RFC patch stems from a discussion on a patch for an ADC model[1] where it
> > was pointed out that I should be able to use the .impl member of
> > MemoryRegionOps to constrain how my read() and write() callbacks where invoked.
> > 
> > I tried Phil's suggested approach and found I got reads of size 4, but with an
> > address that was not 4-byte aligned.
> > 
> > Looking at the source for access_with_adjusted_size() lead to the comment
> > 
> >      /* FIXME: support unaligned access? */
> > 
> > which at least suggests that the implementation isn't complete.
> > 
> > So, this patch is a quick and incomplete attempt at resolving the issue to see
> > whether I'm on the right track or way off in the weeds.
> > 
> > I've lightly tested it with the ADC model mentioned above, and it appears to do
> > the right thing (I changed the values generated by the ADC to distinguish
> > between the lower and upper 16 bits).
> 
> I think the idea is okay.
> 
> > +    access_addr[0] = align_down(addr, access_size);
> > +    access_addr[1] = align_up(addr + size, access_size);
> > +
> > +        for (cur = access_addr[0]; cur < access_addr[1]; cur += access_size) {
> > +            uint64_t mask_bounds[2];
> > +            mask_bounds[0] = MAX(addr, cur) - cur;
> > +            mask_bounds[1] =
> > +                MIN(addr + size, align_up(cur + 1, access_size)) - cur;
> > +
> > +            access_mask = (-1ULL << mask_bounds[0] * 8) &
> > +                (-1ULL >> (64 - mask_bounds[1] * 8));
> 
> Please use MAKE_64BIT_MASK.

Okay.

> 
> > +            r |= access(mr, cur, &access_value, access_size,
> > +                  (MAX(addr, cur) - addr), access_mask, attrs);
> > +
> > +            /* XXX: Can't do this hack for writes */
> > +            access_value >>= mask_bounds[0] * 8;
> > +        }
> 
> Can you subtract access_addr[0] from mask_bounds[0] and mask_bounds[1]
> (instead of cur) to remove the need for this right shift?

I haven't looked at the patch since I sent it. Given you think the idea
 of the patch is okay I'll get back to working on it and think about
this.

Cheers,

Andrew

> 
> Thanks,
> 
> Paolo
Cédric Le Goater Sept. 20, 2018, 12:13 p.m. UTC | #3
Andrew,

On 07/14/2017 08:20 AM, Andrew Jeffery wrote:
> Hi Paolo,
> 
> Thanks for taking a look!
> 
> On Thu, 2017-07-13 at 14:05 +0200, Paolo Bonzini wrote:
>> On 30/06/2017 05:00, Andrew Jeffery wrote:
>>> This RFC patch stems from a discussion on a patch for an ADC model[1] where it
>>> was pointed out that I should be able to use the .impl member of
>>> MemoryRegionOps to constrain how my read() and write() callbacks where invoked.
>>>
>>> I tried Phil's suggested approach and found I got reads of size 4, but with an
>>> address that was not 4-byte aligned.
>>>
>>> Looking at the source for access_with_adjusted_size() lead to the comment
>>>
>>>      /* FIXME: support unaligned access? */
>>>
>>> which at least suggests that the implementation isn't complete.
>>>
>>> So, this patch is a quick and incomplete attempt at resolving the issue to see
>>> whether I'm on the right track or way off in the weeds.
>>>
>>> I've lightly tested it with the ADC model mentioned above, and it appears to do
>>> the right thing (I changed the values generated by the ADC to distinguish
>>> between the lower and upper 16 bits).
>>
>> I think the idea is okay.
>>
>>> +    access_addr[0] = align_down(addr, access_size);
>>> +    access_addr[1] = align_up(addr + size, access_size);
>>> +
>>> +        for (cur = access_addr[0]; cur < access_addr[1]; cur += access_size) {
>>> +            uint64_t mask_bounds[2];
>>> +            mask_bounds[0] = MAX(addr, cur) - cur;
>>> +            mask_bounds[1] =
>>> +                MIN(addr + size, align_up(cur + 1, access_size)) - cur;
>>> +
>>> +            access_mask = (-1ULL << mask_bounds[0] * 8) &
>>> +                (-1ULL >> (64 - mask_bounds[1] * 8));
>>
>> Please use MAKE_64BIT_MASK.
> 
> Okay.
> 
>>
>>> +            r |= access(mr, cur, &access_value, access_size,
>>> +                  (MAX(addr, cur) - addr), access_mask, attrs);
>>> +
>>> +            /* XXX: Can't do this hack for writes */
>>> +            access_value >>= mask_bounds[0] * 8;
>>> +        }
>>
>> Can you subtract access_addr[0] from mask_bounds[0] and mask_bounds[1]
>> (instead of cur) to remove the need for this right shift?
> 
> I haven't looked at the patch since I sent it. Given you think the idea
> of the patch is okay I'll get back to working on it and think about
> this.

We have been using successfully this patch for over a year now and it 
would be nice to get it merged along with the ADC model. Do you have 
a new version addressing Paolo's remarks ?

Thanks,

C.
Andrew Jeffery Sept. 20, 2018, 1:42 p.m. UTC | #4
On Thu, 20 Sep 2018, at 21:43, Cédric Le Goater wrote:
> Andrew,
> 
> On 07/14/2017 08:20 AM, Andrew Jeffery wrote:
> > Hi Paolo,
> > 
> > Thanks for taking a look!
> > 
> > On Thu, 2017-07-13 at 14:05 +0200, Paolo Bonzini wrote:
> >> On 30/06/2017 05:00, Andrew Jeffery wrote:
> >>> This RFC patch stems from a discussion on a patch for an ADC model[1] where it
> >>> was pointed out that I should be able to use the .impl member of
> >>> MemoryRegionOps to constrain how my read() and write() callbacks where invoked.
> >>>
> >>> I tried Phil's suggested approach and found I got reads of size 4, but with an
> >>> address that was not 4-byte aligned.
> >>>
> >>> Looking at the source for access_with_adjusted_size() lead to the comment
> >>>
> >>>      /* FIXME: support unaligned access? */
> >>>
> >>> which at least suggests that the implementation isn't complete.
> >>>
> >>> So, this patch is a quick and incomplete attempt at resolving the issue to see
> >>> whether I'm on the right track or way off in the weeds.
> >>>
> >>> I've lightly tested it with the ADC model mentioned above, and it appears to do
> >>> the right thing (I changed the values generated by the ADC to distinguish
> >>> between the lower and upper 16 bits).
> >>
> >> I think the idea is okay.
> >>
> >>> +    access_addr[0] = align_down(addr, access_size);
> >>> +    access_addr[1] = align_up(addr + size, access_size);
> >>> +
> >>> +        for (cur = access_addr[0]; cur < access_addr[1]; cur += access_size) {
> >>> +            uint64_t mask_bounds[2];
> >>> +            mask_bounds[0] = MAX(addr, cur) - cur;
> >>> +            mask_bounds[1] =
> >>> +                MIN(addr + size, align_up(cur + 1, access_size)) - cur;
> >>> +
> >>> +            access_mask = (-1ULL << mask_bounds[0] * 8) &
> >>> +                (-1ULL >> (64 - mask_bounds[1] * 8));
> >>
> >> Please use MAKE_64BIT_MASK.
> > 
> > Okay.
> > 
> >>
> >>> +            r |= access(mr, cur, &access_value, access_size,
> >>> +                  (MAX(addr, cur) - addr), access_mask, attrs);
> >>> +
> >>> +            /* XXX: Can't do this hack for writes */
> >>> +            access_value >>= mask_bounds[0] * 8;
> >>> +        }
> >>
> >> Can you subtract access_addr[0] from mask_bounds[0] and mask_bounds[1]
> >> (instead of cur) to remove the need for this right shift?
> > 
> > I haven't looked at the patch since I sent it. Given you think the idea
> > of the patch is okay I'll get back to working on it and think about
> > this.
> 
> We have been using successfully this patch for over a year now and it 
> would be nice to get it merged along with the ADC model. Do you have 
> a new version addressing Paolo's remarks ?

Aggh, no :( I've been meaning to get to it, but have been significantly distracted
for quite some time now. I'm not sure when I'll be able to look at it. I'm happy for
someone else to have a crack at fixing it in the mean time :)

Thanks for the reminder.

Andrew
diff mbox

Patch

diff --git a/memory.c b/memory.c
index 0ddc4cc28deb..b9fae8d382bc 100644
--- a/memory.c
+++ b/memory.c
@@ -432,6 +432,7 @@  static MemTxResult  memory_region_read_accessor(MemoryRegion *mr,
 {
     uint64_t tmp;
 
+
     tmp = mr->ops->read(mr->opaque, addr, size);
     if (mr->subpage) {
         trace_memory_region_subpage_read(get_cpu_index(), mr, addr, tmp, size);
@@ -552,7 +553,7 @@  static MemTxResult memory_region_write_with_attrs_accessor(MemoryRegion *mr,
     return mr->ops->write_with_attrs(mr->opaque, addr, tmp, size, attrs);
 }
 
-static MemTxResult access_with_adjusted_size(hwaddr addr,
+static MemTxResult access_with_adjusted_size_aligned(hwaddr addr,
                                       uint64_t *value,
                                       unsigned size,
                                       unsigned access_size_min,
@@ -567,10 +568,11 @@  static MemTxResult access_with_adjusted_size(hwaddr addr,
                                       MemoryRegion *mr,
                                       MemTxAttrs attrs)
 {
+
+    MemTxResult r = MEMTX_OK;
     uint64_t access_mask;
     unsigned access_size;
     unsigned i;
-    MemTxResult r = MEMTX_OK;
 
     if (!access_size_min) {
         access_size_min = 1;
@@ -579,7 +581,6 @@  static MemTxResult access_with_adjusted_size(hwaddr addr,
         access_size_max = 4;
     }
 
-    /* FIXME: support unaligned access? */
     access_size = MAX(MIN(size, access_size_max), access_size_min);
     access_mask = -1ULL >> (64 - access_size * 8);
     if (memory_region_big_endian(mr)) {
@@ -596,6 +597,133 @@  static MemTxResult access_with_adjusted_size(hwaddr addr,
     return r;
 }
 
+/* Assume power-of-two size */
+#define align_down(addr, size) ((addr) & ~((size) - 1))
+#define align_up(addr, size) \
+    ({ typeof(size) __size = size; align_down((addr) + (__size) - 1, (__size)); })
+
+static MemTxResult access_with_adjusted_size_unaligned(hwaddr addr,
+                                      uint64_t *value,
+                                      unsigned size,
+                                      unsigned access_size_min,
+                                      unsigned access_size_max,
+                                      bool unaligned,
+                                      MemTxResult (*access)(MemoryRegion *mr,
+                                                            hwaddr addr,
+                                                            uint64_t *value,
+                                                            unsigned size,
+                                                            unsigned shift,
+                                                            uint64_t mask,
+                                                            MemTxAttrs attrs),
+                                      MemoryRegion *mr,
+                                      MemTxAttrs attrs)
+{
+    uint64_t access_value = 0;
+    MemTxResult r = MEMTX_OK;
+    hwaddr access_addr[2];
+    uint64_t access_mask;
+    unsigned access_size;
+
+    if (unlikely(!access_size_min)) {
+        access_size_min = 1;
+    }
+    if (unlikely(!access_size_max)) {
+        access_size_max = 4;
+    }
+
+    access_size = MAX(MIN(size, access_size_max), access_size_min);
+    access_addr[0] = align_down(addr, access_size);
+    access_addr[1] = align_up(addr + size, access_size);
+
+    if (memory_region_big_endian(mr)) {
+        hwaddr cur;
+
+        /* XXX: Big-endian path is untested...  */
+
+        for (cur = access_addr[0]; cur < access_addr[1]; cur += access_size) {
+            uint64_t mask_bounds[2];
+
+            mask_bounds[0] = MAX(addr, cur) - cur;
+            mask_bounds[1] =
+                MIN(addr + size, align_up(cur + 1, access_size)) - cur;
+
+            access_mask = (-1ULL << mask_bounds[0] * 8) &
+                (-1ULL >> (64 - mask_bounds[1] * 8));
+
+            r |= access(mr, cur, &access_value, access_size,
+                  (size - access_size - (MAX(addr, cur) - addr)),
+                  access_mask, attrs);
+
+            /* XXX: Can't do this hack for writes */
+            access_value >>= mask_bounds[0] * 8;
+        }
+    } else {
+        hwaddr cur;
+
+        for (cur = access_addr[0]; cur < access_addr[1]; cur += access_size) {
+            uint64_t mask_bounds[2];
+
+            mask_bounds[0] = MAX(addr, cur) - cur;
+            mask_bounds[1] =
+                MIN(addr + size, align_up(cur + 1, access_size)) - cur;
+
+            access_mask = (-1ULL << mask_bounds[0] * 8) &
+                (-1ULL >> (64 - mask_bounds[1] * 8));
+
+            r |= access(mr, cur, &access_value, access_size,
+                  (MAX(addr, cur) - addr), access_mask, attrs);
+
+            /* XXX: Can't do this hack for writes */
+            access_value >>= mask_bounds[0] * 8;
+        }
+    }
+
+    *value = access_value;
+
+    return r;
+}
+
+static inline MemTxResult access_with_adjusted_size(hwaddr addr,
+                                      uint64_t *value,
+                                      unsigned size,
+                                      unsigned access_size_min,
+                                      unsigned access_size_max,
+                                      bool unaligned,
+                                      MemTxResult (*access)(MemoryRegion *mr,
+                                                            hwaddr addr,
+                                                            uint64_t *value,
+                                                            unsigned size,
+                                                            unsigned shift,
+                                                            uint64_t mask,
+                                                            MemTxAttrs attrs),
+                                      MemoryRegion *mr,
+                                      MemTxAttrs attrs)
+{
+    unsigned access_size;
+
+    if (!access_size_min) {
+        access_size_min = 1;
+    }
+    if (!access_size_max) {
+        access_size_max = 4;
+    }
+
+    access_size = MAX(MIN(size, access_size_max), access_size_min);
+
+    /* Handle unaligned accesses if the model only supports natural alignment */
+    if (unlikely((addr & (access_size - 1)) && !unaligned)) {
+        return access_with_adjusted_size_unaligned(addr, value, size,
+                access_size_min, access_size_max, unaligned, access, mr, attrs);
+    }
+
+    /*
+     * Otherwise, if the access is aligned or the model specifies it can handle
+     * unaligned accesses, use the 'aligned' handler
+     */
+    return access_with_adjusted_size_aligned(addr, value, size,
+            access_size_min, access_size_max, access, mr, attrs);
+}
+
 static AddressSpace *memory_region_to_address_space(MemoryRegion *mr)
 {
     AddressSpace *as;
@@ -1238,16 +1366,18 @@  static MemTxResult memory_region_dispatch_read1(MemoryRegion *mr,
         return access_with_adjusted_size(addr, pval, size,
                                          mr->ops->impl.min_access_size,
                                          mr->ops->impl.max_access_size,
+                                         mr->ops->impl.unaligned,
                                          memory_region_read_accessor,
                                          mr, attrs);
     } else if (mr->ops->read_with_attrs) {
         return access_with_adjusted_size(addr, pval, size,
                                          mr->ops->impl.min_access_size,
                                          mr->ops->impl.max_access_size,
+                                         mr->ops->impl.unaligned,
                                          memory_region_read_with_attrs_accessor,
                                          mr, attrs);
     } else {
-        return access_with_adjusted_size(addr, pval, size, 1, 4,
+        return access_with_adjusted_size(addr, pval, size, 1, 4, true,
                                          memory_region_oldmmio_read_accessor,
                                          mr, attrs);
     }
@@ -1316,20 +1446,20 @@  MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
     }
 
     if (mr->ops->write) {
-        return access_with_adjusted_size(addr, &data, size,
+        return access_with_adjusted_size_aligned(addr, &data, size,
                                          mr->ops->impl.min_access_size,
                                          mr->ops->impl.max_access_size,
                                          memory_region_write_accessor, mr,
                                          attrs);
     } else if (mr->ops->write_with_attrs) {
         return
-            access_with_adjusted_size(addr, &data, size,
+            access_with_adjusted_size_aligned(addr, &data, size,
                                       mr->ops->impl.min_access_size,
                                       mr->ops->impl.max_access_size,
                                       memory_region_write_with_attrs_accessor,
                                       mr, attrs);
     } else {
-        return access_with_adjusted_size(addr, &data, size, 1, 4,
+        return access_with_adjusted_size_aligned(addr, &data, size, 1, 4,
                                          memory_region_oldmmio_write_accessor,
                                          mr, attrs);
     }