diff mbox

Add NX P7+ support

Message ID 1425950787-821-1-git-send-email-ddstreet@ieee.org
State Accepted
Headers show

Commit Message

Dan Streetman March 10, 2015, 1:26 a.m. UTC
Add NX config register values for P7+.  Remove "P8" from all register
defines, where the define is common to P7+ and P8.  For values new to P8
(specifically 842 prefeching), only enable on P8.

This should correctly setup the NX coprocessors on P7+ systems.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
---

I don't have access to any P7+ bare metal system right now, so I haven't
tested this to verify it works on P7+, but I did verify it still works
on a P8 system.

 hw/nx-842.c    |  59 ++++++++++++++------------
 hw/nx-crypto.c |  82 ++++++++++++++++++------------------
 hw/nx-rng.c    |  19 ++++-----
 include/nx.h   | 131 ++++++++++++++++++++++++++++++---------------------------
 4 files changed, 150 insertions(+), 141 deletions(-)

Comments

Stewart Smith March 13, 2015, 2:59 a.m. UTC | #1
Dan Streetman <ddstreet@ieee.org> writes:
> Add NX config register values for P7+.  Remove "P8" from all register
> defines, where the define is common to P7+ and P8.  For values new to P8
> (specifically 842 prefeching), only enable on P8.
>
> This should correctly setup the NX coprocessors on P7+ systems.
>
> Signed-off-by: Dan Streetman <ddstreet@ieee.org>
> ---
>
> I don't have access to any P7+ bare metal system right now, so I haven't
> tested this to verify it works on P7+, but I did verify it still works
> on a P8 system.

I have P7+ in the lab in CBR that I can test on. I guess the best way is
with the kernel patches?

From a quick glance, there's nothing we can really do to abstract away
the slight differences between P7 and P8 for kernel?

It doesn't really matter anyway, as P7 with OPAL has only ever been used
internally for development purposes.
Dan Streetman March 16, 2015, 6:03 p.m. UTC | #2
On Thu, Mar 12, 2015 at 10:59 PM, Stewart Smith
<stewart@linux.vnet.ibm.com> wrote:
> Dan Streetman <ddstreet@ieee.org> writes:
>> Add NX config register values for P7+.  Remove "P8" from all register
>> defines, where the define is common to P7+ and P8.  For values new to P8
>> (specifically 842 prefeching), only enable on P8.
>>
>> This should correctly setup the NX coprocessors on P7+ systems.
>>
>> Signed-off-by: Dan Streetman <ddstreet@ieee.org>
>> ---
>>
>> I don't have access to any P7+ bare metal system right now, so I haven't
>> tested this to verify it works on P7+, but I did verify it still works
>> on a P8 system.
>
> I have P7+ in the lab in CBR that I can test on. I guess the best way is
> with the kernel patches?

yep.  I'll resend the patch set, updated for the nx node location per
the other skiboot patch I sent.  I can provide a pre-built kernel with
the patches too if you want, either using pkvm 3.1 src or the latest
upstream kernel src.

once you get the patched kernel loaded on the system, you just need to
load the module nx-compress-powernv.  To test, you could use zswap,
although it's a bit complicated to set up; or you can use the
compression test module i added, comp_selftest.  Once you load
comp_selftest there's a debugfs interface; in the
/sys/kernel/debugfs/comp_selftest dir you can do:

echo 842 > compressor
echo 1 > running

that should start it testing, you can do
cat status

for a continually-updated status, and you can ajdust the number of
threads and other params using the other nodes in the dir.  you can
stop the test by writing 0 to the running node.  (I still plan to
adjust the interface here, breaking out each of the status info values
into their own nodes, and maybe removing the status node as you can
script a constant checking easily...)

On the P8 test systems, I was getting some pretty impressive
compression throughput (between 5-10 GBps, per thread, it seemed to be
only about 1/4 or so of the throughput of the memcpy-only null
compressor), so I'm interested to see what it looks like on the p7+
system.

>
> From a quick glance, there's nothing we can really do to abstract away
> the slight differences between P7 and P8 for kernel?

Well the kernel driver doesn't care at all about P7/P8 except for the
xscom r/w, since the xscom addrs are different between P7/P8.  As I
haven't actually added any error recovery (via xscoms) to the kernel
driver yet, I could remove the xscom stuff from the first kernel patch
set.  However, if the kernel driver is who eventually monitors and
handles NX errors, I don't see any way around r/w directly from the NX
xscoms.  The P7/P8 differences should be able to be abstracted in the
nx-842-xscom.h header though, and I don't think the actual driver will
need to know the P num.

Also, we could use skiboot to add device tree nodes telling the driver
what NX xscom register addrs to use, which would remove the P7/P8
specifics from the kernel completely, but then we rely on having to
update skiboot if we ever want to r/w a new NX xscom register in the
kernel driver, which isn't ideal.


>
> It doesn't really matter anyway, as P7 with OPAL has only ever been used
> internally for development purposes.
>

yeah, but it would be nice to be able to test on p7+ also, as you said
before there's more of them out there than P8's for people to work
with.  I managed to get 2 surplus p7+ systems myself, although they
are still being moved to our lab space.

i assume p7+ systems can be updated to fw830?
Stewart Smith March 16, 2015, 6:23 p.m. UTC | #3
Dan Streetman <ddstreet@ieee.org> writes:
> yeah, but it would be nice to be able to test on p7+ also, as you said
> before there's more of them out there than P8's for people to work
> with.  I managed to get 2 surplus p7+ systems myself, although they
> are still being moved to our lab space.
>
> i assume p7+ systems can be updated to fw830?

No, you're still stuck on the old hacked version, but you can copy new
skiboot on there :)
Benjamin Herrenschmidt March 16, 2015, 9:01 p.m. UTC | #4
On Mon, 2015-03-16 at 14:03 -0400, Dan Streetman wrote:
> 
> Well the kernel driver doesn't care at all about P7/P8 except for the
> xscom r/w, since the xscom addrs are different between P7/P8.  As I
> haven't actually added any error recovery (via xscoms) to the kernel
> driver yet, I could remove the xscom stuff from the first kernel patch
> set.  However, if the kernel driver is who eventually monitors and
> handles NX errors, I don't see any way around r/w directly from the NX
> xscoms.  The P7/P8 differences should be able to be abstracted in the
> nx-842-xscom.h header though, and I don't think the actual driver will
> need to know the P num.

I don't want linux to know about SCOMS, instead, create OPAL calls, for
example OPAL_GET_NX_STATUS or whatever makes sense for you.

In any case, a XSCOM is a roundtrip to OPAL so the above won't be
slower.

Cheers,
Ben.
Dan Streetman March 16, 2015, 9:26 p.m. UTC | #5
On Mon, Mar 16, 2015 at 5:01 PM, Benjamin Herrenschmidt
<benh@au1.ibm.com> wrote:
> On Mon, 2015-03-16 at 14:03 -0400, Dan Streetman wrote:
>>
>> Well the kernel driver doesn't care at all about P7/P8 except for the
>> xscom r/w, since the xscom addrs are different between P7/P8.  As I
>> haven't actually added any error recovery (via xscoms) to the kernel
>> driver yet, I could remove the xscom stuff from the first kernel patch
>> set.  However, if the kernel driver is who eventually monitors and
>> handles NX errors, I don't see any way around r/w directly from the NX
>> xscoms.  The P7/P8 differences should be able to be abstracted in the
>> nx-842-xscom.h header though, and I don't think the actual driver will
>> need to know the P num.
>
> I don't want linux to know about SCOMS, instead, create OPAL calls, for
> example OPAL_GET_NX_STATUS or whatever makes sense for you.
>
> In any case, a XSCOM is a roundtrip to OPAL so the above won't be
> slower.

ok will do, i'll update the patch set and resend here for more comments.

>
> Cheers,
> Ben.
>
>
Dan Streetman March 16, 2015, 9:28 p.m. UTC | #6
On Mon, Mar 16, 2015 at 2:03 PM, Dan Streetman <ddstreet@ieee.org> wrote:
> On Thu, Mar 12, 2015 at 10:59 PM, Stewart Smith
> <stewart@linux.vnet.ibm.com> wrote:
>> Dan Streetman <ddstreet@ieee.org> writes:
>>> Add NX config register values for P7+.  Remove "P8" from all register
>>> defines, where the define is common to P7+ and P8.  For values new to P8
>>> (specifically 842 prefeching), only enable on P8.
>>>
>>> This should correctly setup the NX coprocessors on P7+ systems.
>>>
>>> Signed-off-by: Dan Streetman <ddstreet@ieee.org>
>>> ---
>>>
>>> I don't have access to any P7+ bare metal system right now, so I haven't
>>> tested this to verify it works on P7+, but I did verify it still works
>>> on a P8 system.
>>
>> I have P7+ in the lab in CBR that I can test on. I guess the best way is
>> with the kernel patches?
>
> yep.  I'll resend the patch set, updated for the nx node location per
> the other skiboot patch I sent.  I can provide a pre-built kernel with
> the patches too if you want, either using pkvm 3.1 src or the latest
> upstream kernel src.

I just re-sent the patch set with a change to use the relocated nx
device tree nodes, so you can test with that kernel patch set if
you've included my last two skiboot patches.  I'll update this patch
set as mentioned to Ben, to change the xscom r/w into opal calls.

I can provide a pre-build test kernel also if you want, let me know.

>
> once you get the patched kernel loaded on the system, you just need to
> load the module nx-compress-powernv.  To test, you could use zswap,
> although it's a bit complicated to set up; or you can use the
> compression test module i added, comp_selftest.  Once you load
> comp_selftest there's a debugfs interface; in the
> /sys/kernel/debugfs/comp_selftest dir you can do:
>
> echo 842 > compressor
> echo 1 > running
>
> that should start it testing, you can do
> cat status
>
> for a continually-updated status, and you can ajdust the number of
> threads and other params using the other nodes in the dir.  you can
> stop the test by writing 0 to the running node.  (I still plan to
> adjust the interface here, breaking out each of the status info values
> into their own nodes, and maybe removing the status node as you can
> script a constant checking easily...)
>
> On the P8 test systems, I was getting some pretty impressive
> compression throughput (between 5-10 GBps, per thread, it seemed to be
> only about 1/4 or so of the throughput of the memcpy-only null
> compressor), so I'm interested to see what it looks like on the p7+
> system.
>
>>
>> From a quick glance, there's nothing we can really do to abstract away
>> the slight differences between P7 and P8 for kernel?
>
> Well the kernel driver doesn't care at all about P7/P8 except for the
> xscom r/w, since the xscom addrs are different between P7/P8.  As I
> haven't actually added any error recovery (via xscoms) to the kernel
> driver yet, I could remove the xscom stuff from the first kernel patch
> set.  However, if the kernel driver is who eventually monitors and
> handles NX errors, I don't see any way around r/w directly from the NX
> xscoms.  The P7/P8 differences should be able to be abstracted in the
> nx-842-xscom.h header though, and I don't think the actual driver will
> need to know the P num.
>
> Also, we could use skiboot to add device tree nodes telling the driver
> what NX xscom register addrs to use, which would remove the P7/P8
> specifics from the kernel completely, but then we rely on having to
> update skiboot if we ever want to r/w a new NX xscom register in the
> kernel driver, which isn't ideal.
>
>
>>
>> It doesn't really matter anyway, as P7 with OPAL has only ever been used
>> internally for development purposes.
>>
>
> yeah, but it would be nice to be able to test on p7+ also, as you said
> before there's more of them out there than P8's for people to work
> with.  I managed to get 2 surplus p7+ systems myself, although they
> are still being moved to our lab space.
>
> i assume p7+ systems can be updated to fw830?
Stewart Smith March 17, 2015, 5:34 a.m. UTC | #7
Benjamin Herrenschmidt <benh@au1.ibm.com> writes:
> On Mon, 2015-03-16 at 14:03 -0400, Dan Streetman wrote:
>> 
>> Well the kernel driver doesn't care at all about P7/P8 except for the
>> xscom r/w, since the xscom addrs are different between P7/P8.  As I
>> haven't actually added any error recovery (via xscoms) to the kernel
>> driver yet, I could remove the xscom stuff from the first kernel patch
>> set.  However, if the kernel driver is who eventually monitors and
>> handles NX errors, I don't see any way around r/w directly from the NX
>> xscoms.  The P7/P8 differences should be able to be abstracted in the
>> nx-842-xscom.h header though, and I don't think the actual driver will
>> need to know the P num.
>
> I don't want linux to know about SCOMS, instead, create OPAL calls, for
> example OPAL_GET_NX_STATUS or whatever makes sense for you.
>
> In any case, a XSCOM is a roundtrip to OPAL so the above won't be
> slower.

I'll merge this patch and possibly disable the 842 DT node creation
before tagging skiboot-5.0 depending on if the patch that adds OPAL
calls arrives in time or not.

(open question: should the protocol for discovering if firmware supports
a thing be "check device tree entry exists and OPAL calls exist" or just
"check device tree" OR check opal call)
Benjamin Herrenschmidt March 17, 2015, 5:38 a.m. UTC | #8
On Mon, 2015-03-16 at 22:34 -0700, Stewart Smith wrote:

> I'll merge this patch and possibly disable the 842 DT node creation
> before tagging skiboot-5.0 depending on if the patch that adds OPAL
> calls arrives in time or not.
> 
> (open question: should the protocol for discovering if firmware supports
> a thing be "check device tree entry exists and OPAL calls exist" or just
> "check device tree" OR check opal call)

Checking DT should be enough. The check OPAL call was added for things
we did wrong :-)

Cheers,
Ben.
Dan Streetman March 25, 2015, 3:57 p.m. UTC | #9
On Tue, Mar 17, 2015 at 1:34 AM, Stewart Smith
<stewart@linux.vnet.ibm.com> wrote:
> Benjamin Herrenschmidt <benh@au1.ibm.com> writes:
>> On Mon, 2015-03-16 at 14:03 -0400, Dan Streetman wrote:
>>>
>>> Well the kernel driver doesn't care at all about P7/P8 except for the
>>> xscom r/w, since the xscom addrs are different between P7/P8.  As I
>>> haven't actually added any error recovery (via xscoms) to the kernel
>>> driver yet, I could remove the xscom stuff from the first kernel patch
>>> set.  However, if the kernel driver is who eventually monitors and
>>> handles NX errors, I don't see any way around r/w directly from the NX
>>> xscoms.  The P7/P8 differences should be able to be abstracted in the
>>> nx-842-xscom.h header though, and I don't think the actual driver will
>>> need to know the P num.
>>
>> I don't want linux to know about SCOMS, instead, create OPAL calls, for
>> example OPAL_GET_NX_STATUS or whatever makes sense for you.
>>
>> In any case, a XSCOM is a roundtrip to OPAL so the above won't be
>> slower.
>
> I'll merge this patch and possibly disable the 842 DT node creation
> before tagging skiboot-5.0 depending on if the patch that adds OPAL
> calls arrives in time or not.

Sorry for the delay.  I sent the patch with the NX opal calls earlier
this week, so let me know if you think it needs any adjustment.

>
> (open question: should the protocol for discovering if firmware supports
> a thing be "check device tree entry exists and OPAL calls exist" or just
> "check device tree" OR check opal call)
>
Dan Streetman March 27, 2015, 4:06 a.m. UTC | #10
On Wed, Mar 25, 2015 at 11:57 AM, Dan Streetman <ddstreet@ieee.org> wrote:
> On Tue, Mar 17, 2015 at 1:34 AM, Stewart Smith
> <stewart@linux.vnet.ibm.com> wrote:
>> Benjamin Herrenschmidt <benh@au1.ibm.com> writes:
>>> On Mon, 2015-03-16 at 14:03 -0400, Dan Streetman wrote:
>>>>
>>>> Well the kernel driver doesn't care at all about P7/P8 except for the
>>>> xscom r/w, since the xscom addrs are different between P7/P8.  As I
>>>> haven't actually added any error recovery (via xscoms) to the kernel
>>>> driver yet, I could remove the xscom stuff from the first kernel patch
>>>> set.  However, if the kernel driver is who eventually monitors and
>>>> handles NX errors, I don't see any way around r/w directly from the NX
>>>> xscoms.  The P7/P8 differences should be able to be abstracted in the
>>>> nx-842-xscom.h header though, and I don't think the actual driver will
>>>> need to know the P num.
>>>
>>> I don't want linux to know about SCOMS, instead, create OPAL calls, for
>>> example OPAL_GET_NX_STATUS or whatever makes sense for you.
>>>
>>> In any case, a XSCOM is a roundtrip to OPAL so the above won't be
>>> slower.
>>
>> I'll merge this patch and possibly disable the 842 DT node creation
>> before tagging skiboot-5.0 depending on if the patch that adds OPAL
>> calls arrives in time or not.
>
> Sorry for the delay.  I sent the patch with the NX opal calls earlier
> this week, so let me know if you think it needs any adjustment.

Just to follow up on this via email, as we discussed in irc the FSP
will be monitoring the NX status/err registers, so we don't need to
add these calls since the kernel won't need to use them.  You can
ignore the patch I sent with the opal calls.

>
>>
>> (open question: should the protocol for discovering if firmware supports
>> a thing be "check device tree entry exists and OPAL calls exist" or just
>> "check device tree" OR check opal call)
>>
Stewart Smith March 27, 2015, 4:27 a.m. UTC | #11
Dan Streetman <ddstreet@ieee.org> writes:
> Just to follow up on this via email, as we discussed in irc the FSP
> will be monitoring the NX status/err registers, so we don't need to
> add these calls since the kernel won't need to use them.  You can
> ignore the patch I sent with the opal calls.

and on systems without an FSP, opal-prd will be doing it.

Maybe I need to go an hit some hardware with a hammer to trigger an
error though? :)
diff mbox

Patch

diff --git a/hw/nx-842.c b/hw/nx-842.c
index b2ae425..1cf2e12 100644
--- a/hw/nx-842.c
+++ b/hw/nx-842.c
@@ -23,8 +23,8 @@ 
 /* Configuration settings */
 #define CFG_842_FC_ENABLE	(0x1f) /* enable all 842 functions */
 #define CFG_842_ENABLE		(1) /* enable 842 engines */
-#define DMA_COMPRESS_PREFETCH	(1) /* enable prefetching */
-#define DMA_DECOMPRESS_PREFETCH	(1) /* enable prefetching */
+#define DMA_COMPRESS_PREFETCH	(1) /* enable prefetching (on P8) */
+#define DMA_DECOMPRESS_PREFETCH	(1) /* enable prefetching (on P8) */
 #define DMA_COMPRESS_MAX_RR	(15) /* range 1-15 */
 #define DMA_DECOMPRESS_MAX_RR	(15) /* range 1-15 */
 #define DMA_SPBC		(1) /* write SPBC in CPB */
@@ -32,8 +32,8 @@ 
 #define DMA_COMPLETION_MODE	NX_DMA_COMPLETION_MODE_CI
 #define DMA_CPB_WR		NX_DMA_CPB_WR_CI_PAD
 #define DMA_OUTPUT_DATA_WR	NX_DMA_OUTPUT_DATA_WR_CI
-#define EE_0			(1) /* enable engine 0 */
-#define EE_1			(1) /* enable engine 1 */
+#define EE_1			(1) /* enable engine 842 1 */
+#define EE_0			(1) /* enable engine 842 0 */
 
 /* counter used to provide unique Coprocessor Instance number */
 static u64 nx_842_ci_counter = 1;
@@ -43,9 +43,9 @@  static int nx_cfg_842(u32 gcid, u64 xcfg, u64 instance)
 	u64 cfg, ci, ct;
 	int rc;
 
-	if (instance > NX_P8_842_CFG_CI_MAX) {
+	if (instance > NX_842_CFG_CI_MAX) {
 		prerror("NX%d: ERROR: 842 CI %u exceeds max %u\n",
-			gcid, (unsigned int)instance, NX_P8_842_CFG_CI_MAX);
+			gcid, (unsigned int)instance, NX_842_CFG_CI_MAX);
 		return OPAL_INTERNAL_ERROR;
 	}
 
@@ -53,7 +53,7 @@  static int nx_cfg_842(u32 gcid, u64 xcfg, u64 instance)
 	if (rc)
 		return rc;
 
-	ct = GETFIELD(NX_P8_842_CFG_CT, cfg);
+	ct = GETFIELD(NX_842_CFG_CT, cfg);
 	if (!ct)
 		prlog(PR_INFO, "NX%d:   842 CT set to %u\n", gcid, NX_CT_842);
 	else if (ct == NX_CT_842)
@@ -63,12 +63,12 @@  static int nx_cfg_842(u32 gcid, u64 xcfg, u64 instance)
 		prlog(PR_INFO, "NX%d:   842 CT already set to %u, "
 		      "changing to %u\n", gcid, (unsigned int)ct, NX_CT_842);
 	ct = NX_CT_842;
-	cfg = SETFIELD(NX_P8_842_CFG_CT, cfg, ct);
+	cfg = SETFIELD(NX_842_CFG_CT, cfg, ct);
 
 	/* Coprocessor Instance must be shifted left.
 	 * See hw doc Section 5.5.1.
 	 */
-	ci = GETFIELD(NX_P8_842_CFG_CI, cfg) >> NX_P8_842_CFG_CI_LSHIFT;
+	ci = GETFIELD(NX_842_CFG_CI, cfg) >> NX_842_CFG_CI_LSHIFT;
 	if (!ci)
 		prlog(PR_INFO, "NX%d:   842 CI set to %u\n", gcid,
 		      (unsigned int)instance);
@@ -80,12 +80,12 @@  static int nx_cfg_842(u32 gcid, u64 xcfg, u64 instance)
 		      "changing to %u\n", gcid,
 		      (unsigned int)ci, (unsigned int)instance);
 	ci = instance;
-	cfg = SETFIELD(NX_P8_842_CFG_CI, cfg, ci << NX_P8_842_CFG_CI_LSHIFT);
+	cfg = SETFIELD(NX_842_CFG_CI, cfg, ci << NX_842_CFG_CI_LSHIFT);
 
 	/* Enable all functions */
-	cfg = SETFIELD(NX_P8_842_CFG_FC_ENABLE, cfg, CFG_842_FC_ENABLE);
+	cfg = SETFIELD(NX_842_CFG_FC_ENABLE, cfg, CFG_842_FC_ENABLE);
 
-	cfg = SETFIELD(NX_P8_842_CFG_ENABLE, cfg, CFG_842_ENABLE);
+	cfg = SETFIELD(NX_842_CFG_ENABLE, cfg, CFG_842_ENABLE);
 
 	rc = xscom_write(gcid, xcfg, cfg);
 	if (rc)
@@ -98,7 +98,7 @@  static int nx_cfg_842(u32 gcid, u64 xcfg, u64 instance)
 	return rc;
 }
 
-static int nx_cfg_dma(u32 gcid, u64 xcfg)
+static int nx_cfg_dma(u32 gcid, u64 xcfg, int pnum)
 {
 	u64 cfg;
 	int rc;
@@ -108,22 +108,22 @@  static int nx_cfg_dma(u32 gcid, u64 xcfg)
 		return rc;
 
 	cfg = SETFIELD(NX_P8_DMA_CFG_842_COMPRESS_PREFETCH, cfg,
-		       DMA_COMPRESS_PREFETCH);
+		       pnum == 8 ? DMA_COMPRESS_PREFETCH : 0);
 	cfg = SETFIELD(NX_P8_DMA_CFG_842_DECOMPRESS_PREFETCH, cfg,
-		       DMA_DECOMPRESS_PREFETCH);
-	cfg = SETFIELD(NX_P8_DMA_CFG_842_COMPRESS_MAX_RR, cfg,
+		       pnum == 8 ? DMA_DECOMPRESS_PREFETCH : 0);
+	cfg = SETFIELD(NX_DMA_CFG_842_COMPRESS_MAX_RR, cfg,
 		       DMA_COMPRESS_MAX_RR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_842_DECOMPRESS_MAX_RR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_842_DECOMPRESS_MAX_RR, cfg,
 		       DMA_DECOMPRESS_MAX_RR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_842_SPBC, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_842_SPBC, cfg,
 		       DMA_SPBC);
-	cfg = SETFIELD(NX_P8_DMA_CFG_842_CSB_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_842_CSB_WR, cfg,
 		       DMA_CSB_WR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_842_COMPLETION_MODE, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_842_COMPLETION_MODE, cfg,
 		       DMA_COMPLETION_MODE);
-	cfg = SETFIELD(NX_P8_DMA_CFG_842_CPB_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_842_CPB_WR, cfg,
 		       DMA_CPB_WR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_842_OUTPUT_DATA_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_842_OUTPUT_DATA_WR, cfg,
 		       DMA_OUTPUT_DATA_WR);
 
 	rc = xscom_write(gcid, xcfg, cfg);
@@ -145,8 +145,8 @@  static int nx_cfg_ee(u32 gcid, u64 xcfg)
 	if (rc)
 		return rc;
 
-	cfg = SETFIELD(NX_P8_EE_CFG_842_0, cfg, EE_0);
-	cfg = SETFIELD(NX_P8_EE_CFG_842_1, cfg, EE_1);
+	cfg = SETFIELD(NX_EE_CFG_CH1, cfg, EE_1);
+	cfg = SETFIELD(NX_EE_CFG_CH0, cfg, EE_0);
 
 	rc = xscom_write(gcid, xcfg, cfg);
 	if (rc)
@@ -165,7 +165,7 @@  void nx_create_842_node(struct dt_node *node)
 	u64 cfg_dma, cfg_842, cfg_ee;
 	u64 instance;
 	struct dt_node *dt_842;
-	int rc;
+	int rc, pnum;
 	char node_name[32];
 
 	gcid = dt_get_chip_id(node);
@@ -174,18 +174,21 @@  void nx_create_842_node(struct dt_node *node)
 	prlog(PR_INFO, "NX%d: 842 at 0x%x\n", gcid, pb_base);
 
 	if (dt_node_is_compatible(node, "ibm,power7-nx")) {
-		prerror("NX%d: ERROR: 842 not supported on Power7\n", gcid);
-		return;
+		cfg_dma = pb_base + NX_P7_DMA_CFG;
+		cfg_842 = pb_base + NX_P7_842_CFG;
+		cfg_ee = pb_base + NX_P7_EE_CFG;
+		pnum = 7;
 	} else if (dt_node_is_compatible(node, "ibm,power8-nx")) {
 		cfg_dma = pb_base + NX_P8_DMA_CFG;
 		cfg_842 = pb_base + NX_P8_842_CFG;
 		cfg_ee = pb_base + NX_P8_EE_CFG;
+		pnum = 8;
 	} else {
 		prerror("NX%d: ERROR: Unknown NX type!\n", gcid);
 		return;
 	}
 
-	rc = nx_cfg_dma(gcid, cfg_dma);
+	rc = nx_cfg_dma(gcid, cfg_dma, pnum);
 	if (rc)
 		return;
 
diff --git a/hw/nx-crypto.c b/hw/nx-crypto.c
index 623ab84..9d15e87 100644
--- a/hw/nx-crypto.c
+++ b/hw/nx-crypto.c
@@ -35,12 +35,12 @@ 
 #define AMF_COMPLETION_MODE	NX_DMA_COMPLETION_MODE_PDMA
 #define AMF_CPB_WR		(0) /* CPB WR not done with AMF */
 #define AMF_OUTPUT_DATA_WR	NX_DMA_OUTPUT_DATA_WR_DMA
-#define EE_AMF_0		(0) /* disable AMF engine 0 */
-#define EE_AMF_1		(0) /* disable AMF engine 1 */
-#define EE_AMF_2		(0) /* disable AMF engine 2 */
-#define EE_AMF_3		(0) /* disable AMF engine 3 */
-#define EE_SYM_0		(0) /* disable SYM engine 0 */
-#define EE_SYM_1		(0) /* disable SYM engine 1 */
+#define EE_CH7			(0) /* disable engine AMF 2(P7) / 3(P8) */
+#define EE_CH6			(0) /* disable engine AMF 1(P7) / 2(P8) */
+#define EE_CH5			(0) /* disable engine AMF 0(P7) / 1(P8) */
+#define EE_CH4			(0) /* disable engine SYM 2(P7) / AMF 0(P8) */
+#define EE_CH3			(0) /* disable engine SYM 1 */
+#define EE_CH2			(0) /* disable engine SYM 0 */
 
 /* counters used to provide unique Coprocessor Instance numbers */
 static u64 nx_sym_ci_counter = 1;
@@ -51,9 +51,9 @@  static int nx_cfg_sym(u32 gcid, u64 xcfg, u64 instance)
 	u64 cfg, ci, ct;
 	int rc;
 
-	if (instance > NX_P8_SYM_CFG_CI_MAX) {
+	if (instance > NX_SYM_CFG_CI_MAX) {
 		prerror("NX%d: ERROR: SYM CI %u exceeds max %u\n",
-			gcid, (unsigned int)instance, NX_P8_SYM_CFG_CI_MAX);
+			gcid, (unsigned int)instance, NX_SYM_CFG_CI_MAX);
 		return OPAL_INTERNAL_ERROR;
 	}
 
@@ -61,7 +61,7 @@  static int nx_cfg_sym(u32 gcid, u64 xcfg, u64 instance)
 	if (rc)
 		return rc;
 
-	ct = GETFIELD(NX_P8_SYM_CFG_CT, cfg);
+	ct = GETFIELD(NX_SYM_CFG_CT, cfg);
 	if (!ct)
 		prlog(PR_INFO, "NX%d:   SYM CT set to %u\n", gcid, NX_CT_SYM);
 	else if (ct == NX_CT_SYM)
@@ -71,12 +71,12 @@  static int nx_cfg_sym(u32 gcid, u64 xcfg, u64 instance)
 		prlog(PR_INFO, "NX%d:   SYM CT already set to %u, "
 		      "changing to %u\n", gcid, (unsigned int)ct, NX_CT_SYM);
 	ct = NX_CT_SYM;
-	cfg = SETFIELD(NX_P8_SYM_CFG_CT, cfg, ct);
+	cfg = SETFIELD(NX_SYM_CFG_CT, cfg, ct);
 
 	/* Coprocessor Instance must be shifted left.
 	 * See hw doc Section 5.5.1.
 	 */
-	ci = GETFIELD(NX_P8_SYM_CFG_CI, cfg) >> NX_P8_SYM_CFG_CI_LSHIFT;
+	ci = GETFIELD(NX_SYM_CFG_CI, cfg) >> NX_SYM_CFG_CI_LSHIFT;
 	if (!ci)
 		prlog(PR_INFO, "NX%d:   SYM CI set to %u\n", gcid,
 		      (unsigned int)instance);
@@ -88,11 +88,11 @@  static int nx_cfg_sym(u32 gcid, u64 xcfg, u64 instance)
 		      "changing to %u\n", gcid,
 		      (unsigned int)ci, (unsigned int)instance);
 	ci = instance;
-	cfg = SETFIELD(NX_P8_SYM_CFG_CI, cfg, ci << NX_P8_SYM_CFG_CI_LSHIFT);
+	cfg = SETFIELD(NX_SYM_CFG_CI, cfg, ci << NX_SYM_CFG_CI_LSHIFT);
 
-	cfg = SETFIELD(NX_P8_SYM_CFG_FC_ENABLE, cfg, CFG_SYM_FC_ENABLE);
+	cfg = SETFIELD(NX_SYM_CFG_FC_ENABLE, cfg, CFG_SYM_FC_ENABLE);
 
-	cfg = SETFIELD(NX_P8_SYM_CFG_ENABLE, cfg, CFG_SYM_ENABLE);
+	cfg = SETFIELD(NX_SYM_CFG_ENABLE, cfg, CFG_SYM_ENABLE);
 
 	rc = xscom_write(gcid, xcfg, cfg);
 	if (rc)
@@ -110,9 +110,9 @@  static int nx_cfg_asym(u32 gcid, u64 xcfg, u64 instance)
 	u64 cfg, ci, ct;
 	int rc;
 
-	if (instance > NX_P8_ASYM_CFG_CI_MAX) {
+	if (instance > NX_ASYM_CFG_CI_MAX) {
 		prerror("NX%d: ERROR: ASYM CI %u exceeds max %u\n",
-			gcid, (unsigned int)instance, NX_P8_ASYM_CFG_CI_MAX);
+			gcid, (unsigned int)instance, NX_ASYM_CFG_CI_MAX);
 		return OPAL_INTERNAL_ERROR;
 	}
 
@@ -120,7 +120,7 @@  static int nx_cfg_asym(u32 gcid, u64 xcfg, u64 instance)
 	if (rc)
 		return rc;
 
-	ct = GETFIELD(NX_P8_ASYM_CFG_CT, cfg);
+	ct = GETFIELD(NX_ASYM_CFG_CT, cfg);
 	if (!ct)
 		prlog(PR_INFO, "NX%d:   ASYM CT set to %u\n",
 		      gcid, NX_CT_ASYM);
@@ -131,12 +131,12 @@  static int nx_cfg_asym(u32 gcid, u64 xcfg, u64 instance)
 		prlog(PR_INFO, "NX%d:   ASYM CT already set to %u, "
 		      "changing to %u\n", gcid, (unsigned int)ct, NX_CT_ASYM);
 	ct = NX_CT_ASYM;
-	cfg = SETFIELD(NX_P8_ASYM_CFG_CT, cfg, ct);
+	cfg = SETFIELD(NX_ASYM_CFG_CT, cfg, ct);
 
 	/* Coprocessor Instance must be shifted left.
 	 * See hw doc Section 5.5.1.
 	 */
-	ci = GETFIELD(NX_P8_ASYM_CFG_CI, cfg) >> NX_P8_ASYM_CFG_CI_LSHIFT;
+	ci = GETFIELD(NX_ASYM_CFG_CI, cfg) >> NX_ASYM_CFG_CI_LSHIFT;
 	if (!ci)
 		prlog(PR_INFO, "NX%d:   ASYM CI set to %u\n", gcid,
 		      (unsigned int)instance);
@@ -148,11 +148,11 @@  static int nx_cfg_asym(u32 gcid, u64 xcfg, u64 instance)
 		      "changing to %u\n", gcid,
 		      (unsigned int)ci, (unsigned int)instance);
 	ci = instance;
-	cfg = SETFIELD(NX_P8_ASYM_CFG_CI, cfg, ci << NX_P8_ASYM_CFG_CI_LSHIFT);
+	cfg = SETFIELD(NX_ASYM_CFG_CI, cfg, ci << NX_ASYM_CFG_CI_LSHIFT);
 
-	cfg = SETFIELD(NX_P8_ASYM_CFG_FC_ENABLE, cfg, CFG_ASYM_FC_ENABLE);
+	cfg = SETFIELD(NX_ASYM_CFG_FC_ENABLE, cfg, CFG_ASYM_FC_ENABLE);
 
-	cfg = SETFIELD(NX_P8_ASYM_CFG_ENABLE, cfg, CFG_ASYM_ENABLE);
+	cfg = SETFIELD(NX_ASYM_CFG_ENABLE, cfg, CFG_ASYM_ENABLE);
 
 	rc = xscom_write(gcid, xcfg, cfg);
 	if (rc)
@@ -174,26 +174,26 @@  static int nx_cfg_dma(u32 gcid, u64 xcfg)
 	if (rc)
 		return rc;
 
-	cfg = SETFIELD(NX_P8_DMA_CFG_AES_SHA_MAX_RR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AES_SHA_MAX_RR, cfg,
 		       AES_SHA_MAX_RR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_AES_SHA_CSB_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AES_SHA_CSB_WR, cfg,
 		       AES_SHA_CSB_WR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_AES_SHA_COMPLETION_MODE, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AES_SHA_COMPLETION_MODE, cfg,
 		       AES_SHA_COMPLETION_MODE);
-	cfg = SETFIELD(NX_P8_DMA_CFG_AES_SHA_CPB_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AES_SHA_CPB_WR, cfg,
 		       AES_SHA_CPB_WR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_AES_SHA_OUTPUT_DATA_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AES_SHA_OUTPUT_DATA_WR, cfg,
 		       AES_SHA_OUTPUT_DATA_WR);
 
-	cfg = SETFIELD(NX_P8_DMA_CFG_AMF_MAX_RR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AMF_MAX_RR, cfg,
 		       AMF_MAX_RR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_AMF_CSB_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AMF_CSB_WR, cfg,
 		       AMF_CSB_WR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_AMF_COMPLETION_MODE, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AMF_COMPLETION_MODE, cfg,
 		       AMF_COMPLETION_MODE);
-	cfg = SETFIELD(NX_P8_DMA_CFG_AMF_CPB_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AMF_CPB_WR, cfg,
 		       AMF_CPB_WR);
-	cfg = SETFIELD(NX_P8_DMA_CFG_AMF_OUTPUT_DATA_WR, cfg,
+	cfg = SETFIELD(NX_DMA_CFG_AMF_OUTPUT_DATA_WR, cfg,
 		       AMF_OUTPUT_DATA_WR);
 
 	rc = xscom_write(gcid, xcfg, cfg);
@@ -215,12 +215,12 @@  static int nx_cfg_ee(u32 gcid, u64 xcfg)
 	if (rc)
 		return rc;
 
-	cfg = SETFIELD(NX_P8_EE_CFG_AMF_0, cfg, EE_AMF_0);
-	cfg = SETFIELD(NX_P8_EE_CFG_AMF_1, cfg, EE_AMF_1);
-	cfg = SETFIELD(NX_P8_EE_CFG_AMF_2, cfg, EE_AMF_2);
-	cfg = SETFIELD(NX_P8_EE_CFG_AMF_3, cfg, EE_AMF_3);
-	cfg = SETFIELD(NX_P8_EE_CFG_SYM_0, cfg, EE_SYM_0);
-	cfg = SETFIELD(NX_P8_EE_CFG_SYM_1, cfg, EE_SYM_1);
+	cfg = SETFIELD(NX_EE_CFG_CH7, cfg, EE_CH7);
+	cfg = SETFIELD(NX_EE_CFG_CH6, cfg, EE_CH6);
+	cfg = SETFIELD(NX_EE_CFG_CH5, cfg, EE_CH5);
+	cfg = SETFIELD(NX_EE_CFG_CH4, cfg, EE_CH4);
+	cfg = SETFIELD(NX_EE_CFG_CH3, cfg, EE_CH3);
+	cfg = SETFIELD(NX_EE_CFG_CH2, cfg, EE_CH2);
 
 	rc = xscom_write(gcid, xcfg, cfg);
 	if (rc)
@@ -245,8 +245,10 @@  void nx_create_crypto_node(struct dt_node *node)
 	prlog(PR_INFO, "NX%d: Crypto at 0x%x\n", gcid, pb_base);
 
 	if (dt_node_is_compatible(node, "ibm,power7-nx")) {
-		prerror("NX%d: ERROR: Crypto not supported on Power7\n", gcid);
-		return;
+		cfg_dma = pb_base + NX_P7_DMA_CFG;
+		cfg_sym = pb_base + NX_P7_SYM_CFG;
+		cfg_asym = pb_base + NX_P7_ASYM_CFG;
+		cfg_ee = pb_base + NX_P7_EE_CFG;
 	} else if (dt_node_is_compatible(node, "ibm,power8-nx")) {
 		cfg_dma = pb_base + NX_P8_DMA_CFG;
 		cfg_sym = pb_base + NX_P8_SYM_CFG;
diff --git a/hw/nx-rng.c b/hw/nx-rng.c
index 9cc5317..063848d 100644
--- a/hw/nx-rng.c
+++ b/hw/nx-rng.c
@@ -28,7 +28,7 @@  void nx_create_rng_node(struct dt_node *node)
 	u64 xbar, xcfg;
 	u32 pb_base;
 	u32 gcid;
-	u64 rng_addr, rng_len, len;
+	u64 rng_addr, rng_len, len, addr_mask;
 	struct dt_node *rng;
 	int rc;
 
@@ -38,9 +38,11 @@  void nx_create_rng_node(struct dt_node *node)
 	if (dt_node_is_compatible(node, "ibm,power7-nx")) {
 		xbar = pb_base + NX_P7_RNG_BAR;
 		xcfg = pb_base + NX_P7_RNG_CFG;
+		addr_mask = NX_P7_RNG_BAR_ADDR;
 	} else if (dt_node_is_compatible(node, "ibm,power8-nx")) {
 		xbar = pb_base + NX_P8_RNG_BAR;
 		xcfg = pb_base + NX_P8_RNG_CFG;
+		addr_mask = NX_P8_RNG_BAR_ADDR;
 	} else {
 		prerror("NX%d: Unknown NX type!\n", gcid);
 		return;
@@ -55,16 +57,13 @@  void nx_create_rng_node(struct dt_node *node)
 		return;
 
 	/*
-	 * We use the P8 BAR constants. The layout of the BAR is the
-	 * same, with more bits at the top of P8 which are hard wired to
-	 * 0 on P7. We also mask in-place rather than using GETFIELD
-	 * for the base address as we happen to *know* that it's properly
-	 * aligned in the register.
+	 * We mask in-place rather than using GETFIELD for the base address
+	 * as we happen to *know* that it's properly aligned in the register.
 	 *
 	 * FIXME? Always assusme BAR gets a valid address from FSP
 	 */
-	rng_addr = bar & NX_P8_RNG_BAR_ADDR;
-	len  = GETFIELD(NX_P8_RNG_BAR_SIZE, bar);
+	rng_addr = bar & addr_mask;
+	len  = GETFIELD(NX_RNG_BAR_SIZE, bar);
 	if (len > 4) {
 		prerror("NX%d: Corrupted bar size %lld\n", gcid, len);
 		return;
@@ -80,12 +79,12 @@  void nx_create_rng_node(struct dt_node *node)
 	      gcid, rng_addr, rng_addr + rng_len - 1);
 
 	/* RNG must be enabled before MMIO is enabled */
-	rc = xscom_write(gcid, xcfg, cfg | NX_P8_RNG_CFG_ENABLE);
+	rc = xscom_write(gcid, xcfg, cfg | NX_RNG_CFG_ENABLE);
 	if (rc)
 		return;
 
 	/* The BAR needs to be enabled too */
-	rc = xscom_write(gcid, xbar, bar | NX_P8_RNG_BAR_ENABLE);
+	rc = xscom_write(gcid, xbar, bar | NX_RNG_BAR_ENABLE);
 	if (rc)
 		return;
 	rng = dt_new_addr(dt_root, "hwrng", rng_addr);
diff --git a/include/nx.h b/include/nx.h
index 45344da..cf887c0 100644
--- a/include/nx.h
+++ b/include/nx.h
@@ -21,75 +21,80 @@ 
 /* Register addresses and bit fields */
 /*************************************/
 
+#define NX_P7_SAT(sat, offset)	XSCOM_SAT(0x1, sat, offset)
+#define NX_P8_SAT(sat, offset)	XSCOM_SAT(0xc, sat, offset)
+
 /* Random Number Generator */
-#define NX_P7_RNG_BAR		XSCOM_SAT(0x1, 0x2, 0x0c)
+#define NX_P7_RNG_BAR		NX_P7_SAT(0x2, 0x0c)
+#define NX_P8_RNG_BAR		NX_P8_SAT(0x2, 0x0d)
 #define   NX_P7_RNG_BAR_ADDR		PPC_BITMASK(18, 51)
-#define   NX_P7_RNG_BAR_SIZE		PPC_BITMASK(53, 55)
-#define   NX_P7_RNG_BAR_ENABLE		PPC_BIT(52)
-#define NX_P8_RNG_BAR		XSCOM_SAT(0xc, 0x2, 0x0d)
 #define   NX_P8_RNG_BAR_ADDR		PPC_BITMASK(14, 51)
-#define   NX_P8_RNG_BAR_SIZE		PPC_BITMASK(53, 55)
-#define   NX_P8_RNG_BAR_ENABLE		PPC_BIT(52)
+#define   NX_RNG_BAR_SIZE		PPC_BITMASK(53, 55)
+#define   NX_RNG_BAR_ENABLE		PPC_BIT(52)
 
-#define NX_P7_RNG_CFG		XSCOM_SAT(0x1, 0x2, 0x12)
-#define   NX_P7_RNG_CFG_ENABLE		PPC_BIT(63)
-#define NX_P8_RNG_CFG		XSCOM_SAT(0xc, 0x2, 0x12)
-#define   NX_P8_RNG_CFG_ENABLE		PPC_BIT(63)
+#define NX_P7_RNG_CFG		NX_P7_SAT(0x2, 0x12)
+#define NX_P8_RNG_CFG		NX_P8_SAT(0x2, 0x12)
+#define   NX_RNG_CFG_ENABLE		PPC_BIT(63)
 
 /* Symmetric Crypto */
-#define NX_P8_SYM_CFG		XSCOM_SAT(0xc, 0x2, 0x0a)
-#define   NX_P8_SYM_CFG_CI		PPC_BITMASK(2, 14)
-#define   NX_P8_SYM_CFG_CT		PPC_BITMASK(18, 23)
-#define   NX_P8_SYM_CFG_FC_ENABLE	PPC_BITMASK(32, 39)
-#define   NX_P8_SYM_CFG_ENABLE		PPC_BIT(63)
+#define NX_P7_SYM_CFG		NX_P7_SAT(0x2, 0x09)
+#define NX_P8_SYM_CFG		NX_P8_SAT(0x2, 0x0a)
+#define   NX_SYM_CFG_CI			PPC_BITMASK(2, 14)
+#define   NX_SYM_CFG_CT			PPC_BITMASK(18, 23)
+#define   NX_SYM_CFG_FC_ENABLE		PPC_BITMASK(32, 39)
+#define   NX_SYM_CFG_ENABLE		PPC_BIT(63)
 
 /* Asymmetric Crypto */
-#define NX_P8_ASYM_CFG		XSCOM_SAT(0xc, 0x2, 0x0b)
-#define   NX_P8_ASYM_CFG_CI		PPC_BITMASK(2, 14)
-#define   NX_P8_ASYM_CFG_CT		PPC_BITMASK(18, 23)
-#define   NX_P8_ASYM_CFG_FC_ENABLE	PPC_BITMASK(32, 52)
-#define   NX_P8_ASYM_CFG_ENABLE		PPC_BIT(63)
+#define NX_P7_ASYM_CFG		NX_P7_SAT(0x2, 0x0a)
+#define NX_P8_ASYM_CFG		NX_P8_SAT(0x2, 0x0b)
+#define   NX_ASYM_CFG_CI		PPC_BITMASK(2, 14)
+#define   NX_ASYM_CFG_CT		PPC_BITMASK(18, 23)
+#define   NX_ASYM_CFG_FC_ENABLE		PPC_BITMASK(32, 52)
+#define   NX_ASYM_CFG_ENABLE		PPC_BIT(63)
 
 /* 842 Compression */
-#define NX_P8_842_CFG		XSCOM_SAT(0xc, 0x2, 0x0c)
-#define   NX_P8_842_CFG_CI		PPC_BITMASK(2, 14)
-#define   NX_P8_842_CFG_CT		PPC_BITMASK(18, 23)
-#define   NX_P8_842_CFG_FC_ENABLE	PPC_BITMASK(32, 36)
-#define   NX_P8_842_CFG_ENABLE		PPC_BIT(63)
+#define NX_P7_842_CFG		NX_P7_SAT(0x2, 0x0b)
+#define NX_P8_842_CFG		NX_P8_SAT(0x2, 0x0c)
+#define   NX_842_CFG_CI			PPC_BITMASK(2, 14)
+#define   NX_842_CFG_CT			PPC_BITMASK(18, 23)
+#define   NX_842_CFG_FC_ENABLE		PPC_BITMASK(32, 36)
+#define   NX_842_CFG_ENABLE		PPC_BIT(63)
 
 /* DMA */
-#define NX_P8_DMA_CFG		XSCOM_SAT(0xc, 0x1, 0x02)
-#define   NX_P8_DMA_CFG_842_COMPRESS_PREFETCH		PPC_BIT(23)
-#define   NX_P8_DMA_CFG_842_DECOMPRESS_PREFETCH		PPC_BIT(24)
-#define   NX_P8_DMA_CFG_AES_SHA_MAX_RR			PPC_BITMASK(25, 28)
-#define   NX_P8_DMA_CFG_AMF_MAX_RR			PPC_BITMASK(29, 32)
-#define   NX_P8_DMA_CFG_842_COMPRESS_MAX_RR		PPC_BITMASK(33, 36)
-#define   NX_P8_DMA_CFG_842_DECOMPRESS_MAX_RR		PPC_BITMASK(37, 40)
-#define   NX_P8_DMA_CFG_AES_SHA_CSB_WR			PPC_BITMASK(41, 42)
-#define   NX_P8_DMA_CFG_AES_SHA_COMPLETION_MODE		PPC_BITMASK(43, 44)
-#define   NX_P8_DMA_CFG_AES_SHA_CPB_WR			PPC_BITMASK(45, 46)
-#define   NX_P8_DMA_CFG_AES_SHA_OUTPUT_DATA_WR		PPC_BIT(47)
-#define   NX_P8_DMA_CFG_AMF_CSB_WR			PPC_BITMASK(49, 50)
-#define   NX_P8_DMA_CFG_AMF_COMPLETION_MODE		PPC_BITMASK(51, 52)
-#define   NX_P8_DMA_CFG_AMF_CPB_WR			PPC_BITMASK(53, 54)
-#define   NX_P8_DMA_CFG_AMF_OUTPUT_DATA_WR		PPC_BIT(55)
-#define   NX_P8_DMA_CFG_842_SPBC			PPC_BIT(56)
-#define   NX_P8_DMA_CFG_842_CSB_WR			PPC_BITMASK(57, 58)
-#define   NX_P8_DMA_CFG_842_COMPLETION_MODE		PPC_BITMASK(59, 60)
-#define   NX_P8_DMA_CFG_842_CPB_WR			PPC_BITMASK(61, 62)
-#define   NX_P8_DMA_CFG_842_OUTPUT_DATA_WR		PPC_BIT(63)
+#define NX_P7_DMA_CFG		NX_P7_SAT(0x1, 0x02)
+#define NX_P8_DMA_CFG		NX_P8_SAT(0x1, 0x02)
+#define   NX_P8_DMA_CFG_842_COMPRESS_PREFETCH	PPC_BIT(23)
+#define   NX_P8_DMA_CFG_842_DECOMPRESS_PREFETCH	PPC_BIT(24)
+#define   NX_DMA_CFG_AES_SHA_MAX_RR		PPC_BITMASK(25, 28)
+#define   NX_DMA_CFG_AMF_MAX_RR			PPC_BITMASK(29, 32)
+#define   NX_DMA_CFG_842_COMPRESS_MAX_RR	PPC_BITMASK(33, 36)
+#define   NX_DMA_CFG_842_DECOMPRESS_MAX_RR	PPC_BITMASK(37, 40)
+#define   NX_DMA_CFG_AES_SHA_CSB_WR		PPC_BITMASK(41, 42)
+#define   NX_DMA_CFG_AES_SHA_COMPLETION_MODE	PPC_BITMASK(43, 44)
+#define   NX_DMA_CFG_AES_SHA_CPB_WR		PPC_BITMASK(45, 46)
+#define   NX_DMA_CFG_AES_SHA_OUTPUT_DATA_WR	PPC_BIT(47)
+#define   NX_DMA_CFG_AMF_CSB_WR			PPC_BITMASK(49, 50)
+#define   NX_DMA_CFG_AMF_COMPLETION_MODE	PPC_BITMASK(51, 52)
+#define   NX_DMA_CFG_AMF_CPB_WR			PPC_BITMASK(53, 54)
+#define   NX_DMA_CFG_AMF_OUTPUT_DATA_WR		PPC_BIT(55)
+#define   NX_DMA_CFG_842_SPBC			PPC_BIT(56)
+#define   NX_DMA_CFG_842_CSB_WR			PPC_BITMASK(57, 58)
+#define   NX_DMA_CFG_842_COMPLETION_MODE	PPC_BITMASK(59, 60)
+#define   NX_DMA_CFG_842_CPB_WR			PPC_BITMASK(61, 62)
+#define   NX_DMA_CFG_842_OUTPUT_DATA_WR		PPC_BIT(63)
 
 /* Engine Enable Register */
-#define NX_P8_EE_CFG	XSCOM_SAT(0xc, 0x1, 0x01)
-#define   NX_P8_EE_CFG_EFUSE	PPC_BIT(0)
-#define   NX_P8_EE_CFG_AMF_3	PPC_BIT(53)
-#define   NX_P8_EE_CFG_AMF_2	PPC_BIT(54)
-#define   NX_P8_EE_CFG_AMF_1	PPC_BIT(55)
-#define   NX_P8_EE_CFG_AMF_0	PPC_BIT(56)
-#define   NX_P8_EE_CFG_SYM_1	PPC_BIT(57)
-#define   NX_P8_EE_CFG_SYM_0	PPC_BIT(58)
-#define   NX_P8_EE_CFG_842_1	PPC_BIT(62)
-#define   NX_P8_EE_CFG_842_0	PPC_BIT(63)
+#define NX_P7_EE_CFG		NX_P7_SAT(0x1, 0x01)
+#define NX_P8_EE_CFG		NX_P8_SAT(0x1, 0x01)
+#define   NX_EE_CFG_EFUSE		PPC_BIT(0)
+#define   NX_EE_CFG_CH7			PPC_BIT(53) /* AMF */
+#define   NX_EE_CFG_CH6			PPC_BIT(54) /* AMF */
+#define   NX_EE_CFG_CH5			PPC_BIT(55) /* AMF */
+#define   NX_EE_CFG_CH4			PPC_BIT(56) /* P7: SYM, P8: AMF */
+#define   NX_EE_CFG_CH3			PPC_BIT(57) /* SYM */
+#define   NX_EE_CFG_CH2			PPC_BIT(58) /* SYM */
+#define   NX_EE_CFG_CH1			PPC_BIT(62) /* 842 */
+#define   NX_EE_CFG_CH0			PPC_BIT(63) /* 842 */
 
 
 /**************************************/
@@ -102,18 +107,18 @@ 
 #define NX_CT_842	(3)
 
 /* Coprocessor Instance counter
- * P8 NX workbook, section 5.5.1
+ * NX workbook, section 5.5.1
  * "Assigning <CT,CI> Values"
  */
-#define NX_P8_SYM_CFG_CI_MAX		(511)
-#define NX_P8_SYM_CFG_CI_LSHIFT		(2)
-#define NX_P8_ASYM_CFG_CI_MAX		(127)
-#define NX_P8_ASYM_CFG_CI_LSHIFT	(4)
-#define NX_P8_842_CFG_CI_MAX		(511)
-#define NX_P8_842_CFG_CI_LSHIFT		(2)
+#define NX_SYM_CFG_CI_MAX	(511)
+#define NX_SYM_CFG_CI_LSHIFT	(2)
+#define NX_ASYM_CFG_CI_MAX	(127)
+#define NX_ASYM_CFG_CI_LSHIFT	(4)
+#define NX_842_CFG_CI_MAX	(511)
+#define NX_842_CFG_CI_LSHIFT	(2)
 
 /* DMA configuration values
- * P8 NX workbook, section 5.2.3, table 5-4
+ * NX workbook, section 5.2.3, table 5-4
  * "DMA Configuration Register Bits"
  *
  * These values can be used for the AES/SHA, AMF, and 842 DMA