Patchwork [1/3] User OProfile support for the IBM CELL processor SPU event profiling

login
register
mail settings
Submitter Carl Love
Date Dec. 2, 2008, 12:18 a.m.
Message ID <1228177110.6509.286.camel@carll-linux-desktop>
Download mbox | patch
Permalink /patch/11685/
State New
Headers show

Comments

Carl Love - Dec. 2, 2008, 12:18 a.m.
This patch adds the SPU event profiling support for the IBM Cell
processor to the list of available events. The opcontrol script
patches include a test to see if there is a new cell specific file
in the kernel oprofile file system.  If the file exists, then the
kernel supports SPU event profiling.  

Signed-off-by: Carl Love <carll@us.ibm.com>
Michael Ellerman - Dec. 2, 2008, 1:02 a.m.
On Mon, 2008-12-01 at 16:18 -0800, Carl Love wrote:
> This patch adds the SPU event profiling support for the IBM Cell
> processor to the list of available events. The opcontrol script
> patches include a test to see if there is a new cell specific file
> in the kernel oprofile file system.  If the file exists, then the
> kernel supports SPU event profiling.  
> 
> Signed-off-by: Carl Love <carll@us.ibm.com>
> 

> Index: oprofile-cvs/doc/oprofile.xml
> ===================================================================
> --- oprofile-cvs.orig/doc/oprofile.xml
> +++ oprofile-cvs/doc/oprofile.xml
> @@ -123,6 +123,10 @@ For information on how to use OProfile's
>                         of 2.6.22 or more recent.  Additionally, full support of SPE profiling requires a BFD library
>                         from binutils code dated January 2007 or later.  To ensure the proper BFD support exists, run
>                         the <code>configure</code> utility with <code>--with-target=cell-be</code>.
> +
> +		       Profiling the Cell Broadband Engine using SPU events requires a kernel version of 2.6.TBD or
                                                                                                               ^^^

Not sure if you missed that, or it's still TBD, but careful it doesn't
get merged like that.

cheers
Carl Love - Dec. 2, 2008, 3:57 p.m.
On Tue, 2008-12-02 at 12:02 +1100, Michael Ellerman wrote:
> On Mon, 2008-12-01 at 16:18 -0800, Carl Love wrote:
> > This patch adds the SPU event profiling support for the IBM Cell
> > processor to the list of available events. The opcontrol script
> > patches include a test to see if there is a new cell specific file
> > in the kernel oprofile file system.  If the file exists, then the
> > kernel supports SPU event profiling.  
> > 
> > Signed-off-by: Carl Love <carll@us.ibm.com>
> > 
> 
> > Index: oprofile-cvs/doc/oprofile.xml
> > ===================================================================
> > --- oprofile-cvs.orig/doc/oprofile.xml
> > +++ oprofile-cvs/doc/oprofile.xml
> > @@ -123,6 +123,10 @@ For information on how to use OProfile's
> >                         of 2.6.22 or more recent.  Additionally, full support of SPE profiling requires a BFD library
> >                         from binutils code dated January 2007 or later.  To ensure the proper BFD support exists, run
> >                         the <code>configure</code> utility with <code>--with-target=cell-be</code>.
> > +
> > +		       Profiling the Cell Broadband Engine using SPU events requires a kernel version of 2.6.TBD or
>                                                                                                                ^^^
> 
> Not sure if you missed that, or it's still TBD, but careful it doesn't
> get merged like that.
> 
> cheers
> 
Michael:

Yes, I am aware of it.  I felt it was important to post user and kernel
so everyone knows that user and kernel development is done.  I have
talked with the OProfile maintainer about these patches already.  The
plan is to update the user patch with the correct kernel version, once
we know which version the patches will be going into.  

I should have mentioned this specifically in patch 0/3.  I will need a
final spin of the user tool patch once the kernel patches are accepted.

Thanks.

                Carl Love

Patch

Index: oprofile-cvs/events/ppc64/cell-be/events
===================================================================
--- oprofile-cvs.orig/events/ppc64/cell-be/events
+++ oprofile-cvs/events/ppc64/cell-be/events
@@ -108,12 +108,42 @@  event:0xdbe counters:0,1,2,3 um:PPU_0_cy
 event:0xdbf counters:0,1,2,3 um:PPU_0_cycles          minimum:10000	name:stb_not_empty		: At least one store gather buffer not empty.
 
 # Cell BE Island 4 - Synergistic Processor Unit (SPU)
-
+#
+# OPROFILE FOR CELL ONLY SUPPORTS PROFILING ON ONE SPU EVENT AT A TIME
+#
 # CBE Signal Group 41 - SPU (NClk)
+event:0x1004 counters:0 um:SPU_02_cycles          minimum:10000	name:dual_instrctn_commit	: Dual instruction committed.
+event:0x1005 counters:0 um:SPU_02_cycles          minimum:10000	name:sngl_instrctn_commit	: Single instruction committed.
+event:0x1006 counters:0 um:SPU_02_cycles          minimum:10000	name:ppln0_instrctn_commit	: Pipeline 0 instruction committed.
+event:0x1007 counters:0 um:SPU_02_cycles          minimum:10000	name:ppln1_instrctn_commit	: Pipeline 1 instruction committed.
+event:0x1008 counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:instrctn_ftch_stll	: Instruction fetch stall.
+event:0x1009 counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:lcl_strg_bsy		: Local storage busy.
+event:0x100A counters:0 um:SPU_02_cycles          minimum:10000	name:dma_cnflct_ld_st	: DMA may conflict with load or store.
+event:0x100B counters:0 um:SPU_02_cycles          minimum:10000	name:str_to_lcl_strg	: Store instruction to local storage issued.
+event:0x100C counters:0 um:SPU_02_cycles          minimum:10000	name:ld_frm_lcl_strg	: Load intruction from local storage issued.
+event:0x100D counters:0 um:SPU_02_cycles          minimum:10000	name:fpu_exctn		: Floating-Point Unit (FPU) exception.
+event:0x100E counters:0 um:SPU_02_cycles          minimum:10000	name:brnch_instrctn_commit	: Branch instruction committed.
+event:0x100F counters:0 um:SPU_02_cycles          minimum:10000	name:change_of_flow	: Non-sequential change of the SPU program counter, which can be caused by branch, asynchronous interrupt, stalled wait on channel, error correction code (ECC) error, and so forth.
+event:0x1010 counters:0 um:SPU_02_cycles          minimum:10000	name:brnch_not_tkn		: Branch not taken.
+event:0x1011 counters:0 um:SPU_02_cycles          minimum:10000	name:brnch_mss_prdctn	: Branch miss prediction; not exact. Certain other code sequences can cause additional pulses on this signal (see Note 2).
+event:0x1012 counters:0 um:SPU_02_cycles          minimum:10000	name:brnch_hnt_mss_prdctn	: Branch hint miss prediction; not exact. Certain other code sequences can cause additional pulses on this signal (see Note 2).
+event:0x1013 counters:0 um:SPU_02_cycles          minimum:10000	name:instrctn_seqnc_err	: Instruction sequence error.
+event:0x1015 counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl_wrt	: Stalled waiting on any blocking channel write (see Note 3).
+event:0x1016 counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl0	: Stalled waiting on External Event Status (Channel 0) (see Note 3).
+event:0x1017 counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl3	: Stalled waiting on Signal Notification 1 (Channel 3) (see Note 3).
+event:0x1018 counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl4	: Stalled waiting on Signal Notification 2 (Channel 4) (see Note 3).
+event:0x1019 counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl21	: Stalled waiting on DMA Command Opcode or ClassID Register (Channel 21) (see Note 3).
+event:0x101A counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl24	: Stalled waiting on Tag Group Status (Channel 24) (see Note 3).
+event:0x101B counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl25	: Stalled waiting on List Stall-and-Notify Tag Status (Channel 25) (see Note 3).
+event:0x101C counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl28	: Stalled waiting on PPU Mailbox (Channel 28) (see Note 3).
+event:0x1022 counters:0 um:SPU_02_cycles_or_edges minimum:10000	name:stlld_wait_on_chnl29	: Stalled waiting on SPU Mailbox (Channel 29) (see Note 3).
+
 
 # CBE Signal Group 42 - SPU Trigger (NClk)
+event:0x10A1 counters:0 um:SPU_Trigger_cycles_or_edges minimum:10000	name:stld_wait_chnl_op	: Stalled waiting on channel operation (See Note 2).
 
 # CBE Signal Group 43 - SPU Event (NClk)
+event:0x1107 counters:0 um:SPU_Event_cycles_or_edges minimum:10000	name:instrctn_ftch_stll	: Instruction fetch stall.
 
 # Cell BE Island 6 - Element Interconnect Bus (EIB)
 
Index: oprofile-cvs/events/ppc64/cell-be/unit_masks
===================================================================
--- oprofile-cvs.orig/events/ppc64/cell-be/unit_masks
+++ oprofile-cvs/events/ppc64/cell-be/unit_masks
@@ -72,3 +72,66 @@  name:PPU_0123_cycles type:bitmask defaul
 	0x002 Positive polarity				[default  ]
 	0x030 PPU Bus Word 0/1				[default  ]
 	0x0c0 PPU Bus Word 2/3				[optional ]
+name:SPU_02_cycles type:bitmask default:0x0113
+	0x0001 Count cycles				[mandatory]
+	0x0000 Negative polarity			[optional ]
+	0x0002 Positive polarity			[default  ]
+	0x0110 SPU Bus Word 0				[default  ]
+	0x0140 SPU Bus Word 2				[optional ]
+	0x0000 SPU 0					[default  ]
+	0x1000 SPU 1					[optional ]
+	0x2000 SPU 2					[optional ]
+	0x3000 SPU 3					[optional ]
+	0x4000 SPU 4					[optional ]
+	0x5000 SPU 5					[optional ]
+	0x6000 SPU 6					[optional ]
+	0x7000 SPU 7					[optional ]
+name:SPU_02_cycles_or_edges type:bitmask default:0x0113
+	0x0000 Count edges				[optional ]
+	0x0001 Count cycles				[default  ]
+	0x0000 Negative polarity			[optional ]
+	0x0002 Positive polarity			[default  ]
+	0x0110 SPU Bus Word 0				[default  ]
+	0x0140 SPU Bus Word 2				[optional ]
+	0x0000 SPU 0					[default  ]
+	0x1000 SPU 1					[optional ]
+	0x2000 SPU 2					[optional ]
+	0x3000 SPU 3					[optional ]
+	0x4000 SPU 4					[optional ]
+	0x5000 SPU 5					[optional ]
+	0x6000 SPU 6					[optional ]
+	0x7000 SPU 7					[optional ]
+name:SPU_Trigger_cycles_or_edges type:bitmask default:0x0107
+	0x0000 Count edges				[optional ]
+	0x0001 Count cycles				[default  ]
+	0x0000 Negative polarity			[optional ]
+	0x0002 Positive polarity			[default  ]
+	0x0104 SPU Trigger 0				[default  ]
+	0x0114 SPU Trigger 1				[optional ]
+	0x0124 SPU Trigger 2				[optional ]
+	0x0134 SPU Trigger 3				[optional ]
+	0x0000 SPU 0					[default  ]
+	0x1000 SPU 1					[optional ]
+	0x2000 SPU 2					[optional ]
+	0x3000 SPU 3					[optional ]
+	0x4000 SPU 4					[optional ]
+	0x5000 SPU 5					[optional ]
+	0x6000 SPU 6					[optional ]
+	0x7000 SPU 7					[optional ]
+name:SPU_Event_cycles_or_edges type:bitmask default:0x0147
+	0x0000 Count edges				[optional ]
+	0x0001 Count cycles				[default  ]
+	0x0000 Negative polarity			[optional ]
+	0x0002 Positive polarity			[default  ]
+	0x0144 SPU Event 0				[default  ]
+	0x0154 SPU Event 1				[optional ]
+	0x0164 SPU Event 2				[optional ]
+	0x0174 SPU Event 3				[optional ]
+	0x0000 SPU 0					[default  ]
+	0x1000 SPU 1					[optional ]
+	0x2000 SPU 2					[optional ]
+	0x3000 SPU 3					[optional ]
+	0x4000 SPU 4					[optional ]
+	0x5000 SPU 5					[optional ]
+	0x6000 SPU 6					[optional ]
+	0x7000 SPU 7					[optional ]
Index: oprofile-cvs/ChangeLog
===================================================================
--- oprofile-cvs.orig/ChangeLog
+++ oprofile-cvs/ChangeLog
@@ -1,3 +1,12 @@ 
+2008-11-17  Carl Love  <carll@us.ibm.com
+
+	* Added IBM CELL SPU event profiling support.
+	* utils/opcontrol - rewrote Cell tests to validate which
+	  events can be selected at the same time.
+	* events/ppc64/cell-be/events - add CELL SPU events
+	* events/ppc64/cell-be/unit_masks - added CELL SPU
+	  unit masks
+
 2008-11-24  Robert Richter <robert.richter@amd.com>
 
 	* utils/opcontrol: Correct spelling error
Index: oprofile-cvs/utils/opcontrol
===================================================================
--- oprofile-cvs.orig/utils/opcontrol
+++ oprofile-cvs/utils/opcontrol
@@ -1154,31 +1154,67 @@  check_event_mapping_data()
 	fi
 	if [ "$CPUTYPE" = "ppc64/cell-be" ]; then
 		event_num=`echo $EVENT_STR | awk '{print $1}'`
-		if [ "$event_num" = "2" ] ; then
-			if [ "$CYCLES" = "1" \
-				-o "$GRP_NUM_CK_VAL" != "" ] ; then
-				echo "ERROR: SPU CYCLES cannot be counted with any other events." >&2
-				exit 1
-			else
-				SPU_CYCLES=2
-				return
-			fi
-		fi
-		if [ "$event_num" = "1" ] ; then
-			# CYCLES is compatible with any event except SPU_CYCLES
-			if [ "$SPU_CYCLES" = "2" ] ; then
-				echo "ERROR: PPU CYCLES and SPU CYCLES cannot be counted simultaneously." >&2
-				exit 1
-			else
-				CYCLES=1
-				return
-			fi
+
+		# PPU event and cycle events can be measured at
+		# the same time.  SPU event can not be measured
+		# at the same time as any other event.  Similarly for
+		# SPU Cycles
+
+		# We use EVNT_MSK to track what events have already
+		# been seen.  Valid values are:
+		#    NULL string -  no events seen yet
+		#    1 - PPU CYCLES or PPU Event seen
+		#    2 - SPU CYCLES seen
+		#    3 - SPU EVENT seen
+
+		# check if event is PPU_CYCLES
+		if [ "$event_num" = "1" ]; then
+		    if [ "$EVNT_MSK" = "1" ] || [ "$EVNT_MSK" = "" ]; then
+			EVNT_MSK=1
+		    else
+			echo "PPU CYCLES not compatible with previously specified event"
+			exit 1
+		    fi
+
+		# check if event is SPU_CYCLES
+		elif [ "$event_num" = "2" ]; then
+		    if [ "$EVNT_MSK" = "" ]; then
+			EVNT_MSK=2
+		    else
+			echo "SPU CYCLES not compatible with any other event"
+			exit 1
+		    fi
+
+		# check if event is SPU Event profiling
+		elif [ "$event_num" -ge "4100" ] && [ "$event_num" -le "4163" ] ; then
+		    if [ "$EVNT_MSK" = "" ]; then
+			EVNT_MSK=3
+		    else
+			echo "SPU event profiling not compatible with any other event"
+			exit 1
+		    fi
+
+		    # Check to see that the kernel supports SPU event profiling
+		    # Note, if the file exits it should have the LSB bit set to
+		    # to 1 indicating SPU event profiling support. For now, it
+		    # is sufficient to test that the file exists.
+		    if test ! -f /dev/oprofile/cell_support; then
+			echo "Kernel does not support SPU event profiling"
+			exit 1
+		    fi
+
+		# check if event is PPU Event profiling (all other events
+		#     are PPU events)
+
 		else
-			if [ "$SPU_CYCLES" = "2" ] ; then
-				echo "ERROR: SPU CYCLES cannot be counted with any other events." >&2
+		    if [ "$EVNT_MSK" = "1" ] || [ "$EVNT_MSK" = "" ]; then
+			EVNT_MSK=1
+		    else
+			echo "PPU profiling not compatible with previously specified event"
 				exit 1
-			fi
+		    fi
 		fi
+
 		len=`echo -n $event_num | wc -m`
 		num_chars_in_grpid=`expr $len - 2`
 		GRP_NUM_VAL=`echo | awk '{print substr("'"${event_num}"'",1,"'"${num_chars_in_grpid}"'")}'`
Index: oprofile-cvs/doc/oprofile.xml
===================================================================
--- oprofile-cvs.orig/doc/oprofile.xml
+++ oprofile-cvs/doc/oprofile.xml
@@ -123,6 +123,10 @@  For information on how to use OProfile's
                        of 2.6.22 or more recent.  Additionally, full support of SPE profiling requires a BFD library
                        from binutils code dated January 2007 or later.  To ensure the proper BFD support exists, run
                        the <code>configure</code> utility with <code>--with-target=cell-be</code>.
+
+		       Profiling the Cell Broadband Engine using SPU events requires a kernel version of 2.6.TBD or
+		       more recent.
+
                        <note>Attempting to profile SPEs with kernel versions older than 2.6.22 may cause the
                        system to crash.</note>
 		</para></listitem>
@@ -1195,6 +1199,12 @@  is collected in the sample data in order
 the performance of each separate SPU is not necessary.
 </para>
 <para>
+Profiling with an SPU event (events 4100 through 4163) is not compatible with any other
+event.  Further more, only one SPU event can be specified at a time.  The hardware only
+supports profiling on one SPU per node at a time.  The OProfile kernel code time slices
+between the eight SPUs to collect data on all SPUs.
+</para>
+<para>
 SPU profile reports have some unique characteristics compared to reports for
 standard architectures:
 </para>