Patchwork [v3,5/5] powernv:cpufreq: Implement the driver->get() method

login
register
mail settings
Submitter Gautham R. Shenoy
Date March 20, 2014, 12:11 p.m.
Message ID <1395317460-14811-6-git-send-email-ego@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/332100/
State Changes Requested
Headers show

Comments

Gautham R. Shenoy - March 20, 2014, 12:11 p.m.
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

The current frequency of a cpu is reported through the sysfs file
cpuinfo_cur_freq. This requires the driver to implement a
"->get(unsigned int cpu)" method which will return the current
operating frequency.

Implement a function named powernv_cpufreq_get() which reads the local
pstate from the PMSR and returns the corresponding frequency.

Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)
Gautham R. Shenoy - March 21, 2014, 6:25 a.m.
On Thu, Mar 20, 2014 at 05:41:00PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> The current frequency of a cpu is reported through the sysfs file
> cpuinfo_cur_freq. This requires the driver to implement a
> "->get(unsigned int cpu)" method which will return the current
> operating frequency.
> 
> Implement a function named powernv_cpufreq_get() which reads the local
> pstate from the PMSR and returns the corresponding frequency.
> 
> Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().

Forgot to add 

  Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 46bee8a..ef6ed8c 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
>  	BUG();
>  }
> 
> +/*
> + * Computes the current frequency on this cpu
> + * and stores the result in *ret_freq.
> + */
> +static void powernv_read_cpu_freq(void *ret_freq)
> +{
> +	unsigned long pmspr_val;
> +	s8 local_pstate_id;
> +	int *cur_freq, freq, pstate_id;
> +
> +	cur_freq = (int *)ret_freq;
> +	pmspr_val = get_pmspr(SPRN_PMSR);
> +
> +	/* The local pstate id corresponds bits 48..55 in the PMSR.
> +         * Note: Watch out for the sign! */
> +	local_pstate_id = (pmspr_val >> 48) & 0xFF;
> +	pstate_id = local_pstate_id;
> +
> +	freq = pstate_id_to_freq(pstate_id);
> +	pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> +		smp_processor_id(), pmspr_val, pstate_id, freq);
> +	*cur_freq = freq;
> +}
> +
> +/*
> + * Returns the cpu frequency as reported by the firmware for 'cpu'.
> + * This value is reported through the sysfs file cpuinfo_cur_freq.
> + */
> +unsigned int powernv_cpufreq_get(unsigned int cpu)
> +{
> +	int ret_freq;
> +
> +	smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
> +			&ret_freq, 1);
> +	return ret_freq;
> +}
> +
>  static void set_pstate(void *pstate)
>  {
>  	unsigned long val;
> @@ -297,6 +334,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  static struct cpufreq_driver powernv_cpufreq_driver = {
>  	.verify		= powernv_cpufreq_verify,
>  	.target		= powernv_cpufreq_target,
> +	.get		= powernv_cpufreq_get,
>  	.init		= powernv_cpufreq_cpu_init,
>  	.exit		= powernv_cpufreq_cpu_exit,
>  	.name		= "powernv-cpufreq",
> -- 
> 1.8.3.1
>
viresh kumar - March 21, 2014, 9:31 a.m.
On Thu, Mar 20, 2014 at 5:41 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> The current frequency of a cpu is reported through the sysfs file
> cpuinfo_cur_freq. This requires the driver to implement a
> "->get(unsigned int cpu)" method which will return the current
> operating frequency.
>
> Implement a function named powernv_cpufreq_get() which reads the local
> pstate from the PMSR and returns the corresponding frequency.
>
> Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)

Please merge these fixups with the first patch which is creating the driver.
I understand that a different guy has created this patch and so wanted
to have a patch on his name but its really difficult to review this way.
Better add your signed-off in the first patch instead. Sending such changes
once the driver is mainlined looks fine.

> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 46bee8a..ef6ed8c 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
>         BUG();
>  }
>
> +/*
> + * Computes the current frequency on this cpu
> + * and stores the result in *ret_freq.
> + */
> +static void powernv_read_cpu_freq(void *ret_freq)
> +{
> +       unsigned long pmspr_val;
> +       s8 local_pstate_id;
> +       int *cur_freq, freq, pstate_id;
> +
> +       cur_freq = (int *)ret_freq;

You don't need cur_freq variable at all..

> +       pmspr_val = get_pmspr(SPRN_PMSR);
> +
> +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> +         * Note: Watch out for the sign! */
> +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> +       pstate_id = local_pstate_id;

similarly local_pstate_id

> +
> +       freq = pstate_id_to_freq(pstate_id);
> +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> +               smp_processor_id(), pmspr_val, pstate_id, freq);
> +       *cur_freq = freq;

Move above print here after a blank line. Also remove freq variable as
well and use *cur_freq directly.. And then you can rename it to freq as well.

> +}
> +
> +/*
> + * Returns the cpu frequency as reported by the firmware for 'cpu'.
> + * This value is reported through the sysfs file cpuinfo_cur_freq.
> + */
> +unsigned int powernv_cpufreq_get(unsigned int cpu)
> +{
> +       int ret_freq;
> +
> +       smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
> +                       &ret_freq, 1);
> +       return ret_freq;
> +}
> +
>  static void set_pstate(void *pstate)
>  {
>         unsigned long val;
> @@ -297,6 +334,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  static struct cpufreq_driver powernv_cpufreq_driver = {
>         .verify         = powernv_cpufreq_verify,
>         .target         = powernv_cpufreq_target,
> +       .get            = powernv_cpufreq_get,
>         .init           = powernv_cpufreq_cpu_init,
>         .exit           = powernv_cpufreq_cpu_exit,
>         .name           = "powernv-cpufreq",
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gautham R. Shenoy - March 21, 2014, 11:04 a.m.
On Fri, Mar 21, 2014 at 03:01:29PM +0530, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:41 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >
> > The current frequency of a cpu is reported through the sysfs file
> > cpuinfo_cur_freq. This requires the driver to implement a
> > "->get(unsigned int cpu)" method which will return the current
> > operating frequency.
> >
> > Implement a function named powernv_cpufreq_get() which reads the local
> > pstate from the PMSR and returns the corresponding frequency.
> >
> > Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().
> >
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> >  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 38 insertions(+)
> 
> Please merge these fixups with the first patch which is creating the driver.
> I understand that a different guy has created this patch and so wanted
> to have a patch on his name but its really difficult to review this
way.

Heh! Well, that wasn't the reason why this was sent out as a separate
patch, but never mind. Though I don't understand why it would be
difficult to review the patch though.

> Better add your signed-off in the first patch instead. Sending such changes
> once the driver is mainlined looks fine.

Sure, this makes sense.

> 
> > diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> > index 46bee8a..ef6ed8c 100644
> > --- a/drivers/cpufreq/powernv-cpufreq.c
> > +++ b/drivers/cpufreq/powernv-cpufreq.c
> > @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
> >         BUG();
> >  }
> >
> > +/*
> > + * Computes the current frequency on this cpu
> > + * and stores the result in *ret_freq.
> > + */
> > +static void powernv_read_cpu_freq(void *ret_freq)
> > +{
> > +       unsigned long pmspr_val;
> > +       s8 local_pstate_id;
> > +       int *cur_freq, freq, pstate_id;
> > +
> > +       cur_freq = (int *)ret_freq;
> 
> You don't need cur_freq variable at all..

I don't like it either. But the compiler complains without this hack
:-(

> 
> > +       pmspr_val = get_pmspr(SPRN_PMSR);
> > +
> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> > +         * Note: Watch out for the sign! */
> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> > +       pstate_id = local_pstate_id;
> 
> similarly local_pstate_id

well, I am interested in the bits 48..55 of pmspr_val. That's the
pstate_id which can be negative. So I'll like to keep
local_pstate_id. 
> 
> > +
> > +       freq = pstate_id_to_freq(pstate_id);
> > +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> > +               smp_processor_id(), pmspr_val, pstate_id, freq);
> > +       *cur_freq = freq;
> 
> Move above print here after a blank line. Also remove freq variable as
> well and use *cur_freq directly.. And then you can rename it to freq
as well.

We can get rid of freq and use *cur_freq in its place.


Thanks for the detailed review.

--
Regards
gautham.
viresh kumar - March 21, 2014, 11:45 a.m.
On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Heh! Well, that wasn't the reason why this was sent out as a separate
> patch, but never mind. Though I don't understand why it would be
> difficult to review the patch though.

Because the initial driver wasn't complete earlier. There were 2-3 patches
after the first one which are changing what the first patch has added.
Nothing else :)

>> > +static void powernv_read_cpu_freq(void *ret_freq)
>> > +{
>> > +       unsigned long pmspr_val;
>> > +       s8 local_pstate_id;
>> > +       int *cur_freq, freq, pstate_id;
>> > +
>> > +       cur_freq = (int *)ret_freq;
>>
>> You don't need cur_freq variable at all..
>
> I don't like it either. But the compiler complains without this hack
> :-(

Why would the compiler warn for doing this?:

*(int *)ret_freq = freq;

>> > +       pmspr_val = get_pmspr(SPRN_PMSR);
>> > +
>> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
>> > +         * Note: Watch out for the sign! */
>> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
>> > +       pstate_id = local_pstate_id;
>>
>> similarly local_pstate_id
>
> well, I am interested in the bits 48..55 of pmspr_val. That's the
> pstate_id which can be negative. So I'll like to keep
> local_pstate_id.

Can you please explain at bit level how getting the value in a s8
first and then into a s32 will help you here? Instead of getting it
directly into s32.
David Laight - March 21, 2014, 12:01 p.m.
From: Viresh Kumar

 
> On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:

> > Heh! Well, that wasn't the reason why this was sent out as a separate

> > patch, but never mind. Though I don't understand why it would be

> > difficult to review the patch though.

> 

> Because the initial driver wasn't complete earlier. There were 2-3 patches

> after the first one which are changing what the first patch has added.

> Nothing else :)

> 

> >> > +static void powernv_read_cpu_freq(void *ret_freq)

> >> > +{

> >> > +       unsigned long pmspr_val;

> >> > +       s8 local_pstate_id;

> >> > +       int *cur_freq, freq, pstate_id;

> >> > +

> >> > +       cur_freq = (int *)ret_freq;

> >>

> >> You don't need cur_freq variable at all..

> >

> > I don't like it either. But the compiler complains without this hack

> > :-(

> 

> Why would the compiler warn for doing this?:

> 

> *(int *)ret_freq = freq;


Because it is very likely to be wrong.
In general casts of pointers to integer types are dangerous.
In this case why not make the function return the value?

	David
Gautham R. Shenoy - March 21, 2014, 12:31 p.m.
On Fri, Mar 21, 2014 at 12:01:07PM +0000, David Laight wrote:
> From: Viresh Kumar
>  
> > On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > > Heh! Well, that wasn't the reason why this was sent out as a separate
> > > patch, but never mind. Though I don't understand why it would be
> > > difficult to review the patch though.
> > 
> > Because the initial driver wasn't complete earlier. There were 2-3 patches
> > after the first one which are changing what the first patch has added.
> > Nothing else :)
> > 
> > >> > +static void powernv_read_cpu_freq(void *ret_freq)
> > >> > +{
> > >> > +       unsigned long pmspr_val;
> > >> > +       s8 local_pstate_id;
> > >> > +       int *cur_freq, freq, pstate_id;
> > >> > +
> > >> > +       cur_freq = (int *)ret_freq;
> > >>
> > >> You don't need cur_freq variable at all..
> > >
> > > I don't like it either. But the compiler complains without this hack
> > > :-(
> > 
> > Why would the compiler warn for doing this?:
> > 
> > *(int *)ret_freq = freq;
> 
> Because it is very likely to be wrong.
> In general casts of pointers to integer types are dangerous.
> In this case why not make the function return the value?

Because this function is called via an smp_call_function(). And we
need a way of returning the value to the caller.

> 
> 	David
> 
>
viresh kumar - March 21, 2014, 12:34 p.m.
On 21 March 2014 17:31, David Laight <David.Laight@aculab.com> wrote:
>> *(int *)ret_freq = freq;
>
> Because it is very likely to be wrong.
> In general casts of pointers to integer types are dangerous.

Where are we converting pointers to integers? We are doing a
cast from 'void * ' to 'int *' and then using indirection operator
to set its value.
Gautham R. Shenoy - March 21, 2014, 1:04 p.m.
On Fri, Mar 21, 2014 at 05:15:34PM +0530, Viresh Kumar wrote:
> On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > Heh! Well, that wasn't the reason why this was sent out as a separate
> > patch, but never mind. Though I don't understand why it would be
> > difficult to review the patch though.
> 
> Because the initial driver wasn't complete earlier. There were 2-3 patches
> after the first one which are changing what the first patch has added.
> Nothing else :)
> 
> >> > +static void powernv_read_cpu_freq(void *ret_freq)
> >> > +{
> >> > +       unsigned long pmspr_val;
> >> > +       s8 local_pstate_id;
> >> > +       int *cur_freq, freq, pstate_id;
> >> > +
> >> > +       cur_freq = (int *)ret_freq;
> >>
> >> You don't need cur_freq variable at all..
> >
> > I don't like it either. But the compiler complains without this hack
> > :-(
> 
> Why would the compiler warn for doing this?:
> 
> *(int *)ret_freq = freq;

The compiler doesn't complain if I do this.
> 
> >> > +       pmspr_val = get_pmspr(SPRN_PMSR);
> >> > +
> >> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> >> > +         * Note: Watch out for the sign! */
> >> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> >> > +       pstate_id = local_pstate_id;
> >>
> >> similarly local_pstate_id
> >
> > well, I am interested in the bits 48..55 of pmspr_val. That's the
> > pstate_id which can be negative. So I'll like to keep
> > local_pstate_id.
> 
> Can you please explain at bit level how getting the value in a s8
> first and then into a s32 will help you here? Instead of getting it
> directly into s32.

Consider the case when pmspr = 0x00feffffffffffff;

We are interested in extracting the value 'fe'. And ensure that when
we store this value into an int, we get the sign extension right.

So the following doesn't work: 

   pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;

Since we get pstate_id = 0x000000fe, which is not the same as
0xfffffffe.

Of course, we could do the following, but I wasn't sure if
two levels of type casting helped on the readability front.

   pstate_id = (int) ((s8)((pmspr_val >> 48) & 0xFF));

--
Thanks and Regards
gautham.
>
viresh kumar - March 21, 2014, 1:12 p.m.
On 21 March 2014 18:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Consider the case when pmspr = 0x00feffffffffffff;
>
> We are interested in extracting the value 'fe'. And ensure that when
> we store this value into an int, we get the sign extension right.
>
> So the following doesn't work:
>
>    pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;

What about pstate_id = (pmspr_val >> 48) & 0xFF; ??
Gautham R. Shenoy - March 21, 2014, 2:25 p.m.
On Fri, Mar 21, 2014 at 06:42:50PM +0530, Viresh Kumar wrote:
> On 21 March 2014 18:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > Consider the case when pmspr = 0x00feffffffffffff;
> >
> > We are interested in extracting the value 'fe'. And ensure that when
> > we store this value into an int, we get the sign extension right.
> >
> > So the following doesn't work:
> >
> >    pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;
> 
> What about pstate_id = (pmspr_val >> 48) & 0xFF; ??

Nope. 

Suppose the pmspr_val is contained in the register 
%rax, and pstate_id corresponds to the address -20(%rbp) then:

	pstate_id = (pmspr_val >> 48) & 0xFF;

would corrspond to 

        shrq    $48, %rax        // Left shift by 48 bits
        andl    $255, %eax       // Mask the lower 32 bits of %rax with 0x000000FF
        movl    %eax, -20(%rbp)  // Store the lower 32 bits of %rax
                                    into pstate_id


On the other hand, 

   pstate_id = (int)((s8)((pmspr_val >> 48) & 0xFF));

would correspond to:

        shrq    $48, %rax  // Left shift by 48 bits.
        movsbl  %al, %eax  // Move the lower 8 bits of %rax to %eax
                              with sign-extension. 
        movl    %eax, -20(%rbp) // store the result in pstate_id;

So with this, we are getting the sign extension due to the use of movsbl.

And if local_pstate_id corresponds to the address -1(%rbp) then:

       local_pstate_id = (pmspr_val >> 48) & 0xFF;
       pstate_id = local_pstate_id;

would correspond to:

        shrq    $48, %rax        // Left shift by 48 bits
        movb    %al, -1(%rbp)    //copy the lower 8 bits to local_pstate_id
        movsbl  -1(%rbp), %eax   //move the contents of
                                   local_pstate_id to %eax with sign extension.
        movl    %eax, -20(%rbp)  // Store the result in pstate_id


Hope this helps :-)

--
Thanks and Regards
gautham.
David Laight - March 21, 2014, 3:01 p.m.
From: Viresh Kumar [mailto:viresh.kumar@linaro.org]
> On 21 March 2014 17:31, David Laight <David.Laight@aculab.com> wrote:
> >> *(int *)ret_freq = freq;
> >
> > Because it is very likely to be wrong.
> > In general casts of pointers to integer types are dangerous.
> 
> Where are we converting pointers to integers? We are doing a
> cast from 'void * ' to 'int *' and then using indirection operator
> to set its value.

You mis-parsed what I wrote, try:
In general casts of 'pointer to integer' types are dangerous.

Somewhere, much higher up the call stack, the address of an integer
variable is being taken and then passed as the 'void *' parameter.

The 'problem' is that it is quite easily to pass the address of
a 'long' instead. On 32bit and LE systems this won't always
be a problem. On 64bit BE it all goes wrong.

	David
Benjamin Herrenschmidt - March 21, 2014, 10:56 p.m.
On Fri, 2014-03-21 at 16:34 +0530, Gautham R Shenoy wrote:

> > >
> > > +/*
> > > + * Computes the current frequency on this cpu
> > > + * and stores the result in *ret_freq.
> > > + */
> > > +static void powernv_read_cpu_freq(void *ret_freq)
> > > +{
> > > +       unsigned long pmspr_val;
> > > +       s8 local_pstate_id;
> > > +       int *cur_freq, freq, pstate_id;
> > > +
> > > +       cur_freq = (int *)ret_freq;
> > 
> > You don't need cur_freq variable at all..
> 
> I don't like it either. But the compiler complains without this hack
> :-(

Casting integers into void * is a recipe for disaster... what is that
supposed to be about ? We lose all type checking and get exposed
to endian issues etc... the day somebody uses a different type on both
sides.

Also is "freq" a frequency ? In this case an int isn't big enough.

Cheers,
Ben.
Gautham R. Shenoy - March 22, 2014, 7:53 a.m.
Hi Ben,

On Sat, Mar 22, 2014 at 09:56:30AM +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2014-03-21 at 16:34 +0530, Gautham R Shenoy wrote:
> 
> > > >
> > > > +/*
> > > > + * Computes the current frequency on this cpu
> > > > + * and stores the result in *ret_freq.
> > > > + */
> > > > +static void powernv_read_cpu_freq(void *ret_freq)
> > > > +{
> > > > +       unsigned long pmspr_val;
> > > > +       s8 local_pstate_id;
> > > > +       int *cur_freq, freq, pstate_id;
> > > > +
> > > > +       cur_freq = (int *)ret_freq;
> > > 
> > > You don't need cur_freq variable at all..
> > 
> > I don't like it either. But the compiler complains without this hack
> > :-(
> 
> Casting integers into void * is a recipe for disaster... what is that
> supposed to be about ?

Like I mentioned elsewhere on this thread, we're calling
powernv_read_cpu_freq via an smp_call_function(). We use this to
obtain the frequency on the cpu where powernv_read_cpu_freq
executes and return it to the caller of smp_call_function.

> We lose all type checking and get exposed
> to endian issues etc... the day somebody uses a different type on both
> sides.
> 
Yes, I understand the problem now. I'll think of a safer way to pass
the return value.

> Also is "freq" a frequency ? In this case an int isn't big enough.

freq is the frequency stored in the cpufreq_table. The value is in
kHz. So, int should be big enough.

> Cheers,
> Ben.
> 
>

--
Thanks and Regards
gautham.

Patch

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 46bee8a..ef6ed8c 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -206,6 +206,43 @@  static inline void set_pmspr(unsigned long sprn, unsigned long val)
 	BUG();
 }
 
+/*
+ * Computes the current frequency on this cpu
+ * and stores the result in *ret_freq.
+ */
+static void powernv_read_cpu_freq(void *ret_freq)
+{
+	unsigned long pmspr_val;
+	s8 local_pstate_id;
+	int *cur_freq, freq, pstate_id;
+
+	cur_freq = (int *)ret_freq;
+	pmspr_val = get_pmspr(SPRN_PMSR);
+
+	/* The local pstate id corresponds bits 48..55 in the PMSR.
+         * Note: Watch out for the sign! */
+	local_pstate_id = (pmspr_val >> 48) & 0xFF;
+	pstate_id = local_pstate_id;
+
+	freq = pstate_id_to_freq(pstate_id);
+	pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
+		smp_processor_id(), pmspr_val, pstate_id, freq);
+	*cur_freq = freq;
+}
+
+/*
+ * Returns the cpu frequency as reported by the firmware for 'cpu'.
+ * This value is reported through the sysfs file cpuinfo_cur_freq.
+ */
+unsigned int powernv_cpufreq_get(unsigned int cpu)
+{
+	int ret_freq;
+
+	smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
+			&ret_freq, 1);
+	return ret_freq;
+}
+
 static void set_pstate(void *pstate)
 {
 	unsigned long val;
@@ -297,6 +334,7 @@  static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 static struct cpufreq_driver powernv_cpufreq_driver = {
 	.verify		= powernv_cpufreq_verify,
 	.target		= powernv_cpufreq_target,
+	.get		= powernv_cpufreq_get,
 	.init		= powernv_cpufreq_cpu_init,
 	.exit		= powernv_cpufreq_cpu_exit,
 	.name		= "powernv-cpufreq",