Patchwork [U-Boot,1/7] Tegra30: Add AVP (arm720t) files

login
register
mail settings
Submitter Tom Warren
Date Oct. 2, 2012, 10:45 p.m.
Message ID <1349217955-8729-2-git-send-email-twarren@nvidia.com>
Download mbox | patch
Permalink /patch/188675/
State Superseded
Delegated to: Tom Warren
Headers show

Comments

Tom Warren - Oct. 2, 2012, 10:45 p.m.
This provides SPL support for T30 boards - AVP early init, plus
CPU (A9) init/jump to main U-Boot.

Signed-off-by: Tom Warren <twarren@nvidia.com>
---
 arch/arm/cpu/arm720t/tegra-common/cpu.h            |   48 +--
 arch/arm/cpu/arm720t/tegra-common/spl.c            |    3 +-
 .../arm/cpu/arm720t/tegra30}/Makefile              |   20 +-
 arch/arm/cpu/arm720t/tegra30/config.mk             |   19 +
 arch/arm/cpu/arm720t/tegra30/cpu.c                 |  516 ++++++++++++++++++++
 5 files changed, 554 insertions(+), 52 deletions(-)
 copy {board/compal/paz00 => arch/arm/cpu/arm720t/tegra30}/Makefile (68%)
 create mode 100644 arch/arm/cpu/arm720t/tegra30/config.mk
 create mode 100644 arch/arm/cpu/arm720t/tegra30/cpu.c
Stephen Warren - Oct. 3, 2012, 6:23 p.m.
On 10/02/2012 04:45 PM, Tom Warren wrote:
> This provides SPL support for T30 boards - AVP early init, plus
> CPU (A9) init/jump to main U-Boot.

> diff --git a/arch/arm/cpu/arm720t/tegra30/cpu.c b/arch/arm/cpu/arm720t/tegra30/cpu.c

> +/*
> + * Timing tables for each SOC for all four oscillator options.
> + */
> +static struct clk_pll_table tegra_pll_x_table[TEGRA_SOC_COUNT]
> +						[CLOCK_OSC_FREQ_COUNT] = {
> +	/* T20: 1 GHz */

This is odd; it's a Tegra30-specific file, yet has data tables for
Tegra20 too, and various code that only makes sense to differentiate
Tegra20 and Tegra30 at runtime.

Either everything in this file should be Tegra30-specific, or whatever
new code is being added here should be added to a file in tegra-common
if it needs to handle both SoCs.

> +static int get_chip_type(void)
> +{
> +	/*
> +	 * T30 has two options. We will return TEGRA_SOC_T30 until
> +	 * we have the fdt set up when it may change to
> +	 * TEGRA_SOC_T30_408MHZ depending on what we set PLLP to.
> +	 */

I'm not sure what the FDT has to do with it? Certainly at this point in
the series, I doubt that comment is accurate. Do we actually reprogram
the PLLs based on FDT later?

> +	if (clock_get_rate(CLOCK_ID_PERIPH) == 408000000)
> +		return TEGRA_SOC_T30_408MHZ;
> +	else
> +		return TEGRA_SOC_T30;

I thought we'd decided not to support one of those options, but perhaps
it's used somewhere...

> +static void adjust_pllp_out_freqs(void)
> +{
> +	struct clk_rst_ctlr *clkrst = (struct clk_rst_ctlr *)NV_PA_CLK_RST_BASE;
> +	struct clk_pll *pll = &clkrst->crc_pll[CLOCK_ID_PERIPH];
> +	u32 reg;
> +
> +	/* Set T30 PLLP_OUT1, 2, 3 & 4 freqs to 9.6, 48, 102 & 204MHz */
> +	reg = readl(&pll->pll_out[0]);     /* OUTA, contains OUT2 / OUT1 */
> +	reg |= (IN_408_OUT_48_DIVISOR << PLLP_OUT2_RATIO) | PLLP_OUT2_OVR
> +		| (IN_408_OUT_9_6_DIVISOR << PLLP_OUT1_RATIO) | PLLP_OUT1_OVR;
> +	writel(reg, &pll->pll_out[0]);
> +
> +	reg = readl(&pll->pll_out[1]);   /* OUTB, contains OUT4 / OUT3 */
> +	reg |= (IN_408_OUT_204_DIVISOR << PLLP_OUT4_RATIO) | PLLP_OUT4_OVR
> +		| (IN_408_OUT_102_DIVISOR << PLLP_OUT3_RATIO) | PLLP_OUT3_OVR;
> +	writel(reg, &pll->pll_out[1]);
> +}

I don't think that code is ever used, since init_pllx() below is only
ever called with slow==1, so never calls it.

> +void tegra_i2c_ll_write_addr(uint addr, uint config)
> +{
> +	struct i2c_ctlr *reg = (struct i2c_ctlr *)TEGRA_DVC_BASE;
> +
> +	writel(addr, &reg->cmd_addr0);
> +	writel(config, &reg->cnfg);
> +}
> +
> +void tegra_i2c_ll_write_data(uint data, uint config)
> +{
> +	struct i2c_ctlr *reg = (struct i2c_ctlr *)TEGRA_DVC_BASE;
> +
> +	writel(data, &reg->cmd_data1);
> +	writel(config, &reg->cnfg);
> +}
> +
> +static void enable_cpu_power_rail(void)
> +{
> +	struct pmc_ctlr *pmc = (struct pmc_ctlr *)NV_PA_PMC_BASE;
> +	u32 reg;
> +
> +	debug("enable_cpu_power_rail entry\n");
> +	reg = readl(&pmc->pmc_cntrl);
> +	reg |= CPUPWRREQ_OE;
> +	writel(reg, &pmc->pmc_cntrl);
> +
> +	/*
> +	 * Pulse PWRREQ via I2C.  We need to find out what this is
> +	 * doing, tidy up the code and maybe find a better place for it.
> +	 */
> +	tegra_i2c_ll_write_addr(0x005a, 0x0002);
> +	tegra_i2c_ll_write_data(0x2328, 0x0a02);

With a TPS65911x attached to the DVC I2C bus, that sets VDD to 1.4V (I
assume that's both the VDD1 and VDD2 outputs, but the PMIC datasheet
isn't too clear on that at a quick glance), and:

> +	udelay(1000);
> +	tegra_i2c_ll_write_data(0x0127, 0x0a02);

... and this enables the VDD regulator.

So, this is:
a) Entirely specific to a TPS65911x regulator. I think this warrants a
very large FIXME comment.
b) Nothing to do with pulsing PWRREQ via I2C.
c) Really should be done via programming the EEPROM on the PMIC so that
SW doesn't have to do this, but I guess that didn't happen.

> +static void reset_A9_cpu(int reset)
> +{
> +	/*
> +	* NOTE:  Regardless of whether the request is to hold the CPU in reset
> +	*        or take it out of reset, every processor in the CPU complex
> +	*        except the master (CPU 0) will be held in reset because the
> +	*        AVP only talks to the master. The AVP does not know that there
> +	*        are multiple processors in the CPU complex.
> +	*/

At least the last sentence there is false; this code is running on the
AVP and there's explicit code above that determines the number of CPUs
in the main CPU complex (get_num_cpus) and sets separate reset bits for
each of them (enable_cpu_clock). Oh, and ~7 lines below, there's a loop
over num_cpus:

> +	int mask = crc_rst_cpu | crc_rst_de | crc_rst_debug;
> +	int num_cpus = get_num_cpus();
> +	int cpu;
> +
> +	debug("reset_a9_cpu entry\n");
> +	/* Hold CPUs 1 onwards in reset, and CPU 0 if asked */
> +	for (cpu = 1; cpu < num_cpus; cpu++)
> +		reset_cmplx_set_enable(cpu, mask, 1);
> +	reset_cmplx_set_enable(0, mask, reset);
Tom Warren - Oct. 3, 2012, 8:15 p.m.
Stephen,

On Wed, Oct 3, 2012 at 11:23 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 10/02/2012 04:45 PM, Tom Warren wrote:
>> This provides SPL support for T30 boards - AVP early init, plus
>> CPU (A9) init/jump to main U-Boot.
>
>> diff --git a/arch/arm/cpu/arm720t/tegra30/cpu.c b/arch/arm/cpu/arm720t/tegra30/cpu.c
>
>> +/*
>> + * Timing tables for each SOC for all four oscillator options.
>> + */
>> +static struct clk_pll_table tegra_pll_x_table[TEGRA_SOC_COUNT]
>> +                                             [CLOCK_OSC_FREQ_COUNT] = {
>> +     /* T20: 1 GHz */
>
> This is odd; it's a Tegra30-specific file, yet has data tables for
> Tegra20 too, and various code that only makes sense to differentiate
> Tegra20 and Tegra30 at runtime.

I'd meant to pull this out, or commonize the cpu.c files in arm720t,
but forgot about it.

I'll move common code to arm720t/tegra/cpu.c and leave HW-specific
stuff in tegra[23]0/cpu.c on the next rev.

>
> Either everything in this file should be Tegra30-specific, or whatever
> new code is being added here should be added to a file in tegra-common
> if it needs to handle both SoCs.
>
>> +static int get_chip_type(void)
>> +{
>> +     /*
>> +      * T30 has two options. We will return TEGRA_SOC_T30 until
>> +      * we have the fdt set up when it may change to
>> +      * TEGRA_SOC_T30_408MHZ depending on what we set PLLP to.
>> +      */
>
> I'm not sure what the FDT has to do with it? Certainly at this point in
> the series, I doubt that comment is accurate. Do we actually reprogram
> the PLLs based on FDT later?

Most of this file was brought in during a merge with our internal T30
U-Boot repo, so most of these comments aren't from me, but from the
folks that brought up T30 in-house. I chose to start at the slower
PLLX rate (216?) for bringup, and planned to phase in 408MHz support
later, once I was sure of a stable SW base.

>
>> +     if (clock_get_rate(CLOCK_ID_PERIPH) == 408000000)
>> +             return TEGRA_SOC_T30_408MHZ;
>> +     else
>> +             return TEGRA_SOC_T30;
>
> I thought we'd decided not to support one of those options, but perhaps
> it's used somewhere...

See above. Once 408MHz is ported/working, I can remove the slower
clock rate entirely.

>
>> +static void adjust_pllp_out_freqs(void)
>> +{
>> +     struct clk_rst_ctlr *clkrst = (struct clk_rst_ctlr *)NV_PA_CLK_RST_BASE;
>> +     struct clk_pll *pll = &clkrst->crc_pll[CLOCK_ID_PERIPH];
>> +     u32 reg;
>> +
>> +     /* Set T30 PLLP_OUT1, 2, 3 & 4 freqs to 9.6, 48, 102 & 204MHz */
>> +     reg = readl(&pll->pll_out[0]);     /* OUTA, contains OUT2 / OUT1 */
>> +     reg |= (IN_408_OUT_48_DIVISOR << PLLP_OUT2_RATIO) | PLLP_OUT2_OVR
>> +             | (IN_408_OUT_9_6_DIVISOR << PLLP_OUT1_RATIO) | PLLP_OUT1_OVR;
>> +     writel(reg, &pll->pll_out[0]);
>> +
>> +     reg = readl(&pll->pll_out[1]);   /* OUTB, contains OUT4 / OUT3 */
>> +     reg |= (IN_408_OUT_204_DIVISOR << PLLP_OUT4_RATIO) | PLLP_OUT4_OVR
>> +             | (IN_408_OUT_102_DIVISOR << PLLP_OUT3_RATIO) | PLLP_OUT3_OVR;
>> +     writel(reg, &pll->pll_out[1]);
>> +}
>
> I don't think that code is ever used, since init_pllx() below is only
> ever called with slow==1, so never calls it.

Yeah, this'll be used later to get to 408MHz. I'd planned to use it
but didn't get around to it. I can remove it in v2, or try for full
speed (408MHz) instead. Depends on bandwidth & priorities.

>
>> +void tegra_i2c_ll_write_addr(uint addr, uint config)
>> +{
>> +     struct i2c_ctlr *reg = (struct i2c_ctlr *)TEGRA_DVC_BASE;
>> +
>> +     writel(addr, &reg->cmd_addr0);
>> +     writel(config, &reg->cnfg);
>> +}
>> +
>> +void tegra_i2c_ll_write_data(uint data, uint config)
>> +{
>> +     struct i2c_ctlr *reg = (struct i2c_ctlr *)TEGRA_DVC_BASE;
>> +
>> +     writel(data, &reg->cmd_data1);
>> +     writel(config, &reg->cnfg);
>> +}
>> +
>> +static void enable_cpu_power_rail(void)
>> +{
>> +     struct pmc_ctlr *pmc = (struct pmc_ctlr *)NV_PA_PMC_BASE;
>> +     u32 reg;
>> +
>> +     debug("enable_cpu_power_rail entry\n");
>> +     reg = readl(&pmc->pmc_cntrl);
>> +     reg |= CPUPWRREQ_OE;
>> +     writel(reg, &pmc->pmc_cntrl);
>> +
>> +     /*
>> +      * Pulse PWRREQ via I2C.  We need to find out what this is
>> +      * doing, tidy up the code and maybe find a better place for it.
>> +      */
>> +     tegra_i2c_ll_write_addr(0x005a, 0x0002);
>> +     tegra_i2c_ll_write_data(0x2328, 0x0a02);
>
> With a TPS65911x attached to the DVC I2C bus, that sets VDD to 1.4V (I
> assume that's both the VDD1 and VDD2 outputs, but the PMIC datasheet
> isn't too clear on that at a quick glance), and:
>
>> +     udelay(1000);
>> +     tegra_i2c_ll_write_data(0x0127, 0x0a02);
>
> ... and this enables the VDD regulator.
>
> So, this is:
> a) Entirely specific to a TPS65911x regulator. I think this warrants a
> very large FIXME comment.
> b) Nothing to do with pulsing PWRREQ via I2C.
> c) Really should be done via programming the EEPROM on the PMIC so that
> SW doesn't have to do this, but I guess that didn't happen.

Again, this comes from our internal repo & wasn't done by me, so I had
no knowledge of what exactly it was doing. But removing it resulted in
a hung system, so I had to leave it in. Your comments are helpful -
I'll revise this in v2 with new comments and better reg/data defines,
etc.

>
>> +static void reset_A9_cpu(int reset)
>> +{
>> +     /*
>> +     * NOTE:  Regardless of whether the request is to hold the CPU in reset
>> +     *        or take it out of reset, every processor in the CPU complex
>> +     *        except the master (CPU 0) will be held in reset because the
>> +     *        AVP only talks to the master. The AVP does not know that there
>> +     *        are multiple processors in the CPU complex.
>> +     */
>
> At least the last sentence there is false; this code is running on the
> AVP and there's explicit code above that determines the number of CPUs
> in the main CPU complex (get_num_cpus) and sets separate reset bits for
> each of them (enable_cpu_clock). Oh, and ~7 lines below, there's a loop
> over num_cpus:

Again, a comment from a previous dev, not me. I think they're just
saying that the AVP is only working on 1 CPU, and doesn't do full init
of the other ones, since they're aren't needed for U-Boot to
find/fetch a kernel, etc.

>
>> +     int mask = crc_rst_cpu | crc_rst_de | crc_rst_debug;
>> +     int num_cpus = get_num_cpus();
>> +     int cpu;
>> +
>> +     debug("reset_a9_cpu entry\n");
>> +     /* Hold CPUs 1 onwards in reset, and CPU 0 if asked */
>> +     for (cpu = 1; cpu < num_cpus; cpu++)
>> +             reset_cmplx_set_enable(cpu, mask, 1);
>> +     reset_cmplx_set_enable(0, mask, reset);

Thanks for the thorough review,

Tom
Simon Glass - Oct. 4, 2012, 12:57 a.m.
Hi Tom,

On Tue, Oct 2, 2012 at 3:45 PM, Tom Warren <twarren.nvidia@gmail.com> wrote:
> This provides SPL support for T30 boards - AVP early init, plus
> CPU (A9) init/jump to main U-Boot.
>
> Signed-off-by: Tom Warren <twarren@nvidia.com>
> ---
>  arch/arm/cpu/arm720t/tegra-common/cpu.h            |   48 +--
>  arch/arm/cpu/arm720t/tegra-common/spl.c            |    3 +-
>  .../arm/cpu/arm720t/tegra30}/Makefile              |   20 +-
>  arch/arm/cpu/arm720t/tegra30/config.mk             |   19 +
>  arch/arm/cpu/arm720t/tegra30/cpu.c                 |  516 ++++++++++++++++++++
>  5 files changed, 554 insertions(+), 52 deletions(-)
>  copy {board/compal/paz00 => arch/arm/cpu/arm720t/tegra30}/Makefile (68%)
>  create mode 100644 arch/arm/cpu/arm720t/tegra30/config.mk
>  create mode 100644 arch/arm/cpu/arm720t/tegra30/cpu.c

I just have a few additional comments.
>
> diff --git a/arch/arm/cpu/arm720t/tegra-common/cpu.h b/arch/arm/cpu/arm720t/tegra-common/cpu.h
> index 6804cd7..0d9a3c2 100644


> diff --git a/arch/arm/cpu/arm720t/tegra30/cpu.c b/arch/arm/cpu/arm720t/tegra30/cpu.c
> new file mode 100644
> index 0000000..e0821ef
> --- /dev/null
> +++ b/arch/arm/cpu/arm720t/tegra30/cpu.c
> @@ -0,0 +1,516 @@
> +/*
> + * Copyright (c) 2010-2012, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <common.h>
> +#include <asm/io.h>
> +#include <asm/arch/clock.h>
> +#include <asm/arch/flow.h>
> +#include <asm/arch/pinmux.h>
> +#include <asm/arch/tegra.h>
> +#include <asm/arch-tegra/clk_rst.h>
> +#include <asm/arch-tegra/fuse.h>
> +#include <asm/arch-tegra/pmc.h>
> +#include <asm/arch-tegra/scu.h>
> +#include <asm/arch-tegra/tegra_i2c.h>
> +#include "../tegra-common/cpu.h"
> +
> +struct clk_pll_table {
> +       u16             n;
> +       u16             m;
> +       u8              p;
> +       u8              cpcon;
> +};
> +
> +/* ~0=uninitialized/unknown, 0=false, 1=true */
> +uint32_t is_tegra_processor_reset = 0xffffffff;
> +
> +/*
> + * Timing tables for each SOC for all four oscillator options.
> + */
> +static struct clk_pll_table tegra_pll_x_table[TEGRA_SOC_COUNT]
> +                                               [CLOCK_OSC_FREQ_COUNT] = {
> +       /* T20: 1 GHz */
> +       {{ 1000, 13, 0, 12},    /* OSC 13M */
> +        { 625,  12, 0, 8},     /* OSC 19.2M */
> +        { 1000, 12, 0, 12},    /* OSC 12M */
> +        { 1000, 26, 0, 12},    /* OSC 26M */
> +       },
> +
> +       /* T25: 1.2 GHz */
> +       {{ 923, 10, 0, 12},
> +        { 750, 12, 0, 8},
> +        { 600,  6, 0, 12},
> +        { 600, 13, 0, 12},
> +       },
> +
> +       /* T30(slow): 1.0 GHz */
> +       {{ 1000,  13, 0, 8},
> +        { 625,  12, 0, 4},
> +        { 1000, 12, 0, 8},
> +        { 1000,  26, 0, 8},
> +       },
> +
> +       /* T30(high): 1.4 GHz */
> +       {{ 862, 8, 0, 8},
> +        { 583, 8, 0, 4},
> +        { 700, 6, 0, 8},
> +        { 700, 13, 0, 8},
> +       },
> +
> +       /* TEGRA_SOC2_SLOW: 312 MHz */
> +       {{ 312, 13, 0, 12},     /* OSC 13M */
> +        { 260, 16, 0, 8},      /* OSC 19.2M */
> +        { 312, 12, 0, 12},     /* OSC 12M */
> +        { 312, 26, 0, 12},     /* OSC 26M */
> +       },
> +};
> +
> +enum tegra_family_t {
> +       TEGRA_FAMILY_T2x,
> +       TEGRA_FAMILY_T3x,
> +};

This is fine here since the function that uses it is static. I wonder
if we want to export that function one day?

> +
> +static int get_chip_type(void)
> +{
> +       /*
> +        * T30 has two options. We will return TEGRA_SOC_T30 until
> +        * we have the fdt set up when it may change to
> +        * TEGRA_SOC_T30_408MHZ depending on what we set PLLP to.
> +        */
> +       if (clock_get_rate(CLOCK_ID_PERIPH) == 408000000)
> +               return TEGRA_SOC_T30_408MHZ;
> +       else
> +               return TEGRA_SOC_T30;
> +}
> +
> +static enum tegra_family_t get_family(void)
> +{
> +       u32 reg, chip_id;
> +
> +       debug("tegra_get_family entry\n");
> +       reg = readl(NV_PA_APB_MISC_BASE + GP_HIDREV);
> +
> +       chip_id = reg >> 8;
> +       chip_id &= 0xff;
> +       debug("  tegra_get_family: chip_id = %x\n", chip_id);
> +       if (chip_id == 0x30)
> +               return TEGRA_FAMILY_T3x;
> +       else
> +               return TEGRA_FAMILY_T2x;
> +}
> +
> +int get_num_cpus(void)
> +{
> +       return get_family() == TEGRA_FAMILY_T3x ? 4 : 2;
> +}
> +
> +static void adjust_pllp_out_freqs(void)
> +{
> +       struct clk_rst_ctlr *clkrst = (struct clk_rst_ctlr *)NV_PA_CLK_RST_BASE;
> +       struct clk_pll *pll = &clkrst->crc_pll[CLOCK_ID_PERIPH];
> +       u32 reg;
> +
> +       /* Set T30 PLLP_OUT1, 2, 3 & 4 freqs to 9.6, 48, 102 & 204MHz */
> +       reg = readl(&pll->pll_out[0]);     /* OUTA, contains OUT2 / OUT1 */
> +       reg |= (IN_408_OUT_48_DIVISOR << PLLP_OUT2_RATIO) | PLLP_OUT2_OVR
> +               | (IN_408_OUT_9_6_DIVISOR << PLLP_OUT1_RATIO) | PLLP_OUT1_OVR;
> +       writel(reg, &pll->pll_out[0]);
> +
> +       reg = readl(&pll->pll_out[1]);   /* OUTB, contains OUT4 / OUT3 */
> +       reg |= (IN_408_OUT_204_DIVISOR << PLLP_OUT4_RATIO) | PLLP_OUT4_OVR
> +               | (IN_408_OUT_102_DIVISOR << PLLP_OUT3_RATIO) | PLLP_OUT3_OVR;
> +       writel(reg, &pll->pll_out[1]);
> +}
> +
> +static int pllx_set_rate(struct clk_pll_simple *pll , u32 divn, u32 divm,
> +               u32 divp, u32 cpcon)

From this function on I see quite a bit of similarity with the tegra20
version. IMO the number of cpus can just be the result of
get_num_cpus(), most of the clocks are the same, and differences can
either be in this file or behind a conditional check for the family.
Otherwise we have duplicated code here.

Regards,
Simon
Simon Glass - Oct. 4, 2012, 1:11 a.m.
Hi Stephen,

On Wed, Oct 3, 2012 at 11:23 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 10/02/2012 04:45 PM, Tom Warren wrote:
>> This provides SPL support for T30 boards - AVP early init, plus
>> CPU (A9) init/jump to main U-Boot.
>
>> diff --git a/arch/arm/cpu/arm720t/tegra30/cpu.c b/arch/arm/cpu/arm720t/tegra30/cpu.c

>> +static int get_chip_type(void)
>> +{
>> +     /*
>> +      * T30 has two options. We will return TEGRA_SOC_T30 until
>> +      * we have the fdt set up when it may change to
>> +      * TEGRA_SOC_T30_408MHZ depending on what we set PLLP to.
>> +      */
>
> I'm not sure what the FDT has to do with it? Certainly at this point in
> the series, I doubt that comment is accurate. Do we actually reprogram
> the PLLs based on FDT later?

Yes, and that changes the SOC here. It's a bit odd but the AVP
couldn't know what speed we wanted so we updated it later when the
Cortex-A9 was running.

>
>> +     if (clock_get_rate(CLOCK_ID_PERIPH) == 408000000)
>> +             return TEGRA_SOC_T30_408MHZ;
>> +     else
>> +             return TEGRA_SOC_T30;
>
> I thought we'd decided not to support one of those options, but perhaps
> it's used somewhere...

Yes, in fact we started using 408MHz exclusively. Not sure if that is
possible upstream - unless anyone knows for sure I suggest we leave it
as is and remove it later if no one needs it.

>
>> +static void adjust_pllp_out_freqs(void)
>> +{
>> +     struct clk_rst_ctlr *clkrst = (struct clk_rst_ctlr *)NV_PA_CLK_RST_BASE;
>> +     struct clk_pll *pll = &clkrst->crc_pll[CLOCK_ID_PERIPH];
>> +     u32 reg;
>> +
>> +     /* Set T30 PLLP_OUT1, 2, 3 & 4 freqs to 9.6, 48, 102 & 204MHz */
>> +     reg = readl(&pll->pll_out[0]);     /* OUTA, contains OUT2 / OUT1 */
>> +     reg |= (IN_408_OUT_48_DIVISOR << PLLP_OUT2_RATIO) | PLLP_OUT2_OVR
>> +             | (IN_408_OUT_9_6_DIVISOR << PLLP_OUT1_RATIO) | PLLP_OUT1_OVR;
>> +     writel(reg, &pll->pll_out[0]);
>> +
>> +     reg = readl(&pll->pll_out[1]);   /* OUTB, contains OUT4 / OUT3 */
>> +     reg |= (IN_408_OUT_204_DIVISOR << PLLP_OUT4_RATIO) | PLLP_OUT4_OVR
>> +             | (IN_408_OUT_102_DIVISOR << PLLP_OUT3_RATIO) | PLLP_OUT3_OVR;
>> +     writel(reg, &pll->pll_out[1]);
>> +}
>
> I don't think that code is ever used, since init_pllx() below is only
> ever called with slow==1, so never calls it.

This probably came from this function that we had in
arch/arm/cpu/armv7/tegra-common/board.c:

void arch_full_speed(void)
{
	ap20_init_pllx(0);
	debug("CPU at %d\n", clock_get_rate(CLOCK_ID_XCPU));
	debug("Memory at %d\n", clock_get_rate(CLOCK_ID_MEMORY));
	debug("Periph at %d\n", clock_get_rate(CLOCK_ID_PERIPH));
}

It was called from board/nvidia/common/board.

If you like, take a look at the chromeos-v2011.06 branch of U-Boot for
anything about this.

>
>> +void tegra_i2c_ll_write_addr(uint addr, uint config)
>> +{
>> +     struct i2c_ctlr *reg = (struct i2c_ctlr *)TEGRA_DVC_BASE;
>> +
>> +     writel(addr, &reg->cmd_addr0);
>> +     writel(config, &reg->cnfg);
>> +}
>> +
>> +void tegra_i2c_ll_write_data(uint data, uint config)
>> +{
>> +     struct i2c_ctlr *reg = (struct i2c_ctlr *)TEGRA_DVC_BASE;
>> +
>> +     writel(data, &reg->cmd_data1);
>> +     writel(config, &reg->cnfg);
>> +}

Well this is a lot better than the ugliness I had. I don't think you
can put this into the i2c driver since the AVP doesn't have it, right?

>> +
>> +static void enable_cpu_power_rail(void)
>> +{
>> +     struct pmc_ctlr *pmc = (struct pmc_ctlr *)NV_PA_PMC_BASE;
>> +     u32 reg;
>> +
>> +     debug("enable_cpu_power_rail entry\n");
>> +     reg = readl(&pmc->pmc_cntrl);
>> +     reg |= CPUPWRREQ_OE;
>> +     writel(reg, &pmc->pmc_cntrl);
>> +
>> +     /*
>> +      * Pulse PWRREQ via I2C.  We need to find out what this is
>> +      * doing, tidy up the code and maybe find a better place for it.
>> +      */
>> +     tegra_i2c_ll_write_addr(0x005a, 0x0002);
>> +     tegra_i2c_ll_write_data(0x2328, 0x0a02);
>
> With a TPS65911x attached to the DVC I2C bus, that sets VDD to 1.4V (I
> assume that's both the VDD1 and VDD2 outputs, but the PMIC datasheet
> isn't too clear on that at a quick glance), and:
>
>> +     udelay(1000);
>> +     tegra_i2c_ll_write_data(0x0127, 0x0a02);
>
> ... and this enables the VDD regulator.
>
> So, this is:
> a) Entirely specific to a TPS65911x regulator. I think this warrants a
> very large FIXME comment.
> b) Nothing to do with pulsing PWRREQ via I2C.
> c) Really should be done via programming the EEPROM on the PMIC so that
> SW doesn't have to do this, but I guess that didn't happen.
>
>> +static void reset_A9_cpu(int reset)
>> +{
>> +     /*
>> +     * NOTE:  Regardless of whether the request is to hold the CPU in reset
>> +     *        or take it out of reset, every processor in the CPU complex
>> +     *        except the master (CPU 0) will be held in reset because the
>> +     *        AVP only talks to the master. The AVP does not know that there
>> +     *        are multiple processors in the CPU complex.
>> +     */
>
> At least the last sentence there is false; this code is running on the
> AVP and there's explicit code above that determines the number of CPUs
> in the main CPU complex (get_num_cpus) and sets separate reset bits for
> each of them (enable_cpu_clock). Oh, and ~7 lines below, there's a loop
> over num_cpus:
>
>> +     int mask = crc_rst_cpu | crc_rst_de | crc_rst_debug;
>> +     int num_cpus = get_num_cpus();
>> +     int cpu;
>> +
>> +     debug("reset_a9_cpu entry\n");
>> +     /* Hold CPUs 1 onwards in reset, and CPU 0 if asked */
>> +     for (cpu = 1; cpu < num_cpus; cpu++)
>> +             reset_cmplx_set_enable(cpu, mask, 1);
>> +     reset_cmplx_set_enable(0, mask, reset);
> _______________________________________________
> U-Boot mailing list
> U-Boot@lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot

Regards,
Simon

Patch

diff --git a/arch/arm/cpu/arm720t/tegra-common/cpu.h b/arch/arm/cpu/arm720t/tegra-common/cpu.h
index 6804cd7..0d9a3c2 100644
--- a/arch/arm/cpu/arm720t/tegra-common/cpu.h
+++ b/arch/arm/cpu/arm720t/tegra-common/cpu.h
@@ -44,50 +44,11 @@ 
 
 #define CORESIGHT_UNLOCK	0xC5ACCE55;
 
-/* AP20-Specific Base Addresses */
-
-/* AP20 Base physical address of SDRAM. */
-#define AP20_BASE_PA_SDRAM      0x00000000
-/* AP20 Base physical address of internal SRAM. */
-#define AP20_BASE_PA_SRAM       0x40000000
-/* AP20 Size of internal SRAM (256KB). */
-#define AP20_BASE_PA_SRAM_SIZE  0x00040000
-/* AP20 Base physical address of flash. */
-#define AP20_BASE_PA_NOR_FLASH  0xD0000000
-/* AP20 Base physical address of boot information table. */
-#define AP20_BASE_PA_BOOT_INFO  AP20_BASE_PA_SRAM
-
-/*
- * Super-temporary stacks for EXTREMELY early startup. The values chosen for
- * these addresses must be valid on ALL SOCs because this value is used before
- * we are able to differentiate between the SOC types.
- *
- * NOTE: The since CPU's stack will eventually be moved from IRAM to SDRAM, its
- *       stack is placed below the AVP stack. Once the CPU stack has been moved,
- *       the AVP is free to use the IRAM the CPU stack previously occupied if
- *       it should need to do so.
- *
- * NOTE: In multi-processor CPU complex configurations, each processor will have
- *       its own stack of size CPU_EARLY_BOOT_STACK_SIZE. CPU 0 will have a
- *       limit of CPU_EARLY_BOOT_STACK_LIMIT. Each successive CPU will have a
- *       stack limit that is CPU_EARLY_BOOT_STACK_SIZE less then the previous
- *       CPU.
- */
-
-/* Common AVP early boot stack limit */
-#define AVP_EARLY_BOOT_STACK_LIMIT	\
-	(AP20_BASE_PA_SRAM + (AP20_BASE_PA_SRAM_SIZE/2))
-/* Common AVP early boot stack size */
-#define AVP_EARLY_BOOT_STACK_SIZE	0x1000
-/* Common CPU early boot stack limit */
-#define CPU_EARLY_BOOT_STACK_LIMIT	\
-	(AVP_EARLY_BOOT_STACK_LIMIT - AVP_EARLY_BOOT_STACK_SIZE)
-/* Common CPU early boot stack size */
-#define CPU_EARLY_BOOT_STACK_SIZE	0x1000
-
 #define EXCEP_VECTOR_CPU_RESET_VECTOR	(NV_PA_EVP_BASE + 0x100)
 #define CSITE_CPU_DBG0_LAR		(NV_PA_CSITE_BASE + 0x10FB0)
 #define CSITE_CPU_DBG1_LAR		(NV_PA_CSITE_BASE + 0x12FB0)
+#define CSITE_CPU_DBG2_LAR		(NV_PA_CSITE_BASE + 0x14FB0)
+#define CSITE_CPU_DBG3_LAR		(NV_PA_CSITE_BASE + 0x16FB0)
 
 #define FLOW_CTLR_HALT_COP_EVENTS	(NV_PA_FLOW_BASE + 4)
 #define FLOW_MODE_STOP			2
@@ -95,6 +56,11 @@ 
 #define HALT_COP_EVENT_IRQ_1		(1 << 11)
 #define HALT_COP_EVENT_FIQ_1		(1 << 9)
 
+#define FLOW_MODE_NONE		0
+
+#define SIMPLE_PLLX     (CLOCK_ID_XCPU - CLOCK_ID_FIRST_SIMPLE)
+#define GP_HIDREV	0x804
+
 void start_cpu(u32 reset_vector);
 int ap20_cpu_is_cortexa9(void);
 void halt_avp(void)  __attribute__ ((noreturn));
diff --git a/arch/arm/cpu/arm720t/tegra-common/spl.c b/arch/arm/cpu/arm720t/tegra-common/spl.c
index 0d37ce8..2e8d9ca 100644
--- a/arch/arm/cpu/arm720t/tegra-common/spl.c
+++ b/arch/arm/cpu/arm720t/tegra-common/spl.c
@@ -33,8 +33,6 @@ 
 #include <image.h>
 #include <malloc.h>
 #include <linux/compiler.h>
-#include "cpu.h"
-
 #include <asm/io.h>
 #include <asm/arch/clock.h>
 #include <asm/arch/pinmux.h>
@@ -44,6 +42,7 @@ 
 #include <asm/arch-tegra/pmc.h>
 #include <asm/arch-tegra/scu.h>
 #include <asm/arch-tegra/sys_proto.h>
+#include "cpu.h"
 
 DECLARE_GLOBAL_DATA_PTR;
 
diff --git a/board/compal/paz00/Makefile b/arch/arm/cpu/arm720t/tegra30/Makefile
similarity index 68%
copy from board/compal/paz00/Makefile
copy to arch/arm/cpu/arm720t/tegra30/Makefile
index 7f7287e..bd96997 100644
--- a/board/compal/paz00/Makefile
+++ b/arch/arm/cpu/arm720t/tegra30/Makefile
@@ -1,8 +1,8 @@ 
 #
 # Copyright (c) 2010-2012, NVIDIA CORPORATION.  All rights reserved.
 #
-# See file CREDITS for list of people who contributed to this
-# project.
+# (C) Copyright 2000-2008
+# Wolfgang Denk, DENX Software Engineering, wd@denx.de.
 #
 # This program is free software; you can redistribute it and/or modify it
 # under the terms and conditions of the GNU General Public License,
@@ -13,20 +13,22 @@ 
 # FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
 # more details.
 #
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
 
 include $(TOPDIR)/config.mk
 
-$(shell mkdir -p $(obj)../../nvidia/common)
+LIB	= $(obj)lib$(SOC).o
 
-LIB	= $(obj)lib$(BOARD).o
+COBJS-y	+= cpu.o
 
-COBJS	:= $(BOARD).o
-COBJS	+= ../../nvidia/common/board.o
+SRCS	:= $(COBJS-y:.o=.c)
+OBJS	:= $(addprefix $(obj),$(COBJS-y))
 
-SRCS	:= $(COBJS:.o=.c)
-OBJS	:= $(addprefix $(obj),$(COBJS))
+all:	$(obj).depend $(LIB)
 
-$(LIB):	$(obj).depend $(OBJS)
+$(LIB):	$(OBJS)
 	$(call cmd_link_o_target, $(OBJS))
 
 #########################################################################
diff --git a/arch/arm/cpu/arm720t/tegra30/config.mk b/arch/arm/cpu/arm720t/tegra30/config.mk
new file mode 100644
index 0000000..2388c56
--- /dev/null
+++ b/arch/arm/cpu/arm720t/tegra30/config.mk
@@ -0,0 +1,19 @@ 
+#
+# Copyright (c) 2010-2012, NVIDIA CORPORATION.  All rights reserved.
+#
+# (C) Copyright 2002
+# Gary Jennejohn, DENX Software Engineering, <garyj@denx.de>
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms and conditions of the GNU General Public License,
+# version 2, as published by the Free Software Foundation.
+#
+# This program is distributed in the hope it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+# more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+USE_PRIVATE_LIBGCC = yes
diff --git a/arch/arm/cpu/arm720t/tegra30/cpu.c b/arch/arm/cpu/arm720t/tegra30/cpu.c
new file mode 100644
index 0000000..e0821ef
--- /dev/null
+++ b/arch/arm/cpu/arm720t/tegra30/cpu.c
@@ -0,0 +1,516 @@ 
+/*
+ * Copyright (c) 2010-2012, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <common.h>
+#include <asm/io.h>
+#include <asm/arch/clock.h>
+#include <asm/arch/flow.h>
+#include <asm/arch/pinmux.h>
+#include <asm/arch/tegra.h>
+#include <asm/arch-tegra/clk_rst.h>
+#include <asm/arch-tegra/fuse.h>
+#include <asm/arch-tegra/pmc.h>
+#include <asm/arch-tegra/scu.h>
+#include <asm/arch-tegra/tegra_i2c.h>
+#include "../tegra-common/cpu.h"
+
+struct clk_pll_table {
+	u16		n;
+	u16		m;
+	u8		p;
+	u8		cpcon;
+};
+
+/* ~0=uninitialized/unknown, 0=false, 1=true */
+uint32_t is_tegra_processor_reset = 0xffffffff;
+
+/*
+ * Timing tables for each SOC for all four oscillator options.
+ */
+static struct clk_pll_table tegra_pll_x_table[TEGRA_SOC_COUNT]
+						[CLOCK_OSC_FREQ_COUNT] = {
+	/* T20: 1 GHz */
+	{{ 1000, 13, 0, 12},	/* OSC 13M */
+	 { 625,  12, 0, 8},	/* OSC 19.2M */
+	 { 1000, 12, 0, 12},	/* OSC 12M */
+	 { 1000, 26, 0, 12},	/* OSC 26M */
+	},
+
+	/* T25: 1.2 GHz */
+	{{ 923, 10, 0, 12},
+	 { 750, 12, 0, 8},
+	 { 600,  6, 0, 12},
+	 { 600, 13, 0, 12},
+	},
+
+	/* T30(slow): 1.0 GHz */
+	{{ 1000,  13, 0, 8},
+	 { 625,  12, 0, 4},
+	 { 1000, 12, 0, 8},
+	 { 1000,  26, 0, 8},
+	},
+
+	/* T30(high): 1.4 GHz */
+	{{ 862, 8, 0, 8},
+	 { 583, 8, 0, 4},
+	 { 700, 6, 0, 8},
+	 { 700, 13, 0, 8},
+	},
+
+	/* TEGRA_SOC2_SLOW: 312 MHz */
+	{{ 312, 13, 0, 12},	/* OSC 13M */
+	 { 260, 16, 0, 8},	/* OSC 19.2M */
+	 { 312, 12, 0, 12},	/* OSC 12M */
+	 { 312, 26, 0, 12},	/* OSC 26M */
+	},
+};
+
+enum tegra_family_t {
+	TEGRA_FAMILY_T2x,
+	TEGRA_FAMILY_T3x,
+};
+
+static int get_chip_type(void)
+{
+	/*
+	 * T30 has two options. We will return TEGRA_SOC_T30 until
+	 * we have the fdt set up when it may change to
+	 * TEGRA_SOC_T30_408MHZ depending on what we set PLLP to.
+	 */
+	if (clock_get_rate(CLOCK_ID_PERIPH) == 408000000)
+		return TEGRA_SOC_T30_408MHZ;
+	else
+		return TEGRA_SOC_T30;
+}
+
+static enum tegra_family_t get_family(void)
+{
+	u32 reg, chip_id;
+
+	debug("tegra_get_family entry\n");
+	reg = readl(NV_PA_APB_MISC_BASE + GP_HIDREV);
+
+	chip_id = reg >> 8;
+	chip_id &= 0xff;
+	debug("  tegra_get_family: chip_id = %x\n", chip_id);
+	if (chip_id == 0x30)
+		return TEGRA_FAMILY_T3x;
+	else
+		return TEGRA_FAMILY_T2x;
+}
+
+int get_num_cpus(void)
+{
+	return get_family() == TEGRA_FAMILY_T3x ? 4 : 2;
+}
+
+static void adjust_pllp_out_freqs(void)
+{
+	struct clk_rst_ctlr *clkrst = (struct clk_rst_ctlr *)NV_PA_CLK_RST_BASE;
+	struct clk_pll *pll = &clkrst->crc_pll[CLOCK_ID_PERIPH];
+	u32 reg;
+
+	/* Set T30 PLLP_OUT1, 2, 3 & 4 freqs to 9.6, 48, 102 & 204MHz */
+	reg = readl(&pll->pll_out[0]);     /* OUTA, contains OUT2 / OUT1 */
+	reg |= (IN_408_OUT_48_DIVISOR << PLLP_OUT2_RATIO) | PLLP_OUT2_OVR
+		| (IN_408_OUT_9_6_DIVISOR << PLLP_OUT1_RATIO) | PLLP_OUT1_OVR;
+	writel(reg, &pll->pll_out[0]);
+
+	reg = readl(&pll->pll_out[1]);   /* OUTB, contains OUT4 / OUT3 */
+	reg |= (IN_408_OUT_204_DIVISOR << PLLP_OUT4_RATIO) | PLLP_OUT4_OVR
+		| (IN_408_OUT_102_DIVISOR << PLLP_OUT3_RATIO) | PLLP_OUT3_OVR;
+	writel(reg, &pll->pll_out[1]);
+}
+
+static int pllx_set_rate(struct clk_pll_simple *pll , u32 divn, u32 divm,
+		u32 divp, u32 cpcon)
+{
+	u32 reg;
+
+	/* If PLLX is already enabled, just return */
+	if (readl(&pll->pll_base) & PLL_ENABLE_MASK) {
+		debug("pllx_set_rate: PLLX already enabled, returning\n");
+		return 0;
+	}
+
+	debug(" pllx_set_rate entry\n");
+
+	/* Set BYPASS, m, n and p to PLLX_BASE */
+	reg = PLL_BYPASS_MASK | (divm << PLL_DIVM_SHIFT);
+	reg |= ((divn << PLL_DIVN_SHIFT) | (divp << PLL_DIVP_SHIFT));
+	writel(reg, &pll->pll_base);
+
+	/* Set cpcon to PLLX_MISC */
+	reg = (cpcon << PLL_CPCON_SHIFT);
+
+	/* Set dccon to PLLX_MISC if freq > 600MHz */
+	if (divn > 600)
+		reg |= (1 << PLL_DCCON_SHIFT);
+	writel(reg, &pll->pll_misc);
+
+	/* Enable PLLX */
+	reg = readl(&pll->pll_base);
+	reg |= PLL_ENABLE_MASK;
+
+	/* Disable BYPASS */
+	reg &= ~PLL_BYPASS_MASK;
+	writel(reg, &pll->pll_base);
+
+	/* Set lock_enable to PLLX_MISC */
+	reg = readl(&pll->pll_misc);
+	reg |= PLL_LOCK_ENABLE_MASK;
+	writel(reg, &pll->pll_misc);
+
+	return 0;
+}
+
+void init_pllx(int slow)
+{
+	struct clk_rst_ctlr *clkrst = (struct clk_rst_ctlr *)NV_PA_CLK_RST_BASE;
+	struct clk_pll_simple *pll = &clkrst->crc_pll_simple[SIMPLE_PLLX];
+	int chip_type;
+	enum clock_osc_freq osc;
+	struct clk_pll_table *sel;
+
+	debug("init_pllx entry\n");
+
+	/* get chip type. If unknown, assign to T30 */
+	chip_type = get_chip_type();
+	debug(" init_pllx: chip_type = %d\n", chip_type);
+
+	/* get osc freq */
+	osc = clock_get_osc_freq();
+	debug("  init_pllx: osc = %d\n", osc);
+
+	/* set pllx */
+	sel = &tegra_pll_x_table[chip_type][osc];
+	pllx_set_rate(pll, sel->n, sel->m, sel->p, sel->cpcon);
+
+	/* once we are out of slow mode, set up the T30 PLLs also */
+	if (!slow && chip_type == TEGRA_SOC_T30_408MHZ) {
+		debug("  init_pllx: adjusting PLLP out freqs\n");
+		adjust_pllp_out_freqs();
+	}
+}
+
+static void enable_cpu_clock(int enable)
+{
+	struct clk_rst_ctlr *clkrst = (struct clk_rst_ctlr *)NV_PA_CLK_RST_BASE;
+	u32 clk;
+
+	debug("enable_cpu_clock entry, enable = %d\n", enable);
+	/*
+	 * NOTE:
+	 * Regardless of whether the request is to enable or disable the CPU
+	 * clock, every processor in the CPU complex except the master (CPU 0)
+	 * will have it's clock stopped because the AVP only talks to the
+	 * master. The AVP does not know (nor does it need to know) that there
+	 * are multiple processors in the CPU complex.
+	 */
+
+	if (enable) {
+		/* Initialize PLLX in 'slow' mode */
+		init_pllx(1);
+
+		/* Wait until all clocks are stable */
+		udelay(PLL_STABILIZATION_DELAY);
+
+		writel(CCLK_BURST_POLICY, &clkrst->crc_cclk_brst_pol);
+		writel(SUPER_CCLK_DIVIDER, &clkrst->crc_super_cclk_div);
+	}
+
+	/*
+	 * Read the register containing the individual CPU clock enables and
+	 * always stop the clock to CPUs 1, 2 & 3.
+	 */
+	clk = readl(&clkrst->crc_clk_cpu_cmplx);
+	clk |= 1 << CPU1_CLK_STP_SHIFT;
+	clk |= 1 << CPU2_CLK_STP_SHIFT;
+	clk |= 1 << CPU3_CLK_STP_SHIFT;
+
+	/* Stop/Unstop the CPU clock */
+	clk &= ~CPU0_CLK_STP_MASK;
+	clk |= !enable << CPU0_CLK_STP_SHIFT;
+	writel(clk, &clkrst->crc_clk_cpu_cmplx);
+
+	clock_enable(PERIPH_ID_CPU);
+	debug("enable_cpu_clock entry, enabled CPU clock\n");
+}
+
+static int is_cpu_powered(void)
+{
+	struct pmc_ctlr *pmc = (struct pmc_ctlr *)NV_PA_PMC_BASE;
+
+	debug("is_cpu_powered entry\n");
+	return (readl(&pmc->pmc_pwrgate_status) & CPU_PWRED) ? 1 : 0;
+}
+
+static void remove_cpu_io_clamps(void)
+{
+	struct pmc_ctlr *pmc = (struct pmc_ctlr *)NV_PA_PMC_BASE;
+	u32 reg;
+
+	debug("remove_cpu_io_clamps entry\n");
+	/* Remove the clamps on the CPU I/O signals */
+	reg = readl(&pmc->pmc_remove_clamping);
+	reg |= CPU_CLMP;
+	writel(reg, &pmc->pmc_remove_clamping);
+
+	/* Give I/O signals time to stabilize */
+	udelay(IO_STABILIZATION_DELAY);
+}
+
+static void powerup_cpu(void)
+{
+	struct pmc_ctlr *pmc = (struct pmc_ctlr *)NV_PA_PMC_BASE;
+	u32 reg;
+	int timeout = IO_STABILIZATION_DELAY;
+
+	debug("powerup_cpu entry\n");
+	if (!is_cpu_powered()) {
+		/* Toggle the CPU power state (OFF -> ON) */
+		reg = readl(&pmc->pmc_pwrgate_toggle);
+		reg &= PARTID_CP;
+		reg |= START_CP;
+		writel(reg, &pmc->pmc_pwrgate_toggle);
+
+		/* Wait for the power to come up */
+		while (!is_cpu_powered()) {
+			if (timeout-- == 0)
+				printf("CPU failed to power up!\n");
+			else
+				udelay(10);
+		}
+
+		/*
+		 * Remove the I/O clamps from CPU power partition.
+		 * Recommended only on a Warm boot, if the CPU partition gets
+		 * power gated. Shouldn't cause any harm when called after a
+		 * cold boot according to HW, probably just redundant.
+		 */
+		remove_cpu_io_clamps();
+	}
+}
+
+void tegra_i2c_ll_write_addr(uint addr, uint config)
+{
+	struct i2c_ctlr *reg = (struct i2c_ctlr *)TEGRA_DVC_BASE;
+
+	writel(addr, &reg->cmd_addr0);
+	writel(config, &reg->cnfg);
+}
+
+void tegra_i2c_ll_write_data(uint data, uint config)
+{
+	struct i2c_ctlr *reg = (struct i2c_ctlr *)TEGRA_DVC_BASE;
+
+	writel(data, &reg->cmd_data1);
+	writel(config, &reg->cnfg);
+}
+
+static void enable_cpu_power_rail(void)
+{
+	struct pmc_ctlr *pmc = (struct pmc_ctlr *)NV_PA_PMC_BASE;
+	u32 reg;
+
+	debug("enable_cpu_power_rail entry\n");
+	reg = readl(&pmc->pmc_cntrl);
+	reg |= CPUPWRREQ_OE;
+	writel(reg, &pmc->pmc_cntrl);
+
+	/*
+	 * Pulse PWRREQ via I2C.  We need to find out what this is
+	 * doing, tidy up the code and maybe find a better place for it.
+	 */
+	tegra_i2c_ll_write_addr(0x005a, 0x0002);
+	tegra_i2c_ll_write_data(0x2328, 0x0a02);
+	udelay(1000);
+	tegra_i2c_ll_write_data(0x0127, 0x0a02);
+	udelay(10 * 1000);
+}
+
+static void reset_A9_cpu(int reset)
+{
+	/*
+	* NOTE:  Regardless of whether the request is to hold the CPU in reset
+	*        or take it out of reset, every processor in the CPU complex
+	*        except the master (CPU 0) will be held in reset because the
+	*        AVP only talks to the master. The AVP does not know that there
+	*        are multiple processors in the CPU complex.
+	*/
+	int mask = crc_rst_cpu | crc_rst_de | crc_rst_debug;
+	int num_cpus = get_num_cpus();
+	int cpu;
+
+	debug("reset_a9_cpu entry\n");
+	/* Hold CPUs 1 onwards in reset, and CPU 0 if asked */
+	for (cpu = 1; cpu < num_cpus; cpu++)
+		reset_cmplx_set_enable(cpu, mask, 1);
+	reset_cmplx_set_enable(0, mask, reset);
+
+	/* Enable/Disable master CPU reset */
+	reset_set_enable(PERIPH_ID_CPU, reset);
+}
+
+/**
+ * The T30 requires some special clock initialization, including setting up
+ * the dvc i2c, turning on mselect and selecting the G CPU cluster
+ */
+void t30_init_clocks(void)
+{
+	struct clk_rst_ctlr *clkrst =
+			(struct clk_rst_ctlr *)NV_PA_CLK_RST_BASE;
+	struct flow_ctlr *flow = (struct flow_ctlr *)NV_PA_FLOW_BASE;
+	u32 val;
+
+	debug("t30_init_clocks entry\n");
+	/* Set active CPU cluster to G */
+	clrbits_le32(flow->cluster_control, 1 << 0);
+
+	/*
+	 * Switch system clock to PLLP_OUT4 (108 MHz), AVP will now run
+	 * at 108 MHz. This is glitch free as only the source is changed, no
+	 * special precaution needed.
+	 */
+	val = (SCLK_SOURCE_PLLP_OUT4 << SCLK_SWAKEUP_FIQ_SOURCE_SHIFT) |
+		(SCLK_SOURCE_PLLP_OUT4 << SCLK_SWAKEUP_IRQ_SOURCE_SHIFT) |
+		(SCLK_SOURCE_PLLP_OUT4 << SCLK_SWAKEUP_RUN_SOURCE_SHIFT) |
+		(SCLK_SOURCE_PLLP_OUT4 << SCLK_SWAKEUP_IDLE_SOURCE_SHIFT) |
+		(SCLK_SYS_STATE_RUN << SCLK_SYS_STATE_SHIFT);
+	writel(val, &clkrst->crc_sclk_brst_pol);
+
+	writel(SUPER_SCLK_ENB_MASK, &clkrst->crc_super_sclk_div);
+
+	val = (0 << CLK_SYS_RATE_HCLK_DISABLE_SHIFT) |
+		(1 << CLK_SYS_RATE_AHB_RATE_SHIFT) |
+		(0 << CLK_SYS_RATE_PCLK_DISABLE_SHIFT) |
+		(0 << CLK_SYS_RATE_APB_RATE_SHIFT);
+	writel(val, &clkrst->crc_clk_sys_rate);
+
+	/* Put i2c, mselect in reset and enable clocks */
+	reset_set_enable(PERIPH_ID_DVC_I2C, 1);
+	clock_set_enable(PERIPH_ID_DVC_I2C, 1);
+	reset_set_enable(PERIPH_ID_MSELECT, 1);
+	clock_set_enable(PERIPH_ID_MSELECT, 1);
+
+	/* Switch MSELECT clock to PLLP (00) */
+	clock_ll_set_source(PERIPH_ID_MSELECT, 0);
+
+	/*
+	 * Our high-level clock routines are not available prior to
+	 * relocation. We use the low-level functions which require a
+	 * hard-coded divisor. Use CLK_M with divide by (n + 1 = 17)
+	 */
+	clock_ll_set_source_divisor(PERIPH_ID_DVC_I2C, 3, 16);
+
+	/*
+	 * Give clocks time to stabilize, then take i2c and mselect out of
+	 * reset
+	 */
+	udelay(1000);
+	reset_set_enable(PERIPH_ID_DVC_I2C, 0);
+	reset_set_enable(PERIPH_ID_MSELECT, 0);
+}
+
+static void clock_enable_coresight(int enable)
+{
+	u32 rst, src;
+
+	debug("clock_enable_coresight entry\n");
+	clock_set_enable(PERIPH_ID_CORESIGHT, enable);
+	reset_set_enable(PERIPH_ID_CORESIGHT, !enable);
+
+	if (enable) {
+		/*
+		 * Put CoreSight on PLLP_OUT0 (216 MHz) and divide it down by
+		 *  1.5, giving an effective frequency of 144MHz.
+		 * Set PLLP_OUT0 [bits31:30 = 00], and use a 7.1 divisor
+		 *  (bits 7:0), so 00000001b == 1.5 (n+1 + .5)
+		 *
+		 * Clock divider request for 204MHz would setup CSITE clock as
+		 * 144MHz for PLLP base 216MHz and 204MHz for PLLP base 408MHz
+		 */
+		if (get_chip_type() == TEGRA_SOC_T30_408MHZ)
+			src = CLK_DIVIDER(NVBL_PLLP_KHZ, 204000);
+		else
+			src = CLK_DIVIDER(NVBL_PLLP_KHZ, 144000);
+		clock_ll_set_source_divisor(PERIPH_ID_CSI, 0, src);
+
+		/* Unlock the CPU CoreSight interfaces */
+		rst = CORESIGHT_UNLOCK;
+		writel(rst, CSITE_CPU_DBG0_LAR);
+		writel(rst, CSITE_CPU_DBG1_LAR);
+		writel(rst, CSITE_CPU_DBG2_LAR);
+		writel(rst, CSITE_CPU_DBG3_LAR);
+	}
+}
+
+static void set_cpu_running(int run)
+{
+	struct flow_ctlr *flow = (struct flow_ctlr *)NV_PA_FLOW_BASE;
+
+	debug("set_cpu_running entry, run = %d\n", run);
+	writel(run ? FLOW_MODE_NONE : FLOW_MODE_STOP, &flow->halt_cpu_events);
+}
+
+void start_cpu(u32 reset_vector)
+{
+	debug("start_cpu entry, reset_vector = %x\n", reset_vector);
+	t30_init_clocks();
+
+	/* Enable VDD_CPU */
+	enable_cpu_power_rail();
+
+	set_cpu_running(0);
+
+	/* Hold the CPUs in reset */
+	reset_A9_cpu(1);
+
+	/* Disable the CPU clock */
+	enable_cpu_clock(0);
+
+	/* Enable CoreSight */
+	clock_enable_coresight(1);
+
+	/*
+	 * Set the entry point for CPU execution from reset,
+	 *  if it's a non-zero value.
+	 */
+	if (reset_vector)
+		writel(reset_vector, EXCEP_VECTOR_CPU_RESET_VECTOR);
+
+	/* Enable the CPU clock */
+	enable_cpu_clock(1);
+
+	/* If the CPU doesn't already have power, power it up */
+	powerup_cpu();
+
+	/* Take the CPU out of reset */
+	reset_A9_cpu(0);
+
+	set_cpu_running(1);
+}
+
+
+void halt_avp(void)
+{
+	debug("halt_avp entry\n");
+	for (;;) {
+		writel((HALT_COP_EVENT_JTAG | HALT_COP_EVENT_IRQ_1 \
+			| HALT_COP_EVENT_FIQ_1 | (FLOW_MODE_STOP<<29)),
+			FLOW_CTLR_HALT_COP_EVENTS);
+	}
+}