From patchwork Wed Feb 23 16:35:20 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyle Moffett X-Patchwork-Id: 84207 X-Patchwork-Delegate: galak@kernel.crashing.org Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from theia.denx.de (theia.denx.de [85.214.87.163]) by ozlabs.org (Postfix) with ESMTP id AF154B712A for ; Thu, 24 Feb 2011 03:35:39 +1100 (EST) Received: from localhost (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id 3E9A6281BA; Wed, 23 Feb 2011 17:35:38 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at theia.denx.de Received: from theia.denx.de ([127.0.0.1]) by localhost (theia.denx.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8eYNvE2gJIYW; Wed, 23 Feb 2011 17:35:38 +0100 (CET) Received: from theia.denx.de (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id 9535B281BB; Wed, 23 Feb 2011 17:35:35 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id D605F281BB for ; Wed, 23 Feb 2011 17:35:32 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at theia.denx.de Received: from theia.denx.de ([127.0.0.1]) by localhost (theia.denx.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Bamsggq8SOJw for ; Wed, 23 Feb 2011 17:35:31 +0100 (CET) X-policyd-weight: NOT_IN_SBL_XBL_SPAMHAUS=-1.5 NOT_IN_SPAMCOP=-1.5 NOT_IN_BL_NJABL=-1.5 (only DNSBL check requested) Received: from border.exmeritus.com (wsip-70-167-241-26.dc.dc.cox.net [70.167.241.26]) by theia.denx.de (Postfix) with ESMTP id 06EB8281BA for ; Wed, 23 Feb 2011 17:35:29 +0100 (CET) Received: from ysera.exmeritus.com (firewall2.exmeritus.com [10.13.38.2]) by border.exmeritus.com (Postfix) with ESMTP id B59E1AC06F; Wed, 23 Feb 2011 11:35:27 -0500 (EST) From: Kyle Moffett To: u-boot@lists.denx.de Date: Wed, 23 Feb 2011 11:35:20 -0500 Message-Id: <1298478920-22044-1-git-send-email-Kyle.D.Moffett@boeing.com> X-Mailer: git-send-email 1.7.2.3 Cc: Kim Phillips , Kumar Gala , Kyle Moffett , Andy Fleming , John Livingston Subject: [U-Boot] [PATCHv2] fsl_ddr: Don't use full 64-bit divides on 32-bit PowerPC X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.9 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: u-boot-bounces@lists.denx.de Errors-To: u-boot-bounces@lists.denx.de The current FreeScale MPC-8xxx DDR SPD interpreter is using full 64-bit integer divide operations to convert between nanoseconds and DDR clock cycles given arbitrary DDR clock frequencies. Since all of the inputs to this are 32-bit (nanoseconds, clock cycles, and DDR frequencies), we can easily restructure the computation to use the "do_div()" function to perform 64-bit/32-bit divide operations. This decreases compute time rather significantly for each conversion and avoids bringing in a very complicated function from libgcc. It should be noted that nothing else in U-Boot or the Linux kernel seems to require a full 64-bit divide on any 32-bit PowerPC. Build-and-boot-tested on the HWW-1U-1A board using DDR2 SPD detection. Signed-off-by: Kyle Moffett Cc: Andy Fleming Cc: Kumar Gala Cc: Wolfgang Denk Cc: Kim Phillips Acked-by: York Sun --- Ok, so this patch touches a rather sensitive part of the fsl_ddr logic. I spent a fair amount of time trying to verify that the resulting math is exactly the same as it was before, but the consequences of a mistake are insideous timing problems. Additional in-depth review would be much appreciated. Cheers, Kyle Moffett Changelog: v2: Resubmitted separately from the other HWW-1U-1A patches arch/powerpc/cpu/mpc8xxx/ddr/util.c | 58 ++++++++++++++++++++++++---------- 1 files changed, 41 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/cpu/mpc8xxx/ddr/util.c b/arch/powerpc/cpu/mpc8xxx/ddr/util.c index 1e2d921..c545d59 100644 --- a/arch/powerpc/cpu/mpc8xxx/ddr/util.c +++ b/arch/powerpc/cpu/mpc8xxx/ddr/util.c @@ -8,11 +8,19 @@ #include #include +#include #include "ddr.h" unsigned int fsl_ddr_get_mem_data_rate(void); +/* To avoid 64-bit full-divides, we factor this here */ +#define ULL_2e12 2000000000000ULL +#define UL_5pow12 244140625UL +#define UL_2pow13 (1UL << 13) + +#define ULL_8Fs 0xFFFFFFFFULL + /* * Round mclk_ps to nearest 10 ps in memory controller code. * @@ -22,36 +30,52 @@ unsigned int fsl_ddr_get_mem_data_rate(void); */ unsigned int get_memory_clk_period_ps(void) { - unsigned int mclk_ps; + unsigned int data_rate = fsl_ddr_get_mem_data_rate(); + unsigned int result; + + /* Round to nearest 10ps, being careful about 64-bit multiply/divide */ + unsigned long long mclk_ps = ULL_2e12; - mclk_ps = 2000000000000ULL / fsl_ddr_get_mem_data_rate(); - /* round to nearest 10 ps */ - return 10 * ((mclk_ps + 5) / 10); + /* Add 5*data_rate, for rounding */ + mclk_ps += 5*(unsigned long long)data_rate; + + /* Now perform the big divide, the result fits in 32-bits */ + do_div(mclk_ps, data_rate); + result = mclk_ps; + + /* We still need to round to 10ps */ + return 10 * (result/10); } /* Convert picoseconds into DRAM clock cycles (rounding up if needed). */ unsigned int picos_to_mclk(unsigned int picos) { - const unsigned long long ULL_2e12 = 2000000000000ULL; - const unsigned long long ULL_8Fs = 0xFFFFFFFFULL; - unsigned long long clks; - unsigned long long clks_temp; + unsigned long long clks, clks_rem; + /* Short circuit for zero picos */ if (!picos) return 0; - clks = fsl_ddr_get_mem_data_rate() * (unsigned long long) picos; - clks_temp = clks; - clks = clks / ULL_2e12; - if (clks_temp % ULL_2e12) { + /* First multiply the time by the data rate (32x32 => 64) */ + clks = picos * (unsigned long long)fsl_ddr_get_mem_data_rate(); + + /* + * Now divide by 5^12 and track the 32-bit remainder, then divide + * by 2*(2^12) using shifts (and updating the remainder). + */ + clks_rem = do_div(clks, UL_5pow12); + clks_rem <<= 13; + clks_rem |= clks & (UL_2pow13-1); + clks >>= 13; + + /* If we had a remainder, then round up */ + if (clks_rem) clks++; - } - if (clks > ULL_8Fs) { + /* Clamp to the maximum representable value */ + if (clks > ULL_8Fs) clks = ULL_8Fs; - } - - return (unsigned int) clks; + return (unsigned int)clks; } unsigned int mclk_to_picos(unsigned int mclk)