From patchwork Mon Jun 28 14:35:18 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aldy Hernandez X-Patchwork-Id: 57144 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 2AB58B6EEE for ; Tue, 29 Jun 2010 00:35:33 +1000 (EST) Received: (qmail 8289 invoked by alias); 28 Jun 2010 14:35:31 -0000 Received: (qmail 8236 invoked by uid 22791); 28 Jun 2010 14:35:30 -0000 X-SWARE-Spam-Status: No, hits=-4.8 required=5.0 tests=AWL, BAYES_50, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, TW_CP, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 28 Jun 2010 14:35:24 +0000 Received: from int-mx04.intmail.prod.int.phx2.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.17]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o5SEZMAS005067 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 28 Jun 2010 10:35:23 -0400 Received: from redhat.com (vpn-9-111.rdu.redhat.com [10.11.9.111]) by int-mx04.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o5SEZJbJ029841 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Mon, 28 Jun 2010 10:35:21 -0400 Date: Mon, 28 Jun 2010 10:35:18 -0400 From: Aldy Hernandez To: rth@redhat.com, gcc-patches@gcc.gnu.org Subject: [trans-mem] comment memcpy code Message-ID: <20100628143516.GA8684@redhat.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-08-17) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org OK? * memcpy.cc (do_memcpy): Comment. Index: memcpy.cc =================================================================== --- memcpy.cc (revision 161318) +++ memcpy.cc (working copy) @@ -31,7 +31,9 @@ do_memcpy (uintptr_t idst, uintptr_t isr gtm_dispatch::lock_type W, gtm_dispatch::lock_type R) { gtm_dispatch *disp = gtm_disp(); + // The position in the destination cacheline where *IDST starts. uintptr_t dofs = idst & (CACHELINE_SIZE - 1); + // The position in the source cacheline where *ISRC starts. uintptr_t sofs = isrc & (CACHELINE_SIZE - 1); const gtm_cacheline *src = reinterpret_cast(isrc & -CACHELINE_SIZE); @@ -43,8 +45,14 @@ do_memcpy (uintptr_t idst, uintptr_t isr if (size == 0) return; + // If both SRC and DST data start at the same position in the cachelines, + // we can easily copy the data in tandem, cacheline by cacheline... if (dofs == sofs) { + // We copy the data in three stages: + + // (a) Copy stray bytes at the beginning that are smaller than a + // cacheline. if (sofs != 0) { size_t sleft = CACHELINE_SIZE - sofs; @@ -59,6 +67,7 @@ do_memcpy (uintptr_t idst, uintptr_t isr size -= min; } + // (b) Copy subsequent cacheline sized chunks. while (size >= CACHELINE_SIZE) { dpair = disp->write_lock(dst, W); @@ -70,6 +79,7 @@ do_memcpy (uintptr_t idst, uintptr_t isr size -= CACHELINE_SIZE; } + // (c) Copy anything left over. if (size != 0) { dpair = disp->write_lock(dst, W); @@ -78,12 +88,19 @@ do_memcpy (uintptr_t idst, uintptr_t isr memcpy (dpair.line, sline, size); } } + // ... otherwise, we must copy the data in disparate hunks using + // temporary storage. else { gtm_cacheline c; size_t sleft = CACHELINE_SIZE - sofs; sline = disp->read_lock(src, R); + + // As above, we copy the data in three stages: + + // (a) Copy stray bytes at the beginning that are smaller than a + // cacheline. if (dofs != 0) { size_t dleft = CACHELINE_SIZE - dofs; @@ -91,11 +108,18 @@ do_memcpy (uintptr_t idst, uintptr_t isr dpair = disp->write_lock(dst, W); *dpair.mask |= (((gtm_cacheline_mask)1 << min) - 1) << dofs; + + // If what's left in the source cacheline will fit in the + // rest of the destination cacheline, straight up copy it. if (min <= sleft) { memcpy (&dpair.line->b[dofs], &sline->b[sofs], min); sofs += min; } + // Otherwise, we need more bits from the source cacheline + // that are available. Piece together what we need from + // contiguous (source) cachelines, into temp space, and copy + // it over. else { memcpy (&c, &sline->b[sofs], sleft); @@ -110,8 +134,14 @@ do_memcpy (uintptr_t idst, uintptr_t isr size -= min; } + // (b) Copy subsequent cacheline sized chunks. while (size >= CACHELINE_SIZE) { + // We have a full (destination) cacheline where to put the + // data, but to get to the corresponding cacheline sized + // chunk in the source, we have to piece together two + // contiguous source cachelines. + memcpy (&c, &sline->b[sofs], sleft); sline = disp->read_lock(++src, R); memcpy (&c.b[sleft], sline, sofs); @@ -124,12 +154,16 @@ do_memcpy (uintptr_t idst, uintptr_t isr size -= CACHELINE_SIZE; } + // (c) Copy anything left over. if (size != 0) { dpair = disp->write_lock(dst, W); *dpair.mask |= ((gtm_cacheline_mask)1 << size) - 1; + // If what's left to copy is entirely in the remaining + // source cacheline, do it. if (size <= sleft) memcpy (dpair.line, &sline->b[sofs], size); + // Otherwise, piece together the remaining bits, and copy. else { memcpy (&c, &sline->b[sofs], sleft);