From patchwork Tue Nov 24 14:55:13 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Li=C5=A1ka?= X-Patchwork-Id: 548122 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id AEFA2140213 for ; Wed, 25 Nov 2015 01:55:25 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Rf1JJWhU; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=s13JUrCDtoTnZcllu+7fQiZLXAxRee1os0ocEFU14kQ3cRtgx2 ftb4FU7k3c6/AK66c89RTSSQBOniEaV8uo7mVxTK+hZUb1a085xav20QnjUrQdBx GG98a91PHYNIICDpcSzzvcAJnx3NAhmF+5OsZmSQY8mzwT17HS0SfeOwI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=2BTpTO8Q4yIw26HNq9tdIx8aWG8=; b=Rf1JJWhUTMGEQq/+PCOh 61saPOKrNDbSZ1KVB/TJwCPat6nSo9ZXsymGRGoeT3y5NI4X+jC5d+VmqSfx4OhF 8f7L3N0r7hW/mn4QbDGy9/guM3vkB3yvGXNh0bG4Gj6L+s3FIGOQ2LOSkaR/R5WJ OMIMR4NI7/hPvZLx+1pZCk4= Received: (qmail 2904 invoked by alias); 24 Nov 2015 14:55:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 2893 invoked by uid 89); 24 Nov 2015 14:55:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Tue, 24 Nov 2015 14:55:17 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 232B1AC01 for ; Tue, 24 Nov 2015 14:53:48 +0000 (UTC) To: GCC Patches From: =?UTF-8?Q?Martin_Li=c5=a1ka?= Subject: [hsa] Redesign busy loop waiting so that a kernel dispatch signal can be reused Message-ID: <56547A51.5040209@suse.cz> Date: Tue, 24 Nov 2015 15:55:13 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 X-IsSubscribed: yes Hello. Following patch is a workaround for Carrizo devices that tend to have problems with propagation of signal values due to an issue with L2. Commited to the branch. Martin From ca4475aedb47e49b4bdc0a8980f200ec93b31d61 Mon Sep 17 00:00:00 2001 From: marxin Date: Tue, 24 Nov 2015 10:41:54 +0100 Subject: [PATCH 1/2] Redesign busy loop waiting so that a kernel dispatch signal can be reused libgomp/ChangeLog: 2015-11-24 Martin Liska * plugin/plugin-hsa.c (GOMP_OFFLOAD_run): Rewrite busy loop that does a workaround for Carrizo machines. --- libgomp/plugin/plugin-hsa.c | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/libgomp/plugin/plugin-hsa.c b/libgomp/plugin/plugin-hsa.c index b866a78..99ec8e1 100644 --- a/libgomp/plugin/plugin-hsa.c +++ b/libgomp/plugin/plugin-hsa.c @@ -1219,24 +1219,26 @@ GOMP_OFFLOAD_run (int n, void *fn_ptr, void *vars, void** args) __atomic_store_n ((uint16_t*)(&packet->header), header, __ATOMIC_RELEASE); hsa_signal_store_release (agent->command_q->doorbell_signal, index); - /* TODO: fixup, following workaround is necessary to run kernel from - kernel dispatch mechanism on a Carrizo machine. */ - - for (unsigned i = 0; i < shadow->kernel_dispatch_count; i++) - { - hsa_signal_t child_s; - child_s.handle = shadow->children_dispatches[i]->signal; - - HSA_DEBUG ("Waiting for children completion signal: %lu\n", - shadow->children_dispatches[i]->signal); - while (hsa_signal_wait_acquire - (child_s, HSA_SIGNAL_CONDITION_LT, 1, UINT64_MAX, - HSA_WAIT_STATE_BLOCKED) != 0); - } + /* TODO: GPU agents in Carrizo APUs cannot properly update L2 cache for + signal wait and signal load operations on their own and we need to + periodically call the hsa_signal_load_acquire on completion signals of + children kernels in the CPU to make that happen. As soon the + limitation will be resolved, this workaround can be removed. */ HSA_DEBUG ("Kernel dispatched, waiting for completion\n"); - while (hsa_signal_wait_acquire (s, HSA_SIGNAL_CONDITION_LT, 1, - UINT64_MAX, HSA_WAIT_STATE_BLOCKED) != 0); + + /* Root signal waits with 1ms timeout. */ + while (hsa_signal_wait_acquire (s, HSA_SIGNAL_CONDITION_LT, 1, 1000 * 1000, + HSA_WAIT_STATE_BLOCKED) != 0) + for (unsigned i = 0; i < shadow->kernel_dispatch_count; i++) + { + hsa_signal_t child_s; + child_s.handle = shadow->children_dispatches[i]->signal; + + HSA_DEBUG ("Waiting for children completion signal: %lu\n", + shadow->children_dispatches[i]->signal); + hsa_signal_load_acquire (child_s); + } release_kernel_dispatch (shadow); -- 2.6.3