From patchwork Fri Feb 1 13:28:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thierry Reding X-Patchwork-Id: 1034765 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-tegra-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="gjD9Firs"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43rdGw0X8Tz9sDX for ; Sat, 2 Feb 2019 00:28:56 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727837AbfBAN2z (ORCPT ); Fri, 1 Feb 2019 08:28:55 -0500 Received: from mail-wr1-f66.google.com ([209.85.221.66]:41078 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726486AbfBAN2z (ORCPT ); Fri, 1 Feb 2019 08:28:55 -0500 Received: by mail-wr1-f66.google.com with SMTP id x10so7086072wrs.8 for ; Fri, 01 Feb 2019 05:28:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HmJAdh3c1ExrllzdZ4a9mhycg+Byap3LLpqcDsQcNRA=; b=gjD9Firsi399lfcWHNpYlYF7S8ORzRlmJfFQFdVhdwaBJVPlLXUUe4t2aimQza/AZQ 6yhLC929q1SFn47bFW9a2PU28XzLz9U00pvmfRuV5LK1w0eFIlQo8ptv2BQZmYbFxs6R bwAn6lfwscmLbmV4zoG7RZQmLgk2b02a3R/tpBDsJAQfD1Y6te93FimiIvzpdFx9u81V BK8uepkosSIIV5RWUT0FkBiID/B1JiiWDnSg2sm5SC0XTOCTybR9fdMTQfu3/wq4zkpF Vrt0ykjfSkfhUGdhGvktCbwn9y+rbV6wU7KEVwmWkuiRLpfCgvx3jKKVj+8RpC/QQlz0 CBhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HmJAdh3c1ExrllzdZ4a9mhycg+Byap3LLpqcDsQcNRA=; b=kN8l9eoGyMzsC1+RZcBc+e2TWRBsyfOeQClBQjy2sEeFiFqaCvjv9nb5qi0eNNSxUX gJYWDpMrbAcVsKIW3U9jT59kBFR0Cf9koQV7v0sWkKAo/jOsm9H9Tn8L1hmpKXnMn4Yp d8p+R9/1sKjxOumnoXCHjRELHSzeLMNz/Nozu0d8a4iBZR1/RIV3HDB/MJu/tcMq7USQ aDGShyxXx9e91HQ1tBqBkZ4ewjkarNbkNs2g0IukivwtWAzJBURgaqM59iTROq15PHMZ 8LgDCxvTz838sQaJ1baDtBIpDjpAWmSRw7D1P/UnInjj+1apOdqdqdG32e+uQGEuv+re pRQg== X-Gm-Message-State: AHQUAuYQI5L4D9wFtO2JnfXinOixoNaWgDeu/INsxUh5mcSgrX0ozxch S+KLzGNb/H6EMm4wbUlYQpA= X-Google-Smtp-Source: AHgI3IahhXchHzpiLK2szwQHNvnShmP/ChMXTmyhD6ofFytMtUEFcKaYBAglXbSDWsUiMHv+0bL9sg== X-Received: by 2002:a5d:678b:: with SMTP id v11mr8194070wru.245.1549027733261; Fri, 01 Feb 2019 05:28:53 -0800 (PST) Received: from localhost (pD9E51040.dip0.t-ipconnect.de. [217.229.16.64]) by smtp.gmail.com with ESMTPSA id k135sm3482994wmd.42.2019.02.01.05.28.52 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 01 Feb 2019 05:28:52 -0800 (PST) From: Thierry Reding To: Thierry Reding Cc: Mikko Perttunen , Dmitry Osipenko , dri-devel@lists.freedesktop.org, linux-tegra@vger.kernel.org Subject: [PATCH v3 09/16] gpu: host1x: Optimize CDMA push buffer memory usage Date: Fri, 1 Feb 2019 14:28:30 +0100 Message-Id: <20190201132837.12327-10-thierry.reding@gmail.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190201132837.12327-1-thierry.reding@gmail.com> References: <20190201132837.12327-1-thierry.reding@gmail.com> MIME-Version: 1.0 Sender: linux-tegra-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-tegra@vger.kernel.org From: Thierry Reding The host1x CDMA push buffer is terminated by a special opcode (RESTART) that tells the CDMA to wrap around to the beginning of the push buffer. To accomodate the RESTART opcode, an extra 4 bytes are allocated on top of the 512 * 8 = 4096 bytes needed for the 512 slots (1 slot = 2 words) that are used for other commands passed to CDMA. This requires that two memory pages are allocated, but most of the second page (4092 bytes) is never used. Decrease the number of slots to 511 so that the RESTART opcode fits within the page. Adjust the push buffer wraparound code to take into account push buffer sizes that are not a power of two. Signed-off-by: Thierry Reding Reviewed-by: Dmitry Osipenko Tested-by: Dmitry Osipenko --- drivers/gpu/host1x/cdma.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c index a96c4dd1e449..50c1370b56c7 100644 --- a/drivers/gpu/host1x/cdma.c +++ b/drivers/gpu/host1x/cdma.c @@ -42,7 +42,17 @@ * means that the push buffer is full, not empty. */ -#define HOST1X_PUSHBUFFER_SLOTS 512 +/* + * Typically the commands written into the push buffer are a pair of words. We + * use slots to represent each of these pairs and to simplify things. Note the + * strange number of slots allocated here. 512 slots will fit exactly within a + * single memory page. We also need one additional word at the end of the push + * buffer for the RESTART opcode that will instruct the CDMA to jump back to + * the beginning of the push buffer. With 512 slots, this means that we'll use + * 2 memory pages and waste 4092 bytes of the second page that will never be + * used. + */ +#define HOST1X_PUSHBUFFER_SLOTS 511 /* * Clean up push buffer resources @@ -148,7 +158,10 @@ static void host1x_pushbuffer_push(struct push_buffer *pb, u32 op1, u32 op2) WARN_ON(pb->pos == pb->fence); *(p++) = op1; *(p++) = op2; - pb->pos = (pb->pos + 8) & (pb->size - 1); + pb->pos += 8; + + if (pb->pos >= pb->size) + pb->pos -= pb->size; } /* @@ -158,7 +171,10 @@ static void host1x_pushbuffer_push(struct push_buffer *pb, u32 op1, u32 op2) static void host1x_pushbuffer_pop(struct push_buffer *pb, unsigned int slots) { /* Advance the next write position */ - pb->fence = (pb->fence + slots * 8) & (pb->size - 1); + pb->fence += slots * 8; + + if (pb->fence >= pb->size) + pb->fence -= pb->size; } /* @@ -166,7 +182,12 @@ static void host1x_pushbuffer_pop(struct push_buffer *pb, unsigned int slots) */ static u32 host1x_pushbuffer_space(struct push_buffer *pb) { - return ((pb->fence - pb->pos) & (pb->size - 1)) / 8; + unsigned int fence = pb->fence; + + if (pb->fence < pb->pos) + fence += pb->size; + + return (fence - pb->pos) / 8; } /*