From patchwork Fri Mar 5 17:40:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tim Gardner X-Patchwork-Id: 1448105 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DsZln2vwRz9sWj; Sat, 6 Mar 2021 04:41:08 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1lIERl-0003OQ-Vg; Fri, 05 Mar 2021 17:41:05 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lIERe-0003Lc-U6 for kernel-team@lists.ubuntu.com; Fri, 05 Mar 2021 17:40:58 +0000 Received: from mail-pg1-f200.google.com ([209.85.215.200]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lIERe-0004qm-IO for kernel-team@lists.ubuntu.com; Fri, 05 Mar 2021 17:40:58 +0000 Received: by mail-pg1-f200.google.com with SMTP id j3so1816503pgb.3 for ; Fri, 05 Mar 2021 09:40:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=U8YBz0J4H+OkzM3nIHmfAVeI3tic2HzS+aSDca47T0U=; b=KPRUkAuip5r2ymgLcyy3IL6vD8KN+JZQklzY1In7ZZsImdODR5EBxQkWxYveFk2++t D1MgokAF8W4AEji2UuHI0ohrzTb1Gyi+9y3D72m5u3rcwGBwhmXToC+KpVzsn4mj6ubO jtj7srUTDacKTrI4AYExZgtQ1EtqVcnem2HU9zrqgtOkjDGNy+s9cuFJ449Uzc5Uz8t1 sQ9YSxHunzvjw3MsM+9SRNOLBRBGBPp+eGM4oJqEQrSgG7j6frh6SXFlsxknlfBethxC QcCG3A0qPTBm/1NKSBJkqTGVxiffVhYVCQz+VuEfJzvvYw+S/tBWSGruIrH+NRvkSsEj +3Fg== X-Gm-Message-State: AOAM531Tzb6vOUZiL1wraVyX3jo1J6pHwnkrVITgUh9rqskinx0uz47G Tvn4WwtM35O+UOkmUt5ViFCMwgu0zAtU12gE0Jx7mVnnkqETx5a53TCJE7jZ8qYFc2gDit1yxSG adL3OwRW8nHyFGIBe23DnNSOVwhr9r71ziYNMJ/gTFg== X-Received: by 2002:a65:4785:: with SMTP id e5mr9886922pgs.0.1614966057043; Fri, 05 Mar 2021 09:40:57 -0800 (PST) X-Google-Smtp-Source: ABdhPJxbyJEYZo1KTbaYYndBaXGA19/22ERRWyZ/9Ni1RWpiSxniDeF9xKoBnNx3v0IPtrSk16FYmw== X-Received: by 2002:a65:4785:: with SMTP id e5mr9886910pgs.0.1614966056823; Fri, 05 Mar 2021 09:40:56 -0800 (PST) Received: from localhost.localdomain ([69.163.84.166]) by smtp.gmail.com with ESMTPSA id i6sm1608837pgj.85.2021.03.05.09.40.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Mar 2021 09:40:56 -0800 (PST) From: Tim Gardner To: kernel-team@lists.ubuntu.com Subject: [PATCH 1/4] block: add blk_queue_fua() helper function Date: Fri, 5 Mar 2021 10:40:48 -0700 Message-Id: <20210305174051.20097-2-tim.gardner@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210305174051.20097-1-tim.gardner@canonical.com> References: <20210305174051.20097-1-tim.gardner@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Dave Chinner BugLink: https://bugs.launchpad.net/bugs/1917918 So we can check FUA support status from the iomap direct IO code. Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner Signed-off-by: Jens Axboe (cherry picked from commit 0ce9144471de9ee09306ca0127e7cd27521ccc3f) Signed-off-by: Marcelo Henrique Cerri Tested-by: Tim Gardner --- include/linux/blkdev.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 9eb91f690322..cbf8a17ae692 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -753,6 +753,7 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q) #define blk_queue_quiesced(q) test_bit(QUEUE_FLAG_QUIESCED, &(q)->queue_flags) #define blk_queue_preempt_only(q) \ test_bit(QUEUE_FLAG_PREEMPT_ONLY, &(q)->queue_flags) +#define blk_queue_fua(q) test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags) extern int blk_set_preempt_only(struct request_queue *q); extern void blk_clear_preempt_only(struct request_queue *q); From patchwork Fri Mar 5 17:40:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tim Gardner X-Patchwork-Id: 1448102 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DsZlm2tDBz9sWS; Sat, 6 Mar 2021 04:41:07 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1lIERh-0003MD-Hl; Fri, 05 Mar 2021 17:41:01 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lIERg-0003Lj-2l for kernel-team@lists.ubuntu.com; Fri, 05 Mar 2021 17:41:00 +0000 Received: from mail-pj1-f70.google.com ([209.85.216.70]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lIERf-0004r0-Mn for kernel-team@lists.ubuntu.com; Fri, 05 Mar 2021 17:40:59 +0000 Received: by mail-pj1-f70.google.com with SMTP id w2so2088685pjk.4 for ; Fri, 05 Mar 2021 09:40:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3FcAhozCi/85u7N46I+YkHWXKoPFE0GVKaKc4wOUco0=; b=NHsB0CilV9r9+3BSRUZTW7zRj2pEiWVHiyhCn+qIICcbOFRrqyOO84eyjBDK/SKobW JPtnc0OSHM+95KdBzLs18ur2l7ABfGlidYD6bD3sMdpTBI49cUBAmMhkfsZbNgxi0Jyu 0kMCyWNbx8MQP6o0V6uJw4lQ/7Bqa/g8EufQT2Ht0+UBjhI/m7eLIz2UeGQW7C1dBUIK xGzF6PNIcpk/0VASI/I4avQy//D71kM8qS7ApQwE/8NHgxkE9SWzqTXJLeF3gyTsEbA4 p+sHyRGKKeZyTzk0RDLzmb/qql/QkoiM+yJhVRzW/8PQjf/IeZrkpXe4vhAZA3zSuzeV rntA== X-Gm-Message-State: AOAM531eRe5Cy85ipS/4Rl44Wz491NLyTwdZJVAJObF/1exW5buKiHzq HdVpE5Qy9iMXiBXnqU81y7nlEOHWxKUNoJelcmBEHQs8ZLRtltQCDi7xxpm9OLnmJ2l6jhqYRs1 g1ZdPasVyGj2+vcVz4Hl3gr1s1r3KznPXfEDSX+M8mQ== X-Received: by 2002:a63:1542:: with SMTP id 2mr9768155pgv.338.1614966058015; Fri, 05 Mar 2021 09:40:58 -0800 (PST) X-Google-Smtp-Source: ABdhPJzOF3J9qYT/EcSwpte6SKbvthQ5bzisQ3Pd5KE9iJ9F18mLSOj83f9Y9sOMPX6TOsYo9yJZug== X-Received: by 2002:a63:1542:: with SMTP id 2mr9768142pgv.338.1614966057767; Fri, 05 Mar 2021 09:40:57 -0800 (PST) Received: from localhost.localdomain ([69.163.84.166]) by smtp.gmail.com with ESMTPSA id i6sm1608837pgj.85.2021.03.05.09.40.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Mar 2021 09:40:57 -0800 (PST) From: Tim Gardner To: kernel-team@lists.ubuntu.com Subject: [PATCH 2/4] xfs: move generic_write_sync calls inwards Date: Fri, 5 Mar 2021 10:40:49 -0700 Message-Id: <20210305174051.20097-3-tim.gardner@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210305174051.20097-1-tim.gardner@canonical.com> References: <20210305174051.20097-1-tim.gardner@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Dave Chinner BugLink: https://bugs.launchpad.net/bugs/1917918 To prepare for iomap iinfrastructure based DSYNC optimisations. While moving the code araound, move the XFS write bytes metric update for direct IO into xfs_dio_write_end_io callback so that we always capture the amount of data written via AIO+DIO. This fixes the problem where queued AIO+DIO writes are not accounted to this metric. Signed-Off-By: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong (cherry picked from commit ed5c3e66a32883e2b3d119d358d23fd5990dc9c2) Signed-off-by: Marcelo Henrique Cerri Tested-by: Tim Gardner --- fs/xfs/xfs_file.c | 48 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 33 insertions(+), 15 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 448b2dd7c1d7..a58312deec04 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -448,6 +448,12 @@ xfs_dio_write_end_io( if (size <= 0) return size; + /* + * Capture amount written on completion as we can't reliably account + * for it on submission. + */ + XFS_STATS_ADD(ip->i_mount, xs_write_bytes, size); + if (flags & IOMAP_DIO_COW) { error = xfs_reflink_end_cow(ip, offset, size); if (error) @@ -603,6 +609,11 @@ xfs_file_dio_aio_write( * complete fully or fail. */ ASSERT(ret < 0 || ret == count); + + if (ret > 0) { + /* Handle various SYNC-type writes */ + ret = generic_write_sync(iocb, ret); + } return ret; } @@ -640,7 +651,16 @@ xfs_file_dax_write( } out: xfs_iunlock(ip, iolock); - return error ? error : ret; + if (error) + return error; + + if (ret > 0) { + XFS_STATS_ADD(ip->i_mount, xs_write_bytes, ret); + + /* Handle various SYNC-type writes */ + ret = generic_write_sync(iocb, ret); + } + return ret; } STATIC ssize_t @@ -710,6 +730,12 @@ xfs_file_buffered_aio_write( out: if (iolock) xfs_iunlock(ip, iolock); + + if (ret > 0) { + XFS_STATS_ADD(ip->i_mount, xs_write_bytes, ret); + /* Handle various SYNC-type writes */ + ret = generic_write_sync(iocb, ret); + } return ret; } @@ -734,8 +760,9 @@ xfs_file_write_iter( return -EIO; if (IS_DAX(inode)) - ret = xfs_file_dax_write(iocb, from); - else if (iocb->ki_flags & IOCB_DIRECT) { + return xfs_file_dax_write(iocb, from); + + if (iocb->ki_flags & IOCB_DIRECT) { /* * Allow a directio write to fall back to a buffered * write *only* in the case that we're doing a reflink @@ -743,20 +770,11 @@ xfs_file_write_iter( * allow an operation to fall back to buffered mode. */ ret = xfs_file_dio_aio_write(iocb, from); - if (ret == -EREMCHG) - goto buffered; - } else { -buffered: - ret = xfs_file_buffered_aio_write(iocb, from); + if (ret != -EREMCHG) + return ret; } - if (ret > 0) { - XFS_STATS_ADD(ip->i_mount, xs_write_bytes, ret); - - /* Handle various SYNC-type writes */ - ret = generic_write_sync(iocb, ret); - } - return ret; + return xfs_file_buffered_aio_write(iocb, from); } #define XFS_FALLOC_FL_SUPPORTED \ From patchwork Fri Mar 5 17:40:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tim Gardner X-Patchwork-Id: 1448104 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DsZlm2Dhvz9sWL; Sat, 6 Mar 2021 04:41:07 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1lIERi-0003N9-P1; Fri, 05 Mar 2021 17:41:02 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lIERh-0003Lx-D1 for kernel-team@lists.ubuntu.com; Fri, 05 Mar 2021 17:41:01 +0000 Received: from mail-pg1-f199.google.com ([209.85.215.199]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lIERh-0004rC-0I for kernel-team@lists.ubuntu.com; Fri, 05 Mar 2021 17:41:01 +0000 Received: by mail-pg1-f199.google.com with SMTP id y26so1824978pga.10 for ; Fri, 05 Mar 2021 09:41:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9H6fT3LskyX1H/WQacfPWFCtq7veKRS8Hko1d5Hq6zY=; b=MWqBwpyYExgpnA7hYii1x/37bmmzVyEYDEoBaIo/VtdkM7Mz6Quz3t49HiHHXfX4I3 2cM4p7DYkselSXootiLN9HA/RNMDznQMMjpe0vCHLIzcd0s/gaKCZI4iJC4ChccPaS9D GPwmukXVErxWtKVgVQ+IXXcpjG82DNUq7riGl4dk/ai3uc8ks8lRGeEkKPSnyIEzpW5/ 1mGIeanSy4MilH70ZmCtRq2Xy872LfiOeNx8k97TH431phz/jS135o7Po8NdtUx5ddyA j/Mz7XN3SfbgvB++qCCei00ESY/SGv1SydedqN15YMOdzqoO1nPMIY1+sDzYPgdaUt1w cFIQ== X-Gm-Message-State: AOAM532aEOk/LhB3UMssHVX1yFRHtsGX8aYQxta3tR/bj0Sply4D9TtK jf1VP06HguJNdQYr5QmdVBTXLBNyXUOa1foQBjYRmwLP7d+BQyKtxOPmgfgqKfTazhgvD0XgS85 gEU7c7Ha9NxHDuznj3Zd71cLer2IoqwH0fCBnvuABFA== X-Received: by 2002:a17:90a:f0c9:: with SMTP id fa9mr11325491pjb.39.1614966059089; Fri, 05 Mar 2021 09:40:59 -0800 (PST) X-Google-Smtp-Source: ABdhPJxgxge96AU/4yPT1vdMOYEPFkM7ksNikLaj+qwoNjyEMrXT9cs6QJZc0ycw9GPH+j866Wn1TQ== X-Received: by 2002:a17:90a:f0c9:: with SMTP id fa9mr11325473pjb.39.1614966058801; Fri, 05 Mar 2021 09:40:58 -0800 (PST) Received: from localhost.localdomain ([69.163.84.166]) by smtp.gmail.com with ESMTPSA id i6sm1608837pgj.85.2021.03.05.09.40.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Mar 2021 09:40:58 -0800 (PST) From: Tim Gardner To: kernel-team@lists.ubuntu.com Subject: [PATCH 3/4] iomap: iomap_dio_rw() handles all sync writes Date: Fri, 5 Mar 2021 10:40:50 -0700 Message-Id: <20210305174051.20097-4-tim.gardner@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210305174051.20097-1-tim.gardner@canonical.com> References: <20210305174051.20097-1-tim.gardner@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Dave Chinner BugLink: https://bugs.launchpad.net/bugs/1917918 Currently iomap_dio_rw() only handles (data)sync write completions for AIO. This means we can't optimised non-AIO IO to minimise device flushes as we can't tell the caller whether a flush is required or not. To solve this problem and enable further optimisations, make iomap_dio_rw responsible for data sync behaviour for all IO, not just AIO. In doing so, the sync operation is now accounted as part of the DIO IO by inode_dio_end(), hence post-IO data stability updates will no long race against operations that serialise via inode_dio_wait() such as truncate or hole punch. Signed-Off-By: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong (cherry picked from commit 4f8ff44ba0ad82a6f51c1bf381d7bad346464b09) Signed-off-by: Marcelo Henrique Cerri Tested-by: Tim Gardner --- fs/iomap.c | 21 +++++++++++++++------ fs/xfs/xfs_file.c | 5 ----- 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index 965ef9b4ddcf..f66a3bf2ba72 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -683,6 +683,7 @@ EXPORT_SYMBOL_GPL(iomap_seek_data); * Private flags for iomap_dio, must not overlap with the public ones in * iomap.h: */ +#define IOMAP_DIO_NEED_SYNC (1 << 29) #define IOMAP_DIO_WRITE (1 << 30) #define IOMAP_DIO_DIRTY (1 << 31) @@ -758,6 +759,13 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio) dio_warn_stale_pagecache(iocb->ki_filp); } + /* + * If this is a DSYNC write, make sure we push it to stable storage now + * that we've written data. + */ + if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC)) + ret = generic_write_sync(iocb, ret); + inode_dio_end(file_inode(iocb->ki_filp)); kfree(dio); @@ -768,13 +776,8 @@ static void iomap_dio_complete_work(struct work_struct *work) { struct iomap_dio *dio = container_of(work, struct iomap_dio, aio.work); struct kiocb *iocb = dio->iocb; - bool is_write = (dio->flags & IOMAP_DIO_WRITE); - ssize_t ret; - ret = iomap_dio_complete(dio); - if (is_write && ret > 0) - ret = generic_write_sync(iocb, ret); - iocb->ki_complete(iocb, ret, 0); + iocb->ki_complete(iocb, iomap_dio_complete(dio), 0); } /* @@ -973,6 +976,10 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, return copied ? copied : ret; } +/* + * iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO + * is being issued as AIO or not. + */ ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops, iomap_dio_end_io_t end_io) @@ -1017,6 +1024,8 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, dio->flags |= IOMAP_DIO_DIRTY; } else { dio->flags |= IOMAP_DIO_WRITE; + if (iocb->ki_flags & IOCB_DSYNC) + dio->flags |= IOMAP_DIO_NEED_SYNC; flags |= IOMAP_WRITE; } diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index a58312deec04..e05e4d5610ff 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -609,11 +609,6 @@ xfs_file_dio_aio_write( * complete fully or fail. */ ASSERT(ret < 0 || ret == count); - - if (ret > 0) { - /* Handle various SYNC-type writes */ - ret = generic_write_sync(iocb, ret); - } return ret; } From patchwork Fri Mar 5 17:40:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tim Gardner X-Patchwork-Id: 1448106 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DsZln4CHNz9sWk; Sat, 6 Mar 2021 04:41:09 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1lIERm-0003Oo-6m; Fri, 05 Mar 2021 17:41:06 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lIERi-0003MQ-04 for kernel-team@lists.ubuntu.com; Fri, 05 Mar 2021 17:41:02 +0000 Received: from mail-pj1-f70.google.com ([209.85.216.70]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lIERh-0004rD-Ix for kernel-team@lists.ubuntu.com; Fri, 05 Mar 2021 17:41:01 +0000 Received: by mail-pj1-f70.google.com with SMTP id x20so1906364pjk.4 for ; Fri, 05 Mar 2021 09:41:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gMXYwHt3X2kzN8sUrBIRqbMMEHVNIhCqEZGz0Sca21Y=; b=G2mhumaz9t54RtGM790KXvphZa5VHwIphaqO/mwlC2tegQ3s6ZW99ApoKsbf7Wlztd FikpKk3dEBC+PMJdDVr1CU4kegx2ldbRjUwwjkBrKZ2GMMJD160FMHpDkF/mHpHBYsH2 A3D3ehIxODmKPtDkanrAAfpxKOH63xniJqQaw9+tFEldrlY13qWyZUcKcHpk74nNjULl fJjU5KcDeXj3Qmxvr5ugrDfoIWN0RTOuw4jr0A8WQRYk5d/wuIs/BkCbAm4cZnLLVPIZ QtdpfZLaSNGDlwLsXeRuDg7GZUUENgI2ZIHUko6QIIFratvufKCvvnvxOY1xokZC39G5 nldQ== X-Gm-Message-State: AOAM531HTK8njHRWUGC7eAsV2MeKwdGdk0zu6A81Xa5dl4+nnTyH0kMu ZgDAMn69y1i5tfWcIOByukdpXXZIACjnpWLmnYk1A0LdLIG1Ky6jXVDrrmo6B9Kcks312AAI108 x9YWYMln9Xr9wR85XReZExIAF2H8ZorT32zNiPaS9kw== X-Received: by 2002:a63:54:: with SMTP id 81mr9916342pga.410.1614966059934; Fri, 05 Mar 2021 09:40:59 -0800 (PST) X-Google-Smtp-Source: ABdhPJzsbdQ9ycaNM88o7r/NRHEMT/0lkQUPIWnYjK3s9AA9yN8c0j6Rr8OM126v07wfiXmY2fFuuA== X-Received: by 2002:a63:54:: with SMTP id 81mr9916326pga.410.1614966059683; Fri, 05 Mar 2021 09:40:59 -0800 (PST) Received: from localhost.localdomain ([69.163.84.166]) by smtp.gmail.com with ESMTPSA id i6sm1608837pgj.85.2021.03.05.09.40.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Mar 2021 09:40:59 -0800 (PST) From: Tim Gardner To: kernel-team@lists.ubuntu.com Subject: [PATCH 4/4] iomap: Use FUA for pure data O_DSYNC DIO writes Date: Fri, 5 Mar 2021 10:40:51 -0700 Message-Id: <20210305174051.20097-5-tim.gardner@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210305174051.20097-1-tim.gardner@canonical.com> References: <20210305174051.20097-1-tim.gardner@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Dave Chinner BugLink: https://bugs.launchpad.net/bugs/1917918 If we are doing direct IO writes with datasync semantics, we often have to flush metadata changes along with the data write. However, if we are overwriting existing data, there are no metadata changes that we need to flush. In this case, optimising the IO by using FUA write makes sense. We know from the IOMAP_F_DIRTY flag as to whether a specific inode requires a metadata flush - this is currently used by DAX to ensure extent modification as stable in page fault operations. For direct IO writes, we can use it to determine if we need to flush metadata or not once the data is on disk. Hence if we have been returned a mapped extent that is not new and the IO mapping is not dirty, then we can use a FUA write to provide datasync semantics. This allows us to short-cut the generic_write_sync() call in IO completion and hence avoid unnecessary operations. This makes pure direct IO data write behaviour identical to the way block devices use REQ_FUA to provide datasync semantics. On a FUA enabled device, a synchronous direct IO write workload (sequential 4k overwrites in 32MB file) had the following results: kernel time write()s write iops Write b/w ------ ---- -------- ---------- --------- (no dsync) 4s 2173/s 2173 8.5MB/s vanilla 22s 370/s 750 1.4MB/s patched 19s 420/s 420 1.6MB/s The patched code clearly doesn't send cache flushes anymore, but instead uses FUA (confirmed via blktrace), and performance improves a bit as a result. However, the benefits will be higher on workloads that mix O_DSYNC overwrites with other write IO as we won't be flushing the entire device cache on every DSYNC overwrite IO anymore. Signed-Off-By: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong (backported from commit 3460cac1ca76215a60acb086ebe97b3e50731628) [marcelo.cerri: fixed context due to some upstream stable updates we have applied to our 4.15 kernels] Signed-off-by: Marcelo Henrique Cerri Tested-by: Tim Gardner --- fs/iomap.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 46 insertions(+), 5 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index f66a3bf2ba72..b66b047ac1ac 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -683,6 +683,7 @@ EXPORT_SYMBOL_GPL(iomap_seek_data); * Private flags for iomap_dio, must not overlap with the public ones in * iomap.h: */ +#define IOMAP_DIO_WRITE_FUA (1 << 28) #define IOMAP_DIO_NEED_SYNC (1 << 29) #define IOMAP_DIO_WRITE (1 << 30) #define IOMAP_DIO_DIRTY (1 << 31) @@ -859,6 +860,7 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, struct iov_iter iter; struct bio *bio; bool need_zeroout = false; + bool use_fua = false; int nr_pages, ret = 0; size_t copied = 0; @@ -882,8 +884,20 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, case IOMAP_MAPPED: if (iomap->flags & IOMAP_F_SHARED) dio->flags |= IOMAP_DIO_COW; - if (iomap->flags & IOMAP_F_NEW) + if (iomap->flags & IOMAP_F_NEW) { need_zeroout = true; + } else { + /* + * Use a FUA write if we need datasync semantics, this + * is a pure data IO that doesn't require any metadata + * updates and the underlying device supports FUA. This + * allows us to avoid cache flushes on IO completion. + */ + if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) && + (dio->flags & IOMAP_DIO_WRITE_FUA) && + blk_queue_fua(bdev_get_queue(iomap->bdev))) + use_fua = true; + } break; default: WARN_ON_ONCE(1); @@ -937,10 +951,14 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, n = bio->bi_iter.bi_size; if (dio->flags & IOMAP_DIO_WRITE) { - bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_SYNC | REQ_IDLE); + bio->bi_opf = REQ_OP_WRITE | REQ_SYNC | REQ_IDLE; + if (use_fua) + bio->bi_opf |= REQ_FUA; + else + dio->flags &= ~IOMAP_DIO_WRITE_FUA; task_io_account_write(n); } else { - bio_set_op_attrs(bio, REQ_OP_READ, 0); + bio->bi_opf = REQ_OP_READ; if (dio->flags & IOMAP_DIO_DIRTY) bio_set_pages_dirty(bio); } @@ -978,7 +996,12 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, /* * iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO - * is being issued as AIO or not. + * is being issued as AIO or not. This allows us to optimise pure data writes + * to use REQ_FUA rather than requiring generic_write_sync() to issue a + * REQ_FLUSH post write. This is slightly tricky because a single request here + * can be mapped into multiple disjoint IOs and only a subset of the IOs issued + * may be pure data writes. In that case, we still need to do a full data sync + * completion. */ ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, @@ -1023,10 +1046,21 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, if (iter->type == ITER_IOVEC) dio->flags |= IOMAP_DIO_DIRTY; } else { + flags |= IOMAP_WRITE; dio->flags |= IOMAP_DIO_WRITE; + + /* for data sync or sync, we need sync completion processing */ if (iocb->ki_flags & IOCB_DSYNC) dio->flags |= IOMAP_DIO_NEED_SYNC; - flags |= IOMAP_WRITE; + + /* + * For datasync only writes, we optimistically try using FUA for + * this IO. Any non-FUA write that occurs will clear this flag, + * hence we know before completion whether a cache flush is + * necessary. + */ + if ((iocb->ki_flags & (IOCB_DSYNC | IOCB_SYNC)) == IOCB_DSYNC) + dio->flags |= IOMAP_DIO_WRITE_FUA; } if (iocb->ki_flags & IOCB_NOWAIT) { @@ -1091,6 +1125,13 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, if (ret < 0) iomap_dio_set_error(dio, ret); + /* + * If all the writes we issued were FUA, we don't need to flush the + * cache on IO completion. Clear the sync flag for this case. + */ + if (dio->flags & IOMAP_DIO_WRITE_FUA) + dio->flags &= ~IOMAP_DIO_NEED_SYNC; + /* * We are about to drop our additional submission reference, which * might be the last reference to the dio. There are three three