From patchwork Fri Nov 16 08:37:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eiichi Tsukata X-Patchwork-Id: 998938 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=etsukata.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=etsukata-com.20150623.gappssmtp.com header.i=@etsukata-com.20150623.gappssmtp.com header.b="RwyFMYaX"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42xJ3d183lz9sB5 for ; Fri, 16 Nov 2018 23:50:05 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389828AbeKPXCT (ORCPT ); Fri, 16 Nov 2018 18:02:19 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:36395 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389824AbeKPXCS (ORCPT ); Fri, 16 Nov 2018 18:02:18 -0500 Received: by mail-pg1-f194.google.com with SMTP id n2so3407569pgm.3 for ; Fri, 16 Nov 2018 04:50:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=etsukata-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=X8Tdu8EQjWemRcPSv0ahzyPd8tbhyiVn2/corlrP/Xw=; b=RwyFMYaXL6AIwo7KzO7Ba3dKbusg6+EXVQhV1ym1Xqj/M03BAwviwhNnAdnytwCVmn nI9Kpvwn3/xkDJAdg5OaZH09wYG6NZMVCkLtZwnPeuWhuBBqMZ4oJHHVp1ZaSa1IKKEA 0nVUjoo/JsOQimu2AHbHQcRU6utUt8kiNw31g4JMt4UrzZ+xnu6BtBMfkRIMiqz7XqhG roSWz2O/nElQIQFec92zRbhRxBnGbImC+a0683tC3WwXirFlorHZ4fkOJ/smh+icYwoW EBCMgvmGABG1aDuSLzq/w0pHOuK0mh7+Ay4TsBpcYrBEEqbGQCZWyIb/H/Ekrq9vnTQi dyow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=X8Tdu8EQjWemRcPSv0ahzyPd8tbhyiVn2/corlrP/Xw=; b=UK1DggQjUsyu70VNSInbAsjH0F3h+eC1+4ktikXftnG+1ALY+CMIsul0xfB4clBrZk IhmBS8EZQj4XKzihkbO44fip8onkNY4RjWRWux6LebwedlLRqgUfHp0dnXLex6CprQXq uM3PFyM0kePdu/vaaO7oR78iKPPdPpyuiuON/7sba6eu9GvxIIiHMnzhLTgXyVXXEXox YJPYNe80ttwCSM+ltARKk/gPVygGl/j4Euq4Ht6+KuCabFhaSoFbLrLc+SJSR0oE0Mf4 Hqljrnis1isK7vAS/Gheu3d1Yc6Eh88Vw5MqUpQk/vU2kXSnTiQTW8gKJJh/K//EJrA+ b5lA== X-Gm-Message-State: AGRZ1gIafIlemidUnfJFjgaPgj57cua1Lk+bcv2k0yEATDD3reXlz1jU rZrwBay3sUZ5WtDzEJxaBqj9sQ== X-Google-Smtp-Source: AJdET5fmL6vfgnRJ1N8l7Y6KE1nbrwjHf6aL87jgyZyRvXZ7vSPHxhg2ak8vA/qPvL5j/osfyjgoNQ== X-Received: by 2002:a62:3387:: with SMTP id z129-v6mr10787938pfz.143.1542372602915; Fri, 16 Nov 2018 04:50:02 -0800 (PST) Received: from localhost.localdomain (p15044-ipngn8803marunouchi.tokyo.ocn.ne.jp. [153.221.46.44]) by smtp.gmail.com with ESMTPSA id x63-v6sm41703946pfk.14.2018.11.16.04.50.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 16 Nov 2018 04:50:02 -0800 (PST) From: Eiichi Tsukata To: andi@firstfloor.org, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Eiichi Tsukata Subject: [RFC PATCH 1/1] ext4: fix race between llseek SEEK_END and write Date: Fri, 16 Nov 2018 17:37:37 +0900 Message-Id: <20181116083737.10596-2-devel@etsukata.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181116083737.10596-1-devel@etsukata.com> References: <20181116083737.10596-1-devel@etsukata.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org The commit ef3d0fd27e90 ("vfs: do (nearly) lockless generic_file_llseek") removed almost all locks in llseek() including SEEK_END. It based on the idea that write() updates size atomically. But in fact, write() can be divided into two or more parts in generic_perform_write() when pos straddles over the PAGE_SIZE, which results in updating size multiple times in one write(). It means that llseek() can see the size being updated during write(). This race changes behavior of some applications. 'tail' is one of those applications. It reads range [pos, pos_end] where pos_end is obtained via llseek() SEEK_END. Sometimes, a read line could be broken. reproducer: $ while true; do echo 123456 >> out; done $ while true; do tail out | grep -v 123456 ; done example output(take 30 secs): 12345 1 1234 1 12 1234 Signed-off-by: Eiichi Tsukata --- fs/ext4/file.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 69d65d49837b..6479f3066043 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -477,6 +477,16 @@ loff_t ext4_llseek(struct file *file, loff_t offset, int whence) default: return generic_file_llseek_size(file, offset, whence, maxbytes, i_size_read(inode)); + case SEEK_END: + /* + * protects against inode size race with write so that llseek + * doesn't see inode size being updated in generic_perform_write + */ + inode_lock_shared(inode); + offset = generic_file_llseek_size(file, offset, whence, + maxbytes, i_size_read(inode)); + inode_unlock_shared(inode); + return offset; case SEEK_HOLE: inode_lock_shared(inode); offset = iomap_seek_hole(inode, offset, &ext4_iomap_ops);