From patchwork Thu Apr 12 16:06:59 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zheng Liu X-Patchwork-Id: 152116 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 177E3B7012 for ; Fri, 13 Apr 2012 02:00:41 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932974Ab2DLQAj (ORCPT ); Thu, 12 Apr 2012 12:00:39 -0400 Received: from mail-pz0-f52.google.com ([209.85.210.52]:56860 "EHLO mail-pz0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932965Ab2DLQAi (ORCPT ); Thu, 12 Apr 2012 12:00:38 -0400 Received: by dake40 with SMTP id e40so2675784dak.11 for ; Thu, 12 Apr 2012 09:00:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=1PiMvasLNeGsE1klf4u3LnDGBAQT9X6VO2hiwJQA1JM=; b=SIAxVgtBWyFGRnWxUQQMLiGdE4lLGIpEt82IjUwMSR2Gum9RuhSQk5zF+hdyiOdn27 f8ydV46B2OU+FQoIdKfLIixu1Q6ao0n0Opfi+fjuB+4PrxYZU8o19UcnWQ1eQLoevVuy JSdAXYEEihkzldfuMwdhLgX91JKRAQcA9mjNLDaz3KaV8SkOUfpEOoCsfmisetJxAG0Y Ng5ZLh0j9kDdZmep7MWa/7yKav8dCiPcQxRt2LcjUQn0HcDAKoS6AWxNdhdcl855R7ed L7/FlSGWD5maPle2tdPm/Dz8/5TWkUdGUw2d6pSJSx/UFf4JKm7f6BscUCskzwRP4uEe rXTg== Received: by 10.68.226.5 with SMTP id ro5mr3727193pbc.74.1334246438165; Thu, 12 Apr 2012 09:00:38 -0700 (PDT) Received: from gmail.com ([182.92.247.2]) by mx.google.com with ESMTPS id qb10sm1495351pbb.75.2012.04.12.09.00.36 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 12 Apr 2012 09:00:37 -0700 (PDT) Date: Fri, 13 Apr 2012 00:06:59 +0800 From: Zheng Liu To: Jouni Siren Cc: linux-ext4@vger.kernel.org Subject: Re: Bug: Large writes can fail on ext4 if the write buffer is not empty Message-ID: <20120412160658.GA9697@gmail.com> Mail-Followup-To: Jouni Siren , linux-ext4@vger.kernel.org References: <793C2320-255A-4894-AA07-70EDBB1DDDA5@iki.fi> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <793C2320-255A-4894-AA07-70EDBB1DDDA5@iki.fi> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Thu, Apr 12, 2012 at 05:47:41PM +0300, Jouni Siren wrote: > Hi, > > I recently ran into problems when writing large blocks of data (more than about 2 GB) with a single call, if there is already some data in the write buffer. The problem seems to be specific to ext4, or at least it does not happen when writing to nfs on the same system. Also, the problem does not happen, if the write buffer is flushed before the large write. > > The following C++ program should write a total of 4294967304 bytes, but I end up with a file of size 2147483664. > > #include > > int > main(int argc, char** argv) > { > std::streamsize data_size = (std::streamsize)1 << 31; > char* data = new char[data_size]; > > std::ofstream output("test.dat", std::ios_base::binary); > output.write(data, 8); > output.write(data, data_size); > output.write(data, data_size); > output.close(); > > delete[] data; > return 0; > } > > > The relevant part of strace is the following: > > open("test.dat", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 > writev(3, [{"\0\0\0\0\0\0\0\0", 8}, {"", 2147483648}], 2) = -2147483640 > writev(3, [{0xffffffff80c6d258, 2147483648}, {"", 2147483648}], 2) = -1 EFAULT (Bad address) > write(3, "\0\0\0\0\0\0\0\0", 8) = 8 > close(3) = 0 > > > The first two writes are combined into a single writev call that reports having written -2147483640 bytes. This is the same as 8 + 2147483648, when interpreted as a signed 32-bit integer. After the first call, everything more or less fails. This happens on a Linux system, where uname -a returns > > Linux alm01 2.6.32-220.7.1.el6.x86_64 #1 SMP Tue Mar 6 15:45:33 CST 2012 x86_64 x86_64 x86_64 GNU/Linux > > > I believe that the bug can be found in file.c, function ext4_file_write, where variable ret has type int. Function generic_file_aio_write returns the number of bytes written as a ssize_t, and the returned value is stored in ret and eventually returned by ext4_file_write. If the number of bytes written is more than INT_MAX, the value returned by ext4_file_write will be incorrect. > > If you need more information on the problem, I will be happy to provide it. Hi Jouni, Indeed, I think that it is a bug. So the solution is straightforward. Could you please try this patch? Thank you. Regards, Zheng From: Zheng Liu Subject: [PATCH] ext4: change return value from int to ssize_t in ext4_file_write in 32 bit platform, when we do a write operation with a huge number of data, it will cause that the ret variable overflows. So it is replaced with ssize_t. Reported-by: Jouni Siren Signed-off-by: Zheng Liu Tested-by: Jouko Orava --- fs/ext4/file.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index cb70f18..8c7642a 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -95,7 +95,7 @@ ext4_file_write(struct kiocb *iocb, const struct iovec *iov, { struct inode *inode = iocb->ki_filp->f_path.dentry->d_inode; int unaligned_aio = 0; - int ret; + ssize_t ret; /* * If we have encountered a bitmap-format file, the size limit