diff mbox

[v2] ext4: Deprecate data=journal mount option

Message ID 1309260363-19012-1-git-send-email-lczerner@redhat.com
State Rejected, archived
Headers show

Commit Message

Lukas Czerner June 28, 2011, 11:26 a.m. UTC
Data journalling mode (data=journal) is known to be neglected by
developers and only minority of people is actually using it. This
mode is also less tested than the other two modes by the developers.

This creates a dangerous combination, because the option which seems
*safer* is actually less safe the others. So this commit adds a warning
message in case that data=journal mode is used, so the user is informed
that the mode might be removed in the future.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/ext4/super.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

Comments

Bruce Guenter June 29, 2011, 4:49 a.m. UTC | #1
On Tue, Jun 28, 2011 at 01:26:03PM +0200, Lukas Czerner wrote:
> Data journalling mode (data=journal) is known to be neglected by
> developers and only minority of people is actually using it. This
> mode is also less tested than the other two modes by the developers.
> 
> This creates a dangerous combination, because the option which seems
> *safer* is actually less safe the others. So this commit adds a warning
> message in case that data=journal mode is used, so the user is informed
> that the mode might be removed in the future.

When I last benchmarked it, data=journal mode was considerably faster
than the other modes for sync heavy work loads, such as mail servers.
Have there been improvements to other (safe) modes that reverse this?
Lukas Czerner June 29, 2011, 11:49 a.m. UTC | #2
On Tue, 28 Jun 2011, Bruce Guenter wrote:

> On Tue, Jun 28, 2011 at 01:26:03PM +0200, Lukas Czerner wrote:
> > Data journalling mode (data=journal) is known to be neglected by
> > developers and only minority of people is actually using it. This
> > mode is also less tested than the other two modes by the developers.
> > 
> > This creates a dangerous combination, because the option which seems
> > *safer* is actually less safe the others. So this commit adds a warning
> > message in case that data=journal mode is used, so the user is informed
> > that the mode might be removed in the future.
> 
> When I last benchmarked it, data=journal mode was considerably faster
> than the other modes for sync heavy work loads, such as mail servers.
> Have there been improvements to other (safe) modes that reverse this?
> 

No, not any specific change that I know of. The problem with
data=journal from performance perspective is that, it can improve fsync
heavy loads, however just for a limited bandwidth. Because as you
probably know in this mode it will write all data twice, however it will
generate fewer seeks. So yes, for sync heavy work load with small writes
(such as mail servers might do) the performance might be better. But it
is always the matter of benchmarking your particular case to decide if
it really helps, it is not like this is a general fact that
data=journal implies better performance for fsync heavy writes.

Also note that the even though bandwidth limit might go away (or scale
up significantly) for newer drives like SSD's, the same is true for
seeks, so I guess data=journal would not have any benefits on future
drives. Also as I said those code paths are not as well tested as the
other modes.

Thanks!
-Lukas

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Aug. 11, 2011, 3:01 p.m. UTC | #3
On Tue, 28 Jun 2011, Lukas Czerner wrote:

> Data journalling mode (data=journal) is known to be neglected by
> developers and only minority of people is actually using it. This
> mode is also less tested than the other two modes by the developers.
> 
> This creates a dangerous combination, because the option which seems
> *safer* is actually less safe the others. So this commit adds a warning
> message in case that data=journal mode is used, so the user is informed
> that the mode might be removed in the future.

Any comments on this ? Is that feasible to remove is someday ?

Thanks!
-Lukas

> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
>  fs/ext4/super.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 9ea71aa..9d189cf 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1631,6 +1631,11 @@ static int parse_options(char *options, struct super_block *sb,
>  			sbi->s_min_batch_time = option;
>  			break;
>  		case Opt_data_journal:
> +			ext4_msg(sb, KERN_WARNING,
> +				 "Using data=journal may be removed in the "
> +				 "future. Please, contact "
> +				 "linux-ext4@vger.kernel.org if you are "
> +				 "using this feature.");
>  			data_opt = EXT4_MOUNT_JOURNAL_DATA;
>  			goto datacheck;
>  		case Opt_data_ordered:
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andreas Dilger Aug. 11, 2011, 9:08 p.m. UTC | #4
On 2011-08-11, at 9:01 AM, Lukas Czerner wrote:
> On Tue, 28 Jun 2011, Lukas Czerner wrote:
>> Data journalling mode (data=journal) is known to be neglected by
>> developers and only minority of people is actually using it. This
>> mode is also less tested than the other two modes by the developers.
>> 
>> This creates a dangerous combination, because the option which seems
>> *safer* is actually less safe the others. So this commit adds a warning
>> message in case that data=journal mode is used, so the user is informed
>> that the mode might be removed in the future.
> 
> Any comments on this ? Is that feasible to remove is someday ?

I'm less in favour of removing data=journal.  Jan made some fixes to
data=journal mode in the last few weeks, which means that people are
still using this.

>> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
>> ---
>> fs/ext4/super.c |    5 +++++
>> 1 files changed, 5 insertions(+), 0 deletions(-)
>> 
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 9ea71aa..9d189cf 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -1631,6 +1631,11 @@ static int parse_options(char *options, struct super_block *sb,
>> 			sbi->s_min_batch_time = option;
>> 			break;
>> 		case Opt_data_journal:
>> +			ext4_msg(sb, KERN_WARNING,
>> +				 "Using data=journal may be removed in the "
>> +				 "future. Please, contact "
>> +				 "linux-ext4@vger.kernel.org if you are "
>> +				 "using this feature.");
>> 			data_opt = EXT4_MOUNT_JOURNAL_DATA;
>> 			goto datacheck;
>> 		case Opt_data_ordered:
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Aug. 12, 2011, 8:16 a.m. UTC | #5
On Thu, 11 Aug 2011, Andreas Dilger wrote:

> On 2011-08-11, at 9:01 AM, Lukas Czerner wrote:
> > On Tue, 28 Jun 2011, Lukas Czerner wrote:
> >> Data journalling mode (data=journal) is known to be neglected by
> >> developers and only minority of people is actually using it. This
> >> mode is also less tested than the other two modes by the developers.
> >> 
> >> This creates a dangerous combination, because the option which seems
> >> *safer* is actually less safe the others. So this commit adds a warning
> >> message in case that data=journal mode is used, so the user is informed
> >> that the mode might be removed in the future.
> > 
> > Any comments on this ? Is that feasible to remove is someday ?
> 
> I'm less in favour of removing data=journal.  Jan made some fixes to
> data=journal mode in the last few weeks, which means that people are
> still using this.

I think that Jan was actually the one who was in favour of this change
IIRC. But you're right that there are still some (very little possibly?)
users of this. But this change does not remove it, but just let the
users know that it might be removed someday, hence discouraging them from
using it.

Also we were discussing that several times, so I think that letting
users know that we are considering it is a good thing.

Thanks!
-Lukas

> 
> >> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> >> ---
> >> fs/ext4/super.c |    5 +++++
> >> 1 files changed, 5 insertions(+), 0 deletions(-)
> >> 
> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> >> index 9ea71aa..9d189cf 100644
> >> --- a/fs/ext4/super.c
> >> +++ b/fs/ext4/super.c
> >> @@ -1631,6 +1631,11 @@ static int parse_options(char *options, struct super_block *sb,
> >> 			sbi->s_min_batch_time = option;
> >> 			break;
> >> 		case Opt_data_journal:
> >> +			ext4_msg(sb, KERN_WARNING,
> >> +				 "Using data=journal may be removed in the "
> >> +				 "future. Please, contact "
> >> +				 "linux-ext4@vger.kernel.org if you are "
> >> +				 "using this feature.");
> >> 			data_opt = EXT4_MOUNT_JOURNAL_DATA;
> >> 			goto datacheck;
> >> 		case Opt_data_ordered:
> >> 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> Cheers, Andreas
> 
> 
> 
> 
> 
>
Ric Wheeler Aug. 12, 2011, 8:25 a.m. UTC | #6
On 08/12/2011 09:16 AM, Lukas Czerner wrote:
> On Thu, 11 Aug 2011, Andreas Dilger wrote:
>
>> On 2011-08-11, at 9:01 AM, Lukas Czerner wrote:
>>> On Tue, 28 Jun 2011, Lukas Czerner wrote:
>>>> Data journalling mode (data=journal) is known to be neglected by
>>>> developers and only minority of people is actually using it. This
>>>> mode is also less tested than the other two modes by the developers.
>>>>
>>>> This creates a dangerous combination, because the option which seems
>>>> *safer* is actually less safe the others. So this commit adds a warning
>>>> message in case that data=journal mode is used, so the user is informed
>>>> that the mode might be removed in the future.
>>> Any comments on this ? Is that feasible to remove is someday ?
>> I'm less in favour of removing data=journal.  Jan made some fixes to
>> data=journal mode in the last few weeks, which means that people are
>> still using this.
> I think that Jan was actually the one who was in favour of this change
> IIRC. But you're right that there are still some (very little possibly?)
> users of this. But this change does not remove it, but just let the
> users know that it might be removed someday, hence discouraging them from
> using it.
>
> Also we were discussing that several times, so I think that letting
> users know that we are considering it is a good thing.
>
> Thanks!
> -Lukas

I think that this will be very useful to have - users should definitely chime in 
when they see this warning if they are using data journal mode.

The only work load that I know that benefits from a performance point of view is 
one which involves an fsync() heavy, small file creation workload.  Any workload 
with larger files tends to lose roughly 50% of the write bandwidth under 
streaming writes since we end up writing everything twice.

Regards,

Ric


>
>>>> Signed-off-by: Lukas Czerner<lczerner@redhat.com>
>>>> ---
>>>> fs/ext4/super.c |    5 +++++
>>>> 1 files changed, 5 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>>>> index 9ea71aa..9d189cf 100644
>>>> --- a/fs/ext4/super.c
>>>> +++ b/fs/ext4/super.c
>>>> @@ -1631,6 +1631,11 @@ static int parse_options(char *options, struct super_block *sb,
>>>> 			sbi->s_min_batch_time = option;
>>>> 			break;
>>>> 		case Opt_data_journal:
>>>> +			ext4_msg(sb, KERN_WARNING,
>>>> +				 "Using data=journal may be removed in the "
>>>> +				 "future. Please, contact "
>>>> +				 "linux-ext4@vger.kernel.org if you are "
>>>> +				 "using this feature.");
>>>> 			data_opt = EXT4_MOUNT_JOURNAL_DATA;
>>>> 			goto datacheck;
>>>> 		case Opt_data_ordered:
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> Cheers, Andreas
>>
>>
>>
>>
>>
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Curt Wohlgemuth Aug. 12, 2011, 3:45 p.m. UTC | #7
I don't know much about data=journal, but I've been running xfstests
with it, and it's a disaster, given that data=journal doesn't support
O_DIRECT.  What kind of testing do people do on data=journal?

And worse, I consistently crash my machine running xfstests 074 on a
data=journal partition.  I just repeated this to make sure, on
3.1.0-rc1; I've also seen it with 3.0.0.  There's a BUG_ON firing in
jbd2_journal_dirty_metadata():

[ 2174.697692] ------------[ cut here ]------------
[ 2174.698627] kernel BUG at fs/jbd2/transaction.c:1112!
[ 2174.698627] invalid opcode: 0000 [#1] SMP
[ 2174.698627] CPU 11
...
[ 2174.698627] Call Trace:
[ 2174.698627]  [<ffffffff8127f9a3>] __ext4_handle_dirty_metadata+0x83/0x120
[ 2174.698627]  [<ffffffff8127fd1e>] ? __ext4_journal_get_write_access+0x3e/0x80
[ 2174.698627]  [<ffffffff81253a78>] __ext4_journalled_writepage+0x338/0x3e0
[ 2174.698627]  [<ffffffff81254244>] ext4_writepage+0x234/0x2f0
[ 2174.698627]  [<ffffffff81158327>] __writepage+0x17/0x40
[ 2174.698627]  [<ffffffff811597ae>] write_cache_pages+0x1fe/0x650

This is at the J_ASSERT_JH below:

	/*
	 * Metadata already on the current transaction list doesn't
	 * need to be filed.  Metadata on another transaction's list must
	 * be committing, and will be refiled once the commit completes:
	 * leave it alone for now.
	 */
	if (jh->b_transaction != transaction) {
		JBUFFER_TRACE(jh, "already on other transaction");
		J_ASSERT_JH(jh, jh->b_transaction ==
					journal->j_committing_transaction);      <===============
		J_ASSERT_JH(jh, jh->b_next_transaction == transaction);
		/* And this case is illegal: we can't reuse another
		 * transaction's data buffer, ever. */
		goto out_unlock_bh;
	}

Curt

On Fri, Aug 12, 2011 at 1:25 AM, Ric Wheeler <rwheeler@redhat.com> wrote:
> On 08/12/2011 09:16 AM, Lukas Czerner wrote:
>>
>> On Thu, 11 Aug 2011, Andreas Dilger wrote:
>>
>>> On 2011-08-11, at 9:01 AM, Lukas Czerner wrote:
>>>>
>>>> On Tue, 28 Jun 2011, Lukas Czerner wrote:
>>>>>
>>>>> Data journalling mode (data=journal) is known to be neglected by
>>>>> developers and only minority of people is actually using it. This
>>>>> mode is also less tested than the other two modes by the developers.
>>>>>
>>>>> This creates a dangerous combination, because the option which seems
>>>>> *safer* is actually less safe the others. So this commit adds a warning
>>>>> message in case that data=journal mode is used, so the user is informed
>>>>> that the mode might be removed in the future.
>>>>
>>>> Any comments on this ? Is that feasible to remove is someday ?
>>>
>>> I'm less in favour of removing data=journal.  Jan made some fixes to
>>> data=journal mode in the last few weeks, which means that people are
>>> still using this.
>>
>> I think that Jan was actually the one who was in favour of this change
>> IIRC. But you're right that there are still some (very little possibly?)
>> users of this. But this change does not remove it, but just let the
>> users know that it might be removed someday, hence discouraging them from
>> using it.
>>
>> Also we were discussing that several times, so I think that letting
>> users know that we are considering it is a good thing.
>>
>> Thanks!
>> -Lukas
>
> I think that this will be very useful to have - users should definitely
> chime in when they see this warning if they are using data journal mode.
>
> The only work load that I know that benefits from a performance point of
> view is one which involves an fsync() heavy, small file creation workload.
>  Any workload with larger files tends to lose roughly 50% of the write
> bandwidth under streaming writes since we end up writing everything twice.
>
> Regards,
>
> Ric
>
>
>>
>>>>> Signed-off-by: Lukas Czerner<lczerner@redhat.com>
>>>>> ---
>>>>> fs/ext4/super.c |    5 +++++
>>>>> 1 files changed, 5 insertions(+), 0 deletions(-)
>>>>>
>>>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>>>>> index 9ea71aa..9d189cf 100644
>>>>> --- a/fs/ext4/super.c
>>>>> +++ b/fs/ext4/super.c
>>>>> @@ -1631,6 +1631,11 @@ static int parse_options(char *options, struct
>>>>> super_block *sb,
>>>>>                        sbi->s_min_batch_time = option;
>>>>>                        break;
>>>>>                case Opt_data_journal:
>>>>> +                       ext4_msg(sb, KERN_WARNING,
>>>>> +                                "Using data=journal may be removed in
>>>>> the "
>>>>> +                                "future. Please, contact "
>>>>> +                                "linux-ext4@vger.kernel.org if you are
>>>>> "
>>>>> +                                "using this feature.");
>>>>>                        data_opt = EXT4_MOUNT_JOURNAL_DATA;
>>>>>                        goto datacheck;
>>>>>                case Opt_data_ordered:
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> Cheers, Andreas
>>>
>>>
>>>
>>>
>>>
>>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Aug. 12, 2011, 4:08 p.m. UTC | #8
On Fri, 12 Aug 2011, Curt Wohlgemuth wrote:

> I don't know much about data=journal, but I've been running xfstests
> with it, and it's a disaster, given that data=journal doesn't support
> O_DIRECT.  What kind of testing do people do on data=journal?

Short answer is probably none :). Even though that it seems like an
radical answer I believe that it is mostly true, because simply said
almost no-one care. But I think that Ted mentioned that he actually do
some tests with that mode, but I am not sure about that.

> 
> And worse, I consistently crash my machine running xfstests 074 on a
> data=journal partition.  I just repeated this to make sure, on
> 3.1.0-rc1; I've also seen it with 3.0.0.  There's a BUG_ON firing in
> jbd2_journal_dirty_metadata():
> 
> [ 2174.697692] ------------[ cut here ]------------
> [ 2174.698627] kernel BUG at fs/jbd2/transaction.c:1112!
> [ 2174.698627] invalid opcode: 0000 [#1] SMP
> [ 2174.698627] CPU 11
> ...
> [ 2174.698627] Call Trace:
> [ 2174.698627]  [<ffffffff8127f9a3>] __ext4_handle_dirty_metadata+0x83/0x120
> [ 2174.698627]  [<ffffffff8127fd1e>] ? __ext4_journal_get_write_access+0x3e/0x80
> [ 2174.698627]  [<ffffffff81253a78>] __ext4_journalled_writepage+0x338/0x3e0
> [ 2174.698627]  [<ffffffff81254244>] ext4_writepage+0x234/0x2f0
> [ 2174.698627]  [<ffffffff81158327>] __writepage+0x17/0x40
> [ 2174.698627]  [<ffffffff811597ae>] write_cache_pages+0x1fe/0x650
> 
> This is at the J_ASSERT_JH below:
> 
> 	/*
> 	 * Metadata already on the current transaction list doesn't
> 	 * need to be filed.  Metadata on another transaction's list must
> 	 * be committing, and will be refiled once the commit completes:
> 	 * leave it alone for now.
> 	 */
> 	if (jh->b_transaction != transaction) {
> 		JBUFFER_TRACE(jh, "already on other transaction");
> 		J_ASSERT_JH(jh, jh->b_transaction ==
> 					journal->j_committing_transaction);      <===============
> 		J_ASSERT_JH(jh, jh->b_next_transaction == transaction);
> 		/* And this case is illegal: we can't reuse another
> 		 * transaction's data buffer, ever. */
> 		goto out_unlock_bh;
> 	}

Wow, that backtrace seems like a flash-back to me I believe that I have
seen it several times, probably in different modes and probably already
fixed. Do no know about this one though.

Thanks!
-Lukas

> 
> Curt
> 
> On Fri, Aug 12, 2011 at 1:25 AM, Ric Wheeler <rwheeler@redhat.com> wrote:
> > On 08/12/2011 09:16 AM, Lukas Czerner wrote:
> >>
> >> On Thu, 11 Aug 2011, Andreas Dilger wrote:
> >>
> >>> On 2011-08-11, at 9:01 AM, Lukas Czerner wrote:
> >>>>
> >>>> On Tue, 28 Jun 2011, Lukas Czerner wrote:
> >>>>>
> >>>>> Data journalling mode (data=journal) is known to be neglected by
> >>>>> developers and only minority of people is actually using it. This
> >>>>> mode is also less tested than the other two modes by the developers.
> >>>>>
> >>>>> This creates a dangerous combination, because the option which seems
> >>>>> *safer* is actually less safe the others. So this commit adds a warning
> >>>>> message in case that data=journal mode is used, so the user is informed
> >>>>> that the mode might be removed in the future.
> >>>>
> >>>> Any comments on this ? Is that feasible to remove is someday ?
> >>>
> >>> I'm less in favour of removing data=journal.  Jan made some fixes to
> >>> data=journal mode in the last few weeks, which means that people are
> >>> still using this.
> >>
> >> I think that Jan was actually the one who was in favour of this change
> >> IIRC. But you're right that there are still some (very little possibly?)
> >> users of this. But this change does not remove it, but just let the
> >> users know that it might be removed someday, hence discouraging them from
> >> using it.
> >>
> >> Also we were discussing that several times, so I think that letting
> >> users know that we are considering it is a good thing.
> >>
> >> Thanks!
> >> -Lukas
> >
> > I think that this will be very useful to have - users should definitely
> > chime in when they see this warning if they are using data journal mode.
> >
> > The only work load that I know that benefits from a performance point of
> > view is one which involves an fsync() heavy, small file creation workload.
> >  Any workload with larger files tends to lose roughly 50% of the write
> > bandwidth under streaming writes since we end up writing everything twice.
> >
> > Regards,
> >
> > Ric
> >
> >
> >>
> >>>>> Signed-off-by: Lukas Czerner<lczerner@redhat.com>
> >>>>> ---
> >>>>> fs/ext4/super.c |    5 +++++
> >>>>> 1 files changed, 5 insertions(+), 0 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> >>>>> index 9ea71aa..9d189cf 100644
> >>>>> --- a/fs/ext4/super.c
> >>>>> +++ b/fs/ext4/super.c
> >>>>> @@ -1631,6 +1631,11 @@ static int parse_options(char *options, struct
> >>>>> super_block *sb,
> >>>>>                        sbi->s_min_batch_time = option;
> >>>>>                        break;
> >>>>>                case Opt_data_journal:
> >>>>> +                       ext4_msg(sb, KERN_WARNING,
> >>>>> +                                "Using data=journal may be removed in
> >>>>> the "
> >>>>> +                                "future. Please, contact "
> >>>>> +                                "linux-ext4@vger.kernel.org if you are
> >>>>> "
> >>>>> +                                "using this feature.");
> >>>>>                        data_opt = EXT4_MOUNT_JOURNAL_DATA;
> >>>>>                        goto datacheck;
> >>>>>                case Opt_data_ordered:
> >>>>>
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> >>>> the body of a message to majordomo@vger.kernel.org
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>> Cheers, Andreas
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 

--
Theodore Ts'o Aug. 12, 2011, 6:13 p.m. UTC | #9
On Fri, Aug 12, 2011 at 06:08:21PM +0200, Lukas Czerner wrote:
> On Fri, 12 Aug 2011, Curt Wohlgemuth wrote:
> 
> > I don't know much about data=journal, but I've been running xfstests
> > with it, and it's a disaster, given that data=journal doesn't support
> > O_DIRECT.  What kind of testing do people do on data=journal?

I have a rather long list of expected failures, mostly having to do
with xfstests assuming that O_DIRECT has to be supported.  On my todo
list is to scrub through the list failures that I've seen, make sure
they are indeed related to O_DIRECT, and then see if I can figure out
some way of telling xfstests to skip O_DIRECT tests via some
environment variable or command line option.

For the record this is what I mostly expect (from a somewhat older
xfstests) in the data=journal case:

Ran: 001 002 005 006 007 011 013 014 053 069 070 074 075 076 077 088 089 100 105 112 113 123 124 125 126 127 128 129 130 131 132 133 135 141 169 184 193 198 204 207 208 209 210 211 212 213 214 215 219 221 223 224 225 226 228 230 231 232 233 234 235 236 237 239 240 243 245 246 247 248 249 256
Failures: 113 125 130 133 135 198 207 208 209 210 214 223 226 239 240 245

BTW, with the very latest xfstests, I'm seeing new across-the-board
(not just data=journal) failures for tests #62 (caused by the presence
of the lost+found directory and differences in error code returns for
xattrs) and #79 (a failure in the append-only handling which I don't
completely understand yet).

> Short answer is probably none :). Even though that it seems like an
> radical answer I believe that it is mostly true, because simply said
> almost no-one care. But I think that Ted mentioned that he actually do
> some tests with that mode, but I am not sure about that.

Yes, I'm testing with data=journal, and I haven't been able to
reproduce the crash which curtw is seeing (see above; test #74 passes
in my 2cpu/512mb KVM test environment).  I'll add some instrumentation
to the BUG_ON in question and then try to reproduce it in curtw's
environment.

							- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Aug. 12, 2011, 8:57 p.m. UTC | #10
On Fri, Aug 12, 2011 at 02:13:30PM -0400, Ted Ts'o wrote:
> I have a rather long list of expected failures, mostly having to do
> with xfstests assuming that O_DIRECT has to be supported.  On my todo
> list is to scrub through the list failures that I've seen, make sure
> they are indeed related to O_DIRECT, and then see if I can figure out
> some way of telling xfstests to skip O_DIRECT tests via some
> environment variable or command line option.

If you do it please do it by returning a defined failure from the
test programs and then just exiting the test with _notrun.  But given
that xfstests does a lot of O_DIRECT testing it might be quite involved.

To be honest I'd expect a Linux filesystem without O_DIRECT not working
overly well in practical setup - it's pretty widely used these days.
A better fix might be simply accept O_DIRECT for data=journal, but
use the pagecache with a use once hint.

> BTW, with the very latest xfstests, I'm seeing new across-the-board
> (not just data=journal) failures for tests #62 (caused by the presence
> of the lost+found directory

062 fails because Andreas changed the error returns from the xattr
calls.  He sent me a patch to accept the new errors, but I'm still
undecided wether the new ones are correct enough.  I'll wait another
kernel release to see if real users complain about the change, and
will apply it then.

> and differences in error code returns for
> xattrs) and #79 (a failure in the append-only handling which I don't
> completely understand yet).

This was discussed on fsdevel lately.  All filesystem but xfs inherit
the append only bit from directories to files created inside them.
This not only is not very useful behaviour, but also exposes a bug
in the vfs that allows to create these new files, but fail the open
unless using O_APPEND which is against the posix create semantics.

We'll have to either adopt the oether filesystems to the xfs semantics,
or adopt xfs to the common dumb semantics and fix that O_CREAT bug.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Aug. 14, 2011, 2:06 a.m. UTC | #11
On Fri, Aug 12, 2011 at 04:57:19PM -0400, Christoph Hellwig wrote:
> > and differences in error code returns for
> > xattrs) and #79 (a failure in the append-only handling which I don't
> > completely understand yet).
> 
> This was discussed on fsdevel lately.  All filesystem but xfs inherit
> the append only bit from directories to files created inside them.
> This not only is not very useful behaviour, but also exposes a bug
> in the vfs that allows to create these new files, but fail the open
> unless using O_APPEND which is against the posix create semantics.

I agree that disabling inheritance of the APPEND_FL flag makes sense.
I propose we do this for ext2/3/4 and any other file systems which is
currently following the ext2/3/4 behavior.

Any objections?

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara Aug. 15, 2011, 11:19 a.m. UTC | #12
On Sat 13-08-11 22:06:17, Ted Tso wrote:
> On Fri, Aug 12, 2011 at 04:57:19PM -0400, Christoph Hellwig wrote:
> > > and differences in error code returns for
> > > xattrs) and #79 (a failure in the append-only handling which I don't
> > > completely understand yet).
> > 
> > This was discussed on fsdevel lately.  All filesystem but xfs inherit
> > the append only bit from directories to files created inside them.
> > This not only is not very useful behaviour, but also exposes a bug
> > in the vfs that allows to create these new files, but fail the open
> > unless using O_APPEND which is against the posix create semantics.
> 
> I agree that disabling inheritance of the APPEND_FL flag makes sense.
> I propose we do this for ext2/3/4 and any other file systems which is
> currently following the ext2/3/4 behavior.
> 
> Any objections?
  I'm fine with that. It also looks more like a bug (ommission) than
anything else.

								Honza
diff mbox

Patch

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9ea71aa..9d189cf 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1631,6 +1631,11 @@  static int parse_options(char *options, struct super_block *sb,
 			sbi->s_min_batch_time = option;
 			break;
 		case Opt_data_journal:
+			ext4_msg(sb, KERN_WARNING,
+				 "Using data=journal may be removed in the "
+				 "future. Please, contact "
+				 "linux-ext4@vger.kernel.org if you are "
+				 "using this feature.");
 			data_opt = EXT4_MOUNT_JOURNAL_DATA;
 			goto datacheck;
 		case Opt_data_ordered: