[OpenMP/OpenACC/Fortran] Fix mapping of optional (present|absent) arguments
diff mbox series

Message ID 8be82276-81b1-817c-fcd2-51f24f5fe2d2@codesourcery.com
State New
Headers show
Series
  • [OpenMP/OpenACC/Fortran] Fix mapping of optional (present|absent) arguments
Related show

Commit Message

Tobias Burnus Nov. 20, 2019, 1:06 p.m. UTC
This patch does two things regarding explicit and automatical variable 
mapping to offloaded devices:

* Fixes bugs with optional arguments, which are present. They were 
mapped but the mapping had issues causing run-time failures.
* It now also handles absent optional arguments.

Compared to the previous patch set,** I added several OpenMP test cases 
– and fixed the fallout.

Except for trivial changes to libgomp/oacc-mem.c and omp-low.c, all 
changes are in fortran/trans-openmp.c and only affect optional arguments.

The patch was bootstrapped and tested on x86_64-gnu-linux w/o 
offloading-support configured and with nvptx offloading.

Tobias

** Included in the attached patch are the following previously posted 
patches: [1] the trivial libgomp/oacc-mem.c change, [2] only the 
remaining single-line change in omp-low.c, [3] the trans-openmp.c 
changes (which had to be modified+extended), and [5] the test cases. 
([2] and [4] are already in GCC 10.) See: 
https://gcc.gnu.org/ml/gcc-patches/2019-07/threads.html#00960 for the 
original patches.

PS: For full OpenMP support, (absent) optional arguments also needed to 
be handled for data-share clauses.

Comments

Tobias Burnus Nov. 29, 2019, 12:07 p.m. UTC | #1
Early *PING*.

Tobias Burnus wrote:
> This patch does two things regarding explicit and automatical variable 
> mapping to offloaded devices:
>
> * Fixes bugs with optional arguments, which are present. They were 
> mapped but the mapping had issues causing run-time failures.
> * It now also handles absent optional arguments.
>
> Compared to the previous patch set,** I added several OpenMP test 
> cases – and fixed the fallout.
>
> Except for trivial changes to libgomp/oacc-mem.c and omp-low.c, all 
> changes are in fortran/trans-openmp.c and only affect optional arguments.
>
> The patch was bootstrapped and tested on x86_64-gnu-linux w/o 
> offloading-support configured and with nvptx offloading.
>
> Tobias
>
> ** Included in the attached patch are the following previously posted 
> patches: [1] the trivial libgomp/oacc-mem.c change, [2] only the 
> remaining single-line change in omp-low.c, [3] the trans-openmp.c 
> changes (which had to be modified+extended), and [5] the test cases. 
> ([2] and [4] are already in GCC 10.) See: 
> https://gcc.gnu.org/ml/gcc-patches/2019-07/threads.html#00960 for the 
> original patches.
>
> PS: For full OpenMP support, (absent) optional arguments also needed 
> to be handled for data-share clauses.
>
Jakub Jelinek Dec. 5, 2019, 3:15 p.m. UTC | #2
On Wed, Nov 20, 2019 at 02:06:18PM +0100, Tobias Burnus wrote:
> ** Included in the attached patch are the following previously posted
> patches: [1] the trivial libgomp/oacc-mem.c change, [2] only the remaining
> single-line change in omp-low.c, [3] the trans-openmp.c changes (which had
> to be modified+extended), and [5] the test cases. ([2] and [4] are already
> in GCC 10.) See:
> https://gcc.gnu.org/ml/gcc-patches/2019-07/threads.html#00960 for the
> original patches.
> 
> PS: For full OpenMP support, (absent) optional arguments also needed to be
> handled for data-share clauses.

Sure.

> 2019-10-20  Tobias Burnus  <tobias@codesourcery.com>
> 	    Kwok Cheung Yeung <kcy@codesourcery.com>
> 
> 	gcc/fortran/
> 	* trans-openmp.c (gfc_build_conditional_assign, 
> 	gfc_build_conditional_assign_expr): New static functions.
> 	(gfc_omp_finish_clause, gfc_trans_omp_clauses): Handle mapping of
> 	absent optional arguments and fix mapping of present optional args.
> 
> 	gcc/
> 	* omp-low.c (lower_omp_target): For optional arguments, deref once
> 	more to obtain the type.
> 
> 	libgomp/
> 	* oacc-mem.c (update_dev_host, gomp_acc_insert_pointer): Just return
> 	if input it a NULL pointer.
> 	* testsuite/libgomp.oacc-c-c++-common/lib-43.c: Remove; dependent on
> 	diagnostic of NULL pointer.
> 	* testsuite/libgomp.oacc-c-c++-common/lib-47.c: Ditto.
> 	* testsuite/libgomp.fortran/optional-map.f90: New.
> 	* testsuite/libgomp.fortran/use_device_addr-1.f90
> 	(test_dummy_opt_callee_1_absent): New.
> 	(test_dummy_opt_call_1): Call it.
> 	* testsuite/libgomp.fortran/use_device_addr-2.f90: Likewise.
> 	* testsuite/libgomp.fortran/use_device_addr-3.f90: Likewise.
> 	* testsuite/libgomp.fortran/use_device_addr-4.f90: Likewise.
> 	* testsuite/libgomp.oacc-fortran/optional-cache.f95: New.
> 	* testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-data-copyin.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-data-copyout.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-declare.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-firstprivate.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-host_data.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-nested-calls.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-private.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-reduction.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-update-device.f90: New.
> 	* testsuite/libgomp.oacc-fortran/optional-update-host.f90: New.

Ok, with some formatting nits fixed.

> @@ -1199,6 +1257,8 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
>      }
>  
>    tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE;
> +  tree present = gfc_omp_is_optional_argument (decl)
> +		 ? gfc_omp_check_optional_argument (decl, true) : NULL_TREE;

I think emacs users (I'm not one of them) want ()s around, otherwise emacs
misformats that.  So
  tree present = (gfc_omp_is_optional_argument (decl)
		  ? gfc_omp_check_optional_argument (decl, true) : NULL_TREE);

> @@ -1232,17 +1314,43 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
>        stmtblock_t block;
>        gfc_start_block (&block);
>        tree type = TREE_TYPE (decl);
> -      tree ptr = gfc_conv_descriptor_data_get (decl);
> +      tree ptr;
> +
> +      if (present)
> +	ptr = gfc_build_conditional_assign_expr (
> +		&block, present,
> +		gfc_conv_descriptor_data_get (decl),
> +		null_pointer_node);

I must say I don't like very much formatting like that, I'd find it cleaner
to use temporary to have shorter argument and put all the arguments after
the ( column.

> +      else
> +	ptr = gfc_conv_descriptor_data_get (decl);

In this case, it could even be:
      ptr = gfc_conv_descriptor_data_get (decl);
      if (present)
	ptr = gfc_build_conditional_assign_expr (&block, present, ptr,
						 null_pointer_node);
by just using the same call from the else.

> @@ -2252,6 +2385,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>  		TREE_ADDRESSABLE (decl) = 1;
>  	      if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL)
>  		{
> +		  tree present = gfc_omp_is_optional_argument (decl)
> +				 ? gfc_omp_check_optional_argument (decl, true)
> +				 : NULL_TREE;
>  		  if (POINTER_TYPE_P (TREE_TYPE (decl))
>  		      && (gfc_omp_privatize_by_reference (decl)
>  			  || GFC_DECL_GET_SCALAR_POINTER (decl)

See above.

> @@ -2284,6 +2420,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>  		    {
>  		      tree type = TREE_TYPE (decl);
>  		      tree ptr = gfc_conv_descriptor_data_get (decl);
> +		      if (present)
> +			ptr = gfc_build_conditional_assign_expr (
> +			       block, present, ptr,
> +			       null_pointer_node);

And here the other comment.  It is much more indented though, but you could
use a temporary, like:
		      tree nullarg = null_pointer_node;
		      if (present)
			ptr
			  = gfc_build_conditional_assign_expr (block, present,
							       ptr, nullarg);
Though, if it is too much for you, ignore.
Another option would be shorten the name of the function, say
s/conditional/cond/.
There were some discussions about lifting the 80 column restriction and bump
it to something like +-130, but nothing happened yet.

> +			  ptr = gfc_build_conditional_assign_expr (
> +				  block, present, ptr,
> +				  null_pointer_node);

Again.

> +			  stmtblock_t cond_block;
> +			  gfc_init_block (&cond_block);
> +			  tree size = gfc_full_array_size (&cond_block, decl,
> +					GFC_TYPE_ARRAY_RANK (type));

Here one could use a temporary for GFC_TYPE_ARRAY_RANK (type);

> +			  if (present)
> +			    {
> +			      tree var = gfc_create_var (gfc_array_index_type,
> +							 NULL);
> +			      tree cond = fold_build2_loc (input_location,
> +							   NE_EXPR,
> +							   boolean_type_node,
> +							   present,
> +							   null_pointer_node);
> +			      gfc_add_modify (&cond_block, var, size);
> +			      gfc_add_expr_to_block (block,
> +				build3_loc (input_location, COND_EXPR,
> +					    void_type_node, cond,
> +					    gfc_finish_block (&cond_block),
> +					    NULL_TREE));

And here for the expr, perhaps just reuse the cond variable.

> @@ -2346,6 +2534,18 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>  					   OMP_CLAUSE_SIZE (node), elemsz);
>  			}
>  		    }
> +		  else if (present
> +			   && TREE_CODE (decl) == INDIRECT_REF
> +			   && TREE_CODE (TREE_OPERAND (decl, 0))
> +				== INDIRECT_REF)

Above I'd expect
			   && (TREE_CODE (TREE_OPERAND (decl, 0))
			       == INDIRECT_REF))
but perhaps I'm just pushing my coding style too much, ignore in that case.

	Jakub
Thomas Schwinge Dec. 5, 2019, 3:46 p.m. UTC | #3
Hi!

;-P Jakub, thanks for furnishing me a fit occasion here:

On 2019-12-05T16:15:15+0100, Jakub Jelinek <jakub@redhat.com> wrote:
> [...] much more indented though, but you could
> use a temporary, like:
> 		      tree nullarg = null_pointer_node;

I object to cluttering the code by introducing temporary variables/names
just for the sake of a few characters of screen width.  Even if located
close lexically, when reading the following code you still have to trace
back from the 'nullarg' usage to its 'null_pointer_node' definition in
order to figure out what a 'nullarg' might be:

> 		      if (present)
> 			ptr
> 			  = gfc_build_conditional_assign_expr (block, present,
> 							       ptr, nullarg);

> Another option would be shorten the name of the function, say
> s/conditional/cond/.

Likewise I object to "crippling" identifier names like that just for the
sake of a few characters of screen width.  (Here of course, "cond", or
the existing "expr" might be fine abbreviations, but my point is about
the general case.)

> There were some discussions about lifting the 80 column restriction and bump
> it to something like +-130, but nothing happened yet.

Indeed.  :-)

In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner
stated that even he is not using a 80 x 24 terminal anymore, and that
should tell us something.  ;-)

So, I formally propose that we lift this characters per line restriction
from IBM punch card (80) to mainframe line printer (132).

Nonwithstanding that, we should try to not overstrain that; deep
indentation often is a sign that code should be split out into a separate
function, for example.  My point is just to avoid things like the two
examples cited above.

Also, I'm not proposing any mass-reformatting of the existing code, or
re-writing all "expr" into "expression".

Tasks:

  - Discussion.
  - Get agreement/make a decision (by means still to be determined).
  - Put suitable Emacs/Vim configuration files into the source tree?
  - Update coding style guidelines.


Grüße
 Thomas
Jakub Jelinek Dec. 5, 2019, 4:04 p.m. UTC | #4
On Thu, Dec 05, 2019 at 04:46:45PM +0100, Thomas Schwinge wrote:
> Hi!
> 
> ;-P Jakub, thanks for furnishing me a fit occasion here:
> 
> On 2019-12-05T16:15:15+0100, Jakub Jelinek <jakub@redhat.com> wrote:
> > [...] much more indented though, but you could
> > use a temporary, like:
> > 		      tree nullarg = null_pointer_node;
> 
> I object to cluttering the code by introducing temporary variables/names
> just for the sake of a few characters of screen width.  Even if located
> close lexically, when reading the following code you still have to trace
> back from the 'nullarg' usage to its 'null_pointer_node' definition in
> order to figure out what a 'nullarg' might be:
> 
> > 		      if (present)
> > 			ptr
> > 			  = gfc_build_conditional_assign_expr (block, present,
> > 							       ptr, nullarg);
> 
> > Another option would be shorten the name of the function, say
> > s/conditional/cond/.
> 
> Likewise I object to "crippling" identifier names like that just for the
> sake of a few characters of screen width.  (Here of course, "cond", or
> the existing "expr" might be fine abbreviations, but my point is about
> the general case.)

The point about temporaries is general, and I believe they actually make
code much more readable.  Mostly about coding style like:
	t = fold_build2_loc (loc, code, fold_build2_loc (loc, code2,
							 something1,
							 something2),
			     fold_build2_loc (loc, code3, something3,
					      something4));
vs.
	tree op1 = fold_build2_loc (loc, code2, something1, something2);
	tree op2 = fold_build2_loc (loc, code3, something3, something4);
	t = fold_build2_loc (loc, code, op1, op2);
The above case is extreme in both being indented quite a lot (general rule
is to consider outlining something into a function then) and using
way too long function names.  If you look at the earlier suggestion where
the code is indented reasonably, using the temporary there makes the code more
readable and shorter.

> 
> > There were some discussions about lifting the 80 column restriction and bump
> > it to something like +-130, but nothing happened yet.
> 
> Indeed.  :-)
> 
> In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner
> stated that even he is not using a 80 x 24 terminal anymore, and that
> should tell us something.  ;-)
> 
> So, I formally propose that we lift this characters per line restriction
> from IBM punch card (80) to mainframe line printer (132).

Such a proposal would need to be accompanied with a wwwdocs
codingconventions.html patch and contrib/check_GNU_style.{sh,py} patch I
guess ;)

	Jakub
Joseph Myers Dec. 5, 2019, 4:17 p.m. UTC | #5
On Thu, 5 Dec 2019, Thomas Schwinge wrote:

> In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner
> stated that even he is not using a 80 x 24 terminal anymore, and that
> should tell us something.  ;-)
> 
> So, I formally propose that we lift this characters per line restriction
> from IBM punch card (80) to mainframe line printer (132).

I thought these line lengths were based on readability studies suggesting 
lengths that lines shorter than 80 columns were more readable?

Longer lines mean less space for multiple terminal / editor windows 
side-by-side to look at different pieces of code.  I don't think that's an 
improvement.
Paul Koning Dec. 5, 2019, 4:24 p.m. UTC | #6
> On Dec 5, 2019, at 11:17 AM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Thu, 5 Dec 2019, Thomas Schwinge wrote:
> 
>> In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner
>> stated that even he is not using a 80 x 24 terminal anymore, and that
>> should tell us something.  ;-)
>> 
>> So, I formally propose that we lift this characters per line restriction
>> from IBM punch card (80) to mainframe line printer (132).
> 
> I thought these line lengths were based on readability studies suggesting 
> lengths that lines shorter than 80 columns were more readable?

That's certainly a general rule.  There is a reason why books aren't wide, and why newspapers have columns.  The eye can't deal well with long lines.  So while 132 column lines are certainly possible with modern computers, it doesn't mean they are desirable.

	paul
Jeff Law Dec. 5, 2019, 4:40 p.m. UTC | #7
On 12/5/19 9:24 AM, Paul Koning wrote:
> 
> 
>> On Dec 5, 2019, at 11:17 AM, Joseph Myers <joseph@codesourcery.com>
>> wrote:
>> 
>> On Thu, 5 Dec 2019, Thomas Schwinge wrote:
>> 
>>> In the relevant session at the GNU Tools Cauldron 2019, Michael
>>> Meissner stated that even he is not using a 80 x 24 terminal
>>> anymore, and that should tell us something.  ;-)
>>> 
>>> So, I formally propose that we lift this characters per line
>>> restriction from IBM punch card (80) to mainframe line printer
>>> (132).
>> 
>> I thought these line lengths were based on readability studies
>> suggesting lengths that lines shorter than 80 columns were more
>> readable?
> 
> That's certainly a general rule.  There is a reason why books aren't
> wide, and why newspapers have columns.  The eye can't deal well with
> long lines.  So while 132 column lines are certainly possible with
> modern computers, it doesn't mean they are desirable.
I'd like to see the restriction relaxed.  THe 80 column limit really
presents readability problems and excessive expression wrapping to
accommodate the limit.  132 seems like a very reasonable compromise.

My biggest worry with moving to 132 columns is that it will discourage
refactoring when indention levels cause excessive wrapping.

Jeff
Michael Matz Dec. 5, 2019, 4:44 p.m. UTC | #8
Hello,

(oh a flame bait :) )

On Thu, 5 Dec 2019, Thomas Schwinge wrote:

> So, I formally propose that we lift this characters per line restriction
> from IBM punch card (80) to mainframe line printer (132).
> 
> Tasks:
> 
>   - Discussion.

I object to cluttering code in excuse for using sensible function names or 
temporaries that otherwise can help clearing up code.  Using 132-char 
lines is cluttering code:
- long lines are harder to read/grasp: vertical eye movement is easier 
  than horizontal, and source code should be optimized for 
  reading, not writing
- long lines make it impossible to have two files next to each other at a 
  comfortable font size
- long lines are incompatible with existing netiquette re emails, for 
  instance

So, at least for me, that my terminals are 80 wide (but not x24) has 
multiple reasons, and the _least_ of it is because that's what punch cards 
had.


Ciao,
Michael.
Florian Weimer Dec. 5, 2019, 4:55 p.m. UTC | #9
* Paul Koning:

> That's certainly a general rule.  There is a reason why books aren't
> wide, and why newspapers have columns.  The eye can't deal well with
> long lines.  So while 132 column lines are certainly possible with
> modern computers, it doesn't mean they are desirable.

If the line starts at column 40 or so, I don't think readability suffers
too much if it goes to column 100.  If it starts at column 2, then it
might be problematic.

Frequent long lines reduce the usefulness of side-by-side diff viewers.

And there are those of us who use screens in portrait mode, following
the rule that a good function is not longer than a single screen.  1080
pixels give you 8 pixels per character, which isn't that much.

Thanks,
Florian
Jonathan Wakely Dec. 5, 2019, 5:03 p.m. UTC | #10
On Thu, 5 Dec 2019 at 16:44, Michael Matz <matz@suse.de> wrote:
>
> Hello,
>
> (oh a flame bait :) )
>
> On Thu, 5 Dec 2019, Thomas Schwinge wrote:
>
> > So, I formally propose that we lift this characters per line restriction
> > from IBM punch card (80) to mainframe line printer (132).
> >
> > Tasks:
> >
> >   - Discussion.
>
> I object to cluttering code in excuse for using sensible function names or
> temporaries that otherwise can help clearing up code.  Using 132-char
> lines is cluttering code:
> - long lines are harder to read/grasp: vertical eye movement is easier
>   than horizontal, and source code should be optimized for
>   reading, not writing
> - long lines make it impossible to have two files next to each other at a
>   comfortable font size
> - long lines are incompatible with existing netiquette re emails, for
>   instance
>
> So, at least for me, that my terminals are 80 wide (but not x24) has
> multiple reasons, and the _least_ of it is because that's what punch cards
> had.

C++17 introduces a nice feature, with rationale similar to declaring
variables in a for-loop init-statement:

if (auto var = foo(); bar(var))

The variable is only in scope for the block where you need it, just
like a for-loop.

Unfortunately nearly every time I've tried to use this recently, I've
found it's impossible in 80 columns, e.g. this from yesterday:

    if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=>
0; __c != 0)
      return __c;

When you're forced to uglify every variable with a leading __ you run
out of characters pretty damn quickly.

I can either not use the feature (and have the variable defined in a
larger scope than it needs to be) or add fairly arbitrary line breaks:

            if (auto __c
            = __builtin_memcmp(&*__first1, &*__first2, __len)
            <=> 0; __c != 0)
              return __c;

or try to give the variables shorter (and less meaningful) names.
Adding line breaks or picking shorter names doesn't help readability.
So I end up not using the feature.

I'm loosely in favour of relaxing the rule for libstdc++ code. I don't
really care what the rest of GCC looks like ;-)
N.M. Maclaren Dec. 5, 2019, 5:29 p.m. UTC | #11
On Dec 5 2019, Michael Matz wrote:
>
>(oh a flame bait :) )

Quite.  I shall try not to make it too much worse, but there's another point
that needs mentioning.

I find long names hard to read, with either short or long lines, especially
when combined with variants like negotiate_twisty_little_passage, and
negotiate_little_twisty_passage.  And extending line lengths will probably
encourage the use of longer names.

As people say, there are conficting requirements, but I side with you in
preferring 80 column lines.  Actually, I tend to use less, but that's
because of the (still current) frequency with which utilities make a mess
of lines exactly 80 columns long in an 80 column field.

Regards,
Nick Maclaren.
Andrew Stubbs Dec. 5, 2019, 5:55 p.m. UTC | #12
On 05/12/2019 16:17, Joseph Myers wrote:
> Longer lines mean less space for multiple terminal / editor windows
> side-by-side to look at different pieces of code.  I don't think that's an
> improvement.

Here's a data-point ....

My 1920 pixel-wide screen, in the default font, allows 239 columns; not 
enough for two 130-wide editors.  Especially not with line numbers and 
"gutter" columns.

On the other hand, 80 columns does tend to cause some formatting 
contortions, with long function names and deeper indentations.

I think a nice round 100 would be a good compromise.

Andrew
Marek Polacek Dec. 5, 2019, 6:07 p.m. UTC | #13
On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote:
> On Thu, 5 Dec 2019 at 16:44, Michael Matz <matz@suse.de> wrote:
> >
> > Hello,
> >
> > (oh a flame bait :) )
> >
> > On Thu, 5 Dec 2019, Thomas Schwinge wrote:
> >
> > > So, I formally propose that we lift this characters per line restriction
> > > from IBM punch card (80) to mainframe line printer (132).
> > >
> > > Tasks:
> > >
> > >   - Discussion.
> >
> > I object to cluttering code in excuse for using sensible function names or
> > temporaries that otherwise can help clearing up code.  Using 132-char
> > lines is cluttering code:
> > - long lines are harder to read/grasp: vertical eye movement is easier
> >   than horizontal, and source code should be optimized for
> >   reading, not writing
> > - long lines make it impossible to have two files next to each other at a
> >   comfortable font size
> > - long lines are incompatible with existing netiquette re emails, for
> >   instance
> >
> > So, at least for me, that my terminals are 80 wide (but not x24) has
> > multiple reasons, and the _least_ of it is because that's what punch cards
> > had.
> 
> C++17 introduces a nice feature, with rationale similar to declaring
> variables in a for-loop init-statement:
> 
> if (auto var = foo(); bar(var))
> 
> The variable is only in scope for the block where you need it, just
> like a for-loop.
> 
> Unfortunately nearly every time I've tried to use this recently, I've
> found it's impossible in 80 columns, e.g. this from yesterday:
> 
>     if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=>
> 0; __c != 0)
>       return __c;
> 
> When you're forced to uglify every variable with a leading __ you run
> out of characters pretty damn quickly.
> 
> I can either not use the feature (and have the variable defined in a
> larger scope than it needs to be) or add fairly arbitrary line breaks:
> 
>             if (auto __c
>             = __builtin_memcmp(&*__first1, &*__first2, __len)
>             <=> 0; __c != 0)
>               return __c;
> 
> or try to give the variables shorter (and less meaningful) names.
> Adding line breaks or picking shorter names doesn't help readability.
> So I end up not using the feature.
> 
> I'm loosely in favour of relaxing the rule for libstdc++ code. I don't
> really care what the rest of GCC looks like ;-)

Not using such a nice feature just because of formatting sounds really
shameful.  Would the compromise of 100 chars make things any better here?

Marek
Eric Gallager Dec. 5, 2019, 6:12 p.m. UTC | #14
On 12/5/19, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 05/12/2019 16:17, Joseph Myers wrote:
>> Longer lines mean less space for multiple terminal / editor windows
>> side-by-side to look at different pieces of code.  I don't think that's
>> an
>> improvement.
>
> Here's a data-point ....
>
> My 1920 pixel-wide screen, in the default font, allows 239 columns; not
> enough for two 130-wide editors.  Especially not with line numbers and
> "gutter" columns.
>
> On the other hand, 80 columns does tend to cause some formatting
> contortions, with long function names and deeper indentations.
>
> I think a nice round 100 would be a good compromise.

Here's mine:

My 1280 pixel-wide screen allows 179 columns, which comes to 89.5
columns when divided by 2. I think rounding up to 90 columns would be
a good compromise, although if that's too small, 100 is good as well.

>
> Andrew
>
Robin Curtis Dec. 5, 2019, 6:21 p.m. UTC | #15
My IBM Selectric golfball electronic printer only does 90 characters on A4 in portrait mode………(at 10 cps) 

(as for my all electric TELEX Teleprinter machine !) 

Is this debate for real ?!  - or is this a Christmas spoof ? 

External observer…..keep up the great work. :)

(while I punch out a few more 80 column cards). 



> On 5 Dec 2019, at 17:55, Andrew Stubbs <ams@codesourcery.com> wrote:
> 
> On 05/12/2019 16:17, Joseph Myers wrote:
>> Longer lines mean less space for multiple terminal / editor windows
>> side-by-side to look at different pieces of code.  I don't think that's an
>> improvement.
> 
> Here's a data-point ....
> 
> My 1920 pixel-wide screen, in the default font, allows 239 columns; not enough for two 130-wide editors.  Especially not with line numbers and "gutter" columns.
> 
> On the other hand, 80 columns does tend to cause some formatting contortions, with long function names and deeper indentations.
> 
> I think a nice round 100 would be a good compromise.
> 
> Andrew
Martin Sebor Dec. 5, 2019, 6:54 p.m. UTC | #16
On 12/5/19 8:46 AM, Thomas Schwinge wrote:
> Hi!
> 
> ;-P Jakub, thanks for furnishing me a fit occasion here:
> 
> On 2019-12-05T16:15:15+0100, Jakub Jelinek <jakub@redhat.com> wrote:
>> [...] much more indented though, but you could
>> use a temporary, like:
>> 		      tree nullarg = null_pointer_node;
> 
> I object to cluttering the code by introducing temporary variables/names
> just for the sake of a few characters of screen width.  Even if located
> close lexically, when reading the following code you still have to trace
> back from the 'nullarg' usage to its 'null_pointer_node' definition in
> order to figure out what a 'nullarg' might be:
> 
>> 		      if (present)
>> 			ptr
>> 			  = gfc_build_conditional_assign_expr (block, present,
>> 							       ptr, nullarg);

The snippet of code above looks like it might be the symptom
of another common problem: deeply nested conditionals, case
statements, or loops in very large functions.  Those usually
make it much harder to follow code than local variables or
expressions that are broken up to fit the width limit.
Shorter functions typically also means fewer local variables.

One thing I find improves readability in functions with many
local variables is declaring const those that don't change
after initialization.  That also enforces the initialization-
on-declaration coding style, and can result in more efficient
code.

Another solution that might help in this context is default
function arguments: if the last argument may be null, making
it the default in the function declaration avoids having to
pass it explicitly.

> 
>> Another option would be shorten the name of the function, say
>> s/conditional/cond/.

As long as it doesn't compromise readability this sounds like
a good suggestion for a change to the function above.  _cond_
is no less clear or descriptive than _conditional_, similarly
to _expr vs _expression.

> 
> Likewise I object to "crippling" identifier names like that just for the
> sake of a few characters of screen width.  (Here of course, "cond", or
> the existing "expr" might be fine abbreviations, but my point is about
> the general case.)
> 
>> There were some discussions about lifting the 80 column restriction and bump
>> it to something like +-130, but nothing happened yet.
> 
> Indeed.  :-)
> 
> In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner
> stated that even he is not using a 80 x 24 terminal anymore, and that
> should tell us something.  ;-)
> 
> So, I formally propose that we lift this characters per line restriction
> from IBM punch card (80) to mainframe line printer (132).

I'm not a fan of rigid rules, especially those that are subject
to personal style preferences.  At the same time I wouldn't like
to see lines become as long as this as the norm.  As others,
I have windows om my desktop arranged in a way to maximize screen
real estate: three columns of editor/debugger and a couple of
terminals on of top of the other.  They only fit because they're
all 80 characters wide.  But more important:

> deep
> indentation often is a sign that code should be split out into a separate
> function, for example.

Exactly.

Martin

   My point is just to avoid things like the two
> examples cited above.
> 
> Also, I'm not proposing any mass-reformatting of the existing code, or
> re-writing all "expr" into "expression".
> 
> Tasks:
> 
>    - Discussion.
>    - Get agreement/make a decision (by means still to be determined).
>    - Put suitable Emacs/Vim configuration files into the source tree?
>    - Update coding style guidelines.
> 
> 
> Grüße
>   Thomas
>
James Secan Dec. 5, 2019, 7:16 p.m. UTC | #17
All,

As a certified Old Guy (coding in FORTRAN since 1967) my default mode is to stop at 72 characters.  It would be nice to push the max out a bit, but I’ll most likely keep my lines (and my function and variable names!) short.  As always, brevity is the soul of wit.

And yes, please keep up the good work.

Jim
3222 NE 89th St
Seattle, WA 98115
(206) 430-0109

> On Dec 5, 2019, at 10:21 AM, Robin Curtis <Curtis@geoscience.co.uk> wrote:
> 
> Is this debate for real ?!  - or is this a Christmas spoof ? 
> 
> External observer…..keep up the great work. :)
> 
> (while I punch out a few more 80 column cards).
Segher Boessenkool Dec. 5, 2019, 8:06 p.m. UTC | #18
Hi!

On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote:
> C++17 introduces a nice feature, with rationale similar to declaring
> variables in a for-loop init-statement:
> 
> if (auto var = foo(); bar(var))

Similar to GNU C statement expressions, which are *also* only a good
idea to use in limited cases.

> The variable is only in scope for the block where you need it, just
> like a for-loop.
> 
> Unfortunately nearly every time I've tried to use this recently, I've
> found it's impossible in 80 columns, e.g. this from yesterday:
> 
>     if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=>
> 0; __c != 0)
>       return __c;
> 
> When you're forced to uglify every variable with a leading __ you run
> out of characters pretty damn quickly.

If using this "nice feature" forces you to uglify your code, then maybe
it is not such a nice feature, and you should not use it.

If you have issues with scoping your functions are WAY too long already.


Segher
Segher Boessenkool Dec. 5, 2019, 8:12 p.m. UTC | #19
On Thu, Dec 05, 2019 at 04:44:21PM +0000, Michael Matz wrote:
> (oh a flame bait :) )

I will blissfully ignore that warning.

> On Thu, 5 Dec 2019, Thomas Schwinge wrote:
> I object to cluttering code in excuse for using sensible function names or 
> temporaries that otherwise can help clearing up code.  Using 132-char 
> lines is cluttering code:
> - long lines are harder to read/grasp: vertical eye movement is easier 
>   than horizontal, and source code should be optimized for 
>   reading, not writing
> - long lines make it impossible to have two files next to each other at a 
>   comfortable font size
> - long lines are incompatible with existing netiquette re emails, for 
>   instance
> 
> So, at least for me, that my terminals are 80 wide (but not x24) has 
> multiple reasons, and the _least_ of it is because that's what punch cards 
> had.

I agree with all of this.

If you have a hard time writing nicely readable code in 80 columns, you
will have a much harder time still using more columns.  80 is somewhat
too long already (~60 is better), but we have indents in source code,
so that's alright.  And you should not indent that far, so this all
works out splendidly.


Segher
Segher Boessenkool Dec. 5, 2019, 8:21 p.m. UTC | #20
On Thu, Dec 05, 2019 at 05:04:12PM +0100, Jakub Jelinek wrote:
> On Thu, Dec 05, 2019 at 04:46:45PM +0100, Thomas Schwinge wrote:
> > On 2019-12-05T16:15:15+0100, Jakub Jelinek <jakub@redhat.com> wrote:
> > > [...] much more indented though, but you could
> > > use a temporary, like:
> > > 		      tree nullarg = null_pointer_node;
> > 
> > I object to cluttering the code by introducing temporary variables/names
> > just for the sake of a few characters of screen width.  Even if located
> > close lexically, when reading the following code you still have to trace
> > back from the 'nullarg' usage to its 'null_pointer_node' definition in
> > order to figure out what a 'nullarg' might be:
> > 
> > > 		      if (present)
> > > 			ptr
> > > 			  = gfc_build_conditional_assign_expr (block, present,
> > > 							       ptr, nullarg);
> > 
> > > Another option would be shorten the name of the function, say
> > > s/conditional/cond/.
> > 
> > Likewise I object to "crippling" identifier names like that just for the
> > sake of a few characters of screen width.  (Here of course, "cond", or
> > the existing "expr" might be fine abbreviations, but my point is about
> > the general case.)
> 
> The point about temporaries is general, and I believe they actually make
> code much more readable.  Mostly about coding style like:
> 	t = fold_build2_loc (loc, code, fold_build2_loc (loc, code2,
> 							 something1,
> 							 something2),
> 			     fold_build2_loc (loc, code3, something3,
> 					      something4));
> vs.
> 	tree op1 = fold_build2_loc (loc, code2, something1, something2);
> 	tree op2 = fold_build2_loc (loc, code3, something3, something4);
> 	t = fold_build2_loc (loc, code, op1, op2);

Yes.  And the names, even if they do not say much, *do* say enough to
help comprehending the code.  They help structure it.

> The above case is extreme in both being indented quite a lot (general rule
> is to consider outlining something into a function then)

I hope you mean actual factoring, not just outlining :-)  If you pick good
factors you can give them good names, too.

Good names help reading the code.  And on the other hand, when it is hard
to come up with a good name for a piece of code, it is probably not chosen
as a good factor anyway!

> and using
> way too long function names.  If you look at the earlier suggestion where
> the code is indented reasonably, using the temporary there makes the code more
> readable and shorter.

Yup.


Segher
Segher Boessenkool Dec. 5, 2019, 8:32 p.m. UTC | #21
On Thu, Dec 05, 2019 at 11:54:04AM -0700, Martin Sebor wrote:
> >>		      if (present)
> >>			ptr
> >>			  = gfc_build_conditional_assign_expr (block, 
> >>			  present,
> >>							       ptr, nullarg);
> 
> The snippet of code above looks like it might be the symptom
> of another common problem: deeply nested conditionals, case
> statements, or loops in very large functions.  Those usually
> make it much harder to follow code than local variables or
> expressions that are broken up to fit the width limit.
> Shorter functions typically also means fewer local variables.

Yes, and you only get problems that you do not know which var is which,
or you do not know what value it is set to because it was set some 2800
(or 40 or whatever) lines ago, in WAY too long functions.

All those problems do not exist in well-factored code.  The point is not
to have short routines: the point is to not have too much (external)
complexity per routine.  A routine should ideally only do one thing, and
its name should describe what it does.


Segher
Marek Polacek Dec. 5, 2019, 8:38 p.m. UTC | #22
On Thu, Dec 05, 2019 at 02:06:50PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote:
> > C++17 introduces a nice feature, with rationale similar to declaring
> > variables in a for-loop init-statement:
> > 
> > if (auto var = foo(); bar(var))
> 
> Similar to GNU C statement expressions, which are *also* only a good
> idea to use in limited cases.
> 
> > The variable is only in scope for the block where you need it, just
> > like a for-loop.
> > 
> > Unfortunately nearly every time I've tried to use this recently, I've
> > found it's impossible in 80 columns, e.g. this from yesterday:
> > 
> >     if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=>
> > 0; __c != 0)
> >       return __c;
> > 
> > When you're forced to uglify every variable with a leading __ you run
> > out of characters pretty damn quickly.
> 
> If using this "nice feature" forces you to uglify your code, then maybe
> it is not such a nice feature, and you should not use it.

I disagree, it is a nice feature, without quotes.  It's Good Style not to
leak variables into enclosing scopes.

> If you have issues with scoping your functions are WAY too long already.

I don't think that's the case here.

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
Jason Merrill Dec. 5, 2019, 8:41 p.m. UTC | #23
On Thu, Dec 5, 2019 at 11:51 AM Michael Matz <matz@suse.de> wrote:

> Hello,
>
> (oh a flame bait :) )
>
> On Thu, 5 Dec 2019, Thomas Schwinge wrote:
>
> > So, I formally propose that we lift this characters per line restriction
> > from IBM punch card (80) to mainframe line printer (132).
> >
> > Tasks:
> >
> >   - Discussion.
>
> I object to cluttering code in excuse for using sensible function names or
> temporaries that otherwise can help clearing up code.  Using 132-char
> lines is cluttering code:
> - long lines are harder to read/grasp: vertical eye movement is easier
>   than horizontal, and source code should be optimized for
>   reading, not writing
> - long lines make it impossible to have two files next to each other at a
>   comfortable font size
> - long lines are incompatible with existing netiquette re emails, for
>   instance
>
> So, at least for me, that my terminals are 80 wide (but not x24) has
> multiple reasons, and the _least_ of it is because that's what punch cards
> had.
>

Agreed.  I work with two side-by-side terminals, one 80x50 and the other as
wide as fits in the rest of the screen, which currently happens to be
111x50.

Jason
Jonathan Wakely Dec. 5, 2019, 8:56 p.m. UTC | #24
On Thu, 5 Dec 2019 at 20:07, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Hi!
>
> On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote:
> > C++17 introduces a nice feature, with rationale similar to declaring
> > variables in a for-loop init-statement:
> >
> > if (auto var = foo(); bar(var))
>
> Similar to GNU C statement expressions, which are *also* only a good
> idea to use in limited cases.
>
> > The variable is only in scope for the block where you need it, just
> > like a for-loop.
> >
> > Unfortunately nearly every time I've tried to use this recently, I've
> > found it's impossible in 80 columns, e.g. this from yesterday:
> >
> >     if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=>
> > 0; __c != 0)
> >       return __c;
> >
> > When you're forced to uglify every variable with a leading __ you run
> > out of characters pretty damn quickly.
>
> If using this "nice feature" forces you to uglify your code, then maybe
> it is not such a nice feature, and you should not use it.

The uglification has absolutely nothing to do with the 'if'
init-statement feature, all code in libstdc++ headers has to be
uglified, always. Blame the C preprocessor for that, not C++ features.

My point is that 80 characters runs out quicker when 10% of it goes on
visual noise that's only needed because the C preprocessor means we
can't have nice names.

> If you have issues with scoping your functions are WAY too long already.

I don't have issues with scoping, it's just good practice to limit the
scope to the minimum necessary.

The example I'm talking about is:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/bits/stl_algobase.h;h=a2fd306e6d0cca579b510148ba1a7089e2b2f3a2;hb=HEAD#l1499

That function is 46 lines long, including a micro-optimisation to use
memcmpy when appropriate. About 15% of that is on assertions to help
users debug their mistakes. Another five lines are the compile-time
if-constexpr checks to decide when the optimisation is appropriate,
which result in deep nesting, but not because of any complex if-else
branches. It could be nested less deeply to reduce indentation by 4
spaces, but that would result in more template instantiations for
users, which would take more time and memory to compile. When you have
a million users including the header in every C++ program, little
things like that help. So the code is written in an unidiomatic way to
optimise for compilation speed, not readability.

But the end result is that the call to __min_cmp is indented about 25%
of the way into the available space, and that the FIRST line of
executable code in the function. The code is not way too long, and
writing it differently would compile slower. The constraints that
apply to that code are quite different to the internals of GCC, which
most users never see or even compile themselves.

(Yes, maybe libstdc++ should stop indenting everything by 2 columns
when it's inside a namespace, given that almost everything is inside
at least one level of namespace ... that might be a good idea even if
the column limit is extended to 100 or 132, but we'll still have all
the __ prefixes eating up space.)
Segher Boessenkool Dec. 5, 2019, 10:02 p.m. UTC | #25
On Thu, Dec 05, 2019 at 03:38:15PM -0500, Marek Polacek wrote:
> On Thu, Dec 05, 2019 at 02:06:50PM -0600, Segher Boessenkool wrote:
> > > When you're forced to uglify every variable with a leading __ you run
> > > out of characters pretty damn quickly.
> > 
> > If using this "nice feature" forces you to uglify your code, then maybe
> > it is not such a nice feature, and you should not use it.
> 
> I disagree, it is a nice feature, without quotes.  It's Good Style not to
> leak variables into enclosing scopes.

It *is* a quote, from Jonathan's mail.

Why is this Good Style?  (And according to who?)

> > If you have issues with scoping your functions are WAY too long already.
> 
> I don't think that's the case here.

Then it does not hurt to have a local that is visible in slightly longer
scope than necessary.

Simpler code is better code.


Segher
Segher Boessenkool Dec. 5, 2019, 10:19 p.m. UTC | #26
On Thu, Dec 05, 2019 at 08:56:35PM +0000, Jonathan Wakely wrote:
> On Thu, 5 Dec 2019 at 20:07, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote:
> > > C++17 introduces a nice feature, with rationale similar to declaring
> > > variables in a for-loop init-statement:

> > > Unfortunately nearly every time I've tried to use this recently, I've
> > > found it's impossible in 80 columns, e.g. this from yesterday:
> > >
> > >     if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=>
> > > 0; __c != 0)
> > >       return __c;
> > >
> > > When you're forced to uglify every variable with a leading __ you run
> > > out of characters pretty damn quickly.
> >
> > If using this "nice feature" forces you to uglify your code, then maybe
> > it is not such a nice feature, and you should not use it.
> 
> The uglification has absolutely nothing to do with the 'if'
> init-statement feature, all code in libstdc++ headers has to be
> uglified, always. Blame the C preprocessor for that, not C++ features.
> 
> My point is that 80 characters runs out quicker when 10% of it goes on
> visual noise that's only needed because the C preprocessor means we
> can't have nice names.

(Not sure where the preprocessor comes in, these underscores are just to
satisfy language rules afaics, but maybe you mean something else?)

Or you could write

    auto __c = (__builtin_memcmp(&*__first1, &*__first2, __len) <=> 0);
    if (__c)
      return __c;

which is much easier to read, to my eyes anyway.  And it is exactly the
same for the compiler.

Wrapping everything inside-out just for the artificial goal of having
things defined in slightly tighter scopes feels futile.


Segher
Jonathan Wakely Dec. 5, 2019, 10:33 p.m. UTC | #27
On Thu, 5 Dec 2019 at 22:19, Segher Boessenkool wrote:
>
> On Thu, Dec 05, 2019 at 08:56:35PM +0000, Jonathan Wakely wrote:
> > On Thu, 5 Dec 2019 at 20:07, Segher Boessenkool
> > <segher@kernel.crashing.org> wrote:
> > > On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote:
> > > > C++17 introduces a nice feature, with rationale similar to declaring
> > > > variables in a for-loop init-statement:
>
> > > > Unfortunately nearly every time I've tried to use this recently, I've
> > > > found it's impossible in 80 columns, e.g. this from yesterday:
> > > >
> > > >     if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=>
> > > > 0; __c != 0)
> > > >       return __c;
> > > >
> > > > When you're forced to uglify every variable with a leading __ you run
> > > > out of characters pretty damn quickly.
> > >
> > > If using this "nice feature" forces you to uglify your code, then maybe
> > > it is not such a nice feature, and you should not use it.
> >
> > The uglification has absolutely nothing to do with the 'if'
> > init-statement feature, all code in libstdc++ headers has to be
> > uglified, always. Blame the C preprocessor for that, not C++ features.
> >
> > My point is that 80 characters runs out quicker when 10% of it goes on
> > visual noise that's only needed because the C preprocessor means we
> > can't have nice names.
>
> (Not sure where the preprocessor comes in, these underscores are just to
> satisfy language rules afaics, but maybe you mean something else?)

The language rules are only necessary because of the preprocessor.
Users can define macros with names like "pred" and "cmp" before
including any standard library header, and because macros don't
respect lexical scope, that would break any standard library code
using "pred" and "cmp". So the std::lib has to use reserved names.

It's entirely due to the preprocessor that we can't use sane names for
local variables, or for any implementation detail in a header (unlike
in C, our helper types and functions are in a namespace so won't
collide with users' types and functions, but we still have to use
reserved names because the preprocessor doesn't respect namespaces).
Jonathan Wakely Dec. 5, 2019, 10:37 p.m. UTC | #28
On Thu, 5 Dec 2019 at 22:19, Segher Boessenkool wrote:
> Or you could write
>
>     auto __c = (__builtin_memcmp(&*__first1, &*__first2, __len) <=> 0);
>     if (__c)
>       return __c;
>
> which is much easier to read, to my eyes anyway.  And it is exactly the
> same for the compiler.

In this case yes, but not in general.

Given:

auto x = foo();
if (bar(x))
{ }
some_type y;

The destructor of x won't run until after y has been destroyed. That's
not at all identical to:

if (auto x = foo(); bar(x))
{ }
some_type y;

Please don't try to tell me how C++ works :-)
Segher Boessenkool Dec. 5, 2019, 11:02 p.m. UTC | #29
On Thu, Dec 05, 2019 at 10:37:33PM +0000, Jonathan Wakely wrote:
> On Thu, 5 Dec 2019 at 22:19, Segher Boessenkool wrote:
> > Or you could write
> >
> >     auto __c = (__builtin_memcmp(&*__first1, &*__first2, __len) <=> 0);
> >     if (__c)
> >       return __c;
> >
> > which is much easier to read, to my eyes anyway.  And it is exactly the
> > same for the compiler.
> 
> In this case yes, but not in general.
> 
> Given:
> 
> auto x = foo();
> if (bar(x))
> { }
> some_type y;
> 
> The destructor of x won't run until after y has been destroyed. That's
> not at all identical to:
> 
> if (auto x = foo(); bar(x))
> { }
> some_type y;
> 
> Please don't try to tell me how C++ works :-)

I don't, I wouldn't even *know* that.  But this is just the same as in C,
and I do know how to write good C code.

I don't think doing non-trivial things with constructors and destructors
(or anything else!) implicitly is a good idea at all, but that's an
altogether different subject.


Segher
Andrew Stubbs Dec. 6, 2019, 9:22 a.m. UTC | #30
On 05/12/2019 18:21, Robin Curtis wrote:
> My IBM Selectric golfball electronic printer only does 90 characters on A4 in portrait mode………(at 10 cps)
> 
> (as for my all electric TELEX Teleprinter machine !)
> 
> Is this debate for real ?!  - or is this a Christmas spoof ?

I can't speak for the debate, but the pain is real.

Andrew
Thomas Schwinge Dec. 7, 2019, 2:49 p.m. UTC | #31
Hi Tobias!

On 2019-11-20T14:06:18+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> This patch does two things regarding explicit and automatical variable 
> mapping to offloaded devices:

Thanks!


> Compared to the previous patch set,** I added several OpenMP test cases 
> – and fixed the fallout.

I'm seeing:

    PASS: libgomp.fortran/use_device_addr-3.f90   -O0  (test for excess errors)
    PASS: libgomp.fortran/use_device_addr-3.f90   -O0  execution test
    PASS: libgomp.fortran/use_device_addr-3.f90   -O1  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90   -O1  execution test
    PASS: libgomp.fortran/use_device_addr-3.f90   -O2  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90   -O2  execution test
    PASS: libgomp.fortran/use_device_addr-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
    PASS: libgomp.fortran/use_device_addr-3.f90   -O3 -g  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90   -O3 -g  execution test
    PASS: libgomp.fortran/use_device_addr-3.f90   -Os  (test for excess errors)
    PASS: libgomp.fortran/use_device_addr-3.f90   -Os  execution test
    
    libgomp: Trying to map into device [0x7ffd7915a500..0x7ffd7915a500) object when [0x7ffd7915c440..0x7ffd7915c440) is already mapped


    PASS: libgomp.fortran/use_device_addr-4.f90   -O0  (test for excess errors)
    PASS: libgomp.fortran/use_device_addr-4.f90   -O0  execution test
    PASS: libgomp.fortran/use_device_addr-4.f90   -O1  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-4.f90   -O1  execution test
    PASS: libgomp.fortran/use_device_addr-4.f90   -O2  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-4.f90   -O2  execution test
    PASS: libgomp.fortran/use_device_addr-4.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-4.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
    PASS: libgomp.fortran/use_device_addr-4.f90   -O3 -g  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-4.f90   -O3 -g  execution test
    PASS: libgomp.fortran/use_device_addr-4.f90   -Os  (test for excess errors)
    PASS: libgomp.fortran/use_device_addr-4.f90   -Os  execution test
    
    libgomp: Trying to map into device [0x7ffc70529ca0..0x7ffc70529ca0) object when [0x7ffc7052ac40..0x7ffc7052ac40) is already mapped


> Except for trivial changes to libgomp/oacc-mem.c

Just because something is just a few lines of code doesn't mean that it's
trivial.  I had asked you to first resolve that issue separately
(referencing PR92726), instead of mixing these changes into the Fortran
optional agruments commit:
<http://mid.mail-archive.com/87lfrylp8k.fsf@euler.schwinge.homeip.net>
"[PR92726] OpenACC: 'NULL'-in -> no-op, and/or 'NULL'-out".

There was no reason to rush in the Fortran optional arguments changes
before resolving the 'NULL' changes, was there?


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95
> @@ -0,0 +1,23 @@
> +! Test that the cache directives work with optional arguments.  [...]

Missing '{ dg-do run }'.


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95

> +! of giving a non-present argument to the cache directive is not tested as
> +! it is undefined.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90
> @@ -0,0 +1,112 @@
> +! Test that optional arguments work in firstprivate clauses.  The effect of
> +! non-present arguments in firstprivate clauses is undefined, and is not
> +! tested for.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90
> @@ -0,0 +1,115 @@
> +! Test that optional arguments work in private clauses.  The effect of
> +! non-present arguments in private clauses is undefined, and is not tested

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90
> @@ -0,0 +1,69 @@
> +! Test optional arguments in reduction clauses.  The effect of
> +! non-present arguments in reduction clauses is undefined, and is not tested

Once you've got access, please file a ticket at
<https://github.com/OpenACC/openacc-spec/issues> so this gets clarified.


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90

> +! { dg-additional-options "-w" }

Why that?


Grüße
 Thomas
Tobias Burnus Dec. 11, 2019, 10:51 a.m. UTC | #32
Hi Thomas,

[Attached patch committed as Rev. 279217]
[One OpenMP (+OpenACC) patch and one OpenACC-only patch pending review 
are linked below.]

On 12/7/19 3:49 PM, Thomas Schwinge wrote:

> I'm seeing:
>      [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90   -O1  execution test

Whether it passed or not depended whether the stack was NULL for the 
local "array_arg.0" variable (which got assigned "array_arg->data" if 
the argument was present). Fixed by the following patch, which also 
fixes some more corner cases:

https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00707.html – pending 
Jakub's review

>> Except for trivial changes to libgomp/oacc-mem.c
> Just because something is just a few lines of code doesn't mean that it's
> trivial.  I had asked you to first resolve that issue separately
> (referencing PR92726)

I vaguely remembered the this email – but couldn't find it in the 
thread. (This thread has 24 emails!)
Turned out that you wrote tht email in a different thread – and while 
this patch is mainly about Fortran, your email was also not set to 
fortran@ (arguably, your quoted text didn't include any Fortran bits).

Pending OpenACC review the thread you mentedion: 
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00062.html

>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95
>> @@ -0,0 +1,23 @@
>> +! Test that the cache directives work with optional arguments.  [...]
> Missing '{ dg-do run }'.

Fixed/committed.

>> +! of giving a non-present argument to the cache directive is not tested as
>> +! it is undefined. […]
>> +! The effect of
>> +! non-present aguments in firstprivate clauses is undefined […] The effect of
>> +! non-present arguments in reduction clauses is undefined
> Once you've got access, please file a ticket at
> <https://github.com/OpenACC/openacc-spec/issues> so this gets clarified.

I will try to remember this – it's now on my to-do list.

>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90
>> +! { dg-additional-options "-w" }
> Why that?

No idea (was added by Kwok) – I do see warnings with "-Wall", but not 
with default options. (It then shows warnings like: "‘rg.25’ is used 
uninitialized in this function [-Wuninitialized]".)

I have now also removed this line → committed after testing that it runs 
through with nvptx.

Tobias

Patch
diff mbox series

2019-10-20  Tobias Burnus  <tobias@codesourcery.com>
	    Kwok Cheung Yeung <kcy@codesourcery.com>

	gcc/fortran/
	* trans-openmp.c (gfc_build_conditional_assign, 
	gfc_build_conditional_assign_expr): New static functions.
	(gfc_omp_finish_clause, gfc_trans_omp_clauses): Handle mapping of
	absent optional arguments and fix mapping of present optional args.

	gcc/
	* omp-low.c (lower_omp_target): For optional arguments, deref once
	more to obtain the type.

	libgomp/
	* oacc-mem.c (update_dev_host, gomp_acc_insert_pointer): Just return
	if input it a NULL pointer.
	* testsuite/libgomp.oacc-c-c++-common/lib-43.c: Remove; dependent on
	diagnostic of NULL pointer.
	* testsuite/libgomp.oacc-c-c++-common/lib-47.c: Ditto.
	* testsuite/libgomp.fortran/optional-map.f90: New.
	* testsuite/libgomp.fortran/use_device_addr-1.f90
	(test_dummy_opt_callee_1_absent): New.
	(test_dummy_opt_call_1): Call it.
	* testsuite/libgomp.fortran/use_device_addr-2.f90: Likewise.
	* testsuite/libgomp.fortran/use_device_addr-3.f90: Likewise.
	* testsuite/libgomp.fortran/use_device_addr-4.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/optional-cache.f95: New.
	* testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-data-copyin.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-data-copyout.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-declare.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-firstprivate.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-host_data.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-nested-calls.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-private.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-reduction.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-update-device.f90: New.
	* testsuite/libgomp.oacc-fortran/optional-update-host.f90: New.

 gcc/fortran/trans-openmp.c                                               | 224 ++++++++++++++++++++++++++++++++++++---
 gcc/omp-low.c                                                            |   3 +-
 libgomp/oacc-mem.c                                                       |   9 ++
 libgomp/testsuite/libgomp.fortran/optional-map.f90                       | 119 +++++++++++++++++++++
 libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90                  |  36 +++++++
 libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90                  |  36 +++++++
 libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90                  |  27 +++++
 libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90                  |  27 +++++
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c                     |  51 ---------
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c                     |  49 ---------
 libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95                |  23 ++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90 |  29 +++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin.f90          | 140 ++++++++++++++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyout.f90         |  96 +++++++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90      |  91 ++++++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-declare.f90              |  87 +++++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90         | 112 ++++++++++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-host_data.f90            |  39 +++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-nested-calls.f90         | 135 +++++++++++++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90              | 115 ++++++++++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90            |  69 ++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-update-device.f90        | 121 +++++++++++++++++++++
 libgomp/testsuite/libgomp.oacc-fortran/optional-update-host.f90          | 115 ++++++++++++++++++++
 23 files changed, 1640 insertions(+), 113 deletions(-)

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index d9dfcabc65e..77bc9120d85 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1175,6 +1175,64 @@  gfc_omp_clause_dtor (tree clause, tree decl)
   return tem;
 }
 
+/* Build a conditional expression in BLOCK.  If COND_VAL is not
+   null, then the block THEN_B is executed, otherwise ELSE_VAL
+   is assigned to VAL.  */
+
+static void
+gfc_build_conditional_assign (stmtblock_t *block,
+			      tree val,
+			      tree cond_val,
+			      tree then_b,
+			      tree else_val)
+{
+  stmtblock_t cond_block;
+  tree cond, else_b = NULL_TREE;
+  tree val_ty = TREE_TYPE (val);
+
+  if (else_val)
+    {
+      gfc_init_block (&cond_block);
+      gfc_add_modify (&cond_block, val, fold_convert (val_ty, else_val));
+      else_b = gfc_finish_block (&cond_block);
+    }
+  cond = fold_build2_loc (input_location, NE_EXPR,
+			  logical_type_node,
+			  cond_val, null_pointer_node);
+  gfc_add_expr_to_block (block,
+			 build3_loc (input_location,
+				     COND_EXPR,
+				     void_type_node,
+				     cond, then_b,
+				     else_b));
+}
+
+/* Build a conditional expression in BLOCK, returning a temporary
+   variable containing the result.  If COND_VAL is not null, then
+   THEN_VAL will be assigned to the variable, otherwise ELSE_VAL
+   is assigned.
+ */
+
+static tree
+gfc_build_conditional_assign_expr (stmtblock_t *block,
+				   tree cond_val,
+				   tree then_val,
+				   tree else_val)
+{
+  tree val;
+  tree val_ty = TREE_TYPE (then_val);
+  stmtblock_t cond_block;
+
+  val = create_tmp_var (val_ty);
+
+  gfc_init_block (&cond_block);
+  gfc_add_modify (&cond_block, val, then_val);
+  tree then_b = gfc_finish_block (&cond_block);
+
+  gfc_build_conditional_assign (block, val, cond_val, then_b, else_val);
+
+  return val;
+}
 
 void
 gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
@@ -1199,6 +1257,8 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
     }
 
   tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE;
+  tree present = gfc_omp_is_optional_argument (decl)
+		 ? gfc_omp_check_optional_argument (decl, true) : NULL_TREE;
   if (POINTER_TYPE_P (TREE_TYPE (decl)))
     {
       if (!gfc_omp_privatize_by_reference (decl)
@@ -1213,8 +1273,30 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
       OMP_CLAUSE_DECL (c4) = decl;
       OMP_CLAUSE_SIZE (c4) = size_int (0);
       decl = build_fold_indirect_ref (decl);
-      OMP_CLAUSE_DECL (c) = decl;
-      OMP_CLAUSE_SIZE (c) = NULL_TREE;
+      if (present
+	  && (GFC_DECL_GET_SCALAR_POINTER (orig_decl)
+	      || GFC_DECL_GET_SCALAR_ALLOCATABLE (orig_decl)))
+	{
+	  c2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
+	  OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_POINTER);
+	  OMP_CLAUSE_DECL (c2) = decl;
+	  OMP_CLAUSE_SIZE (c2) = size_int (0);
+
+	  stmtblock_t block;
+	  gfc_start_block (&block);
+	  tree ptr = decl;
+	  ptr = gfc_build_conditional_assign_expr (&block, present, decl,
+						   null_pointer_node);
+	  gimplify_and_add (gfc_finish_block (&block), pre_p);
+	  ptr = build_fold_indirect_ref (ptr);
+	  OMP_CLAUSE_DECL (c) = ptr;
+	  OMP_CLAUSE_SIZE (c) = TYPE_SIZE_UNIT (TREE_TYPE (ptr));
+	}
+      else
+	{
+	  OMP_CLAUSE_DECL (c) = decl;
+	  OMP_CLAUSE_SIZE (c) = NULL_TREE;
+	}
       if (TREE_CODE (TREE_TYPE (orig_decl)) == REFERENCE_TYPE
 	  && (GFC_DECL_GET_SCALAR_POINTER (orig_decl)
 	      || GFC_DECL_GET_SCALAR_ALLOCATABLE (orig_decl)))
@@ -1232,17 +1314,43 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
       stmtblock_t block;
       gfc_start_block (&block);
       tree type = TREE_TYPE (decl);
-      tree ptr = gfc_conv_descriptor_data_get (decl);
+      tree ptr;
+
+      if (present)
+	ptr = gfc_build_conditional_assign_expr (
+		&block, present,
+		gfc_conv_descriptor_data_get (decl),
+		null_pointer_node);
+      else
+	ptr = gfc_conv_descriptor_data_get (decl);
       ptr = fold_convert (build_pointer_type (char_type_node), ptr);
       ptr = build_fold_indirect_ref (ptr);
       OMP_CLAUSE_DECL (c) = ptr;
       c2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
       OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_TO_PSET);
-      OMP_CLAUSE_DECL (c2) = decl;
+      if (present)
+	{
+	  ptr = create_tmp_var (TREE_TYPE (TREE_OPERAND (decl, 0)));
+	  gfc_add_modify (&block, ptr, TREE_OPERAND (decl, 0));
+
+	  OMP_CLAUSE_DECL (c2) = build_fold_indirect_ref (ptr);
+	}
+      else
+	OMP_CLAUSE_DECL (c2) = decl;
       OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (type);
       c3 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
       OMP_CLAUSE_SET_MAP_KIND (c3, GOMP_MAP_POINTER);
-      OMP_CLAUSE_DECL (c3) = gfc_conv_descriptor_data_get (decl);
+      if (present)
+	{
+	  ptr = gfc_conv_descriptor_data_get (decl);
+	  ptr = gfc_build_addr_expr (NULL, ptr);
+	  ptr = gfc_build_conditional_assign_expr (&block, present,
+						   ptr, null_pointer_node);
+	  ptr = build_fold_indirect_ref (ptr);
+	  OMP_CLAUSE_DECL (c3) = ptr;
+	}
+      else
+	OMP_CLAUSE_DECL (c3) = gfc_conv_descriptor_data_get (decl);
       OMP_CLAUSE_SIZE (c3) = size_int (0);
       tree size = create_tmp_var (gfc_array_index_type);
       tree elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type));
@@ -1268,11 +1376,36 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
 	  tem = gfc_conv_descriptor_data_get (decl);
 	  tem = fold_convert (pvoid_type_node, tem);
 	  cond = fold_build2_loc (input_location, NE_EXPR,
-				  logical_type_node, tem, null_pointer_node);
+				  boolean_type_node, tem, null_pointer_node);
+	  if (present)
+	    {
+	      tem = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
+				     present, null_pointer_node);
+	      cond = fold_build2_loc (input_location, TRUTH_ANDIF_EXPR,
+				      boolean_type_node, tem, cond);
+	    }
 	  gfc_add_expr_to_block (&block, build3_loc (input_location, COND_EXPR,
 						     void_type_node, cond,
 						     then_b, else_b));
 	}
+      else if (present)
+	{
+	  stmtblock_t cond_block;
+	  tree then_b;
+
+	  gfc_init_block (&cond_block);
+	  gfc_add_modify (&cond_block, size,
+			  gfc_full_array_size (&cond_block, decl,
+					       GFC_TYPE_ARRAY_RANK (type)));
+	  gfc_add_modify (&cond_block, size,
+			  fold_build2 (MULT_EXPR, gfc_array_index_type,
+				       size, elemsz));
+	  then_b = gfc_finish_block (&cond_block);
+
+	  gfc_build_conditional_assign (&block, size, present, then_b,
+					build_int_cst (gfc_array_index_type,
+						       0));
+	}
       else
 	{
 	  gfc_add_modify (&block, size,
@@ -2252,6 +2385,9 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		TREE_ADDRESSABLE (decl) = 1;
 	      if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL)
 		{
+		  tree present = gfc_omp_is_optional_argument (decl)
+				 ? gfc_omp_check_optional_argument (decl, true)
+				 : NULL_TREE;
 		  if (POINTER_TYPE_P (TREE_TYPE (decl))
 		      && (gfc_omp_privatize_by_reference (decl)
 			  || GFC_DECL_GET_SCALAR_POINTER (decl)
@@ -2284,6 +2420,10 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		    {
 		      tree type = TREE_TYPE (decl);
 		      tree ptr = gfc_conv_descriptor_data_get (decl);
+		      if (present)
+			ptr = gfc_build_conditional_assign_expr (
+			       block, present, ptr,
+			       null_pointer_node);
 		      ptr = fold_convert (build_pointer_type (char_type_node),
 					  ptr);
 		      ptr = build_fold_indirect_ref (ptr);
@@ -2296,8 +2436,19 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
 		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
-		      OMP_CLAUSE_DECL (node3)
-			= gfc_conv_descriptor_data_get (decl);
+		      if (present)
+			{
+			  ptr = gfc_conv_descriptor_data_get (decl);
+			  ptr = gfc_build_addr_expr (NULL, ptr);
+			  ptr = gfc_build_conditional_assign_expr (
+				  block, present, ptr,
+				  null_pointer_node);
+			  ptr = build_fold_indirect_ref (ptr);
+			  OMP_CLAUSE_DECL (node3) = ptr;
+			}
+		      else
+			OMP_CLAUSE_DECL (node3)
+			  = gfc_conv_descriptor_data_get (decl);
 		      OMP_CLAUSE_SIZE (node3) = size_int (0);
 
 		      /* We have to check for n->sym->attr.dimension because
@@ -2322,8 +2473,20 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			  tem = gfc_conv_descriptor_data_get (decl);
 			  tem = fold_convert (pvoid_type_node, tem);
 			  cond = fold_build2_loc (input_location, NE_EXPR,
-						  logical_type_node,
+						  boolean_type_node,
 						  tem, null_pointer_node);
+			  if (present)
+			    {
+			      tree tmp = fold_build2_loc (input_location,
+							  NE_EXPR,
+							  boolean_type_node,
+							  present,
+							  null_pointer_node);
+			      cond = fold_build2_loc (input_location,
+						      TRUTH_ANDIF_EXPR,
+						      boolean_type_node,
+						      tmp, cond);
+			    }
 			  gfc_add_expr_to_block (block,
 						 build3_loc (input_location,
 							     COND_EXPR,
@@ -2333,9 +2496,34 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			  OMP_CLAUSE_SIZE (node) = size;
 			}
 		      else if (n->sym->attr.dimension)
-			OMP_CLAUSE_SIZE (node)
-			  = gfc_full_array_size (block, decl,
-						 GFC_TYPE_ARRAY_RANK (type));
+			{
+			  stmtblock_t cond_block;
+			  gfc_init_block (&cond_block);
+			  tree size = gfc_full_array_size (&cond_block, decl,
+					GFC_TYPE_ARRAY_RANK (type));
+			  if (present)
+			    {
+			      tree var = gfc_create_var (gfc_array_index_type,
+							 NULL);
+			      tree cond = fold_build2_loc (input_location,
+							   NE_EXPR,
+							   boolean_type_node,
+							   present,
+							   null_pointer_node);
+			      gfc_add_modify (&cond_block, var, size);
+			      gfc_add_expr_to_block (block,
+				build3_loc (input_location, COND_EXPR,
+					    void_type_node, cond,
+					    gfc_finish_block (&cond_block),
+					    NULL_TREE));
+			      OMP_CLAUSE_SIZE (node) = var;
+			    }
+			  else
+			    {
+			      gfc_add_block_to_block (block, &cond_block);
+			      OMP_CLAUSE_SIZE (node) = size;
+			    }
+			}
 		      if (n->sym->attr.dimension)
 			{
 			  tree elemsz
@@ -2346,6 +2534,18 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 					   OMP_CLAUSE_SIZE (node), elemsz);
 			}
 		    }
+		  else if (present
+			   && TREE_CODE (decl) == INDIRECT_REF
+			   && TREE_CODE (TREE_OPERAND (decl, 0))
+				== INDIRECT_REF)
+		    {
+		      /* A single indirectref is handled by the middle end.  */
+		      gcc_assert (!POINTER_TYPE_P (TREE_TYPE (decl)));
+		      decl = TREE_OPERAND (decl, 0);
+		      decl = gfc_build_conditional_assign_expr (
+			       block, present, decl, null_pointer_node);
+		      OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (decl);
+		    }
 		  else
 		    OMP_CLAUSE_DECL (node) = decl;
 		}
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 19132f76da2..8d6742e7223 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -11817,7 +11817,8 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	      {
 		gcc_checking_assert (is_gimple_omp_oacc (ctx->stmt));
 		s = TREE_TYPE (ovar);
-		if (TREE_CODE (s) == REFERENCE_TYPE)
+		if (TREE_CODE (s) == REFERENCE_TYPE
+		    || omp_check_optional_argument (ovar, false))
 		  s = TREE_TYPE (s);
 		s = TYPE_SIZE_UNIT (s);
 	      }
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 2f271009fb8..e5088014ccc 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -831,6 +831,12 @@  update_dev_host (int is_dev, void *h, size_t s, int async)
   if (acc_dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
     return;
 
+  /* Fortran optional arguments that are non-present result in a
+     NULL host address here.  This can safely be ignored as it is
+     not possible to 'update' a non-present optional argument.  */
+  if (h == NULL)
+    return;
+
   acc_prof_info prof_info;
   acc_api_info api_info;
   bool profiling_p = GOACC_PROFILING_SETUP_P (thr, &prof_info, &api_info);
@@ -901,6 +907,9 @@  gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes,
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
 
+  if (*hostaddrs == NULL)
+    return;
+
   if (acc_is_present (*hostaddrs, *sizes))
     {
       splay_tree_key n;
diff --git a/libgomp/testsuite/libgomp.fortran/optional-map.f90 b/libgomp/testsuite/libgomp.fortran/optional-map.f90
new file mode 100644
index 00000000000..56f6e59e815
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/optional-map.f90
@@ -0,0 +1,119 @@ 
+implicit none (type, external)
+call sub()
+call sub2()
+call call_present_1()
+call call_present_2()
+
+contains
+
+subroutine call_present_1()
+  integer :: ii, ival, iarr, iptr, iparr
+  pointer :: iptr, iparr
+  dimension :: iarr(2), iparr(:)
+  allocate(iptr,iparr(2))
+  ii = 101
+  ival = 102
+  iptr = 103
+  iarr = 104
+  iparr = 105
+  call sub_present(ii, ival, iarr, iptr, iparr)
+  deallocate(iptr,iparr)
+end subroutine
+
+subroutine call_present_2()
+  integer :: ii, ival, iarr, iptr, iparr
+  pointer :: iptr, iparr
+  dimension :: iarr(2), iparr(:)
+  allocate(iptr,iparr(2))
+  ii = 201
+  ival = 202
+  iptr = 203
+  iarr = 204
+  iparr = 205
+  call sub2_present(ii, ival, iarr, iptr, iparr)
+  deallocate(iptr,iparr)
+end subroutine
+
+subroutine sub(ii, ival, iarr, iptr, iparr)
+  integer, optional :: ii, ival, iarr, iptr, iparr
+  pointer :: iptr, iparr
+  dimension :: iarr(:), iparr(:)
+  value :: ival
+  integer :: err
+  err = 42
+  !$omp target map(ii, ival, iarr, iptr, iparr, err)
+  if (present(ii)) then
+    ii = iptr + ival
+    iarr = iparr
+  else
+    err = 0
+  end if
+  if (present(ii)) err = 1
+  if (present(ival)) err = 2
+  if (present(iarr)) err = 3
+  if (present(iptr)) err = 4
+  if (present(iparr)) err = 5
+  !$omp end target
+  if (err /= 0) stop 1
+end subroutine sub
+
+subroutine sub2(ii, ival, iarr, iptr, iparr)
+  integer, optional :: ii, ival, iarr, iptr, iparr
+  pointer :: iptr, iparr
+  dimension :: iarr(:), iparr(:)
+  value :: ival
+  integer :: err(1) ! otherwise, implied defaultmap is firstprivate
+  err(1) = 42
+  !$omp target  ! automatic mapping with implied defaultmap(tofrom) 
+  if (present(ii)) then
+    ii = iptr + ival
+    iarr = iparr
+  else
+    err(1) = 0
+  end if
+  if (present(ii)) err(1) = 1
+  if (present(ival)) err(1) = 2
+  if (present(iarr)) err(1) = 3
+  if (present(iptr)) err(1) = 4
+  if (present(iparr)) err(1) = 5
+  !$omp end target
+  if (err(1) /= 0) stop 2
+end subroutine sub2
+
+subroutine sub_present(ii, ival, iarr, iptr, iparr)
+  integer, optional :: ii, ival, iarr, iptr, iparr
+  pointer :: iptr, iparr
+  dimension :: iarr(:), iparr(:)
+  value :: ival
+  integer :: err
+  err = 42
+  !$omp target map(ii, ival, iarr, iptr, iparr, err)
+  if (.not.present(ii)) err = 1
+  if (.not.present(ival)) err = 2
+  if (.not.present(iarr)) err = 3
+  if (.not.present(iptr)) err = 4
+  if (.not.present(iparr)) err = 5
+  err = err - 42 - 101-102-103-104-105 + ii+ival+iarr(2)+iptr+iparr(2)
+  !$omp end target
+  if (err /= 0) stop 3
+end subroutine sub_present
+
+subroutine sub2_present(ii, ival, iarr, iptr, iparr)
+  integer, optional :: ii, ival, iarr, iptr, iparr
+  pointer :: iptr, iparr
+  dimension :: iarr(:), iparr(:)
+  value :: ival
+  integer :: err(1) ! otherwise, implied defaultmap is firstprivate
+  err(1) = 53
+  !$omp target  ! automatic mapping with implied defaultmap(tofrom) 
+  ! Note: OpenMP 4.5's 'defaultmap' is not yet supported, PR 92568
+  if (.not.present(ii)) err = 1
+  if (.not.present(ival)) err = 2
+  if (.not.present(iarr)) err = 3
+  if (.not.present(iptr)) err = 4
+  if (.not.present(iparr)) err = 5
+  err = err - 53 - 201-202-203-204-205 + ii+ival+iarr(2)+iptr+iparr(2)
+  !$omp end target
+  if (err(1) /= 0) stop 4
+end subroutine sub2_present
+end
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90
index 94ac76f5700..0254f2dc196 100644
--- a/libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90
@@ -472,6 +472,7 @@  contains
      hh = 88.0_c_double
 
      call test_dummy_opt_callee_1(aa, bb, cc, dd, ee, ff, gg, hh, N)
+     call test_dummy_opt_callee_1_absent(N=N)
      deallocate(ee, ff) ! pointers, only
   end subroutine test_dummy_opt_call_1
 
@@ -527,6 +528,41 @@  contains
      if (any(abs(3.0_c_double * gg - hh) > 10.0_c_double * epsilon(gg))) stop 72
   end subroutine test_dummy_opt_callee_1
 
+  subroutine test_dummy_opt_callee_1_absent(aa, bb, cc, dd, ee, ff, gg, hh, N)
+     ! scalars
+     real(c_double), optional, target :: aa, bb
+     real(c_double), optional, target, allocatable :: cc, dd
+     real(c_double), optional, pointer :: ee, ff
+
+     ! non-descriptor arrays
+     real(c_double), optional, target :: gg(N), hh(N)
+     integer, value :: N
+
+     integer :: err
+
+     ! All shall be absent
+     if (present(aa) .or. present(bb)) stop 243
+     if (present(cc) .or. present(dd)) stop 244
+     if (present(ee) .or. present(ff)) stop 245
+     if (present(gg) .or. present(hh)) stop 246
+
+     !$omp target data map(to:aa) map(from:bb) use_device_addr(aa,bb)
+     if (present(aa) .or. present(bb)) stop 247
+     !$omp end target data
+
+     !$omp target data map(to:cc) map(from:dd) use_device_addr(cc,dd)
+     if (present(cc) .or. present(dd)) stop 248
+     !$omp end target data
+
+     !$omp target data map(to:ee) map(from:ff) use_device_addr(ee,ff)
+     if (present(ee) .or. present(ff)) stop 249
+     !$omp end target data
+
+     !$omp target data map(to:gg) map(from:hh) use_device_addr(gg,hh)
+     if (present(gg) .or. present(hh)) stop 250
+     !$omp end target data
+  end subroutine test_dummy_opt_callee_1_absent
+
   ! Save device ptr - and recall pointer
   subroutine test_dummy_opt_call_2()
      integer, parameter :: N = 1000
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90
index d6c5a672370..3dd1f90f04c 100644
--- a/libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90
+++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90
@@ -472,6 +472,7 @@  contains
      hh = 88.0_c_float
 
      call test_dummy_opt_callee_1(aa, bb, cc, dd, ee, ff, gg, hh, N)
+     call test_dummy_opt_callee_1_absent(N=N)
      deallocate(ee, ff) ! pointers, only
   end subroutine test_dummy_opt_call_1
 
@@ -527,6 +528,41 @@  contains
      if (any(abs(3.0_c_float * gg - hh) > 10.0_c_float * epsilon(gg))) stop 72
   end subroutine test_dummy_opt_callee_1
 
+  subroutine test_dummy_opt_callee_1_absent(aa, bb, cc, dd, ee, ff, gg, hh, N)
+     ! scalars
+     real(c_float), optional, target :: aa, bb
+     real(c_float), optional, target, allocatable :: cc, dd
+     real(c_float), optional, pointer :: ee, ff
+
+     ! non-descriptor arrays
+     real(c_float), optional, target :: gg(N), hh(N)
+     integer, value :: N
+
+     integer :: err
+
+     ! All shall be absent
+     if (present(aa) .or. present(bb)) stop 243
+     if (present(cc) .or. present(dd)) stop 244
+     if (present(ee) .or. present(ff)) stop 245
+     if (present(gg) .or. present(hh)) stop 246
+
+     !$omp target data map(to:aa) map(from:bb) use_device_addr(aa,bb)
+     if (present(aa) .or. present(bb)) stop 247
+     !$omp end target data
+
+     !$omp target data map(to:cc) map(from:dd) use_device_addr(cc,dd)
+     if (present(cc) .or. present(dd)) stop 248
+     !$omp end target data
+
+     !$omp target data map(to:ee) map(from:ff) use_device_addr(ee,ff)
+     if (present(ee) .or. present(ff)) stop 249
+     !$omp end target data
+
+     !$omp target data map(to:gg) map(from:hh) use_device_addr(gg,hh)
+     if (present(gg) .or. present(hh)) stop 250
+     !$omp end target data
+  end subroutine test_dummy_opt_callee_1_absent
+
   ! Save device ptr - and recall pointer
   subroutine test_dummy_opt_call_2()
      integer, parameter :: N = 1000
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90
index 5c42bee718c..82cf9ac8070 100644
--- a/libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90
+++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90
@@ -290,6 +290,7 @@  contains
      ff = 66.0_c_double
 
      call test_dummy_opt_callee_1(aa, bb, cc, dd, ee, ff, N)
+     call test_dummy_opt_callee_1_absent(N=N)
      deallocate(ee, ff) ! pointers, only
   end subroutine test_dummy_opt_call_1
 
@@ -336,6 +337,32 @@  contains
      if (any(abs(3.0_c_double * ee - ff) > 10.0_c_double * epsilon(ee))) stop 1
   end subroutine test_dummy_opt_callee_1
 
+  subroutine test_dummy_opt_callee_1_absent(aa, bb, cc, dd, ee, ff, N)
+     ! scalars
+     real(c_double), optional, target :: aa(:), bb(:)
+     real(c_double), optional, target, allocatable :: cc(:), dd(:)
+     real(c_double), optional, pointer :: ee(:), ff(:)
+
+     integer, value :: N
+
+     ! All shall be absent
+     if (present(aa) .or. present(bb)) stop 1
+     if (present(cc) .or. present(dd)) stop 1
+     if (present(ee) .or. present(ff)) stop 1
+
+     !$omp target data map(to:aa) map(from:bb) use_device_addr(aa,bb)
+     if (present(aa) .or. present(bb)) stop 1
+     !$omp end target data
+
+     !$omp target data map(to:cc) map(from:dd) use_device_addr(cc,dd)
+     if (present(cc) .or. present(dd)) stop 1
+     !$omp end target data
+
+     !$omp target data map(to:ee) map(from:ff) use_device_addr(ee,ff)
+     if (present(ee) .or. present(ff)) stop 1
+     !$omp end target data
+  end subroutine test_dummy_opt_callee_1_absent
+
   ! Save device ptr - and recall pointer
   subroutine test_dummy_opt_call_2()
      integer, parameter :: N = 1000
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90
index 5e66a79da90..d17249de2bc 100644
--- a/libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90
+++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90
@@ -290,6 +290,7 @@  contains
      ff = 66.0_c_float
 
      call test_dummy_opt_callee_1(aa, bb, cc, dd, ee, ff, N)
+     call test_dummy_opt_callee_1_absent(N=N)
      deallocate(ee, ff) ! pointers, only
   end subroutine test_dummy_opt_call_1
 
@@ -336,6 +337,32 @@  contains
      if (any(abs(3.0_c_float * ee - ff) > 10.0_c_float * epsilon(ee))) stop 1
   end subroutine test_dummy_opt_callee_1
 
+  subroutine test_dummy_opt_callee_1_absent(aa, bb, cc, dd, ee, ff, N)
+     ! scalars
+     real(c_float), optional, target :: aa(:), bb(:)
+     real(c_float), optional, target, allocatable :: cc(:), dd(:)
+     real(c_float), optional, pointer :: ee(:), ff(:)
+
+     integer, value :: N
+
+     ! All shall be absent
+     if (present(aa) .or. present(bb)) stop 1
+     if (present(cc) .or. present(dd)) stop 1
+     if (present(ee) .or. present(ff)) stop 1
+
+     !$omp target data map(to:aa) map(from:bb) use_device_addr(aa,bb)
+     if (present(aa) .or. present(bb)) stop 1
+     !$omp end target data
+
+     !$omp target data map(to:cc) map(from:dd) use_device_addr(cc,dd)
+     if (present(cc) .or. present(dd)) stop 1
+     !$omp end target data
+
+     !$omp target data map(to:ee) map(from:ff) use_device_addr(ee,ff)
+     if (present(ee) .or. present(ff)) stop 1
+     !$omp end target data
+  end subroutine test_dummy_opt_callee_1_absent
+
   ! Save device ptr - and recall pointer
   subroutine test_dummy_opt_call_2()
      integer, parameter :: N = 1000
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c
deleted file mode 100644
index 5db29124e9e..00000000000
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c
+++ /dev/null
@@ -1,51 +0,0 @@ 
-/* Exercise acc_update_device with a NULL data address on nvidia targets.  */
-
-/* { dg-do run { target openacc_nvidia_accel_selected } } */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <openacc.h>
-
-int
-main (int argc, char **argv)
-{
-  const int N = 256;
-  int i;
-  unsigned char *h;
-  void *d;
-
-  h = (unsigned char *) malloc (N);
-
-  for (i = 0; i < N; i++)
-    {
-      h[i] = i;
-    }
-
-  d = acc_copyin (h, N);
-  if (!d)
-    abort ();
-
-  for (i = 0; i < N; i++)
-    {
-      h[i] = 0xab;
-    }
-
-  fprintf (stderr, "CheCKpOInT\n");
-  acc_update_device (0, N);
-
-  acc_copyout (h, N);
-
-  for (i = 0; i < N; i++)
-    {
-      if (h[i] != 0xab)
-	abort ();
-    }
-
-  free (h);
-
-  return 0;
-}
-
-/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */
-/* { dg-output "\\\[\[^\n\r]*,256\\\] is not mapped" } */
-/* { dg-shouldfail "" } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c
deleted file mode 100644
index c2140429cb1..00000000000
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c
+++ /dev/null
@@ -1,49 +0,0 @@ 
-/* Exercise acc_update_self with a NULL data mapping on nvidia targets.  */
-
-/* { dg-do run { target openacc_nvidia_accel_selected } } */
-
-#include <stdio.h>
-#include <string.h>
-#include <stdlib.h>
-#include <openacc.h>
-
-int
-main (int argc, char **argv)
-{
-  const int N = 256;
-  int i;
-  unsigned char *h;
-  void *d;
-
-  h = (unsigned char *) malloc (N);
-
-  for (i = 0; i < N; i++)
-    {
-      h[i] = i;
-    }
-
-  d = acc_copyin (h, N);
-  if (!d)
-    abort ();
-
-  memset (&h[0], 0, N);
-
-  fprintf (stderr, "CheCKpOInT\n");
-  acc_update_self (0, N);
-
-  for (i = 0; i < N; i++)
-    {
-      if (h[i] != i)
-	abort ();
-    }
-
-  acc_delete (h, N);
-
-  free (h);
-
-  return 0;
-}
-
-/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */
-/* { dg-output "\\\[\[^\n\r]*,256\\\] is not mapped" } */
-/* { dg-shouldfail "" } */
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95
new file mode 100644
index 00000000000..00f7472ae6e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95
@@ -0,0 +1,23 @@ 
+! Test that the cache directives work with optional arguments.  The effect
+! of giving a non-present argument to the cache directive is not tested as
+! it is undefined.  The test is based on gfortran.dg/goacc/cache-1.f95.
+
+! { dg-additional-options "-std=f2008" }
+
+program cache_test
+  implicit none
+  integer :: d(10), e(7,13)
+
+  call do_test(d, e)
+contains
+  subroutine do_test(d, e)
+    integer, optional :: d(10), e(7,13)
+    integer :: i
+    do concurrent (i=1:5)
+      !$acc cache (d(1:3))
+      !$acc cache (d(i:i+2))
+      !$acc cache (e(1:3,2:4))
+      !$acc cache (e(i:i+2,i+1:i+3))
+    enddo
+  end
+end
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90
new file mode 100644
index 00000000000..5cadeed44b4
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90
@@ -0,0 +1,29 @@ 
+! Test OpenACC data regions with optional arguments passed by value.
+
+! { dg-do run }
+
+program test
+  implicit none
+
+  integer :: res
+
+  if (foo(27) .ne. 27) stop 1
+  if (foo(16, 18) .ne. 288) stop 1
+contains
+  function foo(x, y)
+    integer, value :: x
+    integer, value, optional :: y
+    integer :: res, foo
+
+    !$acc data copyin(x, y) copyout(res)
+    !$acc parallel
+    res = x
+    if (present(y)) then
+      res = res * y
+    end if
+    !$acc end parallel
+    !$acc end data
+
+    foo = res
+  end function foo
+end program test
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin.f90
new file mode 100644
index 00000000000..a30908d61a5
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin.f90
@@ -0,0 +1,140 @@ 
+! Test OpenACC data regions with a copy-in of optional arguments.
+
+! { dg-do run }
+
+program test
+  implicit none
+
+  integer, parameter :: n = 64
+  integer :: i
+  integer :: a_int, b_int, c_int, res_int
+  integer :: a_arr(n), b_arr(n), c_arr(n), res_arr(n)
+  integer, allocatable :: a_alloc(:), b_alloc(:), c_alloc(:), res_alloc(:)
+
+  a_int = 7
+  b_int = 3
+  c_int = 11
+
+  call test_int(res_int, a_int)
+  if (res_int .ne. a_int) stop 1
+
+  call test_int(res_int, a_int, b_int)
+  if (res_int .ne. a_int * b_int) stop 2
+
+  call test_int(res_int, a_int, b_int, c_int)
+  if (res_int .ne. a_int * b_int + c_int) stop 3
+
+  do i = 1, n
+    a_arr(i) = i
+    b_arr(i) = n - i + 1
+    c_arr(i) = i * 3
+  end do
+
+  call test_array(res_arr, a_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i)) stop 4
+  end do
+
+  call test_array(res_arr, a_arr, b_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 5
+  end do
+
+  call test_array(res_arr, a_arr, b_arr, c_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i) + c_arr(i)) stop 6
+  end do
+
+  allocate (a_alloc(n))
+  allocate (b_alloc(n))
+  allocate (c_alloc(n))
+  allocate (res_alloc(n))
+
+  do i = 1, n
+    a_alloc(i) = i
+    b_alloc(i) = n - i + 1
+    c_alloc(i) = i * 3
+  end do
+
+  call test_allocatable(res_alloc, a_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. a_alloc(i)) stop 7
+  end do
+
+  call test_allocatable(res_alloc, a_alloc, b_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i)) stop 8
+  end do
+
+  call test_allocatable(res_alloc, a_alloc, b_alloc, c_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i) + c_alloc(i)) stop 9
+  end do
+
+  deallocate (a_alloc)
+  deallocate (b_alloc)
+  deallocate (c_alloc)
+  deallocate (res_alloc)
+contains
+  subroutine test_int(res, a, b, c)
+    integer :: res
+    integer :: a
+    integer, optional :: b, c
+
+    !$acc data copyin(a, b, c) copyout(res)
+    !$acc parallel
+    res = a
+
+    if (present(b)) res = res * b
+
+    if (present(c)) res = res + c
+    !$acc end parallel
+    !$acc end data
+  end subroutine test_int
+
+  subroutine test_array(res, a, b, c)
+    integer :: res(n)
+    integer :: a(n)
+    integer, optional :: b(n), c(n)
+
+    !$acc data copyin(a, b, c) copyout(res)
+    !$acc parallel loop
+    do i = 1, n
+      res(i) = a(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(b)) res(i) = res(i) * b(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(c)) res(i) = res(i) + c(i)
+    end do
+    !$acc end data
+  end subroutine test_array
+
+  subroutine test_allocatable(res, a, b, c)
+    integer, allocatable :: res(:)
+    integer, allocatable  :: a(:)
+    integer, allocatable, optional :: b(:), c(:)
+
+    !$acc data copyin(a, b, c) copyout(res)
+    !$acc parallel loop
+    do i = 1, n
+      res(i) = a(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(b)) res(i) = res(i) * b(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(c)) res(i) = res(i) + c(i)
+    end do
+    !$acc end data
+  end subroutine test_allocatable
+end program test
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyout.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyout.f90
new file mode 100644
index 00000000000..feaa31fa423
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyout.f90
@@ -0,0 +1,96 @@ 
+! Test OpenACC data regions with a copy-out of optional arguments.
+
+! { dg-do run }
+
+program test
+  implicit none
+
+  integer, parameter :: n = 64
+  integer :: i
+  integer :: a_int, b_int, res_int
+  integer :: a_arr(n), b_arr(n), res_arr(n)
+  integer, allocatable :: a_alloc(:), b_alloc(:), res_alloc(:)
+
+  res_int = 0
+
+  call test_int(a_int, b_int)
+  if (res_int .ne. 0) stop 1
+
+  call test_int(a_int, b_int, res_int)
+  if (res_int .ne. a_int * b_int) stop 2
+
+  res_arr(:) = 0
+  do i = 1, n
+    a_arr(i) = i
+    b_arr(i) = n - i + 1
+  end do
+
+  call test_array(a_arr, b_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. 0) stop 3
+  end do
+
+  call test_array(a_arr, b_arr, res_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 4
+  end do
+
+  allocate (a_alloc(n))
+  allocate (b_alloc(n))
+  allocate (res_alloc(n))
+
+  res_alloc(:) = 0
+  do i = 1, n
+    a_alloc(i) = i
+    b_alloc(i) = n - i + 1
+  end do
+
+  call test_allocatable(a_alloc, b_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. 0) stop 5
+  end do
+
+  call test_allocatable(a_alloc, b_alloc, res_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i)) stop 6
+  end do
+
+  deallocate (a_alloc)
+  deallocate (b_alloc)
+  deallocate (res_alloc)
+contains
+  subroutine test_int(a, b, res)
+    integer :: a, b
+    integer, optional :: res
+
+    !$acc data copyin(a, b) copyout(res)
+    !$acc parallel
+    if (present(res)) res = a * b
+    !$acc end parallel
+    !$acc end data
+  end subroutine test_int
+
+  subroutine test_array(a, b, res)
+    integer :: a(n), b(n)
+    integer, optional :: res(n)
+
+    !$acc data copyin(a, b) copyout(res)
+    !$acc parallel loop
+    do i = 1, n
+      if (present(res)) res(i) = a(i) * b(i)
+    end do
+    !$acc end data
+  end subroutine test_array
+
+  subroutine test_allocatable(a, b, res)
+    integer, allocatable :: a(:), b(:)
+    integer, allocatable, optional :: res(:)
+
+    !$acc data copyin(a, b) copyout(res)
+    !$acc parallel loop
+    do i = 1, n
+      if (present(res)) res(i) = a(i) * b(i)
+    end do
+    !$acc end data
+  end subroutine test_allocatable
+end program test
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90
new file mode 100644
index 00000000000..9ed0f753ea5
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90
@@ -0,0 +1,91 @@ 
+! Test OpenACC unstructured enter data/exit data regions with optional
+! arguments.
+
+! { dg-do run }
+
+program test
+  implicit none
+
+  integer, parameter :: n = 64
+  integer :: a(n), b(n), c(n), res(n)
+  integer :: x, y, z, r, i
+
+  do i = 1, n
+    a(i) = i
+    b(i) = n - i + 1
+    c(i) = i * 3
+  end do
+
+  res = test_array(a)
+  do i = 1, n
+    if (res(i) .ne. a(i)) stop 1
+  end do
+
+  res = test_array(a, b)
+  do i = 1, n
+    if (res(i) .ne. a(i) * b(i)) stop 2
+  end do
+
+  res = test_array(a, b, c)
+  do i = 1, n
+    if (res(i) .ne. a(i) * b(i) + c(i)) stop 3
+  end do
+
+  x = 7
+  y = 3
+  z = 11
+
+  r = test_int(x)
+  if (r .ne. x) stop 4
+
+  r = test_int(x, y)
+  if (r .ne. x * y) stop 5
+
+  r = test_int(x, y, z)
+  if (r .ne. x * y + z) stop 6
+contains
+  function test_array(a, b, c)
+    integer :: a(n)
+    integer, optional :: b(n), c(n)
+    integer :: test_array(n), res(n)
+
+    !$acc enter data copyin(a, b, c) create(res)
+    !$acc parallel loop
+    do i = 1, n
+      res(i) = a(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(b)) then
+        res(i) = res(i) * b(i)
+      end if
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(c)) then
+        res(i) = res(i) + c(i)
+      end if
+    end do
+    !$acc exit data copyout(res) delete(a, b, c)
+
+    test_array = res
+  end function test_array
+
+  function test_int(a, b, c)
+    integer :: a
+    integer, optional :: b, c
+    integer :: test_int, res
+
+    !$acc enter data copyin(a, b, c) create(res)
+    !$acc parallel present(a, b, c, res)
+    res = a
+    if (present(b)) res = res * b
+    if (present(c)) res = res + c
+    !$acc end parallel
+    !$acc exit data copyout(res) delete(a, b, c)
+
+    test_int = res
+  end function test_int
+end program test
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-declare.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-declare.f90
new file mode 100644
index 00000000000..074e5a2abb6
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-declare.f90
@@ -0,0 +1,87 @@ 
+! Test OpenACC declare directives with optional arguments.
+
+! { dg-do run }
+
+program test
+  implicit none
+
+  integer, parameter :: n = 64
+  integer :: i
+  integer :: a_int, b_int, c_int, res_int
+  integer :: a_arr(n), b_arr(n), c_arr(n), res_arr(n)
+
+  a_int = 7
+  b_int = 3
+  c_int = 11
+
+  call test_int(res_int, a_int)
+  if (res_int .ne. a_int) stop 1
+
+  call test_int(res_int, a_int, b_int)
+  if (res_int .ne. a_int * b_int) stop 2
+
+  call test_int(res_int, a_int, b_int, c_int)
+  if (res_int .ne. a_int * b_int + c_int) stop 3
+
+  do i = 1, n
+    a_arr(i) = i
+    b_arr(i) = n - i + 1
+    c_arr(i) = i * 3
+  end do
+
+  call test_array(res_arr, a_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i)) stop 4
+  end do
+
+  call test_array(res_arr, a_arr, b_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 5
+  end do
+
+  call test_array(res_arr, a_arr, b_arr, c_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i) + c_arr(i)) stop 6
+  end do
+contains
+  subroutine test_int(res, a, b, c)
+    integer :: a
+    integer, optional :: b, c
+    !$acc declare present_or_copyin(a, b, c)
+    integer :: res
+    !$acc declare present_or_copyout(res)
+
+    !$acc parallel
+    res = a
+    if (present(b)) res = res * b
+    if (present(c)) res = res + c
+    !$acc end parallel
+  end subroutine test_int
+
+  subroutine test_array(res, a, b, c)
+    integer :: a(n)
+    integer, optional :: b(n), c(n)
+    !$acc declare present_or_copyin(a, b, c)
+    integer :: res(n)
+    !$acc declare present_or_copyout(res)
+
+    !$acc parallel loop
+    do i = 1, n
+      res(i) = a(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(b)) then
+        res(i) = res(i) * b(i)
+      end if
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(c)) then
+        res(i) = res(i) + c(i)
+      end if
+    end do
+  end subroutine test_array
+end program test
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90
new file mode 100644
index 00000000000..693e6118489
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90
@@ -0,0 +1,112 @@ 
+! Test that optional arguments work in firstprivate clauses.  The effect of
+! non-present arguments in firstprivate clauses is undefined, and is not
+! tested for.
+
+! { dg-do run }
+
+program test_firstprivate
+  implicit none
+  integer, parameter :: n = 64
+
+  integer :: i, j
+  integer :: a_int, b_int, c_int, res_int
+  integer :: a_arr(n), b_arr(n), c_arr(n), res_arr(n)
+  integer, allocatable :: a_alloc(:), b_alloc(:), c_alloc(:), res_alloc(:)
+
+  a_int = 14
+  b_int = 5
+  c_int = 12
+
+  call test_int(res_int, a_int, b_int, c_int)
+  if (res_int .ne. a_int * b_int + c_int) stop 1
+
+  do i = 1, n
+    a_arr(i) = i
+    b_arr(i) = n - i + 1
+    c_arr(i) = i * 3
+  end do
+
+  call test_array(res_arr, a_arr, b_arr, c_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i) + c_arr(i)) stop 2
+  end do
+
+  allocate(a_alloc(n))
+  allocate(b_alloc(n))
+  allocate(c_alloc(n))
+  allocate(res_alloc(n))
+
+  do i = 1, n
+    a_arr(i) = i
+    b_arr(i) = n - i + 1
+    c_arr(i) = i * 3
+  end do
+
+  call test_allocatable(res_alloc, a_alloc, b_alloc, c_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i) + c_alloc(i)) stop 2
+  end do
+
+  deallocate(a_alloc)
+  deallocate(b_alloc)
+  deallocate(c_alloc)
+  deallocate(res_alloc)
+contains
+  subroutine test_int(res, a, b, c)
+    integer :: a
+    integer, optional :: b, c
+    integer :: res
+
+    !$acc parallel firstprivate(a, b, c) copyout(res)
+    res = a
+    if (present(b)) res = res * b
+    if (present(c)) res = res + c
+    !$acc end parallel
+  end subroutine test_int
+
+  subroutine test_array(res, a, b, c)
+    integer :: a(n)
+    integer, optional :: b(n), c(n)
+    integer :: res(n)
+
+    !$acc data copyin(a, b, c) copyout(res)
+    !$acc parallel loop firstprivate(a)
+    do i = 1, n
+      res(i) = a(i)
+    end do
+
+    !$acc parallel loop firstprivate(b)
+    do i = 1, n
+      if (present(b)) res(i) = res(i) * b(i)
+    end do
+
+    !$acc parallel loop firstprivate(c)
+    do i = 1, n
+      if (present(c)) res(i) = res(i) + c(i)
+    end do
+    !$acc end data
+  end subroutine test_array
+
+  subroutine test_allocatable(res, a, b, c)
+    integer, allocatable :: a(:)
+    integer, allocatable, optional :: b(:), c(:)
+    integer, allocatable :: res(:)
+
+    !$acc data copyin(a, b, c) copyout(res)
+    !$acc parallel loop firstprivate(a)
+    do i = 1, n
+      res(i) = a(i)
+    end do
+
+    !$acc parallel loop firstprivate(b)
+    do i = 1, n
+      if (present(b)) res(i) = res(i) * b(i)
+    end do
+
+    !$acc parallel loop firstprivate(c)
+    do i = 1, n
+      if (present(c)) res(i) = res(i) + c(i)
+    end do
+    !$acc end data
+  end subroutine test_allocatable
+end program test_firstprivate
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-host_data.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-host_data.f90
new file mode 100644
index 00000000000..a6e41e28b0b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-host_data.f90
@@ -0,0 +1,39 @@ 
+! Test the host_data construct with optional arguments.
+! Based on host_data-1.f90.
+
+! { dg-do run }
+! { dg-additional-options "-cpp" }
+
+program test
+  implicit none
+
+  integer, target :: i
+  integer, pointer :: ip, iph
+
+  ! Assign the same targets
+  ip => i
+  iph => i
+
+  call foo(iph)
+  call foo(iph, ip)
+contains
+  subroutine foo(iph, ip)
+    integer, pointer :: iph
+    integer, pointer, optional :: ip
+
+    !$acc data copyin(i)
+    !$acc host_data use_device(ip)
+
+    ! Test how the pointers compare inside a host_data construct
+    if (present(ip)) then
+#if ACC_MEM_SHARED
+      if (.not. associated(ip, iph)) STOP 1
+#else
+      if (associated(ip, iph)) STOP 2
+#endif
+    end if
+
+    !$acc end host_data
+    !$acc end data
+  end subroutine foo
+end program test
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-nested-calls.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-nested-calls.f90
new file mode 100644
index 00000000000..279139f7c59
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-nested-calls.f90
@@ -0,0 +1,135 @@ 
+! Test propagation of optional arguments from within an OpenACC parallel region.
+
+! { dg-do run }
+
+program test
+  implicit none
+
+  integer, parameter :: n = 64
+  integer :: i
+  integer :: res_int
+  integer :: a_arr(n), b_arr(n), res_arr(n)
+  integer, allocatable :: a_alloc(:), b_alloc(:), res_alloc(:)
+
+  call test_int_caller(res_int, 5)
+  if (res_int .ne. 10) stop 1
+
+  call test_int_caller(res_int, 2, 3)
+  if (res_int .ne. 11) stop 2
+
+  do i = 1, n
+    a_arr(i) = i
+    b_arr(i) = n - i + 1
+  end do
+
+  call test_array_caller(res_arr, a_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. 2 * a_arr(i)) stop 3
+  end do
+
+  call test_array_caller(res_arr, a_arr, b_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i) + a_arr(i) + b_arr(i)) stop 4
+  end do
+
+  allocate(a_alloc(n))
+  allocate(b_alloc(n))
+  allocate(res_alloc(n))
+
+  do i = 1, n
+    a_alloc(i) = i
+    b_alloc(i) = n - i + 1
+  end do
+
+  call test_array_caller(res_arr, a_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. 2 * a_alloc(i)) stop 5
+  end do
+
+  call test_array_caller(res_arr, a_arr, b_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_alloc(i) + a_alloc(i) + b_alloc(i)) stop 6
+  end do
+
+  deallocate(a_alloc)
+  deallocate(b_alloc)
+  deallocate(res_alloc)
+contains
+  subroutine test_int_caller(res, a, b)
+    integer :: res, a
+    integer, optional :: b
+
+    !$acc data copyin(a, b) copyout (res)
+    !$acc parallel
+    res = a
+    if (present(b)) res = res * b
+    call test_int_callee(res, a, b)
+    !$acc end parallel
+    !$acc end data
+  end subroutine test_int_caller
+
+  subroutine test_int_callee(res, a, b)
+    !$acc routine seq
+    integer :: res, a
+    integer, optional :: b
+
+    res = res + a
+    if (present(b)) res = res + b
+  end subroutine test_int_callee
+
+  subroutine test_array_caller(res, a, b)
+    integer :: res(n), a(n), i
+    integer, optional :: b(n)
+
+    !$acc data copyin(a, b) copyout(res)
+    !$acc parallel
+    !$acc loop seq
+    do i = 1, n
+      res(i) = a(i)
+      if (present(b)) res(i) = res(i) * b(i)
+    end do
+    call test_array_callee(res, a, b)
+    !$acc end parallel
+    !$acc end data
+  end subroutine test_array_caller
+
+  subroutine test_array_callee(res, a, b)
+    !$acc routine seq
+    integer :: res(n), a(n), i
+    integer, optional :: b(n)
+
+    do i = 1, n
+      res(i) = res(i) + a(i)
+      if (present(b)) res(i) = res(i) + b(i)
+    end do
+  end subroutine test_array_callee
+
+  subroutine test_allocatable_caller(res, a, b)
+    integer :: i
+    integer, allocatable :: res(:), a(:)
+    integer, allocatable, optional :: b(:)
+
+    !$acc data copyin(a, b) copyout(res)
+    !$acc parallel
+    !$acc loop seq
+    do i = 1, n
+      res(i) = a(i)
+      if (present(b)) res(i) = res(i) * b(i)
+    end do
+    call test_array_callee(res, a, b)
+    !$acc end parallel
+    !$acc end data
+  end subroutine test_allocatable_caller
+
+  subroutine test_allocatable_callee(res, a, b)
+    !$acc routine seq
+    integer :: i
+    integer, allocatable :: res(:), a(:)
+    integer, allocatable, optional :: b(:)
+
+    do i = 1, n
+      res(i) = res(i) + a(i)
+      if (present(b)) res(i) = res(i) + b(i)
+    end do
+  end subroutine test_allocatable_callee
+end program test
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90
new file mode 100644
index 00000000000..0320bbb3bc9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90
@@ -0,0 +1,115 @@ 
+! Test that optional arguments work in private clauses.  The effect of
+! non-present arguments in private clauses is undefined, and is not tested
+! for.  The tests are based on those in private-variables.f90.
+
+! { dg-do run }
+
+program main
+  implicit none
+
+  type vec3
+     integer x, y, z, attr(13)
+  end type vec3
+  integer :: x
+  type(vec3) :: pt
+  integer :: arr(2)
+
+  call t1(x)
+  call t2(pt)
+  call t3(arr)
+contains
+
+  ! Test of gang-private variables declared on loop directive.
+
+  subroutine t1(x)
+    integer, optional :: x
+    integer :: i, arr(32)
+
+    do i = 1, 32
+       arr(i) = i
+    end do
+
+    !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+    !$acc loop gang private(x)
+    do i = 1, 32
+       x = i * 2;
+       arr(i) = arr(i) + x
+    end do
+    !$acc end parallel
+
+    do i = 1, 32
+       if (arr(i) .ne. i * 3) STOP 1
+    end do
+  end subroutine t1
+
+
+  ! Test of gang-private addressable variable declared on loop directive, with
+  ! broadcasting to partitioned workers.
+
+  subroutine t2(pt)
+    integer i, j, arr(0:32*32)
+    type(vec3), optional :: pt
+
+    do i = 0, 32*32-1
+       arr(i) = i
+    end do
+
+    !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+    !$acc loop gang private(pt)
+    do i = 0, 31
+       pt%x = i
+       pt%y = i * 2
+       pt%z = i * 4
+       pt%attr(5) = i * 6
+
+       !$acc loop vector
+       do j = 0, 31
+          arr(i * 32 + j) = arr(i * 32 + j) + pt%x + pt%y + pt%z + pt%attr(5);
+       end do
+    end do
+    !$acc end parallel
+
+    do i = 0, 32 * 32 - 1
+       if (arr(i) .ne. i + (i / 32) * 13) STOP 2
+    end do
+  end subroutine t2
+
+  ! Test of vector-private variables declared on loop directive. Array type.
+
+  subroutine t3(pt)
+    integer, optional :: pt(2)
+    integer :: i, j, k, idx, arr(0:32*32*32)
+
+    do i = 0, 32*32*32-1
+       arr(i) = i
+    end do
+
+    !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+    !$acc loop gang
+    do i = 0, 31
+       !$acc loop worker
+       do j = 0, 31
+          !$acc loop vector private(pt)
+          do k = 0, 31
+             pt(1) = ieor(i, j * 3)
+             pt(2) = ior(i, j * 5)
+             arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(1) * k
+             arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(2) * k
+          end do
+       end do
+    end do
+    !$acc end parallel
+
+    do i = 0, 32 - 1
+       do j = 0, 32 -1
+          do k = 0, 32 - 1
+             idx = i * 1024 + j * 32 + k
+             if (arr(idx) .ne. idx + ieor(i, j * 3) * k + ior(i, j * 5) * k) then
+                STOP 3
+             end if
+          end do
+       end do
+    end do
+  end subroutine t3
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90
new file mode 100644
index 00000000000..b76db3ef6d3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90
@@ -0,0 +1,69 @@ 
+! Test optional arguments in reduction clauses.  The effect of
+! non-present arguments in reduction clauses is undefined, and is not tested
+! for.  The tests are based on those in reduction-1.f90.
+
+! { dg-do run }
+! { dg-additional-options "-w" }
+
+program optional_reduction
+  implicit none
+
+  integer :: rg, rw, rv, rc
+
+  rg = 0
+  rw = 0
+  rv = 0
+  rc = 0
+
+  call do_test(rg, rw, rv, rc)
+contains
+  subroutine do_test(rg, rw, rv, rc)
+    integer, parameter     :: n = 10, ng = 8, nw = 4, vl = 32
+    integer, optional      :: rg, rw, rv, rc
+    integer                :: i, vresult
+    integer, dimension (n) :: array
+
+    vresult = 0
+    do i = 1, n
+       array(i) = i
+    end do
+
+    !$acc parallel num_gangs(ng) copy(rg)
+    !$acc loop reduction(+:rg) gang
+    do i = 1, n
+       rg = rg + array(i)
+    end do
+    !$acc end parallel
+
+    !$acc parallel num_workers(nw) copy(rw)
+    !$acc loop reduction(+:rw) worker
+    do i = 1, n
+       rw = rw + array(i)
+    end do
+    !$acc end parallel
+
+    !$acc parallel vector_length(vl) copy(rv)
+    !$acc loop reduction(+:rv) vector
+    do i = 1, n
+       rv = rv + array(i)
+    end do
+    !$acc end parallel
+
+    !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+    !$acc loop reduction(+:rc) gang worker vector
+    do i = 1, n
+       rc = rc + array(i)
+    end do
+    !$acc end parallel
+
+    ! Verify the results
+    do i = 1, n
+       vresult = vresult + array(i)
+    end do
+
+    if (rg .ne. vresult) STOP 1
+    if (rw .ne. vresult) STOP 2
+    if (rv .ne. vresult) STOP 3
+    if (rc .ne. vresult) STOP 4
+  end subroutine do_test
+end program optional_reduction
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-update-device.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-update-device.f90
new file mode 100644
index 00000000000..57f69001d3d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-update-device.f90
@@ -0,0 +1,121 @@ 
+! Test OpenACC update to device with an optional argument.
+
+! { dg-do run }
+
+program optional_update_device
+  implicit none
+
+  integer, parameter :: n = 64
+  integer :: i
+  integer :: a_int, b_int, res_int
+  integer :: a_arr(n), b_arr(n), res_arr(n)
+  integer, allocatable :: a_alloc(:), b_alloc(:), res_alloc(:)
+
+  a_int = 5
+  b_int = 11
+
+  call test_int(res_int, a_int)
+  if (res_int .ne. a_int) stop 1
+
+  call test_int(res_int, a_int, b_int)
+  if (res_int .ne. a_int * b_int) stop 2
+
+  res_arr(:) = 0
+  do i = 1, n
+    a_arr(i) = i
+    b_arr(i) = n - i + 1
+  end do
+
+  call test_array(res_arr, a_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i)) stop 3
+  end do
+
+  call test_array(res_arr, a_arr, b_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 4
+  end do
+
+  allocate (a_alloc(n))
+  allocate (b_alloc(n))
+  allocate (res_alloc(n))
+
+  res_alloc(:) = 0
+  do i = 1, n
+    a_alloc(i) = i
+    b_alloc(i) = n - i + 1
+  end do
+
+  call test_allocatable(res_alloc, a_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. a_alloc(i)) stop 5
+  end do
+
+  call test_allocatable(res_alloc, a_alloc, b_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i)) stop 6
+  end do
+
+  deallocate (a_alloc)
+  deallocate (b_alloc)
+  deallocate (res_alloc)
+contains
+  subroutine test_int(res, a, b)
+    integer :: res
+    integer :: a
+    integer, optional :: b
+
+    !$acc data create(a, b, res)
+    !$acc update device(a, b)
+    !$acc parallel
+    res = a
+    if (present(b)) res = res * b
+    !$acc end parallel
+    !$acc update self(res)
+    !$acc end data
+  end subroutine test_int
+
+  subroutine test_array(res, a, b)
+    integer :: res(n)
+    integer :: a(n)
+    integer, optional :: b(n)
+
+    !$acc data create(a, b, res)
+    !$acc update device(a, b)
+    !$acc parallel loop
+    do i = 1, n
+      res(i) = a(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(b)) then
+        res(i) = res(i) * b(i)
+      end if
+    end do
+    !$acc update self(res)
+    !$acc end data
+  end subroutine test_array
+
+  subroutine test_allocatable(res, a, b)
+    integer, allocatable :: res(:)
+    integer, allocatable :: a(:)
+    integer, allocatable, optional :: b(:)
+
+    !$acc data create(a, b, res)
+    !$acc update device(a, b)
+    !$acc parallel loop
+    do i = 1, n
+      res(i) = a(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(b)) then
+        res(i) = res(i) * b(i)
+      end if
+    end do
+    !$acc update self(res)
+    !$acc end data
+  end subroutine test_allocatable
+end program optional_update_device
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-update-host.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-update-host.f90
new file mode 100644
index 00000000000..36b94241b11
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-update-host.f90
@@ -0,0 +1,115 @@ 
+! Test OpenACC update to host with an optional argument.
+
+! { dg-do run }
+
+program optional_update_host
+  implicit none
+
+  integer, parameter :: n = 64
+  integer :: i
+  integer :: a_int, b_int, res_int
+  integer :: a_arr(n), b_arr(n), res_arr(n)
+  integer, allocatable :: a_alloc(:), b_alloc(:), res_alloc(:)
+
+  a_int = 5
+  b_int = 11
+  res_int = 0
+
+  call test_int(a_int, b_int)
+  if (res_int .ne. 0) stop 1
+
+  call test_int(a_int, b_int, res_int)
+  if (res_int .ne. a_int * b_int) stop 2
+
+  res_arr(:) = 0
+  do i = 1, n
+    a_arr(i) = i
+    b_arr(i) = n - i + 1
+  end do
+
+  call test_array(a_arr, b_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. 0) stop 1
+  end do
+
+  call test_array(a_arr, b_arr, res_arr)
+  do i = 1, n
+    if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 2
+  end do
+
+  allocate(a_alloc(n))
+  allocate(b_alloc(n))
+  allocate(res_alloc(n))
+
+  res_alloc(:) = 0
+  do i = 1, n
+    a_alloc(i) = i
+    b_alloc(i) = n - i + 1
+  end do
+
+  call test_allocatable(a_alloc, b_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. 0) stop 1
+  end do
+
+  call test_allocatable(a_alloc, b_alloc, res_alloc)
+  do i = 1, n
+    if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i)) stop 2
+  end do
+
+  deallocate(a_alloc)
+  deallocate(b_alloc)
+  deallocate(res_alloc)
+contains
+  subroutine test_int(a, b, res)
+    integer :: a, b
+    integer, optional :: res
+
+    !$acc data create(a, b, res)
+    !$acc update device(a, b)
+    !$acc parallel
+    if (present(res)) res = a
+    if (present(res)) res = res * b
+    !$acc end parallel
+    !$acc update self(res)
+    !$acc end data
+  end subroutine test_int
+
+  subroutine test_array(a, b, res)
+    integer :: a(n), b(n)
+    integer, optional :: res(n)
+
+    !$acc data create(a, b, res)
+    !$acc update device(a, b)
+    !$acc parallel loop
+    do i = 1, n
+      if (present(res)) res(i) = a(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(res)) res(i) = res(i) * b(i)
+    end do
+    !$acc update self(res)
+    !$acc end data
+  end subroutine test_array
+
+  subroutine test_allocatable(a, b, res)
+    integer, allocatable :: a(:), b(:)
+    integer, allocatable, optional :: res(:)
+
+    !$acc data create(a, b, res)
+    !$acc update device(a, b)
+    !$acc parallel loop
+    do i = 1, n
+      if (present(res)) res(i) = a(i)
+    end do
+
+    !$acc parallel loop
+    do i = 1, n
+      if (present(res)) res(i) = res(i) * b(i)
+    end do
+    !$acc update self(res)
+    !$acc end data
+  end subroutine test_allocatable
+end program optional_update_host