Message ID | 8be82276-81b1-817c-fcd2-51f24f5fe2d2@codesourcery.com |
---|---|
State | New |
Headers | show |
Series | [OpenMP/OpenACC/Fortran] Fix mapping of optional (present|absent) arguments | expand |
Early *PING*. Tobias Burnus wrote: > This patch does two things regarding explicit and automatical variable > mapping to offloaded devices: > > * Fixes bugs with optional arguments, which are present. They were > mapped but the mapping had issues causing run-time failures. > * It now also handles absent optional arguments. > > Compared to the previous patch set,** I added several OpenMP test > cases – and fixed the fallout. > > Except for trivial changes to libgomp/oacc-mem.c and omp-low.c, all > changes are in fortran/trans-openmp.c and only affect optional arguments. > > The patch was bootstrapped and tested on x86_64-gnu-linux w/o > offloading-support configured and with nvptx offloading. > > Tobias > > ** Included in the attached patch are the following previously posted > patches: [1] the trivial libgomp/oacc-mem.c change, [2] only the > remaining single-line change in omp-low.c, [3] the trans-openmp.c > changes (which had to be modified+extended), and [5] the test cases. > ([2] and [4] are already in GCC 10.) See: > https://gcc.gnu.org/ml/gcc-patches/2019-07/threads.html#00960 for the > original patches. > > PS: For full OpenMP support, (absent) optional arguments also needed > to be handled for data-share clauses. >
On Wed, Nov 20, 2019 at 02:06:18PM +0100, Tobias Burnus wrote: > ** Included in the attached patch are the following previously posted > patches: [1] the trivial libgomp/oacc-mem.c change, [2] only the remaining > single-line change in omp-low.c, [3] the trans-openmp.c changes (which had > to be modified+extended), and [5] the test cases. ([2] and [4] are already > in GCC 10.) See: > https://gcc.gnu.org/ml/gcc-patches/2019-07/threads.html#00960 for the > original patches. > > PS: For full OpenMP support, (absent) optional arguments also needed to be > handled for data-share clauses. Sure. > 2019-10-20 Tobias Burnus <tobias@codesourcery.com> > Kwok Cheung Yeung <kcy@codesourcery.com> > > gcc/fortran/ > * trans-openmp.c (gfc_build_conditional_assign, > gfc_build_conditional_assign_expr): New static functions. > (gfc_omp_finish_clause, gfc_trans_omp_clauses): Handle mapping of > absent optional arguments and fix mapping of present optional args. > > gcc/ > * omp-low.c (lower_omp_target): For optional arguments, deref once > more to obtain the type. > > libgomp/ > * oacc-mem.c (update_dev_host, gomp_acc_insert_pointer): Just return > if input it a NULL pointer. > * testsuite/libgomp.oacc-c-c++-common/lib-43.c: Remove; dependent on > diagnostic of NULL pointer. > * testsuite/libgomp.oacc-c-c++-common/lib-47.c: Ditto. > * testsuite/libgomp.fortran/optional-map.f90: New. > * testsuite/libgomp.fortran/use_device_addr-1.f90 > (test_dummy_opt_callee_1_absent): New. > (test_dummy_opt_call_1): Call it. > * testsuite/libgomp.fortran/use_device_addr-2.f90: Likewise. > * testsuite/libgomp.fortran/use_device_addr-3.f90: Likewise. > * testsuite/libgomp.fortran/use_device_addr-4.f90: Likewise. > * testsuite/libgomp.oacc-fortran/optional-cache.f95: New. > * testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90: New. > * testsuite/libgomp.oacc-fortran/optional-data-copyin.f90: New. > * testsuite/libgomp.oacc-fortran/optional-data-copyout.f90: New. > * testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90: New. > * testsuite/libgomp.oacc-fortran/optional-declare.f90: New. > * testsuite/libgomp.oacc-fortran/optional-firstprivate.f90: New. > * testsuite/libgomp.oacc-fortran/optional-host_data.f90: New. > * testsuite/libgomp.oacc-fortran/optional-nested-calls.f90: New. > * testsuite/libgomp.oacc-fortran/optional-private.f90: New. > * testsuite/libgomp.oacc-fortran/optional-reduction.f90: New. > * testsuite/libgomp.oacc-fortran/optional-update-device.f90: New. > * testsuite/libgomp.oacc-fortran/optional-update-host.f90: New. Ok, with some formatting nits fixed. > @@ -1199,6 +1257,8 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p) > } > > tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE; > + tree present = gfc_omp_is_optional_argument (decl) > + ? gfc_omp_check_optional_argument (decl, true) : NULL_TREE; I think emacs users (I'm not one of them) want ()s around, otherwise emacs misformats that. So tree present = (gfc_omp_is_optional_argument (decl) ? gfc_omp_check_optional_argument (decl, true) : NULL_TREE); > @@ -1232,17 +1314,43 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p) > stmtblock_t block; > gfc_start_block (&block); > tree type = TREE_TYPE (decl); > - tree ptr = gfc_conv_descriptor_data_get (decl); > + tree ptr; > + > + if (present) > + ptr = gfc_build_conditional_assign_expr ( > + &block, present, > + gfc_conv_descriptor_data_get (decl), > + null_pointer_node); I must say I don't like very much formatting like that, I'd find it cleaner to use temporary to have shorter argument and put all the arguments after the ( column. > + else > + ptr = gfc_conv_descriptor_data_get (decl); In this case, it could even be: ptr = gfc_conv_descriptor_data_get (decl); if (present) ptr = gfc_build_conditional_assign_expr (&block, present, ptr, null_pointer_node); by just using the same call from the else. > @@ -2252,6 +2385,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, > TREE_ADDRESSABLE (decl) = 1; > if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL) > { > + tree present = gfc_omp_is_optional_argument (decl) > + ? gfc_omp_check_optional_argument (decl, true) > + : NULL_TREE; > if (POINTER_TYPE_P (TREE_TYPE (decl)) > && (gfc_omp_privatize_by_reference (decl) > || GFC_DECL_GET_SCALAR_POINTER (decl) See above. > @@ -2284,6 +2420,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, > { > tree type = TREE_TYPE (decl); > tree ptr = gfc_conv_descriptor_data_get (decl); > + if (present) > + ptr = gfc_build_conditional_assign_expr ( > + block, present, ptr, > + null_pointer_node); And here the other comment. It is much more indented though, but you could use a temporary, like: tree nullarg = null_pointer_node; if (present) ptr = gfc_build_conditional_assign_expr (block, present, ptr, nullarg); Though, if it is too much for you, ignore. Another option would be shorten the name of the function, say s/conditional/cond/. There were some discussions about lifting the 80 column restriction and bump it to something like +-130, but nothing happened yet. > + ptr = gfc_build_conditional_assign_expr ( > + block, present, ptr, > + null_pointer_node); Again. > + stmtblock_t cond_block; > + gfc_init_block (&cond_block); > + tree size = gfc_full_array_size (&cond_block, decl, > + GFC_TYPE_ARRAY_RANK (type)); Here one could use a temporary for GFC_TYPE_ARRAY_RANK (type); > + if (present) > + { > + tree var = gfc_create_var (gfc_array_index_type, > + NULL); > + tree cond = fold_build2_loc (input_location, > + NE_EXPR, > + boolean_type_node, > + present, > + null_pointer_node); > + gfc_add_modify (&cond_block, var, size); > + gfc_add_expr_to_block (block, > + build3_loc (input_location, COND_EXPR, > + void_type_node, cond, > + gfc_finish_block (&cond_block), > + NULL_TREE)); And here for the expr, perhaps just reuse the cond variable. > @@ -2346,6 +2534,18 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, > OMP_CLAUSE_SIZE (node), elemsz); > } > } > + else if (present > + && TREE_CODE (decl) == INDIRECT_REF > + && TREE_CODE (TREE_OPERAND (decl, 0)) > + == INDIRECT_REF) Above I'd expect && (TREE_CODE (TREE_OPERAND (decl, 0)) == INDIRECT_REF)) but perhaps I'm just pushing my coding style too much, ignore in that case. Jakub
Hi! ;-P Jakub, thanks for furnishing me a fit occasion here: On 2019-12-05T16:15:15+0100, Jakub Jelinek <jakub@redhat.com> wrote: > [...] much more indented though, but you could > use a temporary, like: > tree nullarg = null_pointer_node; I object to cluttering the code by introducing temporary variables/names just for the sake of a few characters of screen width. Even if located close lexically, when reading the following code you still have to trace back from the 'nullarg' usage to its 'null_pointer_node' definition in order to figure out what a 'nullarg' might be: > if (present) > ptr > = gfc_build_conditional_assign_expr (block, present, > ptr, nullarg); > Another option would be shorten the name of the function, say > s/conditional/cond/. Likewise I object to "crippling" identifier names like that just for the sake of a few characters of screen width. (Here of course, "cond", or the existing "expr" might be fine abbreviations, but my point is about the general case.) > There were some discussions about lifting the 80 column restriction and bump > it to something like +-130, but nothing happened yet. Indeed. :-) In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner stated that even he is not using a 80 x 24 terminal anymore, and that should tell us something. ;-) So, I formally propose that we lift this characters per line restriction from IBM punch card (80) to mainframe line printer (132). Nonwithstanding that, we should try to not overstrain that; deep indentation often is a sign that code should be split out into a separate function, for example. My point is just to avoid things like the two examples cited above. Also, I'm not proposing any mass-reformatting of the existing code, or re-writing all "expr" into "expression". Tasks: - Discussion. - Get agreement/make a decision (by means still to be determined). - Put suitable Emacs/Vim configuration files into the source tree? - Update coding style guidelines. Grüße Thomas
On Thu, Dec 05, 2019 at 04:46:45PM +0100, Thomas Schwinge wrote: > Hi! > > ;-P Jakub, thanks for furnishing me a fit occasion here: > > On 2019-12-05T16:15:15+0100, Jakub Jelinek <jakub@redhat.com> wrote: > > [...] much more indented though, but you could > > use a temporary, like: > > tree nullarg = null_pointer_node; > > I object to cluttering the code by introducing temporary variables/names > just for the sake of a few characters of screen width. Even if located > close lexically, when reading the following code you still have to trace > back from the 'nullarg' usage to its 'null_pointer_node' definition in > order to figure out what a 'nullarg' might be: > > > if (present) > > ptr > > = gfc_build_conditional_assign_expr (block, present, > > ptr, nullarg); > > > Another option would be shorten the name of the function, say > > s/conditional/cond/. > > Likewise I object to "crippling" identifier names like that just for the > sake of a few characters of screen width. (Here of course, "cond", or > the existing "expr" might be fine abbreviations, but my point is about > the general case.) The point about temporaries is general, and I believe they actually make code much more readable. Mostly about coding style like: t = fold_build2_loc (loc, code, fold_build2_loc (loc, code2, something1, something2), fold_build2_loc (loc, code3, something3, something4)); vs. tree op1 = fold_build2_loc (loc, code2, something1, something2); tree op2 = fold_build2_loc (loc, code3, something3, something4); t = fold_build2_loc (loc, code, op1, op2); The above case is extreme in both being indented quite a lot (general rule is to consider outlining something into a function then) and using way too long function names. If you look at the earlier suggestion where the code is indented reasonably, using the temporary there makes the code more readable and shorter. > > > There were some discussions about lifting the 80 column restriction and bump > > it to something like +-130, but nothing happened yet. > > Indeed. :-) > > In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner > stated that even he is not using a 80 x 24 terminal anymore, and that > should tell us something. ;-) > > So, I formally propose that we lift this characters per line restriction > from IBM punch card (80) to mainframe line printer (132). Such a proposal would need to be accompanied with a wwwdocs codingconventions.html patch and contrib/check_GNU_style.{sh,py} patch I guess ;) Jakub
On Thu, 5 Dec 2019, Thomas Schwinge wrote: > In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner > stated that even he is not using a 80 x 24 terminal anymore, and that > should tell us something. ;-) > > So, I formally propose that we lift this characters per line restriction > from IBM punch card (80) to mainframe line printer (132). I thought these line lengths were based on readability studies suggesting lengths that lines shorter than 80 columns were more readable? Longer lines mean less space for multiple terminal / editor windows side-by-side to look at different pieces of code. I don't think that's an improvement.
> On Dec 5, 2019, at 11:17 AM, Joseph Myers <joseph@codesourcery.com> wrote: > > On Thu, 5 Dec 2019, Thomas Schwinge wrote: > >> In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner >> stated that even he is not using a 80 x 24 terminal anymore, and that >> should tell us something. ;-) >> >> So, I formally propose that we lift this characters per line restriction >> from IBM punch card (80) to mainframe line printer (132). > > I thought these line lengths were based on readability studies suggesting > lengths that lines shorter than 80 columns were more readable? That's certainly a general rule. There is a reason why books aren't wide, and why newspapers have columns. The eye can't deal well with long lines. So while 132 column lines are certainly possible with modern computers, it doesn't mean they are desirable. paul
On 12/5/19 9:24 AM, Paul Koning wrote: > > >> On Dec 5, 2019, at 11:17 AM, Joseph Myers <joseph@codesourcery.com> >> wrote: >> >> On Thu, 5 Dec 2019, Thomas Schwinge wrote: >> >>> In the relevant session at the GNU Tools Cauldron 2019, Michael >>> Meissner stated that even he is not using a 80 x 24 terminal >>> anymore, and that should tell us something. ;-) >>> >>> So, I formally propose that we lift this characters per line >>> restriction from IBM punch card (80) to mainframe line printer >>> (132). >> >> I thought these line lengths were based on readability studies >> suggesting lengths that lines shorter than 80 columns were more >> readable? > > That's certainly a general rule. There is a reason why books aren't > wide, and why newspapers have columns. The eye can't deal well with > long lines. So while 132 column lines are certainly possible with > modern computers, it doesn't mean they are desirable. I'd like to see the restriction relaxed. THe 80 column limit really presents readability problems and excessive expression wrapping to accommodate the limit. 132 seems like a very reasonable compromise. My biggest worry with moving to 132 columns is that it will discourage refactoring when indention levels cause excessive wrapping. Jeff
Hello, (oh a flame bait :) ) On Thu, 5 Dec 2019, Thomas Schwinge wrote: > So, I formally propose that we lift this characters per line restriction > from IBM punch card (80) to mainframe line printer (132). > > Tasks: > > - Discussion. I object to cluttering code in excuse for using sensible function names or temporaries that otherwise can help clearing up code. Using 132-char lines is cluttering code: - long lines are harder to read/grasp: vertical eye movement is easier than horizontal, and source code should be optimized for reading, not writing - long lines make it impossible to have two files next to each other at a comfortable font size - long lines are incompatible with existing netiquette re emails, for instance So, at least for me, that my terminals are 80 wide (but not x24) has multiple reasons, and the _least_ of it is because that's what punch cards had. Ciao, Michael.
* Paul Koning: > That's certainly a general rule. There is a reason why books aren't > wide, and why newspapers have columns. The eye can't deal well with > long lines. So while 132 column lines are certainly possible with > modern computers, it doesn't mean they are desirable. If the line starts at column 40 or so, I don't think readability suffers too much if it goes to column 100. If it starts at column 2, then it might be problematic. Frequent long lines reduce the usefulness of side-by-side diff viewers. And there are those of us who use screens in portrait mode, following the rule that a good function is not longer than a single screen. 1080 pixels give you 8 pixels per character, which isn't that much. Thanks, Florian
On Thu, 5 Dec 2019 at 16:44, Michael Matz <matz@suse.de> wrote: > > Hello, > > (oh a flame bait :) ) > > On Thu, 5 Dec 2019, Thomas Schwinge wrote: > > > So, I formally propose that we lift this characters per line restriction > > from IBM punch card (80) to mainframe line printer (132). > > > > Tasks: > > > > - Discussion. > > I object to cluttering code in excuse for using sensible function names or > temporaries that otherwise can help clearing up code. Using 132-char > lines is cluttering code: > - long lines are harder to read/grasp: vertical eye movement is easier > than horizontal, and source code should be optimized for > reading, not writing > - long lines make it impossible to have two files next to each other at a > comfortable font size > - long lines are incompatible with existing netiquette re emails, for > instance > > So, at least for me, that my terminals are 80 wide (but not x24) has > multiple reasons, and the _least_ of it is because that's what punch cards > had. C++17 introduces a nice feature, with rationale similar to declaring variables in a for-loop init-statement: if (auto var = foo(); bar(var)) The variable is only in scope for the block where you need it, just like a for-loop. Unfortunately nearly every time I've tried to use this recently, I've found it's impossible in 80 columns, e.g. this from yesterday: if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0; __c != 0) return __c; When you're forced to uglify every variable with a leading __ you run out of characters pretty damn quickly. I can either not use the feature (and have the variable defined in a larger scope than it needs to be) or add fairly arbitrary line breaks: if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0; __c != 0) return __c; or try to give the variables shorter (and less meaningful) names. Adding line breaks or picking shorter names doesn't help readability. So I end up not using the feature. I'm loosely in favour of relaxing the rule for libstdc++ code. I don't really care what the rest of GCC looks like ;-)
On Dec 5 2019, Michael Matz wrote: > >(oh a flame bait :) ) Quite. I shall try not to make it too much worse, but there's another point that needs mentioning. I find long names hard to read, with either short or long lines, especially when combined with variants like negotiate_twisty_little_passage, and negotiate_little_twisty_passage. And extending line lengths will probably encourage the use of longer names. As people say, there are conficting requirements, but I side with you in preferring 80 column lines. Actually, I tend to use less, but that's because of the (still current) frequency with which utilities make a mess of lines exactly 80 columns long in an 80 column field. Regards, Nick Maclaren.
On 05/12/2019 16:17, Joseph Myers wrote: > Longer lines mean less space for multiple terminal / editor windows > side-by-side to look at different pieces of code. I don't think that's an > improvement. Here's a data-point .... My 1920 pixel-wide screen, in the default font, allows 239 columns; not enough for two 130-wide editors. Especially not with line numbers and "gutter" columns. On the other hand, 80 columns does tend to cause some formatting contortions, with long function names and deeper indentations. I think a nice round 100 would be a good compromise. Andrew
On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote: > On Thu, 5 Dec 2019 at 16:44, Michael Matz <matz@suse.de> wrote: > > > > Hello, > > > > (oh a flame bait :) ) > > > > On Thu, 5 Dec 2019, Thomas Schwinge wrote: > > > > > So, I formally propose that we lift this characters per line restriction > > > from IBM punch card (80) to mainframe line printer (132). > > > > > > Tasks: > > > > > > - Discussion. > > > > I object to cluttering code in excuse for using sensible function names or > > temporaries that otherwise can help clearing up code. Using 132-char > > lines is cluttering code: > > - long lines are harder to read/grasp: vertical eye movement is easier > > than horizontal, and source code should be optimized for > > reading, not writing > > - long lines make it impossible to have two files next to each other at a > > comfortable font size > > - long lines are incompatible with existing netiquette re emails, for > > instance > > > > So, at least for me, that my terminals are 80 wide (but not x24) has > > multiple reasons, and the _least_ of it is because that's what punch cards > > had. > > C++17 introduces a nice feature, with rationale similar to declaring > variables in a for-loop init-statement: > > if (auto var = foo(); bar(var)) > > The variable is only in scope for the block where you need it, just > like a for-loop. > > Unfortunately nearly every time I've tried to use this recently, I've > found it's impossible in 80 columns, e.g. this from yesterday: > > if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> > 0; __c != 0) > return __c; > > When you're forced to uglify every variable with a leading __ you run > out of characters pretty damn quickly. > > I can either not use the feature (and have the variable defined in a > larger scope than it needs to be) or add fairly arbitrary line breaks: > > if (auto __c > = __builtin_memcmp(&*__first1, &*__first2, __len) > <=> 0; __c != 0) > return __c; > > or try to give the variables shorter (and less meaningful) names. > Adding line breaks or picking shorter names doesn't help readability. > So I end up not using the feature. > > I'm loosely in favour of relaxing the rule for libstdc++ code. I don't > really care what the rest of GCC looks like ;-) Not using such a nice feature just because of formatting sounds really shameful. Would the compromise of 100 chars make things any better here? Marek
On 12/5/19, Andrew Stubbs <ams@codesourcery.com> wrote: > On 05/12/2019 16:17, Joseph Myers wrote: >> Longer lines mean less space for multiple terminal / editor windows >> side-by-side to look at different pieces of code. I don't think that's >> an >> improvement. > > Here's a data-point .... > > My 1920 pixel-wide screen, in the default font, allows 239 columns; not > enough for two 130-wide editors. Especially not with line numbers and > "gutter" columns. > > On the other hand, 80 columns does tend to cause some formatting > contortions, with long function names and deeper indentations. > > I think a nice round 100 would be a good compromise. Here's mine: My 1280 pixel-wide screen allows 179 columns, which comes to 89.5 columns when divided by 2. I think rounding up to 90 columns would be a good compromise, although if that's too small, 100 is good as well. > > Andrew >
My IBM Selectric golfball electronic printer only does 90 characters on A4 in portrait mode………(at 10 cps) (as for my all electric TELEX Teleprinter machine !) Is this debate for real ?! - or is this a Christmas spoof ? External observer…..keep up the great work. :) (while I punch out a few more 80 column cards). > On 5 Dec 2019, at 17:55, Andrew Stubbs <ams@codesourcery.com> wrote: > > On 05/12/2019 16:17, Joseph Myers wrote: >> Longer lines mean less space for multiple terminal / editor windows >> side-by-side to look at different pieces of code. I don't think that's an >> improvement. > > Here's a data-point .... > > My 1920 pixel-wide screen, in the default font, allows 239 columns; not enough for two 130-wide editors. Especially not with line numbers and "gutter" columns. > > On the other hand, 80 columns does tend to cause some formatting contortions, with long function names and deeper indentations. > > I think a nice round 100 would be a good compromise. > > Andrew
On 12/5/19 8:46 AM, Thomas Schwinge wrote: > Hi! > > ;-P Jakub, thanks for furnishing me a fit occasion here: > > On 2019-12-05T16:15:15+0100, Jakub Jelinek <jakub@redhat.com> wrote: >> [...] much more indented though, but you could >> use a temporary, like: >> tree nullarg = null_pointer_node; > > I object to cluttering the code by introducing temporary variables/names > just for the sake of a few characters of screen width. Even if located > close lexically, when reading the following code you still have to trace > back from the 'nullarg' usage to its 'null_pointer_node' definition in > order to figure out what a 'nullarg' might be: > >> if (present) >> ptr >> = gfc_build_conditional_assign_expr (block, present, >> ptr, nullarg); The snippet of code above looks like it might be the symptom of another common problem: deeply nested conditionals, case statements, or loops in very large functions. Those usually make it much harder to follow code than local variables or expressions that are broken up to fit the width limit. Shorter functions typically also means fewer local variables. One thing I find improves readability in functions with many local variables is declaring const those that don't change after initialization. That also enforces the initialization- on-declaration coding style, and can result in more efficient code. Another solution that might help in this context is default function arguments: if the last argument may be null, making it the default in the function declaration avoids having to pass it explicitly. > >> Another option would be shorten the name of the function, say >> s/conditional/cond/. As long as it doesn't compromise readability this sounds like a good suggestion for a change to the function above. _cond_ is no less clear or descriptive than _conditional_, similarly to _expr vs _expression. > > Likewise I object to "crippling" identifier names like that just for the > sake of a few characters of screen width. (Here of course, "cond", or > the existing "expr" might be fine abbreviations, but my point is about > the general case.) > >> There were some discussions about lifting the 80 column restriction and bump >> it to something like +-130, but nothing happened yet. > > Indeed. :-) > > In the relevant session at the GNU Tools Cauldron 2019, Michael Meissner > stated that even he is not using a 80 x 24 terminal anymore, and that > should tell us something. ;-) > > So, I formally propose that we lift this characters per line restriction > from IBM punch card (80) to mainframe line printer (132). I'm not a fan of rigid rules, especially those that are subject to personal style preferences. At the same time I wouldn't like to see lines become as long as this as the norm. As others, I have windows om my desktop arranged in a way to maximize screen real estate: three columns of editor/debugger and a couple of terminals on of top of the other. They only fit because they're all 80 characters wide. But more important: > deep > indentation often is a sign that code should be split out into a separate > function, for example. Exactly. Martin My point is just to avoid things like the two > examples cited above. > > Also, I'm not proposing any mass-reformatting of the existing code, or > re-writing all "expr" into "expression". > > Tasks: > > - Discussion. > - Get agreement/make a decision (by means still to be determined). > - Put suitable Emacs/Vim configuration files into the source tree? > - Update coding style guidelines. > > > Grüße > Thomas >
All, As a certified Old Guy (coding in FORTRAN since 1967) my default mode is to stop at 72 characters. It would be nice to push the max out a bit, but I’ll most likely keep my lines (and my function and variable names!) short. As always, brevity is the soul of wit. And yes, please keep up the good work. Jim 3222 NE 89th St Seattle, WA 98115 (206) 430-0109 > On Dec 5, 2019, at 10:21 AM, Robin Curtis <Curtis@geoscience.co.uk> wrote: > > Is this debate for real ?! - or is this a Christmas spoof ? > > External observer…..keep up the great work. :) > > (while I punch out a few more 80 column cards).
Hi! On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote: > C++17 introduces a nice feature, with rationale similar to declaring > variables in a for-loop init-statement: > > if (auto var = foo(); bar(var)) Similar to GNU C statement expressions, which are *also* only a good idea to use in limited cases. > The variable is only in scope for the block where you need it, just > like a for-loop. > > Unfortunately nearly every time I've tried to use this recently, I've > found it's impossible in 80 columns, e.g. this from yesterday: > > if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> > 0; __c != 0) > return __c; > > When you're forced to uglify every variable with a leading __ you run > out of characters pretty damn quickly. If using this "nice feature" forces you to uglify your code, then maybe it is not such a nice feature, and you should not use it. If you have issues with scoping your functions are WAY too long already. Segher
On Thu, Dec 05, 2019 at 04:44:21PM +0000, Michael Matz wrote: > (oh a flame bait :) ) I will blissfully ignore that warning. > On Thu, 5 Dec 2019, Thomas Schwinge wrote: > I object to cluttering code in excuse for using sensible function names or > temporaries that otherwise can help clearing up code. Using 132-char > lines is cluttering code: > - long lines are harder to read/grasp: vertical eye movement is easier > than horizontal, and source code should be optimized for > reading, not writing > - long lines make it impossible to have two files next to each other at a > comfortable font size > - long lines are incompatible with existing netiquette re emails, for > instance > > So, at least for me, that my terminals are 80 wide (but not x24) has > multiple reasons, and the _least_ of it is because that's what punch cards > had. I agree with all of this. If you have a hard time writing nicely readable code in 80 columns, you will have a much harder time still using more columns. 80 is somewhat too long already (~60 is better), but we have indents in source code, so that's alright. And you should not indent that far, so this all works out splendidly. Segher
On Thu, Dec 05, 2019 at 05:04:12PM +0100, Jakub Jelinek wrote: > On Thu, Dec 05, 2019 at 04:46:45PM +0100, Thomas Schwinge wrote: > > On 2019-12-05T16:15:15+0100, Jakub Jelinek <jakub@redhat.com> wrote: > > > [...] much more indented though, but you could > > > use a temporary, like: > > > tree nullarg = null_pointer_node; > > > > I object to cluttering the code by introducing temporary variables/names > > just for the sake of a few characters of screen width. Even if located > > close lexically, when reading the following code you still have to trace > > back from the 'nullarg' usage to its 'null_pointer_node' definition in > > order to figure out what a 'nullarg' might be: > > > > > if (present) > > > ptr > > > = gfc_build_conditional_assign_expr (block, present, > > > ptr, nullarg); > > > > > Another option would be shorten the name of the function, say > > > s/conditional/cond/. > > > > Likewise I object to "crippling" identifier names like that just for the > > sake of a few characters of screen width. (Here of course, "cond", or > > the existing "expr" might be fine abbreviations, but my point is about > > the general case.) > > The point about temporaries is general, and I believe they actually make > code much more readable. Mostly about coding style like: > t = fold_build2_loc (loc, code, fold_build2_loc (loc, code2, > something1, > something2), > fold_build2_loc (loc, code3, something3, > something4)); > vs. > tree op1 = fold_build2_loc (loc, code2, something1, something2); > tree op2 = fold_build2_loc (loc, code3, something3, something4); > t = fold_build2_loc (loc, code, op1, op2); Yes. And the names, even if they do not say much, *do* say enough to help comprehending the code. They help structure it. > The above case is extreme in both being indented quite a lot (general rule > is to consider outlining something into a function then) I hope you mean actual factoring, not just outlining :-) If you pick good factors you can give them good names, too. Good names help reading the code. And on the other hand, when it is hard to come up with a good name for a piece of code, it is probably not chosen as a good factor anyway! > and using > way too long function names. If you look at the earlier suggestion where > the code is indented reasonably, using the temporary there makes the code more > readable and shorter. Yup. Segher
On Thu, Dec 05, 2019 at 11:54:04AM -0700, Martin Sebor wrote: > >> if (present) > >> ptr > >> = gfc_build_conditional_assign_expr (block, > >> present, > >> ptr, nullarg); > > The snippet of code above looks like it might be the symptom > of another common problem: deeply nested conditionals, case > statements, or loops in very large functions. Those usually > make it much harder to follow code than local variables or > expressions that are broken up to fit the width limit. > Shorter functions typically also means fewer local variables. Yes, and you only get problems that you do not know which var is which, or you do not know what value it is set to because it was set some 2800 (or 40 or whatever) lines ago, in WAY too long functions. All those problems do not exist in well-factored code. The point is not to have short routines: the point is to not have too much (external) complexity per routine. A routine should ideally only do one thing, and its name should describe what it does. Segher
On Thu, Dec 05, 2019 at 02:06:50PM -0600, Segher Boessenkool wrote: > Hi! > > On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote: > > C++17 introduces a nice feature, with rationale similar to declaring > > variables in a for-loop init-statement: > > > > if (auto var = foo(); bar(var)) > > Similar to GNU C statement expressions, which are *also* only a good > idea to use in limited cases. > > > The variable is only in scope for the block where you need it, just > > like a for-loop. > > > > Unfortunately nearly every time I've tried to use this recently, I've > > found it's impossible in 80 columns, e.g. this from yesterday: > > > > if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> > > 0; __c != 0) > > return __c; > > > > When you're forced to uglify every variable with a leading __ you run > > out of characters pretty damn quickly. > > If using this "nice feature" forces you to uglify your code, then maybe > it is not such a nice feature, and you should not use it. I disagree, it is a nice feature, without quotes. It's Good Style not to leak variables into enclosing scopes. > If you have issues with scoping your functions are WAY too long already. I don't think that's the case here. -- Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
On Thu, Dec 5, 2019 at 11:51 AM Michael Matz <matz@suse.de> wrote: > Hello, > > (oh a flame bait :) ) > > On Thu, 5 Dec 2019, Thomas Schwinge wrote: > > > So, I formally propose that we lift this characters per line restriction > > from IBM punch card (80) to mainframe line printer (132). > > > > Tasks: > > > > - Discussion. > > I object to cluttering code in excuse for using sensible function names or > temporaries that otherwise can help clearing up code. Using 132-char > lines is cluttering code: > - long lines are harder to read/grasp: vertical eye movement is easier > than horizontal, and source code should be optimized for > reading, not writing > - long lines make it impossible to have two files next to each other at a > comfortable font size > - long lines are incompatible with existing netiquette re emails, for > instance > > So, at least for me, that my terminals are 80 wide (but not x24) has > multiple reasons, and the _least_ of it is because that's what punch cards > had. > Agreed. I work with two side-by-side terminals, one 80x50 and the other as wide as fits in the rest of the screen, which currently happens to be 111x50. Jason
On Thu, 5 Dec 2019 at 20:07, Segher Boessenkool <segher@kernel.crashing.org> wrote: > > Hi! > > On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote: > > C++17 introduces a nice feature, with rationale similar to declaring > > variables in a for-loop init-statement: > > > > if (auto var = foo(); bar(var)) > > Similar to GNU C statement expressions, which are *also* only a good > idea to use in limited cases. > > > The variable is only in scope for the block where you need it, just > > like a for-loop. > > > > Unfortunately nearly every time I've tried to use this recently, I've > > found it's impossible in 80 columns, e.g. this from yesterday: > > > > if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> > > 0; __c != 0) > > return __c; > > > > When you're forced to uglify every variable with a leading __ you run > > out of characters pretty damn quickly. > > If using this "nice feature" forces you to uglify your code, then maybe > it is not such a nice feature, and you should not use it. The uglification has absolutely nothing to do with the 'if' init-statement feature, all code in libstdc++ headers has to be uglified, always. Blame the C preprocessor for that, not C++ features. My point is that 80 characters runs out quicker when 10% of it goes on visual noise that's only needed because the C preprocessor means we can't have nice names. > If you have issues with scoping your functions are WAY too long already. I don't have issues with scoping, it's just good practice to limit the scope to the minimum necessary. The example I'm talking about is: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/bits/stl_algobase.h;h=a2fd306e6d0cca579b510148ba1a7089e2b2f3a2;hb=HEAD#l1499 That function is 46 lines long, including a micro-optimisation to use memcmpy when appropriate. About 15% of that is on assertions to help users debug their mistakes. Another five lines are the compile-time if-constexpr checks to decide when the optimisation is appropriate, which result in deep nesting, but not because of any complex if-else branches. It could be nested less deeply to reduce indentation by 4 spaces, but that would result in more template instantiations for users, which would take more time and memory to compile. When you have a million users including the header in every C++ program, little things like that help. So the code is written in an unidiomatic way to optimise for compilation speed, not readability. But the end result is that the call to __min_cmp is indented about 25% of the way into the available space, and that the FIRST line of executable code in the function. The code is not way too long, and writing it differently would compile slower. The constraints that apply to that code are quite different to the internals of GCC, which most users never see or even compile themselves. (Yes, maybe libstdc++ should stop indenting everything by 2 columns when it's inside a namespace, given that almost everything is inside at least one level of namespace ... that might be a good idea even if the column limit is extended to 100 or 132, but we'll still have all the __ prefixes eating up space.)
On Thu, Dec 05, 2019 at 03:38:15PM -0500, Marek Polacek wrote: > On Thu, Dec 05, 2019 at 02:06:50PM -0600, Segher Boessenkool wrote: > > > When you're forced to uglify every variable with a leading __ you run > > > out of characters pretty damn quickly. > > > > If using this "nice feature" forces you to uglify your code, then maybe > > it is not such a nice feature, and you should not use it. > > I disagree, it is a nice feature, without quotes. It's Good Style not to > leak variables into enclosing scopes. It *is* a quote, from Jonathan's mail. Why is this Good Style? (And according to who?) > > If you have issues with scoping your functions are WAY too long already. > > I don't think that's the case here. Then it does not hurt to have a local that is visible in slightly longer scope than necessary. Simpler code is better code. Segher
On Thu, Dec 05, 2019 at 08:56:35PM +0000, Jonathan Wakely wrote: > On Thu, 5 Dec 2019 at 20:07, Segher Boessenkool > <segher@kernel.crashing.org> wrote: > > On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote: > > > C++17 introduces a nice feature, with rationale similar to declaring > > > variables in a for-loop init-statement: > > > Unfortunately nearly every time I've tried to use this recently, I've > > > found it's impossible in 80 columns, e.g. this from yesterday: > > > > > > if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> > > > 0; __c != 0) > > > return __c; > > > > > > When you're forced to uglify every variable with a leading __ you run > > > out of characters pretty damn quickly. > > > > If using this "nice feature" forces you to uglify your code, then maybe > > it is not such a nice feature, and you should not use it. > > The uglification has absolutely nothing to do with the 'if' > init-statement feature, all code in libstdc++ headers has to be > uglified, always. Blame the C preprocessor for that, not C++ features. > > My point is that 80 characters runs out quicker when 10% of it goes on > visual noise that's only needed because the C preprocessor means we > can't have nice names. (Not sure where the preprocessor comes in, these underscores are just to satisfy language rules afaics, but maybe you mean something else?) Or you could write auto __c = (__builtin_memcmp(&*__first1, &*__first2, __len) <=> 0); if (__c) return __c; which is much easier to read, to my eyes anyway. And it is exactly the same for the compiler. Wrapping everything inside-out just for the artificial goal of having things defined in slightly tighter scopes feels futile. Segher
On Thu, 5 Dec 2019 at 22:19, Segher Boessenkool wrote: > > On Thu, Dec 05, 2019 at 08:56:35PM +0000, Jonathan Wakely wrote: > > On Thu, 5 Dec 2019 at 20:07, Segher Boessenkool > > <segher@kernel.crashing.org> wrote: > > > On Thu, Dec 05, 2019 at 05:03:43PM +0000, Jonathan Wakely wrote: > > > > C++17 introduces a nice feature, with rationale similar to declaring > > > > variables in a for-loop init-statement: > > > > > Unfortunately nearly every time I've tried to use this recently, I've > > > > found it's impossible in 80 columns, e.g. this from yesterday: > > > > > > > > if (auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> > > > > 0; __c != 0) > > > > return __c; > > > > > > > > When you're forced to uglify every variable with a leading __ you run > > > > out of characters pretty damn quickly. > > > > > > If using this "nice feature" forces you to uglify your code, then maybe > > > it is not such a nice feature, and you should not use it. > > > > The uglification has absolutely nothing to do with the 'if' > > init-statement feature, all code in libstdc++ headers has to be > > uglified, always. Blame the C preprocessor for that, not C++ features. > > > > My point is that 80 characters runs out quicker when 10% of it goes on > > visual noise that's only needed because the C preprocessor means we > > can't have nice names. > > (Not sure where the preprocessor comes in, these underscores are just to > satisfy language rules afaics, but maybe you mean something else?) The language rules are only necessary because of the preprocessor. Users can define macros with names like "pred" and "cmp" before including any standard library header, and because macros don't respect lexical scope, that would break any standard library code using "pred" and "cmp". So the std::lib has to use reserved names. It's entirely due to the preprocessor that we can't use sane names for local variables, or for any implementation detail in a header (unlike in C, our helper types and functions are in a namespace so won't collide with users' types and functions, but we still have to use reserved names because the preprocessor doesn't respect namespaces).
On Thu, 5 Dec 2019 at 22:19, Segher Boessenkool wrote: > Or you could write > > auto __c = (__builtin_memcmp(&*__first1, &*__first2, __len) <=> 0); > if (__c) > return __c; > > which is much easier to read, to my eyes anyway. And it is exactly the > same for the compiler. In this case yes, but not in general. Given: auto x = foo(); if (bar(x)) { } some_type y; The destructor of x won't run until after y has been destroyed. That's not at all identical to: if (auto x = foo(); bar(x)) { } some_type y; Please don't try to tell me how C++ works :-)
On Thu, Dec 05, 2019 at 10:37:33PM +0000, Jonathan Wakely wrote: > On Thu, 5 Dec 2019 at 22:19, Segher Boessenkool wrote: > > Or you could write > > > > auto __c = (__builtin_memcmp(&*__first1, &*__first2, __len) <=> 0); > > if (__c) > > return __c; > > > > which is much easier to read, to my eyes anyway. And it is exactly the > > same for the compiler. > > In this case yes, but not in general. > > Given: > > auto x = foo(); > if (bar(x)) > { } > some_type y; > > The destructor of x won't run until after y has been destroyed. That's > not at all identical to: > > if (auto x = foo(); bar(x)) > { } > some_type y; > > Please don't try to tell me how C++ works :-) I don't, I wouldn't even *know* that. But this is just the same as in C, and I do know how to write good C code. I don't think doing non-trivial things with constructors and destructors (or anything else!) implicitly is a good idea at all, but that's an altogether different subject. Segher
On 05/12/2019 18:21, Robin Curtis wrote: > My IBM Selectric golfball electronic printer only does 90 characters on A4 in portrait mode………(at 10 cps) > > (as for my all electric TELEX Teleprinter machine !) > > Is this debate for real ?! - or is this a Christmas spoof ? I can't speak for the debate, but the pain is real. Andrew
Hi Tobias! On 2019-11-20T14:06:18+0100, Tobias Burnus <tobias@codesourcery.com> wrote: > This patch does two things regarding explicit and automatical variable > mapping to offloaded devices: Thanks! > Compared to the previous patch set,** I added several OpenMP test cases > – and fixed the fallout. I'm seeing: PASS: libgomp.fortran/use_device_addr-3.f90 -O0 (test for excess errors) PASS: libgomp.fortran/use_device_addr-3.f90 -O0 execution test PASS: libgomp.fortran/use_device_addr-3.f90 -O1 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90 -O1 execution test PASS: libgomp.fortran/use_device_addr-3.f90 -O2 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90 -O2 execution test PASS: libgomp.fortran/use_device_addr-3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: libgomp.fortran/use_device_addr-3.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90 -O3 -g execution test PASS: libgomp.fortran/use_device_addr-3.f90 -Os (test for excess errors) PASS: libgomp.fortran/use_device_addr-3.f90 -Os execution test libgomp: Trying to map into device [0x7ffd7915a500..0x7ffd7915a500) object when [0x7ffd7915c440..0x7ffd7915c440) is already mapped PASS: libgomp.fortran/use_device_addr-4.f90 -O0 (test for excess errors) PASS: libgomp.fortran/use_device_addr-4.f90 -O0 execution test PASS: libgomp.fortran/use_device_addr-4.f90 -O1 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-4.f90 -O1 execution test PASS: libgomp.fortran/use_device_addr-4.f90 -O2 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-4.f90 -O2 execution test PASS: libgomp.fortran/use_device_addr-4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: libgomp.fortran/use_device_addr-4.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-4.f90 -O3 -g execution test PASS: libgomp.fortran/use_device_addr-4.f90 -Os (test for excess errors) PASS: libgomp.fortran/use_device_addr-4.f90 -Os execution test libgomp: Trying to map into device [0x7ffc70529ca0..0x7ffc70529ca0) object when [0x7ffc7052ac40..0x7ffc7052ac40) is already mapped > Except for trivial changes to libgomp/oacc-mem.c Just because something is just a few lines of code doesn't mean that it's trivial. I had asked you to first resolve that issue separately (referencing PR92726), instead of mixing these changes into the Fortran optional agruments commit: <http://mid.mail-archive.com/87lfrylp8k.fsf@euler.schwinge.homeip.net> "[PR92726] OpenACC: 'NULL'-in -> no-op, and/or 'NULL'-out". There was no reason to rush in the Fortran optional arguments changes before resolving the 'NULL' changes, was there? > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 > @@ -0,0 +1,23 @@ > +! Test that the cache directives work with optional arguments. [...] Missing '{ dg-do run }'. > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 > +! of giving a non-present argument to the cache directive is not tested as > +! it is undefined. > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90 > @@ -0,0 +1,112 @@ > +! Test that optional arguments work in firstprivate clauses. The effect of > +! non-present arguments in firstprivate clauses is undefined, and is not > +! tested for. > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90 > @@ -0,0 +1,115 @@ > +! Test that optional arguments work in private clauses. The effect of > +! non-present arguments in private clauses is undefined, and is not tested > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 > @@ -0,0 +1,69 @@ > +! Test optional arguments in reduction clauses. The effect of > +! non-present arguments in reduction clauses is undefined, and is not tested Once you've got access, please file a ticket at <https://github.com/OpenACC/openacc-spec/issues> so this gets clarified. > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 > +! { dg-additional-options "-w" } Why that? Grüße Thomas
Hi Thomas, [Attached patch committed as Rev. 279217] [One OpenMP (+OpenACC) patch and one OpenACC-only patch pending review are linked below.] On 12/7/19 3:49 PM, Thomas Schwinge wrote: > I'm seeing: > [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90 -O1 execution test Whether it passed or not depended whether the stack was NULL for the local "array_arg.0" variable (which got assigned "array_arg->data" if the argument was present). Fixed by the following patch, which also fixes some more corner cases: https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00707.html – pending Jakub's review >> Except for trivial changes to libgomp/oacc-mem.c > Just because something is just a few lines of code doesn't mean that it's > trivial. I had asked you to first resolve that issue separately > (referencing PR92726) I vaguely remembered the this email – but couldn't find it in the thread. (This thread has 24 emails!) Turned out that you wrote tht email in a different thread – and while this patch is mainly about Fortran, your email was also not set to fortran@ (arguably, your quoted text didn't include any Fortran bits). Pending OpenACC review the thread you mentedion: https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00062.html >> --- /dev/null >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 >> @@ -0,0 +1,23 @@ >> +! Test that the cache directives work with optional arguments. [...] > Missing '{ dg-do run }'. Fixed/committed. >> +! of giving a non-present argument to the cache directive is not tested as >> +! it is undefined. […] >> +! The effect of >> +! non-present aguments in firstprivate clauses is undefined […] The effect of >> +! non-present arguments in reduction clauses is undefined > Once you've got access, please file a ticket at > <https://github.com/OpenACC/openacc-spec/issues> so this gets clarified. I will try to remember this – it's now on my to-do list. >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 >> +! { dg-additional-options "-w" } > Why that? No idea (was added by Kwok) – I do see warnings with "-Wall", but not with default options. (It then shows warnings like: "‘rg.25’ is used uninitialized in this function [-Wuninitialized]".) I have now also removed this line → committed after testing that it runs through with nvptx. Tobias
2019-10-20 Tobias Burnus <tobias@codesourcery.com> Kwok Cheung Yeung <kcy@codesourcery.com> gcc/fortran/ * trans-openmp.c (gfc_build_conditional_assign, gfc_build_conditional_assign_expr): New static functions. (gfc_omp_finish_clause, gfc_trans_omp_clauses): Handle mapping of absent optional arguments and fix mapping of present optional args. gcc/ * omp-low.c (lower_omp_target): For optional arguments, deref once more to obtain the type. libgomp/ * oacc-mem.c (update_dev_host, gomp_acc_insert_pointer): Just return if input it a NULL pointer. * testsuite/libgomp.oacc-c-c++-common/lib-43.c: Remove; dependent on diagnostic of NULL pointer. * testsuite/libgomp.oacc-c-c++-common/lib-47.c: Ditto. * testsuite/libgomp.fortran/optional-map.f90: New. * testsuite/libgomp.fortran/use_device_addr-1.f90 (test_dummy_opt_callee_1_absent): New. (test_dummy_opt_call_1): Call it. * testsuite/libgomp.fortran/use_device_addr-2.f90: Likewise. * testsuite/libgomp.fortran/use_device_addr-3.f90: Likewise. * testsuite/libgomp.fortran/use_device_addr-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/optional-cache.f95: New. * testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90: New. * testsuite/libgomp.oacc-fortran/optional-data-copyin.f90: New. * testsuite/libgomp.oacc-fortran/optional-data-copyout.f90: New. * testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90: New. * testsuite/libgomp.oacc-fortran/optional-declare.f90: New. * testsuite/libgomp.oacc-fortran/optional-firstprivate.f90: New. * testsuite/libgomp.oacc-fortran/optional-host_data.f90: New. * testsuite/libgomp.oacc-fortran/optional-nested-calls.f90: New. * testsuite/libgomp.oacc-fortran/optional-private.f90: New. * testsuite/libgomp.oacc-fortran/optional-reduction.f90: New. * testsuite/libgomp.oacc-fortran/optional-update-device.f90: New. * testsuite/libgomp.oacc-fortran/optional-update-host.f90: New. gcc/fortran/trans-openmp.c | 224 ++++++++++++++++++++++++++++++++++++--- gcc/omp-low.c | 3 +- libgomp/oacc-mem.c | 9 ++ libgomp/testsuite/libgomp.fortran/optional-map.f90 | 119 +++++++++++++++++++++ libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90 | 36 +++++++ libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90 | 36 +++++++ libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90 | 27 +++++ libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90 | 27 +++++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c | 51 --------- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c | 49 --------- libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 | 23 ++++ libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90 | 29 +++++ libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin.f90 | 140 ++++++++++++++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyout.f90 | 96 +++++++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90 | 91 ++++++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-declare.f90 | 87 +++++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90 | 112 ++++++++++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-host_data.f90 | 39 +++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-nested-calls.f90 | 135 +++++++++++++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90 | 115 ++++++++++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 | 69 ++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-update-device.f90 | 121 +++++++++++++++++++++ libgomp/testsuite/libgomp.oacc-fortran/optional-update-host.f90 | 115 ++++++++++++++++++++ 23 files changed, 1640 insertions(+), 113 deletions(-) diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index d9dfcabc65e..77bc9120d85 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -1175,6 +1175,64 @@ gfc_omp_clause_dtor (tree clause, tree decl) return tem; } +/* Build a conditional expression in BLOCK. If COND_VAL is not + null, then the block THEN_B is executed, otherwise ELSE_VAL + is assigned to VAL. */ + +static void +gfc_build_conditional_assign (stmtblock_t *block, + tree val, + tree cond_val, + tree then_b, + tree else_val) +{ + stmtblock_t cond_block; + tree cond, else_b = NULL_TREE; + tree val_ty = TREE_TYPE (val); + + if (else_val) + { + gfc_init_block (&cond_block); + gfc_add_modify (&cond_block, val, fold_convert (val_ty, else_val)); + else_b = gfc_finish_block (&cond_block); + } + cond = fold_build2_loc (input_location, NE_EXPR, + logical_type_node, + cond_val, null_pointer_node); + gfc_add_expr_to_block (block, + build3_loc (input_location, + COND_EXPR, + void_type_node, + cond, then_b, + else_b)); +} + +/* Build a conditional expression in BLOCK, returning a temporary + variable containing the result. If COND_VAL is not null, then + THEN_VAL will be assigned to the variable, otherwise ELSE_VAL + is assigned. + */ + +static tree +gfc_build_conditional_assign_expr (stmtblock_t *block, + tree cond_val, + tree then_val, + tree else_val) +{ + tree val; + tree val_ty = TREE_TYPE (then_val); + stmtblock_t cond_block; + + val = create_tmp_var (val_ty); + + gfc_init_block (&cond_block); + gfc_add_modify (&cond_block, val, then_val); + tree then_b = gfc_finish_block (&cond_block); + + gfc_build_conditional_assign (block, val, cond_val, then_b, else_val); + + return val; +} void gfc_omp_finish_clause (tree c, gimple_seq *pre_p) @@ -1199,6 +1257,8 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p) } tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE; + tree present = gfc_omp_is_optional_argument (decl) + ? gfc_omp_check_optional_argument (decl, true) : NULL_TREE; if (POINTER_TYPE_P (TREE_TYPE (decl))) { if (!gfc_omp_privatize_by_reference (decl) @@ -1213,8 +1273,30 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p) OMP_CLAUSE_DECL (c4) = decl; OMP_CLAUSE_SIZE (c4) = size_int (0); decl = build_fold_indirect_ref (decl); - OMP_CLAUSE_DECL (c) = decl; - OMP_CLAUSE_SIZE (c) = NULL_TREE; + if (present + && (GFC_DECL_GET_SCALAR_POINTER (orig_decl) + || GFC_DECL_GET_SCALAR_ALLOCATABLE (orig_decl))) + { + c2 = build_omp_clause (input_location, OMP_CLAUSE_MAP); + OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_POINTER); + OMP_CLAUSE_DECL (c2) = decl; + OMP_CLAUSE_SIZE (c2) = size_int (0); + + stmtblock_t block; + gfc_start_block (&block); + tree ptr = decl; + ptr = gfc_build_conditional_assign_expr (&block, present, decl, + null_pointer_node); + gimplify_and_add (gfc_finish_block (&block), pre_p); + ptr = build_fold_indirect_ref (ptr); + OMP_CLAUSE_DECL (c) = ptr; + OMP_CLAUSE_SIZE (c) = TYPE_SIZE_UNIT (TREE_TYPE (ptr)); + } + else + { + OMP_CLAUSE_DECL (c) = decl; + OMP_CLAUSE_SIZE (c) = NULL_TREE; + } if (TREE_CODE (TREE_TYPE (orig_decl)) == REFERENCE_TYPE && (GFC_DECL_GET_SCALAR_POINTER (orig_decl) || GFC_DECL_GET_SCALAR_ALLOCATABLE (orig_decl))) @@ -1232,17 +1314,43 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p) stmtblock_t block; gfc_start_block (&block); tree type = TREE_TYPE (decl); - tree ptr = gfc_conv_descriptor_data_get (decl); + tree ptr; + + if (present) + ptr = gfc_build_conditional_assign_expr ( + &block, present, + gfc_conv_descriptor_data_get (decl), + null_pointer_node); + else + ptr = gfc_conv_descriptor_data_get (decl); ptr = fold_convert (build_pointer_type (char_type_node), ptr); ptr = build_fold_indirect_ref (ptr); OMP_CLAUSE_DECL (c) = ptr; c2 = build_omp_clause (input_location, OMP_CLAUSE_MAP); OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_TO_PSET); - OMP_CLAUSE_DECL (c2) = decl; + if (present) + { + ptr = create_tmp_var (TREE_TYPE (TREE_OPERAND (decl, 0))); + gfc_add_modify (&block, ptr, TREE_OPERAND (decl, 0)); + + OMP_CLAUSE_DECL (c2) = build_fold_indirect_ref (ptr); + } + else + OMP_CLAUSE_DECL (c2) = decl; OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (type); c3 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); OMP_CLAUSE_SET_MAP_KIND (c3, GOMP_MAP_POINTER); - OMP_CLAUSE_DECL (c3) = gfc_conv_descriptor_data_get (decl); + if (present) + { + ptr = gfc_conv_descriptor_data_get (decl); + ptr = gfc_build_addr_expr (NULL, ptr); + ptr = gfc_build_conditional_assign_expr (&block, present, + ptr, null_pointer_node); + ptr = build_fold_indirect_ref (ptr); + OMP_CLAUSE_DECL (c3) = ptr; + } + else + OMP_CLAUSE_DECL (c3) = gfc_conv_descriptor_data_get (decl); OMP_CLAUSE_SIZE (c3) = size_int (0); tree size = create_tmp_var (gfc_array_index_type); tree elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type)); @@ -1268,11 +1376,36 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p) tem = gfc_conv_descriptor_data_get (decl); tem = fold_convert (pvoid_type_node, tem); cond = fold_build2_loc (input_location, NE_EXPR, - logical_type_node, tem, null_pointer_node); + boolean_type_node, tem, null_pointer_node); + if (present) + { + tem = fold_build2_loc (input_location, NE_EXPR, boolean_type_node, + present, null_pointer_node); + cond = fold_build2_loc (input_location, TRUTH_ANDIF_EXPR, + boolean_type_node, tem, cond); + } gfc_add_expr_to_block (&block, build3_loc (input_location, COND_EXPR, void_type_node, cond, then_b, else_b)); } + else if (present) + { + stmtblock_t cond_block; + tree then_b; + + gfc_init_block (&cond_block); + gfc_add_modify (&cond_block, size, + gfc_full_array_size (&cond_block, decl, + GFC_TYPE_ARRAY_RANK (type))); + gfc_add_modify (&cond_block, size, + fold_build2 (MULT_EXPR, gfc_array_index_type, + size, elemsz)); + then_b = gfc_finish_block (&cond_block); + + gfc_build_conditional_assign (&block, size, present, then_b, + build_int_cst (gfc_array_index_type, + 0)); + } else { gfc_add_modify (&block, size, @@ -2252,6 +2385,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, TREE_ADDRESSABLE (decl) = 1; if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL) { + tree present = gfc_omp_is_optional_argument (decl) + ? gfc_omp_check_optional_argument (decl, true) + : NULL_TREE; if (POINTER_TYPE_P (TREE_TYPE (decl)) && (gfc_omp_privatize_by_reference (decl) || GFC_DECL_GET_SCALAR_POINTER (decl) @@ -2284,6 +2420,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, { tree type = TREE_TYPE (decl); tree ptr = gfc_conv_descriptor_data_get (decl); + if (present) + ptr = gfc_build_conditional_assign_expr ( + block, present, ptr, + null_pointer_node); ptr = fold_convert (build_pointer_type (char_type_node), ptr); ptr = build_fold_indirect_ref (ptr); @@ -2296,8 +2436,19 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); - OMP_CLAUSE_DECL (node3) - = gfc_conv_descriptor_data_get (decl); + if (present) + { + ptr = gfc_conv_descriptor_data_get (decl); + ptr = gfc_build_addr_expr (NULL, ptr); + ptr = gfc_build_conditional_assign_expr ( + block, present, ptr, + null_pointer_node); + ptr = build_fold_indirect_ref (ptr); + OMP_CLAUSE_DECL (node3) = ptr; + } + else + OMP_CLAUSE_DECL (node3) + = gfc_conv_descriptor_data_get (decl); OMP_CLAUSE_SIZE (node3) = size_int (0); /* We have to check for n->sym->attr.dimension because @@ -2322,8 +2473,20 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, tem = gfc_conv_descriptor_data_get (decl); tem = fold_convert (pvoid_type_node, tem); cond = fold_build2_loc (input_location, NE_EXPR, - logical_type_node, + boolean_type_node, tem, null_pointer_node); + if (present) + { + tree tmp = fold_build2_loc (input_location, + NE_EXPR, + boolean_type_node, + present, + null_pointer_node); + cond = fold_build2_loc (input_location, + TRUTH_ANDIF_EXPR, + boolean_type_node, + tmp, cond); + } gfc_add_expr_to_block (block, build3_loc (input_location, COND_EXPR, @@ -2333,9 +2496,34 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, OMP_CLAUSE_SIZE (node) = size; } else if (n->sym->attr.dimension) - OMP_CLAUSE_SIZE (node) - = gfc_full_array_size (block, decl, - GFC_TYPE_ARRAY_RANK (type)); + { + stmtblock_t cond_block; + gfc_init_block (&cond_block); + tree size = gfc_full_array_size (&cond_block, decl, + GFC_TYPE_ARRAY_RANK (type)); + if (present) + { + tree var = gfc_create_var (gfc_array_index_type, + NULL); + tree cond = fold_build2_loc (input_location, + NE_EXPR, + boolean_type_node, + present, + null_pointer_node); + gfc_add_modify (&cond_block, var, size); + gfc_add_expr_to_block (block, + build3_loc (input_location, COND_EXPR, + void_type_node, cond, + gfc_finish_block (&cond_block), + NULL_TREE)); + OMP_CLAUSE_SIZE (node) = var; + } + else + { + gfc_add_block_to_block (block, &cond_block); + OMP_CLAUSE_SIZE (node) = size; + } + } if (n->sym->attr.dimension) { tree elemsz @@ -2346,6 +2534,18 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, OMP_CLAUSE_SIZE (node), elemsz); } } + else if (present + && TREE_CODE (decl) == INDIRECT_REF + && TREE_CODE (TREE_OPERAND (decl, 0)) + == INDIRECT_REF) + { + /* A single indirectref is handled by the middle end. */ + gcc_assert (!POINTER_TYPE_P (TREE_TYPE (decl))); + decl = TREE_OPERAND (decl, 0); + decl = gfc_build_conditional_assign_expr ( + block, present, decl, null_pointer_node); + OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (decl); + } else OMP_CLAUSE_DECL (node) = decl; } diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 19132f76da2..8d6742e7223 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -11817,7 +11817,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) { gcc_checking_assert (is_gimple_omp_oacc (ctx->stmt)); s = TREE_TYPE (ovar); - if (TREE_CODE (s) == REFERENCE_TYPE) + if (TREE_CODE (s) == REFERENCE_TYPE + || omp_check_optional_argument (ovar, false)) s = TREE_TYPE (s); s = TYPE_SIZE_UNIT (s); } diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index 2f271009fb8..e5088014ccc 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -831,6 +831,12 @@ update_dev_host (int is_dev, void *h, size_t s, int async) if (acc_dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) return; + /* Fortran optional arguments that are non-present result in a + NULL host address here. This can safely be ignored as it is + not possible to 'update' a non-present optional argument. */ + if (h == NULL) + return; + acc_prof_info prof_info; acc_api_info api_info; bool profiling_p = GOACC_PROFILING_SETUP_P (thr, &prof_info, &api_info); @@ -901,6 +907,9 @@ gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes, struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; + if (*hostaddrs == NULL) + return; + if (acc_is_present (*hostaddrs, *sizes)) { splay_tree_key n; diff --git a/libgomp/testsuite/libgomp.fortran/optional-map.f90 b/libgomp/testsuite/libgomp.fortran/optional-map.f90 new file mode 100644 index 00000000000..56f6e59e815 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/optional-map.f90 @@ -0,0 +1,119 @@ +implicit none (type, external) +call sub() +call sub2() +call call_present_1() +call call_present_2() + +contains + +subroutine call_present_1() + integer :: ii, ival, iarr, iptr, iparr + pointer :: iptr, iparr + dimension :: iarr(2), iparr(:) + allocate(iptr,iparr(2)) + ii = 101 + ival = 102 + iptr = 103 + iarr = 104 + iparr = 105 + call sub_present(ii, ival, iarr, iptr, iparr) + deallocate(iptr,iparr) +end subroutine + +subroutine call_present_2() + integer :: ii, ival, iarr, iptr, iparr + pointer :: iptr, iparr + dimension :: iarr(2), iparr(:) + allocate(iptr,iparr(2)) + ii = 201 + ival = 202 + iptr = 203 + iarr = 204 + iparr = 205 + call sub2_present(ii, ival, iarr, iptr, iparr) + deallocate(iptr,iparr) +end subroutine + +subroutine sub(ii, ival, iarr, iptr, iparr) + integer, optional :: ii, ival, iarr, iptr, iparr + pointer :: iptr, iparr + dimension :: iarr(:), iparr(:) + value :: ival + integer :: err + err = 42 + !$omp target map(ii, ival, iarr, iptr, iparr, err) + if (present(ii)) then + ii = iptr + ival + iarr = iparr + else + err = 0 + end if + if (present(ii)) err = 1 + if (present(ival)) err = 2 + if (present(iarr)) err = 3 + if (present(iptr)) err = 4 + if (present(iparr)) err = 5 + !$omp end target + if (err /= 0) stop 1 +end subroutine sub + +subroutine sub2(ii, ival, iarr, iptr, iparr) + integer, optional :: ii, ival, iarr, iptr, iparr + pointer :: iptr, iparr + dimension :: iarr(:), iparr(:) + value :: ival + integer :: err(1) ! otherwise, implied defaultmap is firstprivate + err(1) = 42 + !$omp target ! automatic mapping with implied defaultmap(tofrom) + if (present(ii)) then + ii = iptr + ival + iarr = iparr + else + err(1) = 0 + end if + if (present(ii)) err(1) = 1 + if (present(ival)) err(1) = 2 + if (present(iarr)) err(1) = 3 + if (present(iptr)) err(1) = 4 + if (present(iparr)) err(1) = 5 + !$omp end target + if (err(1) /= 0) stop 2 +end subroutine sub2 + +subroutine sub_present(ii, ival, iarr, iptr, iparr) + integer, optional :: ii, ival, iarr, iptr, iparr + pointer :: iptr, iparr + dimension :: iarr(:), iparr(:) + value :: ival + integer :: err + err = 42 + !$omp target map(ii, ival, iarr, iptr, iparr, err) + if (.not.present(ii)) err = 1 + if (.not.present(ival)) err = 2 + if (.not.present(iarr)) err = 3 + if (.not.present(iptr)) err = 4 + if (.not.present(iparr)) err = 5 + err = err - 42 - 101-102-103-104-105 + ii+ival+iarr(2)+iptr+iparr(2) + !$omp end target + if (err /= 0) stop 3 +end subroutine sub_present + +subroutine sub2_present(ii, ival, iarr, iptr, iparr) + integer, optional :: ii, ival, iarr, iptr, iparr + pointer :: iptr, iparr + dimension :: iarr(:), iparr(:) + value :: ival + integer :: err(1) ! otherwise, implied defaultmap is firstprivate + err(1) = 53 + !$omp target ! automatic mapping with implied defaultmap(tofrom) + ! Note: OpenMP 4.5's 'defaultmap' is not yet supported, PR 92568 + if (.not.present(ii)) err = 1 + if (.not.present(ival)) err = 2 + if (.not.present(iarr)) err = 3 + if (.not.present(iptr)) err = 4 + if (.not.present(iparr)) err = 5 + err = err - 53 - 201-202-203-204-205 + ii+ival+iarr(2)+iptr+iparr(2) + !$omp end target + if (err(1) /= 0) stop 4 +end subroutine sub2_present +end diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90 index 94ac76f5700..0254f2dc196 100644 --- a/libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90 +++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-1.f90 @@ -472,6 +472,7 @@ contains hh = 88.0_c_double call test_dummy_opt_callee_1(aa, bb, cc, dd, ee, ff, gg, hh, N) + call test_dummy_opt_callee_1_absent(N=N) deallocate(ee, ff) ! pointers, only end subroutine test_dummy_opt_call_1 @@ -527,6 +528,41 @@ contains if (any(abs(3.0_c_double * gg - hh) > 10.0_c_double * epsilon(gg))) stop 72 end subroutine test_dummy_opt_callee_1 + subroutine test_dummy_opt_callee_1_absent(aa, bb, cc, dd, ee, ff, gg, hh, N) + ! scalars + real(c_double), optional, target :: aa, bb + real(c_double), optional, target, allocatable :: cc, dd + real(c_double), optional, pointer :: ee, ff + + ! non-descriptor arrays + real(c_double), optional, target :: gg(N), hh(N) + integer, value :: N + + integer :: err + + ! All shall be absent + if (present(aa) .or. present(bb)) stop 243 + if (present(cc) .or. present(dd)) stop 244 + if (present(ee) .or. present(ff)) stop 245 + if (present(gg) .or. present(hh)) stop 246 + + !$omp target data map(to:aa) map(from:bb) use_device_addr(aa,bb) + if (present(aa) .or. present(bb)) stop 247 + !$omp end target data + + !$omp target data map(to:cc) map(from:dd) use_device_addr(cc,dd) + if (present(cc) .or. present(dd)) stop 248 + !$omp end target data + + !$omp target data map(to:ee) map(from:ff) use_device_addr(ee,ff) + if (present(ee) .or. present(ff)) stop 249 + !$omp end target data + + !$omp target data map(to:gg) map(from:hh) use_device_addr(gg,hh) + if (present(gg) .or. present(hh)) stop 250 + !$omp end target data + end subroutine test_dummy_opt_callee_1_absent + ! Save device ptr - and recall pointer subroutine test_dummy_opt_call_2() integer, parameter :: N = 1000 diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90 index d6c5a672370..3dd1f90f04c 100644 --- a/libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90 +++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-2.f90 @@ -472,6 +472,7 @@ contains hh = 88.0_c_float call test_dummy_opt_callee_1(aa, bb, cc, dd, ee, ff, gg, hh, N) + call test_dummy_opt_callee_1_absent(N=N) deallocate(ee, ff) ! pointers, only end subroutine test_dummy_opt_call_1 @@ -527,6 +528,41 @@ contains if (any(abs(3.0_c_float * gg - hh) > 10.0_c_float * epsilon(gg))) stop 72 end subroutine test_dummy_opt_callee_1 + subroutine test_dummy_opt_callee_1_absent(aa, bb, cc, dd, ee, ff, gg, hh, N) + ! scalars + real(c_float), optional, target :: aa, bb + real(c_float), optional, target, allocatable :: cc, dd + real(c_float), optional, pointer :: ee, ff + + ! non-descriptor arrays + real(c_float), optional, target :: gg(N), hh(N) + integer, value :: N + + integer :: err + + ! All shall be absent + if (present(aa) .or. present(bb)) stop 243 + if (present(cc) .or. present(dd)) stop 244 + if (present(ee) .or. present(ff)) stop 245 + if (present(gg) .or. present(hh)) stop 246 + + !$omp target data map(to:aa) map(from:bb) use_device_addr(aa,bb) + if (present(aa) .or. present(bb)) stop 247 + !$omp end target data + + !$omp target data map(to:cc) map(from:dd) use_device_addr(cc,dd) + if (present(cc) .or. present(dd)) stop 248 + !$omp end target data + + !$omp target data map(to:ee) map(from:ff) use_device_addr(ee,ff) + if (present(ee) .or. present(ff)) stop 249 + !$omp end target data + + !$omp target data map(to:gg) map(from:hh) use_device_addr(gg,hh) + if (present(gg) .or. present(hh)) stop 250 + !$omp end target data + end subroutine test_dummy_opt_callee_1_absent + ! Save device ptr - and recall pointer subroutine test_dummy_opt_call_2() integer, parameter :: N = 1000 diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90 index 5c42bee718c..82cf9ac8070 100644 --- a/libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90 +++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-3.f90 @@ -290,6 +290,7 @@ contains ff = 66.0_c_double call test_dummy_opt_callee_1(aa, bb, cc, dd, ee, ff, N) + call test_dummy_opt_callee_1_absent(N=N) deallocate(ee, ff) ! pointers, only end subroutine test_dummy_opt_call_1 @@ -336,6 +337,32 @@ contains if (any(abs(3.0_c_double * ee - ff) > 10.0_c_double * epsilon(ee))) stop 1 end subroutine test_dummy_opt_callee_1 + subroutine test_dummy_opt_callee_1_absent(aa, bb, cc, dd, ee, ff, N) + ! scalars + real(c_double), optional, target :: aa(:), bb(:) + real(c_double), optional, target, allocatable :: cc(:), dd(:) + real(c_double), optional, pointer :: ee(:), ff(:) + + integer, value :: N + + ! All shall be absent + if (present(aa) .or. present(bb)) stop 1 + if (present(cc) .or. present(dd)) stop 1 + if (present(ee) .or. present(ff)) stop 1 + + !$omp target data map(to:aa) map(from:bb) use_device_addr(aa,bb) + if (present(aa) .or. present(bb)) stop 1 + !$omp end target data + + !$omp target data map(to:cc) map(from:dd) use_device_addr(cc,dd) + if (present(cc) .or. present(dd)) stop 1 + !$omp end target data + + !$omp target data map(to:ee) map(from:ff) use_device_addr(ee,ff) + if (present(ee) .or. present(ff)) stop 1 + !$omp end target data + end subroutine test_dummy_opt_callee_1_absent + ! Save device ptr - and recall pointer subroutine test_dummy_opt_call_2() integer, parameter :: N = 1000 diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90 index 5e66a79da90..d17249de2bc 100644 --- a/libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90 +++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-4.f90 @@ -290,6 +290,7 @@ contains ff = 66.0_c_float call test_dummy_opt_callee_1(aa, bb, cc, dd, ee, ff, N) + call test_dummy_opt_callee_1_absent(N=N) deallocate(ee, ff) ! pointers, only end subroutine test_dummy_opt_call_1 @@ -336,6 +337,32 @@ contains if (any(abs(3.0_c_float * ee - ff) > 10.0_c_float * epsilon(ee))) stop 1 end subroutine test_dummy_opt_callee_1 + subroutine test_dummy_opt_callee_1_absent(aa, bb, cc, dd, ee, ff, N) + ! scalars + real(c_float), optional, target :: aa(:), bb(:) + real(c_float), optional, target, allocatable :: cc(:), dd(:) + real(c_float), optional, pointer :: ee(:), ff(:) + + integer, value :: N + + ! All shall be absent + if (present(aa) .or. present(bb)) stop 1 + if (present(cc) .or. present(dd)) stop 1 + if (present(ee) .or. present(ff)) stop 1 + + !$omp target data map(to:aa) map(from:bb) use_device_addr(aa,bb) + if (present(aa) .or. present(bb)) stop 1 + !$omp end target data + + !$omp target data map(to:cc) map(from:dd) use_device_addr(cc,dd) + if (present(cc) .or. present(dd)) stop 1 + !$omp end target data + + !$omp target data map(to:ee) map(from:ff) use_device_addr(ee,ff) + if (present(ee) .or. present(ff)) stop 1 + !$omp end target data + end subroutine test_dummy_opt_callee_1_absent + ! Save device ptr - and recall pointer subroutine test_dummy_opt_call_2() integer, parameter :: N = 1000 diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c deleted file mode 100644 index 5db29124e9e..00000000000 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c +++ /dev/null @@ -1,51 +0,0 @@ -/* Exercise acc_update_device with a NULL data address on nvidia targets. */ - -/* { dg-do run { target openacc_nvidia_accel_selected } } */ - -#include <stdio.h> -#include <stdlib.h> -#include <openacc.h> - -int -main (int argc, char **argv) -{ - const int N = 256; - int i; - unsigned char *h; - void *d; - - h = (unsigned char *) malloc (N); - - for (i = 0; i < N; i++) - { - h[i] = i; - } - - d = acc_copyin (h, N); - if (!d) - abort (); - - for (i = 0; i < N; i++) - { - h[i] = 0xab; - } - - fprintf (stderr, "CheCKpOInT\n"); - acc_update_device (0, N); - - acc_copyout (h, N); - - for (i = 0; i < N; i++) - { - if (h[i] != 0xab) - abort (); - } - - free (h); - - return 0; -} - -/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */ -/* { dg-output "\\\[\[^\n\r]*,256\\\] is not mapped" } */ -/* { dg-shouldfail "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c deleted file mode 100644 index c2140429cb1..00000000000 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c +++ /dev/null @@ -1,49 +0,0 @@ -/* Exercise acc_update_self with a NULL data mapping on nvidia targets. */ - -/* { dg-do run { target openacc_nvidia_accel_selected } } */ - -#include <stdio.h> -#include <string.h> -#include <stdlib.h> -#include <openacc.h> - -int -main (int argc, char **argv) -{ - const int N = 256; - int i; - unsigned char *h; - void *d; - - h = (unsigned char *) malloc (N); - - for (i = 0; i < N; i++) - { - h[i] = i; - } - - d = acc_copyin (h, N); - if (!d) - abort (); - - memset (&h[0], 0, N); - - fprintf (stderr, "CheCKpOInT\n"); - acc_update_self (0, N); - - for (i = 0; i < N; i++) - { - if (h[i] != i) - abort (); - } - - acc_delete (h, N); - - free (h); - - return 0; -} - -/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */ -/* { dg-output "\\\[\[^\n\r]*,256\\\] is not mapped" } */ -/* { dg-shouldfail "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 new file mode 100644 index 00000000000..00f7472ae6e --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 @@ -0,0 +1,23 @@ +! Test that the cache directives work with optional arguments. The effect +! of giving a non-present argument to the cache directive is not tested as +! it is undefined. The test is based on gfortran.dg/goacc/cache-1.f95. + +! { dg-additional-options "-std=f2008" } + +program cache_test + implicit none + integer :: d(10), e(7,13) + + call do_test(d, e) +contains + subroutine do_test(d, e) + integer, optional :: d(10), e(7,13) + integer :: i + do concurrent (i=1:5) + !$acc cache (d(1:3)) + !$acc cache (d(i:i+2)) + !$acc cache (e(1:3,2:4)) + !$acc cache (e(i:i+2,i+1:i+3)) + enddo + end +end diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90 new file mode 100644 index 00000000000..5cadeed44b4 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin-by-value.f90 @@ -0,0 +1,29 @@ +! Test OpenACC data regions with optional arguments passed by value. + +! { dg-do run } + +program test + implicit none + + integer :: res + + if (foo(27) .ne. 27) stop 1 + if (foo(16, 18) .ne. 288) stop 1 +contains + function foo(x, y) + integer, value :: x + integer, value, optional :: y + integer :: res, foo + + !$acc data copyin(x, y) copyout(res) + !$acc parallel + res = x + if (present(y)) then + res = res * y + end if + !$acc end parallel + !$acc end data + + foo = res + end function foo +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin.f90 new file mode 100644 index 00000000000..a30908d61a5 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyin.f90 @@ -0,0 +1,140 @@ +! Test OpenACC data regions with a copy-in of optional arguments. + +! { dg-do run } + +program test + implicit none + + integer, parameter :: n = 64 + integer :: i + integer :: a_int, b_int, c_int, res_int + integer :: a_arr(n), b_arr(n), c_arr(n), res_arr(n) + integer, allocatable :: a_alloc(:), b_alloc(:), c_alloc(:), res_alloc(:) + + a_int = 7 + b_int = 3 + c_int = 11 + + call test_int(res_int, a_int) + if (res_int .ne. a_int) stop 1 + + call test_int(res_int, a_int, b_int) + if (res_int .ne. a_int * b_int) stop 2 + + call test_int(res_int, a_int, b_int, c_int) + if (res_int .ne. a_int * b_int + c_int) stop 3 + + do i = 1, n + a_arr(i) = i + b_arr(i) = n - i + 1 + c_arr(i) = i * 3 + end do + + call test_array(res_arr, a_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i)) stop 4 + end do + + call test_array(res_arr, a_arr, b_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 5 + end do + + call test_array(res_arr, a_arr, b_arr, c_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i) + c_arr(i)) stop 6 + end do + + allocate (a_alloc(n)) + allocate (b_alloc(n)) + allocate (c_alloc(n)) + allocate (res_alloc(n)) + + do i = 1, n + a_alloc(i) = i + b_alloc(i) = n - i + 1 + c_alloc(i) = i * 3 + end do + + call test_allocatable(res_alloc, a_alloc) + do i = 1, n + if (res_alloc(i) .ne. a_alloc(i)) stop 7 + end do + + call test_allocatable(res_alloc, a_alloc, b_alloc) + do i = 1, n + if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i)) stop 8 + end do + + call test_allocatable(res_alloc, a_alloc, b_alloc, c_alloc) + do i = 1, n + if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i) + c_alloc(i)) stop 9 + end do + + deallocate (a_alloc) + deallocate (b_alloc) + deallocate (c_alloc) + deallocate (res_alloc) +contains + subroutine test_int(res, a, b, c) + integer :: res + integer :: a + integer, optional :: b, c + + !$acc data copyin(a, b, c) copyout(res) + !$acc parallel + res = a + + if (present(b)) res = res * b + + if (present(c)) res = res + c + !$acc end parallel + !$acc end data + end subroutine test_int + + subroutine test_array(res, a, b, c) + integer :: res(n) + integer :: a(n) + integer, optional :: b(n), c(n) + + !$acc data copyin(a, b, c) copyout(res) + !$acc parallel loop + do i = 1, n + res(i) = a(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(b)) res(i) = res(i) * b(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(c)) res(i) = res(i) + c(i) + end do + !$acc end data + end subroutine test_array + + subroutine test_allocatable(res, a, b, c) + integer, allocatable :: res(:) + integer, allocatable :: a(:) + integer, allocatable, optional :: b(:), c(:) + + !$acc data copyin(a, b, c) copyout(res) + !$acc parallel loop + do i = 1, n + res(i) = a(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(b)) res(i) = res(i) * b(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(c)) res(i) = res(i) + c(i) + end do + !$acc end data + end subroutine test_allocatable +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyout.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyout.f90 new file mode 100644 index 00000000000..feaa31fa423 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-copyout.f90 @@ -0,0 +1,96 @@ +! Test OpenACC data regions with a copy-out of optional arguments. + +! { dg-do run } + +program test + implicit none + + integer, parameter :: n = 64 + integer :: i + integer :: a_int, b_int, res_int + integer :: a_arr(n), b_arr(n), res_arr(n) + integer, allocatable :: a_alloc(:), b_alloc(:), res_alloc(:) + + res_int = 0 + + call test_int(a_int, b_int) + if (res_int .ne. 0) stop 1 + + call test_int(a_int, b_int, res_int) + if (res_int .ne. a_int * b_int) stop 2 + + res_arr(:) = 0 + do i = 1, n + a_arr(i) = i + b_arr(i) = n - i + 1 + end do + + call test_array(a_arr, b_arr) + do i = 1, n + if (res_arr(i) .ne. 0) stop 3 + end do + + call test_array(a_arr, b_arr, res_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 4 + end do + + allocate (a_alloc(n)) + allocate (b_alloc(n)) + allocate (res_alloc(n)) + + res_alloc(:) = 0 + do i = 1, n + a_alloc(i) = i + b_alloc(i) = n - i + 1 + end do + + call test_allocatable(a_alloc, b_alloc) + do i = 1, n + if (res_alloc(i) .ne. 0) stop 5 + end do + + call test_allocatable(a_alloc, b_alloc, res_alloc) + do i = 1, n + if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i)) stop 6 + end do + + deallocate (a_alloc) + deallocate (b_alloc) + deallocate (res_alloc) +contains + subroutine test_int(a, b, res) + integer :: a, b + integer, optional :: res + + !$acc data copyin(a, b) copyout(res) + !$acc parallel + if (present(res)) res = a * b + !$acc end parallel + !$acc end data + end subroutine test_int + + subroutine test_array(a, b, res) + integer :: a(n), b(n) + integer, optional :: res(n) + + !$acc data copyin(a, b) copyout(res) + !$acc parallel loop + do i = 1, n + if (present(res)) res(i) = a(i) * b(i) + end do + !$acc end data + end subroutine test_array + + subroutine test_allocatable(a, b, res) + integer, allocatable :: a(:), b(:) + integer, allocatable, optional :: res(:) + + !$acc data copyin(a, b) copyout(res) + !$acc parallel loop + do i = 1, n + if (present(res)) res(i) = a(i) * b(i) + end do + !$acc end data + end subroutine test_allocatable +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90 new file mode 100644 index 00000000000..9ed0f753ea5 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-data-enter-exit.f90 @@ -0,0 +1,91 @@ +! Test OpenACC unstructured enter data/exit data regions with optional +! arguments. + +! { dg-do run } + +program test + implicit none + + integer, parameter :: n = 64 + integer :: a(n), b(n), c(n), res(n) + integer :: x, y, z, r, i + + do i = 1, n + a(i) = i + b(i) = n - i + 1 + c(i) = i * 3 + end do + + res = test_array(a) + do i = 1, n + if (res(i) .ne. a(i)) stop 1 + end do + + res = test_array(a, b) + do i = 1, n + if (res(i) .ne. a(i) * b(i)) stop 2 + end do + + res = test_array(a, b, c) + do i = 1, n + if (res(i) .ne. a(i) * b(i) + c(i)) stop 3 + end do + + x = 7 + y = 3 + z = 11 + + r = test_int(x) + if (r .ne. x) stop 4 + + r = test_int(x, y) + if (r .ne. x * y) stop 5 + + r = test_int(x, y, z) + if (r .ne. x * y + z) stop 6 +contains + function test_array(a, b, c) + integer :: a(n) + integer, optional :: b(n), c(n) + integer :: test_array(n), res(n) + + !$acc enter data copyin(a, b, c) create(res) + !$acc parallel loop + do i = 1, n + res(i) = a(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(b)) then + res(i) = res(i) * b(i) + end if + end do + + !$acc parallel loop + do i = 1, n + if (present(c)) then + res(i) = res(i) + c(i) + end if + end do + !$acc exit data copyout(res) delete(a, b, c) + + test_array = res + end function test_array + + function test_int(a, b, c) + integer :: a + integer, optional :: b, c + integer :: test_int, res + + !$acc enter data copyin(a, b, c) create(res) + !$acc parallel present(a, b, c, res) + res = a + if (present(b)) res = res * b + if (present(c)) res = res + c + !$acc end parallel + !$acc exit data copyout(res) delete(a, b, c) + + test_int = res + end function test_int +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-declare.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-declare.f90 new file mode 100644 index 00000000000..074e5a2abb6 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-declare.f90 @@ -0,0 +1,87 @@ +! Test OpenACC declare directives with optional arguments. + +! { dg-do run } + +program test + implicit none + + integer, parameter :: n = 64 + integer :: i + integer :: a_int, b_int, c_int, res_int + integer :: a_arr(n), b_arr(n), c_arr(n), res_arr(n) + + a_int = 7 + b_int = 3 + c_int = 11 + + call test_int(res_int, a_int) + if (res_int .ne. a_int) stop 1 + + call test_int(res_int, a_int, b_int) + if (res_int .ne. a_int * b_int) stop 2 + + call test_int(res_int, a_int, b_int, c_int) + if (res_int .ne. a_int * b_int + c_int) stop 3 + + do i = 1, n + a_arr(i) = i + b_arr(i) = n - i + 1 + c_arr(i) = i * 3 + end do + + call test_array(res_arr, a_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i)) stop 4 + end do + + call test_array(res_arr, a_arr, b_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 5 + end do + + call test_array(res_arr, a_arr, b_arr, c_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i) + c_arr(i)) stop 6 + end do +contains + subroutine test_int(res, a, b, c) + integer :: a + integer, optional :: b, c + !$acc declare present_or_copyin(a, b, c) + integer :: res + !$acc declare present_or_copyout(res) + + !$acc parallel + res = a + if (present(b)) res = res * b + if (present(c)) res = res + c + !$acc end parallel + end subroutine test_int + + subroutine test_array(res, a, b, c) + integer :: a(n) + integer, optional :: b(n), c(n) + !$acc declare present_or_copyin(a, b, c) + integer :: res(n) + !$acc declare present_or_copyout(res) + + !$acc parallel loop + do i = 1, n + res(i) = a(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(b)) then + res(i) = res(i) * b(i) + end if + end do + + !$acc parallel loop + do i = 1, n + if (present(c)) then + res(i) = res(i) + c(i) + end if + end do + end subroutine test_array +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90 new file mode 100644 index 00000000000..693e6118489 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-firstprivate.f90 @@ -0,0 +1,112 @@ +! Test that optional arguments work in firstprivate clauses. The effect of +! non-present arguments in firstprivate clauses is undefined, and is not +! tested for. + +! { dg-do run } + +program test_firstprivate + implicit none + integer, parameter :: n = 64 + + integer :: i, j + integer :: a_int, b_int, c_int, res_int + integer :: a_arr(n), b_arr(n), c_arr(n), res_arr(n) + integer, allocatable :: a_alloc(:), b_alloc(:), c_alloc(:), res_alloc(:) + + a_int = 14 + b_int = 5 + c_int = 12 + + call test_int(res_int, a_int, b_int, c_int) + if (res_int .ne. a_int * b_int + c_int) stop 1 + + do i = 1, n + a_arr(i) = i + b_arr(i) = n - i + 1 + c_arr(i) = i * 3 + end do + + call test_array(res_arr, a_arr, b_arr, c_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i) + c_arr(i)) stop 2 + end do + + allocate(a_alloc(n)) + allocate(b_alloc(n)) + allocate(c_alloc(n)) + allocate(res_alloc(n)) + + do i = 1, n + a_arr(i) = i + b_arr(i) = n - i + 1 + c_arr(i) = i * 3 + end do + + call test_allocatable(res_alloc, a_alloc, b_alloc, c_alloc) + do i = 1, n + if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i) + c_alloc(i)) stop 2 + end do + + deallocate(a_alloc) + deallocate(b_alloc) + deallocate(c_alloc) + deallocate(res_alloc) +contains + subroutine test_int(res, a, b, c) + integer :: a + integer, optional :: b, c + integer :: res + + !$acc parallel firstprivate(a, b, c) copyout(res) + res = a + if (present(b)) res = res * b + if (present(c)) res = res + c + !$acc end parallel + end subroutine test_int + + subroutine test_array(res, a, b, c) + integer :: a(n) + integer, optional :: b(n), c(n) + integer :: res(n) + + !$acc data copyin(a, b, c) copyout(res) + !$acc parallel loop firstprivate(a) + do i = 1, n + res(i) = a(i) + end do + + !$acc parallel loop firstprivate(b) + do i = 1, n + if (present(b)) res(i) = res(i) * b(i) + end do + + !$acc parallel loop firstprivate(c) + do i = 1, n + if (present(c)) res(i) = res(i) + c(i) + end do + !$acc end data + end subroutine test_array + + subroutine test_allocatable(res, a, b, c) + integer, allocatable :: a(:) + integer, allocatable, optional :: b(:), c(:) + integer, allocatable :: res(:) + + !$acc data copyin(a, b, c) copyout(res) + !$acc parallel loop firstprivate(a) + do i = 1, n + res(i) = a(i) + end do + + !$acc parallel loop firstprivate(b) + do i = 1, n + if (present(b)) res(i) = res(i) * b(i) + end do + + !$acc parallel loop firstprivate(c) + do i = 1, n + if (present(c)) res(i) = res(i) + c(i) + end do + !$acc end data + end subroutine test_allocatable +end program test_firstprivate diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-host_data.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-host_data.f90 new file mode 100644 index 00000000000..a6e41e28b0b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-host_data.f90 @@ -0,0 +1,39 @@ +! Test the host_data construct with optional arguments. +! Based on host_data-1.f90. + +! { dg-do run } +! { dg-additional-options "-cpp" } + +program test + implicit none + + integer, target :: i + integer, pointer :: ip, iph + + ! Assign the same targets + ip => i + iph => i + + call foo(iph) + call foo(iph, ip) +contains + subroutine foo(iph, ip) + integer, pointer :: iph + integer, pointer, optional :: ip + + !$acc data copyin(i) + !$acc host_data use_device(ip) + + ! Test how the pointers compare inside a host_data construct + if (present(ip)) then +#if ACC_MEM_SHARED + if (.not. associated(ip, iph)) STOP 1 +#else + if (associated(ip, iph)) STOP 2 +#endif + end if + + !$acc end host_data + !$acc end data + end subroutine foo +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-nested-calls.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-nested-calls.f90 new file mode 100644 index 00000000000..279139f7c59 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-nested-calls.f90 @@ -0,0 +1,135 @@ +! Test propagation of optional arguments from within an OpenACC parallel region. + +! { dg-do run } + +program test + implicit none + + integer, parameter :: n = 64 + integer :: i + integer :: res_int + integer :: a_arr(n), b_arr(n), res_arr(n) + integer, allocatable :: a_alloc(:), b_alloc(:), res_alloc(:) + + call test_int_caller(res_int, 5) + if (res_int .ne. 10) stop 1 + + call test_int_caller(res_int, 2, 3) + if (res_int .ne. 11) stop 2 + + do i = 1, n + a_arr(i) = i + b_arr(i) = n - i + 1 + end do + + call test_array_caller(res_arr, a_arr) + do i = 1, n + if (res_arr(i) .ne. 2 * a_arr(i)) stop 3 + end do + + call test_array_caller(res_arr, a_arr, b_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i) + a_arr(i) + b_arr(i)) stop 4 + end do + + allocate(a_alloc(n)) + allocate(b_alloc(n)) + allocate(res_alloc(n)) + + do i = 1, n + a_alloc(i) = i + b_alloc(i) = n - i + 1 + end do + + call test_array_caller(res_arr, a_arr) + do i = 1, n + if (res_arr(i) .ne. 2 * a_alloc(i)) stop 5 + end do + + call test_array_caller(res_arr, a_arr, b_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_alloc(i) + a_alloc(i) + b_alloc(i)) stop 6 + end do + + deallocate(a_alloc) + deallocate(b_alloc) + deallocate(res_alloc) +contains + subroutine test_int_caller(res, a, b) + integer :: res, a + integer, optional :: b + + !$acc data copyin(a, b) copyout (res) + !$acc parallel + res = a + if (present(b)) res = res * b + call test_int_callee(res, a, b) + !$acc end parallel + !$acc end data + end subroutine test_int_caller + + subroutine test_int_callee(res, a, b) + !$acc routine seq + integer :: res, a + integer, optional :: b + + res = res + a + if (present(b)) res = res + b + end subroutine test_int_callee + + subroutine test_array_caller(res, a, b) + integer :: res(n), a(n), i + integer, optional :: b(n) + + !$acc data copyin(a, b) copyout(res) + !$acc parallel + !$acc loop seq + do i = 1, n + res(i) = a(i) + if (present(b)) res(i) = res(i) * b(i) + end do + call test_array_callee(res, a, b) + !$acc end parallel + !$acc end data + end subroutine test_array_caller + + subroutine test_array_callee(res, a, b) + !$acc routine seq + integer :: res(n), a(n), i + integer, optional :: b(n) + + do i = 1, n + res(i) = res(i) + a(i) + if (present(b)) res(i) = res(i) + b(i) + end do + end subroutine test_array_callee + + subroutine test_allocatable_caller(res, a, b) + integer :: i + integer, allocatable :: res(:), a(:) + integer, allocatable, optional :: b(:) + + !$acc data copyin(a, b) copyout(res) + !$acc parallel + !$acc loop seq + do i = 1, n + res(i) = a(i) + if (present(b)) res(i) = res(i) * b(i) + end do + call test_array_callee(res, a, b) + !$acc end parallel + !$acc end data + end subroutine test_allocatable_caller + + subroutine test_allocatable_callee(res, a, b) + !$acc routine seq + integer :: i + integer, allocatable :: res(:), a(:) + integer, allocatable, optional :: b(:) + + do i = 1, n + res(i) = res(i) + a(i) + if (present(b)) res(i) = res(i) + b(i) + end do + end subroutine test_allocatable_callee +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90 new file mode 100644 index 00000000000..0320bbb3bc9 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90 @@ -0,0 +1,115 @@ +! Test that optional arguments work in private clauses. The effect of +! non-present arguments in private clauses is undefined, and is not tested +! for. The tests are based on those in private-variables.f90. + +! { dg-do run } + +program main + implicit none + + type vec3 + integer x, y, z, attr(13) + end type vec3 + integer :: x + type(vec3) :: pt + integer :: arr(2) + + call t1(x) + call t2(pt) + call t3(arr) +contains + + ! Test of gang-private variables declared on loop directive. + + subroutine t1(x) + integer, optional :: x + integer :: i, arr(32) + + do i = 1, 32 + arr(i) = i + end do + + !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + !$acc loop gang private(x) + do i = 1, 32 + x = i * 2; + arr(i) = arr(i) + x + end do + !$acc end parallel + + do i = 1, 32 + if (arr(i) .ne. i * 3) STOP 1 + end do + end subroutine t1 + + + ! Test of gang-private addressable variable declared on loop directive, with + ! broadcasting to partitioned workers. + + subroutine t2(pt) + integer i, j, arr(0:32*32) + type(vec3), optional :: pt + + do i = 0, 32*32-1 + arr(i) = i + end do + + !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + !$acc loop gang private(pt) + do i = 0, 31 + pt%x = i + pt%y = i * 2 + pt%z = i * 4 + pt%attr(5) = i * 6 + + !$acc loop vector + do j = 0, 31 + arr(i * 32 + j) = arr(i * 32 + j) + pt%x + pt%y + pt%z + pt%attr(5); + end do + end do + !$acc end parallel + + do i = 0, 32 * 32 - 1 + if (arr(i) .ne. i + (i / 32) * 13) STOP 2 + end do + end subroutine t2 + + ! Test of vector-private variables declared on loop directive. Array type. + + subroutine t3(pt) + integer, optional :: pt(2) + integer :: i, j, k, idx, arr(0:32*32*32) + + do i = 0, 32*32*32-1 + arr(i) = i + end do + + !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + !$acc loop gang + do i = 0, 31 + !$acc loop worker + do j = 0, 31 + !$acc loop vector private(pt) + do k = 0, 31 + pt(1) = ieor(i, j * 3) + pt(2) = ior(i, j * 5) + arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(1) * k + arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(2) * k + end do + end do + end do + !$acc end parallel + + do i = 0, 32 - 1 + do j = 0, 32 -1 + do k = 0, 32 - 1 + idx = i * 1024 + j * 32 + k + if (arr(idx) .ne. idx + ieor(i, j * 3) * k + ior(i, j * 5) * k) then + STOP 3 + end if + end do + end do + end do + end subroutine t3 + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 new file mode 100644 index 00000000000..b76db3ef6d3 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 @@ -0,0 +1,69 @@ +! Test optional arguments in reduction clauses. The effect of +! non-present arguments in reduction clauses is undefined, and is not tested +! for. The tests are based on those in reduction-1.f90. + +! { dg-do run } +! { dg-additional-options "-w" } + +program optional_reduction + implicit none + + integer :: rg, rw, rv, rc + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + + call do_test(rg, rw, rv, rc) +contains + subroutine do_test(rg, rw, rv, rc) + integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32 + integer, optional :: rg, rw, rv, rc + integer :: i, vresult + integer, dimension (n) :: array + + vresult = 0 + do i = 1, n + array(i) = i + end do + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(+:rg) gang + do i = 1, n + rg = rg + array(i) + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(+:rw) worker + do i = 1, n + rw = rw + array(i) + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(+:rv) vector + do i = 1, n + rv = rv + array(i) + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(+:rc) gang worker vector + do i = 1, n + rc = rc + array(i) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = vresult + array(i) + end do + + if (rg .ne. vresult) STOP 1 + if (rw .ne. vresult) STOP 2 + if (rv .ne. vresult) STOP 3 + if (rc .ne. vresult) STOP 4 + end subroutine do_test +end program optional_reduction diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-update-device.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-update-device.f90 new file mode 100644 index 00000000000..57f69001d3d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-update-device.f90 @@ -0,0 +1,121 @@ +! Test OpenACC update to device with an optional argument. + +! { dg-do run } + +program optional_update_device + implicit none + + integer, parameter :: n = 64 + integer :: i + integer :: a_int, b_int, res_int + integer :: a_arr(n), b_arr(n), res_arr(n) + integer, allocatable :: a_alloc(:), b_alloc(:), res_alloc(:) + + a_int = 5 + b_int = 11 + + call test_int(res_int, a_int) + if (res_int .ne. a_int) stop 1 + + call test_int(res_int, a_int, b_int) + if (res_int .ne. a_int * b_int) stop 2 + + res_arr(:) = 0 + do i = 1, n + a_arr(i) = i + b_arr(i) = n - i + 1 + end do + + call test_array(res_arr, a_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i)) stop 3 + end do + + call test_array(res_arr, a_arr, b_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 4 + end do + + allocate (a_alloc(n)) + allocate (b_alloc(n)) + allocate (res_alloc(n)) + + res_alloc(:) = 0 + do i = 1, n + a_alloc(i) = i + b_alloc(i) = n - i + 1 + end do + + call test_allocatable(res_alloc, a_alloc) + do i = 1, n + if (res_alloc(i) .ne. a_alloc(i)) stop 5 + end do + + call test_allocatable(res_alloc, a_alloc, b_alloc) + do i = 1, n + if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i)) stop 6 + end do + + deallocate (a_alloc) + deallocate (b_alloc) + deallocate (res_alloc) +contains + subroutine test_int(res, a, b) + integer :: res + integer :: a + integer, optional :: b + + !$acc data create(a, b, res) + !$acc update device(a, b) + !$acc parallel + res = a + if (present(b)) res = res * b + !$acc end parallel + !$acc update self(res) + !$acc end data + end subroutine test_int + + subroutine test_array(res, a, b) + integer :: res(n) + integer :: a(n) + integer, optional :: b(n) + + !$acc data create(a, b, res) + !$acc update device(a, b) + !$acc parallel loop + do i = 1, n + res(i) = a(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(b)) then + res(i) = res(i) * b(i) + end if + end do + !$acc update self(res) + !$acc end data + end subroutine test_array + + subroutine test_allocatable(res, a, b) + integer, allocatable :: res(:) + integer, allocatable :: a(:) + integer, allocatable, optional :: b(:) + + !$acc data create(a, b, res) + !$acc update device(a, b) + !$acc parallel loop + do i = 1, n + res(i) = a(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(b)) then + res(i) = res(i) * b(i) + end if + end do + !$acc update self(res) + !$acc end data + end subroutine test_allocatable +end program optional_update_device diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-update-host.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-update-host.f90 new file mode 100644 index 00000000000..36b94241b11 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-update-host.f90 @@ -0,0 +1,115 @@ +! Test OpenACC update to host with an optional argument. + +! { dg-do run } + +program optional_update_host + implicit none + + integer, parameter :: n = 64 + integer :: i + integer :: a_int, b_int, res_int + integer :: a_arr(n), b_arr(n), res_arr(n) + integer, allocatable :: a_alloc(:), b_alloc(:), res_alloc(:) + + a_int = 5 + b_int = 11 + res_int = 0 + + call test_int(a_int, b_int) + if (res_int .ne. 0) stop 1 + + call test_int(a_int, b_int, res_int) + if (res_int .ne. a_int * b_int) stop 2 + + res_arr(:) = 0 + do i = 1, n + a_arr(i) = i + b_arr(i) = n - i + 1 + end do + + call test_array(a_arr, b_arr) + do i = 1, n + if (res_arr(i) .ne. 0) stop 1 + end do + + call test_array(a_arr, b_arr, res_arr) + do i = 1, n + if (res_arr(i) .ne. a_arr(i) * b_arr(i)) stop 2 + end do + + allocate(a_alloc(n)) + allocate(b_alloc(n)) + allocate(res_alloc(n)) + + res_alloc(:) = 0 + do i = 1, n + a_alloc(i) = i + b_alloc(i) = n - i + 1 + end do + + call test_allocatable(a_alloc, b_alloc) + do i = 1, n + if (res_alloc(i) .ne. 0) stop 1 + end do + + call test_allocatable(a_alloc, b_alloc, res_alloc) + do i = 1, n + if (res_alloc(i) .ne. a_alloc(i) * b_alloc(i)) stop 2 + end do + + deallocate(a_alloc) + deallocate(b_alloc) + deallocate(res_alloc) +contains + subroutine test_int(a, b, res) + integer :: a, b + integer, optional :: res + + !$acc data create(a, b, res) + !$acc update device(a, b) + !$acc parallel + if (present(res)) res = a + if (present(res)) res = res * b + !$acc end parallel + !$acc update self(res) + !$acc end data + end subroutine test_int + + subroutine test_array(a, b, res) + integer :: a(n), b(n) + integer, optional :: res(n) + + !$acc data create(a, b, res) + !$acc update device(a, b) + !$acc parallel loop + do i = 1, n + if (present(res)) res(i) = a(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(res)) res(i) = res(i) * b(i) + end do + !$acc update self(res) + !$acc end data + end subroutine test_array + + subroutine test_allocatable(a, b, res) + integer, allocatable :: a(:), b(:) + integer, allocatable, optional :: res(:) + + !$acc data create(a, b, res) + !$acc update device(a, b) + !$acc parallel loop + do i = 1, n + if (present(res)) res(i) = a(i) + end do + + !$acc parallel loop + do i = 1, n + if (present(res)) res(i) = res(i) * b(i) + end do + !$acc update self(res) + !$acc end data + end subroutine test_allocatable +end program optional_update_host