Message ID | 531ABAF8.2070604@charter.net |
---|---|
State | New |
Headers | show |
Jerry DeLisle wrote: > The attached patch addresses the problem identified in comment #22 of the PR. > For character array internal unit reads, eat_spaces must call next_char to > advance every single character until the end of the string is reached. In the > case sited which is very contrived, this amounts to about 100000 calls to next_char. > Without the patch takes about 25 seconds to run. > With the patch this takes about 2.8 seconds. Nice! > The speedup is accomplished by simply skipping over spaces without calling > next_read, then backing up one character and letting the existing execution path > proceed, preserving all the end of record code needed in next_char. I think that code is okay. I wonder whether it can happen that we read one character too far: i.e. stell() == end of string; offset++ -> one behind. As one then immediately steps back (due to "offset < limit" plus offset--), the code should be handled just fine - except for the out of bounds read - which might be detected by, e.g., -fsanitizer=address (if libgfortran and the program use it). However, I am not sure whether eat_spaces would be called in that case or whether other conditions error out before. > I also remove some unneeded error checks. I have to admit that I do not really understand why the conditions are unreachable. Is it because "dtp->u.p.current_unit->bytes_left == 0" is already checked for "is_array_io (dtp)"? If so, can it happen for scalar internal I/O? I assume, it is obvious, but I don't quickly see it. Tobias > 2014-03-08 Jerry DeLisle <jvdelisle@gcc.gnu> > > PR libfortran/38199 > * io/list_read.c (next_char): Delete unuseful error checks. > (eat_spaces): For character array reading, skip ahead over > spaces rather than call next_char multiple times.
On Sat, Mar 8, 2014 at 7:38 AM, Jerry DeLisle wrote: > The speedup is accomplished by simply skipping over spaces without calling > next_read, then backing up one character and letting the existing execution path > proceed, preserving all the end of record code needed in next_char. > > I also remove some unneeded error checks. Would it be enough to make them "unlikely" instead? - if (length < 0) + if (unlikely(length < 0)) Ciao! Steven
On 03/08/2014 02:45 AM, Tobias Burnus wrote: --- snip --- > > I think that code is okay. > > I wonder whether it can happen that we read one character too far: i.e. stell() > == end of string; offset++ -> one behind. As one then immediately steps back > (due to "offset < limit" plus offset--), the code should be handled just fine - > except for the out of bounds read - which might be detected by, e.g., > -fsanitizer=address (if libgfortran and the program use it). However, I am not > sure whether eat_spaces would be called in that case or whether other conditions > error out before. > The strings are internally NULL terminated, which helps. I will do some tests to see. >> I also remove some unneeded error checks. > > I have to admit that I do not really understand why the conditions are > unreachable. Is it because "dtp->u.p.current_unit->bytes_left == 0" is already > checked for "is_array_io (dtp)"? If so, can it happen for scalar internal I/O? I > assume, it is obvious, but I don't quickly see it. The is_array_io (dtp) is checking for an array descriptor. For scalar internal I/O the descriptor is a NULL, so the new code is not executed in that case. I ran a test using a scalar string and the new code is not needed. The errors given are not normal error conditions. If we hit these errors there is a internal library bug and the program would fail out. Since we have had none reported ever I don't think we need it. As Steven suggested I can make them unlikely. Also I just realized I did not address the kind=4 case, so I will do that and resubmit. Thanks for review. Jerry
Am 08.03.2014 07:38, schrieb Jerry DeLisle: > The attached patch addresses the problem identified in comment #22 of the PR. > For character array internal unit reads, eat_spaces must call next_char to > advance every single character until the end of the string is reached. In the > case sited which is very contrived, this amounts to about 100000 calls to next_char. > > For clarity, this test case: > > character buffer(1)*100000 > integer i,j > > j = 1234 > write(buffer(1),'(i4)') j > > DO j=1,9999 > ! write(*,*) buffer(1)(1:4) > read(buffer,*) i > ! write(*,*) i > ENDDO > end > > Without the patch takes about 25 seconds to run. > > With the patch this takes about 2.8 seconds. This is great. However, this is still 10 times slower than the LEN_TRIM variant: character buffer(1)*100000 integer i,j j = 1234 write(buffer(1),'(i4)') j DO j=1,9999 ! write(*,*) buffer(1)(1:4) read(buffer(1)(1:LEN_TRIM(buffer(1))),*) i ! write(*,*) i ENDDO end which takes 0.23s on my box. So on the on hand the improvement is great, on the other hand it is really sad, because the user will still need to do manual LEN_TRIM's when reading larger strings to get optimal performance... Thanks, Manfred > > The speedup is accomplished by simply skipping over spaces without calling > next_read, then backing up one character and letting the existing execution path > proceed, preserving all the end of record code needed in next_char. > > I also remove some unneeded error checks. > > Regression tested on X86_64 gnu. No need for a new test case since no new > functionality is added. > > OK for trunk? The PR is marked as a regression, so I think this could be the > last piece and call it done. > > Regards, > > Jerry > > 2014-03-08 Jerry DeLisle <jvdelisle@gcc.gnu> > > PR libfortran/38199 > * io/list_read.c (next_char): Delete unuseful error checks. > (eat_spaces): For character array reading, skip ahead over > spaces rather than call next_char multiple times. >
Am 08.03.2014 23:37, schrieb Manfred Schwarb: > Am 08.03.2014 07:38, schrieb Jerry DeLisle: >> The attached patch addresses the problem identified in comment #22 of the PR. >> For character array internal unit reads, eat_spaces must call next_char to >> advance every single character until the end of the string is reached. In the >> case sited which is very contrived, this amounts to about 100000 calls to next_char. >> >> For clarity, this test case: >> >> character buffer(1)*100000 >> integer i,j >> >> j = 1234 >> write(buffer(1),'(i4)') j >> >> DO j=1,9999 >> ! write(*,*) buffer(1)(1:4) >> read(buffer,*) i >> ! write(*,*) i >> ENDDO >> end >> >> Without the patch takes about 25 seconds to run. >> >> With the patch this takes about 2.8 seconds. > > This is great. > > However, this is still 10 times slower than the LEN_TRIM variant: > character buffer(1)*100000 > integer i,j > > j = 1234 > write(buffer(1),'(i4)') j > > DO j=1,9999 > ! write(*,*) buffer(1)(1:4) > read(buffer(1)(1:LEN_TRIM(buffer(1))),*) i > ! write(*,*) i > ENDDO > end > and also 10 times slower than the scalar variant (which was fixed by Thomas): character buffer*100000 integer i,j j = 1234 write(buffer,'(i4)') j DO j=1,9999 ! write(*,*) buffer(1:4) read(buffer,*) i ! write(*,*) i ENDDO end > which takes 0.23s on my box. So on the on hand the improvement is great, > on the other hand it is really sad, because the user will still > need to do manual LEN_TRIM's when reading larger strings to get > optimal performance... > > Thanks, > Manfred > > > >> >> The speedup is accomplished by simply skipping over spaces without calling >> next_read, then backing up one character and letting the existing execution path >> proceed, preserving all the end of record code needed in next_char. >> >> I also remove some unneeded error checks. >> >> Regression tested on X86_64 gnu. No need for a new test case since no new >> functionality is added. >> >> OK for trunk? The PR is marked as a regression, so I think this could be the >> last piece and call it done. >> >> Regards, >> >> Jerry >> >> 2014-03-08 Jerry DeLisle <jvdelisle@gcc.gnu> >> >> PR libfortran/38199 >> * io/list_read.c (next_char): Delete unuseful error checks. >> (eat_spaces): For character array reading, skip ahead over >> spaces rather than call next_char multiple times. >> > >
Index: list_read.c =================================================================== --- list_read.c (revision 208303) +++ list_read.c (working copy) @@ -160,7 +160,7 @@ next_char (st_parameter_dt *dtp) dtp->u.p.line_buffer_pos = 0; dtp->u.p.line_buffer_enabled = 0; - } + } /* Handle the end-of-record and end-of-file conditions for internal array unit. */ @@ -208,20 +208,8 @@ next_char (st_parameter_dt *dtp) c = cc; } - if (length < 0) - { - generate_error (&dtp->common, LIBERROR_OS, NULL); - return '\0'; - } - if (is_array_io (dtp)) { - /* Check whether we hit EOF. */ - if (length == 0) - { - generate_error (&dtp->common, LIBERROR_INTERNAL_UNIT, NULL); - return '\0'; - } dtp->u.p.current_unit->bytes_left--; } else @@ -264,6 +252,28 @@ eat_spaces (st_parameter_dt *dtp) { int c; + /* If internal character array IO, peak ahead and seek past spaces. + This is an optimazation to eliminate numerous calls to + next character unique to character arrays with large character + lengths (PR38199). */ + if (is_array_io (dtp)) + { + gfc_offset offset = stell (dtp->u.p.current_unit->s); + gfc_offset limit = dtp->u.p.current_unit->bytes_left; + + do + { + c = dtp->internal_unit[offset++]; + dtp->u.p.current_unit->bytes_left--; + } + while (offset < limit && (c == ' ' || c == '\t')); + /* Back up, seek ahead, and fall through to complete the process + so that END conditions are handled correctly. */ + dtp->u.p.current_unit->bytes_left++; + sseek (dtp->u.p.current_unit->s, offset-1, SEEK_SET); + } + + /* Now skip spaces, EOF and EOL are handled in next_char. */ do c = next_char (dtp); while (c != EOF && (c == ' ' || c == '\t'));