C/C++ and Fortran

 View Only

Improving I/O performance in XL Fortran

By Archive User posted Thu May 12, 2011 04:45 PM

  

Originally posted by: Rafik_Zurob


Input / Output (I/O) performance can be a significant factor in overall program performance. The way you write I/O statements can make a big difference. Consider the following example:

 

subroutine sub(x, n)
real x(n)

do i = 1, n, 2
write (unit=11) x(i)
enddo
end subroutine

 

The I/O in sub might not be as fast as you'd like. This is due to the accumulation of the overhead for each write statement. The write statement above has to do some preparation work before it can actually write anything. It also has to do some cleanup work after it finishes writing x(i). If you compile with one of the thread-safe invocations, like xlf_r, preparation and cleanup also involve locking and unlocking unit 11 to prevent data transfer statements in other threads from getting into race conditions by trying to read or write to unit 11 while x(i) is being written. So the code above becomes like this:

 

subroutine sub(x, n)
real x(n)

do i = 1, n, 2
! prep, lock unit 11, ...
write (unit=11) x(i)
! cleanup, unlock unit 11
enddo
end subroutine

 

In other words, sub has to do preparation and cleanup, including locking and unlocking unit 11, n/2 times!

You can get rid of the overhead, and drastically improve I/O performance, by replacing the DO loop with a single write statement writing an array section or an io-implied-do loop. In other words:

 

subroutine sub(x, n)
real x(n)

write (unit=11) x(1:n:2) ! Use an array section
end subroutine

Or:

subroutine sub(x, n)
real x(n)

write (unit=11) (x(i), i=1, n, 2) ! Use an implied-do loop
end subroutine

The above is much faster because I/O preparation and cleanup, including locking and unlocking unit 11, have to be done only once.

 

Furthermore, using an array section will give a better performance than using an io-implied-do loop.  The reason is that the compiler translates an I/O statement with an io-implied-do loop into a loop, such as the following for sub:

subroutine sub(x, n)
    real x(n)

  ! prep, lock unit 11, ...
do i=1, n, 2
write (unit=11) x(i)
end do
! cleanup, unlock unit 11
end subroutine

On the other hand, if an array section is used, the compiler generates one call to the I/O runtime routine with the array section as the argument without a loop and therefore, it further improves the performance.

0 comments
0 views

Permalink