OpenMP and array operations

I don’t know if it’s the actual source code you are compiling, but it misses the $ in all the OpenMP directives. Assuming it was intended (just to run the code sequentially), setting the right !$omp was not enough, since you are opening a workshare section without closing the previous one (compilation error in both gfortran and ifort).

Anyway, I got this compilable code:

        !$omp parallel
        !$omp workshare
        pderiv = diffw * pwest  + &
                 diffe * peast  + &
                 diffn * pnorth + &
                 diffs * psouth   &
                 - (diffw + diffe + diffn + diffs) * pcentre + pforce
        u = u + deltt * du
        !$omp end workshare
        !$omp end parallel

But the execution fails (segmentation violation) with both gfortran and ifort (on Linux). This is probably because each thread tries allocating a big temporary array in its own stack space. Assigning the result to an allocatable tmp(:,:) array instead of the pointer pderiv(:,:) one (and then copying to pderiv) fixes the problem. The very same problem actually happens with gfortran even without any OpenMP in the code. So at the end what I’m testing is rather:

real, allocatable :: tmp(:,:)
...
        !$omp parallel
        !$omp workshare
        tmp = diffw * pwest  + &
              diffe * peast  + &
              diffn * pnorth + &
              diffs * psouth   &
              - (diffw + diffe + diffn + diffs) * pcentre + pforce
        pderiv = tmp
        u = u + deltt * du
        !$omp end workshare
        !$omp end parallel

gfortran, no OpenMP: 10.0" (1000 iterations max)
ifort, no OpenMP: 10.3"

gfortran, OpenMP workshare (1 thread): 15.6"
ifort, OpenMP workshare (1 thread): 10.8"

gfortran, OpenMP workshare (4 threads): 15.4"
ifort, OpenMP workshare (4 threads): 9.6"

And finally with classical $omp parallel do:

gfortran, OpenMP do loops (4 threads): 5.8"
ifort, OpenMP do loops (4 threads): 4.2"

So yes, workshare doesn’t speed-up anything here.