Exercise: A 2D Jacobi smoother

The folder J2D contains subfolders ending with "-jacobi", which contain an OpenMP-parallel 2D stencil solver (taken from the RWTH Aachen examples collection) in C and Fortran90.

Compile the code using the provided makefile. To supply the input parameters you need to pipe the input file into the command:

$ ./jacobi.exe < input

The program prints performance in MFlop/s.

  1. Looking at the code, calculate the conversion factor between MFlop/s and MLUP/s (lattice site updates per second). 
  2. Parallelize the code with OpenMP.
  3. Perform a roofline analysis, using the maximum memory bandwidth of 42 GB/s. What is the expected performance on a full Emmy socket (10 cores)? Do you get anywhere near that (use the standard problem size of 4000x4000)?
  4. Think about simple code optimizations. What is the best socket-level performance you can get?
  5. Does the performance of your code scale from 1 to 2 sockets? If it does not, fix it.


Last modified: Tuesday, 18 September 2018, 10:43 PM