Exercise: Parallelization of STREAM triad

You find  crippled C and Fortran versions of the STREAM triad in the EX folder.

Your task is:

  1. Look at the code to understand what it is doing
  2. Compile it with the Intel compiler
  3. Run it on the Emmy cluster
  4. Parallelize  the work loops with OpenMP worksharing constructs
  5. Perform a scaling run of the code inside a socket using likwid-pin
  6. Perform a scaling run across sockets
  7. Are the results reasonable?
  8. Issues to check: Non-temporal stores, ccNUMA issues

Last modified: Tuesday, 24 March 2015, 8:02 AM