Exercise: Parallelization of STREAM triad
You find crippled C and Fortran versions of the STREAM triad in the EX folder.
Your task is:
- Look at the code to understand what it is doing
- Compile it with the Intel compiler
- Run it on the Emmy cluster
- Parallelize the work loops with OpenMP worksharing constructs
- Perform a scaling run of the code inside a socket using likwid-pin
- Perform a scaling run across sockets
- Are the results reasonable?
- Issues to check: Non-temporal stores, ccNUMA issues
Last modified: Tuesday, 24 March 2015, 8:02 AM