Weekly outline

  • General

    The seminar covers optimization and parallelization techniques for modern multi- and manycore systems. The topics are chosen from interesting contemporary problems in High Performance Computing on modern hardware like multicore processors, accelerators (e.g., GPGPUs or Xeon Phi), and clusters.

    Lecturer: Prof. G. Wellein (gerhard.wellein@fau.de), Martensstr. 1, Room 01.131. Phone -28136
    Location: 2.037 (e-Studio), Martensstr. 1 (RRZE), 2nd floor
    Time: Tuesday 16:00-18:00

    Either 2.5 or 5 ECTS credits will be granted, depending on whether the student gives one or two talks. In either case, a written seminar report is mandatory.

    Possible topics can be found in the intro talk (see below).


  • 20 April - 26 April

    Georg Hager: Report on the Dagstuhl Seminar "Advanced Stencil Code Engineering"  

    Faisal Shahzad: Paper walk-through "Building a fault tolerant application using GASPI communication layer"

    • 27 April - 3 May

      Artur Mariano (TU Darmstadt): Performance issues of sieving algorithms used to attack lattice-based cryptosystems

      Abstract: Lattice-based cryptography is emerging as one of the most promising pos-quantum types of cryptography. However, its security in practice is yet to be determined, because most of the known attacks have not been fully analysed from a computational perspective. Sieving algorithms form a class of algorithms that can be used to attack several types of lattice-based cryptosystems. In this talk, I will present the most practical sieving algorithms to this day and address the key issues of their parallelization and optimization on multi-core CPUs.

      • 4 May - 10 May

        Q&A for students (not in e-Studio)

        • 11 May - 17 May

          No seminar this week (will be shifted to a later date)

          • 18 May - 24 May

            Prof. Dr. R. Neder (LS Allgmeine Mineralogie/Kristallographie): Refinenemt of disordered crystal structures

          • 25 May - 31 May

            No seminar on Tuesday, May 26 (Berch-Dienstag)!

            The seminar from Tuesday, June 2 will be preponed to Wednesday, May 27, 10am c.t., RRZE E-Studio:

            Sebastian Hack (Universität des Saarlandes): AnyDSL: Building Domain-Specific Languages for Productivity and Performance

            Abstract: To achieve good performance, programmers have to carefully tune their application for the target architecture. Optimizing compilers fail to produce the "optimal" code because their hardware models are too coarse-grained. Even more, many important compiler optimizations are computationally hard even for simple cost models. It is unlikely that compilers will ever be able to produce high-performance code automatically for today's and future machines. Therefore, programmers often optimize their code manually. While manual optimization is often successful in achieving good performance, it is cumbersome, error-prone, and unportable. Creating and debugging dozens of variants of the same original code for different target platform is just an engineering nightmare. An appealing solution to this problem are domain-specific languages (DSLs). A DSL offers language constructs that can express the abstractions used in the particular application domain. This way, programmers can write their code productively, on a high level of abstraction. Very often, DSL programs look similar to textbook algorithms. Domain and machine experts then provide efficient implementations of these abstractions. This way, DSLs enable the programmer to productively write portable and maintainable code that can be compiled to efficient implementations. However, writing a compiler for a DSL is a huge effort that people are often not willing to make. Therefore, DSLs are often embedded into existing languages to save some of the effort of writing a compiler. In this talk, I will present the AnyDSL framework we have developed over the last three years. AnyDSL provides the core language Impala that can serve as a starting point for almost "any" DSL. New DSL constructs can be embedded into Impala in a shallow way, that is just by implementing the functionality as a (potentially higher-order) function. AnyDSL uses online partial evaluation remove the overhead of the embedding entirely. To demonstrate the effectiveness of our approach, we generated code from generic, high-level text-book image-processing algorithms that has, on each and every hardware platform tested (Nvidia/AMD/Intel GPUs, SIMD CPUs), beaten the industry standard benchmark (OpenCV) by 10-35% (!), a standard that has been carefully hand-optimized for each architecture over many years. Furthermore, the implementation in Impala has one order of magnitude less lines of code than a corresponding hand-tuned expert code. We also obtained similar first results in other domains.

            This is joint work with Roland Leißa, Klaas Boesche, Richard Membarth, and Philipp Slusallek

            • 1 June - 7 June

              Seminar shifted to Wednesday, May 27 (time TBD)

              • 8 June - 14 June

                Christopher Bross: What is HPCG?

              • 15 June - 21 June

                Christopher Bross: First seminar talk on "Relaxed Synchronization"

                Lena Leitenmaier: First seminar talk on "SPEC Benchmarks"

              • 22 June - 28 June

                G. Hager/G. Wellein: Fooling the masses with performance results on parallel computers

              • 29 June - 5 July

                Faisal Shahzad: Update on fault tolerance work with MPI-ULFM

                Moritz Kreutzer: Optimized Tall-Skinny MM

                BBQ after the seminar (17:30)

                • 6 July - 12 July

                  Seminar shifted to Wednesday, July 8, 14:00 s.t. (e-Studio)

                  Dominik Thoennes: 2nd seminar talk on OpenCL.

                • 13 July - 19 July

                  No seminar (ISC week)

                  • 20 July - 26 July

                    Ayesha Afzal: Master thesis final presentation on "The cost of computation"

                    • 27 July - 2 August

                      Tuesday, July 28, 11:00am, e-Studio:

                      Anja Gerbes:  Leistungsanalyse mit Hardware-Performanzzählern


                      Efficiency can be crucial in scientific computing. Profiler gives a hint on inefficient parts of a program.
                      Modern processors can count performance relevant events during program execution by programmable counters in special registers. As opposed to software based systems these Hardware Performance Counters have no effect on the flow of the program. The set of hardware performance counters varies with the processor architecture. This thesis gives hints on potential weak spots in programs and helps to discover them by hardware performance counters based characteristic factors. By the comprehension of performance counters, we can find a correlation between certain C++ constructs and the characteristic factors. This is done by specific benchmarks. Characteristic factors to evaluate C++ program codes and analyse its efficiency will be presented. These factors are an indication of the performance of the program part which is examined.In addition this thesis provides a framework that enables the reader to explore the characteristic factors on other hardware platforms with minimal effort. In order to assess these factors, special benchmarks were developed to explore the characteristics of a processor unit with hardware performance counters. These are manufactured to act optimally or weakly, depending on the configuration. 

                      Wednesday, July 29, 15:30: Christopher Bross: 2nd seminar talk (Relaxed Sync)  Shifted to winter term!

                      Wednesday, July 29, 16:30, e-Studio: Lena Leitenmaier: 2nd seminar talk (SPEC OMP Benchmarks)